Other Multi-dimensional Real-data MPI Transforms

Next: FFTW MPI Transposes, Previous: Multi-dimensional MPI DFTs of Real Data, Up: Distributed-memory FFTW with MPI

6.6 Other multi-dimensional Real-Data MPI Transforms

FFTW's MPI interface also supports multi-dimensional ‘r2r’ transforms of all kinds supported by the serial interface (e.g. discrete cosine and sine transforms, discrete Hartley transforms, etc.). Only multi-dimensional ‘r2r’ transforms, not one-dimensional transforms, are currently parallelized.

These are used much like the multidimensional complex DFTs discussed above, except that the data is real rather than complex, and one needs to pass an r2r transform kind (fftw_r2r_kind) for each dimension as in the serial FFTW (see More DFTs of Real Data).

For example, one might perform a two-dimensional L × M that is an REDFT10 (DCT-II) in the first dimension and an RODFT10 (DST-II) in the second dimension with code like:

         const ptrdiff_t L = ..., M = ...;
         fftw_plan plan;
         double *data;
         ptrdiff_t alloc_local, local_n0, local_0_start, i, j;
     
         /* get local data size and allocate */
         alloc_local = fftw_mpi_local_size_2d(L, M, MPI_COMM_WORLD,
                                              &local_n0, &local_0_start);
         data = fftw_alloc_real(alloc_local);
     
         /* create plan for in-place REDFT10 x RODFT10 */
         plan = fftw_mpi_plan_r2r_2d(L, M, data, data, MPI_COMM_WORLD,
                                     FFTW_REDFT10, FFTW_RODFT10, FFTW_MEASURE);
     
         /* initialize data to some function my_function(x,y) */
         for (i = 0; i < local_n0; ++i) for (j = 0; j < M; ++j)
            data[i*M + j] = my_function(local_0_start + i, j);
     
         /* compute transforms, in-place, as many times as desired */
         fftw_execute(plan);
     
         fftw_destroy_plan(plan);

Notice that we use the same ‘local_size’ functions as we did for complex data, only now we interpret the sizes in terms of real rather than complex values, and correspondingly use fftw_alloc_real.