Next: The Discrete Hartley Transform, Previous: The Halfcomplex-format DFT, Up: More DFTs of Real Data

The Fourier transform of a real-even function f(-x) = f(x) is real-even, and i times the Fourier transform of a real-odd function f(-x) = -f(x) is real-odd. Similar results hold for a discrete Fourier transform, and thus for these symmetries the need for complex inputs/outputs is entirely eliminated. Moreover, one gains a factor of two in speed/space from the fact that the data are real, and an additional factor of two from the even/odd symmetry: only the non-redundant (first) half of the array need be stored. The result is the real-even DFT (REDFT) and the real-odd DFT (RODFT), also known as the discrete cosine and sine transforms (DCT and DST), respectively.

(In this section, we describe the 1d transforms; multi-dimensional transforms are just a separable product of these transforms operating along each dimension.)

Because of the discrete sampling, one has an additional choice: is the
data even/odd around a sampling point, or around the point halfway
between two samples? The latter corresponds to *shifting* the
samples by *half* an interval, and gives rise to several transform
variants denoted by REDFTab and RODFTab: a and
b are 0 or 1, and indicate whether the input
(a) and/or output (b) are shifted by half a sample
(1 means it is shifted). These are also known as types I-IV of
the DCT and DST, and all four types are supported by FFTW's r2r
interface.^{1}

The r2r kinds for the various REDFT and RODFT types supported by FFTW,
along with the boundary conditions at both ends of the *input*
array (`n`

real numbers `in[j=0..n-1]`

), are:

`FFTW_REDFT00`

(DCT-I): even around j=0 and even around j=n-1.`FFTW_REDFT10`

(DCT-II, “the” DCT): even around j=-0.5 and even around j=n-0.5.`FFTW_REDFT01`

(DCT-III, “the” IDCT): even around j=0 and odd around j=n.`FFTW_REDFT11`

(DCT-IV): even around j=-0.5 and odd around j=n-0.5.`FFTW_RODFT00`

(DST-I): odd around j=-1 and odd around j=n.`FFTW_RODFT10`

(DST-II): odd around j=-0.5 and odd around j=n-0.5.`FFTW_RODFT01`

(DST-III): odd around j=-1 and even around j=n-1.`FFTW_RODFT11`

(DST-IV): odd around j=-0.5 and even around j=n-0.5.

Note that these symmetries apply to the “logical” array being
transformed; **there are no constraints on your physical input
data**. So, for example, if you specify a size-5 REDFT00 (DCT-I) of the
data abcde, it corresponds to the DFT of the logical even array
abcdedcb of size 8. A size-4 REDFT10 (DCT-II) of the data
abcd corresponds to the size-8 logical DFT of the even array
abcddcba, shifted by half a sample.

All of these transforms are invertible. The inverse of R*DFT00 is
R*DFT00; of R*DFT10 is R*DFT01 and vice versa (these are often called
simply “the” DCT and IDCT, respectively); and of R*DFT11 is R*DFT11.
However, the transforms computed by FFTW are unnormalized, exactly
like the corresponding real and complex DFTs, so computing a transform
followed by its inverse yields the original array scaled by N,
where N is the *logical* DFT size. For REDFT00,
N=2(n-1); for RODFT00, N=2(n+1); otherwise, N=2n.

Note that the boundary conditions of the transform output array are given by the input boundary conditions of the inverse transform. Thus, the above transforms are all inequivalent in terms of input/output boundary conditions, even neglecting the 0.5 shift difference.

FFTW is most efficient when N is a product of small factors; note
that this *differs* from the factorization of the physical size
`n`

for REDFT00 and RODFT00! There is another oddity: `n=1`

REDFT00 transforms correspond to N=0, and so are *not
defined* (the planner will return `NULL`

). Otherwise, any positive
`n`

is supported.

For the precise mathematical definitions of these transforms as used by FFTW, see What FFTW Really Computes. (For people accustomed to the DCT/DST, FFTW's definitions have a coefficient of 2 in front of the cos/sin functions so that they correspond precisely to an even/odd DFT of size N. Some authors also include additional multiplicative factors of √2for selected inputs and outputs; this makes the transform orthogonal, but sacrifices the direct equivalence to a symmetric DFT.)

Since the required flavor of even/odd DFT depends upon your problem,
you are the best judge of this choice, but we can make a few comments
on relative efficiency to help you in your selection. In particular,
R*DFT01 and R*DFT10 tend to be slightly faster than R*DFT11
(especially for odd sizes), while the R*DFT00 transforms are sometimes
significantly slower (especially for even sizes).^{2}

Thus, if only the boundary conditions on the transform inputs are specified, we generally recommend R*DFT10 over R*DFT00 and R*DFT01 over R*DFT11 (unless the half-sample shift or the self-inverse property is significant for your problem).

If performance is important to you and you are using only small sizes (say n<200), e.g. for multi-dimensional transforms, then you might consider generating hard-coded transforms of those sizes and types that you are interested in (see Generating your own code).

We are interested in hearing what types of symmetric transforms you find most useful.

[1] There are also type V-VIII transforms, which
correspond to a logical DFT of *odd* size N, independent of
whether the physical size `n`

is odd, but we do not support these
variants.

[2] R*DFT00 is sometimes slower in FFTW because we discovered that the standard algorithm for computing this by a pre/post-processed real DFT—the algorithm used in FFTPACK, Numerical Recipes, and other sources for decades now—has serious numerical problems: it already loses several decimal places of accuracy for 16k sizes. There seem to be only two alternatives in the literature that do not suffer similarly: a recursive decomposition into smaller DCTs, which would require a large set of codelets for efficiency and generality, or sacrificing a factor of 2 in speed to use a real DFT of twice the size. We currently employ the latter technique for general n, as well as a limited form of the former method: a split-radix decomposition when n is odd (N a multiple of 4). For N containing many factors of 2, the split-radix method seems to recover most of the speed of the standard algorithm without the accuracy tradeoff.