- Based on VkFFT 1.3.1
- Add support for more than 3 FFT dimensions (defaults to 8 in pyvkfft, can be changed when installing)
- Add options to manually or automatically tune the FFT performance for the used GPUs.
- Add pyvkfft-benchmark script.
- The VkFFT source is now included as a git subproject
- Actually use cuda_stream parameter in the pyvkfft.fft interface
- Take into account current cuda device when automatically caching VkFFTApp using the pyvkfft.fft interface(#26) This enable using multiple GPUs in a multi-threaded approach.
- The pyvkfft-test will ignore the PoCL OpenCL platform if another is available and unless it is explicitly selected.
- Fix MANIFEST.in including vkfft_cuda.cu for python setup.py sdist
- VkFFT 1.2.33, now using Rader algorithm for better performance with many non-radix sizes.
- Fix R2C tests when using numpy (scipy unavailable) [#19]
- Add support for F-ordered arrays (C2C and R2C)
- Allow selection of backend for non-systematic pvkfft-test
- Add parameter to select the OpenCL platform in pyvkfft-test
- For pyopencl, transforms will use the array's queue by default instead of the application's (a warning will be written to notify this change when they differ; it can be disabled using config.WARN_OPENCL_QUEUE_MISMATCH). A queue can also be supplied to the fft() and ifft() methods. (from @isuruf, vincefn#17)
- Fix simple fft interface import when only pycuda is used
- Add cuda_driver_version, cuda_compile_version, cuda_runtime_version functions.
- Add simpler interface to run benchmarks, using separate processes.
- add pyvkfft-test-suite for long tests (up to 30 hours) for validation before new releases.
- Correct the dtype of the returned array for fft.rfftn() and fft.irfftn() in the case of an inplace transform
- Pycuda: cast the gpudata pointer to int for comparisons
- Fix TestFFT colour attribute default value
- Added accuracy unit tests, which can be used systematically using the 'pyvkfft-test' installed script
- An extensive testing is now made before official releases, evaluating all type of transforms (c2c, r2c, dct, 1, 2 and 3D, in and out-of-place, norm 0 and 1), different GPUs, both OpenCL and CUDA, etc... Comparison is made against pyfftw, scipy or numpy.
- Update to VkFFT 1.2.21, with support for DCT types 1, 2, 3 and 4, also fixing a number of issues (see closed issues at https://github.com/DTolm/VkFFT/issues), and passing all tests on different GPUs (OpenCL and CUDA, AMD and nVidia)
- Raise a RuntimeError if the VkFFTApp initialisation or the GPU kernel launch fails, with the corresponding VkFFT error.
- [BUG] Correct inverse FFT calculation using pyvkfft.fft.ifftn()
- Installation from source using 'python setup.py install' is now disabled - 'pip install' should always be used.
- Added config.USE_LUT and config.FFT_CACHE_NB variables, which can be used to modify the default behaviour, and can also be set e.g. with the PYVKFFT_USE_LUT environment variable.
- Support for windows installation (cuda and opencl) [requires visual studio with c++ tools and the cuda toolkit with nvcc. Untested with the AMD SDK]
- Remove Cython reference in setup.py
- Enable transforms for any array size (VkFFT Bluestein algorithm)
- Allow 3D transforms on arrays with >3 dimensions (batch transform)
- Support for transforms on a given list of axes, instead of only the first ndim axes. Unavailable for R2C.
- Added a simple pyvkfft.fft interface with fftn, ifftn, rfftn, irfftn functions which automatically recognize the type of GPU arrays and cache the generated VkFFTApp (FFT plans).
- Direct Cosine Transform (DCT) of type 2, 3 and 4 (EXPERIMENTAL)
- Support CuPy arrays in addition to PyCUDA and PyOpenCL
- OpenCL: test for half and double-precision support
- OpenCL: relax accuracy requirements in unit tests
- Fix shape test for out-of-place R2C transforms
- Add a base VkFFTApp class common to OpenCL and CUDA
- Installation: fix macOS compilation. Allow selection of backends from an environment variable
- Initial release, in phase with VkFFT 1.2.2