NVIDIA aggiorna CUDA Toolkit alla versione 3.1
NVIDIA ha reso disponibile una versione aggiornata, release 3.1, del proprio CUDA Toolkit, pacchetto software che permette agli sviluppatori di sfruttare le capacità delle GPU NVIDIA di eseguire codice parallelo che non sia legato necessariamente ad applicazioni grafiche. I CUDA Toolkit sono messi a disposizione in versioni per sistemi operativi Windows, Linux e MacOS sia a 32bit come a 64bit
Il toolkit deve essere abbinato a specifici driver NVIDIA Forceware, release 257.21; software e driver sono disponibili per il download dal sito NVIDIA a questo indirizzo. Queste le note fornite da NVIDIA con la release del nuovo toolkit:
- GPUDirect(tm) gives 3rd party devices direct access to CUDA Memory
- Support for 16-way concurrency allows up to 16 different kernels to run at the same time on Fermi architecture GPUs
- Runtime / Driver interoperability enables applications to mix-n-match use of the CUDA Driver API with CUDA C Runtim and math libraries via buffer sharing and context migration
- New language features added to CUDA C / C++ include:
- Support for printf() in device code
- Support for function pointers and recursion make it easier to port many existing algorithms to Fermi GPUs
- Unified Visual Profiler now supports both CUDA C/C++ and OpenCL, and now includes support for CUDA Driver API tracing
- Math Libraries Performance Improvements, including:
- Improved performance of selected transcendental functions from the log, pow, erf, and gamma families
- Significant improvements in double-precision FFT performance on Fermi-architecture GPUs for 2^n transform sizes
- Streaming API now supported in CUBLAS for overlapping copy and compute operations
- CUFFT Real-to-complex (R2C) and complex-to-real (C2R) optimizations for 2^n data sizes
- Improved performance for GEMV and SYMV subroutines in CUBLAS
- Optimized double-precision implementations of divide and reciprocal routines for the Fermi architecture
- New and updated SDK code samples demonstrating how to use:
- Function pointers in CUDA C/C++ kernels
- OpenCL / Direct3D buffer sharing
- Hidden Markov Model in OpenCL
- Microsoft Excel GPGPU example showing how to run an Excel function on the GPU
Windows developers should be sure to check out the new debugging and profiling features in Parallel Nsight for Visual Studio at www.nvidia.com/nsight.
Please refer to the release notes and Getting Started Guides for more information.
For more information on general purpose computing features of the Fermi architecture, see: www.nvidia.com/fermi.
Note: The developer driver packages below provide baseline support for the widest number of NVIDIA products in the smallest number of installers. More recent production driver packages for end users are available at www.nvidia.com/drivers.
- Support for Tesla 20-series (C2050) is not yet in these developer drivers. We will update in ~1 week
- Production Tesla 20-series drivers will be available at http://www.nvidia.com/drivers in ~1 week