Running number-crunching code on a top-of-the-line graphics card can be 10x faster than on a comparably high-end CPU. Not every application will see a 10x benefit – Monte Carlo simulations, image processing, matrix-vector-product-heavy code are excellent candidates. Thanks to general-purpose linear-algebra-on-the-GPU classes – like gpuarray, gnumpy and Data.Array.Accelerate – it’s possible to run high-level Matlab, Python or Haskell code on the GPU.
In other words, GPU computing is very accessible. So which GPU should you choose to do your work?
Although in theory OpenCL allows general-purpose GPU programming on multiple platforms, in practice most high-level libraries support only CUDA-based computing. Therefore, NVIDIA graphics card are your only choices. NVIDIA has many CUDA-enabled products, but in terms of features/pricepoint your choices boil down to:
A GeForce card is simply a consumer-level graphics card that you can also use for GPU computing. A Tesla card is a dedicated board that doesn’t have VGA/DVI output; its only purpose is computing. In either case, you’ll need an extra graphics card for display purposes; you can buy a cheap GeForce for that purpose. Oh, and you’ll need a giant power supply to run any of these cards. On the upside, your office will never get cold in the winter.
What’s the difference between a top-of-the-line GeForce and a Tesla?
- Price: a GeForce is about 3x cheaper – the cheapest Tesla is about 3k$, while the most expensive GeForce is in the 1k$ range
- Memory: some Tesla variants have more on-board RAM
- Double-precision performance: some GeForce cards are much slower than a Tesla in double-precision
- Reliability and management: a Tesla has error-checking memory, is built to run 24/7, and has some hooks for remote management via HPC software
Notice how I didn’t mention single-precision performance? That’s because in terms of memory bandwidth, number of CUDA cores, and single-precision TFlops, Tesla and top-of-the-line GeForce cards are very similar – and sometimes the specs even favor the much cheaper GeForce cards.
If you’re running a cluster, reliability is probably very important to you, in which case, by all means, get a Tesla.
If instead you’re actively trying to avoid having to set up a cluster by buying a GPU, then get the GeForce Titan. The Titan is the only GeForce which doesn’t have handicapped double-precision capability. It’s as fast as the Tesla for double-precision tasks. It also has 6GBs of RAM, which is as much than the entry level Tesla which costs 3x more. It’s currently 1250$ on Amazon although this price is sure to be outdated by the time you read this.
The GeForce GTX 780 is a very good option as well. If you’re uncommitted about GPU computing, it’s an excellent cost-conscious choice at half the price of the Titan – about 550$ currently. I’m using a GTX 680 (last generation’s top-of-the-line GeForce) and it is stupendous for fitting convolutional neural nets. Beware however that filling up the GTX 780’s puny 3GBs of onboard RAM is really easy, so you will get annoyed with it eventually. A 6GB version is rumored to be out soon. Low double-precision performance also means that you’ll have to jump a few hoops to adapt your code to single precision.
Still, think about the difference between “I can fit my model in a day vs. I can fit my model in 10 days.” Completely different experience. It will enhance your quality of life and productivity significantly and potentially lead to better science (you can copy and paste this sentence when you send your pitch to your advisor).