Forums  > Software  > gpu for trading model research  
     
Page 1 of 1
Display using:  

rickyvic


Total Posts: 222
Joined: Jul 2013
 
Posted: 2020-09-14 20:48
Hi guys this is a hardware not software question

I am exploring gpus as means to offload heavy duty calculations off the cpu.
The most demanding processes are mainly in the data loading from flat files, parsing, manipulating and databasing.

Most computations are done in double precisions but I guess I have never tried to use single precision.

I am currently using a mix of R and matlab so it is wise to take into account packages I can use (they most likely will use by default double precision).

What NVIDIA card to buy (tesla k20-k40-k80, titan z, tesla v100).
Many cards vs one super powerful etc...
I suppose a lot of the time will be spent in overhead from the cpu to the gpu and viceversa.

Any pointers appreciated

"amicus Plato sed magis amica Veritas"

ax


Total Posts: 78
Joined: Jul 2007
 
Posted: 2020-09-17 13:32
what benefit does a gpu provide when data loading from flat files, parsing and databasing?

depending on what you mean by manipulating there may be an upside

Its Grisha


Total Posts: 59
Joined: Nov 2019
 
Posted: 2020-09-17 14:46
Not an expert on CUDA etc but have dabbled a bit..

> which NVIDIA card to buy

I would not invest in cards right away, the best way to try this out is to get a GPU instance on Google Cloud/AWS. I think Tesla v100 is the fastest card they offer.

Before forking out the time and money to build your own servers, try it out for a couple bucks an hour and see how it benchmarks for your workload.

> a lot of the time will be spent in overhead from the cpu to the gpu

This is very true, especially if your data isn't very high dimensional. In some of my early experiments with GPU it was actually slower than CPU because of the overhead for moving the next batch of data on and off of the card.

doomanx


Total Posts: 89
Joined: Jul 2018
 
Posted: 2020-09-17 14:54
I'm not convinced GPU is going to help you unless you're doing embarrassingly parallel computations on the data. Better invest the money in either more ram, nvme2.0 ssds or some manpower to do computations on disk/storage.

did you use VWAP or triple-reinforced GAN execution?

nikol


Total Posts: 1176
Joined: Jun 2005
 
Posted: 2020-09-17 15:30
May be this discussion paper can help?

https://link.springer.com/article/10.1007/s00778-019-00581-w

Aside: reading flat files is sequential, therefore, I expect, that parallelization will not help (UPD: unless it is indexed or contained as it is done in HDF5, where you can read data in chunks).
So, look the article.

EspressoLover


Total Posts: 446
Joined: Jan 2015
 
Posted: 2020-09-18 14:42
Have you exhausted the possibility of squeezing more efficiency out of the existing process? I'd double check to make sure that you're using a parsing library optimized for large datasets. You mentioned using R, fread() is several orders of magnitude faster than read.csv().

On a shitty MacBook, I can parse and load CSVs close to a rate of 1 GB per second. You can provision a 224 core machine for $2.50 an hour. Even avoiding any sort of multi-node map/reduce clustering, you should be able to parse a 1 petabyte dataset in about 90 minutes.

While benchmarks aren't perfect and everyone's workload's different, this is the most fair and comprehensive comparison between the performance of different big data systems. It includes a lot of GPU-based benchmarks. Take a look. It's useful because it should indicate how far away your current process is from near max efficiency. It also should indicate how much gains you can expect from moving to an alternate paradigm.

Good questions outrank easy answers. -Paul Samuelson

jslade


Total Posts: 1221
Joined: Feb 2007
 
Posted: 2020-09-21 11:46
>The most demanding processes are mainly in the data loading from flat files, parsing, manipulating and databasing.

What EspressoLover said; use better tools. fread and data.tables instead of read.csv/data.frames. Writing a fast csv parser is a surprisingly rich computer science problem; the fastest I know about are the ones in Kx and jd databases.

Another thing; your computer is almost certainly going to be IO bound if your'e doing your manipulating and databasing properly. This means your M2/NVMe drive and your motherboard choice (4 lanes of PCIe 3.x) is going to be vastly more important than what kind of GPU you have; even if you move the data into the GPU for processing, which is generally a terrible idea for what you describe. There was a generation of GPU oriented analytics database engines ~ 2016 when I was in the TSDB business. While they had their corner use cases, they were mostly useless for the obvious reasons (aka maxing out the PCI bus with data cha-cha).

Finally if you want a cheap machine with lots of cores for your mapfor embarrassingly parallel problems, buy a threadripper. At least you don't need to compile separate object code and move data across the PCI bus. I picked one up with 64 threads with 256G, big M2 drives and a bunch more spinney disks for $6k. It's fast enough for anything I need. I might buy another one for deploys in the data center.

I'd love to fool around with a big GPU toaster some time, but I refuse to write a bunch of ephemeral C code for Cuda. I feel like in the 2020s there should be lots of good high level FP GPU options with big libraries of useful code, rather than working with Nvidia's shitty PDP-11 assembler framework. Everything I've seen so far is a hack or Python/torch/tensorflow spaghetti oriented towards dweeb learning. Even something like STAN on the GPU would be useful (I know, there's some hack which claims to do this; maybe I'll torture myself with it one day).

"Learning, n. The kind of ignorance distinguishing the studious."

rickyvic


Total Posts: 222
Joined: Jul 2013
 
Posted: 2020-09-25 14:57
I have looked into it, maybe with chunking I could possibly make use of it.
I am going to fool around with it and report back.
Thanks for your inputs all very useful.

"amicus Plato sed magis amica Veritas"

rickyvic


Total Posts: 222
Joined: Jul 2013
 
Posted: 2020-09-25 15:02
Regarding cpu code optimization, I think I have done quite a lot, still these routines take some time. Not horrible but I would like to have it quicker. I also use hdf5 for storage on ssd drives.
Anything else is quite vectorized so probably difficult to be much faster.


"amicus Plato sed magis amica Veritas"

agentq


Total Posts: 40
Joined: Jul 2008
 
Posted: 2020-09-25 15:10
https://neanderthal.uncomplicate.org/

Might be interesting, though it is clojure based.
Previous Thread :: Next Thread 
Page 1 of 1