Forums  > Software  > Big Data and Deep Learning, a technology revolution in trading or yet another hype?  
     
Page 1 of 2Goto to page: [1], 2 Next
Display using:  

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2016-10-23 11:33
My non-technical essay


Summary:

*BigData and DeepLearning are popular buzz words nowadays. But the number of the genuine success stories is relatively small.

*In trading the BigData technology is mostly associated with automatic analysis of the news and sentiment in social networks. But unless you are Google or Reuters, you will never be the one who gets the news first. Additionally, a market reaction both to news and sentiment is often vague and amorph.

*Large deep neural networks closely resemble a human brain, which also has a lot of neurons, interconnected in many layers. But it doesn’t mean a breakthrough to a real artificial intelligence: all is not gold that glitters.

*A positive side: trading is only a part of the financial world. Likely, BigData + DeepLearing has a high potential in adjacent areas like risk profiling and credibility analysis.

So IMO there is more hype than opportunities. However, I would be happy to hear the opposite opinions (esp. with concrete examples of success stories).

www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-23 19:24
Big Data is important for insurance companies.

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2016-10-23 19:32
Agree for the health insurance (first of all to prevent fraud).
For life and pension insurance, unlikely (at least I cannot imagine a use case).


www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-23 19:52
Better estimates of survival probability, no?

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2016-10-23 20:45
Well, I studied how to generate Sterbetafeln (mortality tables) in detail.
Naively, one just needs to observe 1 000 000 (in USA 100 000) persons from birth to death. Not a big data.
The nuance is that longevity grows with time (better medical service and so on). So one needs to observe several cohorts (and then extrapolate the longevity growth).
But several cohorts is still not a big data.

On the other hand if one can better estimate mortality risk from, say, social networks (and it is possible: a guy who e.g. puts photos with a bike has larger probability to die).
But it may happen that a gain, compared to cost will be minimals (if at all).
Life insurance companies all the same demand pretty exhaustive disclosure (whether you are a smoker, a biker and so on).


www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-23 21:45
In insurance Big Data could lower rates for optimistic tweeters

chiral3
Founding Member

Total Posts: 4985
Joined: Mar 2004
 
Posted: 2016-10-23 22:18
There's a pretty big push in insurtech now. It's riffing on fintech but is limited more so by regulations. Much of it involves disintermediating via the front end. For instance, AIG, a reinsurer, and 2sigma are partnering because AIG can write product, the reinsurer's off-shore entity can house it, and 2sigma can skunkworks the digital front-end. The only novelty is the rate that they can underwrite business. The actual applications of DL, AI, robo, and BD are really more in the sales / distribution space, not running the businesses or the trading / investing side. At a minimum these companies need to pivot for the new demographic, who are amassing and managing savings and wealth in a very different way than previous generations. Capturing this wealth requires new methods. There is a decent amount of hype, though. Much is coming simply from fear of not seeing around the corner.

Nonius is Satoshi Nakamoto. 物の哀れ

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-23 22:20
Also in managing investments: http://www.byhiras.com

rftx713


Total Posts: 77
Joined: May 2016
 
Posted: 2016-10-23 22:45
>On the other hand if one can better estimate mortality risk from, say, social networks (and it is possible: a guy who e.g. puts photos with a bike has larger probability to die).


Well.... This seems to be a bit incongruent with how a portion of the population views things.

What I'm saying is - I would think, in the few minutes I've been reading this thread, that if we're crossing this bridge, we're also assuming away the problems that people have with giving insurers access to their DNA. That is, it seems that at least a statistically significant sample of the population would rather pay an extra $x per month than give somebody total insight into their bodies or personal lives.

Put more simply: Big Data could do a lot of cool things if the people from whom the data was sourced were totally cool with those cool things.

"Give us access to your Twitter, it might lower your premiums!" OK, but if I post something snarky or sarcastic in my snide way that a large portion of the average population (particularly the average types that would be coding these things) tend to not catch, and then takes it seriously, will I suddenly find myself paying even more because of my tweets? Will people game the system and sell $5 scripts to post a steady stream of randomized, optimized optimism to lower premiums?

Why not use the license plate scanners that are now widespread to add another dataset to your auto insurance pricing? We need you to explain your recent visits to the hood now please, or just pay this small increase...

Why not use the gyroscope in your iPhone and its internet browsing history to tell how many times a day you jack off? I hear the more frequent, the less risk of heart disease.

Apologies if I come off as a bit flippant, I genuinely don't mean to, but re-reading my post I think I might to some. I would just be more interested in discussions about overcoming those problems. It's clear at this point that with x amount more data, you can do x more cool things with it, some of which carries over into pricing financial instruments. But as the above said, much of the headwinds are regulatory - for a good reason in my book.

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-24 00:08
Well obviously the interests of the insurer and the insured are opposite here.

rftx713


Total Posts: 77
Joined: May 2016
 
Posted: 2016-10-24 02:19
Fair point, I guess my rant was pretty pointless as that has pretty much always been the case.

NeroTulip


Total Posts: 996
Joined: May 2004
 
Posted: 2016-10-24 05:18
"Well obviously the interests of the insurer and the insured are opposite here."

Well, there could be cases where everybody wins, i.e. when better pricing of risk helps safer behavior. The insurer can say "ok, you can ride your motorcycle way too fast, but we'll have to price that into your premium", then the insured can make an informed decision of whether they want to take this risk or not. I guess if risk is properly priced, less people will engage in risky behavior, but some still will, because rock'n'roll. Everybody wins.

In many countries, there are heavy regulations that prevent discrimination in insurance, and that seems to be the tough part in insurtech. The interesting question in insurance regulation is how this is going to evolve. Insurers will probably lobby for using as much information as possible, but that is not necessarily in the interest of the insured. I find -and that's a personal opinion- the prospect of hearing "well, we just found out you were born with genetic mutation XYZ, so your premium has to go up 10x" quite unsettling. Seems to me that the whole point of insurance is to mutualise risks that we cannot control.

So that is were I would draw the line: insurers should be able to price risk that you can control (i.e. your choice of riding your motorcycle way too fast), but not what you cannot control (your DNA). The ethical/philosophical question becomes "what can you control?", and there's a grey area, larger than I'd like, but I think that is the whole debate.

I'd like to hear if you think this is sensible reasoning, and if this is where regulation is moving towards.

"Earth: some bacteria and basic life forms, no sign of intelligent life" (Message from a type III civilization probe sent to the solar system circa 2016)

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-24 08:32
Yes, often people who are well to do, healthy and equipped with a large social support network have a very different idea of "what you can control" than the poorer strata of society.

E.g. fast riding a cheap bike: what if it's the cheapest way of commuting to work for a father supporting a family?

goldorak


Total Posts: 979
Joined: Nov 2004
 
Posted: 2016-10-24 10:49
Big data and Clustering, a technology revolution in trading or yet another hype?

Big data and PCA, a technology revolution in trading or yet another hype?

Big data and Boosting, a technology revolution in trading or yet another hype?

Big data and Linear Regression, a technology revolution in trading or yet another hype?

and so on... every epoch has its own hot solution to the ultimate question of life, the universe and everything. With time it is just another tool in your toolbox.

I see the same hot trends in my kitchen.




If you are not living on the edge you are taking up too much space.

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2016-10-25 21:50
>E.g. fast riding a cheap bike: what if it's the cheapest way of commuting
>to work for a father supporting a family?
Good point! And if a classification rule decides to classify this father as a bad risk I would call it not "an artificial intelligence" but rather "a natural stupidity".

>Big data and Linear Regression, a technology revolution in trading or yet another hype?
BTW, can a linear regression (least square) algorithm be run under MapReduce paradigm?! ;)


www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-25 22:18
"And if a classification rule decides to classify this father as a bad risk I would call it not "an artificial intelligence" but rather "a natural stupidity"."

A great benefit of computer algorithms is that we can embed in them all our biases and pretend to be objective.

"BTW, can a linear regression (least square) algorithm be run under MapReduce paradigm"

It can, but it's not always a great idea. MapReduce is great for N-to-1 problems, but finance is dominated by N-to-M problems, with N, M >> 1.

jslade


Total Posts: 1064
Joined: Feb 2007
 
Posted: 2016-10-25 23:18
People who talk about "big data" in public remind me of conversations about sex I had when I was 12. Everyone talks about it; very few people are actually doing it.

People talk about it like there is all kinds of information hiding around in your data that the favored magic box of the month (presently dweeb learning) will be able to derive actionable meaning from. It is pretty rare for this to be true.

Most of the time, "big data" just gets in the way. 99% of it is trash that turns into "regular sized data" when you parse the stupid log files you keep around out of laziness and poor engineering practices. Let's imagine a "big data" problem together: let's say, all the login attempts for a phone company over the course of a year, or maybe all the credit card transactions for a credit card company. You're looking for bogeys. In one case, unlabeled, in the other mostly labeled. The data set itself isn't particularly big once it is parsed into something sane in a columnar DB; terabytes -something that fits on one decently beefy computer. After feature creation (maybe blowing it out to 2000 features) and aggregation, you're left with something which is a couple of GB. If you know what you're doing, you can pare down your example set to tens of thousands of examples and get a decent classifier for what you're interested in. Or you can deal with the complexity of a classifier which can ingest GB of data and get pretty much the same answer.

As an example, you can look at VW doing matrix factorization on 100k samples, then more samples:
https://github.com/JohnLangford/vowpal_wabbit/wiki/Matrix-factorization-example
http://files.grouplens.org/papers/ml-10m-README.html
You won't get a very different answer.

Deep learning is probably good stuff in the hands of a master craftsman, and is arguably the only way to do stuff like classify German traffic signs, but I have yet to see a commercial application of it that justifies the hype being created by Google and Facebook. Meanwhile, it's really computationally expensive and not as useful as, say, gradient boosted decision trees.

I have a friend working on DL in trading in a startup, and think it's a lost cause. Maybe you could come up with something interesting using a reinforcement learning framework, but you're probably as well off doing this using linear models. Magic box ideas are silly.

"Learning, n. The kind of ignorance distinguishing the studious."

chiral3
Founding Member

Total Posts: 4985
Joined: Mar 2004
 
Posted: 2016-10-25 23:51
I something similar in a ppt recently. It went something like "Big Data is like teenage sex: everyone is talking about it, nobody knows exactly what it is, and everyone thinks everyone else is doing it."

We've been performing dimensional reduction for decades via regression, PCA, entropy methods, etc. I always get punchy when groups talk about tools instead of problems being solved. It's indicative of math-envy.

Nonius is Satoshi Nakamoto. 物の哀れ

EspressoLover


Total Posts: 221
Joined: Jan 2015
 
Posted: 2016-10-26 00:58
@jslade

At the risk of rehashing our previous debate about the merits of deep learning... While I'm pretty skeptical of whether deep learning has any real application for trading, the general hype is still pretty justifiable. It's definitely arguable whether the deep-net victories in supervised learning represent a categorical improvement, or just a reflection of more invested effort and tuning.

But the killer app isn't classification, it's fantasizing. As quant financiers, we generally (with good reason) ignore the generative in favor of the discriminative. However it's hard to argue that deep-nets aren't a quantum leap forward when it comes to fantasizing highly structured data-sets. Not only are there significant improvements in the distributions, but sampling on deep nets is really easy.

In ten years, it's pretty feasible that some variant of recurrent-nets will be able to generate spit out mediocre sitcom episodes or formulaic pop songs on demand. That's a *really* big deal.

rftx713


Total Posts: 77
Joined: May 2016
 
Posted: 2016-10-26 03:57
>That's a *really* big deal.

Would you mind explaining how? No offense meant whatsoever, I really do enjoy reading your posts and to me it's clear you're much better informed about this than I am. However, ending on that note sounds quite like a Buzzfeed or Vox article. "You won't believe that ABC just XYZ - and that's a really big deal."

Anyways, to me that's a misunderstanding of the problem being solved (as per chiral3), although it's very interesting because I was thinking about this other day and feel like I may have something to add.

To preface: I'm a huge fan of Soundcloud. On that service alone (not even mentioning YouTube), there are truly countless individuals who are just incredibly talented. Whatever tools they're using, methods they're using, whatever, they make fucking awesome music. But most people have never heard of them, and never will. Some stupidly good songs will get less than 1,000 likes.

I get the impression that the same could be said about screenwriters, or really any of the creative arts.

After watching some friends try in different ways to gain recognition for their creative work, I've noticed that the creative work is just one piece of many when it comes to creating work that people will actually want to "participate" in. For something like Drake, Skrillex, South Park, The Wire, Breaking Bad, Harry Potter, Lord of the Rings, whatever... I'm willing to bet the screenwriters or music producers weren't the major expense or difficulty (as shown by my Soundcloud example that making a quality product isn't the entire game). Getting the word out there, converting people to your "cause"/"image"/whatever... to me that's the real "problem" in the function of "fantasizing." That's why radio stations don't just flip the bird to the record labels and play awesome music they found by no-names on the internet. Most people wouldn't listen, despite the raw quality being more or less the same.

Without going on a long spiel, I think much of this comes down to appealing to people in a way they may not have even known they wanted to be appealed to before. Your music/screenwriting/etc. must do this, but the end product that the person hears or sees is really much more than the work alone. If you're correct, which I think that despite my objections you probably are, the world may soon be flooded with sitcoms and songs on the radio that were made by algos. But then I wouldn't be surprised whatsoever to see a backlash away from those products towards "art" that is, in some currently unpredictable way, definitively not algorithmic.

I hope that made some amount of sense. Apologies if I came off in any kind of way, just wanted to speculate for a bit.

edit: I realize you mentioned mediocre sitcoms and formulaic pop songs, which takes away from some of what I said, but I do think the point remains. Katy Perry's day-to-day issue isn't coming up with a new pop song that most people will enjoy and that can be played in most social settings without pissing too many people off too much. It's being "Katy Perry." So unless an algo can be an artist's manager, or other equivalents, I'm not sure it's that much of a game changer.

chiral3
Founding Member

Total Posts: 4985
Joined: Mar 2004
 
Posted: 2016-10-26 04:23
@rftx713: For some time, in other writings, I've been drawing parallels between modern consumption and Marshall McLuhan's famous phrase "The medium is the message". In modern times I think of Uber. Uber isn't about the driver. In most major cities (except London...) the driver is an over-educated kid punting his many start-ups who is not very good with directions. Uber is about the app. Full stop. It cracked me up to hear Travis Kalanick talking about autonomous cars years ago. When people think of Uber they think of the app. They think of pushing that button and seeing the cars on the map. The driver is a notch up in importance from the gas cap.

Several years ago the quant angle was - as an example, for music - to find songs that maximize hook appeal and maximize listener fatigue (fits a Heavyside-looking function). This results in the highest purchase-repurchase rates. There's been a shift to how we consume, though. Like the taxi driver, the player and the artist have been replaced. Of course this creates cross sell opportunities, abstracts currencies (via the endowment effect and availability heuristic), and generates massive trainable datasets.

Nonius is Satoshi Nakamoto. 物の哀れ

katastrofa


Total Posts: 357
Joined: Jul 2008
 
Posted: 2016-10-26 08:17
I don't think generating songs will ever be a major application of smart algos (dare I say AI?), because there is and will be a lot of humans who WANT to write songs and will do it for peanuts just so that they get a shot at fame and recognition. And this is before we even think about reaching for the artistic creations of other species, e.g. whales.

Accounting, however...

Maggette


Total Posts: 942
Joined: Jun 2007
 
Posted: 2016-10-26 08:50
"People who talk about "big data" in public remind me of conversations about sex I had when I was 12. Everyone talks about it; very few people are actually doing it. "

Couldn't agree more. I actually do work with the hyped bid data technology stack (Spark, HBase, Hive, ELK, and some Flink) for a big customer. The hype drives bussiness my way.

I think we do have to make a distinction here: doing trivial ETL like stuff can be done and is done with "big data" technology. There it can add value and can replace expensive solutions.

But when people talk about "big data" most of the times they think of the predictive analytics part of the business. And here I only have a experienced a couple of use cases (like one couple ..2!!!!) that really added commercial value.

All the other stuff I workded on felt and was handled like an R&D project.

IMHO big data is often a lot of noise....

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

jslade


Total Posts: 1064
Joined: Feb 2007
 
Posted: 2016-10-26 09:47
"In ten years, it's pretty feasible that some variant of recurrent-nets will be able to generate spit out mediocre sitcom episodes or formulaic pop songs on demand. That's a *really* big deal. "

You can already spit out formulaic pop songs using something as simple as LZW compression as the core predictor algorithm. In fact, you can barf out pretty convincing classical music this way. Video games already do this. This trick has existed since the 80s at least.

You can't make dweeb learning barf out mediocre sitcom episodes, and won't be able to in 10 years. You should know better than to say something this preposterous.

I have a theory that strength of claims about DL are inversely proportional to actual knowledge of the technology. The people who actually invented it: Hinton, LeCun, Bottou: they are most emphatically not saying crazy things like this, even though there are decent motivations for them to hype it up. In fact, Yann has gone on the record that people should chill out with the crazy claims, rightly drawing the parallel with the first AI winter.

DL and "AI" is presently looking a lot like "nanotech" was in the 90s and early 00s. Where's my self replicating nanobots?

"Learning, n. The kind of ignorance distinguishing the studious."

pj


Total Posts: 3317
Joined: Jun 2004
 
Posted: 2016-10-26 10:19
> Where's my self replicating nanobots?
Running around the house.
Annoying daddy nanobot.

I saw a dead fish on the pavement and thought 'what did you expect? There's no water 'round here stupid, shoulda stayed where it was wet.'
Previous Thread :: Next Thread 
Page 1 of 2Goto to page: [1], 2 Next