Posted: 2012-01-18 14:18
just discovered this stuff in bloomberg bizweek...
has anyone here ever participated ?

Posted: 2012-01-19 16:25
Yes, some interesting projects. Personally I use the various challenges as learning exercises. Have gained decent proficiency with R and applying some machine learning to the provided datasets.

The big daddy of the awards is the Heritage Health Prize amounting to about USD 3M if you are able to predict days spent in hospital for a member given their claim history below a certain error rate. Currently decently placed in the leaderboard, however barring some magical breakthrough, getting into the top 15 will be very hard. Perhaps we should form an NP team.

Posted: 2012-01-19 17:27
I think that is the one I looked at a few weeks ago.  As I recall, much of the data you would want to estimate the rates were missing such as age, weight, smoker/non-smoker, blood pressure and such.  To me it appeared to be looking for the best kitchen sink regression with repeated trials of the non-visible dataset determining the best fit.   


Posted: 2012-01-20 08:34
hmm. i think i saw this a year ago and at the time there was a contest for predicting credit card defaults offering 4k or something, which i equated with slave labor. maybe i'm wrong and that was a different site, because for 3mm they definitely have my attention.

i dont like the idea of these cheap contests in general as they seem intended to exploit naive people with the skills needed. (grad students) ie a team of 5 people qualified working for 2 yrs on this could easily approach 3mm in salary. and the losers who are basically just as good wasted their time. and i presume the value of such prediction for insurance companies exceeds 3mm by several orders of magnitude.

finally given the cannot-deny future of insurance in the US, I can only assume the 3mm contest is a prelude to precrime-meets-death-panels. black helicopters in 3 ... 2 ... 1 ...

ok cynic hat off ... i may hypocritically throw it into the ring just to play with some data Smiley


Posted: 2012-01-20 12:30
Check out :

This guy is very helpful (one of the milestone prize winners) and has organized the data for input into a sqlDB and has even very kindly put up some R code that uses GBM's to predict daysInHospital.

Just to make you aware of the project difficulty: this first cut code gets you to rmsle of about .463. The top guys are around .453. So even after you throw all sorts of ensemble methods you are still a ways away from getting to the .40 rmsle required to get 3MM. Effectively this is a 500K contest as I highly doubt (and the leaders tend to agree) to get to the level required to win the Grand Prize.

I have spent a decent amount of time on this already and it is definitely a catalyst to learn new things. So even though one might "waste" 500 hours on this, the knowledge gained could be quiet useful in the long run.

Posted: 2012-01-20 14:48
The points made by quantz are excellent. This paper has some good proposals on how to improve the mechanism design of sites like kaggle.


Posted: 2012-01-23 19:25
All-pay auctions suck, regardless of whether the participants bid monetary amounts or spend their time.

Posted: 2017-03-10 19:29
A few days late, but Google is acquiring Kaggle

Posted: 2017-03-13 12:53
kaggle timed their pivot, to energy-industry consulting (shale gas), perfectly to its collapse. google acquhire i guess? 1000 xgboost models certainly isn't worth anything.


Posted: 2017-03-13 20:40
no inside color but should be either that or friends helping friends

also: lol @ s/script/kernel/
