Forums  > Software  > OneTick  
     
Page 2 of 2Goto to page: 1, [2] Prev
Display using:  

Scotty


Total Posts: 721
Joined: Jun 2004
 
Posted: 2016-10-12 02:27
The other one worth looking at is Arctic from the Man AHL guys. Particularly for Python shops.

“Whatever you do, or dream you can, begin it. Boldness has genius and power and magic in it.”

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2016-10-12 11:50
"I'd forgotten about this discussion. It seems bizarre now that kdb and onetick were "cutting edge" tools when this thread started - kdb had already been around for a long time too. Amazon certainly boosted the speed of change in a lot of industries. There are still people out there with more money than sense that continue to use kdb to this day, in fact I heard there's a bunch of clowns sticking it in its own cloud with expensive fixed-term contracts for access - the last fart of a dying corpse. Even MongoDB is struggling to keep it together now.

The simple reality is that we now have on-demand access to an infinitely scaleable number of CPUs and GPUs coupled with any flavor of database technology you want.

Public cloud seems such an obvious idea in hindsight."

Hi,

I am not sure that I agree with everything you said. I think there is some latency based stuff that is just not manageable in the AWS. I am working for a customer right now that has decided to "go cloud" with hadoop and things are falling apart in my opinion.

kdb+ is too expensive. No doubt. But I think the major reason it is dying is absolutely ugly to develop applications with/in it.

I am excited about kerf (by an NP member).

Still. I do use the AWS heavily. It is so easy to role out your MongoDB or Hadoop stuff there and prototype stuff in spark and explore data in spark and hive.

I think the cloud base stuff is here to stay. But I do see niches were you are better of with a more classical solution.

@scotty:
really really nice. Didn't know about that one.

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

brainyoga


Total Posts: 164
Joined: Jun 2004
 
Posted: 2016-12-15 23:12
kdb+ still beats the pants off competition which is why it sells to those who desire performance.

Externally audited benchmarks, lead held by Kx

stac

open source competitors
and the more amusing hadoop

And although they don't do GPU out of the box yet, kdb+ integrates readily with KNL, beating benchmarks from GPU databases:
intel phi

Just look at how many other databases have risen and fallen in the so far 23 year life span of Kx. The share price of First Derivatives, who owns Kx, shows the demand for this tech is still very strong and expanding. It sounds natural for them, like others, to offer cloud solutions.

size does matter

BuzzMeeks


Total Posts: 9
Joined: Jul 2009
 
Posted: 2016-12-16 00:03
Nah.

Kdb+ has been the pre-eminent fully-baked solution, I doubt many here dispute that, but the writing certainly seems to be on the wall. Ridiculously expensive proprietary software is not to contemporary tastes - and the market that kdb+ was traditionally sold into back in its heyday has contracted massively. FD is, and always will be, a consultancy/PS shop first and foremost - they had been failing at software for years before they acquired kx (way too late). They have also tried kdb in the cloud/tick data in the cloud multiple times in the past with zero success (cloud meaning private cloud).

I really don't see kdb+ being relevant in 2020 unless FD can quickly get past the "innovators dilemma" and accept cannibalisation of current revenues to be reborn as a public cloud solution.

That's probably not going to happen though so kdb will probably fade in the next few years. FD will probably still exist as a niche software consultancy for some time albeit with much reduced revenue and headcount.

jslade


Total Posts: 1070
Joined: Feb 2007
 
Posted: 2016-12-16 04:14
Ranty McJsladepants:

Something like Kx is always going to be faster than Spark. Kx effectively maps vector memory images to disk. If you ever looked at the preposterous conga dance that even the more modern Spark "columnar" stuff like Parquet files does to get the data to the CPU, well, Spark is an improvement on Hadoop, but that doesn't mean they know what they're doing. Spark appears to have been designed by a bunch of glue sniffing Berkeley graduate students. Ever see the code recent graduates write? It usually sucks butt. I don't know why people think it is different because it is publicly available and "peer reviewed." "Better than Hadoop" (which Spark absolutely is) is an extremely low bar.

That said, it is a shrinking market. And at some point someone in the open source community will figure out the 1960s era design tricks that make Kx (or other adult TSDB; OneTick counts, sorta) so fast. Or HP will succeed with memristor computers making the whole concept irrelevant.


"Learning, n. The kind of ignorance distinguishing the studious."

BuzzMeeks


Total Posts: 9
Joined: Jul 2009
 
Posted: 2016-12-16 08:00
Not disputing Kx's speed or capabilities as a standalone solution - as I said previously, its always been the best solution for tick data.

But I think that it very clear that almost everything will be run in public cloud in the not so distant future, including trading engines (considering that CME announced its building a matching engine in AWS the same day it sold its Aurora data centre last summer; and AWS announced FPGA EC2 instances a couple of weeks ago - just two indications of where we're headed). And, with that in mind, I think First Derivatives, as a listed Plc, will be unable to adapt its commercial models to remain relevant with kdb+ as an on-demand service.

Hadoop (Spark) is aimed at a more generic enterprise market and will likely never fill kdb's shoes, but as electronic/digital markets shift into the cloud someone will fill that gap if FD don't.

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2016-12-16 22:55
Hej everybody,

I do use Spark in production and even contributed to the project (minor contributions though). I do not know many people personally in Germany other than people on my team that do use it in production. And I totally understand why. I share (once again) jslades view on this.

A few points
1) Spark is a moving target. Spark 2.0 has changed a lot from Spark 1.x... IMHO the DataFrame Api was a bad idea from the beginning. The syntax is ugly and it is very limited (you can't use a proper datatime format in it and and and..). They moved in the right direction with Datasets. That said: you try to understand the inner workings of spark (as I did) and actually get comfortable with your crude understanding of the execution engine and get quite fluent in the syntax RDDs and SchemaRDDs (the former DataFrames)....and then everything changes (a little bit of exaggerating here, but you get my point).

2)In my opinion they try to much. They have an API for python (okay I understand...),Scala (makes sence, spark is written in Scala), Java (comes kind of naturally) and R (here it starts ´to get weird). They have a streaming framework (..just don't ...use Flink!! or Storm) and graph processing enginge (I love the idea and even used it) and of course a machine learning library (more on that later). I consider Spark SQL as quite mature and a natural extension to the core library.

I don't think you can manage a framework and develop a framework properly that pretends to do anything!

3) First let me say that I like the Spark community very much. There are many smart guys and gals working on the project. Way smarter than me. And I know that my next point is an unfair generalization: but I do think that the main contributers don't actually USE Spark in critical commercial applications! You can see that on many missing features that are absolutely obvious for anybody who implemented some machine learning application that is actually doing something else then "hello world" examples on Wiki data. For example for a long time there wasn't a nice and easy way save your mllib models!!! jesus. even SPSS can do this!

4) Sometimes it isn't that stable and the framework for example is not serializing something and the whole thing crashes and you have to dig deep to find the problem and work around it. I do not feel comfortable to commit myself to deadlines when using spark. I did it so and I came away alive, but it wasn't always a great experience.

1),2),3) and 4) are bugging a couple of big potential users in Germany (non financial). I talked to a leading software architect of a DAX company which spend a lot of money on fancy machine learning and big data stuff and they don't use Spark in production any more. Just Map-Reduce. You have more control on that.

My point of view. Of xourse I might be wrong and there are a lot of people using it...and I just didn't run into them.

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2016-12-20 18:13
What is wrong with Cassandra or InfluxDB?

http://cassandra.apache.org/
https://www.influxdata.com/time-series-platform/influxdb/

First one is legit.

Time well wasted.

jslade


Total Posts: 1070
Joined: Feb 2007
 
Posted: 2016-12-21 09:01
Cassandra: the cockroaches go in, but they never come out. Great write speed, assuming you're a JVM wizard. Forget about doing something useful with your stored data. How many pages of code to calculate a slow moving average?

Influx, I have no idea what it is used for, beyond burning VC and making charts for data centers.

Neither are as good as Spark, let alone Kx.

"Learning, n. The kind of ignorance distinguishing the studious."

Patrik
Founding Member

Total Posts: 1333
Joined: Mar 2004
 
Posted: 2016-12-21 13:03
influxdb may have improved, but when I tried it last year it was pretty stinky and unreliable fwiw.

Capital Structure Demolition LLC Radiation

xbsd


Total Posts: 1
Joined: Jul 2017
 
Posted: 2017-07-19 19:38
Agreed. Have been using kdb+ for 8+ years and still not found anything that comes comparably close. The new stuff (Redshift, Cassandra, Spark, MongoDB, etc) are all fine, but even today a single node kdb+ instance can outperform them (by multiples). Tried (most of) the other DBs from HP, Actian, IBM, etc as well as DB types - KV pairs, SSTables, Columnar, In-Memory, etc etc and the overhead is monumental compared to the simplicity of kdb+. Plus, you don't get the performance anyhow after all that effort. If you have come across any that are worth testing, please let me know.

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2017-07-19 23:36
I doubt that the "new stuff" was created to outperform kdb+.

We recently worked on a "big data" problem that involved transforming some TB of deeply nested json files. The tranformations only affected one json per file (no joining or grouping).
We were competing against an implementation in the hadoop universe, implemented by consultants of one of the bigger consulting companies.....and kicked the crap out of them using a shell script that started several cython scripts in parallel.

Still. I admitt I am a spark fan and even a bigger Flink fan.Just recently used spark graphx to generate features and was impressed how flexible that stuff is.

I would love to add some kdb+ like tool to my skill set. Also like to think in vectors.
But in Germany I am not aware of a single company using it. I onced tried to place kerf in a DAX company, but the whole project was descoped. Does anybody outside of finance use kdb+?

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

jslade


Total Posts: 1070
Joined: Feb 2007
 
Posted: 2017-08-02 04:57
KDB occasionally gets used in the valley when stuff like Redshift or Vertica falls down, or the CTO knows about KDB. It used to get used in bioinformatics (the old K3 list has some users in Pharma). They're trying for IOT marketplace, but IOT doesn't really exist yet in a useful way. Should probably be used for smartgrid stuff if it isn't yet.

"Learning, n. The kind of ignorance distinguishing the studious."

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2017-08-03 00:51
Thx

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
dp

Time well wasted.

svisstack


Total Posts: 300
Joined: Feb 2014
 
Posted: 2017-08-16 18:58
@xbsd: did you have some papers about this performance tests or maybe some information’s about what queries was tested?

In general if you are reading data then performance will be dependent only on database disk seek time and reading speed, so I do not think it will be possible that kdb+ will introduce any performance improvement without adding some other things like smart memory caching or some kind of deduplication ... or maybe you are talking about some aggregations that works faster on kdb+ and don’t work in other databases -> it will make sense then and i would like to read about it, but don’t have starting point.

Time well wasted.

jslade


Total Posts: 1070
Joined: Feb 2007
 
Posted: 2017-08-17 02:24
There is no place to read about this, because the things that make Kx go are known as "trade secrets." I happen to know more or less how they do it: nobody in the open source world has a clue other than the J guys.

It's astounding and hilarious how clueless "coders" are about very basic computer science. It's nothing to do with "smart memory caching." It's all common sense.

Anyway, Art's a nice guy, even if the language is pretty hairy. Give him your money.

"Learning, n. The kind of ignorance distinguishing the studious."
Previous Thread :: Next Thread 
Page 2 of 2Goto to page: 1, [2] Prev