Forums  > Software  > Time Series database: Reboot. Make a wish.  
     
Page 1 of 1
Display using:  

gillest


Total Posts: 3
Joined: Sep 2017
 
Posted: 2017-09-26 18:23
Hi Guys,

I work at quasardb - I just want this to be clear upfront.

We're currently in the middle of building - what - (we humbly think! hey dare to dream!) could be a disruptor in the TS market : store and process TS at scale (intraday/historical).

I'm passing on the usual crap ... but in a nutshell ... We have tried to think "KDB" in modern terms while using a rolling mill on "Big Data" stacks. C++, clustered mode, low level programming, scalable, ... http://www.quasardb.net/-what-is-nosql- ), transactional, simple (see https://doc.quasardb.net/2.1.0/ ), "focused" ...

The story is best told here: https://blog.quasardb.net/time-series-in-quasardb/

We have released our Beta 2 lately. Still more work to be completed. But we need "end user" feedback to ensure we're on the right track.

Aka, if we were to build the TS system of your dream - what would it have in from a "computing" angle ?

We have built-in basic aggregations (https://doc.quasardb.net/2.1.0/api/time_series.html) - that are cross domain (IOT, Finance, ...) - but we are now looking at the one that all traders/quant/... will need to build what they need (1mn bar? ...).

Our idea is NOT to build another q - but rather implement a grammar that will allow any trader/quant to build their model using Python, C++, .NET, Java, Node.JS, R, Julia, ... while still leveraging some "verb" executing on the system (natively x86 optimised & inherently distributed).

We also took the design principle that what goes in ... needs to get out! aka - make sense of the data.
The system should manage the data lifecycle (how many concurrent version, replica, data bucketing, ...) transparently and the end user should not have to manage its lifecycle when running a query ... want to get all datapoint for IBM since 1998 ... well ... up to you ! here they are ... make sense of it ... or precompute on the infrastructure some metrics (mean, average, ...) ...

Our client-side API is (will stay) open source (see https://github.com/bureau14 ) - while our core is not. We know this is not "fashionable" but .. we want to build a good company. Fair but still making money to keep attracting talents & getting cool stuff done IN due time :-)

I know this is a pretty open ended question ... and I'll carefully look at ALL feedback that you would be very kind giving us. I'm also happy to skype/... to make it easy.

We target end of the year to have our system ready to be tested with STAC M3. So we're implementing the basic set of functions we need to do that on top of what's in already ...

So - if you had a dream ... what would you like this TS Storage and Processing engine do ?

Thanks !

Gilles


Smiley

jslade


Total Posts: 1093
Joined: Feb 2007
 
Posted: 2017-09-27 20:32
The basic verbs given by Kx are pretty useful. Asof joins are important. Making your "group by" queries performant (almost nobody but Kx does this right) is important. By-day table sharding is important (this is really just syntax sugar in your parser).

Scalable; probably not as important as you think. You only really get speedups when your cluster/parallel architecture either exactly matches your query structure or you have petabyte problems (as opposed to gigabyte queries on a petabyte dataset) and thousands of machines, which is probably not what you're targeting.

Transactional/locking, etc type of shit PostGres nerds worry about, usually not real important.


"Learning, n. The kind of ignorance distinguishing the studious."

Maggette


Total Posts: 964
Joined: Jun 2007
 
Posted: 2017-09-27 22:59
Nothing that I want...and after reading your blog post pretty sure not what you want either....but ifyou you want it commercialy successful you probably have to make sure that your db integrates well with the Apache stack. Most important make sure you can serve as a sink and/or source for the oh so modern kafka/flink/storm based kappa architectures.


Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

gillest


Total Posts: 3
Joined: Sep 2017
 
Posted: 2017-09-28 08:50
Understood.
We now have an interface for Spark. R/W. More to come.
As for Kafka - it is on the roadmap. As well as Solace. We are working with 3 different feed providers to make sure we have market connectivity to ingress ticks.

Thanks for the feedback!
Gilles

gillest


Total Posts: 3
Joined: Sep 2017
 
Posted: 2017-09-28 08:55
To Jslade,

Asof joins ...Yep. On the list. We are doing some prep work now to get it done by end of the year.

The way we bucket the data is not by day. today. We basically do transparent sharding with automatic rebalancing when adding/removing nodes.

As for scalability - we see it from 2 angles: capacity to store (whatever the size and granularity), capacity to process (aka, run aggregations on the data) ... we are aware of the data topology and sharding - hence having the ability to run the rights queries where the data resides.

I'll ask Edouard (our CTO) to add relevant details.

Thanks for the feedback

Gilles
Previous Thread :: Next Thread 
Page 1 of 1