Forums  > Software  > What does your process monitoring look like?  
     
Page 1 of 1
Display using:  

contango_and_cash


Total Posts: 124
Joined: Sep 2015
 
Posted: 2020-09-09 13:24
I, like most of you, have a lot of crap that runs every day.

Right now:
If something fails, I send a message on telegram to an "error" channel that tells me I need to go look at stuff. The task ranges from trading, automation, etl. My code is all python.

None of this is perfect. Curious to see what others have done.

EspressoLover


Total Posts: 458
Joined: Jan 2015
 
Posted: 2020-09-09 14:12
Went through a similar process as you, the details of my journey are here. Still pretty much using the same setup that I left it at the post. If I could distill one piece of advice: pay the $10 a month for PagerDuty. It's way more reliable than rolling your own alerts service.

https://www.nuclearphynance.com/Show%20Post.aspx?PostIDKey=188059

Good questions outrank easy answers. -Paul Samuelson

contango_and_cash


Total Posts: 124
Joined: Sep 2015
 
Posted: 2020-09-11 13:48
Thanks EL

quantie


Total Posts: 903
Joined: Jun 2004
 
Posted: 2020-09-12 22:54
Well I have all my process post to slack and pushbullet.

But I have been looking at this more closely this one looks very interestingPrefect.

Also drake appears to be a good way to go check these other tools.
pipeline-tools

kloc


Total Posts: 40
Joined: May 2017
 
Posted: 2020-09-13 06:25
@quantie: In case you've used Apache Airflow - how do you compare it with Prefect?

(I had a look at https://medium.com/the-prefect-blog/why-not-airflow-4cfa423299c4, but it is unfortunately just a peppy PR piece from the Prefect people... and a bit of a turnoff TBH.)

quantie


Total Posts: 903
Joined: Jun 2004
 
Posted: 2020-09-16 01:20
I have not looked at that but will task my intern to look into it.

prikolno


Total Posts: 74
Joined: Jul 2018
 
Posted: 2020-09-17 00:10
I've used Airflow.

I think the only real problem with Airflow is that it warrants substantial DIY setup and has a large set of dependencies. Most of the basic requirements you'd need in a ETL scheduler don't just work out of the box and can't be configured from the UI:
- HA of the master node
- Log database (e.g. postgres)
- LDAP
- Version control of pipelines
- Docker configs
- Mesos/k8s/Celery for parallelization

The last item alone carries significant complexity. I don't think the setup of all these can be delegated to an intern.

After setup, the only one thing that will kill the project is team buy-in. It will really only work if there's a critical mass of people using it and treating Airflow DAGs as first class work unit. My classmate ran the team that wrote Airflow, and at Airbnb, they make quite a deliberate effort of writing everything as an Airflow DAG. I tried helping a 1,000+ person company get into it and failed, because no one wanted to use it in spite of it already being set up and there being plenty of know-how, financial resources and management encouragement to do so.

I think the team buy-in comes down to a matter of style rather than technical factors; what the rest of your infrastructure looks like and some philosophical elements. Airflow will do exactly what you tell it do, but sometimes more structure is better for your team and Pachyderm will encourage better hygiene for data provenance. If your team doesn't use Python, the adapters lose much of their value and you may find some other DAG metalanguage easier to adopt. If you already have k8s set up, then Argo is a pretty good choice for ETL automation. If you already have Jenkins CI, then it isn't bad to throw in Blue Ocean and use it as scheduler. There's also a few cloud-native ones that are more suited for firms which are strictly on AWS - if your firm is like that and you want a simpler Airflow setup experience, you might consider Astronomer.

Back to the original question about Prefect: among the newer ETL tools, I think parallel execution is usually the missing/weakest feature. We also considered Prefect for that project and if I recall correctly, it depended on k8s for this, which was no go. Rundeck is good here, as it clusters out of the box. I also remember Prefect was missing a few other critical features listed above - LDAP I think?

kloc


Total Posts: 40
Joined: May 2017
 
Posted: 2020-09-17 04:20
@prikolno: Thank you very much for your superb answer - this is a perfect example of why I love this site.

Note taken about Prefect. I've been able to use Airflow in several environments, including a small company (less than 30 people) with a non-technical crew and they are generally able to monitor the processes and resolve minor hiccups (or escalate the issues when needed). However, Airflow does have some rough edges so I started to wonder whether Prefect would make things smoother...

I guess I'll pass on Prefect and stick with Airflow.

Maggette


Total Posts: 1268
Joined: Jun 2007
 
Posted: 2020-09-17 11:46
Thx prikolno. Very interesting. Also wasn't aware of Astronomer.

Things worked out fine so far using Airflow. But "hands on muddling through" via jenkins that call python or groovy scripts. I am in the position that I parallelise vie jobs (hence each node in DAG is a spark or dask job that runs distributed o x workers).



Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

quantie


Total Posts: 903
Joined: Jun 2004
 
Posted: 2020-09-20 02:30
Thanks prikolno
Previous Thread :: Next Thread 
Page 1 of 1