Mage Rocks the best of both worlds imho. Airflow great, Mage better. Anyone who has ever wrestled an Airflow DAG into Production will find Mage an absolute breeze in comparison. Mage can pretty much integrate with everything, they just don't do the 'logo collection' part, which, oddly, seems to please a lot of folks. The war of the open data formats has really only just begun, and the ability to transform data and then kick off ML and/or 'AI' jobs while you choose your own method and _place_ to do that compute is 'the way' (no helmet required).
I've been using mage for 6 months and had to add something with the newest pandas which then needed the latest sqlalchemy which then breaks everything.
I did a greenfield project last year, got down to these 4 and finally Dagster vs Airflow, as you said, Airflow being that 500lb gorilla, you really need a strong use case to justify going with anything else, even though I enjoyed dagster integration with dbt, Airflow's did not let me down and had lower risks associated with it, knowing that we would go with self hosted of any of them.
Just gonna leave this here https://www.getorchestra.io/
Mage Rocks the best of both worlds imho. Airflow great, Mage better. Anyone who has ever wrestled an Airflow DAG into Production will find Mage an absolute breeze in comparison. Mage can pretty much integrate with everything, they just don't do the 'logo collection' part, which, oddly, seems to please a lot of folks. The war of the open data formats has really only just begun, and the ability to transform data and then kick off ML and/or 'AI' jobs while you choose your own method and _place_ to do that compute is 'the way' (no helmet required).
Thanks for sharing detailed information.
Did you miss legendary aws data pipelines?
Also, you may recall my article, it resonates well: https://www.junaideffendi.com/p/my-data-pipeline-orchestrators-journey?r=cqjft
I've been using mage for 6 months and had to add something with the newest pandas which then needed the latest sqlalchemy which then breaks everything.
Great post thanks!
Thoughts on how Airflow 3.0 remote execution might change these tradeoffs? Seems like that should let us do real work in native tasks
I did a greenfield project last year, got down to these 4 and finally Dagster vs Airflow, as you said, Airflow being that 500lb gorilla, you really need a strong use case to justify going with anything else, even though I enjoyed dagster integration with dbt, Airflow's did not let me down and had lower risks associated with it, knowing that we would go with self hosted of any of them.
🔥🔥🔥🔥