Why Declarative (Lakeflow) Pipelines Are the Future of Spark

get on board

Feb 02, 2026

∙ Paid

Hey, it’s me … Dan; the one churning out and endless stream of crap, some people would say. I hope you’ve been enjoying the Podcast lately, I’ve got three more exciting interviews releasing soon, and month of interviews booked. The future is bright.

If you are a CTO, Data Leader, or decisions maker … I do offer consulting services, so message me here or on Linkedin. Got some data devils you need tamed? I can help.

Abstraction layered on abstraction, upon another abstraction. Isn’t that the the age old story? One could argue that’s how modern programming languages have been built. There will always be a cadre of zealots in the corner decrying the death of some low-level “thing,” refusing to release their death grip on some bespoke technological and theological oracle made of bits and bytes.

Arguments can be made on both sides.

Who needs the low level when the high level will do the job (better)?
Knowing the low level makes you better at the rest.

Today, we march to the beating drum of yet another Apache Spark self-reinvention. I, for one, do not think it’s a bad thing when the powers that be recognize who their users are and bring them sweet and delicious offerings to make life easier.

One could say that this is the very thing that makes one feature die on the vine, and another flower.

Weep you may, but the age of Scala and RDDs has fallen; Python and now Declarative Pipelines rise.

Thanks for reading Data Engineering Central! This post is public so feel free to share it.

Spark Declarative Pipelines - what and why.

Today, I will do my best to unpack Spark Declarative Pipelines, what they are, and why they have been born into this troubled world. Is it just another mirage of empty promises to muddy the already dirty waters, or a new paradigm that brings data-pipeline glory to the teaming Cursor script kiddies?

Continue reading this post for free, courtesy of Daniel Beach.

Or purchase a paid subscription.