<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Data Engineering Central]]></title><description><![CDATA[Long Live the Data Engineer. No holds barred.]]></description><link>https://dataengineeringcentral.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!pIVQ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F880c179a-d4f4-4f41-a70c-48e557c48f38_256x256.png</url><title>Data Engineering Central</title><link>https://dataengineeringcentral.substack.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 23 Jun 2026 01:01:00 GMT</lastBuildDate><atom:link href="https://dataengineeringcentral.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[dataengineeringdude]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[dataengineeringcentral@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[dataengineeringcentral@substack.com]]></itunes:email><itunes:name><![CDATA[Daniel Beach]]></itunes:name></itunes:owner><itunes:author><![CDATA[Daniel Beach]]></itunes:author><googleplay:owner><![CDATA[dataengineeringcentral@substack.com]]></googleplay:owner><googleplay:email><![CDATA[dataengineeringcentral@substack.com]]></googleplay:email><googleplay:author><![CDATA[Daniel Beach]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Apache Datafusion Comet (Spark Accelerator)]]></title><description><![CDATA[any better than last time?]]></description><link>https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 22 Jun 2026 13:32:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xBCz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xBCz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xBCz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xBCz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1271937,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xBCz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!xBCz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d2a9e5f-8f69-48da-9591-bc0febd92414_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As I was winding my way through the various twists and turns of <a href="https://www.linkedin.com/in/danieldavidbeach/">LinkedIn</a> lately, strange it has become the least of the social media evils, by happenstance there popped a post about <a href="https://datafusion.apache.org/comet/#">Comet</a>, that Datafusion baby that&#8217;s been trying to take a corner out of Spark for a few years now.</p><p>I did <a href="https://dataengineeringcentral.substack.com/p/apache-datafusion-comet">poke a stick at Comet back in 2024</a>, but it was a little painful. You can read about that experience blew if you&#8217;re interested.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dataengineeringcentral.substack.com/p/apache-datafusion-comet" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q19-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 424w, https://substackcdn.com/image/fetch/$s_!q19-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 848w, https://substackcdn.com/image/fetch/$s_!q19-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 1272w, https://substackcdn.com/image/fetch/$s_!q19-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q19-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png" width="1456" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131640,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://dataengineeringcentral.substack.com/p/apache-datafusion-comet&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q19-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 424w, https://substackcdn.com/image/fetch/$s_!q19-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 848w, https://substackcdn.com/image/fetch/$s_!q19-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 1272w, https://substackcdn.com/image/fetch/$s_!q19-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8355430b-f037-484a-8f1a-6e16b72e9086_2058x772.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One would have to assume with the increase in <a href="https://datafusion.apache.org/comet/#">Comet</a> chatter, that the project has gotten its sea legs under it and hopefully is a slightly better experience this time. I&#8217;m crossing my fingers at this point that I don&#8217;t have to build my own JAR this time.</p><blockquote><p>I hope one can simply download the correct pre-built JAR for the Spark version we are interested in. That would be a step in the right direction.</p></blockquote><p>Truth is I have do no &#8220;pre-work&#8221; when I sit down to write articles like this and explore or revisit tools. Endevoring to simply approach the problem space like the average engineer who interact with some new &#8220;thing&#8221; for the first time. I want to encounter that same experience and relay that to you. <strong>Save you some trouble.</strong></p><p>I do not have particularly high hopes from what I might encounter on this foray into Mordor, based on past experience. Let me explain what I mean by this. There are two types of Engineers and tooling they build in the world &#8230;</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">1. Smart engineers building cool tools but lack critical 
developer-centric awareness.


2. Smart engineers building cools tools and understand 
developer-centric are the key to success.</code></pre></div><p>This has been a problem as long as I&#8217;ve been (attempting) to write code for the last 20 years. To this day, it simply doesn&#8217;t change. Some tool and their ardent acolytes say &#8220;<em>Look, we are faster, come unto us.</em>&#8221; Yet, they never seem to crack the nut and become mainstream, and can&#8217;t figure out why.</p><blockquote><p>All the while continuing to waste their resources on negativity, pulling down other <strong>people</strong> and <strong>tools</strong> in an effort to lift themselves into a place by pulling other things down. <em>This never has worked, and never will, longterm, in any sphere in life.</em></p></blockquote><p>I&#8217;ve been privy to see behind the curtains, meeting people at Databricks and MotherDuck (DuckDB), who are creating and running the creme-de-la-creme of data tooling. What makes them succesful?</p><ul><li><p>The have the fastest tool &#8230; NO</p></li><li><p>The focus their energies on negativity and others .. NO</p></li></ul><p>They simply have an almost irresistible set of qualities that is impossible to ignore or hid &#8230; or be pulled down in the mud by others.</p><ul><li><p>Positivity in all things</p></li><li><p>Love what they do and other people</p></li><li><p>Put Developer experience BEFORE everything else</p></li><li><p>Serve and give back to the data community</p></li></ul><p>So simple, but yet so hard for some folk to grasp. Why? I don&#8217;t know. It&#8217;s probably something inside them, unhappy people will always be bumbling along looking for people to suck down into their misery.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Let&#8217;s start what I&#8217;ve been avoiding</h2><p>Ok, at some point we have to get to what we are after today, namely figuring out if <a href="https://github.com/apache/datafusion-comet">Apache Datafusion Comet</a> has improved itself over the few years since we last visited it. One thing is for sure. In the Year of Lord 2026 if someone is using Spark, they better be using Databricks. </p><p>I know there are still a few odd old-school curmudgeons banging around in the dark corners, BUT if you want to be taken seriously in the Spark world today, you must go where the users are. Databricks. Sorry. Not Sorry.</p><ul><li><p>Can we get Comet working on Databricks?</p></li></ul><p>First, we need a JAR.</p><p><a href="https://github.com/apache/datafusion-comet">The GitHub page for Comet</a> gives a semi-not-so-much clear instructions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://github.com/apache/datafusion-comet" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ub8v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 424w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 848w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 1272w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ub8v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png" width="1456" height="774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:774,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:303149,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://github.com/apache/datafusion-comet&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ub8v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 424w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 848w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 1272w, https://substackcdn.com/image/fetch/$s_!ub8v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfdd7640-6f5b-4c4b-b47a-c608d803bc67_1946x1034.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The <a href="https://datafusion.apache.org/comet/user-guide/latest/installation.html">installation guide</a> gives us more info, including what Spark Versions we are allowed to use with Comet.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://datafusion.apache.org/comet/user-guide/latest/installation.html" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5_oZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 424w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 848w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 1272w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5_oZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png" width="1456" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:196900,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://datafusion.apache.org/comet/user-guide/latest/installation.html&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5_oZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 424w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 848w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 1272w, https://substackcdn.com/image/fetch/$s_!5_oZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b4793b-db02-419a-a300-7e55eb41f525_1946x798.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So, we will have to pick a DBR version that matches something in here. Strangely, they say &#8220;<em><strong>Published jar files are only available for released versions, </strong></em>&#8221; which insinuates they have published, pre-built JARs for supported Spark versions, although there is no link or mention of where to find those JARs &#8230; one could assume Maven maybe?</p><ul><li><p>You would think linking and being obvious about where pre-built JARs are located for potential dev &#8220;customers&#8221; might be a fairly obvious thing to do. Or not.</p></li></ul><p>Ok, let&#8217;s go find some JARs (hopefully), before falling back to building our own. Indeed, <a href="https://mvnrepository.com/artifact/org.apache.datafusion">they have a Maven repo</a> with the different Spark version Jars.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://mvnrepository.com/artifact/org.apache.datafusion" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YmCU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 424w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 848w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 1272w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YmCU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png" width="1456" height="806" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:806,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:280979,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://mvnrepository.com/artifact/org.apache.datafusion&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YmCU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 424w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 848w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 1272w, https://substackcdn.com/image/fetch/$s_!YmCU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17ce52fe-da2d-4a33-9d6d-9e40df76c21e_2134x1182.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We should be able to line this up with a <a href="https://docs.databricks.com/aws/en/release-notes/runtime/">Databricks DBR version</a> that fits our needs eh. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.databricks.com/aws/en/release-notes/runtime/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aqUD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 424w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 848w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 1272w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aqUD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png" width="1456" height="775" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:775,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:340725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.databricks.com/aws/en/release-notes/runtime/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aqUD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 424w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 848w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 1272w, https://substackcdn.com/image/fetch/$s_!aqUD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdffbed-80db-48f9-937e-e69af19b0bd8_2248x1196.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Let&#8217;s pick Spark 4.0.</p></blockquote><p>So now we&#8217;ve got our <em><strong>comet-common-spark4.0_2.13-0.16.0.jar</strong></em> that we can align with DBR 17.3 LTS. Next, we just need to gather the required Spark configs needed to go along with this JAR, we can unwind what we need from the Comet examples given.</p><p>Based on what I can see, this is what we will need to add to our Databricks Cluster config.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">--jars $COMET_JAR \
    --conf spark.driver.extraClassPath=$COMET_JAR \
    --conf spark.executor.extraClassPath=$COMET_JAR \
    --conf spark.plugins=org.apache.spark.CometPlugin \
    --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
    --conf spark.comet.explainFallback.enabled=true \</code></pre></div><p>As well as possibly</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">--driver-class-path spark/target/comet-spark-spark4.1_2.13-0.17.0-SNAPSHOT.jar
</code></pre></div><p>Also, if you want your brain to go numb, you can read through the <a href="https://datafusion.apache.org/comet/user-guide/latest/configs.html">Comet Configs located here</a>, it&#8217;s the next page that comes up in the Installation Instructions.</p><p>I&#8217;m assuming some of these are important, and greatly assist with how Comet performs, but God knows how one is simply supposed to scroll through all 1000 of them and know what to do. Your guess is as good as mine.</p><ul><li><p>Might be nice if they put some sort of overview of the the most important ones, things that should not be left to default, or the most commonly used ones and put some sort of conceptual overview together for how to tune them.</p></li></ul><p>To top it all off, <a href="https://datafusion.apache.org/comet/user-guide/latest/compatibility/index.html">there is a very large and complex &#8220;Compatibility&#8221; section</a> detailing what Comet can, and cannot do, where it will fall back to Spark etc. This one will give you a headache as well.</p><div class="pullquote"><p>Between that and the configs, you either need to have an ungodly amount of time on your hands, or be a core maintainer &#8230; or know someone who is &#8230; at that point you can probably get everything figured out.</p></div><p>If you meet someone behind the old oak tree at midnight, throw salt over your left shoulder, and say these words &#8230; &#8220;Ooga Booga&#8221; 3 times over. At that point you will know if the production Spark pipeline you have will benefit from Comet and how to tune the configs for optimal performance.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>Tear* &#8230; let&#8217;s do this thing.</h3><p>Ok, let&#8217;s switch over to Databricks and get our cluster setup. First, let&#8217;s get this JAR available somewhere Databricks clusters can access it. Probably a Volume would be the easiest way to do this.</p><p>Here I have made a volume <code>Volumes/confessions/default/jars</code> and put our JAR up in there.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l836!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l836!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 424w, https://substackcdn.com/image/fetch/$s_!l836!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 848w, https://substackcdn.com/image/fetch/$s_!l836!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 1272w, https://substackcdn.com/image/fetch/$s_!l836!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l836!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png" width="1456" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140105,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l836!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 424w, https://substackcdn.com/image/fetch/$s_!l836!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 848w, https://substackcdn.com/image/fetch/$s_!l836!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 1272w, https://substackcdn.com/image/fetch/$s_!l836!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa346dda8-e17d-49ef-a72b-630c91c54413_1848x934.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Next, let&#8217;s make a little init.sh script that can run on any cluster.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">#!/bin/bash
COMET_JAR="/Volumes/confessions/default/jars/comet-common-spark4.0_2.13-0.16.0.jar"
LOCAL_JAR="/databricks/jars/comet-common-spark4.0_2.13-0.16.0.jar"

cp "$COMET_JAR" "$LOCAL_JAR"
chmod 644 "$LOCAL_JAR"</code></pre></div><p>Next, we can put that bash file into the same Databricks volume.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b4Qf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b4Qf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 424w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 848w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 1272w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b4Qf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png" width="1456" height="98" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:98,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18558,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b4Qf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 424w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 848w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 1272w, https://substackcdn.com/image/fetch/$s_!b4Qf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab0d7843-f6c0-42e3-8458-992f9dbcefcc_1694x114.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Ok, so let&#8217;s just use one of those dreaded All-Purpose clusters in Databricks, configure it with the init script, as well as the rest of our configs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K4qJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K4qJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 424w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 848w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 1272w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K4qJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png" width="1456" height="655" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:655,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:134254,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K4qJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 424w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 848w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 1272w, https://substackcdn.com/image/fetch/$s_!K4qJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb36b534f-ea1c-408d-b7ab-935e40eecf71_2022x910.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As well, we can add in that init script we made.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nnQw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nnQw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 424w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 848w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 1272w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nnQw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png" width="1456" height="414" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:414,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nnQw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 424w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 848w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 1272w, https://substackcdn.com/image/fetch/$s_!nnQw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F956a8815-c9d8-4c69-b548-d6ae84a6a329_2116x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And finally, our Spark configs for Comet.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t3kD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t3kD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 424w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 848w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 1272w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t3kD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png" width="1456" height="297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:297,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:97320,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t3kD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 424w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 848w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 1272w, https://substackcdn.com/image/fetch/$s_!t3kD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa444b40-976d-450c-8070-82a0ca1c4bc5_2086x426.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I have zero idea if this is going to work, just a shot in the dark. We will find out soon enough, maybe. Ok, next we need some data in s3 to munge around with. <a href="https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data#downloadingTheRawTestData">Let&#8217;s use the open-source Backblaze harddrive dataset.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data#downloadingTheRawTestData" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DJBk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 424w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 848w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 1272w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DJBk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png" width="1456" height="370" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:370,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:108365,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data#downloadingTheRawTestData&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DJBk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 424w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 848w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 1272w, https://substackcdn.com/image/fetch/$s_!DJBk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20311177-c2af-42d2-8370-ab88ae8a6ca5_1808x460.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We can get all the 2025 data and use that. This is 365 files for about 43GB of CSV data in total. Not big data, but then again, most Spark pipelines running today picking up CSV files at not munching on Big Data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qQcN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qQcN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 424w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 848w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 1272w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qQcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png" width="1456" height="434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:434,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132864,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/200167715?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qQcN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 424w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 848w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 1272w, https://substackcdn.com/image/fetch/$s_!qQcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee52e8c9-8ef9-46d0-975f-b883c9a3517d_2270x676.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This will do for now. Let&#8217;s write some average Spark code that would represent the average Spark pipeline in production.</p>
      <p>
          <a href="https://dataengineeringcentral.substack.com/p/apache-datafusion-comet-spark-accelerator">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Review of Databricks Data + AI Summit 2026]]></title><description><![CDATA[from someone who wasn't there.]]></description><link>https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Fri, 19 Jun 2026 18:52:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LnEr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LnEr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LnEr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LnEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb329723-cf91-4789-8151-d87197833453_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2127943,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LnEr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!LnEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb329723-cf91-4789-8151-d87197833453_1672x941.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Well, it&#8217;s that time of year again. My feed has been full of a little bit of this, little bit of that from the Databricks Data and AI Summit 2026. It&#8217;s kinda hard to miss, although this year, I must say, it seemed a little on the quiet side as compared to others. </p><blockquote><p>Who knows, 2026 has been a busy year in the world in all ya know, wars, rumors of wars, layoffs, and generally it&#8217;s either the beginning of a new world or the end of our current one.</p></blockquote><p>Like any good hunter of technology, I keep my veteran ear to the ground, ruffling through the fluff and litter, trying to find out what is actually worth your time, and what is just another layer on the AI cake we are all too tired to choke down.</p><p>So I&#8217;m just going to give you my take on the important announcements and products released or announced at this years Databricks Data and AI Summit 2026. I mean I wasn&#8217;t there, so take me with a grain of salt.</p><p>Here&#8217;s my list, and I will then follow it up with my take on each one.</p><ul><li><p>Zach Wilson went after blowing a raspberry at them last year.</p></li><li><p>Lakehouse//RT aka &#8220;Reyden&#8221;</p></li><li><p>LTAP - OLAP + OLTP on a single copy of data in the lake</p></li></ul><p>Yeah, I&#8217;m giving a big ho-hum to the plethora of AI shiny rocks released, like Genie Ontology, Genie ZeroOps, Omnigent, Unity AI Gateway,  etc. The world of Agents and AI is still in flux, no clear winners, everyone is throwing stuff at the wall to see what sticks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.reddit.com/r/databricks/comments/1u8g2by/databricks_just_dropped_genie_one_ontology_and/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GqD8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 424w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 848w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 1272w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GqD8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png" width="1456" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185951,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.reddit.com/r/databricks/comments/1u8g2by/databricks_just_dropped_genie_one_ontology_and/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GqD8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 424w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 848w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 1272w, https://substackcdn.com/image/fetch/$s_!GqD8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76659e0-a8d2-4949-b981-3b7d49814497_1554x658.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For my part, I&#8217;m a big believer in Databricks, they have a propensity to fundamentally change technology and how we use it. Truly ground breaking technical solutions and products they bring to market. If you can&#8217;t admit that, then check yourself. That being said, I&#8217;m just reviewing what I consider products or announcements that meet that critera. <em><strong>Ground breaking and game changing.</strong></em></p><p>If you think I&#8217;m full of it about something, or I missed something big, drop a comment.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Zach showed up.</h2><p>Hey, I like a little spice in my life, the hot stuff keeps everyone on their toes. <a href="https://www.linkedin.com/posts/ritchievink_we-tried-to-reproduce-ops-post-and-couldnt-share-7466112259820433408-1DGU/">I regularly get on the wrong side of powerful people</a>, much to my joy. Nothing like free head space to bring me more followers.</p><p>Anywho, <a href="https://www.linkedin.com/posts/eczachly_i-wont-be-at-the-databricks-ai-summit-this-share-7337871019052789761-OaEr/">last year Zach was throwing rocks at Databricks</a>, as you can see below. Good Lord you gotta love seeing behind the curtain sometimes. We need our own data Netflix series on this sorta thing ya know?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/posts/eczachly_i-wont-be-at-the-databricks-ai-summit-this-share-7337871019052789761-OaEr/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5BS5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 424w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 848w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 1272w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5BS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png" width="1104" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/posts/eczachly_i-wont-be-at-the-databricks-ai-summit-this-share-7337871019052789761-OaEr/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5BS5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 424w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 848w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 1272w, https://substackcdn.com/image/fetch/$s_!5BS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f8781a-26ab-474b-b480-ce89f8cc31bf_1104x852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But, Zach attended this years Summit, which I was glad to see. I think Zach is one of the smartest and hard working engineers I&#8217;ve ever scene, and I love Databricks because they make the best products. I&#8217;m glad they are getting along now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/posts/eczachly_data-modeling-is-getting-even-more-important-activity-7473150390377742336-xMfB?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAACCGWeMBpzyTjCHWjac5iobhsbGE41GBhto" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dA1W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 424w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 848w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 1272w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dA1W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png" width="1104" height="970" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:858143,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/posts/eczachly_data-modeling-is-getting-even-more-important-activity-7473150390377742336-xMfB?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAACCGWeMBpzyTjCHWjac5iobhsbGE41GBhto&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dA1W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 424w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 848w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 1272w, https://substackcdn.com/image/fetch/$s_!dA1W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba64e47f-8257-4f5a-9788-57d6494eaaa0_1104x970.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anywho, enough on that, next.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h1>Lakehouse//RT: Databricks Brings Real-Time Analytics to the Lakehouse</h1><p>This one caught my poor little ears as soon as the words dripped out onto LinkedIn. Again, all this stuff is new so who knows what the future holds or if anything will come of it, but it solves some major pain points we&#8217;ve been dealing with in a new and novel way.</p><ul><li><p>I&#8217;m excited for this one.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.databricks.com/blog/introducing-lakehousert-real-time-performance-unified-lakehouse" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GXQd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 424w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 848w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 1272w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GXQd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png" width="1456" height="381" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:381,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:135575,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.databricks.com/blog/introducing-lakehousert-real-time-performance-unified-lakehouse&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GXQd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 424w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 848w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 1272w, https://substackcdn.com/image/fetch/$s_!GXQd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f789d8d-8808-419b-aa8e-a0cc4ddf457c_1804x472.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One of the biggest challenges in modern data architecture has been serving real-time applications from a Data Lake or Lake House. When we adopted technologies like Delta Lake and Apache Iceberg, aka file storage with ACID, we gained some stuff and lost others. Well lost isn&#8217;t a good word. <em><strong>Delta Lake + Spark ain&#8217;t no Postgres + Python ya know??</strong></em></p><p>Traditional LakeHouses excel at ETL, analytics, machine learning, and business intelligence, but when teams need dashboards that refresh in milliseconds or applications serving thousands of users simultaneously, they often introduce an entirely separate serving database such as ClickHouse, Pinot, Druid, or Redis. </p><blockquote><p><em>That second system brings another copy of the data, another synchronization pipeline, another set of security policies, and another operational burden.</em></p></blockquote><p>It brings complexity and overhead. These technologies just had a difficult time jiving.</p><p><a href="https://www.databricks.com/blog/introducing-lakehousert-real-time-performance-unified-lakehouse">Lakehouse//RT</a> is Databricks&#8217; attempt to eliminate that architecture altogether. Instead of exporting data into a specialized serving database, <a href="https://www.databricks.com/blog/introducing-lakehousert-real-time-performance-unified-lakehouse">Lakehouse//RT</a> delivers millisecond query performance directly against Delta Lake while keeping the data inside the governed Lakehouse.</p><div><hr></div><h3>What is Lakehouse//RT?</h3><p>Lakehouse//RT is a new real-time compute designed specifically for workloads that require both <strong>very low latency</strong> and <strong>extremely high concurrency</strong>. It is powered by a new execution engine called <strong>Reyden</strong>, which Databricks says was built from the ground up for operational analytics, application serving, observability, dashboards, and AI agents.</p><blockquote><p>I&#8217;m still unsure if we are dealing with two things, or one &#8230; read this closely &#8230;</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4ABQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4ABQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4ABQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png" width="1456" height="508" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186460,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!4ABQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!4ABQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee117bd-42db-4165-b5ce-9f0170ebc60f_1600x558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Powered by Reyden, a &#8220;new engine for realtime workloads&#8221;</p></li><li><p>Lakehouse/RT, &#8220;real time data warehouse&#8221;</p></li></ul><p>Are these two separate things you can use, or you use them both at the same time? I don&#8217;t know. Time will tell the details.</p><p>Unlike traditional analytical warehouses that optimize long-running reports, Reyden focuses on thousands of simultaneous interactive queries while maintaining consistent response times. According to Databricks, preview customers have seen:</p><ul><li><p><em>Up to <strong>16&#215; faster</strong> performance than existing real-time serving layers</em></p></li><li><p><em>Query latencies as low as <strong>10 milliseconds</strong></em></p></li><li><p><em>Sub-100 ms performance on much larger datasets</em></p></li><li><p><em>Around <strong>12,000 queries per second</strong> while maintaining low latency</em></p></li></ul><div><hr></div><h3>Eliminating the Serving Layer</h3><p>The biggest architectural shift isn&#8217;t simply that queries are faster, we&#8217;ve been hearing people say &#8220;<em><strong>My thing is faster</strong></em>,&#8221; for a decade or more. The difference is that Databricks wants to remove an entire layer of infrastructure.</p><p>Today&#8217;s architecture often looks like this:</p><pre><code><code>Delta Lake
      &#9474;
   ETL / CDC
      &#9474;
ClickHouse / Pinot / Druid / Redis
      &#9474;
 Applications &amp; Dashboards</code></code></pre><p>With Lakehouse//RT, Databricks wants applications to query the Lakehouse directly:</p><pre><code><code>Delta Lake
      &#9474;
Lakehouse//RT
      &#9474;
 Applications
 Dashboards
 AI Agents</code></code></pre><p>That means:</p><ul><li><p>no duplicate storage</p></li><li><p>no synchronization pipelines</p></li><li><p>no additional serving clusters</p></li><li><p>no proprietary storage formats</p></li><li><p>no duplicated governance</p></li></ul><p>Everything continues to use Delta Lake and Unity Catalog. Some will argue that this isn&#8217;t new technology, and that Databricks sees the growth of <a href="https://clickhouse.com/cloud?utm_campaign=google-brand-na-tier-1&amp;utm_source=google&amp;utm_medium=paid-search&amp;utm_source=google.com&amp;utm_medium=paid_search&amp;utm_campaign=21862172345_169330245029&amp;utm_content=764403839947&amp;utm_term=clickhouse_g_c&amp;gad_source=1&amp;gad_campaignid=21862172345&amp;gbraid=0AAAAAocOPCbtFWZH1GWrTR98um9mtF-hk&amp;gclid=Cj0KCQjwrs7RBhDuARIsAIVfBD3ibg5Mjr8pE6n0Ek_rMzC-N0U9B8cnuYQz6yVM79VYMP_DkonGBf8aAjZnEALw_wcB">ClickHouse</a> for example, and decided they needed to do something about that.</p><p>I agree, but I also think that Databricks providing this sort of extremely fast Compute option is ground breaking inside the Data Platform they provide. You simply CANNOT discount the reduction in complexity and code when you</p><div><hr></div><h3>Built for Operational Analytics</h3><p>Databricks positions Lakehouse//RT for workloads that have traditionally been difficult to run directly from a data lake:</p><ul><li><p><em>customer-facing SaaS applications</em></p></li><li><p><em>operational dashboards</em></p></li><li><p><em>observability platforms</em></p></li><li><p><em>security analytics</em></p></li><li><p><em>embedded analytics</em></p></li><li><p><em>AI agent retrieval</em></p></li><li><p><em>interactive business intelligence</em></p></li></ul><p>These are all scenarios where thousands of users&#8212;or AI agents&#8212;may issue queries simultaneously and expect responses in tens of milliseconds rather than seconds. It takes something tradtionally done outside Databricks, or at the minimum with third party tools, and brings it back inside the MotherShip.</p><blockquote><p>All the chickens coming home to roost so to speak.</p></blockquote><div><hr></div><h4>Simpler Operations</h4><p>Lakehouse//RT also introduces a different compute model. Instead of selecting warehouse sizes manually, Databricks automatically determines the appropriate baseline compute. Rather than scaling by duplicating entire warehouse clusters, it incrementally adds or removes nodes as concurrency changes, aiming to improve utilization while reducing costs. </p><h4>Governance Doesn&#8217;t Change</h4><p>One of the more compelling aspects of Lakehouse//RT is that governance remains centralized. Since the data never leaves the Lakehouse:</p><ul><li><p>Unity Catalog permissions stay intact</p></li><li><p>security policies are defined once</p></li><li><p>business logic isn&#8217;t duplicated</p></li><li><p>data lineage remains consistent</p></li></ul><p>Organizations no longer need to recreate governance rules inside a separate serving database. Governance is indeed the next big topic in the world of data and AI. Security breaches every work, Claude 100X Engineers releasing buggy code left and right.</p><p>Can&#8217;t be too careful these days.</p><div><hr></div><h3>How It Fits Into Databricks&#8217; Bigger Picture</h3><p>Lakehouse//RT makes even more sense when viewed alongside Databricks&#8217; other announcements this year.</p><ul><li><p><strong>Lakebase</strong> provides a PostgreSQL-compatible operational database.</p></li><li><p><strong>LTAP</strong> unifies transactional and analytical storage on a single copy of data.</p></li><li><p><strong>Lakehouse//RT</strong> provides millisecond analytical serving directly from that same data.</p></li></ul><p>Together, Databricks is attempting to collapse what has historically been three separate systems:</p><ul><li><p>OLTP databases</p></li><li><p>analytical warehouses</p></li><li><p>real-time serving databases</p></li></ul><p>into a single platform built around Delta Lake, Unity Catalog, and specialized execution engines.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>My Take on RT</h3><p>This feels like one of the most strategically important announcements from this year&#8217;s Data + AI Summit. The performance numbers are certainly impressive, but the bigger story is architectural simplification. </p><blockquote><p>For decades, data teams have accepted that customer-facing applications require a separate serving database. Lakehouse//RT challenges that assumption by making the lakehouse itself fast enough to serve those workloads. Will it catch? I don&#8217;t know. We will all find out in a year I guess.</p></blockquote><p>The remaining question is whether those benchmark results translate to the wide variety of real-world production environments that rely on ClickHouse, Pinot, Druid, Elasticsearch, and similar systems today. If they do, Lakehouse//RT could remove an entire category of infrastructure from many modern data platforms.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h1>Lake Transactional/Analytical Processing Architecture</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.databricks.com/company/newsroom/press-releases/databricks-launches-ltap-first-lake-transactionalanalytical" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gqCA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 424w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 848w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 1272w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gqCA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png" width="1456" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129315,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.databricks.com/company/newsroom/press-releases/databricks-launches-ltap-first-lake-transactionalanalytical&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gqCA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 424w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 848w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 1272w, https://substackcdn.com/image/fetch/$s_!gqCA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69085442-3d2a-49d5-8b20-dbba50ec99f1_1804x560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve been waiting for this one since I was knee high to a grasshopper. Jezz, <a href="https://www.forbes.com/sites/victordey/2026/06/16/databricks-ceo-says-hes-cracked-a-40-year-old-database-problem-with-ltap/">even Forbes is writing about this one</a>, what the heck do they know about anything??</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.forbes.com/sites/victordey/2026/06/16/databricks-ceo-says-hes-cracked-a-40-year-old-database-problem-with-ltap/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ks3m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 424w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 848w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 1272w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ks3m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png" width="1456" height="359" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:359,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154815,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.forbes.com/sites/victordey/2026/06/16/databricks-ceo-says-hes-cracked-a-40-year-old-database-problem-with-ltap/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/202658374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ks3m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 424w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 848w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 1272w, https://substackcdn.com/image/fetch/$s_!ks3m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8561070f-4b61-40ef-a163-46fa1c56cde5_2240x552.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This is the one thing I wasn&#8217;t ready for, although it makes total sense. It was <strong><a href="https://www.databricks.com/company/newsroom/press-releases/databricks-launches-ltap-first-lake-transactionalanalytical">LTAP</a></strong><a href="https://www.databricks.com/company/newsroom/press-releases/databricks-launches-ltap-first-lake-transactionalanalytical">, short for </a><strong><a href="https://www.databricks.com/company/newsroom/press-releases/databricks-launches-ltap-first-lake-transactionalanalytical">Lake Transactional/Analytical Processing</a></strong>, and if you spend your days building data platforms instead of making keynote slides, this is the announcement that deserves your attention.</p><blockquote><p>The elevator pitch sounds almost suspiciously simple. Databricks wants operational databases and analytical workloads to operate on the <strong>same copy of data</strong>, eliminating CDC pipelines, ETL jobs, replicas, synchronization processes, and the collection of brittle plumbing that has somehow become accepted as &#8220;modern data architecture.&#8221; </p></blockquote><p>After reading both the press release and Ali Ghodsi&#8217;s interview explaining the thinking behind it, I don&#8217;t think this is really a story about ETL at all. It&#8217;s a story about removing a 40 year old architectural assumption that everyone stopped questioning years ago.</p><pre><code><code>Today's architecture

Application
     &#9474;
 PostgreSQL
     &#9474;
 CDC / ETL
     &#9474;
 Data Lake
     &#9474;
 Analytics</code></code></pre><p>For decades we&#8217;ve accepted that applications belong in one database while analytics belong somewhere else, usually connected together by a growing pile of Kafka topics, replication jobs, Airflow DAGs, managed CDC services, and a Slack channel dedicated entirely to asking why yesterday&#8217;s pipeline failed again. </p><blockquote><p><em>The industry has spent years trying to make this architecture less painful instead of asking whether the architecture itself is the problem.</em></p></blockquote><p>Databricks thinks it is.</p><p>The obvious <a href="https://en.wikipedia.org/wiki/Hybrid_transactional/analytical_processing">comparison is HTAP, which promised to unify transactional and analytical workloads years ago. The problem was that HTAP</a> largely tried to shove both workloads into the same engine, which meant eventually your dashboard and your checkout page were competing for the same resources. </p><p>Databricks is taking a different approach. Instead of building one engine that tries to do everything, they&#8217;re building <strong>one storage layer</strong> that multiple specialized engines can operate against. </p><div class="pullquote"><p>Lakebase handles PostgreSQL transactions, the Lakehouse handles analytics, Lakehouse//RT handles low latency serving, and they all operate on the same governed copy of Delta or Iceberg data.</p></div><pre><code><code>LTAP

          Delta / Iceberg
                &#9474;
   &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
   &#9474;            &#9474;            &#9474;
Lakebase   Lakehouse   Lakehouse//RT
  OLTP        OLAP        Real Time</code></code></pre><p>That distinction is important because LTAP isn&#8217;t replacing PostgreSQL with Spark, and it isn&#8217;t asking Spark to become an operational database. It&#8217;s saying the engines can stay specialized while the storage becomes unified. That is a much bigger architectural shift than &#8220;we removed ETL.&#8221;</p><ul><li><p><em>The other piece that clicked for me came from Ghodsi&#8217;s explanation of why they&#8217;re doing this now. His argument is that the real pressure isn&#8217;t coming from human developers anymore. It&#8217;s coming from AI agents. Humans create applications relatively slowly. Agents don&#8217;t. </em></p></li></ul><p>They create databases, clone environments, test ideas, throw them away, and repeat the process constantly. Databricks claims roughly 80 percent of the databases on Lakebase are already being created by agents rather than people. </p><blockquote><p>Whether that number surprises you or makes you instinctively reach for a fact check, the direction is hard to argue with. Infrastructure built around dozens of data copies and endless synchronization simply doesn&#8217;t scale when your primary users become software instead of humans. </p></blockquote><p>The real takeaway here isn&#8217;t that Databricks found a clever way to remove another Airflow DAG. They&#8217;re trying to remove entire categories of infrastructure. No operational database replica feeding analytics. No warehouse copy that&#8217;s always fifteen minutes behind production. No &#8220;Zero ETL&#8221; marketing that quietly hides another synchronization service behind the curtain. </p><ul><li><p><em>One copy of the data, multiple engines reading and writing it, and governance living in one place.</em></p></li></ul><p>Will it work? That&#8217;s the billion dollar question, or perhaps the hundred billion dollar question given Databricks&#8217; valuation. Lakebase still has to prove it can be a production operational database at massive scale, and LTAP has to demonstrate that this elegant architecture survives contact with messy enterprise workloads. </p><p>But if Databricks pulls it off, we may eventually look back at maintaining separate OLTP and OLAP systems the same way we now look at nightly FTP jobs and hand-written shell scripts. Necessary once, painful always, and eventually replaced by something that made us wonder why we tolerated the old way for so long.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/review-of-databricks-data-ai-summit?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[The Future of the Lakehouse: Delta Lake, Rust, and Data Platforms at Scale]]></title><description><![CDATA[with Ethan Urbanski]]></description><link>https://dataengineeringcentral.substack.com/p/the-future-of-the-lakehouse-delta</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/the-future-of-the-lakehouse-delta</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 17 Jun 2026 13:35:53 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/197940455/9f7ed8edab614c5757f3eef627935c6a.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/ethanurbanski/">Ethan</a>, a maintainer of <a href="https://github.com/delta-io/delta-rs">delta-rs</a> and an expert in modern lakehouse architecture working in the pharmaceutical industry.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/ethanurbanski/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bHh-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 424w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 848w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 1272w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bHh-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png" width="1456" height="669" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:669,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1153993,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/ethanurbanski/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197940455?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bHh-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 424w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 848w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 1272w, https://substackcdn.com/image/fetch/$s_!bHh-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7660fdb-8411-420a-b1c2-762cfa7332bb_1646x756.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We discuss Ethan&#8217;s journey into tech and data engineering, the evolution of open table formats like Delta Lake and Apache Iceberg, and what it actually takes to build scalable enterprise data platforms in highly regulated environments like big pharma.</p><p>We also dive into:</p><ul><li><p>delta-rs and the future of Delta Lake outside Spark</p></li><li><p>Lakehouse architecture and open catalogs</p></li><li><p>Rust in the modern data ecosystem</p></li><li><p>Data platform governance and scalability</p></li><li><p>Enterprise analytics and infrastructure</p></li><li><p>The future of agentic analytics and AI-enabled data systems</p></li><li><p>Lessons learned building large-scale data platforms</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://github.com/ethan-tyler" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DhxO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 424w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 848w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 1272w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DhxO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png" width="1456" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:871541,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://github.com/ethan-tyler&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197940455?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DhxO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 424w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 848w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 1272w, https://substackcdn.com/image/fetch/$s_!DhxO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6ef17ed-506a-431e-b59f-5cb7f353778d_2628x1020.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;re interested in modern data engineering, open source infrastructure, lakehouses, or the future of analytics engineering, this is a great conversation.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-future-of-the-lakehouse-delta?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-future-of-the-lakehouse-delta?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-future-of-the-lakehouse-delta?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Escaping the Agentic Token Tax: Replacing Claude Code or Copilot with OpenCode]]></title><description><![CDATA[opencode + ollama for the win.]]></description><link>https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Thu, 11 Jun 2026 16:23:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DMVS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DMVS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DMVS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DMVS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3922449,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DMVS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!DMVS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359cf379-2212-492a-a6f1-aa5544efa75b_3840x2160.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m going to tell your mom. I&#8217;m going to do it. You can&#8217;t stop me. And you, Grandma, for that matter. I&#8217;m calling them right now, like now. You&#8217;re an addict; you have to stop. Someone has to stop you. Might as well be me.</p><blockquote><p><em>It started out harmless, didn&#8217;t it?</em></p></blockquote><p>But now things have changed, you&#8217;re hooked on The Vibes. They got their claws in you. Look at yourself. You should be ashamed. Sitting high on your Macbook Tower all these years, God&#8217;s programming gift to humankind, those LLMs came along, you were enamored, Gastown and all, <strong>those tasty token morsels.</strong></p><p>More tokens, bro, just a few more tokens, and you will finally become Neo and enter the Matrix. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F2Vg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F2Vg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 424w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 848w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 1272w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F2Vg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif" width="498" height="316" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:316,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:825763,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F2Vg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 424w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 848w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 1272w, https://substackcdn.com/image/fetch/$s_!F2Vg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa49026f-d30b-4898-ac47-fba18d6a5a55_498x316.gif 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ok, on a serious note, if I have one in my body, what&#8217;s the big deal? Why are the token addicts crying foul and howling at the moon? </p><ul><li><p>Well, the signs are there; people from <a href="https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/">Fortune</a> to LinkedIn are complaining about it.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!htT2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 424w, https://substackcdn.com/image/fetch/$s_!htT2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 848w, https://substackcdn.com/image/fetch/$s_!htT2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 1272w, https://substackcdn.com/image/fetch/$s_!htT2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!htT2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png" width="1456" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:714374,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!htT2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 424w, https://substackcdn.com/image/fetch/$s_!htT2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 848w, https://substackcdn.com/image/fetch/$s_!htT2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 1272w, https://substackcdn.com/image/fetch/$s_!htT2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa73062d4-078a-4ee0-923b-3a05c87d2012_2302x778.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.reddit.com/r/ClaudeAI/comments/1sxcxge/github_copilot_9x_price_increase_for_claude_models/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_DZU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 424w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 848w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 1272w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_DZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png" width="1456" height="570" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c588fa7-c356-459a-b075-8e086495694e_1864x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:570,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:251102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.reddit.com/r/ClaudeAI/comments/1sxcxge/github_copilot_9x_price_increase_for_claude_models/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_DZU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 424w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 848w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 1272w, https://substackcdn.com/image/fetch/$s_!_DZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c588fa7-c356-459a-b075-8e086495694e_1864x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What I&#8217;m still trying to figure out is how the collective &#8220;we&#8221; didn&#8217;t see this coming. Is there anything more classic in the marketing of drugs and AI, since the beginning of time, than to get &#8216;em hooked and then do the old bait and switch on pricing?</p><p>Heck, I don&#8217;t know if the pricing increases even matter. I mean, if the C-Suite wants to invest in AI, then who really knows whether the whole pricing strategy and increase are a big deal? <strong>It&#8217;s hard to turn the ship around once it has sailed.</strong></p><div><hr></div><p><em><strong>Thanks to <a href="http://www.delta.io/">Delta</a> for sponsoring this newsletter! I use Delta Lake daily, and I believe it represents the future of Data Engineering. Content like this would not be possible without their support. Check out <a href="http://www.delta.io/">their website</a> below.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp" width="600" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4196,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><p>If you live under a rock and don&#8217;t know what I&#8217;m talking about, basically &#8230;</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">The era of heavily subsidized, flat-rate AI pricing has ended as both GitHub Copilot 
and Anthropic transition to token-based or usage-based billing. 
Users are facing significant bill increases and credit depletion due to the
 high token cost of long, autonomous coding tasks.</code></pre></div><p>There are probably multiple factors behind the &#8220;why&#8221; folk are having second thoughts about token prices.</p><ul><li><p><em>Cost</em></p></li><li><p><em>Risk (general)</em></p></li><li><p><em>Lock-In</em></p></li><li><p><em>The future</em></p></li><li><p><em>Good judgement</em></p></li><li><p><em>Privacy</em></p></li><li><p><em>Freedom</em></p></li></ul><p>The same leaders who&#8217;ve been proud of escaping &#8220;vendor lock-in&#8221; and other sorts of evils are the ones who recently woke up to the news that their dev token costs/bills might go through the roof, and are now crying wolf. <strong>Ironic.</strong></p><blockquote><p>I mean, hindsight is 20-20, but it wasn&#8217;t rocket science to figure out that something funny was going to happen at some point. Once you have the masses drinking from the AI teats, <em>money talks ya know.</em></p></blockquote><p>Also, I&#8217;m saying you have to abandon, or should abandon, tools like CoPilot or Claude Code because you&#8217;re scared, and overreact. There are many obvious, viable ways to reduce token usage that require little to no effort.</p><ul><li><p><em>Review CLAUDE.MD files and context</em></p></li><li><p><em>Implement tools like <a href="https://github.com/juliusbrussee/caveman">Caveman</a></em></p></li><li><p><em>Get better at prompting</em></p></li><li><p><em>Adjust context</em></p></li></ul><p>The truth is, I have great faith in the indelible human spirit. We find solutions to most problems, including AI token maxxing. Everyone has been sloppy because we were allowed to be. <strong>If we &#8220;have to&#8221; be more judicious, we can be.</strong></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Open-source alternatives to break free from token prison.</h2><p>All that being said, in the classic white hat hacker, open source spirit that never dies &#8230; there is, and has been for a while, a strong undercurrent to have total control, total free control. The question is, can it be reasonably achieved?</p><ul><li><p>Can we find open-source alternatives we can run locally on our machines that deliver reasonable output and performance?</p></li></ul><p>Most people are not going to go buy a Mac mini just to run a model for themselves. They will just pay the money to the SaaS Lords and move on. There are also other important questions to ask.</p><ul><li><p><em>Once we&#8217;re used to Claude Code and Anthropic speeds (from prompt to result), can we achieve them locally with any setup, or anything even close?</em></p></li><li><p><em>We used to handroll code a few years ago. Can we be patient enough to wait 30 seconds to a minute for a response?</em></p></li></ul><p>Something tells me that in the Instagram and Amazon life we live, once we&#8217;ve tasted that token fruit upon our digital tongues, it&#8217;s hard to find something else &#8220;good enough,&#8221; when it actually is good enough.</p><blockquote><p>Will the code or systems design output meet our expectations? Will it take 30 seconds longer than we want, making it feel like years by comparison? Is the setup and installation overly burdensome?</p></blockquote><p>As much as I would like to believe the best, the truth is humans are fairly predictable.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Starting with OpenCode.</h2><p>So, let&#8217;s start our possibly long and forelorn journey down the winding road of token freedom. Like <a href="https://en.wikipedia.org/wiki/Beowulf">Beowulf of old</a>, we seek new lands and are ready to battle new monsters. I don&#8217;t expect this adventure to be free from heartache, but I&#8217;m sure we will learn something along the way.</p><ul><li><p>First things first.</p></li></ul><p>I&#8217;m going to simplify my approach to this and break it up into two different logical pieces.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8ycN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ycN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 424w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 848w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 1272w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ycN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png" width="1456" height="611" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86913,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8ycN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 424w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 848w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 1272w, https://substackcdn.com/image/fetch/$s_!8ycN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818c925a-c90f-4b66-ac6e-8e1a19dd5f7f_1544x648.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Think about it like your Claude Code or GitHub CoPilot setup. You have some &#8220;agentic tool&#8221; on your machine that you probably interact with. Then, you are using some remote LLM model via some API from Anthropic, OpenAI, whatever&#8230; these two pieces combined allow you to Gastown your way to glory.</p><p>You can use what you want, but let&#8217;s pick the cream of the crop in open source, that is, <a href="https://opencode.ai/">OpenCode</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://opencode.ai/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JZxa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 424w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 848w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 1272w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JZxa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png" width="1456" height="506" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:506,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:126688,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://opencode.ai/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JZxa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 424w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 848w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 1272w, https://substackcdn.com/image/fetch/$s_!JZxa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa037364-8edb-4e64-b77c-984a19751fdc_2122x738.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p>&#8220;<strong>What is OpenCode? </strong>OpenCode is an open source agent that helps you write code in your terminal, IDE, or desktop.&#8221; - <a href="https://opencode.ai/">source</a></p></div><p>So, let&#8217;s get to it. Easy enough. <a href="https://opencode.ai/docs">Find installation instructions for yourself here.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3e5u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3e5u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 424w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 848w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 1272w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3e5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png" width="1400" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:110971,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3e5u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 424w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 848w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 1272w, https://substackcdn.com/image/fetch/$s_!3e5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea10fd1d-5c83-445c-99ab-1768f59b70f9_1400x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The rest is smooth enough, you&#8217;ll probably wonder why you didn&#8217;t do this a year ago.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;cc7ac18d-a83b-43a2-bfa9-eae9d6396598&quot;,&quot;duration&quot;:null}"></div><p>Ok, so this is kind of deceptive up front. You&#8217;ve only fought half the battle at this point. By defau<code>l</code>t, OpenCode will just look at your environment and find whatever default model it can find that you&#8217;re probably already using, things like <code>OPENAI_API_KEY</code>, <code>ANTHROPIC_API_KEY.</code></p><blockquote><p>Sure, we are now using an open-source coding agent/tool, but if we are trying to break free from our token addictions&#8230; then we still need to find a small, runnable, local model to wire into OpenCode.</p></blockquote><p>If you want to understand more about using different LLMs with OpenCode, <a href="https://opencode.ai/docs/zen">you can read their docs on the subject here</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h2>Small language models for coding tasks.</h2><p>Enter the Rabbit Hole. What hole? The hole of the endless Reddit threads on what is the best &#8220;small language model&#8221; you can use locally for decent results. This is where personality traits and life outlook come into play.</p><blockquote><p>Everyone works on different tasks, cares about different things, and will find some models better than others for various reasons.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.reddit.com/r/LocalLLaMA/comments/1jn1njb/which_llms_are_the_best_and_opensource_for_code/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FG98!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 424w, https://substackcdn.com/image/fetch/$s_!FG98!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 848w, https://substackcdn.com/image/fetch/$s_!FG98!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 1272w, https://substackcdn.com/image/fetch/$s_!FG98!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FG98!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png" width="1456" height="499" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:192226,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.reddit.com/r/LocalLLaMA/comments/1jn1njb/which_llms_are_the_best_and_opensource_for_code/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FG98!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 424w, https://substackcdn.com/image/fetch/$s_!FG98!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 848w, https://substackcdn.com/image/fetch/$s_!FG98!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 1272w, https://substackcdn.com/image/fetch/$s_!FG98!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb19e6764-56a3-48ec-9d1a-20b5c5a95925_1828x626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.reddit.com/r/LocalLLM/comments/1qt0qv7/best_small_model_for_coding/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2a09!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 424w, https://substackcdn.com/image/fetch/$s_!2a09!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 848w, https://substackcdn.com/image/fetch/$s_!2a09!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 1272w, https://substackcdn.com/image/fetch/$s_!2a09!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2a09!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png" width="1456" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:153616,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.reddit.com/r/LocalLLM/comments/1qt0qv7/best_small_model_for_coding/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2a09!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 424w, https://substackcdn.com/image/fetch/$s_!2a09!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 848w, https://substackcdn.com/image/fetch/$s_!2a09!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 1272w, https://substackcdn.com/image/fetch/$s_!2a09!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d14f4e3-ff1c-45bb-bcd0-f4734e0a7bd8_1670x594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m not here to jump into this debate on which SLM is the best for coding. It changes all the time and will continue to change, hopefully getting better and better as time goes on. It&#8217;s simply hard to compete with companies like OpenAI or Anthropic, backed by the Deep State, Bigfoot, Aliens, and billions of dollars.</p><p>So, I want to find something that is &#8220;good enough&#8221; in basic coding tasks. The idea being, in the real world, we could mix this setup with our Token Masters, say start out with OpenCode and some SLM, get the grunt work done, and maybe fine-tune with OpenAI or Anthropic.</p><ul><li><p>Back to the problem at hand, let&#8217;s pick an SLM and get it hooked up to our local OpenCode.</p></li></ul><p>Ok, so in a somewhat satirical twist of fate, the once <a href="https://www.youtube.com/watch?v=T5Tum7b9D3I">&#8220;don&#8217;t be evil&#8221; Google</a>, according to the internet, is going to save our bacon on this one.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.ibm.com/think/topics/google-gemma" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r1mw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 424w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 848w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 1272w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r1mw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png" width="1456" height="594" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:594,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:725645,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.ibm.com/think/topics/google-gemma&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r1mw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 424w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 848w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 1272w, https://substackcdn.com/image/fetch/$s_!r1mw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F982de129-f5f4-4739-9ee8-010c64f8f085_2040x832.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p>&#8220;Gemma is Google&#8217;s family of free and open <a href="https://www.ibm.com/think/topics/small-language-models">small language models (SLMs)</a>. They&#8217;re built from the same technology as the <a href="https://www.ibm.com/think/topics/google-gemini">Gemini</a> family of <a href="https://www.ibm.com/topics/large-language-models">large language models (LLMs)</a> and are considered &#8220;lightweight&#8221; versions of Gemini.&#8221; - <a href="https://www.ibm.com/think/topics/google-gemma">source</a></p></div><p>It just so happens those little blighters fine-tuned a coding-specific version.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://ai.google.dev/gemma/docs/codegemma" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JlQh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 424w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 848w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 1272w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JlQh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png" width="1456" height="406" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:406,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:119300,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://ai.google.dev/gemma/docs/codegemma&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JlQh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 424w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 848w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 1272w, https://substackcdn.com/image/fetch/$s_!JlQh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0aad83-9d3d-4573-baa9-99a70866cbfe_1824x508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So, I think we just give this <a href="https://ai.google.dev/gemma/docs/codegemma">CodeGemma 7B</a> a try, wire it up to OpenCode, and let &#8216;er rip.</p><p>This might seem a little strange, but the easiest way to get CodeGemma onto our local machine and running is to use Ollama. <a href="https://dataengineeringcentral.substack.com/p/run-llama-31-8b-locally-with-langchain?utm_source=publication-search">I&#8217;ve used Ollama plenty in the past</a>. <a href="https://ollama.com/download/mac">It&#8217;s easy to get.</a></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">ollama run codegemma</code></pre></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;5f6458d6-ab1c-4cda-bee1-2d0d0b767f2b&quot;,&quot;duration&quot;:null}"></div><p>So we have CodeGemma on our machine, thanks to Ollama. Don&#8217;t you love open source? Now we can hopefully configure our OpenCode to run CodeGemma and see whether we shed tears of joy or sorrow.</p><ul><li><p>Next, we need a little JSON config magic to point our OpenCode to our Ollama Gemma model. Mouthfull.</p></li></ul><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;89486b98-7752-46ed-a26f-37a6accc4b31&quot;,&quot;duration&quot;:null}"></div><p>Next, we do a little vim&#8217;ing.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&gt;&gt; ~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "codegemma:7b": {
          "name": "CodeGemma 7B"
        }
      }
    }
  },
  "model": "ollama/codegemma:7b"
}</code></pre></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;8ac2746c-a742-40a4-99d2-c06b916d2e11&quot;,&quot;duration&quot;:null}"></div><p>Now we can double-check what model OpenCode is using.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">opencode
/model</code></pre></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;d98c08b6-60eb-47b5-b1d4-9811fc434dbd&quot;,&quot;duration&quot;:null}"></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Doing a thing.</h2><p>Well, we've done what we set out to do today. Who do you feel? Freedom? Anarchy? Rebel? Rich? Wow, a little bit of work and swimming against the stream, we have unshackled ourselves from the token mongers. <em><strong>Better sleep with one eye open tonight.</strong></em></p><ul><li><p>I mean, the real question is &#8230; will it perform?</p></li></ul><p>Beauty is in the eye of the beholder. I have no idea; maybe it will be too slow or just produce horrible output. Maybe it can&#8217;t do specific Data Engineering tasks. I don&#8217;t know.</p><p>Let&#8217;s just do something simplistic.</p><blockquote><p>Have it read an <a href="https://divvy-tripdata.s3.amazonaws.com/index.html">open-source Divvy Bike Trip</a> CSV file, and use maybe DuckDB or Polars to do some analytics, maybe a simple groupBy and aggregate.</p></blockquote><p>Here we go.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-isf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-isf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 424w, https://substackcdn.com/image/fetch/$s_!-isf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 848w, https://substackcdn.com/image/fetch/$s_!-isf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 1272w, https://substackcdn.com/image/fetch/$s_!-isf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-isf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png" width="1456" height="375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:375,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:205852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-isf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 424w, https://substackcdn.com/image/fetch/$s_!-isf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 848w, https://substackcdn.com/image/fetch/$s_!-isf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 1272w, https://substackcdn.com/image/fetch/$s_!-isf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75c9853-cc5f-4ef4-982c-9c1a9d09422c_1708x440.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So, we are learning something: one, we can fail immediately; two, apparently, all free open-source models are not created equal. I&#8217;m no savant of running local models; apparently, this GemmaCode doesn&#8217;t support tool calling, aka doesn&#8217;t expose the methods that OpenCode would need for integration.</p><p>Maybe that is something we could figure out and tweak, but look, I&#8217;m approaching this from a caveman perspective. I want an easy setup that anyone can tackle.</p><ul><li><p><em>Anywho, I am cornfeed and Midwest-raised, and I don&#8217;t quit easily. Onto the next model. <a href="https://ollama.com/library/qwen3-coder">Qwen, my love</a>.</em></p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">ollama pull qwen2.5-coder:7b-instruct</code></pre></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;70f5a28a-4164-4a09-9260-d9eab4c3f4d6&quot;,&quot;duration&quot;:null}"></div><p>Of course, we need to update that config file.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-instruct": {
          "name": "Qwen 2.5 Coder 7B"
        }
      }
    }
  },
  "model": "ollama/qwen2.5-coder:7b-instruct"
}</code></pre></div><p>Ok, let&#8217;s try that data pipeline again with <a href="https://ollama.com/library/qwen3-coder">Qwen</a>.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Make a Python file that uses DuckDB to read the CSV file 
202605-divvy-tripdata.csv, count the number of trips per day, 
write results to another CSV file. So count the
 column ride_id and group by started_at cast to a date.</code></pre></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;7bfc520e-f461-44b6-9d80-4a90a9cafe70&quot;,&quot;duration&quot;:null}"></div><p>Then I left and had supper with the family. Why not. Isn&#8217;t that the point of Agentic Coding? You just tell the Agent to do a thing, leave and do other things, then come back later to check on that thing?</p><p>All I know is when I got back from supper, it still wasn&#8217;t done. Ain&#8217;t not Anthropic, told you so. Look, that little stinker is burning 30% of my CPU.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LQHz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LQHz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 424w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 848w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 1272w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LQHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png" width="1456" height="181" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:181,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:99874,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LQHz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 424w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 848w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 1272w, https://substackcdn.com/image/fetch/$s_!LQHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f538e3-489c-4760-ba80-deee8a874f6d_1654x206.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I mean, I&#8217;m using what I consider a pretty standard MacBook.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cd36!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cd36!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 424w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 848w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 1272w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cd36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png" width="546" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:546,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83394,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cd36!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 424w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 848w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 1272w, https://substackcdn.com/image/fetch/$s_!Cd36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb979561d-71b8-4fdd-9e8b-b094b9c2b4b6_546x492.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What in Thor&#8217;s Beard? 33 Minutes for that task. That ain&#8217;t going to work, but I guess if you&#8217;re like Tiny Tim peddling the streets for tokens, time is of little importance.</p><ul><li><p>Also, what do I know about tools and OpenCode? It didn&#8217;t actually write the file. I just spit the Python out.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WQuJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WQuJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 424w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 848w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 1272w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WQuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png" width="1456" height="451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:451,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:456133,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WQuJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 424w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 848w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 1272w, https://substackcdn.com/image/fetch/$s_!WQuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf2bfeac-c54e-4824-814d-b8e6bc630d2f_1706x528.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before I start complaining about how crap this all is, let&#8217;s see if this code actually works. We will put it into a Python file ourselves and run it. Clearly, it&#8217;s tool calling ability, you can see it trying to write a Python file &#8230; is a little wonky, maybe stuff you can get working?</p><p>Off the bat, I can see its Python skills are a little sub-par.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FzzU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FzzU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 424w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 848w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FzzU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png" width="1400" height="1154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:277593,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FzzU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 424w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 848w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!FzzU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3cd57ff-e55b-42d2-9160-b7a86c99fc24_1400x1154.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What&#8217;s that saying your Grandma was always carrying on about? You get what you pay for??</p><ul><li><p>SQL query has wrong &#8220; vs &#8216;</p></li><li><p>df column referenced incorrectly, needs [] in a few spots</p></li></ul><p>Fixed code, by myself, that is, looks like this and runs great, outputs as I asked.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dgur!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dgur!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 424w, https://substackcdn.com/image/fetch/$s_!dgur!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 848w, https://substackcdn.com/image/fetch/$s_!dgur!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!dgur!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dgur!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png" width="1400" height="1154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:276356,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dgur!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 424w, https://substackcdn.com/image/fetch/$s_!dgur!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 848w, https://substackcdn.com/image/fetch/$s_!dgur!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!dgur!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff39b9b0-efb1-4cdd-9f79-4a14f87afb8b_1400x1154.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZJI_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZJI_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 424w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 848w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 1272w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZJI_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png" width="1456" height="635" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:635,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:294339,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/201340170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZJI_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 424w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 848w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 1272w, https://substackcdn.com/image/fetch/$s_!ZJI_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67cb5976-8d2e-4303-b149-427cac3d514d_1710x746.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ok, so clearly, while my local model is mostly, well, only using CPU to solve these problems, this thing is going to crawl. Now you know why crypto bros have been ordering GPUs for years now. Also, you now know why things like Claude Code, combined with some Anthropic model behind an API, are so compelling, and also why those companies&#8230;</p><ul><li><p><em>Hire smart people</em></p></li><li><p><em>Burn a lot of cash</em></p></li><li><p><em>Suck up all our groundwater into data centers</em></p></li><li><p><em>Use all our electricity</em></p></li></ul><p>Takes some of that there &#8220;<strong>gumption</strong>&#8221; like we call it around here, for you to send some massive context over the wire and across the country, and get back over that same wire &#8230; a correct answer.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>So &#8230; yeah &#8230;</h2><p>To be honest, I&#8217;m running out of gumption. Don&#8217;t tell anyone. Just give me my basket of nice hot tokens and let me snack on Claude Code. If you want me to reduce token costs, I will just sprinkle some caveman on it, improve my prompts, and cut out all unnecessary context that is hidden around here and there.</p><p>Give em&#8217; the money, give me my tokens.</p><blockquote><p><em>See, what did I tell you? I figured we could get something working, break free from the rest of the lemmings running over the AI cliff. Pretend we are rebels and run our own local model.</em></p></blockquote><p>Life isn&#8217;t a movie; you haven&#8217;t figured that out yet? No free lunch? Never heard of that either?</p><p>I&#8217;m sure there are 16 things I did wrong that I'll be told about; the Reddit rabble will probably figure it out. All you have to do is meet them under the old oak tree at midnight and sign your name in blood with a stick. Who knows.</p><ul><li><p>More hardware, more GPU, better configs, I don&#8217;t know.</p></li></ul><p>Just Google it a little, and you will see what I mean. There are already a million YouTube videos and other articles telling you they have cracked the code for the perfect, snappy model that performs best.</p><p>Good luck.</p><p>This is something I want to return to, maybe in a Part 2, do a little experimenting and research, see if we can find a model that gives good results on a decent-sized machine like this. Maybe it isn&#8217;t possible yet, who knows.</p><p>Yeah, I know we can solve lots of compute problems with more RAM, CPU, GPU &#8230; but not only did I want to see if it was/is possible to break free from the token tax, which, yes, it is possible, but what is it like for the average Dev who wants to do that?</p><p>It appears not to be THAT easy. Yeah, I didn&#8217;t spend much time on it, or trying to solve it yet, that can come later, I just wanted to dip my toes in the water and see what the world had in store for me.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/escaping-the-token-tax-how-open-models/comments"><span>Leave a comment</span></a></p><div id="youtube2-qte0OZx86Go" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;qte0OZx86Go&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/qte0OZx86Go?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div>]]></content:encoded></item><item><title><![CDATA[From Failure to AWS: What Actually Makes a Great Engineer]]></title><description><![CDATA[with Victor Moreno]]></description><link>https://dataengineeringcentral.substack.com/p/from-failure-to-aws-what-actually</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/from-failure-to-aws-what-actually</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 10 Jun 2026 12:15:10 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/196475381/77e74e2c2a0430018a6c74b43cf56aae.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>Victor Moreno went from failing out of a top CS program to becoming a senior engineer at AWS, and his story says a lot about what actually matters in software engineering today.</p><blockquote><p>In this conversation, we go deep into the reality behind the AI hype, what makes engineers valuable (<em>it&#8217;s not writing more code</em>), and why the future of the field looks very different from what most people think.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/thecodingteacher/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nP7g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 424w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 848w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 1272w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nP7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png" width="1456" height="708" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:708,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1382662,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/thecodingteacher/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196475381?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nP7g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 424w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 848w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 1272w, https://substackcdn.com/image/fetch/$s_!nP7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a115ec3-35b0-4865-80a8-f541e5babfb5_1612x784.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We talk about the shift from coding to system thinking, why fundamentals matter more in the age of AI, and how junior engineers will need to evolve as tools like Claude and ChatGPT take over the &#8220;grunt work.&#8221;</p><p>Victor also shares hard-earned lessons from teaching, startups, consulting, and building systems at AWS, along with practical advice for engineers looking to stand out in a crowded, uncertain job market.</p><p>This is not a hype conversation. It&#8217;s a real look at where things are going.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/from-failure-to-aws-what-actually?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/from-failure-to-aws-what-actually?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/from-failure-to-aws-what-actually?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>&#128273; What We Cover</h2><ul><li><p>Why AI is making fundamentals more important, not less</p></li><li><p>The biggest mistake engineers make is chasing promotions</p></li><li><p>How to actually become a high-impact engineer</p></li><li><p>Why does doing more Jira tickets not matter</p></li><li><p>What&#8217;s broken about today&#8217;s interview process</p></li><li><p>The future of junior engineers in an AI world</p></li><li><p>Tactical vs strategic engineering (and why it matters)</p></li><li><p>Why most AI-generated code is still &#8220;low quality.&#8221;</p></li><li><p>How to think about career growth in a weird job market</p></li></ul><div><hr></div><h2>&#128161; Key Takeaway</h2><p>The best engineers aren&#8217;t the ones writing the most code&#8212;they&#8217;re the ones who understand systems, think long-term, and can drive decisions.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Delta Lake + DuckDB. Catalog Commits with Unity Catalog. Unlocking Concurrent Ingestion.]]></title><description><![CDATA[why isn't anyone talking about this?]]></description><link>https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Sun, 07 Jun 2026 14:03:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Wq3k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wq3k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wq3k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wq3k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1043380,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wq3k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Wq3k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb5a9ce3-1c73-4aba-b7d4-6ecd945f6656_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p><strong>UPDATE</strong>: Since writing this the Databricks/DuckDB teams have fixed the errors I ran into. &#8220;We've since worked with the DuckDB team to land a fix. It's now available in the nightly extension builds via FORCE INSTALL and will also go out in the upcoming DuckDB patch release (ETA june 15th).&#8221;</p></div><p>Ok, sometimes I honestly get amazed at what squeaky wheel gets the grease and what doesn&#8217;t in the data community at large. <em>*Sigh</em>. Well, I sort of get it, but as someone who loves the <a href="https://dataengineeringcentral.substack.com/p/the-single-node-rebellion?utm_source=publication-search">Single Node Rebellion</a>, it&#8217;s hard not to call my mom about this.</p><p>The world, and I, have embraced the Lake House as the data platform architecture of choice (<em>yes, it is an architecture</em>), Delta Lake and Iceberg becoming one, <strong>we have all drunk of the chalice and are reluctant to put it down.</strong></p><p>Part of the problem, if you can even call it that, is that we (the collective we) have, as a side effect, become addicted to clusters and high-cost compute. Sure, Databricks Serverless has come along, and is indeed welcome; <a href="https://dataengineeringcentral.substack.com/p/polars-and-duckdb-release-unity-catalog?utm_source=publication-search">tools like DuckDB, Polars, and Daft have for some time supported Unity Catalog access to the Lake House</a>.</p><ul><li><p><em>But we don&#8217;t turn a blind eye to the complexity and issues this might create.</em></p></li></ul><p><a href="https://dataengineeringcentral.substack.com/p/aws-lambda-duckdb-and-delta-lake?utm_source=publication-search">In the past years, I&#8217;ve been the dude pushing DuckDB to prod inside Lambdas and letting &#8216;er rip on Unity Catalog Delta Lake tables.</a> <strong>But ya gotta be careful when you're trying to touch the sun</strong>. At the most basic level, Delta Lake and like Lake House storage layers are all about transactions, and once that word appears, <strong>it&#8217;s easy to blow stuff up.</strong></p><p>How do you ensure you&#8217;re working on the right table version? How do you allow multiple writers, say DuckDB, to hit the same table and not FUBAR your production tables?</p><blockquote><p><em>That&#8217;s what we are going to talk about today.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://delta.io/blog/2026-05-06-delta-grows-up-writes-time-travel-and-unity-catalog/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YCrU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 424w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 848w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 1272w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YCrU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png" width="1456" height="460" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112324,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://delta.io/blog/2026-05-06-delta-grows-up-writes-time-travel-and-unity-catalog/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YCrU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 424w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 848w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 1272w, https://substackcdn.com/image/fetch/$s_!YCrU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b2ede4e-a2fa-4d0e-9348-f0ba0e670f21_1872x592.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Le Probl&#232;me (the problem)</h2><p>I sorta mentioned the current &#8220;problem&#8221; with the Lake House architectures for the last few years. They have mostly been one-trick ponies, wonderful in their inception and in delivering data at scale, but poor at adopting real-world multi-tool read/write support.</p><blockquote><p>Sure, we&#8217;ve had read access for a few years now for a variety of tools, but that only gets a guy so far. The ever reached for golden star that unlocks the potential for bespoke, handcrafted, multi-engine unlimited access and interaction has been mostly a <a href="https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/">untouched relic used by delta-rs and dynamodb mythical creatures</a>.</p></blockquote><p>Only a chosen few have the courage to walk down that path less traveled.</p><p>So, for the masses, we&#8217;ve had to stick to our Spark Clusters, Serverless, Databricks Connect, or whatever &#8230; to get the true read, and more importantly, write semantics,&nbsp;<strong>needed to build most production pipelines.</strong></p><p>Sure, we&#8217;ve had <a href="https://dataengineeringcentral.substack.com/p/polars-and-duckdb-release-unity-catalog">DuckDB and Polars integrations into Unity Catalog</a> for the brave, but you were still playing with fire and banging rocks together.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dataengineeringcentral.substack.com/p/polars-and-duckdb-release-unity-catalog" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a7Fy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 424w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 848w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 1272w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a7Fy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png" width="1456" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:358873,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://dataengineeringcentral.substack.com/p/polars-and-duckdb-release-unity-catalog&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a7Fy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 424w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 848w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 1272w, https://substackcdn.com/image/fetch/$s_!a7Fy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f35b0d7-6203-490e-b93b-054b2199a6c0_1922x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What were we missing?</p><blockquote><p><strong>Simplistic concurrent writes via common data tooling.</strong></p></blockquote><p>Simple is good, doesn&#8217;t have to be fancy, to open up the Lake House architecture to be more &#8230;</p><ul><li><p>flexible</p></li><li><p>multi-engine</p></li><li><p>full production-like support for all operations</p></li></ul><p>Is that so much to ask for?</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Then the Lord said &#8230; &#8220;Let them bring forth Catalog Commits.&#8221;</h2><p>Let&#8217;s get to the matter at hand. Also, just because I&#8217;m singing the praises of Unity Catalog and Catalog Commits here, I will make some critical comments at some point. Ya&#8217; know I bring the heat at all times and in all seasons.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.databricks.com/blog/convergence-open-table-formats-and-open-catalogs-catalog-commits-generally-available" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1qjf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 424w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 848w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 1272w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1qjf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png" width="1456" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:236138,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.databricks.com/blog/convergence-open-table-formats-and-open-catalogs-catalog-commits-generally-available&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1qjf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 424w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 848w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 1272w, https://substackcdn.com/image/fetch/$s_!1qjf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89f82730-29f8-41be-914a-b7861f1d0dbc_1922x788.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m not going to give a ton of background here, just the basics, so everyone is on the same page about why Catalog Commits are a big deal. At the core of the problem, Delta Lake provides ACID capabilities on top of object storage. This makes the implementation simple, and the need for a Data Catalog simplifies the architecture.</p><p>But also surfaces issues. Databricks sums them up well.</p><ul><li><p>&#8220;&#8230; <em>external engines writing to Delta tables directly in object storage cause catalog metadata, like schemas, to silently diverge from the actual table state.</em>&#8221;</p></li><li><p>&#8220;&#8230; <em>every engine, tool, and agent can access tables differently, resulting in fragmented table discovery, inconsistent auditing, and no standardized enforcement of row or column-level controls across systems.</em>&#8221;</p></li><li><p>&#8220;&#8230; <em>open lakehouse architectures historically have not supported atomic writes spanning multiple tables.</em>&#8221;</p></li></ul><p>The only real option to solve these issues was/is for Delta Lake to follow in Iceberg&#8217;s footsteps, for once, and move towards making <strong>a <a href="https://delta.io/blog/2026-02-02-delta-catalog-managed-tables/">Data Catalog the primary means by which someone or something interacts with the Lake House.</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://delta.io/blog/2026-02-02-delta-catalog-managed-tables/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w4YD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 424w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 848w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 1272w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w4YD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png" width="1456" height="467" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:467,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151676,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://delta.io/blog/2026-02-02-delta-catalog-managed-tables/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w4YD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 424w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 848w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 1272w, https://substackcdn.com/image/fetch/$s_!w4YD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e30a306-b62d-4574-a0cb-0cabcffedce2_1922x616.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Again, why is this a big deal? Well, because &#8220;we&#8221; have a whole new conceptual data-processing and pipeline architecture available now &#8230; for pretty much any and all Lake House ingestion and processing.</p><ul><li><p><em>Something we didn&#8217;t really have, or didn&#8217;t have much of before.</em></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aHPa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aHPa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 424w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 848w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 1272w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aHPa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png" width="1456" height="835" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:835,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:229107,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aHPa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 424w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 848w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 1272w, https://substackcdn.com/image/fetch/$s_!aHPa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dc0b19c-4539-4e89-8671-c83532408554_1922x1102.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I think I&#8217;ve been pontificating enough at this point, and I would like to simply show you, on my personal AWS and Databricks accounts, how we can think about changing the data processing paradigm in the Lake House now.</p><ul><li><p>Simplicity, don&#8217;t forget that part. What I&#8217;m most excited about is the simplicity of this architecture.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Using DuckDB + AWS Lambdas + Databricks Delta Lake</h2><p>Ok, so there is most likely no more basic and common data ingestion pattern on a Databricks Lake House platform than the simple ingestion of CSV files into some <a href="https://dataengineeringcentral.substack.com/p/medallion-architecture-truth-or-fiction">Medallion Architecture</a>.</p><p>Typically, this would be done with a Databricks Job, using Spark. Not a big deal, except for your compute bill at the end of the month, AND the simple fact that even processing a few hundred CSVs every day is simply a little Spark overkill. You don&#8217;t exactly need distributed compute to do that.</p><p>So, what if we could just have some AWS Lambdas with triggers watching some S3 bucket(s) where the CSV files land, and let DuckDB inside the Lambdas write that data ingestion into the Lake House concurrently and at scale, without any worries?</p><blockquote><p>Talk about simple. And fun.</p></blockquote><div><hr></div><h4>First, we need a Catalog Managed Delta Table.</h4><p>( <a href="https://github.com/danielbeach/duckdb_with_unity_catalog_commits">all code on GitHub</a> )</p><p>So, I&#8217;ve already got some Delta Lake tables banging around in my Databricks account. We will use the old and trusty&nbsp;<a href="https://divvy-tripdata.s3.amazonaws.com/index.html">Divvy Bike trips table and use some of those open-source datasets/CSVs</a>&nbsp;for our little project.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!30Eg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!30Eg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 424w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 848w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 1272w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!30Eg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png" width="1456" height="544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:544,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:155866,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!30Eg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 424w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 848w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 1272w, https://substackcdn.com/image/fetch/$s_!30Eg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe547141-fe0f-4e52-8972-ab6e2009d0e3_1992x744.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>First, let&#8217;s run a command to upgrade our nasty old Delta Lake so it can handle new-fangled stuff.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Avsn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Avsn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 424w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 848w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 1272w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Avsn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png" width="1456" height="279" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:279,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:75766,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Avsn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 424w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 848w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 1272w, https://substackcdn.com/image/fetch/$s_!Avsn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882b96f7-04e1-4c84-aaaf-b1a977be20f2_2004x384.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Next, we need two things related to our Catalog: either OSS Unity or Databricks Unity.</p><ul><li><p>Workspace URL</p></li><li><p>PAT token</p></li></ul><p>This is how our DuckDB, or whatever engine for you, will hit our Lake House from inside an AWS Lambda.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">DATABRICKS_HOST = os.environ["DATABRICKS_HOST"]
DATABRICKS_TOKEN = os.environ["DATABRICKS_TOKEN"]</code></pre></div><p>Also, let&#8217;s use a Dockerfile we can build and an ECR with AWS for running the Lambda.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;dockerfile&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-dockerfile">FROM --platform=linux/amd64 public.ecr.aws/lambda/python:3.12

RUN pip install --no-cache-dir uv &amp;&amp; \
    uv pip install --system --no-cache "duckdb"

RUN mkdir -p /opt/duckdb_extensions &amp;&amp; \
    python -c "import duckdb; c=duckdb.connect(config={'extension_directory':'/opt/duckdb_extensions'}); c.execute('INSTALL unity_catalog'); c.execute('INSTALL httpfs'); c.execute('INSTALL aws'); c.close()"

ARG DATABRICKS_HOST
ARG DATABRICKS_TOKEN

ENV DATABRICKS_HOST=${DATABRICKS_HOST}
ENV DATABRICKS_TOKEN=${DATABRICKS_TOKEN}
ENV DUCKDB_EXT_DIR=/opt/duckdb_extensions

COPY main.py ${LAMBDA_TASK_ROOT}

CMD ["main.lambda_handler"]</code></pre></div><p>Nothing much there, simple and simple. The lambda code itself is as uncomplicated as they come. These are the sort of things we should all default to.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import os
import shutil
import logging
import duckdb
from urllib.parse import unquote_plus

logger = logging.getLogger()
logger.setLevel(logging.INFO)

DATABRICKS_HOST = os.environ["DATABRICKS_HOST"]
DATABRICKS_TOKEN = os.environ["DATABRICKS_TOKEN"]
_OPT_EXT_DIR = "/opt/duckdb_extensions"
DUCKDB_EXT_DIR = "/tmp/duckdb_extensions"

if not os.path.exists(DUCKDB_EXT_DIR):
    shutil.copytree(_OPT_EXT_DIR, DUCKDB_EXT_DIR)


def lambda_handler(event, context):
    record = event["Records"][0]
    bucket = record["s3"]["bucket"]["name"]
    key = unquote_plus(record["s3"]["object"]["key"])

    if not key.startswith("trips/"):
        logger.info("Skipping key outside trips/ prefix: %s", key)
        return {"statusCode": 200, "skipped": True}

    s3_path = f"s3://{bucket}/{key}"
    logger.info("Processing %s", s3_path)

    conn = duckdb.connect(config={"extension_directory": DUCKDB_EXT_DIR})
    conn.execute("FORCE INSTALL delta FROM core_nightly")
    conn.execute("LOAD delta")
    conn.execute("LOAD unity_catalog")
    conn.execute("LOAD httpfs")
    conn.execute("LOAD aws")

    conn.execute("CREATE SECRET (TYPE s3, PROVIDER credential_chain)")

    conn.execute(f"""
        CREATE SECRET (
            TYPE     unity_catalog,
            TOKEN    '{DATABRICKS_TOKEN}',
            ENDPOINT '{DATABRICKS_HOST}'
        )
    """)
    conn.execute("""
        ATTACH 'confessions' AS uc_catalog
            (TYPE unity_catalog, DEFAULT_SCHEMA 'default')
    """)
    rows = conn.execute(f"""
        INSERT INTO uc_catalog.default.trips
        SELECT * FROM read_csv('{s3_path}')
    """).rowcount
    conn.execute("COMMIT")
    conn.close()

    logger.info("Inserted %d rows from %s", rows, s3_path)
    return {"statusCode": 200, "rows_inserted": rows}
</code></pre></div><p>Then all we needed was the actual ECR image, the Lambda, the S3 Trigger, some files, and a sprinkle of testing. First, the ECR.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;3fdedfc2-98d5-4d04-b86e-5d6f54f96552&quot;,&quot;duration&quot;:null}"></div><p>Once we have built and pushed the ECR image, we can get Lambda up and running.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;1e8ef0b5-3a2a-4aa0-a381-a16abc683a6f&quot;,&quot;duration&quot;:null}"></div><p>After the S3 file trigger is attached to Lambda, it&#8217;s just pushing up some files.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;e445890e-eafb-4260-bcc4-ae9f90f45f0f&quot;,&quot;duration&quot;:null}"></div><p>Of course, you and I both knew it wasn&#8217;t going to work; come on, admit it. Hope springs eternal, they say. I did have my dreams set on this working, but this early in the game, I figured something would go wrong.</p><p>That&#8217;s just experience.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oszq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oszq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 424w, https://substackcdn.com/image/fetch/$s_!oszq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 848w, https://substackcdn.com/image/fetch/$s_!oszq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 1272w, https://substackcdn.com/image/fetch/$s_!oszq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oszq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png" width="1133" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:1133,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106914,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oszq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 424w, https://substackcdn.com/image/fetch/$s_!oszq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 848w, https://substackcdn.com/image/fetch/$s_!oszq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 1272w, https://substackcdn.com/image/fetch/$s_!oszq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F375e474c-8111-4d1d-b74c-dfadd5295da0_1133x415.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I piddled around trying to figure out what&#8217;s going on, but clearly, we got some tom-foolery going on in the Delta Kernel. Funny, it seemed to work fine on OSS Unity on both the DuckDB and OSS Delta blogs, but who knows what they were exactly doing that I&#8217;m not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://duckdb.org/2026/05/07/delta-uc-updates" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bSDs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 424w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 848w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 1272w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bSDs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png" width="886" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:886,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79947,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://duckdb.org/2026/05/07/delta-uc-updates&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199350247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bSDs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 424w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 848w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 1272w, https://substackcdn.com/image/fetch/$s_!bSDs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe3cedc4-6f70-4ee7-886f-be55de7af0ce_886x400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anywho, you and I at least get the picture of what&#8217;s coming. I&#8217;m sure, based on what I know about the Delta folk and DuckDB, it shouldn&#8217;t be long before things get a little bit more firmed up and wrinkles worked out.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/delta-lake-duckdb-catalog-commits?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p><div id="youtube2-WI0knF42wHw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;WI0knF42wHw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/WI0knF42wHw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div>]]></content:encoded></item><item><title><![CDATA[How Real Data Engineers Think (Beyond Tools and Hype)]]></title><description><![CDATA[with &#8212; Yordan Ivanov]]></description><link>https://dataengineeringcentral.substack.com/p/how-real-data-engineers-think-beyond</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/how-real-data-engineers-think-beyond</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 03 Jun 2026 12:45:19 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195935277/996418e8e5496ace211efda630148d18.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/ivanovyordan/">Yordan Ivanov</a>, Head of Data Engineering at a growing fintech company, to talk through what it actually looks like to build and run real data platforms in production.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/ivanovyordan/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Il9w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 424w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 848w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 1272w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Il9w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png" width="1456" height="721" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:721,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:297960,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/ivanovyordan/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195935277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Il9w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 424w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 848w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 1272w, https://substackcdn.com/image/fetch/$s_!Il9w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6769bdc1-262d-4927-8c60-72b4cd90e7ba_1576x780.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Yordan&#8217;s story starts like many of mine, early programming, gaming, PHP, Linux servers&#8212;but what makes this conversation interesting is how he evolved from a generalist software engineer into a data engineering leader without even realizing it at first.</p><blockquote><p>We spend a lot of time digging into what actually matters in modern data engineering, and it&#8217;s not the tools.</p></blockquote><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/how-real-data-engineers-think-beyond?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/how-real-data-engineers-think-beyond?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/how-real-data-engineers-think-beyond?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p>We talk about:</p><ul><li><p>Why the industry went too far into complexity and is now swinging back toward simplicity</p></li><li><p>The reality of running a data platform at scale (and why most teams waste time maintaining tools instead of delivering value)</p></li><li><p>How to think about migrations the right way without breaking everything</p></li><li><p>The difference between junior, mid, and senior engineers&#8212;and why ambiguity tolerance and impact matter more than coding ability</p></li><li><p>Why &#8220;perfect&#8221; engineering is a trap and how to actually ship work that matters</p></li></ul><p>We also get into AI, and Yordan has one of the more grounded takes you&#8217;ll hear right now. Most companies aren&#8217;t even close to ready for AI, and the idea that it&#8217;s replacing engineers anytime soon misses the bigger problem: messy data, unclear metrics, and weak foundations.</p><p><a href="https://www.datagibberish.com/">Check out Yordan&#8217;s Substack below!<br></a></p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:828483,&quot;name&quot;:&quot;Data Gibberish&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!57pD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff67d08b-5df4-4292-a62e-921909a6ce52_1280x1280.png&quot;,&quot;base_url&quot;:&quot;https://www.datagibberish.com&quot;,&quot;hero_text&quot;:&quot;Data Gibberish turns experienced data professionals into well-rounded leaders. This is how you stop being overlooked, work on the problems that matter and get paid what you deserve. Nobody else teaches you this in data.&quot;,&quot;author_name&quot;:&quot;Yordan Ivanov&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://www.datagibberish.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!57pD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff67d08b-5df4-4292-a62e-921909a6ce52_1280x1280.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Data Gibberish</span><div class="embedded-publication-hero-text">Data Gibberish turns experienced data professionals into well-rounded leaders. This is how you stop being overlooked, work on the problems that matter and get paid what you deserve. Nobody else teaches you this in data.</div><div class="embedded-publication-author-name">By Yordan Ivanov</div></a><form class="embedded-publication-subscribe" method="GET" action="https://www.datagibberish.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><p>We also talk about:</p><ul><li><p>How AI is actually used on real teams today (not Twitter hype)</p></li><li><p>Why juniors with AI can be risky without strong processes</p></li><li><p>How to think about code reviews, testing, and slowing down when it matters</p></li></ul><blockquote><p>On top of that, we dig into content creation, Substack, and what it takes to stand out in a world full of generic AI-generated content. Yordan&#8217;s approach is simple: write from real experience or don&#8217;t write at all.</p></blockquote><p><em>This is one of those conversations that cuts through a lot of noise and gets back to fundamentals, how to think, how to build, and how to grow as an engineer in a rapidly changing space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Databricks Zerobus - Event Streams + Lake House (be gone Kafka)]]></title><description><![CDATA[it's always something you know]]></description><link>https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 01 Jun 2026 12:03:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Jegn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jegn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jegn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jegn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2425550,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jegn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Jegn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffcc6ca-00cd-4f0a-a3ca-093a23aa008f_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I hadn&#8217;t thought about it much lately, but depending on your point of view, Kafka is either at the height of its rise or on a slow downward spiral. <em>Maybe both?</em> We do live in the age of abstraction; businesses at large seem to be less willing to pay hordes of Platform Engineers to babysit complex architecture.</p><ul><li><p>But, with the rise of the Vibe Engineer, solving your streaming problems is only one god prompt away.</p></li></ul><p>The (<em>streaming</em>) complexity is seen as chink in the old armor by many a SaaS vendor, Databricks included. That&#8217;s how the world turns. Someone sees a &#8220;problem&#8221; and says, &#8220;<em>Hey, I can do better. Come on over here, my friend, the water is fine.</em>&#8221; </p><p>Can you blame &#8216;em?</p><blockquote><p>Just like Spark getting it&#8217;s heels nipped at by annoying puppies (<em>DuckDB, Daft, Polars, etc</em>), the streaming world has seen its share of <a href="https://www.arroyo.dev/">upstarts trying to make streaming easier and less complex</a>.</p></blockquote><p>Then along came a little birdie and whispered sweet nothings into my lonely ear, something mysterious and wonderful &#8230; words like &#8230; &#8220;<em><strong>Databricks &#8230; Streaming &#8230; gRPC &#8230; Delta Lake</strong></em>.&#8221; That was enough to pique my curiosity and make me want to find out more. Some people know my weakness(<em>es</em>).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.databricks.com/aws/en/ingestion/zerobus-ingest" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r8Mn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 424w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 848w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 1272w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r8Mn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png" width="1346" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1346,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132316,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.databricks.com/aws/en/ingestion/zerobus-ingest&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r8Mn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 424w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 848w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 1272w, https://substackcdn.com/image/fetch/$s_!r8Mn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F716c2db8-6d78-4bb4-8622-4d2e24406ce0_1346x522.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><em><strong>Thanks to <a href="http://www.delta.io/">Delta</a> for sponsoring this newsletter! I use Delta Lake daily, and I believe it represents the future of Data Engineering. Content like this would not be possible without their support. Check out <a href="http://www.delta.io/">their website</a> below.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp" width="600" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4196,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><p>I want to take a gander at <a href="https://www.databricks.com/product/data-engineering/lakeflow-connect/zerobus-ingest">Zerobus from Databricks</a>, generally how we can think about it and compare it to streaming tech like Kafka, why it exists, and then the actual reality of trying to play with it.</p><ul><li><p>What is it</p></li><li><p>How do you use it</p></li><li><p>Real-life playtime</p></li></ul><p>The truth often lies somewhere in between what we believe. Once you read the documentation and try things for yourself, you may find that things aren&#8217;t what they seem, or maybe they are.</p><blockquote><p><strong>You can&#8217;t know until you put your hand to the plow.</strong></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h2>Streaming + Lake House</h2><p>So, we need to have a talk about overhead and complexity. But we can also talk about the truth. Streaming and near-realtime systems are becoming increasingly mainstream in data culture. It also goes without saying that, in the current economic and business climate, not everyone is willing to overlook the costs and labor intensity of infrastructure or long-running streaming clusters.</p><blockquote><p><em>One can argue that Kafka is going nowhere, and that is true. That&#8217;s like saying DuckDB or Polars will kill Spark. No, they are not. They just eat at the edges. This is the relationship of Zerobus to Kafka. At least for now.</em></p></blockquote><p>The fact that Databricks probably spent an ungodly amount of time and money building Zerobus, when streaming has been around forever, and the market is full of offerings &#8230; tells you something.</p><ul><li><p><strong>LakeHouse architecture is here to stay.</strong></p><ul><li><p>It&#8217;s the data layer of choice for modern orgs</p></li></ul></li><li><p><strong>Streaming to a LakeHouse is problematic.</strong></p><ul><li><p>Zerobus makes it easy?</p></li></ul></li></ul><p>With all that as a way of introduction, let&#8217;s dive in.</p><div><hr></div><h2>What is Zerobus?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.databricks.com/aws/en/ingestion/zerobus-overview" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AIg-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AIg-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png" width="1456" height="508" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:172014,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.databricks.com/aws/en/ingestion/zerobus-overview&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AIg-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!AIg-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8e50eec-c170-4ad6-b462-0fe52460f442_1600x558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is an interesting take on streaming data, eh? If you come from a world where the streaming platform is handled by a legion of engineers, or a few zealots, where most of the time is spent tuning, configuring, dealing, maintenance, upgrades, etc.</p><blockquote><p>Well, then this probably sounds like black magic. One would assume, in Databricks fashion, <strong>that is the point. </strong>It&#8217;s rare the Databricks does something at %50 effort, or just does the next boring thing.</p></blockquote><p>You can expect Databricks to introduce any new product or feature that tries to flip the script and become a serious contender in whatever space it enters.</p><ul><li><p>No doubt there are plenty of large enterprise customers of Databricks who did nothing but complain about the complexity and brittleness of large-scale streaming into their Lake House (<em>probably a Delta-style lakehouse</em>).</p></li></ul><p> What better way to solve streaming in the context of a Lake House than&#8230;</p><ul><li><p><strong>API interface</strong></p><ul><li><p><em>gRPC</em></p></li><li><p><em>REST</em></p></li><li><p><em>OpenTelemetry</em></p></li></ul></li><li><p><strong>Serverless</strong></p></li><li><p><strong>Push Only</strong></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.databricks.com/aws/en/ingestion/zerobus-overview" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QjX4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QjX4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png" width="1456" height="508" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186421,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.databricks.com/aws/en/ingestion/zerobus-overview&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QjX4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 424w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 848w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 1272w, https://substackcdn.com/image/fetch/$s_!QjX4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12368ee1-5766-49d8-8412-fdb977a10e68_1600x558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Again, this is about the simplicity of ingesting streaming data directly into a Lake House, without the need for expensive, complex third-party tools to operate and maintain.</p><p><em>Yet another simplification of the Modern Data Stack.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kSt8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kSt8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 424w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 848w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 1272w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kSt8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png" width="1456" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/738cc03c-85d1-4015-9334-16b436784aed_1693x929.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1540976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kSt8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 424w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 848w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 1272w, https://substackcdn.com/image/fetch/$s_!kSt8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F738cc03c-85d1-4015-9334-16b436784aed_1693x929.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Learn by doing.</h2><p>We could spend more time pontificating about the concepts and finer details of Zerobus, but I think it would be best to select an extremely simple use case, try it out, and discuss what we see as we go.</p><p>Before we begin, we need to identify just a few pieces of information.</p><ul><li><p>Our Zerobus Databricks Endpoint</p></li><li><p>Create a target Delta Lake table.</p></li><li><p>Set up Auth and permissions</p></li><li><p>Choose Client SDK and write code.</p></li></ul><div><hr></div><h4>Define your endpoint.</h4><p>To get our Databricks Zerobus endpoint, you only need your Workspace ID and URL. Mine is something like this &#8230;</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">SERVER_ENDPOINT = "https://319592733000122.zerobus.us-west-2.cloud.databricks.com" # Workspace ID
DATABRICKS_WORKSPACE_URL = "https://dbc-9a64f31c-25b9.cloud.databricks.com/" # Workspace URL</code></pre></div><p>Easy enough.</p><div><hr></div><h4>Define your service principal, OAuth secrets, and permissions.</h4><p>Next, you will need, or should use, a service principal with secrets and proper permissions to said table. Maybe something like this <code>delta-streaming</code></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XsN2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XsN2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 424w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 848w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 1272w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XsN2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png" width="1456" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85651,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XsN2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 424w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 848w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 1272w, https://substackcdn.com/image/fetch/$s_!XsN2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca71364-1518-4052-8cde-5c6da5a81c0c_1602x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>&#8230; OAuth Secrets and Permissions &#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LQ6u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LQ6u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 424w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 848w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 1272w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LQ6u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png" width="1456" height="308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df74b131-7f76-4ec5-9433-19413edde48c_2044x432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:308,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LQ6u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 424w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 848w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 1272w, https://substackcdn.com/image/fetch/$s_!LQ6u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf74b131-7f76-4ec5-9433-19413edde48c_2044x432.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>And now we have our OAuth thingys, we have everything we need to start writing the code.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6apa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6apa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 424w, https://substackcdn.com/image/fetch/$s_!6apa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 848w, https://substackcdn.com/image/fetch/$s_!6apa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 1272w, https://substackcdn.com/image/fetch/$s_!6apa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6apa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png" width="1456" height="241" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be754038-f185-4d2f-822a-23bd66254ec1_2044x338.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:241,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91294,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6apa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 424w, https://substackcdn.com/image/fetch/$s_!6apa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 848w, https://substackcdn.com/image/fetch/$s_!6apa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 1272w, https://substackcdn.com/image/fetch/$s_!6apa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe754038-f185-4d2f-822a-23bd66254ec1_2044x338.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><h4>Prep your Delta Table, or make one.</h4><p>We will use the <a href="https://divvy-tripdata.s3.amazonaws.com/index.html">Divvy Bike trips</a> open-source dataset, just because it&#8217;s easy and accessible. I&#8217;ve already set up a Delta Table for this dataset, so we'll use it as our target table.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1KHZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1KHZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 424w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 848w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 1272w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1KHZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png" width="1456" height="609" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:609,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:234102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1KHZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 424w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 848w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 1272w, https://substackcdn.com/image/fetch/$s_!1KHZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F008c1622-f3ec-4851-be84-6cff32092e13_2046x856.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Write the Zerobus Client.</h4><p>Ok, so now we are at the point where we can actually get to using Zerobus, and they give you a few different SDK options to interact with Zerobus, depending on whether you are a fake data engineer, a power user like me, or maybe more of a real data engineer, <a href="https://dataengineeringcentral.substack.com/p/scott-haines-on-the-future-of-data?utm_source=publication-search">like Scott</a>.</p><ul><li><p><em>Oh, I almost forgot that dreaded pip install.</em></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZuW-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZuW-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 424w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 848w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 1272w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZuW-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png" width="1456" height="275" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7527059-02a8-472d-8e35-213bf1663662_2040x386.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:275,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:147060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZuW-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 424w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 848w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 1272w, https://substackcdn.com/image/fetch/$s_!ZuW-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7527059-02a8-472d-8e35-213bf1663662_2040x386.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We are going to be combining Zerobus with PyArrow, aka Apache Arrow, <a href="https://www.confessionsofadataguy.com/apache-arrow-as-data-interchange/">because Arrow is the next big thing</a>, you ninny.</p><p>Prep our Arrow schema.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">schema = pa.schema([
    ("ride_id", pa.string()),
    ("rideable_type", pa.string()),
    ("started_at", pa.timestamp("ms")),
    ("ended_at", pa.timestamp("ms")),
    ("start_station_name", pa.string()),
    ("start_station_id", pa.string()),
    ("end_station_name", pa.string()),
    ("end_station_id", pa.string()),
    ("start_lat", pa.float64()),
    ("start_lng", pa.float64()),
    ("end_lat", pa.float64()),
    ("end_lng", pa.float64()),
    ("member_casual", pa.string()),
])</code></pre></div><p>Anywho, next we write the rest of the code.</p><ul><li><p>Create Stream</p></li><li><p>Read dataset</p></li><li><p>Push batches to Stream</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uw2q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uw2q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 424w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 848w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uw2q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png" width="1456" height="877" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:877,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:364297,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uw2q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 424w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 848w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!Uw2q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0beb39ef-12f4-4cf7-8e9d-2eb5286cbef3_2046x1232.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Y&#8217;all knew as well as me it wasn&#8217;t going to work the first time. I never lie to you, my fair-weathered friends. Call me Honest Abe.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O9He!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O9He!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 424w, https://substackcdn.com/image/fetch/$s_!O9He!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 848w, https://substackcdn.com/image/fetch/$s_!O9He!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 1272w, https://substackcdn.com/image/fetch/$s_!O9He!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O9He!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png" width="1456" height="229" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:229,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:139006,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O9He!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 424w, https://substackcdn.com/image/fetch/$s_!O9He!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 848w, https://substackcdn.com/image/fetch/$s_!O9He!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 1272w, https://substackcdn.com/image/fetch/$s_!O9He!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd01dc173-ab95-4124-91f7-8ad2f8461d82_2038x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I see no <a href="https://docs.databricks.com/aws/en/ingestion/zerobus-overview">mention in the docs</a> about not being able to use Serverless. Basically, I can&#8217;t even hit that Zerobus endpoint from my Notebook/Serverless. Not much to do besides attempt the same code on an All Purpose cluster and see what&#8217;s crack&#8217;en.</p><ul><li><p><em>Ah ha! So we get farther when we don&#8217;t use Serverless, maybe that&#8217;s not an option, and they don&#8217;t tell you.</em></p></li></ul><p>Using an All-Purpose Cluster, we get an actual error about our Schema from our Zerobus SDK, so that&#8217;s a good sign.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xLfG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xLfG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 424w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 848w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 1272w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xLfG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png" width="1456" height="360" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:360,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:250627,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xLfG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 424w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 848w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 1272w, https://substackcdn.com/image/fetch/$s_!xLfG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75e36b64-e13d-404d-ac00-f74a3e20c825_2014x498.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I got any easy for this Sunny Jim. Just make all the fields STRING, we gotta break out our inner Junior Dev mindset here. Channel the vibes.</p><div class="callout-block" data-callout="true"><p>Ok, Ok, we are moving on down the Error list, this is a good sign. I can see messages that the Zerobus SDK created a new Arrow Flight stream connection to our table.</p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gYyE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gYyE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 424w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 848w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 1272w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gYyE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:360601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gYyE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 424w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 848w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 1272w, https://substackcdn.com/image/fetch/$s_!gYyE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166cdafb-8ce3-406a-a41f-c87b2c850c82_2048x948.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Dang, add a little Arrow <code>batch.cast</code> , and we are off to the races! It works!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HbfP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HbfP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 424w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 848w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HbfP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png" width="1456" height="881" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:881,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:455713,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HbfP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 424w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 848w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!HbfP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48bab7cb-14f6-4149-b4cf-f48ac6a7b365_2036x1232.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let&#8217;s take one of the 140,000 records we just streamed to our Delta Table with Zerobus and see if we can find it. <strong>Hot Dog, ain&#8217;t it a beauty?</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VMYR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VMYR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 424w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 848w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 1272w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VMYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png" width="1456" height="359" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:359,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:111043,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VMYR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 424w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 848w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 1272w, https://substackcdn.com/image/fetch/$s_!VMYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe650480-164a-4d5c-a7fa-c55e39ee58c9_2084x514.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h4>Thinking about things.</h4><p>There are a few things I&#8217;m mildly curious about. Namely, you have to imagine that the number of parquet files created will be a little crazy, depending on your settings, so you'd better manage that whole OPTIMIZE crap or your streaming batches.</p><ul><li><p>Also, just in case you think I&#8217;m lying to you, here is the History of that table. You can see the Zerobus ingestion there.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XNOw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XNOw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 424w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 848w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 1272w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XNOw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png" width="1456" height="567" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:567,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129566,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XNOw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 424w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 848w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 1272w, https://substackcdn.com/image/fetch/$s_!XNOw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc431267e-65e1-4414-9a90-5cb3eb57804f_2084x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here we have 8 files in this Delta Table after the streaming job, but before OPTIMIZE.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ac4m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ac4m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 424w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 848w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 1272w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ac4m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png" width="1456" height="373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:373,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102465,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ac4m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 424w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 848w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 1272w, https://substackcdn.com/image/fetch/$s_!ac4m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5ee9af-7a47-460d-91d0-9109b48d5f3b_2084x534.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Afterward, that is OPTIMIZE, we are down to a single parquet file backing up the Delta Table, not 8. One can use one's imagination to consider how carefully one would have to manage real-time, high-volume streaming ingestion into a Lake House. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dIIs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dIIs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 424w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 848w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 1272w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dIIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png" width="1456" height="373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:373,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96080,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dIIs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 424w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 848w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 1272w, https://substackcdn.com/image/fetch/$s_!dIIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf8ec2f-be2d-4b49-aa81-2fb7ecf3249d_2084x534.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><em>Again, at the production scale, many hundreds of TBs or more, this would require some serious thinking.</em></p></li></ul><div><hr></div><h3>Closing Comments.</h3><p>I just want to spend a minute now that we&#8217;ve done a little dirty work, talking about what we&#8217;ve learned, and questions we might have about streaming to the Lake House with Zerobus from Databricks.</p><p>Firstly, we must admit two things that have been happening at large in the data world &#8230;</p><ol><li><p><em>Streaming has become popular</em></p></li><li><p><em>The Lake House is the new data architecture of choice</em></p></li></ol><p>Also, we would all have to admit that streaming is probably one of the most complicated and complex tasks one can tackle in Data Engineering. It&#8217;s a whole different animal than batch. Also, many companies and tools seem interested in simplifying streaming, bringing it to the masses, and lowering the barriers to entry.</p><p>Yeah, I know we just played with a toy example, but one has to admit, <strong>Zerobus is pretty slick. </strong></p><ul><li><p>No big infrastructure to maintain and tune</p></li><li><p>SDKs are simple and straightforward</p></li><li><p>Integration into the Lake House is a breeze</p></li></ul><p>Methinks that this streaming complexity problem, previously requiring bespoke and custom setups via Kafka and Spark Streaming were just a little much for some folks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://delta.io/blog/write-kafka-stream-to-delta-lake/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9iWJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 424w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 848w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 1272w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9iWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png" width="1456" height="369" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:369,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:113534,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://delta.io/blog/write-kafka-stream-to-delta-lake/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/176765943?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9iWJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 424w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 848w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 1272w, https://substackcdn.com/image/fetch/$s_!9iWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db7f441-d4a2-4421-bc60-897fe79a1d73_2012x510.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I mean, it does add a lot of complexity and breakpoints.</p><blockquote><p>Anywho, now we have a new and wonderful option in Zerobus if we are interested in Lake House streams.</p></blockquote><p>Based on our preliminary playing around, it appears to be a very serious contender, heck, they even have support for Arrow! That was a pleasant surprise.</p><p><strong>If you have any experience streaming to a Lake House, drop a comment below and tell us about your setup!</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/databricks-zerobus-event-streams/comments"><span>Leave a comment</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Data, AI, and DuckDB]]></title><description><![CDATA[with Jacob Matson]]></description><link>https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 27 May 2026 12:03:48 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195885885/1332c9e90aaa4bec32e264be1f9469e4.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/jacobmatson/">Jacob Matson</a>, <a href="https://motherduck.com/authors/jacob-matson/">Developer Advocate at MotherDuck</a>, to unpack one of the most interesting shifts happening in data engineering right now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/jacobmatson/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!omcW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 424w, https://substackcdn.com/image/fetch/$s_!omcW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 848w, https://substackcdn.com/image/fetch/$s_!omcW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 1272w, https://substackcdn.com/image/fetch/$s_!omcW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!omcW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png" width="1456" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:262213,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/jacobmatson/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195885885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!omcW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 424w, https://substackcdn.com/image/fetch/$s_!omcW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 848w, https://substackcdn.com/image/fetch/$s_!omcW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 1272w, https://substackcdn.com/image/fetch/$s_!omcW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd8f6d7e-3bbe-41c7-ae69-006f4dca2bcd_1586x556.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Jacob didn&#8217;t start in tech the way most people expect. He began in accounting, working with Excel and financial systems, before slowly realizing that the real problem he loved solving wasn&#8217;t finance, it was data pipelines. That path eventually led him deep into SQL Server, data warehousing, and ultimately to DuckDB, a tool that fundamentally changed how he thought about processing data.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><ul><li><p>What we get into is bigger than just tools, though.</p></li></ul><blockquote><p><em>We talk about why DuckDB exploded in popularity, what it gets right that traditional databases and even modern cloud warehouses struggle with, and why the industry may be swinging back toward simplicity after years of over-engineered &#8220;modern data stacks.&#8221;</em></p></blockquote><p>There&#8217;s a really interesting thread here around how engineers accidentally created too much complexity, and now tools like DuckDB are winning by removing it.</p><p>We also go deep on the evolution of the data stack itself. From SQL Server&#8217;s &#8220;everything in one box&#8221; model, to the unbundled chaos of the modern stack, and now back toward a more unified, simpler approach. Jacob shares how MotherDuck is thinking about that shift and where things are headed next.</p><ul><li><p>One of the more important parts of this conversation is around AI.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://github.com/matsonj" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5XS5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 424w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 848w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 1272w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5XS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png" width="1456" height="603" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:603,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:802961,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://github.com/matsonj&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195885885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5XS5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 424w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 848w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 1272w, https://substackcdn.com/image/fetch/$s_!5XS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d58f7fd-fa30-4c1a-b7d0-6bc4b554714d_2750x1138.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s a strong argument here that AI doesn&#8217;t kill data engineering; it massively expands it. Instead of fewer queries being written, we may be heading toward a world where AI agents generate orders of magnitude more queries than humans ever could. That flips a lot of assumptions on their head, especially around things like data modeling, which suddenly becomes more important, not less.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><p>We also talk about:</p><ul><li><p>Why most Spark workloads are overkill</p></li><li><p>When single-node tools like DuckDB actually win</p></li><li><p>The real tradeoffs behind Lakehouse architectures</p></li><li><p>Why data modeling is still critical in an AI-driven world</p></li><li><p>How engineers should think about building in 2026 and beyond</p></li></ul><blockquote><p><strong>This is one of those conversations that helps you zoom out and see where things are actually going, not just what tools are trending this week.</strong></p></blockquote><p>If you&#8217;re building data platforms, experimenting with AI, or just trying to simplify your stack, this one is worth your time.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/data-ai-and-duckdb?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Benchmarking Vortex File Format ... vs Parquet, CSV ... vs DuckDB, Polars, Datafusion.]]></title><description><![CDATA[just because we want to make em' mad]]></description><link>https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 25 May 2026 19:49:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rCLU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rCLU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rCLU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 424w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 848w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 1272w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rCLU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png" width="1200" height="700" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:700,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38114,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rCLU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 424w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 848w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 1272w, https://substackcdn.com/image/fetch/$s_!rCLU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F035d92dc-992c-43b6-a683-b0aa4ab8a43d_1200x700.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If we had a dollar for every time someone came along to become the Apache Parquet killer, we would all be living on the side of a mountain tending to our alpacas. A boy can dream, can&#8217;t he?</p><p>Because it&#8217;s a holiday Monday, and I&#8217;ve been up since dawn running amok on the river, and now I&#8217;m lying on the porch with a cool breeze, it&#8217;s time we put our hand to the plow and turn up something interesting.</p><blockquote><p>Today, that will be the <a href="https://docs.vortex.dev/">Vortex file format</a>.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.vortex.dev/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sCW9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 424w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 848w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 1272w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sCW9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png" width="1456" height="453" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:453,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92930,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.vortex.dev/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sCW9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 424w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 848w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 1272w, https://substackcdn.com/image/fetch/$s_!sCW9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92d76888-b172-433a-a1a8-e413c5b9413a_1916x596.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="callout-block" data-callout="true"><ul><li><p>100x faster random access</p></li><li><p>10-20x faster scans</p></li><li><p>5x faster writes</p></li><li><p>Similar compression ratio (vs. Apache Parquet)</p></li></ul></div><p>At this point, half of you probably know more about Vortex than I. I&#8217;ve heard the name rattle around here and there, but never touched it. So, I don&#8217;t expect much today, really just a little poke here and there, a little benchmark to bring the craizes out of the swap, ya know, the usual stuff.</p><div><hr></div><h2>Vortex file format for ninnies.</h2><p>It isn&#8217;t really apparent to me, on a simple poking around in the docs, what niche is being chased down by Vortex. I mean, if you read between the lines, and not hard to find lines, seeing worlds like &#8220;<em>&#8230; vs Parquet</em>&#8221; or &#8220;<em>compressed columnar data,</em>&#8221; heck, even a &#8220;<em>unlike Arrow &#8230;</em>&#8221; makes me sit up and take note.</p><ul><li><p><strong>Seems like these little buggers are taking potshots at everyone worth anything today.</strong></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://github.com/vortex-data/vortex" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RVm2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 424w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 848w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 1272w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RVm2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png" width="1456" height="513" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:513,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:103731,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://github.com/vortex-data/vortex&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RVm2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 424w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 848w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 1272w, https://substackcdn.com/image/fetch/$s_!RVm2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe909476b-37e7-4c1b-ac0f-c037672ec20c_1600x564.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The GitHub page actually brings it home a little better.</p><blockquote><p>&#8220;<em>Vortex is a next-generation columnar file format and toolkit designed for high-performance data processing. It is the fastest and most extensible format for building data systems backed by object storage.</em>&#8221;<br> - <a href="https://github.com/vortex-data/vortex">GitHub</a></p></blockquote><p>It also boasts some impressive features that stand out to me beyond what has already been mentioned.</p><ul><li><p>appears to be written in Rust</p></li><li><p>&#8220;Zero-copy compatibility with Apache Arrow&#8221;</p></li><li><p>Arrow, DataFusion, DuckDB, Spark, Pandas, Polars, &amp; more</p></li><li><p>Modeled after Apache DataFusion&#8217;s extensible approach</p></li></ul><p>So, we shall see, eh?</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h3>Python + Vortex and benchmarking performance.</h3><p>Well, if any tool worth its salt wants to make a go of the data community, it must have first-class Python support; otherwise, all is for naught. Rustaceans might cringe, but that&#8217;s simply the way of the world.</p><p>Let us kill those two birds with one stone.</p><ul><li><p>Try out the Python integrations with DuckDB, Polars, and Datafusion.</p></li><li><p>Benchmark those against each other, plus vs CSV, Parquet, maybe Lance?</p></li></ul><blockquote><p><em>We can see what it&#8217;s like to use Vortex inside Python code, and see if there is any indication its performance is a thing or not.</em></p></blockquote><p>Remember, if you leave me nasty comments about scientific benchmarking, there is a high probability that I will make fun of you in front of 35,000 people who follow me on this platform. Just saying. Take your chances.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NTOC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NTOC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 424w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 848w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 1272w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NTOC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png" width="1400" height="484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:484,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129861,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NTOC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 424w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 848w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 1272w, https://substackcdn.com/image/fetch/$s_!NTOC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F684c1550-f267-4917-8ebc-4f0af91c678d_1400x484.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Next, let us grab some data to use. We will use the Backblaze open-source hard drive dataset about failure rates.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j_5f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 424w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 848w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 1272w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j_5f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png" width="1456" height="249" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:249,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58569,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j_5f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 424w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 848w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 1272w, https://substackcdn.com/image/fetch/$s_!j_5f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551f0eb6-d14e-4036-b650-6f3e70c37359_1706x292.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Let&#8217;s take these two zip files, each containing two quarters of data, totaling about <strong>23.91 GB on disk</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ga_7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ga_7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 424w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 848w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 1272w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ga_7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png" width="1364" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:1364,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:84093,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ga_7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 424w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 848w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 1272w, https://substackcdn.com/image/fetch/$s_!ga_7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec0e9c8d-026f-4b9c-88aa-71a51ce41b14_1364x342.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>This should be enough for our purposes.</p></li></ul><p>First, let&#8217;s run a simple DuckDB query on these raw CSV files to get a baseline, then we will convert to Vortex and test again.</p><ul><li><p><a href="https://github.com/danielbeach/benchmarkingVortex">All this code is on GitHub, you hobbit.</a></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-2US!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-2US!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 424w, https://substackcdn.com/image/fetch/$s_!-2US!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 848w, https://substackcdn.com/image/fetch/$s_!-2US!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!-2US!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-2US!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png" width="1400" height="1266" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1266,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:293326,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-2US!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 424w, https://substackcdn.com/image/fetch/$s_!-2US!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 848w, https://substackcdn.com/image/fetch/$s_!-2US!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!-2US!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ced6f2d-b930-4b37-bdee-3140648ecf74_1400x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">Total failure days: 181
Runtime: 25.465s</code></pre></div><p>Ok, longer than I thought. Let&#8217;s run the exact same code on raw CSV with both Polars and DataFusion before we switch to Vortex, and then test Parquet and Lance as well.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3eNd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3eNd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 424w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 848w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3eNd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png" width="1400" height="1228" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/add5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1228,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:288735,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3eNd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 424w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 848w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!3eNd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadd5fe06-4fe5-4472-8807-dd041418e4ca_1400x1228.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Oh &#8230; what do you flipping know, another one of those mysterious Polars failures that doesn&#8217;t exist, and you are anathema and a heretic for even saying it.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JfDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JfDm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 424w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 848w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JfDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png" width="1456" height="989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:989,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:333164,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JfDm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 424w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 848w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!JfDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5a44f04-7028-4565-92d9-91bde77ab3e6_1934x1314.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I highly suggest <a href="https://dataengineeringcentral.substack.com/p/why-im-replacing-polars-with-duckdb">you go read this.</a> And also, I encourage you to go to Google OOM Polars issues yourself, don&#8217;t let that powerful and slinking Nazgul scare you away, you will find enough evidence yourself. <em><strong>You can only hide the pea for so long.</strong></em></p><ul><li><p>This is why I had to rip Polars out of production: simply unpredictable and unreliable in an unacceptable way, for tasks that other tools handle with ease.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dataengineeringcentral.substack.com/p/why-im-replacing-polars-with-duckdb" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y2md!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 424w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 848w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 1272w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y2md!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png" width="1456" height="456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88831,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://dataengineeringcentral.substack.com/p/why-im-replacing-polars-with-duckdb&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y2md!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 424w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 848w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 1272w, https://substackcdn.com/image/fetch/$s_!Y2md!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08e76509-56e1-405f-9baa-f90796ec5848_1884x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anyway, stick Polars in the ditch, let&#8217;s move on to DataFusion.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WDx_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WDx_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 424w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 848w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 1272w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WDx_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png" width="1400" height="1564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1564,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:374104,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WDx_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 424w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 848w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 1272w, https://substackcdn.com/image/fetch/$s_!WDx_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad3e5363-3d8b-40b8-a08f-bce9dae717ad_1400x1564.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Oh dang, Datafusion is fast.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Total failure days: 181
Runtime: 5.106s</code></pre></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Ok, let&#8217;s move to Parquet files.</h2><p>So, let&#8217;s convert all the CSV files to Parquet and re-run those scripts with DuckDB, Polars, and Datafusion on them. See what&#8217;s cracken.</p><ul><li><p>184 CSVs &#8594; 20 Parquet files (<em>10 per quarter, last batch 2 files each</em>).</p></li></ul><p><a href="https://github.com/danielbeach/benchmarkingVortex">Again, all the code is on GitHub</a>, and we converted all our previous scripts to just read Parquet instead of raw CSV. You can go look at the scripts if you please. I&#8217;m just going to show results here.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">  - failures_by_day_duckdb.py &#8212; 0.125s
  - failures_by_day_datafusion.py &#8212; 0.370s
  - failures_by_day_polars.py &#8212; 0.193s</code></pre></div><p>Well, that is quite the difference, eh.</p><div><hr></div><h2>Ok, let&#8217;s move to Vortex files.</h2><p>So, let&#8217;s finally get to what we&#8217;ve been waiting for this whole time: will these packages really integrate well with Vortex, and will the performance be noticeably faster than Parquet?</p><ul><li><p>First, let&#8217;s convert from Parquet to Vortex with DuckDB.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zFMi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zFMi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 424w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 848w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zFMi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png" width="1400" height="1228" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1228,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:307174,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zFMi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 424w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 848w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!zFMi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd284c16-7a41-4866-8573-2e572731f88c_1400x1228.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With that done, let&#8217;s run the benchmarks with the same set of tools again for DuckDB, Polars, and Datafusion on Vortex. I will put the DuckDB example in; you can check the rest in GitHub, and print the results only here.</p><p></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">  - vortex_scripts/failures_by_day_vortex.py &#8212; pure vortex scan with filter pushdown, 0.111s
  - vortex_scripts/failures_by_day_duckdb.py &#8212; DuckDB querying VortexDataset (PyArrow interface), Runtime: 0.201s
  - vortex_scripts/failures_by_day_polars.py &#8212; VortexFile.to_polars() LazyFrame, Runtime: 0.114s</code></pre></div><p>FYI, the integrations aren't as solid as they claim. The DuckDB vortex extension blows up with memory errors. <strong>OOM like Polars on CSV files.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8VGS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8VGS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 424w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 848w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8VGS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png" width="1400" height="1526" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1526,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:325033,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8VGS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 424w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 848w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!8VGS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3c789ce-4136-488a-8ff6-5bf1d1c726e3_1400x1526.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WDD0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WDD0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 424w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 848w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 1272w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WDD0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png" width="1456" height="1071" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1071,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:936277,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WDD0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 424w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 848w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 1272w, https://substackcdn.com/image/fetch/$s_!WDD0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66d805e-da4b-445b-ab70-a02d7b462b3c_1938x1426.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Polars only seems to support per-file-type code; I couldn't get any globbing patterns to work. On top of that, we convert to PyArrow and Polars right away in that code.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mB61!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mB61!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 424w, https://substackcdn.com/image/fetch/$s_!mB61!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 848w, https://substackcdn.com/image/fetch/$s_!mB61!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 1272w, https://substackcdn.com/image/fetch/$s_!mB61!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mB61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png" width="1400" height="1750" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1750,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:374434,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mB61!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 424w, https://substackcdn.com/image/fetch/$s_!mB61!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 848w, https://substackcdn.com/image/fetch/$s_!mB61!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 1272w, https://substackcdn.com/image/fetch/$s_!mB61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179c5efe-dd59-4e91-94bd-2c3a2be4b8c8_1400x1750.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Heck, maybe I&#8217;m doing stuff wrong, but the farther I get into the weeds with Vortex, the more I realize this might be early days yet. Seems there are still some problems to solve and wrinkles to iron out in the Python ecosystem.</p><ul><li><p>We could probably make DuckDB work with Vortex by doing what we did with Polars, convert to Arrow right away?</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gFGW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gFGW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 424w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 848w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 1272w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gFGW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png" width="1400" height="2010" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2010,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:446022,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gFGW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 424w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 848w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 1272w, https://substackcdn.com/image/fetch/$s_!gFGW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d50de8e-05ed-4f08-a066-02926a35e15e_1400x2010.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Blah, performance is fine, but that is some nasty-looking code. No fault of these third-party tools, just a little early in the lifecycle methinks.</p><div><hr></div><p>Anywho, here ya go.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zvdY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zvdY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 424w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 848w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 1272w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zvdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png" width="1200" height="700" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:700,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38114,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/199213160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zvdY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 424w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 848w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 1272w, https://substackcdn.com/image/fetch/$s_!zvdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa240e3b7-163e-4f12-8946-4be09f3a9937_1200x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I didn&#8217;t learn a ton. Maybe the lift over Parquet is larger on massive datasets, but I'm not sure it&#8217;s worth the hassle of subpar Python framework integrations and ugly code, along with OOM issues when you try to read directories of Vortex files.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/benchmarking-vortex-file-format-vs?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Why, I do declare!]]></title><description><![CDATA[Guest post from Anonymous Rust Dev]]></description><link>https://dataengineeringcentral.substack.com/p/why-i-do-declare</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/why-i-do-declare</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Fri, 22 May 2026 12:58:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DUmj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DUmj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DUmj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DUmj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:984597,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195064346?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DUmj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!DUmj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5929273-0128-4e4c-93ef-f09e98157d8d_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p>Back again for more, The Anonymous Rust Dev is here to talk Declarative Code.</p></div><p>You&#8217;ve seen it before. Don&#8217;t lie, I know you have. Java has the <a href="https://dev.java/learn/api/streams/map-filter-reduce/">Stream API</a>. .NET has <a href="https://learn.microsoft.com/en-us/dotnet/csharp/linq/">LINQ</a>. We&#8217;ve even talked about <a href="https://dataengineeringcentral.substack.com/p/lazy-is-ambitious">lazy evaluation</a> ourselves.</p><p>When we last looked at lazy evaluation, it was in the context of doing work only when it&#8217;s needed, rather than preemptively doing all the work and risking that much of it wasn&#8217;t needed. YAGNI.</p><p>All that said, these streaming models expose other benefits. And I, your friendly neighborhood Anonymous Rust Dev, am here to tease some of them out.</p><h2>Declarative code</h2><p>First, a shameless plug for our recent conversation around <a href="https://dataengineeringcentral.substack.com/p/why-declarative-pipelines-are-the">declarative pipelines</a>. In it, Dan looks at Spark and the SDP framework with examples.</p><blockquote><p><em>SDP isn&#8217;t the first time declarative code has appeared on the scene. In fact, traces of it can be found back in the coding antiquity &#8212;&nbsp;<a href="https://en.wikipedia.org/wiki/ML_(programming_language)">ML</a>&nbsp;originated in the early 1970s and is the precursor to many modern functional languages like Haskell or Scala. And I know my audience; SQL&#8217;s </em><code>SELECT</code><em> It is another popular declarative grammar.</em></p></blockquote><p>It&#8217;s understandable how imperative became the lingua franca of the programming world. For instance, just looking at C, one of its goals is to closely mirror what the underlying hardware is doing, and if you&#8217;ve ever cracked open assembly code, you&#8217;ll see imperative code at its most fundamental level. For those who don&#8217;t know assembly, I don&#8217;t plan to show you any code today, but it&#8217;s literally just a sequence of atomic steps like &#8220;add this to that&#8221; or &#8220;go to this location in code&#8221;.</p><ul><li><p>Procedural code is easier to understand when your scope is small. It&#8217;s very easy to reason about an addition statement. But when looking at large-scale problems, many tiny instructions start to turn into noise. Naturally, you can (and should) refactor your code to hide some of that. In a perfect world, if you&#8217;ve done a good job of refactoring and structuring your codebase, the code should flow like prose &#8212; or, to put it another way, it should tell a clear story.</p></li></ul><p>And <em>that</em> is where declarative programming comes in. Baked into its entire premise is the notion that, rather than giving you a series of do-this-then-do-that statements, it instead models the flow of an application, and more closely represents many problem domains.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/why-i-do-declare?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/why-i-do-declare?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/why-i-do-declare?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Functional programming</h2><p>The &#8220;functional&#8221; in the name isn&#8217;t talking about whether or not it works, but rather the unit of work being employed. If <a href="https://youtu.be/QM1iUe6IofM">this video by Brian Will</a> doesn&#8217;t quickly sour you on OOP, I&#8217;m not sure how long you&#8217;ll be able to stick with me; the core premise I take from it is that OOP doesn&#8217;t really model how things work in the real world. Conversely, if you&#8217;ve ever seen a flowchart and understood it, you intuitively know functional programming.</p><p>Don&#8217;t get me wrong, OOP has its place, but it&#8217;s overused and employed for the wrong problems. If you&#8217;ve ever spent time in old-school Java and C#, you understand this better than most &#8212; everything is derived from Object, and if you want to execute behavior in your code, the recipe is:</p><ul><li><p><em>Have a thing</em></p></li><li><p><em>Have the thing do something to itself via methods</em></p></li></ul><p>Sounds passive to me. All of your code starts on the premise that it&#8217;s attached to a thing, and at some point, the &#8220;you must have a root thing to do all the stuff&#8221; narrative became so tiring and tedious that C# introduced syntactic sugar to alleviate it with their <a href="https://learn.microsoft.com/en-us/dotnet/csharp/tutorials/top-level-statements">&#8220;top-level statements&#8221;</a> concept. Now, you don&#8217;t need a bunch of namespace and class boilerplate to start doing stuff &#8212; it&#8217;s still there, lurking behind the scenes, but programs start to feel more like a series of actions and less like a bunch of objects that are being forced to stand in for behavior.</p><p>When functions are first-class citizens, behaviors are easier to express... and arguably, to understand. Where OOP inevitably drives you to use class structures and inheritance models to contain, protect, and manage application state, functional code tends to focus on &#8220;having data&#8221; and a means to transform or process it.</p><p>Now, that recipe becomes:</p><h2>Do stuff</h2><h3>In practice</h3><p>Relax, I&#8217;m not trying to convince everyone here to become a Scala dev. While not as elegant, Python has some constructs that make this somewhat ergonomic.</p><blockquote><p><em>First, I&#8217;ll introduce you to the notion of a &#8220;point-free&#8221; or <a href="https://en.wikipedia.org/wiki/Tacit_programming">tacit</a> style of programming. That Wikipedia link offers a <a href="https://en.wikipedia.org/wiki/Tacit_programming#Python">Python example</a> that illustrates how to &#8220;compose&#8221; a series of functions into a wrapping function.</em></p></blockquote><p>This should intuitively make some sense when you see it. If you&#8217;ve ever baked a cake, you understand the black-box idea of &#8220;baking a cake,&#8221; even without memorizing the individual steps. Of course, you realize there <em>are</em> in fact several atomic steps that must be executed (<em>e.g., procure eggs, store/refrigerate eggs, remove eggs from storage, break eggs, blend eggs, etc.</em>), but many of those details are composed of parent steps that themselves are composed into the recipe as a whole.</p><p>Functional programming follows the same spirit. I&#8217;m switching to TypeScript for this illustration, since it does a good job of being readable while still giving us type descriptions:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;7eccad10-fe41-4bd3-8aa8-cc2e27a5b651&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">interface Ingredient {
    name: string,
    quantity: number,
    units: 'each' | 'grams' | 'milliliters',
}

type Cake = unknown;  // We'll figure this out some other time

function bakeCake(ingredients: Array&lt;Ingredient&gt;): Cake {
    // TODO
}</code></pre></div><p>First, you can see I &#8220;baked&#8221; in some assumptions (sorry, couldn&#8217;t help myself) &#8212; namely, that you need ingredients as an input to produce the output Cake.</p><p>Want to see something cool? The function signature for a &#8220;good&#8221; (subjective measure) function should tell you something even without knowing the function or argument names: <code>(Ingredient[]) -&gt; Cake</code>. You intuitively know, even without being told what the function&#8217;s name is, what recipe is being executed:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;3bca7d32-59b5-4e1e-a9f4-e29335d161b9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Scenario: (unnamed behavior happening here)
Given some ingredients
When we (insert function name here)
Then we produce a Cake</code></pre></div><p>BDD (e.g., <a href="https://cucumber.io/docs/gherkin/">Cucumber&#8217;s Gherkin</a>, as shown here) is an amazing way to test-drive systems, particularly those with an emergent design like the one we&#8217;re using here. We &#8220;know&#8221; there are some steps to producing a cake, and we can black-box that behavior in the meantime with a Scenario until we&#8217;ve had a chance to do some discovery and tease it out.</p><p>Returning to our <code>bakeCake</code> function, let&#8217;s try stubbing out some behavior:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;7e12b23b-8269-4324-81aa-1c694bfcd9ef&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">function prepare(ingredients: Array&lt;Ingredient&gt;): Array&lt;Ingredient&gt; {
    let processedIngredients = [];
    // TODO: take input ingredients, do stuff do them, produce some "prepared" ingredients
    
    return processedIngredients;
}

type Oven = unknown; // Also figure this out at a later date

function preheatOven(tempCelsius: number): Oven {
    let oven;
    // TODO: do some stuff to make our oven hot here
    
    return oven;
}

interface Food { /* TBD */ }

// Fleshing out Cake just a bit more:
interface Cake extends Food { /* TBD */ }

function cook(ingredients: Array&lt;Ingredient&gt;, oven: Oven): Food {
    let result;
    
    // ???
    
    return result;
}

function bakeCake(ingredients: Array&lt;Ingredient&gt;): Cake {
    const preparedIngredients = prepare(ingredients);
    const preheatedOven = preheatOven(175.0);
    const cookedFood = cook(preparedIngredients, preheatedOven);
    
    return cookedFood as Cake;
}</code></pre></div><p>Actually, if we&#8217;re being honest, the process of baking a cake doesn&#8217;t invent a new oven in the process; we&#8217;ve teased out a hidden requirement that makes me want to shuffle some stuff around to be more honest:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;be82c004-16f8-4605-80c1-cbc2e8850c26&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">function preheat(tempCelsius: number, oven: Oven): Oven {
    // do something to our input oven to make it the right amount of hot
    return oven;
}

function bakeCake(ingredients: Array&lt;Ingredient&gt;, oven: Oven): Cake {
    const preparedIngredients = prepare(ingredients);
    const preheatedOven = preheat(175.0, oven);
    const cookedFood = cook(preparedIngredients, preheatedOven);

    return cookedFood as Cake;
}</code></pre></div><p>While still sketchy on the details, we can now see the emerging design of a cake-baking workflow. Also, I have enough in place to refactor with composition in mind (relying on type inference where possible):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;d651628b-862e-4d3f-8fd4-b8c94098fe38&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">const bakeCake = (ingredients, oven) =&gt; cook(
    prepare(ingredients),
    preheat(175.0, oven)
) as Cake;</code></pre></div><p>You&#8217;re free to have whatever opinions you like, insofar as to whether you prefer the refactored version or the more imperatively-styled variation that preceded it. </p><p>However, you have to admit, the relationship between input and output is clear enough at either point to describe it in high-level terms (<em>e.g., via a flowchart</em>):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KbVe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KbVe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 424w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 848w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 1272w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KbVe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png" width="690" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:690,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:77692,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195064346?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KbVe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 424w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 848w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 1272w, https://substackcdn.com/image/fetch/$s_!KbVe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90cd7fab-bebf-4564-a5b1-e78c1d47a1ab_690x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Yeah, this is the first time we&#8217;re actually seeing the ingredients enumerated, but I couldn&#8217;t leave you hanging forever...</p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p>
      <p>
          <a href="https://dataengineeringcentral.substack.com/p/why-i-do-declare">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Why I Left Facebook to Work for Myself]]></title><description><![CDATA[with Ben Rogojan - Seattle Data Guy]]></description><link>https://dataengineeringcentral.substack.com/p/why-i-left-facebook-to-work-for-myself</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/why-i-left-facebook-to-work-for-myself</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 20 May 2026 13:27:14 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195404147/c7a1a2aeae88d6a3fd42c9141327d472.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/benjaminrogojan/">Ben Rogojan</a> to talk about the <em>real</em> story behind data engineering careers, Big Tech, and what&#8217;s changing right now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/benjaminrogojan/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oPIT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 424w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 848w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 1272w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oPIT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png" width="1456" height="731" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1019539,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/benjaminrogojan/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195404147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oPIT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 424w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 848w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 1272w, https://substackcdn.com/image/fetch/$s_!oPIT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ec0841-d068-434d-a87c-3da95c84f3b3_1602x804.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ben shares how he went from working in kitchens&#8230; to data engineering&#8230; to Facebook&#8230; and eventually walking away from it all to build his own consulting business.</p><p>And yeah, it wasn&#8217;t all glamorous.</p><blockquote><p>&#8220;I was making the same money as Facebook&#8230; and I hated my life.&#8221;</p></blockquote><p>We get into the stuff most people don&#8217;t talk about:</p><ul><li><p>What it&#8217;s actually like working in Big Tech</p></li><li><p>Why high-paying jobs can still burn you out</p></li><li><p>How he transitioned into consulting (and what people get wrong)</p></li><li><p>The reality of modern data stacks and tool sprawl</p></li><li><p>Whether data engineering is changing because of AI</p></li><li><p>Why fundamentals still matter more than ever</p></li></ul><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/why-i-left-facebook-to-work-for-myself?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/why-i-left-facebook-to-work-for-myself?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/why-i-left-facebook-to-work-for-myself?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p>We also go deep on where the industry is heading:</p><ul><li><p>Is the &#8220;modern data stack&#8221; breaking down?</p></li><li><p>Are tools like DuckDB actually replacing warehouses?</p></li><li><p>Is data modeling dead&#8230; or just not trendy anymore?</p></li><li><p>What AI is really changing (and what it&#8217;s not)</p></li></ul><p>If you&#8217;re trying to break into data, grow your career, or figure out where things are headed&#8230; this is one of the more honest conversations you&#8217;ll hear.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://courses.technicalfreelanceracademy.com/courses/starting-6-7-figure-consulting">Ben also runs a course and community for those interested in getting into consulting.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://courses.technicalfreelanceracademy.com/courses/starting-6-7-figure-consulting" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pjnb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 424w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 848w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 1272w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pjnb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png" width="1456" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3125489,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://courses.technicalfreelanceracademy.com/courses/starting-6-7-figure-consulting&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195404147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pjnb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 424w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 848w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 1272w, https://substackcdn.com/image/fetch/$s_!Pjnb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1043d7-e679-4476-b048-1d440af4ee97_2386x1046.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div>]]></content:encoded></item><item><title><![CDATA[Spark. Postgres. Duplicates. Dang it.]]></title><description><![CDATA[a lesson in fundamentals]]></description><link>https://dataengineeringcentral.substack.com/p/spark-postgres-duplicates-dang-it</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/spark-postgres-duplicates-dang-it</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 18 May 2026 12:51:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NlDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NlDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NlDm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 424w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 848w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 1272w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NlDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png" width="1414" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1414,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:520978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197358825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NlDm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 424w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 848w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 1272w, https://substackcdn.com/image/fetch/$s_!NlDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b421b-d0ea-4744-ace4-6505f7a2c2e3_1414x786.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I recently came across the most classic Data Engineering problem of the last 50 years. <strong>Duplicates records.</strong> It was some &#8220;<em>old code</em>&#8221;, if you call 5-year-old Databricks Spark code old, which is like 15 years in Databricks years. To me, it was the perfect example of taking time, a lesson in the most basic fundamentals of data.</p><blockquote><p><em>It&#8217;s not a particularly earth-shaking problem in and of itself, rather boring, but it&#8217;s a story that re-enforces the age old ideals.</em></p></blockquote><p>Since the <a href="https://www.confessionsofadataguy.com/2018/02/">days of my data youth</a>, heck, harkening back to my <a href="https://www.reddit.com/r/webdev/comments/18htpfi/anyone_else_miss_the_good_ole_lamp_days/">LAMP stack</a> years in college, data duplication has been an issue that somehow manages to creep into database tables of all shapes and sizes.</p><p>History is doomed to repeat itself; programming and logic errors, conundrums, and slipups are par for the course. I mean, those little blighter <strong>LLMS were literally trained on our collective sins</strong>, so what makes you think things will get any better?</p><p>That&#8217;s what I thought.</p><div><hr></div><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Thanks to Delta for sponsoring this newsletter! I use Delta Lake daily, 
and I believe it represents the future of Data Engineering. Content like this 
would not be possible without their support. Check out their website below.</code></pre></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp" width="600" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4196,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><h2>We start at the beginning.</h2><p>What is your life really like if at some point during the past 365 days you get a message from someone &#8220;on the business side&#8221; that so-and-so is showing duplicate records? A tale as old as time, this.</p><ul><li><p><em>First off, and before you throw a fit, hear me out:&nbsp;<strong>What is a duplicate record?</strong></em></p></li></ul><p>That is indeed NOT a stupid question. Didn&#8217;t your middle school teacher ever tell you that there is NO such thing as a stupid question? It&#8217;s true. <strong>When it comes to data, you simply cannot assume anything</strong>. Like anything &#8230; at all.</p><p>Sometimes duplicates are by design. It might have been a faulty assumption or a bad decision, <em>but by design in many cases</em>. One of the recurring fundamentals of data, when you encounter a new, or a new-to-you dataset, should be the question &#8230; &#8220;<em>What is the grain of this dataset?</em>&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UvZk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UvZk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 424w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 848w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 1272w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UvZk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png" width="1456" height="542" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:542,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:211326,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197358825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UvZk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 424w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 848w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 1272w, https://substackcdn.com/image/fetch/$s_!UvZk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2a24833-233d-4cf6-8459-8c1ac549eba4_1600x596.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before running off to solve any duplicate data problem or looking for a bug that might not exist, one should start from the ground floor. It&#8217;s just good table manners, and good for you as a technical person.</p><blockquote><p>It&#8217;s hard to solve a problem, especially a data problem involving duplicates, unless you have solved the many times tricky problem of <em><strong>figuring out what makes each record in the dataset unique. </strong></em>It&#8217;s never the same for any two datasets, or rarely is.</p></blockquote><p>Also, before we talk about the problem I encountered, I think we should break down another great way to approach duplicate data to ease the stress of solving issues that can be buried in a complex and large system.</p><p>We should separate any data system or duplicate data research into three big boxes. These boxes can help us narrow down the scope of the problem and where to best unleash the Claude hounds to find it.</p><ol><li><p>Source</p></li><li><p>Transformation</p></li><li><p>Destination</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xQrP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xQrP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 424w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 848w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 1272w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xQrP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png" width="1456" height="837" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:837,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140544,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197358825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xQrP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 424w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 848w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 1272w, https://substackcdn.com/image/fetch/$s_!xQrP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe010de4-d3a1-445a-af10-3f09b504aad3_1476x848.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>This may seem painfully obvious, but when we approach a problem like duplicate data in large, complex systems, finding the bug can be overwhelming. Being able to logically and technically differentiate where a problem is, or isn&#8217;t, will make our jobs and time to resolution much quicker and less stressful.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Back to the problem at hand.</h2><p>So, back to my boring storing of trying to solve an intermittent duplicate data issue within a few short hours, so I could get back to the AI salt mines.</p><p>Here&#8217;s what I knew.</p><ol><li><p><em>Duplicate data issue showing up inconsistently in a Web UI.</em></p><ol><li><p><em>Some days there are duplicates, some days are not.</em></p></li></ol></li><li><p><em>Pipeline from source to finish included &#8230;</em></p><ol><li><p><em>Delta Lake House</em></p></li><li><p><em>Databricks Spark transformation</em></p></li><li><p><em>Python script to push said data to Postgres.</em></p></li><li><p><em>Web App pulling data from Postgres.</em></p></li></ol></li></ol><p>Now, logically, the data problem could be anywhere, although the intermittent nature of the duplicate data would probably preclude problems in the Web App.</p><p>Also, when working in large data systems, the best you can do is control the controllables rather than blame others. My plan was simply to eliminate the possibility of duplicates insofar as the data systems I CONTROL.</p><blockquote><p><em>A good rule of thumb for data, and life &#8230; control the controlables, don&#8217;t worry about the rest until you have too.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b2TT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b2TT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 424w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 848w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 1272w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b2TT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png" width="1456" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:221166,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197358825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b2TT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 424w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 848w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 1272w, https://substackcdn.com/image/fetch/$s_!b2TT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0821467-8399-4c8a-8de0-1dfb346d476d_2028x642.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When I first heard the word &#8220;intermittent&#8221;, I had some suspicions in my mind. But, being the fundamentalist I am, I started with the Delta Lake Table.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UGMz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UGMz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 424w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 848w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 1272w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UGMz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png" width="193" height="299" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:299,&quot;width&quot;:193,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23799,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/197358825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UGMz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 424w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 848w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 1272w, https://substackcdn.com/image/fetch/$s_!UGMz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F755cde9a-0fb4-407c-9d1b-08d0d90c2715_193x299.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div>
      <p>
          <a href="https://dataengineeringcentral.substack.com/p/spark-postgres-duplicates-dang-it">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Academic → CTO: What Actually Matters in Data (Matthew Housley)]]></title><description><![CDATA[podcast interview]]></description><link>https://dataengineeringcentral.substack.com/p/academic-cto-what-actually-matters</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/academic-cto-what-actually-matters</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 13 May 2026 12:50:55 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195398651/fab1f8efb81a9d0b9926ce0805ac5f7f.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>Most companies don&#8217;t have a tooling problem. They have a foundation problem.</p><p>In this episode, I sit down with <a href="https://www.linkedin.com/in/matt-housley/">Matthew Housley</a>, a famed co-author of Data Engineering Fundamentals and former CTO of Ternary Data, to talk about what actually makes data teams successful and why so many organizations get it wrong despite having modern stacks, cloud platforms, and expensive dashboards.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/matt-housley/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qeIg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 424w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 848w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 1272w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qeIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png" width="1456" height="511" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:511,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:988837,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/matt-housley/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195398651?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qeIg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 424w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 848w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 1272w, https://substackcdn.com/image/fetch/$s_!qeIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4610d501-0fb2-4b65-b64a-3edd03754846_1602x562.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><em>Matthew&#8217;s path is a little different than most. He started in academia as a mathematics instructor before moving into industry as a data scientist at Overstock.com, and eventually leading data strategy and analytics as a CTO. That mix of academic rigor and real-world execution gives him a very clear perspective on where things break down.</em></p></li></ul><p>We get into the gap between data science and real business impact, why analytics foundations matter more than flashy models, and what companies consistently underestimate when building out data platforms. We also talk about what it actually looks like to transition from academia to industry, and how that shapes how you think about data problems at scale.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>If you&#8217;ve ever felt like your data stack should be delivering more value than it is, this conversation will probably hit close to home.</p><div><hr></div><h2>&#9201;&#65039; Topics we cover:</h2><ul><li><p>Why most analytics efforts fail before they even start</p></li><li><p>The difference between &#8220;doing data&#8221; and delivering value</p></li><li><p>Data science vs data engineering vs analytics reality</p></li><li><p>Academic thinking vs industry execution</p></li><li><p>What CTOs actually care about when it comes to data</p></li><li><p>Building foundations that don&#8217;t fall apart six months later</p></li></ul><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/academic-cto-what-actually-matters?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/academic-cto-what-actually-matters?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/academic-cto-what-actually-matters?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Reducing PySpark Testing Suite Runtimes]]></title><description><![CDATA[with AI ...]]></description><link>https://dataengineeringcentral.substack.com/p/reducing-pyspark-testing-suite-runtimes</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/reducing-pyspark-testing-suite-runtimes</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 11 May 2026 19:43:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Vbif!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vbif!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vbif!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vbif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:615414,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/194697845?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vbif!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Vbif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91443a31-fbdb-4a5e-bf48-606ea00a48d9_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I like to take the middle road in life, not too hot, not too cold, right down the ol&#8217; center lane. When something like AI comes to gobble us all up, I just smile and keep doing my thing. We live in a world now filled with three types of people.</p><ul><li><p>AI Gluttons</p></li><li><p>Middle of the Road</p></li><li><p>AI Deniers</p></li></ul><p>You can guess which camp I'm sitting in.</p><blockquote><p>Once the cat is out of the bag, it&#8217;s hard to get it back in. AI is here to stay; it will change the face of software forever, slow, fast, bubble, no bubble. It&#8217;s a changing.</p></blockquote><p>So, if you have the luxury of being an AI Denier and never using it, well &#8230; good for you, although the next time you look for a job, you might get a surprise. I understand the argument about &#8220;<em><strong>Software as a Craft</strong></em>&#8221; and being the best at what you do. Such folk will always find a place.</p><p>But, indeed, AI is an innovation worthy of some use in the software world, me thinks.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PiY4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PiY4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 424w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 848w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PiY4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif" width="320" height="240" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84991e46-7970-4dac-9022-a04612f22f10_320x240.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:320,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:276672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/194697845?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PiY4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 424w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 848w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!PiY4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84991e46-7970-4dac-9022-a04612f22f10_320x240.gif 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>Well, Deniers that have always been and always will be, and that leaves us with the AI Gluttons. Who knows who I&#8217;m talking about? It&#8217;s the same folk who, 5 years ago, thought their entire worth as someone who wrote software for a living was simply writing software.</p><ul><li><p><em>That was always a simplistic view of software and led (and still leads) to some of the worst teammates and code producers.</em></p></li></ul><p>What do these new AI Gluttons do?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T1wP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T1wP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 424w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 848w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 1272w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T1wP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif" width="320" height="240" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b828391-812d-48ff-9202-421339e88b0a_300x225.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:225,&quot;width&quot;:300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:303965,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/194697845?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T1wP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 424w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 848w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 1272w, https://substackcdn.com/image/fetch/$s_!T1wP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b828391-812d-48ff-9202-421339e88b0a_300x225.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>They say &#8230; give it to me, all baby, I want the whole thing. Every AI Agent, Gastown, every MPC server, and Claude Skill you can find on GitHub.</p>
      <p>
          <a href="https://dataengineeringcentral.substack.com/p/reducing-pyspark-testing-suite-runtimes">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Cognitive Overload of AI Development]]></title><description><![CDATA[just one more prompt]]></description><link>https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Sat, 09 May 2026 14:18:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tV-n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tV-n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tV-n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tV-n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:624736,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196830088?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tV-n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!tV-n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa48b592c-66ae-41c6-b3f3-6f94c833b57a_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Typically, when the <a href="https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry">Harvard Business Review</a> publishes something, especially techy, people tend to pay attention. Well, that is, unless it goes against the ultra-psyops-capitalism that drives most of the known world, in the form of extracting every useful drop of blood and life from the glassy-eyed masses that are too exhausted or addicted to the doom-scroll to look up for a minute.</p><blockquote><p>Hey, am I just the kettle calling the pot black, <a href="https://dataengineeringcentral.substack.com/spring50">while I sell you a 10-dollar-a-month subscription</a>? Maybe.</p></blockquote><p>Just when you thought the late-night working, on-call fearing, Slack notification twitching, deadline anxiety-ridden Software Engineer was at the end of his or her rope, one JIRA ticket away from giving it all up for chicken farming &#8230; the world in all its maniacal plotting dropped AI square into the face of every developer.</p><div><hr></div><p><em><strong>Thanks to <a href="http://www.delta.io/">Delta</a> for sponsoring this newsletter! I use Delta Lake daily, and I believe it represents the future of Data Engineering. Content like this would not be possible without their support. Check out <a href="http://www.delta.io/">their website</a> below.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp" width="600" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4196,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><p>All we have experienced and learned to this point has been nothing, just training, really.</p><p>It was just preparation to be plugged into the actual matrix, the Gas Town crazed C-suite and CTOs, whose eyes shine bright with endless Claude-driven possibilities of Agents upon Agents, calling each other in endless token burning loops, while investors pour bags of money on their already rich heads.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dZaB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 424w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 848w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 1272w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dZaB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png" width="1456" height="582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f606e523-237e-4ded-9a86-90d0111770bf_2148x858.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:582,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166311,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196830088?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dZaB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 424w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 848w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 1272w, https://substackcdn.com/image/fetch/$s_!dZaB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff606e523-237e-4ded-9a86-90d0111770bf_2148x858.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><em>Every normal path of the software engineering life cycle has been turned upside down, short-circuited, smashed, and thrown out the window.</em></p></li></ul><p>In a blindly dizzy change of culture, the once clean-code-loving, JIRA-embracing, Agile acolytes have abandoned their first love for an Anthropic mistress, abandoning overnight 50 years&#8217; worth of collective experience and knowledge, which was stripped away, sucked up, and fed into the darkness of Foundational Models as training data.</p><blockquote><p>It wasn&#8217;t enough to simply turn a million programmers into token fools; once independent thinkers, we are all now sucking at the teat of subscription plans. We live in fear of being cut off from our new drug of choice, the token, the bringer of life and features.</p></blockquote><p>The outcome and human cost of these Agentic Coding tools are as obvious as they come.</p><div class="pullquote"><p>&#8220;We found that the phenomenon described in these posts&#8212;cognitive exhaustion from intensive oversight of AI agents&#8212;is both real and significant. We call it &#8220;AI brain fry,&#8221; which we define as <em>mental fatigue from excessive use or oversight of AI tools beyond one&#8217;s cognitive capacity.</em> Participants described a &#8220;buzzing&#8221; feeling or a mental fog with difficulty focusing, slower decision-making, and headaches. This AI-associated mental strain carries significant costs in the form of increased employee errors, decision fatigue, and intention to quit.&#8221;<br><a href="https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry">- Harvard Business Review</a></p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>What it means to program in the Age of AI.</h2><p>I feel I might have some insight here, having been on the giving and receiving end of the AI firehose of code, and now that we&#8217;ve been living in that world for a bit, it&#8217;s pretty obvious both the upsides and downsides we are dealing with in this Brave New World.</p><p>Look, my personal feelings on it all are irrelevant, as are yours. We are just little cogs in a big machine, washed along the flood of professional life, victims of the rains of the tech culture at large. You may tell yourself you sit aloof, you and your neovim, but the truth is you&#8217;re doing what you&#8217;re doing because you&#8217;ve been influenced, or pushed places by necessity.</p><ul><li><p><em>So what does it mean to deal with AI in all its reckless glory?</em></p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">1. Senior+ Engineers have turned into glorified code reviewers.

2. It isn&#8217;t possible to keep up with the pace of AI output in a codebase.

3. You no longer understand the minutiae and details of an AI-generated codebase.

4. Your mind and soul are torn between doing the right thing (understanding and
 reviewing changes in depth) and meeting expectations.

5. Inexperienced devs and non-engineering stakeholders think they are smarter 
than they are (armed with Claude).

6. Bad designs and architecture are amplified.

7. You will get more &#8220;burned out&#8221; faster.

8. You let &#8220;things slide&#8221; that you normally would not have.

9. We have to deal with more organizational and professional chaos and uncertainty.</code></pre></div><p>I think what it boils down to, and the end of the day, is the mental burden of the expectation that we use AI to move quickly and produce more, while still being held accountable for all negative outcomes and side effects of the software products we produce.</p><blockquote><p>It is a classic problem: <strong>being held responsible for something you can&#8217;t control in its entirety.</strong></p></blockquote><p>A little interwebs search will assure you that this topic is on the back of the minds of a lot of folk, who know enough to be worried.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://news.ycombinator.com/item?id=46934404https://news.ycombinator.com/item?id=46934404" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zjSl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 424w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 848w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 1272w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zjSl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png" width="1456" height="277" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:277,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68138,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://news.ycombinator.com/item?id=46934404https://news.ycombinator.com/item?id=46934404&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196830088?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zjSl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 424w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 848w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 1272w, https://substackcdn.com/image/fetch/$s_!zjSl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3161d2d7-84c2-492b-aa23-9a3f53ae4062_1544x294.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.reddit.com/r/technology/comments/1rsoqcy/ai_is_exhausting_workers_so_much_researchers_have/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AM8P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 424w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 848w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 1272w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AM8P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png" width="1456" height="666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:666,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:697001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.reddit.com/r/technology/comments/1rsoqcy/ai_is_exhausting_workers_so_much_researchers_have/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196830088?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AM8P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 424w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 848w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 1272w, https://substackcdn.com/image/fetch/$s_!AM8P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5de5ed-953f-4a89-ae70-623856f8c952_1596x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substack.com/home/post/p-196220896" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M7bx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 424w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 848w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 1272w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M7bx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png" width="1456" height="781" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:781,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:640931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://substack.com/home/post/p-196220896&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196830088?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M7bx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 424w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 848w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 1272w, https://substackcdn.com/image/fetch/$s_!M7bx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b3d123f-c6e7-4002-960a-58c53e052767_1528x820.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Why kick against the goad?</h2><p>So what?</p><p>It&#8217;s all good and well to rage against the AI machine, but methinks that might be a fool&#8217;s errand. There probably isn&#8217;t much hope to make any gains against the flow of tech culture; it&#8217;s probably best to just find our way through it, if we can.</p><p>Now, more than ever, it&#8217;s important to do a few things to counteract AI burnout in your own life.</p><ul><li><p>Find hobbies and take time away from the computer.</p><ul><li><p><em>Go outside, exercise, and read.</em></p></li></ul></li><li><p>Treat AI coding like another skill that&#8217;s important in the marketplace, don&#8217;t overemphasize it.</p></li><li><p>Remember why you fell in love with coding, and do those things yourself regularly.</p></li><li><p>Ignore the AI Doomers and the AI Groomers at the same time.</p><ul><li><p><em>Take the middle road.</em></p></li></ul></li></ul><p>I think a healthy dose of reality in all its forms is a great antivenom for a mind and body plagued by AI burnout. For me, this burnout creeps in slowly and takes over before I know it, or realize it.</p><blockquote><p>Context switching quickly, massive code changes, and PRs, the incredibly fast pace of development and new features, and worrying about quality, understandability, and best practices.</p></blockquote><p><strong>Slow yourself down.</strong></p><p>While those around you move and ship at breakneck speeds, you should slow down a little. Take time to think through large design decisions, architecture, and systems. </p><p>Think critically and slowly about that AI-generated PR, just because it was spewed out in a day (w<em>hat once would have taken a week</em>), spend an extra day or two really understanding the business context, what&#8217;s happening, and hidden decisions being introduced.</p><div><hr></div><h2>Beat them at their own game.</h2><p>Don&#8217;t try to escape the game; it&#8217;s been forced upon you, and it&#8217;s probably not going anywhere. Be optimistic in your attitude, remember why you do what you do, and why you love it.</p><blockquote><p><em><strong>They think AI will make them a 10x Engineer with little work and no sacrifice. You know better.</strong></em></p></blockquote><p>Go deep, understand problems like no one else. Ship at their pace when needed, and slow down when it counts. Take your time. Find joy in the great outdoors, and live a healthy lifestyle. The code will flow long after you are gone, and was flowing long before you arrived. Remember that.</p><div class="poll-embed" data-attrs="{&quot;id&quot;:509713}" data-component-name="PollToDOM"></div><p>Please leave a comment and let me know how you view the AI revolution and how you are dealing with it yourself.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-cognitive-overload-of-ai-development/comments"><span>Leave a comment</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AI Isn’t Replacing Curious Developers]]></title><description><![CDATA[It&#8217;s Changing Who Wins (Neil Roberts)]]></description><link>https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 06 May 2026 13:39:45 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195351410/b8335a6f25714561904e11cccac79990.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>AI isn&#8217;t just changing how we write code. It&#8217;s changing what it even means to build software.</p><p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/neilcolynroberts/">Neil Roberts</a> &#8212; a developer who&#8217;s been through every major wave of the web, from BASIC on an Atari to modern TypeScript, and now deep into LLMs and agentic workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/neilcolynroberts/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q7p2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 424w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 848w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 1272w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q7p2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png" width="1456" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:285592,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/neilcolynroberts/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195285177?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Q7p2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 424w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 848w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 1272w, https://substackcdn.com/image/fetch/$s_!Q7p2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d847355-f794-4e19-8108-3f28e99979f9_1634x866.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is not another surface-level &#8220;AI will change everything&#8221; conversation. We get into what is actually happening right now, where it works, where it completely breaks, and what developers are getting wrong.</p><ul><li><p><em>We talk about why front-end and UX matter more than ever in an AI world, why most people misunderstand agents, and what real day-to-day workflows with LLMs actually look like. </em></p></li><li><p><em>There&#8217;s also a hard look at who benefits from AI, who falls behind, and whether we are quietly building fragile systems that we don&#8217;t fully understand.</em></p></li></ul><p>If you&#8217;re a developer trying to figure out where this is all going, this is one of those conversations worth paying attention to.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Expect to learn:</p><ul><li><p>Why AI is as much a UX problem as it is a backend problem</p></li><li><p>What &#8220;agents&#8221; actually mean in practice, not in demos</p></li><li><p>Where LLM workflows are useful today and where they fail hard</p></li><li><p>Whether junior developers should be worried or excited</p></li><li><p>How building apps changes when AI is part of the system</p></li><li><p>What developers should actually be doing right now to stay relevant</p></li></ul><p>Neil also has a podcast, <a href="https://podcasts.apple.com/us/podcast/the-skill-tree/id1884932498">The Skill Tree</a>, on AI and agentic-specific topics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://podcasts.apple.com/us/podcast/the-skill-tree/id1884932498" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pfp-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 424w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 848w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 1272w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pfp-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png" width="1456" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:366840,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://podcasts.apple.com/us/podcast/the-skill-tree/id1884932498&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195285177?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!pfp-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 424w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 848w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 1272w, https://substackcdn.com/image/fetch/$s_!pfp-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7d7d781-6f7e-4fcb-b4dd-c41e47494c7e_1756x690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We also get into a bigger question most people are avoiding:</p><ul><li><p>Are we heading toward AI-assisted coding&#8230; or AI-orchestrated systems where developers become supervisors?</p></li><li><p>And maybe more importantly&#8230; which side of that shift do you want to be on?</p></li></ul><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[The Age of Infra and Containers (AI, that is) ... and Humans?]]></title><description><![CDATA[times a changing]]></description><link>https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 04 May 2026 12:04:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4USV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4USV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4USV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!4USV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!4USV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!4USV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4USV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1000613,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4USV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!4USV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!4USV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!4USV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0d46f3b-0ac8-43f8-a88c-96974ea83cac_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>For years, I've wondered at the intricate designs of MAKE files, venvs, poetry, binaries, uvs, yml, and many other such spells that engineers through the ages have deemed necessary to provide a consistent runtime environment. Project to project, company to company, it all changes.</p><blockquote><p>It&#8217;s a mix of this and that; the goal is clear, the tools are not.</p></blockquote><p>You and I want an easy-to-use, continuous, and production-like environment to weave our spells and write our code. Well, we used to cast our code; Claudious does that now. It&#8217;s a question of reduced friction at the end of the day. We want to handle our software and data in a way that replicates as closely as possible the systems the code will crawl.</p><p>The times are changing, like it or not. The price tag to produce a line of code has come down a lot; it&#8217;s been cheapened. Yeah, I hear ya. There will always be people and places who are willing to pay for, and desire, the hand-wrought in a forge of blood and sweat &#8230; code.</p><p>So if the LLMs are spewing out mid-level drivel at an ever-increasing pace, what&#8217;s left for the old and well-worn data persons, like <em>me, us, you</em>? </p><div><hr></div><p style="text-align: center;"><strong>Listen, you scalleywag, yeah, you!</strong> You think I&#8217;m over here recording podcasts, writing stuff, taken a beat&#8217;n for you out of the goodness of my heart??! Ha!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/spring50&quot;,&quot;text&quot;:&quot;Get 50% Off for 1 Year!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/spring50"><span>Get 50% Off for 1 Year!</span></a></p><p style="text-align: center;">Get your moldy wallet out, you heartless pirate. <a href="https://dataengineeringcentral.substack.com/spring50">%50 off for a year</a>, throw a guy a bone.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UmsM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UmsM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 424w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 848w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UmsM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif" width="480" height="240" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:433978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UmsM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 424w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 848w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!UmsM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98e89d67-a2ba-4355-a004-68c1f7b8337e_480x240.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><h3>After code &#8230; infrastructure and containers.</h3><p>It seems everyone is mourning the death of the programmer; maybe it&#8217;s too early to place the headstone and chisel out the date &#8230; I feel like a poor soul on the Batavia, harassed by storms and thrown ashore on <a href="https://en.wikipedia.org/wiki/Houtman_Abrolhos">Abrolhos</a>. <br></p><p>Except I&#8217;m a writer of code, some 397 years later, the ship is my life, the storm is AI, and the island is the world at large.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xgE0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xgE0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 424w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 848w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 1272w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xgE0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png" width="1300" height="446" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:446,&quot;width&quot;:1300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:120299,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xgE0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 424w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 848w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 1272w, https://substackcdn.com/image/fetch/$s_!xgE0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7fb8b3-2022-435a-ba3d-68eeef8f494f_1300x446.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I have a feeling it feels worse when you&#8217;re a helpless player in a cosmic theme than it is in reality. But then again, I&#8217;m currently gainfully employed and have not yet had to face that daunting task of throwing myself upon the shoals of LinkedIn&#8217;s job board.</p><p>AI has shifted the ground underneath our feet; methinks you can&#8217;t put the cat back in the box, even if you want to, and I&#8217;m not saying I do. Best to just trudge ahead and figure out the new world, feeling our way around in the dark, hoping someone or something doesn&#8217;t hack off your fingers.</p><blockquote><p><em>I have noticed some things, though. </em></p></blockquote><p>Things that give me hope for the future, and maybe a path to walk.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h3>The end is not here &#8230; yet.</h3><p>Sure, every look into the crystal ball of the future to divine what comes next usually comes to naught. But I am willing to step into the breach for you and make some predictions. It&#8217;s a free world; I&#8217;m free to be wrong and confident in my assertions pulled from my own experience.</p><p>I have the benefit of already being a coder for decades, and now have the opportunity to use AI and LLMs to expand what I can do and the speed at which I do it, without necessarily losing or not learning a bunch of skills. Although I could see that happening over longer periods of time.</p><blockquote><p><em>At the day job, I build a lot of products and features from scratch, literally, from a C-suit or Product produced One-Pager with an idea, to a delivered piece of software in production. This gives me unique insight into how AI is affecting the &#8220;build&#8221; process as a whole.</em></p></blockquote><p>I&#8217;m just going to jump straight into it, take what you will from it. The following are takeaways I have observed and experienced firsthand regarding the New Age of AI Software Development: building products from conception to reality.</p><ul><li><p>The entire production timeline has shrunk significantly.</p></li><li><p>&#8220;The Business&#8221; is more technical now (<em>they use AI</em>).</p></li><li><p>Architecture and systems design have now become the bottleneck.</p></li><li><p>Decision-making by people, humans, is also a new bottleneck.</p></li><li><p>Writing &#8220;the code&#8221; has become the easy part.</p></li><li><p>More time is spent on the infrastructure.</p></li><li><p>Quick, seamless CI/CD deployment cycles are becoming increasingly important.</p></li><li><p>Containerization is important.</p></li></ul><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q7YT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 424w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 848w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 1272w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q7YT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png" width="1200" height="558" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:558,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:163070,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/186919866?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!q7YT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 424w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 848w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 1272w, https://substackcdn.com/image/fetch/$s_!q7YT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8899f27a-42e9-483b-b026-ced87091c6b4_1200x558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://www.delta.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp" width="600" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4196,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;http://www.delta.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wmd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 424w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 848w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1272w, https://substackcdn.com/image/fetch/$s_!wmd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708be49f-dfaa-498f-a862-8e9810a5fc58_600x123.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><p>The game has changed when it comes to developing and producing software products, and I don&#8217;t think in a bad way. In fact, it gives me hope for the future of software. We still need experienced programmers who are good at their job. But their job can no longer be just spewing code.</p><p>Claude can spew code, we don&#8217;t need code spewers, we need humans who are very technical, and can bridge the gap between some vibe-coded non-reality produced by a Product Manager, and a well-oiled system that can run a thing in production, long term.</p><blockquote><p>We could sum up the findings I listed above like this &#8230;</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vprh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vprh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 424w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 848w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 1272w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vprh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png" width="1174" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1174,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:99221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vprh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 424w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 848w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 1272w, https://substackcdn.com/image/fetch/$s_!Vprh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb62aa2b2-d462-4582-9600-68f8fea9a27e_1174x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The code is arguably the easy part, call it what you want, easier to produce, cheaper to write, more or less &#8230; code production that WAS the bottleneck 5 years ago, is now the flip-flop to be the fastest and quickest part of a software project.</p><ul><li><p>Does this mean we no longer need programmers?</p></li></ul><p>No, in fact, I find it amplifies the value of the senior engineer who has built a set of skills over a long period of time, some technical and some non-technical, skills that aren&#8217;t easy to develop.</p><p>Let&#8217;s break down these three high level findings about building end-to-end products in the age of AI.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h3>The Human Bottleneck</h3><p>What a strange way to think about the new bottlenecks of AI-assisted development workflows. It&#8217;s hard to forecast how it will all shake out in the future, but clearly, based on my experience with the full lifecycle of this <a href="https://en.wikipedia.org/wiki/Brave_New_World">Brave New World</a>, humans indeed still play a critical part.</p><p>Yeah, I know there is a broad spectrum of &#8220;AI usage&#8221; in the real world, ranging from never-touched-it to <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Gas Town</a>. The vast majority of AI usage will fall squarely in the middle, of course. The noisiest folk online will be in either extreme camps, the silent majority in the middle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5znj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5znj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 424w, https://substackcdn.com/image/fetch/$s_!5znj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 848w, https://substackcdn.com/image/fetch/$s_!5znj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 1272w, https://substackcdn.com/image/fetch/$s_!5znj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5znj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png" width="1204" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1204,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5znj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 424w, https://substackcdn.com/image/fetch/$s_!5znj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 848w, https://substackcdn.com/image/fetch/$s_!5znj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 1272w, https://substackcdn.com/image/fetch/$s_!5znj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a6feb34-c6d6-4f5b-853e-29c3e42bac1b_1204x762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My comments on this subject apply to the middle group. I can&#8217;t help people who bury their heads in the sand and think that there is no need to ever touch AI. I also don&#8217;t have much to say to folk who Gas Town every project, they will reap what they sow.</p><blockquote><p>That being said, I&#8217;ve noticed that humans are indeed the new bottleneck, albiet a critically important part of the process, we are always the slow part.</p></blockquote><ul><li><p>Humans are still, in great part, the &#8220;idea factories.&#8221;</p></li><li><p>Humans are always procrastinators and indecisive</p></li><li><p>Most humans lack focus</p></li><li><p>Communication is always a problem and hard</p></li></ul><p>Think about it this way: for those who have started using AI to assist with the generation of the &#8220;technical part&#8221; of Software projects, what was once the bottleneck has now become waiting for other humans to handle the above bullet points.</p><p>It wasn&#8217;t such a big deal in the past, when the C-suit and Product Managers were meeting, talking, ideating, researching, etc, because Engineering was happily over in the corner doing MVPs, POCs, and generally moving at a snail&#8217;s pace.</p><p>Now it has reversed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tkH4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tkH4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 424w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 848w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 1272w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tkH4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png" width="1454" height="712" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:712,&quot;width&quot;:1454,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106207,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tkH4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 424w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 848w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 1272w, https://substackcdn.com/image/fetch/$s_!tkH4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4e3c22-fe99-4fd3-b80e-1c035766728d_1454x712.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The truth is, Engineering can, if they want, pump out MVPs, POCs, and near-production-ready systems like hotcakes. That&#8217;s not the long part anymore. If they are good, they spend a few days on architecture and systems design, and shovel out the technical implementation at a rapid pace, and iteration is even faster than before.</p><p>The bottlenecks are indeed more human-centric.</p><ul><li><p><em>We have to wait for the C-suit to come up with the next &#8220;big thing.&#8221;</em></p></li><li><p><em>We have to wait for Product (or whoever fills that role) and the C-suit to battle it out.</em></p></li><li><p><em>We have to wait for them to do their customer and market research, and validation.</em></p></li><li><p><em>Engineering leaders have to go back and forth with said people to hammer out the possibilities.</em></p></li><li><p><em>Engineering has to build and iterate, feeding back info into this human loop, waiting for updates and changes, ready to spew technology.</em></p></li></ul><p>It has simply become less of a process of waiting weeks or months for engineers to hammer out those bits and bytes. That has been shortened considerably. The long part is the waiting for humans to make decisions, research, ideate, communicate, etc.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2>Systems design, architecture, and infrastructure.</h2><p>The second piece of the software lifecycle puzzle that has become more important, something that doesn&#8217;t take more time than before, but because of the pace of code production, simply appears to be a larger part of the process, is systems design, architecture, and infrastructure. <em>All rolled into one.</em></p><blockquote><p>Truth be told, long before AI came along to write all your functions, the planning and design part of the software project was always the most critical. </p></blockquote><p>The only thing that has changed is that now it has 10x&#8217;d in importance.</p><p><strong>Why?</strong></p><p>Because those legions of Claude Engineers are highly susceptible to thinking less and doing more (code), on a whim and fancy, because they simply CAN move faster, and maybe even feel pressured to move faster.</p><p>It&#8217;s the new norm. In a feint to appear at the top of their game, on board and forward thinking &#8230; they will just start to prompt their way into a solution before anyone has taken the human step of simply &#8230; <strong>thinking.</strong></p><ul><li><p>It&#8217;s clear some engineers think that putting Claude in &#8220;Plan Mode&#8221; is all they have to do.</p></li></ul><p>I mean, you can wing it if you want, just trust the LLM that&#8217;s able to predict the next best token &#8230; with the future of your Data Platform or Software project, but Lord be with you at the end of that road.</p><div class="pullquote"><p>Claude simply doesn&#8217;t get the fine nuances of your CTO. Claude doesn&#8217;t understand the last 5 years of your business's trajectory, problems, and people. Claude doesn&#8217;t understand the personality and tendencies of the individuals in your Engineering group.</p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nJvy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nJvy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 424w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 848w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 1272w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nJvy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png" width="1332" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1332,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:437011,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/196041141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nJvy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 424w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 848w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 1272w, https://substackcdn.com/image/fetch/$s_!nJvy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff042f6a7-38c8-49a5-9a39-59599dfc789a_1332x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What does this all mean in the end?</p><p>You should be the person spending time sitting and considering the &#8230;</p><ul><li><p>Systems Design</p></li><li><p>Architecture</p></li><li><p>Infastructure</p></li></ul><p>Look, I don&#8217;t care if you talk to an LLM about the different options available to you. Pros and cons, what&#8217;s hot and what's not. How does this fit into that? In fact, that&#8217;s a good idea, it helps you not miss the sharp and hidden edges.</p><blockquote><p>But, you should be filtering these decisions through your humaness. You should think about the Engineers around you, what the goals and desires of your CTO are, what other projects and priorties are coming down the pike in the neaer future of Product Management.</p></blockquote><p>To simply make technology decisions, or let the LLM make these technology decisions purely based on &#8230; some probability of the next right token &#8230; <strong>is insane.</strong></p><ul><li><p>What are the skills and traits of your engineers?</p></li><li><p>What are the budget and time constraints?</p></li><li><p>How does this fit into next year&#8217;s goals?</p></li><li><p>What meets the needs of the business?</p></li></ul><p><strong>Systems design, architecture, and infrastructure</strong>&nbsp;are the domain of smart engineers who care about their craft and have seen firsthand decades' worth of right and wrong decisions borne out in reality.</p><blockquote><p>I personally feel like AI simply makes someone 10x what they already were.</p></blockquote><p>Careless design and thoughtless foresight have been the bane of many a system since time began, nothing has changed, the problem has only accelrated with AI.</p><p>I brought up a few technical points and shifts I have experienced firsthand while working in a software lifecycle that includes AI. I want to bring this story I&#8217;ve written you home by emphasizing a few practical points and takeaways.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Practical Applications</h2><p>Again, you have listened to me rant about enough if you have made it this far, for which I commend you, but you might be asking, &#8220;<em>So what? Where does that leave us, me?</em>&#8221;</p><p>Good question.</p><p>Here are some practical technical takeaways for folks living inside an AI-driven software development lifecycle. Things I have found have changed or become very important.</p><p>If you have been left wondering recently what your future might look like, whether you will have a job, and how you can differentiate yourself since code is no longer the blocker? If I were you, <strong>I would use this list as a study guide. </strong></p><ul><li><p>More time is spent on the infrastructure and systems design.</p></li><li><p>Quick, seamless CI/CD deployment cycles are becoming increasingly important.</p></li><li><p>Containerization is key.</p></li><li><p>DevOps is hot again.</p></li><li><p>Terraform &#8230; aka YML as code is non-negotiable.</p></li></ul><p>This might look like a disjoined list of things, but I think they are more related than you think, owing to the fact that they stem from the fundamental shifts AI has wrought on a lot of software development lifecycles, not all, but a lot.</p><p>When code is produced in a fraction of the time, what was once an ancillary task becomes an important part. <strong>If someone produces an MVP in a matter of hours or days, then the movement of that software product into some infrastructure, on some computer, through some automated pipeline is now the difference between make or break.</strong></p><ul><li><p>Systems Design and Infrastructure are now the all-important stage on which the play is danced.</p></li><li><p>Deployment pipelines are the conduit through which the glut of new features and products is pushed into the world.</p></li><li><p>Containers are easy to use, quick to encapsulate, and deploy a variety of software and tools we now swim in.</p></li><li><p>Infra as Code (IaC) is the bedrock upon which the castles of sand are built.</p></li></ul><p>Where once you found your value in the intricate lines of hand-crafted and artisanal functions with which you dazzled your peers, eventually LLMs will rip this from your withered fingers, <em>in part, not in whole, but in part.</em></p><p>Set yourself apart by going the extra mile, Systems Design and Architecture, DevOps and CI/CD, IaC, and Containers. An acute awareness that humans still have a role to play in this pipeline is a valuable skill to refine.</p><p>I urge you to be different in this age of speed, shipping code at lightning speed; you should slow down. <strong>Consider the business and human context in which you find yourself designing software solutions.</strong></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/the-age-of-infra-and-containers-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[AI Is Changing Data Engineering Fast]]></title><description><![CDATA[Here&#8217;s What Actually Matters (Andreas Kretz)]]></description><link>https://dataengineeringcentral.substack.com/p/ai-is-changing-data-engineering-fast</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/ai-is-changing-data-engineering-fast</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Wed, 29 Apr 2026 11:11:14 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/194948005/d4721f8104d95cccf562e48ade181b78.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode of the Data Engineering Central Podcast, I sit down with <a href="https://www.linkedin.com/in/andreas-kretz/">Andreas Kretz</a> to break down what is really happening in the industry right now. We go far beyond surface-level AI hype and talk about how data engineering actually works in the real world, what skills still matter, and where most engineers are wasting time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://learndataengineering.com/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-oLY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 424w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 848w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 1272w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-oLY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png" width="1456" height="609" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:609,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:478404,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://learndataengineering.com/&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/194948005?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-oLY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 424w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 848w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 1272w, https://substackcdn.com/image/fetch/$s_!-oLY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe6e106-10b6-4d92-8517-6fbade243dba_1582x662.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Andreas shares his full journey from industrial IoT and working at Bosch to building one of the largest data engineering education platforms in the world, training over 2,000 students and reaching more than 100,000 engineers globally. We get into what production data systems actually look like, why most learning paths are broken, and how AI is reshaping the role of the modern data engineer.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/ai-is-changing-data-engineering-fast?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/ai-is-changing-data-engineering-fast?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/ai-is-changing-data-engineering-fast?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><ul><li><p>We also dig into the uncomfortable truths. AI can write code, but it cannot replace thinking. Most engineers focus too much on tools and not enough on problem-solving, system design, and communication. That gap is only getting bigger.</p></li></ul><p>If you are trying to figure out how to stay relevant in data engineering, or you are just getting started and want to avoid years of wasted effort, this conversation will change how you think about your career.</p><div><hr></div><h3><strong>Today&#8217;s podcast is sponsored by <a href="http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026">Estuary</a>.</strong></h3><p>Without them, content like this isn&#8217;t possible. The best way to support this Newsletter is to check out what <strong><a href="http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026">Estuary</a></strong> has to offer and click the links below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rU1J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 424w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 848w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 1272w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rU1J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png" width="1384" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:1384,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:89865,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/189674475?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!rU1J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 424w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 848w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 1272w, https://substackcdn.com/image/fetch/$s_!rU1J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a76be4-9417-4a2e-a637-2ca539c08b8c_1384x280.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3 style="text-align: center;"><strong>Build millisecond-latency, scalable, future-proof data pipelines in minutes.</strong></h3><p style="text-align: center;"><a href="http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026">Estuary is the Right-Time Data Platform that integrates all of the systems you use to produce, process, and consume data.</a> Also, providing best-in-class CDC (<em>Change Data Capture</em>).</p><p style="text-align: center;"><a href="http://estuary.dev/?utm_source=podcast_dec&amp;utm_medium=paid_audio&amp;utm_campaign=signups_spring_2026">Estuary</a> unifies today&#8217;s batch and streaming paradigms so that your systems, current and future, are synchronized around the same datasets, updating in milliseconds.</p><div><hr></div><p><strong>What we cover:</strong></p><ul><li><p>Why most data engineers are learning the wrong things</p></li><li><p>The shift from coding to problem-solving and system design</p></li><li><p>How AI is actually changing data engineering workflows</p></li><li><p>Why courses and tutorials are becoming less effective</p></li><li><p>The difference between real production systems and &#8220;toy projects.&#8221;</p></li><li><p>The future of data engineering jobs and whether AI will replace them</p></li><li><p>Why fundamentals still matter more than ever</p></li></ul><p>One of the biggest takeaways is simple. The tools will keep changing, but the problems stay the same. The engineers who win are those who understand systems, ask better questions, and connect business problems to real solutions.</p><div><hr></div><p><strong>Links:</strong></p><ul><li><p>Learn Data Engineering Academy: </p></li></ul><p><a href="https://learndataengineering.com">https://learndataengineering.com</a></p><ul><li><p><a href="https://www.linkedin.com/in/andreas-kretz/">Andreas Kretz on LinkedIn</a></p></li><li><p><a href="https://www.youtube.com/channel/UCY8mzqqGwl5_bTpBY9qLMAA">Andreas Kretz on YouTube</a></p></li><li><p>Sponsor: <a href="https://estuary.dev">https://estuary.dev</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Angry AI rantings of an old git]]></title><description><![CDATA[Guest post from Anonymous Rust Dev]]></description><link>https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git</link><guid isPermaLink="false">https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git</guid><dc:creator><![CDATA[Daniel Beach]]></dc:creator><pubDate>Mon, 27 Apr 2026 12:09:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!c8-b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c8-b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c8-b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c8-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:505070,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195061919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c8-b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!c8-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56aa0e9-a554-48c4-8f78-19f14f46c26c_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p>It&#8217;s been a while since we&#8217;ve heard from that ol&#8217; stick in the mud, Anonymous Rust Dev.</p></div><h1>A luddite&#8217;s observations on AI</h1><p>Somehow, I still conduct my day job without relying on AI regularly. As your friendly neighborhood Anonymous Rust Dev, you might think I&#8217;ve found a home for it at this point, but thus far, I&#8217;m just as much of a cantankerous old git as the actual <a href="https://www.reddit.com/r/theprimeagen/comments/1ge5hwh/linus_torvalds_reckons_ai_is_90_marketing_and_10/">creator of Git</a>. </p><blockquote><p>I&#8217;ve had people wax poetic, tell me these incredible success stories, and yet they haven&#8217;t won me over.</p></blockquote><p>So, what&#8217;s holding me back? Am I just resisting the inexorable tides of change? What&#8217;s actually going on in my head? As I buck this trend and the peer pressure that goes along with it, am I just setting myself up for failure?</p><h2>The observations of a Millennial</h2><p>As a self-described Luddite, I think it is important to <a href="https://en.wikipedia.org/wiki/Luddite">remember the original Luddites</a>:</p><blockquote><p><em>The Luddites were members of a 19th-century movement of English textile workers who opposed the use of certain types of automated machinery due to concerns relating to worker pay and output quality.</em> <br>-<em>(<a href="https://commons.wikimedia.org/wiki/File:Luddite.jpg#/media/File:Luddite.jpg">Wikipedia</a>)</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://commons.wikimedia.org/wiki/File:Luddite.jpg#/media/File:Luddite.jpg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_bQ8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_bQ8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg" width="1280" height="1690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1690,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1035221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://commons.wikimedia.org/wiki/File:Luddite.jpg#/media/File:Luddite.jpg&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195061919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_bQ8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_bQ8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a23b84-1a78-41cf-b31b-02a43c9e6226_1280x1690.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">By Unknown. 195 years since publication, copyright extinguished - <a href="http://146.87.210.2/detail.aspx?parentpriref=110003456">Working Class Movement Library catalogue</a>, Public Domain, <a href="https://commons.wikimedia.org/w/index.php?curid=2603296">Link</a> (Wikipedia)</figcaption></figure></div><p>The Industrial Revolution may be the single greatest acceleration of societal change across history. We have made numerous advancements as a species, many of them game-changers, but access to industrial processes irrevocably changed how we function. The Luddites had good reason to fear the new paradigm &#8212; it was a fundamental attack on their way of life. And yet, they were on the leading edge of a societal evolution that would eventually come to introduce a new &#8220;middle&#8221; class and an era of prosperity for the Western world.</p><p><strong>...But, at what cost?</strong></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineering Central! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p>The modern era is built on untold horrors arising from industrialism. Today, the idea of unions rubs so many the wrong way, but they are a product of a massive power imbalance that existed between workers and employers. </p><p>The likes of Carnegie, insanely wealthy and famous for his philanthropy, reached that point atop the <a href="https://en.wikipedia.org/wiki/Homestead_strike">backs of mistreated laborers</a>. It is often joked that OSHA regulations are written in blood, and those who bristle at having to follow seemingly stupid rules should remember that, in their absence, employers could have their workers doing some really nasty and dangerous work, with little to no recourse.</p><blockquote><p><em>Honestly, I can&#8217;t begin to do this topic justice. I don&#8217;t think we&#8217;re ignorant of these horrors; anyone who&#8217;s played games in the Bioshock or Fallout franchises, for instance, can see the [fictional] course trajectories of worlds where the guard rails weren&#8217;t there.</em></p></blockquote><div><hr></div><h3>Old School</h3><p>Born in the early 80&#8217;s myself, I&#8217;ve watched the world change at a rapid pace. My early memories were of things like the family&#8217;s Ford LTD, VCR decks, the Nintendo Entertainment System, and eventually I&#8217;d even get to a classroom environment where the class had its own dedicated Apple IIe in the corner of the room. </p><p>I&#8217;d read the works of futurists who said that one day every house would have a computer, and not long after, news broadcasts would do specials about how the Information Superhighway was coming to life and would soon touch our lives.</p><p>I don&#8217;t doubt that the generation before me was already clinging on for dear life. My grandmother could scarcely turn on the TV with the remote, and we&#8217;d be over at her house teaching her which one of the goofy symbols on the remote would start or stop her movie. </p><div class="pullquote"><p>Meanwhile, I got to experience the explosion of growth in IT &#8212; computers became ubiquitous, dial-up would eventually give way to cable and DSL, internet chat would supplant phone conversations, and eventually lead to the phenomenon of SMS text messaging. And, of course, social media platforms would appear, fundamentally altering social dynamics and how information is broadcast and shared. That, coupled with the appearance of the smartphone, ensures we&#8217;re now perpetually tethered to the internet.</p></div><p>Society is still reeling, I think. We haven&#8217;t found footing, and old people today struggle to discern fact from fiction when scrolling through their Facebook feeds. My generation, along for the described ride during our formative years, has enough perspective and adaptability to barely keep up with it, if only just. </p><p>And the newest generations are being thrust into a world of adult-oriented haptics like touchscreens and voice recognition, or neatly digested information in YouTube Shorts that lets them bypass the learning process that earlier generations were forced to contend with, and I can&#8217;t help but wonder what it&#8217;s doing to hobble their developmental progress.</p><blockquote><p>Paradigm shifts have been a thing for the last couple of hundred years, but their frequency is increasing at an alarming rate. We&#8217;re being tossed about in ever choppier seas, and at some point, I expect the whole thing to capsize.</p></blockquote><p>And then, we see the appearance of generative AI. The pace at which this technology evolves and its impact on the general populace are at once exciting and frightening. </p><p>People who don&#8217;t truly understand the technology get to witness this magic genie that seemingly has an answer for everything, while the people who do understand it (insofar as any one of us truly can) know that it&#8217;s simply stringing together the works of yore. This tech lands at the worst possible time, when we&#8217;re still reeling from the collective shock of the pandemic and dealing with tumultuous sociopolitical upheaval and a deepening divide across multiple facets of society.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineeringcentral.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The observations of a freelancer</h2><p>For much of this, I watch from the sidelines. I have no business overlord forcing AI down my throat; I can only watch with fascination as many businesses, like Google or Facebook, attempt to mandate that their workers use AI in their day-to-day activities. For them, the tool is a shiny new hammer, and they&#8217;re going to find a nail eventually.</p><blockquote><p>It <a href="https://futurism.com/artificial-intelligence/ai-failing-boost-productivity">doesn&#8217;t seem to be going all that well, though...</a></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://futurism.com/artificial-intelligence/ai-failing-boost-productivity" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y8t9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 424w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 848w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 1272w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y8t9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png" width="1456" height="671" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:671,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1288353,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://futurism.com/artificial-intelligence/ai-failing-boost-productivity&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineeringcentral.substack.com/i/195061919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y8t9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 424w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 848w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 1272w, https://substackcdn.com/image/fetch/$s_!y8t9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63690352-45d3-4c16-b4ca-255d558c662d_2002x922.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meanwhile, I stay in touch with old colleagues in the web development business. They&#8217;ve got it rough &#8212; one of the areas in which generative AI excels is in displacing creativity, and digital design studios are especially hard-hit. </p><p>Of course, we all <a href="https://www.businessinsider.com/coca-cola-ai-holiday-ad-glitches-highlight-ai-shortcomings-2025-11">know it&#8217;s soulless crap</a>, but it&#8217;s becoming increasingly difficult to discern the difference between the real and the fiction. </p><p>At this point, I honestly don&#8217;t know how the real Will Smith feels about spaghetti, and if he came out in a video showing and telling us his love for the stuff, I wouldn&#8217;t be able to tell you whether it was true. <a href="https://www.reddit.com/r/isthisAI/">r/isthisAI</a> gives a glimpse of the struggle to see through the haze.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qrwz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qrwz!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 424w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 848w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qrwz!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif" width="480" height="240" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Mmmm, spaghetti&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Mmmm, spaghetti" title="Mmmm, spaghetti" srcset="https://substackcdn.com/image/fetch/$s_!Qrwz!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 424w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 848w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 1272w, https://substackcdn.com/image/fetch/$s_!Qrwz!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c2a436-63f9-41ee-80c7-5a33967d3ca0_480x240.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>...And personally, I find it very hard to tell anymore what&#8217;s fake and what&#8217;s authentic. I like to think I know what I&#8217;m looking for, but the cues are increasingly hard to detect. At least in my own generation, I feel we&#8217;ve been sufficiently primed to expect digital misinformation indistinguishable from fact; how much worse for the older generation, with their difficulties keeping up, or for the younger ones who haven&#8217;t seen the evolutionary path and are simply being dropped right into the midst of this?</p><blockquote><p><em>I think it&#8217;s all by design. We&#8217;re currently so inundated as a society with change that we can&#8217;t hope to keep up. It&#8217;s a functional <a href="https://en.wikipedia.org/wiki/Gish_gallop">Gish gallop</a> &#8212; we&#8217;re assaulted from all sides, and we can only devote so much energy to making sense of it all. Hidden in this assault, I think, are multiple concerted efforts:</em></p></blockquote><p><strong>First</strong>, businesses clearly see an opportunity to displace workers or readjust the power dynamic in their own favor. Unlike the fantasy universe of Wall-E, though, we probably can&#8217;t expect the automation to work in favor of the proletariat... Instead, the chasm between the haves and the have-nots will only widen. </p><ul><li><p>For a brief period during the pandemic, the IT worker was a highly coveted asset that businesses would fight over; now, big business sees a rare opportunity to flip the script and devalue those same workers by claiming AI can do their jobs. Whether that&#8217;s true or not, the narrative is being cemented, and soon we should count ourselves &#8220;lucky&#8221; if a business deigns to give us a job and the time of day to go with it.</p></li></ul><p><strong>Second</strong>, we&#8217;re being conditioned by AI vendors. Free-tier and low-cost AI services were a gateway drug, giving people a taste of what&#8217;s possible. As workers increasingly depend on AI to do the dirty work for them and consequently lose the skills or muscle memory associated with that work, there will one day be a rug-pull in which that subscription cost balloons rapidly, and people who got hooked will need to keep their drug dealers happy no matter the price. </p><ul><li><p>The vibe coding era will be quickly replaced by a &#8220;prompt specialist&#8221; one, where you can no longer just pick up the tech on a lark, but instead need to sell your soul for access to it.</p></li></ul><p><strong>Third</strong>, in a rather conspiratorial vein, I also see an emerging technocracy, where certain leaders in the tech space will have positioned themselves to profit from the chaos. I don&#8217;t need to name them; you probably already know who I mean, but they will have the keys to the kingdom. </p><ul><li><p>They will own the data centers, dictate policy, and be the final arbiters of right and wrong. It&#8217;s <a href="https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-national-artificial-intelligence-policy/">already happening</a>, and will only increase in its efforts over time. There is still a shrinking window of opportunity to fight this, as the digital arms race collides with economic and <a href="https://www.youtube.com/watch?v=sL5pItTShys">infrastructure difficulties</a>, but if they emerge victorious, then we&#8217;ll be facing a grim future.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2pj2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2pj2!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2pj2!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The future&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The future" title="The future" srcset="https://substackcdn.com/image/fetch/$s_!2pj2!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!2pj2!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2997243b-af12-4bfc-9d27-25e27572843c_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The observations of a developer</h2><p>So, do I think it&#8217;s all horrible? No, quite the contrary &#8212; I see a place for <em>tooling</em> to emerge that can function as a force multiplier, for those who know what they&#8217;re looking at and how to accommodate it. Granted, anyone who depends on AI-generated code needs to question the ethics involved; those models aren&#8217;t above plagiarizing the works of those who came before.</p><p>The major dangers I see are:</p><ul><li><p>As already stated, there&#8217;s a hook of reliance and an atrophy of skills for people who depend too heavily on this stuff</p></li><li><p>The illusion that developers are easily replaced can create economic shock</p><ul><li><p><em>The idea that developers are easily replaced by automation will displace entry-level talent, and when they&#8217;re not given the opportunity for growth, the future talent pool will be effectively choked out</em></p></li><li><p><em>Talented engineers will be needed to come in and fix the mess when the dust settles, but their work will be much harder as they&#8217;re forced to comb through an accumulation of slop and codebases absent of a coherent vision</em></p></li></ul></li></ul>
      <p>
          <a href="https://dataengineeringcentral.substack.com/p/angry-ai-rantings-of-an-old-git">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>