Sitemap - 2024 - Data Engineering Central
Delta Lake vs Apache Iceberg. The Lake House Squabble.
Why Is Everybody So Big on Zig?
AWS S3 Tables?! The Iceberg Cometh.
Databricks Raises Money - 55 Billion Dollar Valuation
Turkey Day Is Here - Black Friday Sale - %50 Off
Data Engineering Central Podcast - 04
10 billion row challenge. DuckDB vs Polars vs Daft.
DataFusion, My Swiss Army Knife
End of Year Engineering Planning for 2025
Apache Airflow vs Databricks Workflows
DuckDB inside Postgres (pg_duckdb) Exposed!
The Death of Primary and Foreign Keys?
What makes "smart" engineers so stupid.
Data Engineering Central Podcast - 03
Daft vs Spark (Databricks) for Delta Tables (Unity Catalog)
Small Engineering Changes (PR reviews)
Should you use DuckDB or Polars?
Data Engineering Central Podcast - 02
Maestro - Netflix Open Sources Workflow Tool
I used ChatGPT o1 to do PostgreSQL basics
Data Engineering Central Podcast
Lord Save Us, Not Another ETL Tool Please!
There are 3 Types of Data Engineers.
Databricks. Delta Lake. Table Versions. Polars. Insidious Features.
MLflow ... with Databricks. Thoughts and more.
NO EXCUSES! Answer the dang questions!
Realtime Streaming data from PostgreSQL to Delta Lake (Unity Catalog)
Using SQL with Python. The Ultimate Chad Stack.
The Rise of The Notebook Engineer
Deploying Spark Streaming with Delta on Kubernetes using Terraform
You're Doing Data Engineering Wrong.
Date and Time Manipulation with DuckDB
Kubernetes Sucks. Long Live K8s.
Ain't no room for AI (in my workflow)
Replace Databricks Spark Jobs (using Delta) with Polars
Snowflake is Dying on the Vine?
Data Validation for Data Engineers
DuckDB 1.0.0 - Let's Kick The Tires
When to Rust for Data Engineering ... and when NOT to.
Introduction to Daft ( ... vs Polars)
Real Life Example of the QuickSort Algo (Rust)
Premature Optimization is NOT the root of all evil?
I See Window Functions Everywhere
How Tech Debt, Databricks, and Spark UDFs ruined my weekend.
Cost Savings for Databricks Users
Why Analytics is a Lose Lose Game
Redshift vs Snowflake vs BigQuery vs Databricks vs ...
Transitioning to Senior Engineer
Delta Lake - Map and Array data types
Spark Connect - What is this madness?
How to Build an Open Source Python Package
Why Aren’t You Filtering More?
Default Values - Thoughts and More
Error Handling for Data Engineers
Microservices for Data Engineering
UDTFs (User-defined Table Functions) in PySpark.
Apple Pie. Angry People. Other News.
DuckDB vs Polars - Thunderdome.
New SQL Practice Problems - Free For Paid Subscribers
Unit Testing for Data Engineers
Batch vs Near-Realtime vs Streaming
Why DuckDB is losing to Polars
LLMs Part 2 - Fine Tuning OpenLLaMA
Introduction to Write-Audit-Publish Pattern
Data Warehouse Analytics - Latency