That’s right you pasty little hacker. You. Yeah, you. Over there on a Friday afternoon when you should be skipping out of work early. Just putting in a few more lines of code. I mean it’s just a few more.
We just need more code. We can solve this problem, just give me some time to write the code. Did you see that code they wrote? That’s not how you should write that code. Aren’t you writing Rust yet? Golang? Hey, you heard about Zig? I heard it’s lit.
Is it really all about the code? Leetcode junkies snorting one more line of algo goodness into their already exploding brain. Why write one line when you can write five?
I mean gotta keep that GitHub profile screaming that green. Besides, what will your teammates think of you if you aren’t writing a ton of code? Lots of code = doing your job.
You should check out Prefect, the sponsor of the newsletter this week! Prefect is a workflow orchestration tool that gives you observability across all of your data pipelines. Deploy your Python code in minutes with Prefect Cloud.
What mid Data Engineers are doing wrong.
Maybe this should be about what we are all doing wrong. I would say in all my years of Data Engineering probably the single most important lesson I’ve learned is …
… when to NOT write code.
No empty platitudes here, truth. This is what all the mid-engineers get wrong. Yeah, you have to be a good programmer to Senior+ levels, and that requires writing lots of code over time and getting experience.
Nothing worse than a programmer who simply can’t program.
That doesn’t negate the fact that when you reach those Senior+ levels that the proverbial rug is pulled from under your programming feet. The world you live in gets flip-flopped.
You reached your job by being a good programmer, better than average, probably better than those around you. So why the sudden change? Senior+ engineers spend most of their time …
Planning projects
Helping others solve problems
Upskilling and mentoring others
Architecture
Working with the business units (non-engineering)
Translating requirements and needs back to Engineering
You will note that nothing in the list involves programming (although it probably involves the assumption you’re a good programmer).
You’re not special.
I hate to break it to you. I don’t care what your mom told you. She lied. You’re not that special. If we post your job onto the interwebs by tomorrow morning that would be 1000 applications with people who can all write SQL and Python.
Many of them might be below average … but guess what … you can teach someone how to become a better programmer!
So many people mistakenly think that being a better Data Engineer or coder is all about getting a higher rank of LeetCode. That simply isn’t the case. Anyone who programs consistently over time will become better. It’s a relatively easy skill to teach.
You know what isn’t that easy?
Being a team player
Being someone who can solve problems with hand-holding
Excellent written and verbal communication skills
Constant learner
These skills are way harder to learn, hone, and teach than simply how to write SQL and Python. I mean we have ChatGPT now … it’s as good as most of the mid-Engineers.
If you’re on a team of Data Engineers, or trying to get hired onto a team, do you really want to spend all your time becoming the best programmer? Sure, might not be a bad idea. But, if you want to stand out, you need to build the skills that separate Senior+ Engineers from the rest.
The engineers that provide …
Business value
Cross-functional relationships
Upskill the team
We need more Engineers who think before they code.
This is how you’re doing Data Engineering wrong.
Instead of responding to every single problem with more code, immediately, we need more Data Engineers who are willing to stop for a moment … and think.
Good engineers understand that every single line of code written is a liability, something that can and will break. Something that is tech debt.
Majoring in Minors
There are lots of ways to major in minors, to walk down the path of ill-fated Data Platforms that are a spaghetti of tools, half-baked code bases, no tests, and general crap.
I mean, this is where most Data Engineers spend their time, majoring in minors, while the important stuff molders away in the corner.
Arguing about whether to use Python vs Scala vs Golang vs Rust
Not agreeing on whether to use SubQuerys of CTEs
Python or DataFrames vs SQL
Snowflake vs Databricks
AWS vs GCP
OOP vs imperative vs procedural vs functional
Kimball vs Inmon vs other data modeling
Managed service vs self-hosted
Data Warehouse vs Lake House vs Data Lake
In the age of marketing drivel pumped out in a never-ending stream of regurgitated old ideas, the SaaS companies you worship are releasing new “must-have” features every week. You don’t need them all to have a good Data Platform or to be a good Data Engineer.
In fact, there is nothing more telling than when folk bluster on about the brand new this or that, acting like it’s come to deliver us from our purgatory of data. How foolish. Did we not survive up until this point?
Don’t major in the minors like everyone else.
Worry about building Data Platforms and Data Products that are …
reliable
simple
monitoring and logging setup
good development lifecycles
delivered on time
happy customers (end users)
provide real value
cost-effective
Heck, I want you to be a good coder. I think there’s nothing wrong with spending time on Leetcode and honing your craft, we should all spend time becoming better writers of code. But, being an excellent programmer requires more than good code, a lot more.
Getting these values swapped around is probably what keeps most mid-level and junior engineers from getting to the next level. Simply refusing to accept the fact that writing code is all that matters.
What good is your beautiful code if it’s not what the business wants or needs? What good is it if it costs too much? What good is it if it’s over-engineered? Instead of providing value, many engineers simply write as much code as possible and end up destroying the very thing they are supposed to curate, their Data Platforms.
So don’t do it.
You don’t need just one more line of code. It isn’t always the answer.