SQL: You Can't Be a Data Engineer Without It (Non-Negotiable)
Why SQL is Essential for Your Career
SQL is a long-term game. It’s easy to pick up, but hard to master.
I’m not a master by any means, but I know how to write good SQL — clean and efficient.
The problem is any fool with an internet connection can figure out how to write a SELECT statement and press execute, the classic ‘hit and hope.’
I see these ‘hit and hopes’ all the time. They normally have a SELECT * in them and no WHERE clause. Noobs out there think that because the results came back, their SQL is good to go — Wrong.
SQL has a way of lulling you into a false sense of security, and then one day, bam! That SQL query (the one with the *) that was working so nicely doesn’t anymore. Your little query is starting to chug through years of data, scanning the entire table, and now the database slams shut and it’s on its knees — oops!
If we throw some AI into the mix, then the line between good SQL and okay-ish SQL becomes even more fuzzy, but the worms come out of the woodwork sooner or later. They always do.
It’s the very reason I love writing SQL. SQL has a way of keeping you honest and on your toes. Proof: When you do a DROP TABLE, there is no “are you sure?” It does what you told it to do, that table is going bye-bye. Many a developer has strolled over to my desk, head in hands, saying they’ve dropped a table or deleted data they shouldn’t have.
It’s a tale as old as time.
A New World…
In this brave new cloud-based world we find ourselves in, BigQuery, Snowflake, and Databricks land, you need to have your head screwed on when it comes to clean and efficient SQL (I’m talking the get-what-you-need-and-get-out kind of SQL, aka cheap to run) or you WILL break the bank.
Breaking the bank means the people from finance will be shining spotlights on you and your team — This equals danger. You want a spotlight on your team, but not because you and your queries are burning through the budget by the bucket load.
It’s easy, they say…
You can pick up the basics of SQL fairly quickly — most do. That is where they leave it (until something bad happens).
If you want to get good at SQL, I’m talking really good, then it takes time — lots of it.
If you want to know the right and wrong way to do something with SQL, then I’m afraid, again, it takes time and experience.
Most don’t have the time, which is why they write ‘slapdash SQL’ and why data out in the wild is poorly modeled, cobbled together, a mess like a makeshift raft barely staying afloat. People can’t be arsed to understand the ins and outs of it. To think long term. They write the code and move on..
Why Learn SQL?
As a Data Engineer, you NEED SQL. I would argue that learning SQL is one of the rare no-brainers in tech. There are too many benefits to learning it to ignore.
SQL is a sledgehammer: you’re not going to chug through petabytes of data without it — no chance. SQL is everywhere and not going anywhere either, it powers nearly every single back-end database out in the wide world.
The Problem — Data folks often forget SQL is a tool, one of many in this game. For you, Mr. Data Engineer, it’s something you need and something you should get good at, along with other languages and tooling out there.
Learn to learn in conjunction with other things. Mastering SQL alone will not cut it, but getting good at it will help you play the long game and see the bigger picture. Don’t get sucked into the hype of ‘I don’t need SQL’ or ‘it’s quick to learn’ nonsense.
How do you learn SQL?
What does ‘learn SQL’ even mean?
Is it CRUD operations? Or basic syntax?
Is it modeling data?
Is it being able to get data in or out of a database?
Is it performance tuning?
The idea of ‘learning SQL’ is stupid. I’ve been working with relational databases and SQL for the better part of 15 years, and I still don’t consider myself an expert.
SQL is vast, complex, and constantly evolving.
It’s daunting and big!
You will never know it all, and that’s okay. The goal should never be mastery (of anything); it should be continuous improvement. You can, however, get good at it by focusing on the areas that matter the most to your work (right now) and the company you find yourself in.
I will say this though: Learn the fundamentals well, and then build on them. Most people learn the basics — when I say basic, I mean bare minimum — then they draw a line on it and think they “know” SQL.
Remember, amateurs stop when they achieve something. The pros know that there is no end goal, just small steps that are launchpads to the next step.
Don’t stall — keep learning.
If you don’t know the basics of SQL well, you will never grasp the rest. The basics are your bread and butter, especially for a data engineer.
A data engineer who does not know the basics of SQL, in my opinion is, lost.
Understanding fundamental concepts like joins, indexing, and query optimization isn’t just ‘nice to have’ — it’s a must. These are the things you will rely on every single day.
If you want to design efficient data pipelines, model datasets, and have systems perform at scale, get good at the basics, and everything else will follow.
You need to know this stuff.
📩 Think this is valuable? Share it with someone who might benefit.
Thanks for reading! I send this email weekly. If you would like to receive it, join other readers and subscribe below to get my latest articles.
👉 If you enjoyed reading this post, and think this is valuable? Share it with someone who might benefit. Or click the ❤️ button on this post so more people can discover it on Substack 🙏
🚀 Want to get in touch? feel free to find me on LinkedIn.
🎯 Want to read more of my articles, find me on Medium.
✍🏼 Write Content for Art of Data Engineering Publication.