How Data Engineers Win (Discipline and Fundamentals)
The Real Things Every Data Engineer Should Care About
Data is a land of opportunity. It’s also the land of choice.
Too much choice, if you ask me. Too much choice leads to confusion. Too much choice breeds too many opinions. That is when common sense goes right out the window.
Example:
Airflow, Dagster, Prefect, Mage AI, Kubeflow, Flyte, and Luigi. This list reads like some kid’s favorite Pokémon.
Here’s the secret about most tools in Data:
They all do the same thing! Some do it better. Some have extra bells. Some have fancy whistles. Some are cheap (this is what people care about), most are expensive — like crazy expensive. Some are easier to use than others, and all have the world’s leading marketing teams trying to ram it down your throat the first chance they get.
End of the day, they all do the thing it says on the tin. Take the above list, for example. They all do the same thing — orchestrate. End of story.
Here’s another example: BigQuery vs. Databricks. Databricks vs. Snowflake. Snowflake vs. Redshift.
On and on it goes.
The Never-Ending Battles
If you scroll LinkedIn or Medium long enough, you’ll likely stumble onto an article explaining why this tool is better than that one or why you should use this one over that one.
Endless pros. Endless cons.
To these articles and posts, I have one thing to say: Who cares!
Unless you find yourself as the CTO or in some sort of power position where you get to decide these sorts of things, then what tool is best should be of no concern to you. Sure, you can throw your two cents in if someone comes along and asks your opinion. Then, by all means, share your views, give your thoughts. Most of the time, however, you are walking into a job somewhat blind.
It sucks, but you have no say in what orchestration tool you get to use (they probably have one already).
What database system should they be using? Nope, sorry, it’s probably been set up, bedded in, and not going anywhere (not soon, that’s for sure).
Which ETL tool is the best? Again, it’s probably made itself at home already, and you are just a guest in the house.
You have no control over these choices. Make friends with whatever tools are available.
As my mother would say each dinnertime, ‘You get what you are given.’ This is as true at my childhood dinner times as it is in data. You get what you are given.
The Right Tool for the Job
People like to say, ‘use the right tool for the job.’ I agree, but most, if not all, of the time, you’re in a position where you have to make do with the tool you’ve got. Survivor style, learn to make do.
Why? Because most companies don’t have the budget to blow on the latest and greatest silver bullet.
Many times, I’ve looked over the fence at something like Snowflake or Databricks and wished for this or that little feature which would solve my problems, but companies are not going to run out and buy you Snowflake just to make you happy. You have to use the tools at your disposal.
So, instead of worrying about which is better, get to work and focus on the things that you have control over. For me, it’s working on the fundamentals and the discipline I care about.
It’s a Fundamentals Game
To get good at what you do, you have to know the basics. The millennials out there aren’t going to like this, but there is no other way to learn the basics and get good at them unless you do it.
I’m talking doing it every… single… day… Practice, practice, practice. Learning different types of data, understanding how data is organized, and grasping simple aspects of the job — this is what you will fall back on time and time again. The book Fundamentals of Data Engineering by Matt Housley and Joe Reis should be your go-to.
Then learn your SQL, your Python, your data modeling — all these things combined will lay some solid foundations for you. Throw some mistakes on that bonfire too while you’re at it; they will help you learn (fast).
The fundamentals of data are what will help you learn and master whatever tool lands on your doorstep. It will allow you to pick up tooling and projects with confidence. So, when you walk in the door of your new job and find something you’ve never used before plonked on your plate, it won’t scare you. Why? Because you have the fundamentals down.
My Fundamentals Story
Two years ago, I’d not worked on GCP, let alone touched BigQuery. Fast forward to the present day. I know the ins and outs of BigQuery and the GCP ecosystem. I’m not an expert, but I know how to wangle data in it better than most. I know the right and the wrong way to get BigQuery to do its thing. Why? Because I took my experience and applied it to BigQuery — Simple. If I can do it, so can you. Fundamentals, folks — it’s all that matters.
The Art of Discipline
The sign of a great data professional is their ability to pick up new things fast with as little hand-holding or spoon-feeding as possible. To be a successful Data Engineer, you need to have significant patience, determination, and curiosity. The only way I know to do all that is by good old-fashioned discipline.
Discipline in learning.
Discipline in the way you work and the way you think.
Discipline in your habits and approach to projects and problems.
In my professional life, I have worked with every conceivable type of data professional, from masters of the craft to dead average or worse — lazy. The ones who were at the top of their games were not necessarily the smartest. What set them apart from most was discipline. They:
Did the basics right.
Showed up consistently.
Kept learning every day.
Approached tasks with methodical mindset.
Maintained a high level of stand in every aspect of their work.
Discipline is not something you are born with. You learn it over time — slowly. Small daily disciplines in your work will stack up and, over time, snowball. Discipline will set you apart from 80% of people out there. You don’t have to be smarter than average or have attended five years of university to learn it. It’s earned every day you show up and grind it out. You’re not going to get anywhere without it.
📩 Think this is valuable? Share it with someone who might benefit.
Thanks for reading! I send this email weekly. If you would like to receive it, join other readers and subscribe below to get my latest articles.
👉 If you enjoyed reading this post, and think this is valuable? Share it with someone who might benefit. Or click the ❤️ button on this post so more people can discover it on Substack 🙏
🚀 Want to get in touch? feel free to find me on LinkedIn.
🎯 Want to read more of my articles, find me on Medium.
✍🏼 Write Content for Art of Data Engineering Publication.
So, so true, Tim. Thank you for the reminder, sir. Another excellent newsletter! :{>