5 Lessons from 5 Data Teams (Lessons for Data Engineers)
Common Sense Data Engineering Lessons Across Teams
I’ve been lucky in my career. Lucky in the sense that I’ve worked in a number of different industries: finance, tourism, fashion, media, and now music.
I’ve been in big teams, small teams, and the ones in between. I’ve even flown solo for a spell. I’ve been in the deep end and somehow stayed afloat. I’ve been in the shallow end where there was nothing to do but twiddle my thumbs and wonder, “Why am I even here?”
I’ve seen how companies use their data. I’ve seen how not to use it. I’ve seen how you should use it.
I’ve seen common sense go out the window (more times than it should’ve). On the flip side, I’ve also learned common sense isn’t all that common — over and over again.
I’ve seen Excel run some of the leading companies in the world — a lot of Excel. All. Over. The. Shop. I’ve seen tables get dropped, databases corrupted, data lost, dashboards built, and also seen very smart people literally pushing buttons hoping a restart will solve it all — it didn’t. I want to say I’ve seen it all, but nothing surprises me anymore. Something will likely happen that will either make me raise my eyebrows or add a few more frowns to my forehead.
I’ve learned a ton from life in the trenches. I’ve slogged it out day in and day out. I still do, and I love it that way. After all these years, I’m of the opinion that in the tech world, it all boils down to three. There are three types of teams.
Those that make things happen.
Those that watch things happen.
Those that don’t know what’s happening.
Those are the main ones. Teams tend to drift in and out of those over the course of weeks, months, years, or even days, but the good teams see it happening fast and shift gears, fast! They hit the gas before bad habits become the norm. You know, the teams that will say, “Oh, that’s just how we do things here.” Good teams have their systems and values down. They know it’s not about being perfect, but it’s about being consistent.
Life working in data is a freaking circus. No joke. I’ve learned that experience is the real teacher, not the little tutorial you spent 10 dollars on. Here are some experiences I’ve learned from five teams I’ve been in. These are the golden nuggets I’ve learned.
Lesson 1 — Priority is to prioritise
There is always something to do. Always. The business folks of the world will have something for your team to do, and if not, they will cook it up for you, no problem. Therefore, you will need your wits about you.
The more work you get, the more important it is to work out what is important. When I say important, I mean what brings value, makes the team look like rock stars, helps as many people as possible, or solves a real problem.
Having 1000 KPI dashboards and reports does not mean a company is being diligent and proactive. The truth is, the more dashboards and work, the blurrier things get as to what really matters.
Kill off projects and tasks that bring little or no value. Do this early. Do this often. Don’t hold back. One project or task can send you and your team to the stratosphere or off a cliff — choose wisely.
Lesson 2 — Stop the bus
What do I mean by stop the bus? If you’re looking for perfect, it will never come. You can always make that script better. You can always make that pipeline faster. You can always find some way to improve something, but eventually, you’re going to have to stop the bus and get off. You need to figure out when your stop is. You need to figure out when what you are doing is good enough.
I worked with a DBA once who would spend hours and hours changing this and that to try to performance tune a query to shave milliseconds off its execution time, only for the CTO to say, “Oh, that’s great, but this doesn’t need to be fast; it just needs to work. It was good enough as it was.” Ding ding. Stop the bus and get off. Don’t even board the bus until you know where you’re getting off. Remember, sooner or later, the bus you are on will drive off a cliff.
Lesson 3 — RTFM
Read the F*@£ing manual. Everyone knows this saying, yet no one seems to do it. Common sense again.
People will go to Google and Stack Overflow, try all this and that, but very rarely is the official documentation the first place people go to. The documentation is there for a reason — it tells you exactly how something works, its limitations, and its possibilities. It sounds too good to be true, to be honest. The good people who built that tool took the time to write out how the thing works; it’s only right to have a read. If you do take the time to read the documentation, you have a pretty good chance of knowing what you are doing rather than trying to make it up as you go along, in other words, trying to wing it or worse, going to ChatGPT.
Someone told me once that the sign of a great data professional is if they read the official documentation. Well, I’m not sure how true that is, but it makes sense to me. RTFM and read it often.
Lesson 4 —There is always a plan B
The plan you start out with will not be the plan you end with. This upsets people. People get mighty stubborn when the way they planned it to go doesn’t happen. Very similar to my two-year-old daughter when she can’t have her way.
Here is the truth of it: something will happen that will make you change course, maybe not a huge change but a change nonetheless. Always have a plan B or C or D. Be open to change and be adaptable. Plan A is good; it got you going, it got you moving, and most importantly, it got you started.
Never be afraid to change course if it feels like the right thing to do. No point banging your head against a brick wall for days and weeks because you can’t get plan A working. Change and adaptability are the name of the game as a data engineer. Who knows, plan B might just be the silver bullet you were looking for.
Lesson 5 — Complicated sucks
The complicated code is the code no one wants to touch.
This goes for all things data-related: databases, servers, dashboards, procedures, functions, etc.
Simple code, simple tasks, simple processes. These are the things that stand the test of time. They are easy to fix. Easy to manage. Easy to improve on. Complex just slows you down and increases maintenance costs — for no benefit.
Yes, sometimes complex is the only way to go; that’s just the way it is sometimes (I said sometimes), but I’ve seen data engineers take simple things and dress them up with all the bells and whistles, advanced language features when simple will suffice, undocumented features when tried and tested methods work, poor structure, and little or no naming conventions — think alphabet soup in a blender.
If you can do it simple, do it simple. Avoid complexity at all costs. It will save you and your team time and effort and make life easier for all around you. If you have to relearn a codebase every time you touch it — there’s a problem.
📩 Think this is valuable? Share it with someone who might benefit.
Thanks for reading! I send this email weekly. If you would like to receive it, join other readers and subscribe below to get my latest articles.
👉 If you enjoyed reading this post, and think this is valuable? Share it with someone who might benefit. Or click the ❤️ button on this post so more people can discover it on Substack 🙏
🚀 Want to get in touch? feel free to find me on LinkedIn.
🎯 Want to read more of my articles, find me on Medium.
✍🏼 Write Content for Art of Data Engineering Publication.