I’ve enjoyed a much needed 2-month parental leave, and as I’m back to Back on Friday, this is a good time for a review of my position and duties.
If you check CARTO team page as of today, you’ll find that I am Principal Data Engineer. It’s not such a common role. What does it mean? Let’s Google it!
- This Quora question is mostly answered with “it depends”, but you can distill that it’s an experienced person close to management (but not a manager) with some degree of independence.
- This other one Quora response is in terms of rank, as a way to grow without management duties.
- Study.com talks about strategy, goals, expertise, and motivation.
- A Software Engineering StackExchange question focus on architecture at a macro level.
At CARTO, it’s explained in contraposition. A Tech Lead is a developer that evolves towards a management position, while a Principal is a pure technological role. When I switched from Tech Lead to Principal it was summarized as dropping team management duties and getting others related to product definition and experimentation.
What about the “Data” qualifier? It’s not being a Data Scientist (but we’re looking for some, check our open positions and you’ll see backend and data vacancies as well). Here, Data talks about our Data strategy. In order to get great insights out of your input, you need valuable, ready-to-use data, and we want to make CARTO great regarding that.
So, we could say that it’s a technical role that must focus on definition and implementation of solutions for data products.
What are the main challenges?
- Data quality: if you think that you can just download a CSV and add it to your analyses, you’re just wrong. There’s a lot of cleanup, normalization, and transformation involved, just to name a few of the tasks.
- Data volume: nowadays there’s a ton of data everywhere. A lot of noise as well, but also legitimate amounts. If you’re not careful enough, you can turn any simple task in a weeks-long nightmare for developers and users.
- Customer orientation: even in the case that you have managed to get attractive, big data, you still need to understand your customer needs. Do they want an API? Bulk download? How does it relate to platform limits, pricing, and other non-functional requirements? Is the data meaningful for their use case?
- Communication: there are (too?) many stakeholders. Data fetching, owners, legal, scientist, integrators… You don’t just add a new data source.
It’s been my position for some time now, but because of some matters of very different kind I feel like this Friday is a new beginning. Last months have been very intense, with changes everywhere, new products, new proofs of concept, and experiments. I still don’t know what I’ll find in a couple of days, but I hope that next weeks lead to new projects, learning and pushing the platform forward in a meaningful way. Let’s do it!