As a data engineer, I’ve always thought my role was to support analytics and machine learning use cases. But recently, I discovered that some data teams serve customer-facing data directly. This got me thinking: should data engineers really own online customer-facing data?
In my experience, data engineers typically focus on supporting use cases for analytics or machine learning. We have some room for errors, and our work is often behind the scenes. But when I joined my current company, I found that another data team in my department serves customer-facing data directly. They write SQL, build pipelines on Airflow, and send data to Kafka for display on the customer-facing app.
The use cases involve rewards distribution, and data correctness is highly sensitive. If there’s a delay or error, customers will likely complain. This made me wonder: shouldn’t this be done via software methods, like calling APIs and doing aggregation, to ensure higher reliability and correctness?
The Case for Data Engineers Owning Customer-Facing Data
There are pros to data engineers owning customer-facing data. We’re already familiar with the data and the pipelines, so we can optimize for performance and reliability. We can also ensure data quality and correctness, which is critical for customer-facing applications.
The Case Against Data Engineers Owning Customer-Facing Data
On the other hand, customer-facing data requires a high level of reliability and correctness, which can be challenging for data engineers to achieve. Additionally, data engineers may not have the same level of expertise in software development, which could lead to suboptimal solutions.
The Middle Ground
Perhaps the best approach is a hybrid model, where data engineers work closely with software developers to design and implement customer-facing data solutions. This way, data engineers can focus on ensuring data quality and correctness, while software developers can focus on building reliable and scalable applications.
What do you think? Should data engineers own online customer-facing data, or is this a job for software developers? Share your thoughts in the comments!