Why the majority of data products fail!
A quick overview of three main obstacles developping a date product that makes a difference
Data products holds immense potential to transform businesses, but the harsh reality is that up to 80-85% of data science projects never make it into production, and of those that do, only a tiny fraction—around 8%—create meaningful value for their organizations (Thomas, 2020). This staggering failure rate stems from three major issues: waste, misalignment with business goals, and challenges in scaling data science/ analytics outputs (Atwal, 2020). Let’s dive into these challenges, explore their root causes, and discuss how organizations can overcome them.
1. The Problem of Waste during the development of Data Products
In the context of data science, waste refers to non-value-adding activities that consume significant resources. Data scientists, analysts, and engineers face numerous barriers in their daily workflows, such as:
Poorly described data: Incomplete or unclear documentation about available datasets.
Access restrictions: Limited or slow access to necessary databases due to authentication hurdles and organizational silos.
Lack of standardization: Inconsistent processes and frameworks leading to inefficiencies.
Repetitive manual work: Tasks that could be automated, resulting in lower scalability.
Workarounds prone to failure: Temporary fixes often collapse under pressure.
These inefficiencies often lead to frustrated teams and subpar results. Worse still, hidden debt—legacy systems and outdated practices—further delays progress by making testing and deployment cumbersome.
2. Misalignment with Business Objectives
Even the most advanced machine learning model is useless if it doesn’t address a critical business need. Miscommunication between data scientists and business stakeholders is a common problem. Without a shared understanding of goals, models may be developed that fail to answer the right questions or provide actionable insights. In some cases, these models are never deployed; in others, they’re implemented but never used.
An agile approach, such as building Minimum Viable Products (MVPs), can help bridge this gap. MVPs allow businesses to test ideas early and adapt quickly to changing requirements, ensuring the data product delivers real value.
3. Challenges in Productionizing Data Products
A key determinant of success is the speed and ease of deployment for data products. Companies that can rapidly test and refine models—using methods like A/B testing—gain a competitive edge. However, lengthy lead times for productionizing data products create bottlenecks.
Effective productionizing requires:
Robust infrastructure: Agile systems capable of handling small, iterative changes.
Real-time data: Fresh data is far more valuable than outdated or historical data, as it allows for proactive decision-making.
Reliable analytics: High-quality, actionable insights are critical to building trust with stakeholders.
Root Causes: Why These Challenges Persist
At the heart of these issues are two fundamental problems (Atwal, 2020):
Outdated Information Architectures
Many organizations still operate with 20th-century systems, relying on siloed databases, rigid security measures, and manual processes. These setups are ill-suited for modern data analytics, where interoperability, scalability, and speed are paramount.Knowledge Gaps between Business and Data Professional and Weak Organizational Support
Data scientists often lack the domain knowledge needed to align their work with business needs. Similarly, IT departments frequently fail to understand the tools and data access requirements of data teams. This disconnect creates delays, friction, and frustration. Furthermore, weak development skills among data scientists can lead to poor-quality, non-reproducible models that are difficult to productionize.
What are your experience in developing a data product? Where do you see the main challenges?
Sources:
Atwal, H. (2020). Practical DataOps. In Practical DataOps. Apress. https://doi.org/10.1007/978-1 4842-5104-1
Thomas. (2020). 10 reasons why data science projects fails. https://fastdatascience.com/why-do data-science-projects-fail/