OZ Digital, LLC

  1. Home
  2. /
  3. Resources
  4. /
  5. Blog
  6. /
  7. Why 85% of AI...

Why 85% of AI and ML Projects Fail — How to Ensure Yours Succeeds

By Sal Cardozo, Senior Vice President, Data Analytics & AI

Traditional AI has been around for decades, but the recent launch of ChatGPT has put the spotlight back on AI. From out-of-the-box copilot experiences to custom generative AI solutions built within Azure AI Studio, every organization is looking to integrate AI into their operations. Companies have invested heavily in AI and machine learning (ML). Yet, despite the hype and record levels of investment, most AI and machine learning (ML) projects struggle to get off the ground.

Unrealistic expectations are partly to blame — organizations often view AI as a magic bullet that can solve everything, even when AI may not be the best solution for every business challenge. More often than not, it’s not a failure on the part of the technology but rather the expectation of what people believe AI can (or cannot) do.

Here are several other factors that impede AI projects:

6 Reasons Why AI/ML Initiatives Fail

  • Data Access

    The more data AI can ingest, the higher the accuracy. But often, insufficient and poor quality data hinders data scientists from developing robust AI solutions. Data scientists require full access to modeling data. However, the lack of governance around deployed workloads and misconfigured access policies pose significant data security risks and inefficient use of resources. As a result, data scientists have to rely on smaller, less usable snapshots of the data instead of the entire data lake, which impacts model accuracy.

Next is the difficulty of obtaining sufficiently large data sets for training.

For example, let’s say you’re trying to predict the high and low day temperatures on a given day in the United States. You provide all the data collected from all over the country but fail to give the day of the year, including the latitude and longitude where each sample was collected. This data might provide some general predictions, but the range would make the data less valuable because the temperature may have been recorded in the dead of winter or the peak of summer, in the extreme south or extreme north.

The best a data scientist can do is give you the average high and low temperatures for a year across the entire country. Data like humidity, precipitation, wind speed, etc., would help them do better, but more is needed to pinpoint the exact location. The precise timing of sunrise and sunset might help you determine the time of year so that the AI might infer some seasons. But since there isn’t enough information, explicitly given or inferentially available, you may not achieve accuracy for a given day and location.

AI projects fail when the inputs contain insufficient information to produce the desired outputs with a useful level of predictive accuracy. Even where sufficient data exists, it might not be available in a format that is quick to ingest. The lack of usable data has long been a problem for AI and ML projects.

To save time and costs in the long term, companies must first ask: do we have the data to begin? Do we have the necessary permissions? Is it accessible? Is it in the format we want? Is it usable?

  • Model Development and Deployment

    Several factors go into turning a pilot into a scalable deployment. A whole suite of skills and resources is needed to make it happen. You may need to streamline code, bring in new technologies, push your AI or ML to the edge versus having one data repository, employ new teams, and set up a data labeling factory. And you have to iterate frequently and fail fast. To do that, you require an MLOps infrastructure, which cloud service providers

  • Service Complexity

    Building a modern AI/ML platform from the ground up means acquiring the right technologies for production, integrating them into your stack, and testing them to ensure they work well together and at scale—all of which are time-consuming and resource-intensive. This leads to significant wait times for data science teams who build and manage complex environments.

But when you use platforms like Microsoft’s Azure Machine Learning (as a service) with their built-in AI infrastructure, you accelerate time to value. Industry-leading machine learning operations (MLOps), open-source interoperability, and integrated tools let you build, train, and deploy models faster — and with confidence. Azure ML supports many types of machine learning projects — classic ML, unsupervised algorithms, and deep learning — including the tools you need to move from development to deployment.

  • Data Governance

    The foundation of any AI is data. A clear data strategy will help you leverage the data for AI. However, because AI technology has developed so quickly in the past few years, there are no robust data governance frameworks to ensure data is clean, well-organized, and secure.

Here are some major challenges that impede data governance and security:

  • Lack of visibility: Enterprise data today often lives in unstructured formats, scattered across emails, cloud applications, and databases. Locating and organizing this data is arduous. Organizations must also identify sensitive information within this vast ocean of data to ensure it’s adequately protected.
  • Data quality and lineage concerns: 45% of Google Bard’s (their AI copilot) training data is from unverified sources, making you question AI’s veracity. Tracking the origin and lifecycle of data for verification becomes an additional challenge in data          governance.
  • Data classification: Another challenge is categorizing the data correctly to improve governance and security so you can implement better controls. For data governance to be successful, you must differentiate the signal from the noise surrounding data.                        However, the available unstructured data can make it feel near impossible.
  • Data mapping: Manual processes, complexity, and data silos typically complicate data mapping. Now, it’s more complex with the evolution of AI and ML.
  • Rising Costs

    When organizations set up pilot projects in tightly controlled environments, it’s easy to underestimate the costs associated with the project once it’s deployed. For example, if AI is deployed via the cloud, then the cost of every API adds up, and it becomes hard to predict usage. Then, there are costs associated with transitioning to the new technology, especially if the transition takes longer than expected. Then, the size of the data sets. You must pay for the storage and calls to it. For applications that have multi-deployed storage worldwide, you must pay for backups too.   

  • AI Bias

    As organizations increasingly rely on AI systems, understanding the ethical implications and the various forms of bias that can inadvertently creep in when integrating AI is crucial. We have to understand how the processes used to collect training data can influence AI models. Instances of unintended biases may arise when the training dataset fails to accurately reflect the broader population to which an AI model is deployed. Thus, facial recognition models trained on a dataset representing the demographics of AI developers may encounter difficulties when applied to individuals with more varied characteristics.

It includes issues like:

  • Transparency: Can users understand how the AI makes decisions?
  • Accountability: Is there a clear line of responsibility if something goes wrong?
  • Fairness: Does the AI treat all users and affected parties equitably?
  • Privacy: Are individuals’ data and personal information protected?

AI systems learn from data, and if that data reflects historical inequalities or biases, the AI can perpetuate or even exacerbate them. Some types of bias include:

  • Sample Bias: Occurs when the data used to train the AI does not accurately represent the broader population.
  • Prejudice Bias: Arises when the training data includes prejudicial assumptions, leading to discriminatory outcomes.
  • Measurement Bias: Occurs when the data collected does not accurately measure the real-world variables it’s supposed to represent.

Why It Matters:

  1. Trust and Reputation: Ethical missteps or biased outcomes can significantly damage your organization’s trust and reputation. Users and customers are increasingly aware of and concerned about these issues.
  2. Legal and Regulatory Compliance: Data protection, privacy, and AI regulations are becoming more stringent. Ethical AI practices are not just moral but also legal necessities.
  3. Effectiveness and Reliability: Biased or unethical AI systems can produce flawed results, leading to poor decisions and outcomes. Ensuring ethics and minimizing bias is critical to the effectiveness of your AI initiatives.

Blend “First Mile” and “Last Mile” Efforts

Companies will need to consider “first mile” efforts, that is, how to acquire and organize data and actions, as well as the “last mile,” or how to integrate the output of AI models into frontline workflows. The AI models are only as effective as a company’s ability to execute against them and deliver value. Understanding which use cases have the most potential to drive value as well as which AI and analytical techniques need to be deployed is paramount. AI’s success is driven less by the technology itself but more by a company’s skills, capabilities, and data. Future success is not just about what the technology can do; it’s about what you can do with it.

Next Steps

Embrace the AI journey with confidence, responsibility, and a forward-thinking mindset, and watch as it transforms your organization in ways you can only begin to imagine.

Ready to dive into AI? With over 25 years of experience in Microsoft technologies, we can support you from the ground up to make AI and ML an integral part of your advanced analytics strategy.

Contact us today to get started.