The Mirage of AI Without Data: Cloudera Reveals 82% of Companies Lack Full Governance

04/Jun/2026
by ForgeNEX
AI

In the race to lead digital transformation, many companies have put the cart before the horse. Artificial intelligence (AI) has become the centerpiece of corporate strategies, but a recent study by Cloudera, The Data Readiness Index: Understanding the Foundations for Successful AI, reveals an uncomfortable truth: only 18% of organizations have their data fully governed to drive AI. This means the remaining 82% are building on quicksand.

juan-carlos-sanchez-de-la-fuente-cloudera-se-invie-0.jpg

Juan Carlos Sánchez de la Fuente, regional vice president for Spain and Portugal at Cloudera, puts it bluntly: “The speed at which organizations are adopting artificial intelligence far exceeds the speed at which they are modernizing their data foundations.” In his view, many companies have placed AI at the center of their strategy but continue to operate with data distributed across multiple clouds, data centers, SaaS applications, and legacy systems. “And that is building on unstable ground: no matter the quality of the models, if the data feeding them is incomplete, fragmented, or cannot be audited.”

Table of contents [Show] [Hide]

The Gap Between Perception and Reality
Critical Sectors: Finance, Healthcare, and Public Administration
Common Mistakes in AI Projects
Priorities to Turn AI into a Competitive Advantage

The Gap Between Perception and Reality

The study reveals a paradox: 85% of respondents claim to have a solid data strategy, but 79% acknowledge that their initiatives are limited by difficulties in access, preparation, and governance in distributed environments. According to Sánchez de la Fuente, this gap is one of the most relevant findings. “What we observe is that many organizations have advanced in defining the vision, but are still working on execution. Having a data plan approved by the board is very different from having data that is actually accessible, clean, integrated, and governed in production.”

In Spain, this gap has an additional dimension: the weight of technological legacy, especially in banking and the public sector. “These are sectors with complex architectures built over decades. Modernizing them requires time, investment, and sustained organizational will,” he points out. The result is that the gap between perception and reality delays AI with real impact: organizations believe they are ready when they are not.

Critical Sectors: Finance, Healthcare, and Public Administration

The report reveals that only 9% of financial sector organizations and 13% of healthcare organizations have full data governance, while in public administrations the figure reaches 20%. For Sánchez de la Fuente, this entails three types of risks: regulatory, operational, and reputational. “Environments like finance, GDPR, DORA, or the European AI Act require levels of traceability and control that partial governance cannot guarantee. Sanctions are not hypothetical; they are a direct and growing consequence.”

On the operational side, decisions made by AI models in credit scoring, clinical triage, or fraud detection are only reliable if the data supporting them is reliable. “Poorly governed data generate recommendations that are difficult to explain and decisions with real consequences for real people,” he warns. And the reputational damage from a breach or an erroneous automated decision is hard to repair.

juan-carlos-sanchez-de-la-fuente-cloudera-se-invie-1.jpg

Common Mistakes in AI Projects

One of the most frequent mistakes is investing in AI before data. “Organizations prioritize the model (which LLM to use, which cloud provider, which use case to present to the board) and leave for later the fundamental question: what data will we feed all this with? When that question comes late, the project arrives incomplete,” he notes.

Most AI projects do not fail due to algorithm problems, but because they work on incomplete, inconsistent, or hard-to-locate data. “AI amplifies the value of good data, but it also amplifies problems when the foundation is not ready.” Therefore, AI development must advance at the same pace as the data strategy. Without that alignment, organizations generate very high expectations and very limited results.

Priorities to Turn AI into a Competitive Advantage

Cloudera argues that the value of AI depends directly on data readiness. “The competitive advantage will not come from having access to more AI models than the competition, but from having better data than the competition,” says Sánchez de la Fuente. To achieve this, he proposes three priorities for the next twelve months:

1. Conduct a Real Inventory

Before any AI project, organizations need to know exactly what data they have, where it is, who can access it, and what its quality status is. “Without that inventory with full traceability, everything else is building on sand.”

2. Govern Before Scaling

It is not necessary to have 100% of data governed on day one, but a clear roadmap and a platform that allows governance to be extended progressively are essential. “Organizations that advance here in 2026 will have a structural advantage: their models will be more reliable, more auditable, and aligned with European regulation that will continue to tighten.”

3. Bet on Data Sovereignty

In an environment where AI is rapidly democratizing, proprietary data is the only asset that competitors cannot replicate. “Companies that manage to control, integrate, and activate 100% of their information will be the ones that transform AI into a real competitive advantage and not just another pilot project.”

juan-carlos-sanchez-de-la-fuente-cloudera-se-invie-2.jpg

These priorities are only sustainable if they rest on an architecture that unifies access, governance, and the ability to run inference where the data lives, without having to move it or fragment control. In this sense, agentic AI and operations platforms become the most important layer of the company, as we already analyzed in a previous article.

Trust in AI will not depend solely on the quality of the models, but on the quality and control of the data that feeds them. As Sánchez de la Fuente aptly summarizes: “Models are increasingly accessible to everyone. The ability to feed them with proprietary, reliable, governed, and contextualized data cannot be bought in a marketplace.”

To delve deeper into how to secure AI workloads, we recommend our article on how to secure Kubernetes in the era of AI, as well as the step-by-step technical guide to implementing Generative AI in workflows.

Original source: ComputerWorld. Analysis and adaptation by ForgeNEX.

Office Address

Phone Number

Email Address

Available on Google Play