When Exposed Data Derails AI Deployments: Managing The Fallout

Gary Yeowell/DigitalVision via Getty

Follow ZDNET: Add us as a trusted source on Google.

Key points from ZDNET

Artificial intelligence can enhance efficiency and make data retrieval easier.
Some technology leaders have paused AI deployments over privacy and security concerns.
For decades-old information has resurfaced through AI-generated queries.

For business professionals, agentic and generative AI tools have unlocked access to valuable data and fresh insights. Yet recent findings suggest this rapid access might come with unintended consequences. Seasoned leaders in enterprise AI deployments recently warned professionals against rushing into AI adoption without proper preparation.

The challenges encountered by these experts forced temporary pauses in AI initiatives designed to increase workplace productivity, as leadership teams re-evaluated what internal information might be inadvertently exposed. Speaking on a panel at the recent Veeam conference held in New York City, these executives stressed that AI itself was not the root cause of these issues. Both organizations represented by the panelists had amassed enormous quantities of data over time, with one needing to establish an entirely new data governance framework.

Also: 96% of IT professionals currently use AI: The top 7 agentic tools and the biggest challenges they face

Steve MacIntyre, senior vice president at Fidelity Investments, explained how his 400,000-person organization discovered long-buried information — stored on SharePoint sites or network-attached storage systems — suddenly becoming accessible through AI queries. “This wasn’t really an AI issue,” he noted. “It was more about how AI’s speed and efficiency made it possible to locate unexpected content almost instantly.”

Wim Geurden, chief architect for enterprise technology at EY, described his firm’s challenge: determining who actually owned data across its worldwide network of independent member firms — data that was now surfacing through the company’s AI platform. “When we rolled out enterprise-wide search, all sorts of content began appearing in unexpected places where people could see it,” he explained.

“EY Global doesn’t actually own any of this data. Each individual member firm controls its own data. That’s exactly where the first concerns came up. What is all this content? How many SharePoint sites existed? We had multiple petabytes of stored data, and it was completely unmanaged. There were no lifecycle management practices for these SharePoint sites, and roughly half of them had no designated owner. We had no record of when they had last been accessed.”

Also: 51% of workers say AI-generated ‘workslop’ hurts their output — here’s how to fix it in two steps

At Fidelity, previously hidden information was emerging from a massive archive of PowerPoint presentations and PDF documents. “We maintain decades of research notes at Fidelity, many in PDF format,” MacIntyre said. “We distributed a few Copilot licenses, and within just two days, our legal team contacted me reporting an AI problem. A team member had run a search, and the AI returned every PowerPoint file that was stored on SharePoint from years past.”

AI functions as an “incredibly powerful search engine operating at remarkable speed,” MacIntyre added. “It scans everything it can access and presents results in a clear, organized way. Everyone assumed we had an AI problem, but what it really revealed was a data security gap. The issue became immediately clear: we possessed vast amounts of unstructured data we had largely ignored, and then large language models arrived, transforming all of that overlooked data into something extremely valuable.”

Setting up protective measures

At EY, once AI began unlocking access to its extensive data repositories, the immediate priority was to “determine who actually owns each piece of data,” Geurden said. “Our next step was to shut everything down completely.” Access to the Copilot tool was restricted exclusively to licensed users.

Also: How to build an agentic AI strategy that delivers results — without putting your business at risk

The process of verifying data ownership involved identifying and categorizing all data across the EY enterprise, Geurden continued. Categories included labels such as “confidential” or “financial services.”

AI itself became a useful tool for helping classify the company’s unstructured knowledge repositories, Geurden explained, pointing out the impracticality of relying on manual labeling given a 25% yearly staff turnover rate.

However, classification must go well beyond basic top-level labels. “First, we need to understand exactly what data existed when the AI processed it,” Geurden said. “We need a complete historical record, including all previous versions.” Then, “we must go far beyond simply marking information as confidential. We need geographic restrictions, location-based labels, business-unit classifications, and links to our contracts, because we handle enormous volumes of client data with specific rules about what we can and cannot do with it.”

Also: More than 80% of US government agencies are already deploying AI agents — and adoption is just getting started

All of this metadata must be formally documented in contracts, he added: “That part is relatively straightforward. The harder part is encoding it into a technological framework. Right now, that process remains extremely complex and labor-intensive.”

Effective governance is the cornerstone of success across all aspects of these AI deployments, the executives emphasized. “We need to understand exactly what is being used,” said MacIntyre.

“This brings up the whole issue of shadow AI, shadow IT, and similar concerns — and it all comes back to endpoint data. We need to ensure our asset inventories are accurate. Are they properly aligned with the registered and approved use cases? That way, at minimum, we can confirm that if someone is working on a specific project, they should be using Claude, because it’s connected to an approved project that was authorized for that particular tool.”

Also: These 4 critical AI security flaws are being targeted faster than teams can defend against them

After that, “we need to consider what the right secure environment is for these agents to operate in?” MacIntyre continued: “How should they interact with foundational models? What kind of architecture do we build to channel all that activity into a single location that provides us with the right level of visibility and monitoring, so we can verify that agents and AI-powered applications are functioning as intended — or identify when they’re not?”

Another significant challenge — perhaps the most complex one facing digital leaders today — is defining agent identity, MacIntyre said: “How do you assign an identity to an AI agent? At that point, it essentially becomes an employee. But what happens if my agent only exists for a few seconds? It’s a genuinely fascinating problem, and I’m not convinced anyone has truly solved it yet.”

Top Posts

Bitcoin and Altcoins Plunge – Are Bulls Ready to Step In and Buy the Dip?

Messaging Notifications on Android May Control Google Gemini

Inspektor Gadget: First Security Audit Results Revealed

When Exposed Data Derails AI Deployments: Managing the Fallout

Mastering ChatGPT in 2026: The Ultimate Beginner’s Guide to Unlocking OpenAI’s AI Chatbot

I trained with a $170 smartwatch aimed at keeping injuries at bay

GSMA MWC26 Shanghai: Formula E Takes Center Stage Alongside a World-Class Speaker Line-Up

RFID: The Game-Changer Every Cybersecurity Teams Must Embrace

LoRa Alliance Unveils Three-Year Blueprint to Simplify LoRaWAN Integration and Operations

10 Rounds of Truth: Claude Opus 4.8 vs 4.7—Until a Legal Prompt Shattered the Test

Bitcoin and Altcoins Plunge – Are Bulls Ready to Step In and Buy the Dip?

Messaging Notifications on Android May Control Google Gemini

Inspektor Gadget: First Security Audit Results Revealed

Mastering ChatGPT in 2026: The Ultimate Beginner’s Guide to Unlocking OpenAI’s AI Chatbot

5 Fascinating Papers That Make LLMs Easy to Understand

Google DeepMind Unveils Gemma 4 12B: Encoder-Free Multimodal AI with Native Audio on a 16 GB Laptop

3 Trump-Backed US Stocks Worth Watching This June

100 AI Agents Ranked by Security: The Critical Findings You Can’t Afford to Miss

Trending

Bitcoin and Altcoins Plunge – Are Bulls Ready to Step In and Buy the Dip?

Messaging Notifications on Android May Control Google Gemini

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

When Exposed Data Derails AI Deployments: Managing the Fallout

Key points from ZDNET

Setting up protective measures

Related Posts