The National Institutes of Health is turning to artificial intelligence to pull valuable insights from enormous volumes of health data — a move that could help the agency carry out research faster and provide new tools to assist healthcare providers.
Susan Gregurick, NIH’s associate director for data science, said during Federal News Network’s AI & Data Exchange 2026 that progress in AI is starting to reveal insights from data spread across separate systems.
“I think there’s just an incredible amount of excitement here,” Gregurick said. “I’ve noticed some truly promising trends in building AI technologies that can pull out hard-to-find data from clinical records and doctors’ notes.”
AI to accelerate public health responses
Central to NIH’s AI initiative is an effort to tear down data silos that have slowed how fast researchers can react to new public health threats.
NIH is working with the Energy Department and the National Cancer Institute to use AI for pulling information from pathology reports. Through this effort, NIH is helping researchers gain a clearer picture of how COVID infections relate to cancer progression.
“The challenges we face really revolve around obtaining real-time health data,” she said. “It was extremely difficult to grasp the connection between people who may have had COVID and the impact it would have on cancer progression,” Gregurick said.
“If you’re a cancer patient, your diagnosis comes from a pathology report that sits in a completely separate system from your electronic health record data. That program effectively pulls the key details from pathology reports and feeds them into our system for understanding cancer.”
Digging through massive datasets
Beyond pulling insights from individual records, NIH is also dealing with the enormous scale of its data environment. NIH holds roughly 440 petabytes of data spread across its three cloud service providers.
“This is truly a massive amount of data — getting a handle on all of it, discovering it, locating it, and using it in analytics. That’s something I’m deeply committed to and dedicating a lot of time to,” Gregurick said. “There’s an opportunity to leverage real-world data, such as wearables and survey data, to help us understand health outcomes. But challenges remain around the structure of that data. Simply converting it into standard formats from different devices is going to require considerable effort.”
One example is the agency’s Bridge to AI program, which is focused on building high-quality, AI-ready datasets.
“The entire purpose of this program is to produce AI-ready, high-quality, gold-standard data that can then be used with new AI models,” she said.
NIH, through its Bridge to Artificial Intelligence (Bridge2AI) program, is creating new flagship datasets and best practices for machine learning analysis. These datasets were gathered and processed with AI modeling in mind. Among the first datasets Bridge2AI tackled was the prevalence of Type 2 diabetes in American Indian and Alaska Native populations.
“We know there’s a higher prevalence of Type 2 diabetes in those populations, and so building this data helps researchers understand why these communities face greater risk and how we can better treat them,” Gregurick said. “This is one of the advantages of collaborating across NIH — generating that data and then making it accessible to researchers working with populations such as American Indian, Alaska Native, or people from underserved communities across the country.”
Using AI to streamline back-office operations at NIH
AI is also taking on a larger role in NIH’s internal operations, especially in handling the roughly 20,000 grant applications the agency receives each year.
“We’re using AI to do things like take incoming grant applications and sort them into categories that match different study sections, so we don’t have to manually review the grants right away,” Gregurick said. “We can use a large language model to group these into the appropriate study section. And AI and large language models can also help us spot any potential conflicts that reviewers might have with a grant, so they aren’t assigned a conflicting application, and assist us in choosing reviewers as well.”
NIH is also investing in the next generation of AI talent through programs like AIM-AHEAD, a network of more than 10,000 AI and healthcare professionals.
“The entire goal of AIM-AHEAD is to make sure that younger researchers, new or experienced clinicians, and public health workers have access to and can receive training in artificial intelligence,” Gregurick said.
Collaborating with cloud providers
Industry partnerships are another key pillar of NIH’s AI strategy. Cloud providers have helped the agency expand its data infrastructure and offer training environments for researchers.
“The partnerships with AWS, Google, and Microsoft Azure have been outstanding for us in terms of moving all of our data into the cloud,” Gregurick said. “Each of the cloud service providers has developed a sandbox where students or investigators can come and test and compare different algorithms, access different datasets, or simply learn how to use the cloud to run much higher-throughput analytics pipelines.”
AI is also making new kinds of real-time analytics and visualization possible, which she said could reshape how health systems respond to emerging trends.
“If I wanted to know how many patients in the past 90 days have been diagnosed with lupus across our health network, having an AI-powered dashboard with visualizations and analytics that can actually help us understand and retrieve that data in real time and display it for us is a game changer when it comes to spotting broader trends,” Gregurick said.
Testing a health learning system in real time
The combination of AI, data integration, and cross-sector collaboration is building the groundwork for a more data-driven health system.
“Working with startup companies or even larger ones, to truly help us visualize these trends, is giving us insight that we simply didn’t have before,” she said. “It’s allowing us to test what a health learning system centered on patient information looks like in real time.”
While data challenges persist, NIH views AI as essential to its mission going forward.
“Collecting and harmonizing health data to identify trends remains a grand challenge,” Gregurick said. “We still have a long way to go.”
But with AI increasingly woven into research, operations, and partnerships, she expects the agency can close that gap and deliver more precise insights into the nation’s most urgent health issues.
Find more articles and videos now on the AI & Data Exchange event page.
Copyright
© 2026 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.



