How AI can make infectious disease surveillance smarter, faster, and more useful

This is a guest essay for Healthbeat. Public health, explained: Sign up to receive Healthbeat’s free national newsletter here.

Public health agencies are under pressure to move faster, detect threats earlier, and make better decisions, even as their funding is cut and their authority reduced. While most public health agencies will have to do less with less, artificial intelligence systems provide an opportunity to maintain and possibly improve performance in one critical area: infectious disease surveillance.

When I led infectious disease programs for the New York City Health Department, I saw how even the best-resourced health departments rely on inefficient, error-prone systems. Disease surveillance often requires personnel to manually review lab reports, call health care providers, and write and revise code to clean, analyze, and visualize data.

AI has improved performance in industries where data are abundant and decisions need to be fast and accurate, such as finance and logistics. Public health surveillance systems are essentially large repositories of data about a community’s health. The same tools that improve performance in other fields can help health agencies save time, improve accuracy, and act faster.

What is disease surveillance and why does it matter?

Surveillance is how governments monitor the incidence and overall burden of disease in a population over time, detect outbreaks, inform policies, and assess whether prevention and control programs are working. Each state maintains a list of “notifiable” diseases that health care providers and laboratories are legally mandated to report.

With infectious diseases, most reporting occurs through laboratory-based surveillance. The system is supposed to work like this: A patient has symptoms and visits a health care provider, the provider collects a specimen from the patient and sends it to a clinical laboratory, the laboratory performs tests to identify a potential pathogen, then the laboratory reports that data to a state or local health agency. This is how health agencies collect data about everything from salmonella to HIV to Covid-19.

Monitoring laboratory-confirmed cases, in this way, gives public health agencies the confidence that they are investigating cases of public health importance, not just people with symptoms of a disease that is less relevant to public health.

Here’s how artificial intelligence could improve public health

Health agencies ideally receive these reports electronically through systems that use standardized digital formatting and vocabulary. But many reports still come by fax or phone. More importantly, regardless of whether data are transmitted digitally or analog, duplicate records must be adjudicated, errors corrected, and missing information (often about patient demographics and exposures) filled in.

AI can improve how data get collected and reported

Labs generate huge volumes of data. While some lab instruments automatically generate data in a format that can be readily reported to health agencies, technicians often must interpret and record results in another electronic system, and labs or health departments must process that data into a standardized format to merge them with disease surveillance databases.

AI-powered natural language processing can scan free-text lab reports and extract useful data: which pathogen was detected, which test was used, when it was done, what the result was, and what identifiable and reportable information is there about the patient (e.g., name, date of birth, home address, gender, race, ethnicity). That information can be automatically formatted to meet public health reporting requirements.

Become a Healthbeat sponsor

AI can also monitor data flowing from laboratory instruments and detect when a reportable disease appears. It can then automate the steps needed to report that infection without requiring a person to notice it, look it up, and enter data into a system manually.

Another major problem is compliance with reporting. Staff turnover, new software, or new instruments can result in cases not being reported. Health departments often discover these problems only after reviewing trends (e.g., why has this lab not reported any diseases in the past quarter?), then staff spend time reviewing data, contacting labs, and sometimes issuing official warnings. AI “agents” operated by health departments could continuously monitor reporting trends by facility, flag unusual drops in reporting, issue reminder emails, and even conduct calls to the laboratory facility – with a human supervising the system, but not necessarily conducting the activities.

AI can help clean and connect the data

At a health agency, epidemiologists must merge surveillance data from many different sources, e.g., labs, hospitals, outpatient offices, and other facilities. AI algorithms help accelerate and improve identification and resolution of duplicates. Two reports might refer to the same patient but have different names or dates. Or one patient might have multiple test results for the same infection. AI can learn to match and consolidate these records, improving the accuracy of matching over time.

Most lab reports contain just a few pieces of information: name, age, sex, address, and the name of the doctor and facility that ordered the test. But to understand disease burden and identify outbreaks, public health investigators need much more: When did symptoms start? What was the patient exposed to? Did they travel? Were they hospitalized?

To get these details, epidemiologists must call patients or providers and search other agency databases. That takes time, staff, and effort. AI tools and agents can help by automating some of these processes. AI agents could send links to patients’ mobile phones or email for online surveys or chatbot interviews and, as they evolve, even interview patients by voice. If legally permitted in a state, AI agents can connect to hospital records, immunization registries, or death certificates to fill in missing demographic and outcome data.

AI can make sense of the data faster

One of the most promising uses of AI is to improve the analysis, visualization, and communication of surveillance data.

In well-resourced health agencies, epidemiologists run models to “nowcast” and “forecast” diseases. Most county health departments in the United States, however, lack epidemiologists with these skills. AI tools can be built to automate these processes, including merging surveillance data with other locally available databases, such as from emergency rooms, 911 calls, and/or pharmacy sales to estimate current and future incidence, depending on different interventions.

AI can also write custom summaries of the data for different audiences. Most surveillance reports are dense and difficult to read, written by epidemiologists for a broad audience that covers experts and lay persons. Generative AI tools can create summaries that are tailored for different technical levels and perspectives, such as policymakers, journalists, health care providers, or the public.

Privacy and security still matter

While surveillance data is generally exempt from federal health privacy laws like HIPAA, it is still governed by strict state and local laws.

AI tools used for public health surveillance must be designed to operate within these legal frameworks. They should not store or use identifiable information for training purposes. They must limit access to authorized personnel. And they should be run in secure, closed electronic environments.

Become a Healthbeat sponsor

AI may also help allow more data transparency while preserving confidentiality. One reason health agencies hesitate to release detailed surveillance data is the risk that individuals could be identified. AI tools could help automatically de-identify records and test whether patients can be re-identified once datasets are made public.

What’s next

For AI to have an immediate impact, public health agencies need tools that are validated for data processing and analysis and can be easily adapted to the needs of epidemiologists. Partnerships between public health experts, AI developers, and policymakers will be essential to ensure these systems are incorporated and improve the speed, quality, and usability of public health surveillance systems, while maintaining robust security and privacy controls.

Dr. Jay K. Varmais a physician and epidemiologist. An expert in the prevention and control of infectious diseases, he has led epidemic responses, developed global and national policies, and implemented large-scale programs that saved hundreds of thousands of lives in Asia, Africa, and the United States.