National Health Research Institute · Public Health Research
Existing epidemiological surveillance systems relied on structured reporting — diagnosed cases submitted through formal channels. The institute had access to a broader dataset: anonymized EHR event streams from 2,400 clinical sites covering 18 million individuals. This data was being collected but analyzed only at aggregate statistical level, with individual event-level reasoning applied to less than 5% of records. Early signals of emerging disease — patterns of symptom combinations, treatment-seeking behavior, and clinical test ordering — existed in the full event stream weeks before they appeared in diagnostic reports.
LLM reasoning applied to 100% of anonymized EHR event streams. The model was configured to identify population-level behavioral anomalies — changes in symptom patterns, clinical presentation clusters, and care-seeking behavior — that might indicate emerging disease activity.
"The signals were in the data three weeks before they showed up in our surveillance system. We're not collecting new data — we're finally reading the data we already have."
Director of Epidemiological Surveillance
In retrospective validation against three historical disease emergence events, full EHR event coverage with LLM reasoning identified statistically significant anomaly signals an average of 21 days earlier than the formal reporting system. The signals were specific enough to indicate geographic clustering and demographic concentration, enabling pre-positioned public health response. The institute is now deploying this capability as a real-time early warning system.