Johns Hopkins Researchers Validate Using Senior Risk Factors to Predict Utilization

“It is valid to use geriatric risk factors identified from electronic health record data as predictors for increased health care utilization for providers and health care organizations without access to claims data. Given the time lag of claims data, EHR data provides the advantage of real-time identification of patients with increased geriatric risk and health care needs for timely clinical interventions.”

The Center for Population Health Information Technology and the ACG SystemTeam – part of the Department of Health Policy and Management at the Johns Hopkins Bloomberg School of Public Health – have been “putting lots of effort into mining the electronic medical record for predictive modeling,” reports Jonathan P. Weiner DrPH, Professor of Health Policy & Management and of Health Informatics; Director, Center for Population Health Information Technology (CPHIT); ACG System co-developer and Executive Director of Research. The latest result: “We recently published a breakthrough article in Medical Care presenting and evaluating the ACG System’s new expanded Geriatric Risk/Frailty Risk metrics for predictive modeling derived from both ‘structured’ and ‘free text’ EHRs.”  The risk metric, he notes, has considerable relevancy for Medicare and disabled populations.

Here are additional details (as featured in Predictive Modeling News April 2018) from “Defining and Assessing Geriatric Risk Factors and Associated Health Care Utilization Among Older Adults Using Claims and Electronic Health Records”.

The researchers set out to “define and compare geriatric risk factors derivable from claims, structured EHRs and unstructured EHRs, and estimate the relationship between geriatric risk factors and health care utilization.” JHU “collaborated with the well-known Atrius medical group,” Weiner adds, which boasts more than 1 million patients in Massachusetts and constitutes the main physician group of the Harvard Pilgrim Health Plan. Atrius provided EMR data, both text and structured, he also notes, and claims for a cohort of about 25,000 Medicare Advantage patients for a multi-year period.

The team, the paper reports, “defined 10 individual geriatric risk factors and a summary geriatric risk index based on diagnosed conditions and pattern matching techniques applied to EHR free text.” Prevalence was estimated using claims, structured EHRs and structured and unstructured EHRs combined; “the association of geriatric risk index with any occurrence of hospitalizations, emergency department visits and nursing home visits was estimated using logistic regression adjusted for demographic and comorbidity covariates.” Here are excerpts from the results described in the paper:

  • The prevalence of geriatric risk factors increased after adding unstructured EHR data to structured EHRs compared with those derived from structured EHRs alone and claims alone.
  • Statistically significant association between geriatric risk index and health care utilization was found independent of demographic and comorbidity covariates.
  • The results demonstrate the feasibility and potential of using EHRs and claims for collecting new types of geriatric risk information that could augment the more commonly collected disease information to identify and move upstream the management of high-risk cases among older patients.
  • This is promising given that EHRs not only may serve as another rich clinical data source, but also hold the potential for real-time clinical decision support informed by both structured and unstructured data.
  • Future research is needed in considering incorporation of additional data, such as vital signs and lab test results, into geriatric risk factors, exploring severity for geriatric risk factors and refining text mining and natural language processing techniques for mining EHR free text.

Predictive Modeling News talked to study authors Weiner, Hong J. Kan PhD MPP MA and Hadi Kharrazi MD PhD MHI, all faculty at the Johns Hopkins Bloomberg School of Public Health Center for Population Health IT and members of the Johns Hopkins ACG System Research and Development Team.

Predictive Modeling News:  Congratulations again! How big a deal is this? Is it a leap forward in what’s possible with existing data sources and new analytics technology?

Hong J. Kan PhD MPP MA, Hadi Kharrazi MD PhD MHI and Jonathan Weiner DrPH: Thanks for the kinds words and for featuring our work! The main novelty of this study lies in its comparison of the added incremental predictive value of insurance claims, structured electronic health records and EHR free text when used to measure a wide range of risk factors relevant to geriatric populations. We hypothesized that not all risk information is equally recorded across the three data sources. Certain concepts, such as dementia, may be well-recorded using diagnosis codes in claims and structured EHRs, but other important factors, such as lack of social support, will more likely be captured in EHR free text based on the clinicians’ notes. Indeed, we found that when information derived from unstructured data using text mining techniques was combined with claims or EHR structured data, the prevalence of risk factors increased significantly.

For example, when EHR free text notes were mined, the prevalence of “walking difficulty” among our elderly cohort had a prevalence of 19.9% (a 2.5-fold increase over EHR structured data only); the prevalence of falls was 13.2% (a 2.5-fold increase); weight loss, 6.9% (a 1.3-fold increase); and dementia, 4.5% (a 1.4-fold increase). The largest change was seen in social support, which was negligibly recorded (less than .1 %) in both structured EHRs and claims, but its prevalence increased to over 11% when the free text was searched. The results demonstrate the value of information from free text in EHRs, especially when information is not necessarily recorded using structured data — due possibly to the lack of financial incentives and/or existing ICD codes. In addition, the study shows that geriatric risk factors extracted from EHRs and claims independently predict future health care utilization in terms of use of hospitalizations, emergency department visits and nursing home use.

PMN: You mention that the metric “has considerable relevancy for Medicare and disabled populations.” Those are two pretty big markets. Will this be a blockbuster product? Are the new Geriatric Risk/Frailty Risk metrics available on the market now? How soon? What are the other main markets?

HJK, HK & JW: We know that frailty and other geriatrics syndrome-related factors are clinical conditions prevalent in older adults and that the presence of these factors matters in many ways. Historically, “frailty” is measured with surveys and inperson assessments, often requiring clinician administration, which limits their feasibility for large populations. We developed and refined a set of geriatric risk factors that can be extracted from existing electronically captured data, such as claims and EHRs. These risk factors can be potentially used to identify patients with high geriatric risk, including aspects of frailty, based on existing data for improved care management and population health management. This geriatric risk does add explanatory power above and beyond standard disease-oriented diagnostic risk adjustment/predictive modeling. Updated geriatric risk algorithms based on structured EHR and claims data are available in the next release of the Johns Hopkins ACG System. We expect NLP/text mining versions in the near future.

PMN: You mention that your findings are “in line with a previous study” and that “the ability to utilize free text in EHRs is an active research area.” What will the next big breakthrough be in research into integrating new data into new analytics machines?

HJK, HK & JW: We believe EHRs holds a great promise for improving care both at the individual and population levels, especially as it nears universal adoption and real-time data availability. Further standardization of data capture and improved interoperability of EHR systems will eventually allow EHRs to become a powerful source of data for clinical care management and population/public health management. We envision that there will be many breakthroughs along the way as EHRs become easier to access and analyze. Our research team at the Johns Hopkins School of Public Health Center for Population Health IT has a large research portfolio focusing on integrating EHRs and other novel data sources, including free-text/NLP, and also new types of clinical information, such as lab data and vital signs, and also the all important non-medical social determinants of health.

PMN: You note that your findings indicate an opportunity to “identify and move upstream the management of high-risk cases among older patients.” Do health care organizations have the other tools they need to do that? Can your findings be operationalized by most entities? Is inconsistency among EHR functionality and interoperability a problem doing so?

HJK, HK & JW: Geriatric risk factors are health status characteristics common in older adults who have potential to be intervened upon and ameliorated if identified in a timely manner. Since our case identification algorithm is based on existing diagnosis data (ICD-9 or ICD-10 codes) from EHRs and claims, they can be readily operationalized by most provider and payer organizations. Our study shows that patients with multiple geriatric risk factors are particularly susceptible to future hospitalizations, ED visits and nursing home use. We recognize that there will be challenges in applying EHRs’ free text data, depending on specific EHR setup and tools used to extract frailty markers in a health care organization.

PMN: What’s your main message to colleagues reading this interview?

HJK, HK & JW: As their interoperability and standardization increase, EHRs, with their broad array of clinical data, will add important information to the mix. After careful assessment, such as that described in the study we published in the Medical Care journal; EHR-derived data should be fully integrated into predictive modeling and analytics activities. Moreover, in the future, we believe that claims and other administrative datasets, which are the mainstay of most activities today, will disappear
and give way to these more clinically based health IT systems.

PMN: Any other comments?

HJK, HK & JW: At the CPHIT Center at Johns Hopkins and within the Johns Hopkins ACG System R&D team, we have a very large portfolio of related work, including two articles featured in past issues of this newsletter, as well as another paper recently published that describes additional details of our approach in using EHRs’ free-text: Anzaldi LJ, Davison A, Boyd C, Leff B, Kharrazi H. “Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: A retrospective cohort study.” BMC Geriatrics. 2017; 17 (247): 1-7 ( As part of our research, we also are comparing the results of our digitally derived ACG Frailty Risk/Geriatric Risk measures to standardized frailty assessments completed in person. Also, two other papers are currently under review that will describe the technical details of our NLP approach as well as the geriatric syndrome’s case identification rate using claims versus EHRs (both structured and
free text).

Click here for more information on the published article:  Defining and Assessing Geriatric Risk Factors and Associated Health Care Utilization Among Older Adults Using Claims and Electronic Health Records

Back to top