Documents

Evaluating the performance of a predictive modeling approach to identifying members at high-risk of hospitalization

Published: September 10, 2019

Category: Bibliography

Authors: Chris Neely, Dawn Cantrell, Jack Holloway, Janet Chaisson, Jason Ouyang, Somesh Nigam, Tasha Bergeron, Vindell Washington, Xiaojing Yuan, Yuan Zhang

Countries: United States

Language: English

Types: Population Health, Utilization

Settings: Health Plan

Abstract

Aims

To evaluate the risk-of-hospitalization (ROH) models developed at Blue Cross Blue Shield of Louisiana (BCBSLA) and compare this approach to the DxCG risk-score algorithms utilized by many health plans.

Materials and Methods

Time zero for this study was December 31, 2016. BCBSLA members were eligible for study inclusion if they were fully insured; aged 80 years or younger; and had continuous enrollment starting on or before June 1, 2016, through time zero. Up to 2 years of historical claims data from time zero per patient was included for model development. Members were excluded if they had cancer, renal failure, or were admitted for hospice. The Blue Cross ROH models were developed using (1) regularized logistic regression and (2) random decision forests (a tree ensemble learning classification method). All models were generated using Scikit-learn: Machine Learning in Python. Prognostic capabilities of DxCG risk-score algorithms were compared to those of the Blue Cross models.

Results

When stratifying by the top 0.1% of members with the highest ROH, the Blue Cross logistic regression model had the highest area under the receiving operator characteristics curve (0.862) based on the result of 10-fold cross-validation. The Blue Cross random decision forests model had the highest positive predictive value (49.0%) and positive likelihood ratio (61.4), but sensitivity, specificity, negative predictive values, and negative likelihood ratios were similar across all four models.

Conclusions

The predictability of the Blue Cross models show how member-specific, regional data can be used to accurately identify patients with a high ROH, which may allow healthcare workers to intervene earlier and subsequently reduce the healthcare burden for patients and providers.

admissions,hospitalizations,predictive modeling,machine learning,high risk, risk score

Please log in/register to access.