Blog banner

The Truven Health Blog

The latest healthcare topics from a trusted, proven, and unbiased source.

Truven Health Risk Model Ranks High in Actuarial Evaluation

By John Azzolini/Monday, November 28, 2016


Healthcare payers today are facing the complexities of reform, increased competition, and budget constraints — all while dealing with pressures to reduce costs and improve member health. Managing health risk has become a necessity. But to manage risk, payers must first understand their population. To do this well, they need reliable, robust risk and cost of care models.


Last month, the Society of Actuaries (SOA) released a study showing that Truven Health Analytics’ cost of care model outperformed other risk models in 18 out of 22 measures. SOA’s Accuracy of Claims-Based Risk Scoring Models compared health risk-scoring models, building on their previous studies with similar objectives (the most recent was in 2007). In the medical claims category (predictions based only on medical claims data), the current study showed that, in 21 of the 22 measures, the Truven Health model was ranked either first or second. No other model came close to matching this performance. (See Table 1 for a summary of how Truven Health’s model ranked relative to the competition).


How the SOA Evaluates Risk Models

The SOA evaluated Truven Health Analytics’ cost of care model against six others:


  • ACG® System
  • Chronic Illness & Disability Payment System and MedicaidRx
  • DxCG Intelligence
  • Milliman Advanced Risk Adjusters
  • Wakely Risk Assessment Model


The SOA assessed all models on their ability to predict costs using the Truven Health Marketscan® commercial claims dataset of 1 million members, and used three methodologies to evaluate their precision: R-Squared, the mean absolute error statistics, and predictive ratios. All three methodologies measure the statistical difference between the prediction and the actual results. All models produced both a concurrent and prospective cost prediction and were evaluated using both a capped data set (where patient costs were capped at $250,000) and a non-capped data set.


The SOA evaluated the models’ predictive ability using a number of scenarios (total medical costs, simulated random groups, condition-specific predictions, patient cost). In the simulated random group scenario, the SOA created groups of 1,000 and 10,000 patients to simulate the application of the model to subgroups of the population.


Table 1: How the Truven Health Cost of Care Model Performed

The Truven Health model ranked first or second for its ability to predict costs in 21 of the 22 measures studied.



Truven Health Model Ranking*


Mean Absolute Error





Total Medical Costs, Concurrent





Total Medical Costs, Prospective





Simulated Random Groups, Concurrent





Simulated Random Groups, Prospective







Predictive Ratios



Overall Condition Specific Prediction, Concurrent





Overall Condition Specific Prediction, Prospective





Very Low Cost Patients, Concurrent





Very Low Cost Patients, Prospective





Very High Cost Patients, Concurrent





Very High Cost Patients, Prospective





     * Compared with six other models.

** Capped at $250,000


Why Risk Models Are Important to Payers

Risk modeling is a very helpful tool for health plans and employers. It can provide valuable insights into member utilization patterns and risk– vital for benefit planning, disease management and wellness program management, and member communications. It can provide deep insights into provider performance, and aid in determining ideal reimbursement and premium rates. Such models are an integral part of a number of Truven Health databases and analytical tools. The SOA evaluation speaks to the high quality and reliability of the Truven Health solutions.

John Azzolini
Senior Consulting Scientist

How a Data Scientist Thinks about Risk Stratification

By Anne Fischer/Tuesday, October 25, 2016

“Risk”. It’s a word we hear every day in the healthcare industry. We want to avoid risk, we want to predict risk, we want to find patients that are high risk. We want to risk stratify populations (organize people into a set number of mutually exclusive tiers of increasing risk).

My recent blog posts have centered around the concept of Population Health. Clearly the idea of risk is particularly important in this world, where the goals are to keep well individuals healthy, avoid poor outcomes for those that are already sick, and minimize costs. Understanding, assessing, and predicting risk are all essential to this effort.

But what is “risk”? If you asked a physician, an insurer, and an average Joe on the street to describe “high risk” from a healthcare perspective, you would likely get very different answers. A physician might describe someone with high risk of developing a disease, high risk of a serious disease complication, or high risk of mortality. An insurer might describe someone at risk for a high amount of spending in the immediate future. The average Joe might describe someone at high risk for impairment/inability to function in daily life. Understanding the context-appropriate definition of risk is the first step toward building analytics to support risk analysis. And the appropriate definition is always dependent on the real world application.

Even when the application is understood, there is still considerable work to be done to identify the appropriate data and characteristics that lead to poor outcomes. Consider a discharge nurse who sees hundreds of patients a month as they prepare to depart from the hospital. Most knowledgeable hospital staff are aware that the most experienced discharge nurses will be able to tell you, with a high degree of accuracy, who is likely to show up back in the hospital in the near future. Multiple studies have tried to quantify the drivers of this type of “nurse’s intuition”. How do they know?

In 1964, United States Supreme Court Justice Potter Stewart used the now infamous phrase: “I know it when I see it” to describe his threshold test for obscenity in the case of Jacobellis v. Ohio. A discharge nurse might say much the same thing when asked to describe a patient at high risk for readmission. I know it when I see it. Characteristics such as illness burden, past behavior, social situation, self-care ability, home support, and others are often referred to, but the reality is that it’s the entire picture, and often a bit of an ambiguous “gut feeling” thrown in for good measure.

So how does Data Science fit into this picture? Our challenge as Data Scientists is to turn “I know it when I see it” into a measurable mathematical formula, so that everyone “knows it” even without seeing it in person. It involves extensive experimentation with different data sources, variables, and modeling techniques, as well as building in the capability for models to evolve and learn over time. At Truven Health Analytics, my team is exclusively focused on developing and testing new models, using various kinds of data that are readily available to us. In future blogs, we’ll describe some of these models including risk of developing diabetes and risk of admission. Truven Health, an IBM Company, now is positioned to move deeply into this space and develop these types of risk models by bringing together traditionally disparate data sources, clinical knowledge, and cutting edge modeling techniques.

Anne Fischer
Senior Director, Advanced Analytics

IBM Watson Health and Truven Health Analytics: Joining forces to improve health outcomes and advance value-based care solutions

By Mike Boswood/Thursday, February 18, 2016

Today, IBM announced its plan to acquire Truven Health. Once the acquisition closes, Truven Health will help IBM continue to build an unparalleled array of healthcare capabilities to help improve health outcomes, control costs, and advance value-based care.

Upon completion of the acquisition, IBM’s health cloud will house one of the world’s largest and most diverse collections of health-related data and the ability to apply cognitive tools to obtain previously unavailable insights. Additionally, IBM is well positioned to scale globally and to build leading-edge solutions designed to help clients succeed in a value-based care environment.

The Truven Health team looks forward to combining our expertise with Watson Health. Why? Because it can catapult the industry forward to transform healthcare and improve lives.

You can find more information here. I look forward to working with Watson Health to deliver more value to our customers and ultimately, to patients. 

Mike Boswood
President & CEO

The Burden of Hepatitis C Infection

By Truven Staff/Thursday, May 14, 2015

Infection with hepatitis C virus (HCV) may lead to devastating health problems such as cirrhosis or cancer of the liver, which may develop decades after the initial infection. Although the incidence of HCV infection peaked in the late 1980s, roughly 3.2 million people in the United States today have a chronic infection. Given the long course of this disease, the medical consequences of HCV and related direct and indirect costs are continually rising. The estimated total nationwide cost associated with hospitalizations for HCV infection with advanced liver disease is $34.7 billion per year[1]. 

Clinical research demonstrates that the consequences of HCV can be mitigated with appropriate antiviral treatment. However, patient adherence is challenging due to lack of awareness and tolerability issues. We recently completed a study examining the medical costs and lost work productivity among patients diagnosed with and treated for HCV between 1997 and 2012. Our results show that patients with the shortest duration of treatment exhibited the highest post-treatment total and HCV-specific costs and lost productivity over time. Specifically, patients with the shortest duration of treatment had about 50% greater total health costs, double HCV-specific costs, and 20% greater short term disability days. The data further show that due to low cure rates, rates of re-treatment are high. 

These results provide an important baseline for understanding the significant unmet needs for HCV patients treated with the older interferon/ribavirin regimens and the opportunity for newer treatments to better facilitate appropriate adherence, improved patient health, quality of life, work productivity, and reduce the need for re-treatment.

 This study and white paper are available at 

This study was funded by Pharmaceutical Research and Manufacturers of America (PhRMA)


[1] Xu F, Tong X, Leidner AJ. Hospitalizations and costs associated with hepatitis C and advanced liver disease continue to increase. Health Aff (Millwood). 2014 Oct;33(10):1728-35.

Genomic Data for Oncology Research Available in Literature

By Kathleen Foley/Friday, May 30, 2014
Kathleen Foley imageData. Most of us in research are data-hungry and data-greedy. When we can’t get our hands on a certain piece of data our eyes start roaming, looking for the nearest match, like a teenage boy ravaging the fridge for the tenth time in a day. We will grab anything that resembles data, although we typically crave the hardcore, quantifiable, number-crunching data we’re used to, especially in cancer research. It’s funny, though, how in our hunger-driven craze, we can be blind to obvious sources of data.

Today, oncology researchers are all scrambling for genomic data. It’s the single most common data question I get from researchers. But genomic data are not yet available in most secondary data sources, such as administrative claims data or HIPAA-compliant electronic medical records. How then can we begin to explore the role of genomic information in cancer research? As Talia Foster points out in her opinion brief, Oncology Literature Reviews Reach a Tipping Point in Genomic Assessment, the literature is a readily available source that is prime for exploration.

The literature is a versatile source of information on genomic markers in cancer. It can be analyzed both qualitatively, as well as quantitatively, and as Ms. Foster points out, it can address a variety of questions. Rather than wait for our typical sources of cancer data to fully incorporate genomic data, we can access the genomic literature today. By leveraging this powerful and rich source of data, we can not only begin to address many of the questions about the role of genomic assessment in diagnosis, prognosis and treatment response, the prevalence of various mutations and real world use of targeted agents, but we can also begin to plan new research studies that will help us in our search to get the right treatments, to the right patients, at the right time. Are you looking for genomic data? Perhaps the time is right for you to think about the literature for your next data venture.

Kathleen Foley
Senior Director, Strategic Consulting (Life Sciences)