More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research (Text Version)
Slide Presentation from the AHRQ 2009 Annual Conference
On September 15, 2009, John E. Ware, Jr. made this presentation at the 2009 Annual Conference. Select to access the PowerPoint® presentation (7 MB).
Slide 1

More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research
John E. Ware, Jr., PhD, Professor and Chief
Division of Measurement Sciences, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA
Track A - Patient Reported Outcome Measurement and Comparative Effectiveness Research to Reform: Achieving Health System Change
AHRQ 2009 Annual Conference, Bethesda MD September 13-16, 2009
Slide 2

What is the Relationship Between Health Care Expenditures & Outcomes?
Image: Line graph shows health outcome rising with expenditures for health care ($).
Slide 3

Health Insurance Experiment Revealed:
More Health Care is Not Always Better
Image: Line graph shows health outcome rising with expenditures for health care ($), then leveling off. The leveling off is described as "Flat of the Curve."
Slide 4

When the Same Outcome Costs More
Payers & Consumers: Want to Pay Less
Image: Line graph shows health outcome rising with expenditures for health care ($), then leveling off. On the section of the line that has leveled off are two bell curves and the captions, "Payers and Consumers: Want to Pay Less."
Slide 5

Who is Most Vulnerable with Aggressive Cost Containment?
- Health Insurance Experiment (HIE) (1974-1981)
Well, Well off, Young
Cost Containment
- Most Vulnerable in the MOS:
- Chronically Ill
- Elderly
- Poor
- Non-white
- Medical Outcomes Study (MOS) (1986-1990)
Expenditures for Health Care ($)
Slide 6

4-Year Physical Health Outcomes Favored FFS > HMO for Chronically-Ill Medicare in the MOS
Images: Two pie charts display the following statistics:
Fee for Service
- 63% Same
- 28% Worse
- 9% Better
HMO
- 54% Worse (These percentages, better & worse would be only about 5% due to measurement error)
- 37% Same
- 9% Better
Source: Ware, Bayliss, Rogers et al., JAMA 1996; 276:1039-1047
Slide 7

When Outcomes Vary at the Same Price
Image: Line graph shows health outcome rising with expenditures for health care ($), then leveling off. At the point where the line levels off is a bell curve perdendical to the level line and the caption, "Payers & Consumers Want the Best Outcomes."
Slide 8

To Compare Health Care Effectiveness
We Need Health Outcomes "Rulers"
Image: Line graph shows health outcome rising with expenditures for health care ($), then leveling off. Health Outcome is divided into three sections:
- Better
- Same
- Worse
Slide 9

Continuum of Disease-specific and Generic Health Measures
| Clinical Markers | Specific Symptoms | Impact of Disease-specific Problems | Generic Functioning, Well-being and Evaluation |
|---|---|---|---|
| (1) | (2) | (3) | (4) |
Adapted from: Wilson and Cleary, JAMA 1995; Ware, Annual Rev. Pub. Health 1995
Slide 10

Continuum of Disease-specific and Generic Health Measures
| Spirometry | Shortness or Breath | ||
|---|---|---|---|
| Image: a woman is shown using a spirometer; the parts of the machine are labeled |
Over the last 4 weeks I have had shortness of breath
|
||
| Clinical Markers | Specific Symptoms | Impact of Disease-specific Problems | Generic Funcitoning, Well-being and Evaluation |
| (1) | (2) | (3) | (4) |
Adapted from: Wilson and Cleary, JAMA 1995; Ware, Annual Rev. Pub. Health 1995
Slide 11

Continuum of Disease-specific and Generic Health Measures
| Spirometry | Shortness or Breath | Respiratory -specific | |
|---|---|---|---|
| Image: a woman is shown using a spirometer; the parts of the machine are labeled |
Over the last 4 weeks I have had shortness of breath
|
How much did your lung/ respiratory problems limit your usual activities or enjoyment of everyday life?
|
|
| Clinical Markers | Specific Symptoms | Impact of Disease-specific Problems | Generic Funcitoning, Well-being and Evaluation |
| (1) | (2) | (3) | (4) |
Adapted from: Wilson and Cleary, JAMA 1995; Ware, Annual Rev. Pub. Health 1995
Slide 12

Continuum of Disease-specific and Generic Health Measures
| Spirometry | Shortness or Breath | Respiratory -specific | |
|---|---|---|---|
| Image: a woman is shown using a spirometer; the parts of the machine are labeled |
Over the last 4 weeks I have had shortness of breath
|
How much did your lung/ respiratory problems limit your usual activities or enjoyment of everyday life?
|
In general, would you say your health is...
|
| Clinical Markers | Specific Symptoms | Impact of Disease-specific Problems | Generic Funcitoning, Well-being and Evaluation |
| (1) | (2) | (3) | (4) |
Adapted from: Wilson and Cleary, JAMA 1995; Ware, Annual Rev. Pub. Health 1995
Slide 13

There is More to the Continuum
Image: the table below is contained in the shape of an arrow pointing to the right.
| Clinical Markers | Specific Symptoms | Impact of Disease-specific Problems | Generic Funcitoning, Well-being and Evaluation |
|---|---|---|---|
| (1) | (2) | (3) | (4) |
Slide 14

Prediction and Risk Management: PROs are among the Best Predictors
Image: the table below is contained in the shape of an arrow pointing to the following text:
Future health
Inpatient expenditures
Outpatient expenditures
Job loss
Response to treatment
Return to work
Work productivity
Mortality
| Impact of Disease-specific Problems | Generic Funcitoning, Well-being and Evaluation |
|---|---|
| (3) | (4) |
Below the arrow is the following text: "Health-Related QOL (HR-QOL)."
Slide 15

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 16

Content of Widely-Used Patient-Reported Outcome Measures
| Concepts | Psychometric | Utility Related | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SIP | HIE | NHP | COOP | DUKE | MOS FWBP | MOS SF-36 | PROMIS | QWB | EURO-QOL | HUI | SF-6D | |
| Physical functioning | x | x | x | x | x | x | x | x | x | x | x | x |
| Social functioning | x | x | x | x | x | x | x | x | x | x | x | |
| Role functioning | x | x | x | x | x | x | x | x | x | x | x | |
| Psychological distress | x | x | x | x | x | x | x | x | x | x | x | |
| Health perceptions (general) | x | x | x | x | x | x | x | x | ||||
| Pain (bodily) | x | x | x | x | x | x | x | x | x | x | ||
| Energy/fatigue | x | x | x | x | x | x | x | x | x | x | ||
Psychological well-being |
x | x | x | |||||||||
| Sleep | x | x | x | |||||||||
| Cognitive functioning | x | x | x | |||||||||
| Quality of life | x | x | x | |||||||||
| Reported health transition | x | x | ||||||||||
| SIP = Sickness Impact Profiles (1976) HIE = Health Insurance Experiment surveys (1979) NHP = Nottingham Health Profile (1980) QLI = Quality of Life Index (1981) COOP = Dartmouth Function Charts (1987) MOS FWBP = MOS Functioning and Well-Being Profile (1992) |
MOS SF-36 = MOS 36-Item Short-Form Health Survey (1992) PROMIS = Patient Reported Outcomes Measurement Information System QWB = Quality of Well-Being Scale (1973) EUROQOL = European Quality of Life Index (1990) HUI = Health Utility Index (1996) SF-6D = SF-36 Utility Index (Brazier, 2002) |
Source: Adapted from Ware, 1995
Slide 17

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 18

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 19

A Practical Solution in 1999: Computerized Dynamic Health Assessment
Image: Graph showing that IRT/CAT will spawn a new generation of static tools.
Ware JE, Jr, et al. Med Care 2000;38:1173-82.
Slide 20

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 21

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 22

Practical Solution in 2000:
Cross-Calibration of Headache Pain Disability Measures
Theta (θ) [Best Possible Estimate]
| Scales | 20 | 30 | 40 | 50 | 60 | 70 |
|---|---|---|---|---|---|---|
| HDI ↑ | 16 | 43 | 73 | 91 | 98 | 100 |
| HIMQ ↓ | 74 | 53 | 31 | 17 | 8 | 2 |
| MIDAS ↓ | 58 | 28 | 5 | 1 | 0 | 0 |
| MSQ ↑ | 31 | 53 | 79 | 92 | 96 | 99 |
| DYNHA-5 (+) | 23 | 32 | 41 | 51 | 58 | 66 |
Note: Direction of scoring shown with arrows
Source: Ware, Bjorner & Kosinski, Medical Care 2000
Slide 23

We Need the Health Equivalent of a Two-Sided Tape Measure
Image: A tape measure with centimeters on one side and inches on the other. A note reads, "52 centimeters = 20.5 inches."
And Public-Private Partnerships That Meet the Needs of Research and Business
Slide 24

What Do We Need for Comparative Effectiveness Research?
- Outcomes that matter to patients
- Practical measures
- Coverage of a wide range
- Greater precision
- Comparability of scores
- Ease of interpretation
Slide 25

PRO Validation Must be Comprehensive
Image: Five boxes contain the following text:
|
Causes
|
Gold Standard |
Consequences
|
| Measures In Question | ||
| Other Measures & Methods |
Arrows point from "Causes" to "Measures In Question" to "Consequences."
Adapted from: Ware JE, Jr. and Keller SD: Interpreting general health measures, in: Quality of Life and Pharmacoeonomics in Clinical Trials. Philadelphia, PA: Lippincott-Raven Publishers; 1995: Chapter 47.
Slide 26

What Do Differences in Treatment Effectiveness Mean?
Treatment
- 50% reduction in disease burden
- 33% reduction in hospitalization
- Substantional increase in work productivity
- Subsequent cost savings
Slide 27

Matching Methods to Applications:
"Choosing the Right Horse for the Course"
- Population monitoring
- Group-Level outcomes monitoring
- Patient-level measurement/management
Slide 28

Matching Methods to Applications
- Graph of matching methods to applications.
- Population monitoring
Single Item- Most Functionally Impaired: Noisy Individual Classification
- Group-Level Outcomes Monitoring
Multi-Item Scale - Patient-Level Management
"Item Pool" (CAT Dynamic)- Most Functionally Impaired: Very Accurate Individual Classification
Slide 29

Solutions
- Improved psychometrics (Item response theory—IRT)
- Computerized adaptive testing (CAT) software
- The Internet (and other connectivity)
Business Week. November 26, 2001.
Slide 30

First, Construct Better Metrics
- Comprehensive Item "Pools"
- IRT Cross Calibration of Items
1980 "PF Ruler" >75% @ Ceiling
1990 "PF Ruler" >30% @ Ceiling
2008 "PF Ruler" <3 % @ Ceiling
Note: Physical Functioning (PF)
Slide 31

Precision Varies Across “Static” and Dynamic Forms and Across Score Levels
Image: Graph of Static and Dynamic Forms, across score levels.
Slide 32

2nd Solution, Assess Health Dynamically
CAT
Patient scores here
CAT = Computerized Adaptive Testing
Slide 33

What are the Advantages of Dynamic Assessments?
- More accurate risk screening
- Reliable enough to monitor individual outcomes
- Brevity of a short form—
90% reduction in respondent burden - Elimination of "ceiling" & "floor" effects
- Can be administered using various data collection technologies
- Markedly reduced data collection costs
- Monitor data quality in real time
Slide 34

Performance of 5-item CAT Scores Confirmed in NIH-Sponsored Studies
Images: A series of 6 graphs studies, Mental Health, Headache Disability, Pedatric Disability, Chronic Kidney Disease, Diabetes Impact, Post Acute Rehabilitation.
Slide 35

3rd Solution: The Internet
- www.amIhealthy.com
- www.asthmacontroltest.com
Reference—Headache Impact: MS Bayliss, JE Dewey, R Cady et al., A Study of the Feasibility of Internet Administration of a computerized health survey: The Headache Impact Test (HIT), Quality of Life Research 2003, 12:953-961
References—Asthma Control: Nathan RA, Sorkness CA, Kosinski M et al., "Development of the Asthma Control Test: A survey for assessing asthma control. Journal of Allergy and Clinical Immunology 2004;113:59-65.
Slide 36

Conclusions
- Patient-reported outcomes (PROs) are very useful
- Standardization of concepts & metrics is enabling comparisons across treatments & settings
- Increasing widespread use proves that more practical tools will be adopted
- Promising technological advances include: item response theory (IRT), computerized adaptive testing (CAT) and Internet-based data capture


5600 Fishers Lane Rockville, MD 20857