July 22, 2009: Morning Session (continued)
Transcript: First Meeting of the Subcommittee on Quality Measures for Children in Medicaid and Children's Health Insurance Programs
Jeffrey Schiff: And then the other, the last thing I just want to say in terms of logistics for the moment is I know there are a few people who came a little late and have not had a chance to introduce themselves and where they are from, and I'm looking at Lisa and Glenn and Marina, so if you would just introduce yourself, where you are from, and that will be great.
Lisa Simpson: Good morning. I'm Lisa Simpson. I'm the director of the Child Policy Research Center at Cincinnati Children's Hospital and at the University of Cincinnati. And like Marina, I feel that this is a very exciting day because it has been a long time coming.
Female Voice: Marina?
Marina Weiss: I'm Marina Weiss, and I'm with the March of Dimes and have been for a number of years. Before that, I was with the Clinton Administration, and before that, I worked on the Hill and particularly in the Senate, and I was very much involved in the development of the legislation that has brought us here today. Partly as an outgrowth of a very long and rich tenure on the National Quality Forum (NQF) where several of us tried mightily with some modest success to move the agenda in the direction of pediatric measures, but resources were always short, and it just became obvious that in order to make this thing happen, we needed to go to the Hill, we needed to get some legislation, and we needed money. So therefore, I stepped down from the NQF board and have spent my last couple of years doing that, and I'm delighted to be here to see the fruition of this effort. So thank you.
Rita Mangione-Smith: I apologize. Glenn, you did introduce yourself. You switched chairs so I got confused. Sorry. Okay. So, I'm going to start us off just very, very briefly stating as straightforward as I can what our goals as a subcommittee are for these next 2 days.
Our main goal by the end of Thursday is to identify our preliminary core set of measures. It will not be the set that is our final recommendation. We do not expect that to happen until we have made you do a whole bunch more work between now and September and then even probably a little more after the September meeting. We literally will be using your expertise and needing your input and thoughts all the way until this is due on September 30th. So we do hope to have an initial preliminary set to share with the National Advisory Council on Friday, and we will tell you—Jeff is going to kind of go through our vision of how this process is going to work to get us all oriented in just a minute.
Our other key thing that we need to accomplish before we let you all go tomorrow is to make sure we are all on the same page as to what our process steps are between the end of tomorrow and the end of this effort, especially between now and our September meeting because, again, I know a lot of people cannot be at that meeting, so our intention is to make you work really hard online so we can get all your input even if you cannot be at the meeting. So I'm going to turn it over to Jeff who is going to talk to you about kind of the logic and flow as we see it working over the next 2 days.
Jeffrey Schiff: Thanks, and I know it is hard to be in the beginning of these meetings where we have to listen for a while, but I have broken this up into a few different areas just to sort of give context for this, and then we will do some questions really about the process and then hopefully move fairly quickly into the meat of this whole thing. I think one of the challenges that we have had as we had phone calls with the Centers for Medicare & Medicaid Services (CMS) and AHRQ and ourselves as chairs is to try to design a process that gets us to this goal as efficiently and quickly as possible, and so I want to talk about a couple of things.
The first is about the logic of the process as we have envisioned it, which is really reflected in the agenda. The second is to talk about just how the meeting will flow which is a little bit in the agenda. And I think the third thing is there are some process values that I think that we have articulated as we have gone forward that I just want to be able to say as we move along, so bear with me just for a few minutes as we talk about this.
I think one of things that most of you probably know way better than I do as someone who tries to implement quality measurement but is probably perhaps less involved in the development is we actually talked about a logic model where we go from talking about validity and whether or not our measures are valid, so that we can use them, to looking at their feasibility, to see whether or not we can actually implement them and they are reliable and usable, and then to score importance, and we have actually said we need to flow in that direction. And in that regard, what we really have—and you have already participated in this by your responses to the Delphi exercise—are the first steps of that, of looking at the validity and feasibility for measures we have already been able to identify.
So what we are really going to try to do today and tomorrow in the flow of this meeting is to look at what we mean by validity and feasibility and importance and in each of these discussions, our first discussion after this is really to have some common understanding in this group about what we mean around validity and feasibility and then to talk about measures and where measures score out with that.
We specifically want to spend our time talking about measures for which our scores were close. The ones that have dropped off we feel like we should not spend time on—our time is very precious—and the ones that have scored high we feel like we should not spend time on because we have a fair amount of agreement. But as Rita will go through a little bit later, we have sort of tried to figure out the ones where we are sitting on the fence a little bit. What we hoped by doing this process with the validity and feasibility with the borderline ones and then tomorrow we will look at importance for the ones that were already scored high enough. So today, we are going to talk about the borderline ones for validity and feasibility. Tomorrow, we are going to talk about importance for the ones that scored high.
What about some other categories? There are categories of measures that have not been identified yet, and then there are the ones that we are going to talk about now that we have not decided upon whether or not they meet validity and feasibility categories. We will either deal with them online over the course of the interval, or we will deal with them at the next meeting.
In this process, as we said, we would like to get some meat on the bones for Friday, so that when we meet with the National Advisory Committee, we can say here are some things that we know we are going to value. We know that on Friday, we will have an incomplete product but we know that we have this process going. So we hope actually as we have talked about this, we actually see this measurement process going forward in a bunch of different steps. There will be the relatively easy ones that we have already committed to around validity and feasibility. We can move on to importance. There will be others that we will have to talk about. There will be new ones that will be identified. So we will bring some of this on I guess I would say in waves as I see it where the first wave will be those that we have already validated. The second wave and third wave will be those we identify and those we agree on in this discussion have validity and feasibility.
So our agenda is built around that and in the agenda, we have time for these discussions around what we mean and then time for the discussion around what is involved. I need to tell you, and I think Rita and I and Denise have talked about this, this is going to be both an exciting process, and it is going to be frustrating. And so we decided that we would apologize in the beginning for the frustration and the fact that we in some ways are the timekeepers and the referees here to make this happen so we get through what we need to get through. And so, if we seem like we are trying to adhere to a time schedule it is because there is a lot to do.
The other thing then to say is—so that sort of feeds the agenda—how we do that. We think then at our next meeting, we will have moved this along a certain distance at our meeting in September and hopefully in September, we have a very significant issue to talk about which Carolyn alluded to, which is what is the size of a core set. We know also and I think one of the first things we will do after this conversation and it started with the very first question and the very last question is what about the pressure we feel about measures that are not developed? And so we will talk about that as well as we go along.
The last thing I want to say about this process then as I have talked about the logic and the flow of the meeting and our agenda is around some things that we have talked about that we value in this process. Probably the most important one is around transparency, and we really want to make sure that this process is transparent, and as we develop a report that will go to the Secretary, we are transparent. And some examples of that will be we have, for example, scored some things on validity fairly high, but the evidence is not overwhelming. So we want to be able to both say this is how we scored it based on expert opinion but be honest about what level of evidence we have. That is one value.
The next value I think is that we believe that these measures need to—another report on the status of programs and of States where there need to be measures that can be used for improvement and quality and that is part of the legislation. I think it is a core value of what we have to articulate that these measures are not just static to report on programs, but they are measures that actually move the program forward.
And then I think the last thing I will say about values, and this is where you will give us feedback, is that we want this to be a fair process, so we want to make sure that everyone is heard. I think there are people who have different things that they are more passionate about. We want to make sure that this is a participatory process, both for this committee and then—as this becomes more public and people have a chance to comment—that it is participatory for anyone who wishes to move along.
So that is some of the stuff about the process. It is a lot. I wanted to make sure we have a chance to just comment, have people comment on that if this makes sense. We as co-chairs have tried to make this work and tried to make sure this process gets us to where we need to go. It is no small task. We hope this makes sense. So if people have comments about the process, let's do that now.
Rita Mangione-Smith: I just have one last thing to add. One final thing, and it is the reason I wrote this up on the board over there, as Jeff alluded to, you all bring to the table measures that you may know are in use by States that are not on the list that we sent you for the original Delphi process. As we go through the day, we want—and actually I think one of our next steps as we get feedback from you is—I want to start writing down any measures that have been missed.
There are some areas of passion where there is not a measure in use. There may be a measure in existence, but it is not in use or there is just an area that is begging for measures that does not have them. We would like to take our brain bank and put those down and at least offer that up to the National Advisory Council as these were areas that we identified that we cannot include in the core set for the reasons that have been put forward but are certainly areas that are really in need of development and potential future implementation. So that is what the right-hand side is for, and you can offer those up at any point during the day, and I promise I will go down and write them down. Okay.?
Female Voice: I'm sorry.
Rita Mangione-Smith: One thing—can you just do this when you want to talk so in that way, we know you want to speak up.
Jeffrey Schiff: I'll say one other thing about your nametags; if you could make them so I could see them because I do not know everybody's name, and I'm kind of looking around in glasses.
Female Voice: Okay. So, I think the process is terrific, and I love the way you have organized the agenda and the materials ahead of time were very helpful and so on. But I would like to kind of offer up maybe a friendly amendment or at least a thought to keep in mind as we go forward. One of the very important endpoints for whatever information is captured as a consequence of our development of a core measure set will be the report that goes to Congress.
And I just want to underscore for folks around the table how very excited members were, members we worked with, on both sides of the aisle to move this process forward, and the word that kept coming up again and again and again was accountability. The idea being that members of Congress take some political risk in raising taxes needed to support some very expensive programs, Medicaid and CHIP. And it is very attractive to them that for the first time ever, they are looking forward to being able to say this is a sound investment.
We are accountable to the taxpayer for what we are doing here in the way of raising their taxes in order to pay for programs that are making a real difference for children. And we can show you in quantifiable ways that the outcomes that children are experiencing health-wise have been improved and are being maintained in cases where we are talking about children with chronic conditions and so forth in ways that we, as a nation, can be proud of. So I just want to raise the issue of thinking ahead about how the information is going to be used, and that is just one area. Obviously, there are some other things that are going to be done with the information, but this is a very important audience for all of us to be aware of.
Jeffrey Schiff: Thanks.
Rita Mangione-Smith: Paul? The red light goes on [cross-talking].
Paul Melinkovich: I think this is a very good structure, I guess. The one question I would ask would be the timing of the discussion about the size of the measure set. My understanding is if this is a voluntary process, if we do come up with, say, a hundred great measures, the chances of getting the States to report a really good dataset or to get the kind of report we are just talking about may be a challenge. It would—some discussion about the size will be important early on. If we said 30 might do, that would be significantly less work than talking about a hundred.
Rita Mangione-Smith: That is an excellent point, and it actually came up at breakfast this morning because we have a certain I think level of stress about that, that whole piece of it. And Carolyn actually brought up a really good point, and that is we need to keep in mind that our group is advisory and that presenting maybe a larger set from which to choose from may not be such a bad thing because we do not actually have the final word on what goes into the core set. We are making our expert recommendation that these are slam dunks, these are the ones that absolutely should go in, and here is an additional set that we think would really enrich it, but they were not quite as firm as this core set. But I think you are right, I think we all have to be thinking a bit about what is that central set and what is the enriching set as we move forward.
Carolyn Clancy: One other note of reality here—every year or every so often for the National Healthcare Quality Report, we go out publicly looking for input on measures, and we always go out to Federal partners. We have a very strong group of collaborators there. No one ever, ever has said, "You know XYZ measure, not really telling us much so we have kind of maxed out there. It does not give us that much information." It is always more, more, and more. So I think this is a generic challenge of pent-up enthusiasm, and we will work through that.
Jeffrey Schiff: I just want to make one other comment on what Paul had to say. I think in developing this process, we felt like we had enough to do justice in a practical level of this meeting to get through this validity, feasibility, and importance thing, and that we needed to sort of let it sort out a little bit to see if there were breakpoints around importance in these sets. So I think that we are putting a big discussion off perhaps, but that is sort of the logic of why we did it.
Catherine Hess: A couple of quick things on the schedule up until you have to—this has to go up on the Web for public comment and so forth. The NAC meets again in early November. Is there to be a post-September leading up to the early November meeting, another report to the NAC or no?
Rita Mangione-Smith: Yeah. I mean that is a possibility, whether the actual, the subcommittee will meet.
Catherine Hess: No, I was not so much thinking of meeting as just—is there another sort of point there at which a revised report to the NAC might be done?
Rita Mangione-Smith: Yes.
Catherine Hess: Okay.
Rita Mangione-Smith: At that point however, the report to the Secretary may be in clearance.
Catherine Hess: Okay.
Rita Mangione-Smith: In clearance process, internally.
Catherine Hess: Okay, all right. Well, that—okay. I wondered on a substantive note if we are all in agreement about what the age group is for child or CHIP versus pediatric, and whether across all the States there is any helpful epidemiologic information about the children that are covered or are enrolled as beneficiaries, whether there is an age distribution that might guide us to focus on certain parts of a pediatric age group. I'm kind of operating on the assumption this is zero to 18, but I think some clarity on what child means and do we really mean that to be the same as pediatric and so forth would be helpful.
The third thing is out of the measurement world, I like your formulation of validity and feasibility and importance, but the thing that is missing is reliability, and reliability puts a ceiling on validity and you cannot stick it into feasibility. It is a very different kind of thing, and I think for people who have to actually work with honest to God measurement and measure sets and so forth, we should be keeping in mind that reliability does put that cap on what could conceivably be validity.
The fourth thing that I wondered is whether there is any recourse to the National Quality Measures Clearinghouse information that might help us know what is in that Clearinghouse already and whether that would help in some selection in or out of certain kinds of measures.
Rita Mangione-Smith: I'll take the first couple of your points, and they are very, very well taken, Cathy. And one of the things that we are going to start doing at probably just about noontime is talking in detail about validity, feasibility, and what do we all agree we are talking about when we use those terms and what can we all as a group agree to are the important criteria to set for those because I think that informs all of the rest of this afternoon's and tomorrow's discussion. So that is one process that we absolutely are going to go through as a group to see that we are all kind of on the same page.
I do tend to, and probably I'm in error in doing so, lump reliability into feasibility. I think of it as if you have a feasible measure that has detailed specifications then I can get unbiased reliable information regardless of whether I'm in Texas or California or Minnesota and I'm going to get the same answer using the same measure and the same specifications each time. But again, these are things that we as a group need to hash out starting at about noon we hope. So that is that part, but that Clearinghouse, I'm going to let Denise—you were going to comment about the Clearinghouse.
Denise Dougherty: Yes, we did go through the National Quality Measures Clearinghouse, and it is tagged. You can get it, things that are tagged, all the pediatric age groups, low birth weight included, and external oversight by Medicaid. So we did—with the help of Arielle Mishkin who was with us for the summer—go through a long printout of the 189 measures. A lot of them are Outcome-Based Quality Improvement (OBQI) measures. We found that it was mostly overlap, and for some of them, we were not clear they were actually were relevant to children or were used by children—for children by CMS, and there were a couple of new ones so we did go through all of those.
Rita Mangione-Smith: Glenn has a question.
Glenn Flores: We briefly went over this I think online, but I'm still not entirely clear about one issue. I note that we set up this protocol of looking at current extant Medicaid measures, and I get concerned because I'm not sure everybody around the table would agree that the current measures, even if you pooled all the States' measures, are the ideal measures, are useful measures or practical or feasible. And so I just get worried, and I know you have addressed it very slightly but maybe we could talk about it at some point a little bit more that we need to hopefully have some room for creativity, both about the current measures and how we can make them better quality improvement measures.
But also again, I think you have heard already from a lot of people by E-mail that there are a lot of measures that are not there, and they are feasible and they are important. And even if you can only come up with 50 measures, I bet you a lot of the ones that are already out there would be dropped in favor of new and what we feel are more relevant and useful measures. And I just hope conceptually at some point we can talk about that a little bit more.
Rita Mangione-Smith: So you are addressing a frustration I think that everybody in the group shares. When I first agreed to co-chair this group, one of the first things Denise said to me, "I need to warn you." She said, "You are only going to be able to really think about and talk about measures that are already in use." And I kind of went, "Oh really?" Because I'm a measure developer, right? I like to think about where the gaps are and where we fill them in and make new good rigorous measures. So it is a frustration I think we are all going to be dealing with.
I think your point about is there a way to think about how these measures get specified or implemented that could put in some of that creativity and make them better measures than they currently are because I think a lot of them are used in many different ways by many different States, there is not a lot of consistency. And that is where we will really be looking to CMS to say, "Okay, here is the core set. Now let's make sure that they get specified and implemented in a way where we can get the most meaningful information out of them possible." But I share your frustration. It is part of the reason we decided to make that right-hand list because they are out there, there are feasible measures that had been developed that are not being used by States, so they are kind of out of our domain but that is a frustration, unfortunately, I think we are kind of stuck with.
Glenn Flores: Just so we all understand, too, the constraint is because of the way the CHIPRA language is written or because of the direction that AHRQ would like us to take because I'm a little fuzzy about why in essence it is recycling measures that are out there, hopefully modifying them, but I guess there is still a lot of discomfort around the room about—if somebody said design the best set of quality measures, the first step would not necessarily be to say what is out there that is not working or might be working or we do not know, but rather here is what a group of experts from around the country might suggest would be useful and helpful.
Rita Mangione-Smith: Yeah, I think it is one of the things that got raised was the limits to State resources right now and their ability to build measurement systems that are not already in place. That is one of the issues we are grappling with. We have to come up with a core set that is implementable. So part of the rationale is if States are already using or at least some States are already using these measures, then that to us is some indication of feasibility.
Now, some of them, if you look at the charts we sent, are not being done by many States. The question is, are they not being done by many because they are the stretched measures, they are the better measures; they are the harder ones to do? Hopefully, if they become part of our core set, they will be done by more people or more States.
I know that is not a satisfying answer. I would love for us to be able to put the brain bank around this table and come up with the elite best set of measures we could think of, but that, unfortunately, is not what we have been charged with doing. I do not want to deflate everybody, but I think to a certain degree we have to keep that in mind.
Carolyn Clancy: I guess I want to just add two things. One is, thanks to the folks who acknowledge that this has been a long time coming. The legislation also includes the Institute of Medicine report which I actually inadvertently forgot to mention, which I think will give us a sense of this best and brightest. If you were writing the script brand new today, what are the kinds of things, and where should we be measuring? So this is part of a journey.
I will say though that the progress we have had to date in quality is a glass half full and you can either look at that optimistically or from a different perspective, but not letting the perfect be the enemy of the good has been a kind of consistent theme that I think Lisa and others would recognize. But the implementable now is a very, very strong gravitational pull which is why I love this phrase about aspirational and grounded. What grounds me is actually hearing about which types of people States are laying off. So I do not want to make that too much of a pull and I do not want an aspirational thought, insight, experience to be left in this room or left in your brains and not shared with us, but the specific task right now is extremely pragmatic.
Jeffrey Schiff: I just wanted to add I think about, I sometimes use the term at work—radical incrementalism which is I guess the way I look at this. We have to together start on this road and, Glenn, you are absolutely right, we are not going to finish, but if we all start together and we never go back, we will have made a big step because we will not be going backwards and are just kind of moving around.
Rita Mangione-Smith: Just one quick addition before we take the other questions that are out there. We do absolutely want people—we know we do not have the full universe of even what has been used, so please be sure during the course of this meeting to help us fill in that left-hand column of what is in use that we do not have up there because we know there are things out there that were not on that original list. So Lisa is next.
Lisa Simpson: Okay. It is sort of a three-part comment, and they are related very much to this discussion we are having right now. As I think back to really two of the defining features of Title IV of the CHIPRA, the first one in relation to quality is that it extends the quality focus from CHIP to Medicaid and that is really, really important. The second one is what Marina addressed earlier and that is the accountability, and the only way we get to accountability is comparability. We have to have comparable measures across all States, and since we are all going in that direction and that is the goal, I think we have to remind ourselves that many of the measures that are considered in use in these tables that have the same name are being measured in very different ways by States. And States are going to have to invest in changing their infrastructure to continue to measure with a perhaps bad measure because it was the best available 10 years ago or the best available 5 years ago. And do we want to sort of codify that or say there is actually a better measure out there? And if you are going to invest in data infrastructure to reprogram, maybe we need to go in that direction. So I think it is important to remember that just because you picked one of these measures, it does not mean States can report them right now because they may be measuring it very differently, so I think we have to keep that in mind.
What that leads me to next is that as we make these decisions and discussions around the table, there are obviously a dozen different ways of specifying any one of these labels and when voting on validity and reliability without knowing inclusion criteria, exclusion criteria, scoring, any of those things. So I guess I have a question for this committee when we come to vote, move forward our proposed core set, are we going to be able to say it is the core set for ADHD measured this way with these inclusion or exclusion criteria? So that we are clear on the specification of that measure, it is not just a measured concept because what we are voting on right now are measured concepts.
Rita Mangione-Smith: Unfortunately, we have that information for the HEDIS [Healthcare Effectiveness Data and Information Set] measures, but for many of the measures as I'm sure you encountered going through the list, we do not have that level of information. And I've pushed and asked a lot about that because I know exactly what you are saying. How do I assess this? I do not even know what the denominator is; I do not know what the numerator is. I do not even know, exactly what you said—including, excluding, all of that. Those are critical pieces of information that you need to honestly assess validity and feasibility, especially feasibility, I think, but my understanding is we have what we have.
Lisa Simpson: So then the next question is, the measure set that is published by the Secretary in January, it will include those specifications?
Denise Dougherty: Well, if we have them. One of the papers you will hear about tomorrow that we are doing under contract is an environmental scan, starting with State Web sites but also informational interviews and things, and we have asked that environmental scan to include finding out as much as they can about the measure specs and the populations used and whether they can be used for racial and ethnic disparity evaluation and things like that. That is just getting started but you will be hearing about that. So I think this goes on the list of what we would like to have by September to make a final recommendation, and tomorrow, you will hear about some of the work that is underway to get that kind of information. We just could not get it all done in time for this meeting.
Carolyn Clancy: I wanted to inject one other reality note here. Boy, I should just have a big pitcher of ice water right here. I have never seen a measure that did not have a constituency, okay? And so I hope that the State people will chime in on some of these because we—well, no, I mean we folks who think about science and all this kind of forget that our track record in letting measures go, HEDIS as far as I know for all populations is one, because performance was so routinely excellent. Actually, I think we are still measuring it; it just does not count in how the plans are scored. No, I'm serious. I quote it all the time—beta blockers after MI [myocardial infarction], 25 years. Hey, we nailed it, really good stuff.
We may see issues that States may think are very important, and I think that is going to be a very interesting qualitative judgment. Now, again, in the spirit of advising the Secretary, I think it might be helpful to say, "Gee, 34 States are still counting this, and we are not actually"—it is the sense of the committee or the scientific input from the committee that this may not be the most helpful use of resources or is not necessarily future-oriented so that she can make the best possible call.


5600 Fishers Lane Rockville, MD 20857