Evidence Points To ‘Gaming’ At Hospitals Subject To National Health Service Cleanliness Inspections
- Veronica Toffolutti ( [email protected] ) is a postdoctoral researcher in health economics at the University of Oxford, in the United Kingdom.
- Martin McKee is a professor of European public health at the London School of Hygiene and Tropical Medicine, in the United Kingdom.
- David Stuckler is a professor of political economy and sociology at the University of Oxford.
Abstract
Inspections are a key way to monitor and ensure quality of care and maintain high standards in the National Health Service (NHS) in England. Yet there is a perception that inspections can be gamed. This can happen, for example, when staff members know that an inspection will soon take place. Using data for 205 NHS hospitals for the period 2011–14, we tested whether patients’ perceptions of cleanliness increased during periods when inspections occurred. Our results show that during the period within two months of an inspection, there was a significant elevation (2.5–11.0 percentage points) in the share of patients who reported “excellent” cleanliness. This association was consistent even after adjustment for secular time trends. The association was concentrated in hospitals that outsourced cleaning services and was not detected in those that used NHS cleaning services.
A prerequisite for a competitive market in health care, such as that established by the English Health and Social Care Act of 2012, is the existence of valid information on the performance of providers. This is necessary for informed purchasing of services. Yet as has often been noted, this can be difficult because, other than for certain easily standardized services, many aspects of health care are difficult to specify, 1 and there are strong incentives for opportunistic behavior, or gaming. 2 This can take many forms, such as changing behavior (for example, by avoiding complex cases) or changing how things are recorded (for example, by adding diagnostic codes to make patients appear more severely ill than they are). 3
As noted on several occasions by the UK House of Commons Public Accounts Committee, 4 one area of concern relates to external inspections of providers—such as those undertaken by the Care Quality Commission, one of a number of regulators in the National Health Service (NHS) in England. These concerns are echoed in the field of education, which has also seen a marked increase in inspections and where there have been many accounts of opportunistic behavior, 5 such as schools being warned about “unplanned” inspections or the temporary exclusion of disruptive students or those of low ability from testing 6–8 —or even changing the food provided in school cafeterias with the dubious intention of boosting students’ performance 9 (with questionable impacts on their health). 10,11
Hospital cleanliness has been high on the agenda of successive governments in the United Kingdom, reflecting a combination of appropriate concern about hospital-acquired infection and the exploitation of data by some media outlets. 12 Even though the media coverage of hospital cleanliness problems has diminished in intensity, it has not stopped. 13–17
Consequently, the NHS’s ten-year plan, launched in 2000, 18 established a series of “nation-wide clean-up campaigns” to improve cleanliness in hospitals. These involved “unannounced” inspections (although staff members were always given forty-eight hours’ notice) that would take place over the course of up to one month by teams composed initially of hospital staff members and patients, chosen at the level of the trust (trusts are English public organizations that operate one or more health care providers, including hospitals). However, a lack of patient volunteers meant that the teams subsequently included mostly staff members.
From the outset, there has been concern about the potential for gaming of cleanliness inspections. It is widely believed that since staff members know when each inspection will happen, they are incentivized to make a special effort in the period leading up to it and then relax their standards after the inspection. This could be especially prominent in services that are outsourced to private contractors, given the risk of failing to obtain contract renewal should their performance receive poor scores in NHS inspections.
The true extent and consequences of gaming in the NHS are poorly understood, but there is enough evidence to raise concerns. Russell Mannion and Jeffrey Braithwaite found twenty distinct forms of dysfunctional responses to the NHS performance management regime. 19 Gwyn Bevan and Christopher Hood give examples of poor performance in areas that are not measured, hitting the target but missing the point, and ambiguities or fabrication of data. 20 Another review highlighted various abuses of health targets, including the creation of target-free zones, either physically (for example, placing patients awaiting hospital admission in temporary facilities in the hospital’s parking lot) or administratively (for example, establishing informal waiting lists to get on official waiting lists), and exploiting the opportunity to remove patients from waiting lists, if they declined an offer of admission, by making offers during holiday periods. 21 In addition, two studies found that financial incentives to physicians increased the likelihood that they would manipulate lists of patients by excluding those whose presence impeded their achievement of targets. 22,23
In these circumstances, it seems plausible that hospitals have incentives to game cleanliness inspections. Information we obtained from two acute trusts under freedom-of-information legislation revealed that they actually had between two and five months’ advance notice of such inspections.
In what we believe is the first study of its kind, we looked for evidence of possible gaming effects by taking advantage of a unique source of data that links patients’ perceptions of cleanliness with hospital inspection dates in the period 2011–14. Specifically, we tested whether patients gave hospitals’ cleanliness better ratings in the months leading up to an inspection than at other times, which would be consistent with the hypothesis that gaming does take place.
Study Data And Methods
We linked data on patients’ perceptions of cleanliness with dates of cleaning inspections for 205 English hospitals. All analyses were conducted at the hospital level. Patients’ assessments of hospital cleanliness were obtained from the Picker Institute NHS Patient Survey Programme. 24 Between June and August each year, each trust sends a questionnaire to 850 patients who have spent at least one night in a hospital operated by the trust. They are asked to report on their experiences at any time in the year, although in practice 93 percent of the reports describe experiences in this three-month period. All of the sampled patients are asked, “In your opinion, how clean was the hospital room or ward that you were in?” The possible answers are “very clean (excellent),” “fairly clean,” “not very clean,” and “not clean at all.”
We recoded the data by hospital and matched this information with the month that the hospital had a cleanliness inspection—data obtained from Patient Environment Action Teams for 2011–12 25 and Patient-Led Assessments of the Care Environment for 2013–14) 26 (the name of the data source changed, but the data collection practices did not). We aggregated these data to determine the median percentage of patients rating cleanliness as “excellent” for each hospital by month and year. Additional data on hospital size and services provided were taken from the Estates Return Information Collection (a mandatory information collection from all NHS trusts) for the period 2011–14. 27
We matched data on the timing of cleanliness inspections and from the NHS Patient Surveys by calendar year. The data on hospital size were reported by fiscal year; we matched these data to calendar years. This was unlikely to confound the analysis since there is little temporal variation in numbers of hospital beds.
Our initial sample included 492 English hospitals. Seventeen (3.46 percent) were excluded because they had no inpatient services. Another 270 (54.9 percent) were excluded because patients had not been surveyed. Thus, the final sample consisted of 205 English hospitals. We observed 145 hospitals each year, on average, for 6–7 months, and we had complete records for 907 hospital-months. Of these hospitals, 125 operated in-house NHS cleaning services, 76 hospitals contracted with private providers of cleaning services, and 4 used both NHS and private providers (these hospitals integrated outsourcing into a mixed public-private partnership). This information is displayed in a flow chart in online Appendix 1, Exhibit A1. 28Exhibit 1 provides further information about the 205 hospitals.
Characteristic | Median or mean | SD | Minimum | Maximum |
Median percent of patients rating cleanliness “excellent” | 72.1 | 11.4 | 25 | 100 |
Number of beds | 637 | 493 | 5 | 2,257 |
Average length-of-stay (days) | 6.07 | 1.56 | 2.4 | 14.2 |
Multiservice hospitals a | 0.08 | 0.27 | 0 | 1 |
Specialist hospitals a | 0.20 | 0.40 | 0 | 1 |
Other hospitals | 0.72 | 0.45 | 0 | 1 |
North of England a | 0.44 | 0.50 | 0 | 1 |
Central England a | 0.27 | 0.44 | 0 | 1 |
London a | 0.11 | 0.32 | 0 | 1 |
South of England a | 0.18 | 0.39 | 0 | 1 |
Number of hospitals observed for each month of inspection | 145 | 49.9 | 1 | 194 |
Number of patients without missing data on hospital cleanliness survey per month | 205 | 150 | 1 | 552 |
Statistical Models
To investigate the association between month of inspection and perceived cleanliness, we used a regression discontinuity design. 29 (For further details, see Appendix 1, Exhibit A2.) 28
As shown in Appendix 1, Exhibit A3, 28 until 2012, the assessments tended to be concentrated between January and March, whereas after 2012 they tended to occur in the first six months of the year. The main coefficient of interest was β, which estimated the average change in the median perceived cleanliness of hospitals during inspection months. All data and models were estimated using Stata, version 13. Robust standard errors were clustered by hospital to reflect the nonindependence of sampling.
Limitations
As with all statistical modeling studies, our analysis had several limitations. First, we did not have the exact date when a patient was discharged, only the month. Thus, when we merged information at the hospital level, we could not investigate a possible gaming effect within a given month. This imprecision was likely to have produced conservative estimates of the magnitude of potential gaming behavior.
Second, our results suggest only modest effect sizes. However, even a modest increment in perceived cleanliness is sufficient for hospitals to avoid threats of an adverse assessment and the consequences that flow from it.
Third, a comprehensive longitudinal data set that tracks patients’ perceptions of cleanliness independently across all of the sites in the United Kingdom does not exist. Thus, in this initial assessment—to our knowledge, the first of its kind in the NHS—we took advantage of a large pooled data set to determine whether cleanliness increased in the months just before and during inspections and then reverted to its historical level after inspections. A limitation to this method is that it cannot identify individual hospitals that are gaming. However, it does point to characteristics, such as outsourcing cleaning services, that may render a hospital more likely to game.
Fourth, we could not observe a uniform distribution in terms of month of inspection or in terms of the numbers of patients responding to the questionnaire. However, we took advantage of the available data to assess gaming effects.
Study Results
Association Of Inspection Months With Cleanliness
In the months leading up to an inspection, levels of cleanliness appeared to rise, followed by a drop after the inspection period ( Exhibit 2 ). When we compared the months before and after the inspection, we found that on average, patients’ reports of excellent cleanliness were about 10 percentage points higher (81.5 percent, versus 71.9 percent in all other months; -test: −3.73; ).
For example, at the Royal National Hospital for Rheumatic Diseases, there were inspections in June 2013 and May 2014. In the months before each inspection, patients’ perceptions of cleanliness were relatively constant ( Exhibit 3 ). Those perceptions increased in the inspection months and returned to their previous levels shortly afterward.
Our data were corroborated by other evidence. The results of our freedom-of-information requests to hospitals about communication with cleaning staff in the months during inspections revealed that those hospitals performed a series of detailed pre-inspection checks a few days before inspection, which revealed long-standing problems that were then addressed. (For an example, see Appendix 1, Exhibit A4.) 28
In the month when an inspection took place, the share of patients who rated their hospital’s cleanliness as “excellent” jumped by 7.78 percentage points (95% confidence interval: 2.75, 12.8; see Appendix 1, Exhibit A5). 28 (For further corroboration of our results, see the estimation coefficients of a distributed lag model in Appendix 1, Exhibit A6.) 28
Outsourced Or In-House Cleaning Services
We used a difference-in-differences model to test whether hospitals that privately contracted for cleaning services were more likely to exhibit gaming behavior, compared to those that provide the cleaning in-house. 30 As shown in Appendix 1, Exhibit A5, higher cleanliness scores in inspection months were concentrated in hospitals that outsourced their cleaning services (11.0 percentage points; 95% CI: 5.15, 19.6), whereas there was no statistically detectable association between cleanliness scores and inspection months in hospitals that used in-house NHS cleaning services (2.68 percentage points; 95 percent CI: −3.52, 8.88). (For further corroboration of our results, see the estimation coefficients of a distributed lag model in Appendix 1, Exhibit A7.) 28 This finding is in line with a recent study that found a greater incidence of infection and evidence of poorer cleaning where cleaning was outsourced. 31
Within-Group Estimation
To test whether our results were driven by potential unobserved heterogeneity, we used a within-group estimation. Our results clearly show that switching from a non-inspection month to an inspection month led to an increase in reported cleanliness by about 2.54 percentage points (95% CI: 0.02, 5.06; see Appendix 1, Exhibit A5). 28
To further the temporal pattern of our results, we included a cubic term in the term “time to inspection.” The results were consistent with our main study findings (β: 2.86 percentage points; 95% CI: 0.06, 5.67).
Robustness Checks
We performed a series of robustness checks in order to better understand the effects of various factors on our results—that is, whether something other than an impending inspection could be driving the changes we observed. First, we adjusted for potential confounding factors, including hospital size, hospital complexity (that is, whether the hospital type was specialist, multiservice, or other), and time trends. (For the results of these checks, see Appendix 1, Exhibit A8.) 28
To identify whether these patterns were driven by a few outliers that exhibited extreme gaming, we removed 5 percent of our distribution (2.5 percent each from the bottom and the top of the distribution). This changed none of our results (see Appendix 1, Exhibit A9). 28
We further examined our results to see whether they were confounded by some areas’ having low numbers of respondents. We restricted our sample to areas with at least three hospitals that each had at least seventeen respondents. This removed 10 percent of the lower end of the distribution in terms of numbers of respondent patients for each month (for the results, see Appendix 1, Exhibit A10). 28 The results were consistent with our main findings, except that the within-group results were no longer significant.
To ensure that our results were not driven by structural differences between acute and specialist hospitals that may have affected the propensity to fall into the “treatment” group, we applied two different robustness tests (“treatment” here is defined as hospitals that were observed during inspection months or shortly before; hospitals observed at other times were placed in the “control” group—for the results, see Appendix 1, Exhibit A11.) 28 First, we used propensity score matching to better match treatment hospitals with control hospitals on their size and complexity. Specifically, we stratified hospitals by type (specialist, multiservice, or acute), and within each category, we matched treated hospitals with at least one control in terms of hospital size (we allowed control hospitals to be used more than once as a match). To ensure “goodness of fit,” we permitted matches for only those pairs of hospitals whose propensity scores differed by 0.01 or less. Second, we restricted our sample to specialist hospitals. In both cases, none of our results changed qualitatively.
Finally, as a so-called falsification test, we analyzed the pattern of responses to food and hydration quality of hospitals, instead of cleanliness. Conceptually, cleaning and providing food are different services. The companies providing these services are also different, if the services are outsourced. This allowed us to test whether the problem is cleaning in particular, instead of a general disposition to outsource services. (For the results, see Appendix 1, Exhibit A12.) 28 It is worth noting that the sample size dropped because the assessment of food quality is available only at the trust level. However, this was a good test on conceptual and empirical grounds. Empirically, we observed no significant correlation at the trust level between scores of cleanliness and of food and hydration quality (ρ = 0.11).
Discussion
NHS inspections are a core element of the performance management regime designed to ensure that hospitals maintain high standards of quality. This is especially important when services, including cleaning, are outsourced to private contractors to save money. Yet there is a perception that NHS inspections can be gamed. This can happen, for example, when staff members know that an inspection will soon take place.
By taking advantage of a unique data source, we were able to compare patients’ perceptions of cleanliness around the time of inspections. We found evidence consistent with gaming: In inspection months and for a short period before them, cleanliness appeared to improve, declining in subsequent months. This pattern was most prominent for hospitals that outsourced cleaning services to private contractors. This finding appears particularly relevant in light of a recent study that found that sites that outsourced cleaning services had significantly higher rates of methicillin-resistant Staphylococcus aureus (MRSA). 31
Our findings suggest that gaming may increase a hospital’s cleanliness score by 2.5–11.0 percentage points. This would often be sufficient to avoid the severe consequences of an adverse inspection report, which range from warnings to enforcement action by the Care Quality Commission or even restrictions on activity, and which have implications for the tenure of senior executives.
Our findings have obvious implications for policy, given the importance of hospital cleanliness in the fight against antimicrobial resistance. However, they also have implications for systems of regulation and inspection. One obvious question is whether inspections should be announced or unannounced. For example, our findings suggest that hospitals invested considerable resources in preparing for an inspection. Arguably, they should be investing those resources at all times. A recent systematic review asking whether announced and unannounced inspections led to different approaches to risk assessment found only three studies. 32 The authors concluded that unannounced inspections reduce the regulatory burden compared to announced ones, but there was no significant difference between the two in terms of outcomes.
Another question is the extent to which a system based on inspections is the best way of ensuring quality. A history of regulation in the English NHS described a series of shifts from trust-based professional regulation to detailed external inspection, followed by some rolling back. 33 Changes were often driven by events that revealed malfunctions in the system in place at the time, instead of making reforms based on evidence of the clear superiority of one among a series of alternative approaches.
While the characteristics of an ideal system are easy to specify, combining high standards with transparency, they seem more difficult to achieve in practice. However, one lesson is clear. In any regulatory system, it should be assumed that gaming will take place. These systems should be designed in ways that minimize gaming.
ACKNOWLEDGMENTS
David Stuckler and Veronica Toffolutti are funded by the European Research Council (Grant No. 313590-HRES). Stuckler is also funded by the Wellcome Trust.
NOTES
- 1 . Uncertainty and the welfare economics of medical care . Am Econ Rev . 1963 ; 53 ( 5 ): 941 – 73 . Google Scholar
- 2 . Markets and hierarchies: some elementary considerations . Am Econ Rev . 1973 ; 63 ( 2 ): 316 – 25 . Google Scholar
- 3 . DRG creep: a new hospital-acquired disease . N Engl J Med . 1981 ; 304 ( 26 ): 1602 – 4 . Crossref, Medline, Google Scholar
- 4 House of Commons Committee of Public Accounts . Care Quality Commission: twelfth report of session 2015–16 [Internet]. London : Stationery Office ; 2015 Dec 11 [cited
2016 Dec 28 ]. Available from: www.publications.parliament.uk/pa/cm201516/cmselect/cmpubacc/501/501.pdf Google Scholar - 5 . School inspections: can we trust Ofsted reports? CentrePiece [serial on the Internet]. 2011–12 winter [cited
2016 Dec 28 ]. Available from: http://cep.lse.ac.uk/pubs/download/cp358.pdf Google Scholar - 6 . Accountability, ability, and disability: gaming the system? In: Gronberg TFJansen DW , editors. Improving school accountability . Bingley (UK) : Emerald Group Publishing Ltd. ; 2006 . p. 35 – 49 . Crossref, Google Scholar
- 7 . Testing, crime, and punishment . J Public Econ . 2006 ; 90 ( 4–5 ): 837 – 51 . Crossref, Google Scholar
- 8 . Accountability, incentives, and behavior: the impact of high-stakes testing in the Chicago Public Schools . J Public Econ . 2005 ; 89 ( 5–6 ): 761 – 96 . Crossref, Google Scholar
- 9 . Food for thought: the effects of school accountability plans on school nutrition . J Public Econ . 2005 ; 89 ( 2–3 ): 381 – 94 . Crossref, Google Scholar
- 10 . School accountability laws and the consumption of psychostimulants . J Health Econ . 2011 ; 30 ( 2 ): 355 – 72 . Crossref, Medline, Google Scholar
- 11 . Adequate (or adipose?) yearly progress: assessing the effect of “No Child Left Behind” on children’s obesity [Internet]. Cambridge (MA) : National Bureau of Economic Research ; 2011 Mar [cited
2016 Dec 28 ]. (NBER Working Paper No. 16873). Available from: http://www.nber.org/papers/w16873.pdf Google Scholar - 12 . The “hospital superbug”: social representations of MRSA . Soc Sci Med . 2006 ; 63 ( 8 ): 2141 – 52 . Google Scholar
- 13 . One in 16 pick up a bug in FILTHY hospitals . Daily Mail [serial on the Internet]. 2014 Apr 16 [cited
2016 Dec 28 ]. Available from: http://www.dailymail.co.uk/health/article-2606425/NICE-blames-staff-hygiene-dirty-equipment-thousands-deaths-One-16-pick-bug-FILTHY-hospitals.html Google Scholar - 14 Daily Mail reporter . Hospital bugs “spread by use of detergent wet wipes to clean wards” according to first study of its kind . Daily Mail [serial on the Internet]. 2015 Jun 8 [cited
2016 Dec 28 ]. Available from: http://www.dailymail.co.uk/news/article-3116162/Hospital-bugs-spread-use-wet-wipes-clean-wards.html Google Scholar - 15 . Hospital patient so shocked at dirty ward she climbed out of bed to clean it herself . Daily Mail [serial on the Internet]. 2009 Jul 2 [cited
2016 Dec 28 ]. Available from: http://www.dailymail.co.uk/news/article-1196997/Hospital-patient-shocked-dirty-ward-climbed-bed-clean-herself.html Google Scholar - 16 Press Association . Vulnerable patients “at high risk” of food poisoning . Daily Mail [serial on the Internet]. 2016 Nov 2 [cited
2016 Dec 28 ]. Available from: http://www.dailymail.co.uk/wires/pa/article-3895752/Vulnerable-patients-high-risk-food-poisoning.html Google Scholar - 17 . Dirty hospital equipment causes 11 people to become infected with bacteria that could lead to PNEUMONIA . Daily Mail [serial on the Internet]. 2015 Oct 8 [cited
2016 Dec 28 ]. Available from: http://www.dailymail.co.uk/health/article-3265031/Dirty-hospital-equipment-causes-11-people-infected-bacteria-lead-PNEUMONIA.html Google Scholar - 18 Department of Health . The NHS plan: a plan for investment, a plan for reform [Internet]. Colegate : Stationery Office ; 2000 Jul [cited
2016 Dec 28 ]. Available from: http://webarchive.nationalarchives.gov.uk/20130107105354/http://www.dh.gov.uk/prod_consum_dh/groups/dh_digitalassets/@dh/@en/@ps/documents/digitalasset/dh_118522.pdf Google Scholar - 19 . Unintended consequences of performance measurement in healthcare: 20 salutary lessons from the English National Health Service . Intern Med J . 2012 ; 42 ( 5 ): 569 – 74 . Crossref, Medline, Google Scholar
- 20 . What’s measured is what matters: targets and gaming in the English public health care system . Public Adm . 2006 ; 84 ( 3 ): 517 – 38 . Crossref, Google Scholar
- 21 Wismar MMcKee MErnst KSrivastava DBusse R , editors. Health targets in Europe: learning from experience [Internet]. Copenhagen : European Observatory on Health Systems and Policies ; 2008 [cited
2016 Dec 28 ]. Available from: http://www.euro.who.int/__data/assets/pdf_file/0008/98396/E91867.pdf Google Scholar - 22 . Doctor behaviour under a pay for performance contract: treating, cheating, and case finding? Econ J (Oxf) . 2010 ; 120 ( 542 ): F129 – 56 . Crossref, Google Scholar
- 23 . General practitioners’ reasons for removing patients from their lists: postal survey in England and Wales . BMJ . 2001 ; 322 ( 7295 ): 1158 – 9 . Crossref, Medline, Google Scholar
- 24 Care Quality Commission, Picker Institute Europe . Acute Trusts: Adult Inpatients Survey . London : UK Data Service ; 2010 – 14 . Google Scholar
- 25 Health and Social Care Information Centre . Patient Environment Assessment Team (PEAT) . London : Department of Health ; 2010 – 14 . Google Scholar
- 26 Health and Social Care Information Centre . Patient-Led Assessments of the Care Environment (PLACE), England . London : Department of Health ; 2013 – 14 . Google Scholar
- 27 Health and Social Care Information Centre . Estates Return Information Collection . London : Department of Health ; 2010 – 14 . Google Scholar
- 28 To access the Appendix, click on the Appendix link in the box to the right of the article online.
- 29 . Identification and estimation of treatment effects with a regression-discontinuity design . Econometrica . 2001 ; 69 ( 1 ): 201 – 9 . Crossref, Google Scholar
- 30 . Minimum wages and employment: a case study of the fast food industry in New Jersey and Pennsylvania [Internet]. Cambridge (MA) : National Bureau of Economic Research ; 1993 Oct [cited
2016 Dec 28 ]. (NBER Working Paper No. 4509). Available from: http://www.nber.org/papers/w4509.pdf Google Scholar - 31 . Outsourcing cleaning services increases MRSA incidence: evidence from 126 English acute trusts . Soc Sci Med . 2016 ; 174 : 64 – 69 . Crossref, Medline, Google Scholar
- 32 . Unannounced, compared with announced inspections: a systematic review and exploratory study in nursing homes . Health Policy . 2013 ; 111 ( 3 ): 311 – 9 . Crossref, Medline, Google Scholar
- 33 . Changing paradigms of governance and regulation of quality of healthcare in England . Health Risk Soc . 2008 ; 10 ( 1 ): 85 – 101 . Crossref, Google Scholar