To Subgroup or Not to Subgroup?

Far from being ‘stamp collecting’, as Ernest Rutherford is said to have claimed, classifying things is central to the scientific enterprise – imagine biology without the Linnaean taxonomy (multi-dimensional classification) of plants, animals and minerals (now plants, animals, fungi, protists, chromists, archaea and eubacteria kingdoms). Or medicine without its nosology. Classification has been the basis of all knowledge, and Rutherford was wrong – for example, astronomy is also built on a classification of stars and planets.

However, classification does not come free of problems. On the contrary, I call it the ‘central dilemma of epidemiology’. For a start, it is a human attempt to organise an underlying (latent) and often disorganised world. That is both its strength and its weakness. By organising the underlying complexity it allows abstractions to be made regarding the organising principles that underlie phenomena we observe about us. But the price we must pay is that we are often superimposing a classification over an underlying continuum. Thus, astronomical objects like Pluto can be ‘demoted’ and species change from one genus to another. Many health conditions do not fit neatly into one group or another, bearing features of both – think auto-immune disease and mental illness. Of course, any classification system is useful, insofar as it leads to new knowledge about underlying mechanisms and it is quite natural that the process is iterative, such that new classifications emerge – clades in biology rather than the original Linnaean family tree, for example.

But in the practice of epidemiology the issues of groups and subgroups can be a problem, not just because groups overlap, or misclassification may occur. A problem also arises in the interpretation of observed differences between groups. On the one hand, we do not want to miss important subgroup differences in the effect of an exposure on an outcome. On the other hand, we also want to avoid spurious associations. There are many examples, especially in the context of treatment trials, of subgroup associations that were subsequently over-turned.

The usual argument put forward to avoid spurious associations is that only subgroups specified in advance should be considered as a test of an hypothesis – all else is a fishing expedition, the results of which are to be down-weighted.

This is all very well but it just moves the problem from the analysis stage to the design stage. The corollaries are two-fold:

  1. Any subgroup must be selected on the basis of sound principles – there should be a theoretical model for an interaction between exposure and outcome. The statistical subgroup analysis is then designed to strengthen or weaken the credibility of the model. Note, the issue is the interaction between subgroup and outcome through the treatment effect. A direct effect on outcome is neither here nor there.
  2. Since precision is often low in a subgroup, and always lower than in the group as a whole, hypothesis tests are even less appropriate to subgroup effects than to the overall effect. Dichotomising the results into positive and null, and using this dichotomy to make a decision, is always stupid and is risible in a subgroup.

Some subgroups derive from an underlying, if latent, scale. Socio-economics groups, for example, or age. But others are irrevocably categorical. Gender, for example, or rural vs. urban residence. In the former situation – where the group is homologous (scalable) – a small subgroup is not a large problem, because the statistical model can look for a trend. The situation is more problematic when a small subgroup is not part of a homologous continuum. Any examination in the small subgroup will be imprecise in proportion to its size. Amalgamating it within a larger group makes sense on the basis that ‘it’s better to have a precise answer to a general problem, than an imprecise answer.’

But this logic breaks down if there is a sound theoretical reason to expect a different result in the small sub-group. Grouping trans people with male or female would be unsuitable for many purposes. In such a situation it is better to have an imprecise answer to a specific question.

Richard Lilford, ARC WM Director

Advertisement

Science Denial and the Importance of Engaging the Public with Science

A recent paper in JAMA, concerning science denial, tackles a problem of immense importance.[1] For us scientists, science denial negates our reason for being. Far more important though, is the effect on society. We need to think only of the vaccination fiasco. The JAMA paper used the difficulties that people with certain neurological conditions have with processing information as an analogy for the challenges that people with low scientific literacy have with interpreting complex graphs. Such difficulties leave room for false beliefs, including beliefs in conspiracy theories. While this analogy might shed light on neural mechanisms, there are far more important determinants of science denial in the population at large. One issue is the effect of education. Lack of educational attainment is consistently associated with science denial and the propensity to believe in conspiracy theories.[2]

Of course, this does not prove that improving science education would solve the problem. It may simply be the case that the cause of low educational achievement is also the cause of a predisposition to believe conspiracy theories. For example, low self-esteem or cognitive ability may be determinants of both low educational attainment and science denial. More likely, education plays a part, and both nature and nurture are involved. In that case, educational achievement conditional on early-life cognitive ability should correlate with resistance to conspiracy theories. We do not know whether this possibility has been examined.

Debunking misinformation with evidence or education is not enough. In responding to COVID-19, behavioural scientists were quick to point out that debunking could even lead to a backlash and increase the belief in misinformation. While the evidence on backlash is mixed, alternative approaches are still needed. One alternative is ‘pre-bunking’,[3] which is analogous to medical inoculation: people are exposed to a little bit of misinformation that activates their ability to critique it, but not so much misinformation as to be overwhelming. Web-based games like ‘Get Bad News’ apply this approach and are used by governments and schools to reduce people’s susceptibility to fake news. Reminding people before they engage with information to assess the accuracy of sources may also help.[4]

Yet, education, pre-bunking, and reminders are arguably ‘demand-side’ factors, which largely rely on the public selecting into engagement with science. These may be the very people least likely to denounce it. Given this, it is incumbent upon policymakers – and academics – to address the ‘supply-side’ factors, too. They must consider how to provide trustworthy, transparent, and accessible information, including to those with lower levels of education or cognitive ability. Sadly, this does not always happen; for example, little effort appears to have been directed towards testing some of the public health messaging about COVID-19 in the UK.[5] Confusing messaging can breed uncertainty, which is easily filled with simple but false information – including scientific information. Critiquing conspiracy theorists for their ‘bad science’ is unlikely to be persuasive. Instead, we advocate building trust in rigorous science.

Engaging the public with science is critically important; we can hardly think of a more important issue. Here at ARC West Midlands we take public engagement very seriously. We continuously seek opportunities to engage on science. In previous news blogs, we tested some of the government’s COVID-19 messaging ourselves,[6][7] and described our plans to use geospatially referenced maps to engage communities where COVID-19 infections are not under control.[8] We are engaging the public in numerous implementation science projects, including one based on mathematical modelling and another on the role of chance in decision-making. In all of these, development of the service, engagement with decision-makers, and with the public, go hand in hand.

Richard Lilford, ARC WM Director; Laura Kudrna, Research Fellow

References:

  1. Miller BL. Science Denial and COVID Conspiracy Theories: Potential Neurological
    Mechanisms and Possible Responses
    . JAMA. 2020.
  2. Van Prooijen J-W. Why Education Predicts Decreased Belief in Conspiracy Theories. Appl Cognit Psychol. 2016; 31(1).
  3. Van Bavel JJ, et al. Using social and behavioural science to support COVID-19 pandemic response. Nat Hum Behav. 2020; 4:460-71.
  4. Pennycook G, et al. Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychol Sci. 2020; 31(7):770-80.
  5. BBC News. Coronavirus: Minister defends ‘stay alert’ advice amid backlash. 10 May 2020.
  6. Kudrna L, Schmidtke KA. Changing the Message to Change the Response – Psychological Framing Effects During COVID-19. NIHR ARC West Midlands News Blog. 2020; 2(7): 7-9.
    See also our London School of Economics and Political Science blog.
  7. Schmidtke KA, Kudrna L. Speaking to Hearts Before Minds: Increasing Influenza Vaccine Uptake During COVID-19. NIHR ARC West Midlands News Blog. 2020; 2(10):9-11.
    See also our London School of Economics and Political Sciences blog.
  8. Lilford RJ, Watson S, Diggle P. The Land War in the Fight Against COVID-19. NIHR ARC West Midlands News Blog. 2020; 2(10):1-4.

Use of Causal Diagrams to Inform the Analysis of Observational Studies

Observational studies usually involve some sort of multi-variable analysis. To make sense of the association between an explanatory variable (E) and an outcome (O), it is necessary to control for confounders – age for example in clinical studies. A confounder (C) is a variable that is associated with both E and O. Indeed it is causal of E and O as shown by the direction of arrows in Figure 1.

Fig 1. Causal Diagram for a Confounder

A common error is to mistake a confounder for a mediator. If the variable lies on the causal pathway between E and O, then it is a Mediator – M in Figure 2.

Fig 2. Causal Diagram to Distinguish Between a Confounder (C) and a Mediator (M)

Failure to make this distinction, and to adjust for M, will reduce or remove the effect of E on the outcome. In a study of the effect of money spent on tobacco on lung cancer, it would be self-defeating to adjust for smoking! If we are interested in decomposing different causal pathways, then we should adapt the multivariable analysis to examine how much of the effect of E or O is explained by the putative mediator (M in Figure 2) – a structural equation model or ‘mediator’ analysis.

There are some issues to consider:

  1. It may not be possible to say for certain whether a variable is a mediator or confounder and some variables may be both. Then try the analysis three ways: omit it, treat it as a confounder, or treat it as a mediator.
  2. It is hard to know which variables to include as confounders. A dataset was sent for analysis by 29 different teams of statisticians.[1] They came up with different results that varied wildly. This was because they adjusted for different combinations of variables. The corollary is that choice of variables should not be left to statisticians – it turns on causal theory that distinguishes between variables that are likely to have arrows pointing from E and O via M, and those pointing from C to both E and O (Figure 2). Context matters!
  3. There may be an interaction between variables, such that the causal effect of one variable on E or O is amplified or attenuated in the presence of another. Given four variables, each with four ‘levels’, yields 256 possible first order interactions. So, again, theory is needed to determine which variables to include in such interaction tests.

A variable may exist that is an independent cause of C or M (let’s call these C* and M*), as in Figure 3. There is no reason to adjust for these variables. Likewise, do not adjust for any variable that ‘precedes’ E, as also shown in Figure 3.

Fig 3. Variables That Cause Change in Other Variables

In this example, C* and M* are not causally linked to O, except through C and M respectively. But a situation may occur where such a link is possible. It is well known that maternal smoking is causally linked to both low birth-weight and to neonatal deaths, as per Figure 4. The theory is that smoking is toxic and leads to both a small baby and, via that pathway and other pathways, leads to neonatal death.

Fig 4. Causal Pathway for Smoking and Neonatal Deaths

If this analysis is conducted controlling for ‘small baby’, then smoking is associated with lower mortality – it appears protective. The obvious fault was to control for a variable on the causal pathway, as per Figure 2. But this could explain why the association may be reduced, but not reversed.

The explanation for the reversal lies in a putative third variable (perhaps a ‘genetic’ defect, G), which predisposes to both a small baby and neonatal death (Figure 5). Note, that both E and G collide on M, and such a scenario leads to ‘collider bias’ – by controlling for one source of bias, the door is opened to another. It is well known that there may be unobserved (‘lurking’) confounders in any association. The same applies, of course, to a variable that might completely alter the meaning of an association once one has conditioned on another variable.

Fig 5. Collider Bias

These analyses show that conducting a multivariable analysis is not, or rather should never be, an entirely data-driven / empirical exercise. Choices have to be made, such that the statistical model informs on, but does not determine, the causal model. For a brilliant example of extensive causal chains involving confounders, colliders and mediators, see an example from Andrew Forbes and colleagues.[2]

Note, we are not arguing against adjustment per se. It is an essential part of the analysis. We argue against adjusting without reference to a causal model.

Richard Lilford, ARC WM Director; Sam Watson, Senior Lecturer [With thanks to Peter Diggle (Lancaster University & Health Data Research UK) for comments.]


References:

  1. Silberzahn R, et al. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Adv Method Pract Psychological Science. 2018; 1(3).
  2. Williamson EJ, et al. Introduction to causal diagrams for confounder selection. Respirology. 2014; 19(3): 303-11.

Speaking to Hearts Before Minds: Increasing Influenza Vaccine Uptake During COVID-19

In 2019, the UK health secretary Matt Hancock said that he is “open” to making vaccines compulsory, and Labour MP Paul Sweeney argued that failure to vaccinate children should be a “criminal offence”. But mandates are difficult to enforce, and punishments diminish public trust. In addition, people still opt out of mandatory policies, and effectiveness increases when people freely comply.[1] Instead of mandates, we advocate behavioural approaches that preserve individual freedom,[2] and agree with Professor Heidi Larson that additional emphasis should be placed on public perspectives when planning vaccine policies and programmes.[3]

Public health messaging about vaccines is particularly important in light of the COVID-19 pandemic. In April 2020, the United Kingdom’s ‘Vaccine Taskforce’ convened, and, in May 2020, the United States’ ‘Operation Warp Speed’ took off. This speed elicited optimism among some, but handed a megaphone to the anti-vaccination movement. Del Bigtree, founder of the Information Consent Action Network, cautioned that, “You shouldn’t rush to create a product you can inject into perfectly healthy people without doing proper safety studies”. Here, identical factual information – a vaccine is being developed quickly – elicited reasoned responses that were both optimistic and pessimistic. However, intuitions come first and strategic reasoning comes second.[4] Where public health messages do not align with people’s automatic intuitions, factual and reasoned information may fall on deaf ears.

On September 21, we conducted an online experiment to determine if public health messages aligned with people’s political intuitions influenced their intentions to take up the influenza vaccine.[5] Influenza vaccinations have long been important, but are particularly important now in the context of COVID-19 because co-infection increases mortality rates.[6] We recruited 192 participants living in England, aged 50 years+, who had not already vaccinated this season. Half of these participants identified as being affiliated with the Labour party, and half with the Conservative party. Participants viewed a message either aligned or unaligned with their automatic political intuitions (see Figures 1 and 2). Then they stated how much they agreed with a statement about their intentions to take up the influenza vaccine this season on a 7-point scale, where higher numbers indicated more positive intentions.

Fig 1. Left-Wing Message (aligned with Labour)
Fig 2. Right-Wing Message (aligned with Conservative)

Professor Jonathan Haidt describes the automatic intuitions we set out to influence as moral foundations.[4] Typically, people who identify as being more left-wing are most strongly influenced by their care and fairness intuitions (a desire to prevent harm to others and to ensure equality). In contrast, people who identify as being more right-wing are more strongly influenced by the remaining foundations: purity (a desire to avoid contaminants), authority (to preserve traditions), loyalty (to strengthen group bonds), and liberty (to preserve individual freedom).

Research conducted in the United States and Australia has already identified some of the foundations associated with parental vaccine hesitancy, and suggests that public health messages can be framed to increase parents’ intentions.[7,8] For example, a message designed to promote purity might say: Boost your child’s natural defenses against diseases! – Vaccinate! These proposals are a good start, but without evidence that they are likely to be effective, public health practitioners have little reason to prefer them to the messages developed in-house. The messages used in the present study were informed by messages used in a previous study that significantly altered people’s intentions to recycle.[9]

Our main prediction was that our left-wing message would increase labour participants’ intentions, and our right-wing message would increase conservative participants’ intentions. We did not find this. As shown in Figure 3, there was no substantial effect of the messages. One explanation is that the moral foundations used in our advertisements were not relevant in a UK context, which we plan to address in future work. We aim to conduct a general UK survey describing moral foundations in the population and use the survey results to inform a collaborative online workshop with public contributors and health specialists, which is in keeping with Professor Heidi Larson’s calls to involve public perspectives. This pilot study lays the groundwork for such future research.

Fig 3. Results of the study testing the effects of messages on vaccination intentions as measured by average agreement with the statement: “I intend to receive an influenza vaccination this season [2020/21].”

We asked people some follow up questions too. In a free-text box, participants were asked to explain their intentions to (or not to) vaccinate. Their explanations largely fell within five categories, which, in addition to their foundations, may have been influenced by the messages they read: Protect Self, Protect Others, Protect the NHS, Being Eligible/Invited, and Habits. We also asked questions about people’s intentions of taking up a COVID-19 vaccination and wearing a face mask. Similar to recent research,[10] people were more likely to express intentions to take up a future COVID-19 vaccination (72%) than the current influenza vaccination (65%). We suspect that these expressed intentions may be a bit optimistic. Indeed, most participants (89%) also expressed that they would wear a face mask in a store that did not require them to do so, which is higher than our casual observations at the grocery store around the time of the experiment (before additional penalties were introduced). Acquiescence bias may have led our participants to be agreeable in this survey, particularly as participants just saw messages promoting health-related behaviour. But this need not preclude identifying meaningful differences between randomised conditions. Our research team looks forward to better understanding the intuitive influences on vaccination behaviour.

Kelly Ann Schmidtke (Assistant Professor) and Laura Kudrna (Research Fellow)

References:

  1. Salmon DA, et al. Compulsory vaccination and conscientious or philosophical exemptions: past, present, and future. Lancet. 2006;367(9508):436-42.
  2. Sunstein C & Thaler R. Libertarian Paternalism. Am Econ Rev. 2003; 93(2): 175-9.
  3. Larson HJ et al. Addressing the vaccine confidence gap. Lancet. 2011;378:526-35.
  4. Haidt J. The righteous mind: why good people are divided by politics and religion. New York: Pantheon Books; 2012.
  5. U.S. National Library of Medicine. ClinicalTrials.gov Influenza 2020/2021. NCT04546854. 14 September 2020.
  6. Iacobucci G. Covid-19: Risk of death more than doubled in people who also had flu, English data show. BMJ. 2020;370:m3720.
  7. Amin AB, et al. Association of moral values with vaccine hesitancy. Nat Hum Behav. 2017;1(12):873-80.
  8. Rossen I, et al. Accepters, fence sitters, or rejecters: moral profiles of vaccination attitudes. Soc Sci Med. 2019;224(1):23-7.
  9. Kidwell B, et al. Getting Liberals and Conservatives to Go Green: Political Ideology and Congruent Appeals. J Cons Res. 2013; 40(2):350–67.
  10. Boseley S. Coronavirus: fifth of people likely to refuse Covid vaccine, UK survey finds. The Guardian. 24 September 2020.

The Land War in the Fight Against COVID-19

Gone are the days of thinking there is a quick fix to the COVID-19 pandemic. Another country-wide lockdown would reduce COVID-19 infection, but at the same time would damage the economy and pose a threat to other long-term health conditions, with disproportionate effects on the more disadvantaged groups in society. The Great Barrington Declaration – aiming for herd immunity while sequestering high-risk people – does not bear close examination.[1] Vaccination is not an automatic get out of jail card – we do not yet know when vaccination will be available at the required volume, nor what degree of protection it will confer. So, this is the land war. We must work on supply chains, procedures, detection and contact tracing, getting ever slicker at the operation. Personal protection, social distancing and graded lockdowns can all play a part, but only if they are accepted by the general public, who deserve clear explanations of when, where and why unwelcome restrictions will be imposed and what these restrictions are intended to achieve.

While central government has an obvious role to play, it has become clear that the battle must go local; and the more local the better. The risk of being hospitalised with COVID-19 in Birmingham varies dramatically across the various electoral wards, with the seven-day rolling rate of new cases (for week ending 14 October 2020) ranging from 43.8 per 100,000 in Nechells, to 825.8 in Selly Oak.[2] So, supported by the MRC, NIHR ARC West Midlands and our host hospital (University Hospitals Birmingham NHS Foundation Trust) we are developing a computer application to track the evolving pattern of the COVID-19 pandemic. We have developed software that uses geostatistical models to identify “hot spots”, however one defines them, across a broad space such as an urban conurbation or a country. Within such a space we identify localities at whatever scale is relevant for local decision-making and that the data can support. We can map rates of infection per unit of population in real time on these maps to show the current state of the epidemic and its direction of travel (see Example). These maps can direct decision-makers to specific localities where incidence is increasing rapidly and hence where urgent action is needed.

But there is a problem with policy action directed at small areas and particular communities – dictatorial edicts are likely to provoke resentment rather than effective action, especially when carried out at a very local level. It is one thing to place restrictions across a whole country or even a large city, but quite another to try to lockdown an area such as Lady Pool in Birmingham or Chapel Town in Leeds. Indeed, the disease has highest incidence in BAME communities who may feel victimised or disenfranchised. Already only 18% of people fully comply with UK regulations regarding self-isolation.[3] So here we come to the second use of our application and the maps it produces.

We think that policy-makers should increasingly turn to local communities and ask them to be the architects, not recipients, of policy. In essence we are arguing for an ‘assets-based’ or ‘participatory’ approach based on ‘co-invention’. And here our application can help by providing scientific data at a local level in a form that can be easily assimilated. We are arguing at a local level for the type of thing that Prof Chris Witty used at a national level in his Downing Street presentation with the Prime Minister and Chancellor (12 October 2020). There is evidence that populations relate well to local maps and they are sometimes used in qualitative research as a method to promote discussion among people.[4] The approach we are advocating here, of high-risk spatio-temporal identification, followed by case-area targeted intervention, has proven effective in limiting the spread of cholera outbreaks,[5] and we advocate a similar approach with respect to the COVID-19 pandemic.

We would be pleased to hear from news blog readers regarding:

  1. Your opinions and advice.
  2. Whether you would like to hear more or use the application when it is developed.
  3. Whether you have examples of similar initiatives elsewhere in the world.
  4. Whether you would like to collaborate.

You can contact us at ARCWM@warwick.ac.uk.

Richard Lilford, ARC WM Director; Sam Watson, Senior Lecturer; Peter Diggle, Distinguished Professor at Lancaster University

Example of Real-Time Surveillance of COVID-19

For this example we have aggregated the results to MSOA (middle-layer Super Output Area) level across the catchment area of University Hospitals Birmingham NHS Foundation Trust, although we have retained other areas of Birmingham to make the boundary of the city clear. One could aggregate to smaller or larger levels as needed. A case here is an admission to hospital for COVID-19.

We have produced these outputs as if we were working on March 26 2020 using data from the preceding two weeks. The first thing someone interested in tracking COVID-19 in the city might ask is what is the incidence of the disease that day?

There is a lot of variation across the different MSOAs, with one area standing out as being high (yellow area). The variation here could be explained by differences in demographics or socioeconomic status, and we might want to ask whether any differences are for unexpected reasons. We can break down the incidence into
different components:

Where:

  • Expected is the number of cases we would expect that day from each area based on the size of its population.
  • Observed shows the relative risk in each area associated with observable characteristics
    (age, ethnicity, and deprivation). For example, consider if the average incidence across the city were one case per 10,000 person-days. An area with a larger proportion of older residents would have a high risk; if this risk were double the average then it would have a relative risk of two.
  • Latent is the relative risks in each area due to unexplained factors or unobserved
    variables. Our area with more older people may have an expected incidence of two cases per 10,000 person-days (a ‘baseline’ of 1 per 10,000 person-days times a relative risk of two), but if we observe an average rate of four cases per 10,000 person-days, then there is an additional unexplained relative risk of 2.
  • Posterior SD indicates the predictive variance.

So based on these plots the area with high incidence in the North of Birmingham would appear to be higher than we would expect based on the observed variables by factor of 2 or 3. This may indicate the need for public health intervention. We might finally ask, how this compares to previous days?

The next plot shows the incidence rate ratio, which here is the ratio of incidence compared to seven days prior for each area. A value of one indicates no change, two a doubling, and so forth. One can clearly see that it is above one, i.e. it is increasing, city-wide. The greatest relative increases are centred on the area we identified as being of high concern.



References:

  1. Alwan NA, et al. Scientific consensus on the COVID-19 pandemic: we need to act now. Lancet. 2020.
  2. Public Health England. Coronavirus (COVID-19) in the UK: Interactive Map. 19 October 2020.
  3. Smith LE, et al. Adherence to the test, trace and isolate system: results from a time series of 21 nationally representative surveys in the UK (the COVID-19 Rapid Survey of Adherence to Interventions and Responses [CORSAIR] study). MedRXiv. 2020. [Pre-print].
  4. Boschmann EE, Cubbon E. Sketch maps and qualitative GIS: Using cartographies of individual spatial narratives in geographic research. Professional Geographer. 2014;66(2):236-48.
  5. Ratnayake R, et al. Highly targeted spatiotemporal interventions against cholera epidemics, 2000-19: a scoping review. Lancet Infect Dis. 2020.

When Waiting is Not Enough

Healthcare is emerging from the immediate crisis response of COVID-19 into a hugely uncertain environment. One of the very few things of which we can be sure is significantly longer waiting times for elective procedures.

The Health Foundation recently published a report drawn from pre-COVID data,[1] which starkly portrayed the challenges around the 18 weeks Referral to Treatment target. The report estimated that the NHS needed to treat an additional 500,000 patients per year for the next four years to restore delivery of the target. Using data from NHS England following the first month of COVID-19 induced elective shutdown, Dr Rob Findlay noted a jump, both in the number of patients waiting over 52 weeks, and the average wait time for patients, which rose to 6 months.[2] These figures are likely to increase further in coming months. The article also noted that very few long-wait patients were treated. Longer wait patients should be de facto low clinical urgency, as it is this that has made them appropriate to wait.

There are two significant decision-making points for the treatment of patients on waiting lists. Clinical urgency, which of course affects those near the start of their waiting time, and being in imminent danger of breaching a waiting time target, which necessarily affects those towards the end. Between these decision-making points at the start and end of the waiting list lie a huge volume of patients with little categorisation or prioritisation.

Herein lies a significant future challenge: as waiting times increase and a growing number of patients breach waiting time targets, how do you ensure that limited elective capacity is targeted towards those with greatest clinical need?

If NHS England and NHS Improvement do not relax waiting time restrictions, maximum wait times will continue to be an important decision-making point. This incentivises providers to make a trade-off and treat longer waiting, but clinically less urgent, patients over short waiting, but clinically more urgent, ones. This would be a difficult position to justify ordinarily but in a time of likely constrained resource, the policy is likely to do far more harm than good.

It is crucially important to use need as the basis for prioritising which patients to treat. A recent literature review described some of the efforts made around the start of the millennium to develop a more systematic and transparent approach to prioritisation based on need. This approach developed from the Western Canada Waiting List Project [3] and the New Zealand Priority Criteria Project.[4] These approaches were rigorously reviewed through a range of academic articles and evaluated well, showing both transparency and consistency of decision making and prioritisation. Importantly, they also carried strong public support when reviewed with focus groups.

These ‘point-count’ systems work by creating a scoring chart for each clinical condition, such as cataract surgery, major joint replacement, coronary bypass graft. However, they have also been successfully used and evaluated for topics such as the use of Magnetic Resonance Imaging (MRI) and children’s mental health. The scoring grid is unique to each clinical condition and developed through consensus discussion with clinicians to balance a range of clinical and social factors. The objective is to prioritise patients for treatment who will gain the most substantial benefit from intervention.

‘Point-count’ systems have translated successfully into several healthcare settings but not to the NHS. Often these types of changes are put in to the ‘too difficult’ category as the resource required to implement them is seen to be greater than the benefit gained. However, we are moving to a different paradigm post COVID-19 where integrated care systems are more accountable to their population and a more objective and transparent decision-making process is desirable.

Think too of the benefits of a shared language of waiting lists. We should not forget that many non-clinical staff are involved in the booking and scheduling of elective patients. A common currency in which objective comparisons can be made on the likely benefit of surgery or intervention across clinical indications and specialties is highly appealing.

One of the most keenly-debated elements of the development of these ‘point-count’ systems was what factors should be considered as part of the scoring criteria. Repeatedly the idea of including some reflection of how long a patient had waited was considered, and strongly rejected. Instead a measure of ‘potential for disease progression’ was included to ensure those, for instance, waiting for a joint replacement procedure, were not constantly usurped by patients with a more acute presentation. However, it guards against the current system of those waiting longest receiving priority at the potential expense of another who would derive greater clinical benefit.

So, as a policy directive there is a clear indication – the maintenance of the current maximum wait times will prioritise many clinically less urgent patients over more urgent cases. It remains to be seen whether the evidence base is substantial enough, and whether there is sufficient appetite within the NHS to revisit some of these clinical prioritisation approaches, but their use should be considered and their implementation would make a fascinating piece of research in the coming years.

Paul Bird, Head of Programme Delivery (Engagement), Richard Lilford, ARC WM Director

With thanks to Prof. Tim Hofer (University of Michigan) for discussion and input.


References:

  1. Charlesworth A, Watt T, Gardner T. Returning NHS waiting times to 18 weeks for routine treatment. The Health Foundation. 2020.
  2. Findlay R. Average waiting time for NHS operations hits six months thanks to covid. Health Serv J. 2020.
  3. Noseworthy TW, McGurran JJ, Hadorn DC, et al. Waiting for scheduled services in Canada: development of priority-setting scoring system. J Eval Clin Pract. 2003; 9(1): 23-31.
  4. Hadorn DC, Holmes AC. The New Zealand priority criteria project. Part 1: Overview. BMJ. 1997; 314: 131.

Recognising the rising tide in service delivery and health systems research

With rising demands and finite resources, health systems worldwide are under constant financial pressure. The US has been at the extreme end of high spending, with health expenditure consisting of 17% of its GDP in 2017 – compared with 9.8% for the UK and 8.7% for the average of the OECD countries (OECD).[1] Therefore, the imperative of containing healthcare cost is mounting in the US. Under the Affordable Care Act (ACA), alternative payment models (often known as value-based payments) have been widely introduced to replace the fee-for-service model.

A recent article in JAMA highlighted a paradox,[2] in which an apparent plateau in overall healthcare expenditure (at around 18% of US GDP) is contrasted with lack of significant success reported in individual evaluations of these alternative payment models. Why has health spending as a proportion of GDP plateaued when the interventions to reduce spending have been ineffective in doing so? The authors ruled out the explanation that the growth in GDP has outpaced the growth of health expenditures as the latter seems to be genuinely flattening. So how can this discrepancy be reconciled?

The authors offered three explanations:

  1. Anticipation of ACA-driven expansion of alternative payment models may have induced changes in the psychology and practice of clinicians and health care organisations, leading to curbs on spending irrespective of the introduction of alternative payment models.
  2. Primed by the above change in mindset, clinicians and health care organisations may have been influenced by their peers and emulate their practice. This would cause a wider spread of the change beyond the institutions where the alternative payment modelled were first introduced and evaluated (e.g. from within the Medicaid system to those covered by commercial insurers).
  3. Simultaneous introduction of a large number of alternative models in different places may have led to contamination of control groups in individual evaluations, where the control group chosen in one evaluation may be subject to the introduction of another alternative payment model.

Taken in the round, these explanations suggest a secular trend of system-wide changes (in this case cost containment), which may take various forms and be achieved through different means, but which are triggered by heightened awareness of the same issue and shared social pressure to tackle it across the board – what we have described as the ‘rising tide phenomenon’.[3] The phenomenon is by no mean a rare occurrence in health services and systems research and so is well worth considering when a null finding is observed in a controlled study. The corollary is that when there is a rising tide, null findings do not disprove the potential effectiveness of the intervention being evaluated. A more nuanced interpretation taking into account the secular trend is required, as the authors of the aforementioned paper did.

Yen-Fu Chen, Associate Professor; Richard Lilford, ARC WM Director


References:

  1. Organisation for Economic Co-operation and Development. Health. 2020. Available at: https://stats.oecd.org/Index.aspx?ThemeTreeId=9
  2. Navathe AS, Boyle CW, Emanuel EJ. Alternative Payment Models—Victims of Their Own Success? JAMA. 2020; 324(3):237-8.
  3. Chen Y-F, Hemming K, Stevens AJ, Lilford RJ. Secular trends and evaluation of complex interventions: the rising tide phenomenon. BMJ Qual Saf. 2016; 25(5): 303-10.

Changing the Message to Change the Response – Psychological Framing Effects During COVID-19

The way in which a government communicates can shape people’s responses. Psychological and behavioural research reveals that the same objective information can elicit different responses when presented in different ways, an effect called ‘framing’.[1] For example, one study compared describing blood donations as either a way to “prevent a death” or “save a life”.[2] While preventing death and saving life are two sides of the same coin, “prevent a death” triggered more donations. These results are explained, at least in part, by a prevalent loss-aversion bias. As Kahneman and Tversky (1979) explain: losses loom larger than gains.[3] 

In 1981, Kahneman and Tversky asked people to imagine that the US was preparing for a disease outbreak that was expected to kill 600 people.[1] Participants were asked to choose between two government programmes. In one scenario, participants considered saving lives: given programme A, 200 lives would be saved; and given programme B, there was a 1/3 probability that 600 lives would be saved and 2/3 probability that no lives would be saved. While mathematically these programmes are equivalent, 72% preferred programme A (109/152 participants). A second group of participants considered preventing deaths: given programme C, 400 would die; and given programme D, there was a 1/3 probability that nobody would die and 2/3 probability that 600 people will die. This time, 78% chose programme D (121/155 participants). Flipping the vocabulary coin flipped people’s preferences. 

In March 2020, we set out to test whether these results would hold when applied to COVID-19. We created two scenarios with identical options to Kahneman and Tversky’s but changed the wording to be about COVID-19 and social/physical distancing. The study was ethically approved and in early July we invited UK participants via Prolific Academic to respond to a randomly allocated scenario. The data were collected in less than two hours. The pattern of results held – participants preferred programme A over B (21/30 = 70%) and D over C (23/30 = 77%). Interesting, but perhaps insufficient to inform the way messages are presented to the public to influence their more personal decisions, such as about visitors at home.

The UK government’s initial messaging strategy about personal decisions emphasised that people needed to say home in order to “save lives”. A later campaign framed this differently, stressing that “people will die” if they go out. Does flipping the vocabulary coin here matter? We, and others, suspect that it does. There have been several opinion pieces on psychologically informed messaging,[4] although we are unaware of any published research results that have tested framing effects in the context of COVID-19. 

We created six further personal scenarios. These scenarios varied across three situations and two frames. Participants were asked whether they would be willing to have a friend over (yes/no), attend a crowded work meeting (yes/no), and download a contact tracing app (yes/no). Each situation was framed in two ways – as about a choice to save lives or prevent deaths. An excerpt from the story about inviting a friend over is provided here: 

Imagine that the town of Pleasantville… is preparing for the outbreak of the Coronavirus (COVID-19), which is expected to kill 600 people. They decide to adopt a social/physical distancing programme to prevent the spread of COVID-19 that is expected to [save 200 lives / prevent 400 deaths]. Social/physical distancing is when people reduce social interaction to stop the spread of a disease, such as by working from home and avoiding gatherings in public spaces. Your good friend calls you and says they want to come over to discuss the announcement… 

What do you say to your friend? Yes, come over / No, don’t come over

If losses loom larger than gains in more personal scenarios, then we should expect messages framed as ‘preventing death’ to have stronger effects across situations. The pilot results are shown in Figure 1. There was no substantial effect of message framing, although the situation made some difference. Nobody was willing to let a friend visit their home, some people said they would attend a work meeting, and the majority would download a contact tracing app.


Fig 1: Results of the study testing framing effects about saving lives versus preventing death

What can explain these results? One possibility is social desirability bias. People may wish to appear as if they would take action to prevent COVID-19 spreading, even if they would not in everyday life. 

Timing may also matter. When we conducted our study, people may have been sufficiently fearful of the consequences of COVID-19 that they were willing to comply with guidelines and recommendations, regardless of the message framing. It is possible that earlier on in the pandemic, we would have found different results.  

Another explanation is that, unlike the government programmes scenarios, the alternative options in the more personal scenarios did not state certain and probabilistic qualities. For the government programme scenarios, when the options were framed as saving lives, participants wanted to secure the safe-but-sure option. One participant explained their response by saying, “The 1/3 probability means the same 200 die but the [other] option appears to guarantee saved lives”. Alternatively, when the options are framed negatively, people wanted to roll the proverbial dice. One participant explained that, “The overall odds are the same but the chance for no one dying is worthwhile”. In contrast, the risk regarding personal decisions is uncertain because many outcomes for COVID-19 are uncertain. It may be that loss aversion is more pronounced when people make policy choices between certain and probabilistic outcomes. 

Our study only scratches the surface of possibilities for message testing. We wonder what research may have shown about alternatives to ‘Stay Alert’. Perhaps some of its criticisms could have been avoided, such as with messages to help manage the anxieties associated with the uncertainty of lifting a lockdown. Certainly, public messages can be efficiently tested before they are publicly disseminated – even during a crisis.

Laura Kudrna (Research Fellow) and Kelly Ann Schmidtke (Assistant Professor)


References:

  1. Tversky A, Kahneman D. The Framing of Decisions and the Psychology of Choice. Science. 1981; 211(4481): 453-8.
  2. Chou EY, Murnighan JK. Life or Death Decisions: Framing the Call for Help. PLoS ONE. 2013; 8(3): e57351.
  3. Kahneman D, Tversky A. Prospect Theory: An Analysis of Decision Under Risk. Econometrica. 1979; 47(2): 263-92.
  4. Halpern SD, Truog RD, Miller FG. Cognitive Bias and Public Health Policy During the COVID-19 Pandemic. JAMA. 2020.

N.B. This blog post has also been cross-posted at: blogs.lse.ac.uk/politicsandpolicy/changing-the-message-to-change-the-response-psychological-framing-effects-during-covid-19/

Walking Through the Digital Door: Video Consultations During COVID-19 and Beyond

The “NHS Long-Term Plan” (2019) is a five-year plan describing how NHS services should be redesigned for the next decade. This plan includes making better use of digital technologies, such as video consultations. While video consultations have potential advantages for patients and hospital systems,[1] they may make patients uncomfortable. If patients do not walk through the ‘digital door’ to attend a video consultation, then potential advantages cannot be realised. Likely the motto of “build it and they will come” is insufficient. Instead, we need to support patients so that they come the first time and return after that. 

What support that patients need is, at least in part, an empirical question that we plan to address in a future study. One way to support attendance may be with the behavioural science principle of ‘defaults’ – people tend to ‘go with the flow’ of pre-set options.[2] Defaults have been used to influence organ donations by adding the word ‘don’t’ to an application, i.e. “If you want to be an organ donor, please check here,” vs. “If you don’t want to be an organ donor, please check here”. In a simulated study, 42% of people opted-in to become organ donors given the original phrasing, and 82% did not opt-out given the second.[3] In other words, the realised organ donation rate nearly doubled by changing the default option. Until April 2020 England had an opt-in system with 38% of people having opted-in to become organ donors. When England’s law changed to an opt-out system in May 2020 the assumed donor rate has increased instantly. Time will tell how many people fill out the form to opt-out, but the present authors suspect the resultant donor rate to remain higher than 38%.

Defaults have been used to influence people’s behaviour in many contexts, e.g. how much money people save for retirement,[4] physicians’ medication use,[5] and purchases of healthy foods.[6] At least three psychological mechanisms are at play: endorsement (believing the proposed default is recommended), endowment (believing the default is normal), and ease (taking up the proposed default is simpler than refusing it).[7,8] Re-framing an invitation to attend an outpatient appointment from ‘in-person’ to ‘video’ creates a new default ‘endorsed’ mode of attendance that is ‘easier’ to accept than refuse. However, if a substantial number of patients refuse an invitation to attend a video consultation, this would suggest that more support is needed to garner people’s acceptance.

An ideal experimental test of the default effect on out-patient appointment attendance would occur in the field setting, similar to our work on influenza vaccination letters.[9] But (without tremendous follow-up efforts) this approach provides a limited ability to explore barriers and facilitators patients believe influence their choices. These beliefs undoubtably influence whether patients attend. To explore how default options and beliefs influence whether patients accept an invitation to attend a video consultation, we will conduct a simulated study with patients from the site Prolific Academic. Prolific Academic contains thousands of people prepared to answer researchers’ questions who can be filtered on criteria such as health status, age, and education. Our research will utilise an online experiment with quantitative and qualitative items. We plan to compare our findings to real hospital data on video consultations before and after COVID-19, which may have provided the impetus for more patients to engage in digital healthcare. 

Conversations with researchers across ARC WM’s themes and with public contributors suggest several barriers and facilitators to the uptake of video consultations. For instance, while the location of in-person consultations was obvious, video consultations require patients to make an additional choice about where they feel comfortable attending. Whether attending from home or work, new privacy concerns arise regarding what other people can overhear across physical and digital space. Our research will show how much such concerns matter to patients, and suggest what additional support should be offered to increase patients’ attendance within their invitation to attend. If COVID-19 hasn’t provided the push that patients need to walk through the digital door, this research will help us understand why. Equally, if it has, we will be better equipped to sustain and expand the shift, and in so doing help realise the NHS Long-Term Plan.

Kelly Ann Schmidtke (Assistant Professor) and Laura Kudrna (Research Fellow)


References:

  1. Greenhalgh T, et al. Virtual Online Consultations: Advantages and Limitations (VOCAL) Study. BMJ Open 2016; 6: e009388. 
  2. Dolan P, et al. Influencing Behaviour: The Mindspace Way. J Econ Psychol. 2012; 33(1): 264-77.
  3. Johnson EJ, Goldstein D. Do Defaults Save Lives? Science. 2003; 302(5649): 1338-9. 
  4. Madrian BC, Shea DF. The Power of Suggestion: Inertia in 401(k) Participation and Savings Behaviour. Q J Econ. 2001; 116(4):1149–87. 
  5. Ansher C, et al. Better Medicine by Default. Med Decis Making. 2014; 34(2):147-58. 
  6. Peters J, et al. Using Healthy Defaults in Walt Disney World Restaurants to Improve Nutritional Choices. J Assoc Consum Res. 2016; 1(1): 92-103.
  7. Jachimowicz JM, et al. When and Why Defaults Influence Decisions: a Meta-Analysis of Default Effects. Behav Public Policy. 2019; 3(2): 159-86. 
  8. Dinner I, et al. Partitioning Default Effects: Why People Choose Not to Choose. J Exp Psychol Appl. 2011; 17(4): 332-41. 
  9. Schmidtke KA, et al. Randomised controlled trial of a theory-based intervention to prompt front-line staff to take up the seasonal influenza vaccine. BMJ Qual Saf. 2020; 29(3): 189-97.

The Holy Grail of Quality Measurement

Writing in JAMA, Austin and Kachalia argue for automation of quality measurements.[1] We ourselves have argued that the proliferation of routine quality measures is getting out of hand.[2]

The authors argue, as we have argued, that using quality measures to incentivise organisations is a blunt tool, subject to gaming. Far better, is to use quality measures in real-time to prompt doctors to provide high quality care.

In fact, this is what computerised decision support offers. There is considerable empirical support for use of this type of decision tool. Working with Prof Aziz Sheikh and colleagues NIHR ARC West Midlands has investigated decision support for prescribing [3] and we are now investigating its use in antibiotic stewardship.[4] We are entirely in support of the use of decision support to improve care in real-time.

However, we question the idea that the majority of healthcare can be guided by online decision support. Working with Prof Timothy Hofer in Michigan, ARC WM co-investigators have shown that the measurement of the quality of hospital care is extremely unreliable.[5] Kappa measures of agreement between reviewers were about 20%. This means that seven reviewers would be needed for each case note, to achieve a reliability of 80%.

That is to say, that for much of medical care, there is no agreed standard. Truly, the majority of medical care is more art than science.

We think that the time has arrived to abandon hubristic notions about standardising and quality assuring the generality of clinical care. Medicine is not like aviation. Commercial aviation is almost entirely computerised. Emergencies aside, the whole process can be guided algorithmically. Our paper in Milbank Quarterly, shows quite clearly that this is not the case for medicine.[5]

Working with Prof Julian Bion, the ARC WM Director had an opportunity to audit numerous case notes from patients with sepsis.[6] The idea was to observe quality of care against a package of evidence-based criteria. Many of these criteria was based on actions that should be carried out within a specified time from diagnosis. The exercise proved almost impossible, since the point of diagnosis was ephemeral. In most cases there was no clear point to start the clock and the very diagnosis of sepsis had to be reverse-engineered from the time at which a sepsis-associated action took place! This exercise provided eloquent testimony to the judgemental, rather than rules-based, nature of much medical practice. We should use algorithmic decision support where clear rules exist, but we must stop pretending that the whole of medicine can be guided in this way. Perhaps we should just stand back a little, and accept some of the imperfections in our systems. Like a harm-free world, perfection will always lie beyond our grasp.[7]

Richard Lilford, ARC WM Director


References:

  1. Austin JM, Kachalia A. The State of Health Care Quality Measurement in the Era of COVID-19. The Importance of Doing Better. JAMA. 2020.
  2. Lilford RJ. Measuring Quality of Care. NIHR CLAHRC West Midlands News Blog. 21 April 2017.
  3. Yao GL, Novielli N, Manaseki-Holland S, et al. Evaluation of a predevelopment service delivery intervention: an application to improve clinical handovers. BMJ Qual Saf. 2012;21:i29-38.
  4. Usher Institute. ePrescribing-Based Antimicrobial Stewardship. 2020.
  5. Manaseki-Holland S, Lilford RJ, Te AP, et al. Ranking Hospitals Based on Preventable Hospital Death Rates: A Systematic Review With Implications for Both Direct Measurement and Indirect Measurement Through Standardized Mortality Rates. Milbank Q. 2019;97(1):228-84. 
  6. Lord JM, Midwinter MJ, Chen YF, et al. The systemic immune response to trauma: an overview of pathophysiology and treatment. Lancet. 2014;384(9952):1455-65.
  7. Meddings J, Saint S, Lilford RJ, Hofer TP. Targeting Zero Harm: A Stretch Goal That Risks Breaking the Spring. NEJM Catal Innov Care Del. 2020; 1(4).