Tag Archives: Guidance

Policy Makers Should Use Evidence, But What Should They Do In an Evidence Vacuum?

There are two points of view concerning the obligations of policy makers when there is no direct evidence to guide them:

  1. It is wrong to take any action or intervene unless there is evidence to support your decision.
  2. A lack of evidence is neutral; it neither allows a decision-maker to intervene, nor does it sanction non-intervention.

Which is correct? Writing in the Lancet recently, Feng, et al. advocate the use of face masks in public to prevent the spread of COVID-19.[1] They say it is an asymmetrical choice; unlikely to do harm and may do much good by preventing the spread of the disease from pre-symptomatic people to people who are unaffected.

The ARC WM Director sides with the ‘lack of evidence is neutral’ principle. In my opinion the argument that a policy maker should not intervene in the absence of direct evidence is flawed for a series of linked reasons:

  1. The obligation to use evidence when it exists does not entail the requirement to fail to act when there is no such evidence.
  2. Further, there is never a circumstance in which no relevant evidence is available. Granted, there may be no direct, comparative evidence, but this is not tantamount to no evidence at all.
  3. There can be no automatic supposition that the expected value of a proposed intervention is less than that of the status quo. That is to say, the balance of benefits, harms and costs may go either way when there is no incontrovertible comparative evidence. It is then a matter of judgment as to the relative probabilities of benefit and cost that must sit alongside values in determining the best course of action.
  4. The theoretical basis for decisions under uncertainty derive from expected utility theory, which reconciles probability and values/preferences.[2][3] Under this axiomatic theory, probability refers to the decision maker’s degree of belief. 

Of course, nothing written above should be misinterpreted to imply either that good evidence should not inform decisions or that policy makers have no obligation to try to collect evidence to better inform future decisions. Indeed, the mandate to collect and use evidence is now enshrined in law in many states in the USA and was a manifesto commitment for the current UK government.

The US state of Oregon is well known for ground-breaking policies. Right back in 2003 it passed legislation requiring evidence-based procurement of clinical services in the field of addictions beginning 2005.[4] By 2011, 75% of addiction services commissioned by public money had to be evidence-based.[5] Likewise, nearby Washington state published a law in 2012 requiring policy makers to use empirically supported services for children’s health and welfare.[6] 

The British government has a tripartite structure for policy trials:

  1. Funding universities to carry out policy trials to inform the government’s programme. A good example is The Work and Health Unit (WHU) trial of an intervention to encourage small- and medium-sized enterprises (SMEs) to do more to promote employee health and welfare.[7] The WHU have sponsored ARC WM faculty, supported by the West Midland Combined Authority and RAND Europe, to carry out a four arm cluster randomised trial of 100 SMEs.[8]
  2. Funding external ‘what works’ centres, such as the Education Endowment Fund that was established in 2011 by The Sutton Trust with £125m funding from the Department for Education. This organisation has conducted a very large series of educational RCTs, in which England now leads of the world, as recently described in your news blog.[9]
  3. In-house trials conducted by individual government departments. I am a member of the Cabinet Office ‘What works trial advice panel’ that advises on in-house and externally commissioned trials whatworks.blog.gov.uk/trial-advice-panel/. HMRC has conducted the largest-ever RCT of self-assessment tax schemes, for example. The environment agency has recently conducted an RCT to tackle waste crime. I am currently part of a small group advising government departments on the design and evaluation of an intervention to help people who have recently become carers to adapt to their new circumstances without becoming depressed, and in some cases being able to continue to work.
  4. Funding academic centres, such as DHSC policy research centres.

ARC West Midlands will continue to promote local and international studies to provide evidence for evidence-based policy. We like to work very closely with policy makers and service managers so that our work addresses their immediate needs. We like to think of ourselves as pioneers in the fields of rapid response and opportunistic research, and can cite a number of on-going and recent examples, many covering the areas of public health and social care.

Richard Lilford, ARC WM Director; with thanks to Emily Power for contributions.


  1. Feng S, et al. Rational use of face masks in the COVID-19 pandemic. Lancet Resp Med. 2020.
  2. Thornton JG, Lilford RJ, Johnson N. Decision analysis in medicine. BMJ. 1992; 304: 1099-103. 
  3. Lilford RJ, Braunholtz D. The statistical basis of public policy: a paradigm shift is overdue. BMJ. 1996; 313: 603.
  4. Oregon Legislative Assembly. Human Service Issues: Health Care. Senate Bill 267. In: 2003 Summary of Legislation. Oregon: Legislative Fiscal Office; 2003. p59.
  5. Rieckmann T, et al. Employing Policy and Purchasing Levers to Increase the Use of Evidence-Based Practices in Community-Based Substance Abuse Treatment Settings: Reports from Single State Authorities. Eval Program Plann. 2011; 34(4): 366-74.
  6. Trupin E, Kerns S. Introduction to the Special Issue: Legislation Related to Children’s Evidence-Based Practice. Admin Policy Ment Health. 2017; 44(1): 1-5.
  7. Thrive at Work Wellbeing Programme Collaboration. Evaluation of a policy intervention to promote the health and wellbeing of workers in small and medium sized enterprises – a cluster randomised controlled trial. BMC Public Health. 2019; 19: 493.
  8. Lilford R, Russell S, Sutherland A. Thrive at Work Wellbeing Premium – Evaluation of a Cluster Randomised Controlled Trial. AEA RCT Registry. October 17 2018.
  9. Lilford RJ. UK Takes Over From the US as the Home of Trials of Educational Interventions. NIHR CLAHRC West Midlands News Blog. June 1 2018.

Guidelines on How to Change Services for the Better

Theory of Change

If you want to improve care at the front line against a standard (e.g. kindness to clients, implementing cancer treatment, etc.) then you have to intervene at the service level. The development of service interventions is stock in trade for service managers/clinicians; they are doing so all the time. But how should an intervention be developed? As you might expect this subject of how is an immense one, but there is broad agreement on the process, recently described by Wight et al., and detailed below.[1]

  1. Define and understand the problem.
  2. Identify things that might change.
  3. Come up with a causal change mechanism/theory of change.
  4. Identify how to deliver the change.
  5. Test and refine on a small scale.
  6. Roll out and evaluate (summative evaluation).

Well that’s pretty basic and fits well with the Medical Research Council guidance referred to in a previous CLAHRC West Midlands News Blog. [2]

Different Approaches

For a much more extensive discussion see a recent paper by Alicia O’Cathain, which discusses different approaches.[3] In fact the approaches are not hermetically sealed from each other and many have overlapping constraints. The emphasis, of course, varies. Few do not highlight the importance of involving service users in the development and design. No one thinks that an intervention should not be preceded by “diagnosis of the causes of a developed problem.” Piloting before widespread application is widely supported if not always adhered to. Some (intervention mapping for example) are more elaborate and formulaic than that. However, it is hard to insist a one-size-fits-all approach. Having an explicit theory does not increase the probability of success, but it does make it easier to explain the intervention to others.

Behavioural Psychology

One way to obtain change is to mandate certain behaviours and to enforce compliance. Such coercion is often justified, but in the grey area of healthcare in general, and medical care in particular, few activities are governed by hard rules. Mandating correct clinical diagnosis, for example, does not make a lot of sense. So we are into more subtle methods to change behaviour. 

Some interventions are truly straightforward and do not require conscious behaviour change- certain engineering solutions, such as forced function to prevent misconnecting anaesthetic gas pipes, for example. But most require those annoying creatures, human beings, to change their behaviours in some way. Perhaps the greatest single greatest contribution to providing a framework comes from the development of the trans-theoretical model [4] and its further distillation in the form of the COM-B model.[5] These models are built up from analysis and categorisation of the myriad preceding psychological theories that seek to explicate behaviour change. Of course, one way to obtain change is to mandate certain behaviours and to enforce compliance. Such coercion is often justified, but in the grey area of healthcare in general, and medical care in particular, few activities are governed by hard rules. Mandating correct clinical diagnosis, for example, does not make a lot of sense. So we are in to more subtle methods to change behaviour.

Thoughts from ARC WM

A recent article published by the Council for Allied Health Professions Research highlights Krysia Dziedzic’s top tips for implementation.[6] Krysia is part of our Long-term conditions theme and directs the Impact Accelerator Unit in the School of Primary, Community and Social Care at Keele University. Here I give my own tips for service change.

Some Frequently Flouted ‘Rules’ of Behaviour Change When Service Intervention are Designed and Implemented

Incentives (expectancy theory)Never use an incentive, positive or negative, when the people at whom it is targeted do not believe they can achieve it under their own volition.[7] [8]
Even if an intervention is targeted at the frontline of operations, intervene also at ‘higher’ levelsIn general, when intervening at the operational level, also activate higher levels, not only to liberate resource but also to create the right social environment in line with Social Expectancy Theory.[9] [10]
Political workDo not intervene when people are not expecting it and when it may change patterns of work, without first doing political work to ‘win hearts and minds’. People might not oppose what you are attempting, but you need active support. I think it is worth considering compensating the first generation of losers after Aneurin Bevan’s “I stuffed their mouths with gold” dictum.[11]
Be persistent, but also patientExpect prolonged resistance if skill substitution or material disruption of work is involved.[12] Elinor Ostrom’s emphasis on developing personal relations and providing lots of time for dialogue – cheap talk.[13] It also takes time for people in different roles to share the same intellectual map or ‘logics’.[12]
PilotingWhenever possible pilot interventions to iron out problems. If possible, alpha test them before they are rolled out. Incremental change is generally better than re-engineering business process, which involves greater risk than more incremental approaches.[14]
Involve service users in the design of interventions at all stagesCo-design not only makes sense, but is supported by experimental evidence.[15] [16] The ARC WM approach is to involve public contributors simultaneously in intervention design and evaluations.
Address multiple barriers to implementationInterventions are more likely to succeed if all material barriers are identified and addressed.[17] Frameworks, such as COM-B / trans-theoretical model can help identify ‘lurking’ barriers.
Seek risk-sharing agreements when purchasing equipmentEquipment often fails and repair can be very expensive because the vendor is in a monopoly position. Build in service contracts or even re-imbursement by hours of trouble-free service.
Do not overload the intervention description Be parsimonious by describing the essential features of a service intervention. Consider ‘essential’ and optional elements. Remember, if a compound intervention has n components, and the probability of successful implementation of each is p, then only pn will get the complete bundle.[18]
Encourage innovationMentor front-line staff to be the architects of their own destiny, rather than prescribe solutions – try to be an ‘invisible leader’.
Always read the previous literature concerning the proposed interventionFailure to do so is scientific and management malpractice. Yes, contexts vary, but not to the degree that systematic analysis of previous experience can be jettisoned.
EvaluationsConduct (and distinguish between) intra-mural (formative) and extra-mural (summative) evaluations. The former are necessary to identify unanticipated problems and probe the limits of what may be achieved.[19] [20] Intra-mural evaluations are an integral part of Plan-Do-Study-Act (PDSA) cycles, Total Quality Management (TQM), and so on.

Richard Lilford, ARC WM Director; Krysia Dziedzic,  Director of the Impact Accelerator Unit


  1. Wight D, Wimbush E, Jepson R, et al. Six steps in quality intervention development (6SQuID). J Epidemiol Community Health. 2016;70:520-525. 
  2. Craig P, Dieppe P, Macintyre S, Michie S, Nazareth I, Petticrew M. Developing and evaluating complex interventions: the new Medical Research Council guidance. BMJ. 2008; 337: a1655.
  3. O’Cathain A, Croot L, Duncan E, et al. Guidance on how to develop complex interventions to improve health and healthcare. BMJ Open. 2019;9:e029954.
  4. Prochaska JO, Velicer WF. The transtheoretical model of health behaviour change. Am J Health Promot. 1997; 12(1): 38-48.
  5. Michie S, van Stralen MM, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011;6:42.
  6. Swaithes L, Campbell L, Fowler-Davis S, Dziedzic K. Top Tips. Implementation for Impact. Council for Allied Health Professions Research. 2019.
  7. Lilford RJ. Financial Incentives for Providers of Health Care: The Baggage Handler and the Intensive Care Physician. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  8. Lilford RJ. Two Things to Remember About Human Nature When Designing Incentives. NIHR CLAHRC West Midlands News Blog. 27 January 2017.
  9. Lilford RJ. Monumental Study of Service Interventions to Drive up the Quality of Care in Low- and Middle- Income Countries. NIHR CLAHRC West Midlands News Blog. 19 October 2018.
  10. Ferlie E, & Shortell S. Improving the quality of health care in the United Kingdom and the United States: a framework for change. Milbank Quart. 2001; 79(2): 281-315.
  11. BBC News. Making Britain Better. 1 July 1998.
  12. Lilford RJ. How Theories Inform our Work in Service Delivery Practice and Research. NIHR CLAHRC West Midlands News Blog. 21 September 2018.
  13. Lilford RJ. Polycentric Organisations. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  14. Lilford RJ. Introducing Hospital IT Systems – Two Cautionary Tales. NIHR CLAHRC West Midlands News Blog. 4 August 2017.
  15. Lilford RJ, Skrybant M. Our CLAHRC’s Unique Approach to Public and Community Involvement Engagement and Participation (PCIEP). NIHR CLAHRC West Midlands News Blog. 24 August 2018.
  16. Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ. The Stepped Wedge Cluster Randomised Trial: Rationale, Design, Analysis, and Reporting. BMJ. 2015; 350: h391.
  17. Lilford RJ. It Really Is Possible to Intervene to Reduce Teenage Pregnancy. NIHR CLAHRC West Midlands News Blog. 14 November 2014.
  18. Resar R, Griffin FA, Haraden C, Nolan TW. Using Care Bundles to Improve Health Care Quality. IHI Innovation Series white paper. Cambridge, Massachusetts: Institute for Healthcare Improvement; 2012.
  19. Lilford RJ, Foster J, Pringle M. Evaluating eHealth: How to Make Evaluation More Methodologically Robust. PLoS Med. 2009; 6(11): e1000186.
  20. Lilford RJ. The MRC Framework for Complex Interventions – The Blind Spot. NIHR CLAHRC West Midlands News Blog. 7 June 2019.

Measuring Things That Are Not Themselves Directly Observable

Much of science concerns concepts not material entities. We talk easily and glibly about wealth, satisfaction, liberal democracy and metropolitan elites. But in science we need to quantify these types of thing. To paraphrase Galileo; if something is not measurable make it so!

The ARC WM Director was made very aware of definitional and measurement issues while attending the African Research Collaboration on Sepsis research meeting in Dar Es Salaam. The Collaboration is funded by an NIHR Global Health Group grant awarded to Jamie Rylance at Malawi Liverpool Wellcome Research Centre. The meeting covered many fascinating topics. One recurring theme was how to define sepsis. Since 1991, three international conferences have been held to “define sepsis” – the most recent consensus statement (2016) was published in JAMA.[1] 

Right off the bat in reading the literature there is a problem, as the challenge of the measurement task is often referred to as that of finding an operational definition or worse simply “a definition”.  This is a problem because referring to the measurement task as “defining sepsis” can obscure the fact that there is currently a well specified and seemingly widely accepted conceptual definition of sepsis from Sepsis-3, namely “Sepsis should be defined as life-threatening organ dysfunction caused by a dysregulated host response to infection.”[1]  But as noted in the same publication, “There are, as yet, no simple and unambiguous clinical criteria or biological, imaging, or laboratory features that uniquely identify a septic patient.”  So, to be clear, virtually all of the arguments and difficulties that have arisen after each consensus conference establishing a conceptual definition are in how to design a measurement procedure, including the selection of a population, a set of observable variables and the mathematical model that combines them. So this got us thinking about measurements of scientific constructs.

A clearly defined conceptual entity that is not directly observable is often referred to as a latent construct or variable.  Building on Bollen and Bauldry,[2] and Hand,[3] three scenarios are possible when defining a measurement procedure for a latent construct, such as ‘sepsis’:

  1. Where a measurable reference category or gold-standard for a latent construct exists, such as the molecular classification of intersex or the chemical classification of endocrine disorders. The reference category is then held to be the observable representation of the construct. Other potentially more easily measured observable features can be assessed directly as to how accurately and precisely they represent the construct through their relationship to the reference category.  
  1. Where theory is “poorly formulated” with regard to how the latent construct exerts its effects, some observable features can be combined in what Hand called a “pragmatic measurement”[3] procedure to produce a measurement that is useful not because you understand what is going on but only to the extent that the pragmatic measures have some ability to predict an outcome of interest, as is the case with the concept of socioeconomic status, the histological grading of tumours, or the APACHE score of acute illness severity. In the absence of a model causally relating the construct to the observed features, the combination of the features into an index can only be said to summarise the observable features rather than represent the underlying construct.  In turn, the index is actionable only because of its ability to predict.  Finally, as the index is only a summary of observable features, the components of such an index cannot be changed or left out with changing the nature of what is being measured.
  1. Where there is a well-specified formal conceptual definition the task is to identify a pool of exchangeable and observable features that theory would suggest are caused by the construct. By use of a statistical model that includes those observable features the latent variable that causes them can then be identified. Yet, the hypothesised causal relationship between the underlying construct and the observed effects requires a continuing effort to collect evidence supporting the argument that the observed effects are a valid representation of the underlying construct. The example here would be schizophrenia, where the American College of Psychiatrists definition has allowed the science to proceed. A latent social construct (‘this is a schizophrenic’) is hypothesised to predict the observed clinical manifestations that can be measured. This measurement model is itself a theory that remains open to revision or being abandoned entirely, but which still can be employed as a useful tool.

In our opinion the latter ‘third way’ is appropriate for ‘sepsis’. The conceptual definition is not, cannot be, perfect but it is based on broad consensus. Once the conceptual definition has crystallised, science can proceed to develop one or more measurement procedures.  These measurement procedures may well need to be refined or changed in different settings of care. The research may one day yield a reference standard reflecting basic mechanisms; possibly this point is within reach in the case of schizophrenia, where genome-wide association studies have yielded stunning findings.[4] We think this is the approach the sepsis field should follow. It is more profitable than devoting endless effort to attempting to find the holy grail of a reference standard for sepsis. It seems reasonable to accept the JAMA proposal for an operational measurement of the construct. While using it, continue to collect evidence that supports or refutes the theory represented in the measurement model.

Richard Lilford, ARC WM Director; Timothy Hofer, Professor of General Medicine


  1. Singer M, Deutschman CS, Seymour CW, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016; 315(8): 801-10.
  2. Bollen KA, & Bauldry S. Three Cs in Measurement Models: Causal Indicators, Composite Indicators, and Covariates. Psychol Methods. 2011; 16(3): 265-84.
  3. Hand DJ. Measurement theory and practice: the world through quantification. London: Wiley-Blackwell; 2004.
  4. Lilford RJ. Psychiatry Comes of Age. NIHR CLAHRC West Midlands News Blog. 11 March 2016.

Scientific Writing

The ARC WM Director spends a lot of time writing; grant requests, research papers, reports and, yes, your News Blog. My work is submitted to collaborators and reviewers, while I spend a lot of time reading and reviewing grants and research papers. 

So a large part of my job concerns the written word. Writing well is not easy: I find writing hard after 40 years of academic work. Perhaps there are two things that an academic needs: good ideas and the ability to convey these ideas in the written word. Instructions for people making grant applications stress the importance of a clear and compelling narrative. The text must be easy to understand and follow. But what makes for clear English? I think there are two different aspects; the words used and the order, or flow, of the words.

Of these two dimensions, the words themselves and the way they are strung together, the latter strikes me as far more important. The words need to create a clear and compelling narrative; you need to tell a good story. Telling a good story turns on putting the thoughts down in the right order, including all the important ideas, and not digressing into issues that do not contribute to the storyline. It might sound odd for a scientist to draw a lesson from a nursery rhyme but I often use ‘Little Red Riding Hood’ as an example. Everything that is needed to create suspense and provide meaning is included in the tale, while there is nothing included that does not need to be there. For all we know, Red Riding Hood might have met a hedgehog on her way to Granny’s house – it does not contribute to the story line; leave it out.

Good writing reflects good thinking. Writing is like painting – an iterative process where a general idea takes form and is crystallised into a meaningful set of objects; ideas in the case of writing, brush strokes in the case of painting. Writing is the act of generating material and organising it in a coherent way.

One of my enduring frustrations concerns the way modern grant applications break up the storyline with their endless boxes to be completed; the people that produce these forms clearly do not read them in a cogent way. Take the application form on which  I am currently engaged. It places the section ‘Why is this research needed now?’ – in other words the ‘background’ – after the ‘research design and methods’ section. It takes only a little thought to spot that the research designs turns on the study question, which in turn turns on ‘why the research is needed now.’ As for this current fashion for insisting on ‘Scientific Abstract’ and ‘Plain English Summary’ – this rests on an outmoded notion that a good scientific abstract cannot be explicated in ‘plain English’. What rot!

Then we come to the words themselves, an issue that I think is subordinate to the question of how the words are strung together. There are two issues concerning word selection. First, the same word may mean different things to different people. Second, some words might simply lie outside a reader’s vocabulary. The first problem, different meanings for the same words, is much more problematic than the simple issue of vocabulary. The subject of service delivery research is bedevilled by lack of consensus over the meaning of the words that define its essential constructs [1];  so much so that it has been described as a ‘tower of Babel’.[2] The only way to confront this problem is to refer to a framework into which the essential constructs can be fitted. Then the terms that might cause confusion can be explicated with reference to that framework. For example, the term intervention can be very confusing in the context of service delivery research. Sometimes it refers to a clinical intervention, while other times to a service intervention (designed, for example, to improve the uptake of a clinical intervention). In the recent call for ARC proposals I was seldom sure which of these two was being referred to. Reference to a simple, generic, causal framework for service level interventions would have cleared up this confusion.[3] 

The question of vocabulary is one that is often referred to by public and patient representatives. The meaning of a word may be obscured either because it is a term of art whose meaning is specific to a particular subject or discipline, or because the reader simply has not encountered the word in their general reading. In the former instance, the solution is simply to explain the term or provide a glossary. Words like ‘cluster’, ‘linear’, ‘sensitivity’ and ‘interaction’ have more precise meanings in quantitative science than they do in the vernacular.

It has become very fashionable to criticise the use of dictionary words that are seldom used in common parlance. People who use such words are often criticised for being elitist and some people use software to identify and thus eliminate obscure words. It is also true that many superb communicators, such as Bill Clinton, Tony Blair, John Major and Winston Churchill avoided using obscure words. Nevertheless, it is also true that there are nuances of meaning and, as Wittgenstein argued, all language is an approximation.[4] So I do not think it is fair to argue that the use of less common words is necessarily a form of elitism or showing off. Sometimes, it is an attempt to get as close as possible to what you want to say. Synonyms are approximations; they mean something similar but not exactly the same. To be solicitous is not quite the same as to be attentive. To besmirch is not identical to traduce. To extirpate is not quite the same as to remove. And isn’t egregious a better fit than disgusting in many contexts? In short, do not use less common words simply in order to show off. Equally, do not rush to judgment, that the user of a word is trying to show off, merely because the word is not widely used.

What is my take-home message? It is this – do not think that writing is easy. In fact, think of writing as a method; as a method to help you organise your thoughts. How often have you set out to write a sentence with a clear idea of what you want to say, only to find that the sentence is hard to complete? The sentence is hard to complete because the thought was incomplete. The process of writing helps you to sharpen the underlying logic of what you are trying to say. It is in the very process of writing that your scientific argument takes form. Be prepared to find it hard, tear up your previous drafts, worry over the sentences that you use, and seek constructive criticism. The term ‘writing’, does not describe what we are really doing when we write.

Richard Lilford, ARC WM Director


  1. Lilford RJ. Health Service and Delivery Research – A Subject of Multiple Meanings. NIHR CLAHRC West Midlands News Blog. 30 November 2018.
  2. McKibbon KA, Lokker C, Wilczynski NL, et al. A cross-sectional study of the number and frequency of terms used to refer to knowledge translation in a body of health literature in 2006: a Tower of Babel? Implementation Science. 2010; 5: 16.
  3. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ. 2010; 341: c4413.
  4. Wittgenstein L. Philosophical Investigations. Oxford: Basil Blackwell Ltd; 1953.