Friday, 26 September 2014

Why most scientists don't take Susan Greenfield seriously


Three years ago I wrote an open letter to Susan Greenfield, asking her to please stop claiming there is a link between autism and use of digital media. It’s never pleasant criticizing a colleague, and since my earlier blogpost I’ve held back from further comment, hoping that she might refrain from making claims about autism, and/or that interest in her views would just die down. But now she's back, reiterating the claims in a new book and TV interview, and I can remain silent no longer.

Greenfield featured last week as the subject of a BBC interview in the series Hard Talk. The interviewer, Stephen Sackur, asked her specifically if she really believed her claims that exposure to modern digital media – the internet, video games, social media – were damaging to children’s development. Greenfield stressed that she did: although she herself had not done direct research on the internet/brain impact link, there was ample research to persuade her it was real. Specifically, she stated: “.. in terms of the evidence, anyone is welcome to look at my website, and it’s been up there for the last year. There’s 500 peer-reviewed papers in support of the possible problematic effects.”

A fact-check on the “500 peer-reviewed papers”

So I took a look. The list can be downloaded from here: it’s not exactly a systematic review. I counted 395 distinct items, but only a small proportion are peer-reviewed papers that find evidence of adverse effects from digital technology. There are articles from the Daily Mail and reports by pressure groups. There are some weird things that seem to have found their way onto the list by accident, such as a report on the global tobacco epidemic, and another from Department of Work and Pensions on differences in life expectancy for 20-, 50- and 80-year-olds. I must confess I did not read these cover to cover, but a link with 'mind change' was hard to see. Of the 234 peer-reviewed papers, some are reports on internet trends that contain nothing about adverse consequences, some are straightforward studies of neuroplasticity that don’t feature the internet, and others are of uncertain relevance. Overall, there were 168 papers that were concerned with effects of digital technology on behaviour and 15 concerned with effects on the brain. Furthermore, a wide range of topics was included: internet addiction, Facebook and social relations, violent games and aggression, reading on screens vs books, cyberbullying, ‘brain training’ and benefits for visuospatial skills, effects of multitasking on attention. I could only skim titles and a few abstracts, but I did not come away feeling there was overwhelming evidence of adverse consequences of these new technologies. Rather, papers covered a mix of risks and benefits with varying quality of evidence. There is, for instance, a massive literature on Facebook influences on self-esteem and social networks, but much of it talks of benefits. The better studies also noted the difficulties of inferring causation from correlational data: for instance, it’s possible that an addictive attitude to a computer game is as much a consequence as a cause of problems with everyday life.

Greenfield’s specific contribution to this topic is to link it up with what we know about neuroplasticity, and she has speculated that attentional mechanisms may be disrupted by effects that games have on neurotransmitter levels, that empathy and social relationships can be damaged when computers/games take us away from interacting with people, and that too much focus on a two-dimensional screen may affect perceptual and cognitive development in children. This is all potentially important and a worthy topic for research, but is it reasonable, as she has done, to liken the threat to that posed by climate change? As Stephen Sackur pointed out, the evidence from neuroplasticity would indicate that if the brain changes in response to its environment, then we should be able to reverse an effect by a change in environment. I cannot resist also pointing out that if it is detrimental to perform socially-isolated activities with a two-dimensional surface rather than interacting with real people in a 3D world, then we should be discouraging children from reading books.

Digital media use as a risk factor for autism

My main concern is the topic that motivated me to write to Greenfield in the first place: autism. The arguments I put forward in 2011 still stand: it is simply irresponsible to indulge in scaremongering on the basis of scanty evidence, particularly when the case lacks logical consistency.

In the Hard Talk interview*, Greenfield attempted to clarify her position: “You have to be careful, because what I say is autistic spectrum disorder. That’s not the same as autism.” Yet this is no clarification at all, given that the latest edition of DSM5 states: “Individuals with a well-established DSM-IV diagnosis of autistic disorder, Asperger’s disorder, or pervasive developmental disorder not otherwise specified should be given the diagnosis of autism spectrum disorder (ASD).” Greenfield has had a few years to check her facts, yet seems to be under the impression that ASD is some kind of mild impairment like social gaucheness, quite distinct from a clinically significant condition.

In an interview in the Observer (see here**), Greenfield was challenged by the interviewer, Andrew Anthony, who mentioned my earlier plea to her to stop talking about autism. She replied to say that she was not alone in making the link and that there were published papers making the same case. She recommended that if I wanted to dissent, I should “slug it out” with the authors of those papers. That’s an invitation too good to resist, so I searched the list from her website to find any that mentioned autism. There were four (see reference list below):

We need not linger on the Hertz-Picciotto & Delwiche paper, because it focuses on changes in rates of autism diagnosis and does not mention internet use or screen time. The rise is a topic of considerable interest about which a great deal has been written, and numerous hypotheses have been put forward to explain it. Computer use is not generally seen as a plausible hypothesis because symptoms of ASD are typically evident by 2 years of age, long before children are introduced to computers. (Use of tablets with very young children is increasing, but would not have been a factor for the time period studied, 1990-2006).

The Finkenauer et al paper is a study of internet use, and compulsive internet use by married couples, who were assessed using self-report questionnaires. Frequency of internet use was not related to autistic traits, but compulsive internet use was. The authors did not conclude that internet use causes autistic traits – that would be a bit weird in a sample of adults who grew up before the internet was widespread. Instead, they note that if you have autistic traits, there is an increased likelihood that internet use could become problematic. The paper is cautious in its conclusions and does not support Greenfield’s thesis that the internet is a risk factor for autism. On the contrary, it emphasises the possibility that people who develop an addictive relationship with the internet may differ from others in pre-existing personality traits.

So on to Waldman et al, who consider whether television causes autism. Yes, that’s right, this is not about internet use. It’s about the humble TV. Next thing to note is this is an unpublished report, and not a peer-reviewed paper. So I checked out the authors to see if they had published anything on this, and found an earlier paper with the intriguing title: “Autism Prevalence and Precipitation Rates in California, Oregon, and Washington Counties”. Precipitation? Like, rainfall? Yup! The authors did a regression analysis and concluded that there was a statistically significant association between the amount of rainfall in a specific county, and the frequency of autism diagnoses. They then went on to consider why this might be, and came up with an ingenious explanation: when it is wet, children can’t play outside. So they watch TV. And develop autism.

In the unpublished report, the theme is developed further, by linking rate of precipitation to household subscription to cable TV. The conclusion:

“Our precipitation tests indicate that just under forty percent of autism diagnoses in the three states studied is the result of television watching due to precipitation, while our cable tests indicate that approximately seventeen percent of the growth in autism in California and Pennsylvania during the 1970s and 1980s is due to the growth of cable television.”

One can only breathe a sigh of relief that no peer-reviewed journal appears to have been willing to publish this study.

But wait, there is one more study in the list provided by Greenfield. Will this be the clincher? It's by Maxson McDowell a Jungian therapist who uses case descriptions to formulate a hypothesis that relates autism to “failure to acquire, or retain, the image of the mother’s eyes”. I was initially puzzled at inclusion of this paper, because the published version blames non-maternal childcare rather than computers, but there is an updated version online which does make a kind of link – though again not with the internet: “The image-of-the-eyes hypotheses suggest that this increase [in autism diagnoses] may be due to the increased use, in early infancy, of non-maternal childcare including television and video.” So, no data, just anecdote and speculation designed to make working mothers feel it’s their fault that their child has autism.

Greenfield's research track record

Stephen Sackur asked Greenfield why, if she thought this topic so important, she hadn’t done research on this topic herself. She replied that as a neuroscientist, she couldn't do everything, that research costs money, and that if someone would like to give her some money, she could do such research.

But someone did give her some money. According to this website, in 2005 she received an award of $2 million from the Templeton Foundation to form the Oxford Centre for Science of the Mind which is “dedicated to cutting-edge interdisciplinary work drawing on pharmacology, human anatomy, physiology, neuroscience, theology and philosophy". A description of the research that would be done by the centre can be found here. Most scientists will have experienced failure to achieve all of the goals that they state in their grant proposals – there are numerous factors outside one's control that can mess up the best-laid plans. Nevertheless, the mismatch between what is promised on the website and evidence of achievement through publications is striking, and perhaps explains why further funding has apparently not been forthcoming.

One of the more surprising comments by Greenfield was when Sakur mentioned criticism of her claims by Ben Goldacre. “He’s not a scientist,” she retorted, “he’s a journalist”. Twitter went into a state of confusion, wondering whether this was a deliberate insult or pure ignorance. Goldacre himself tweeted: “My publication rate is not stellar, as a part time early career researcher transferring across from clinical medicine, but I think even my peer reviewed publication rate is better than Professor Greenfield's over the past year.”

This is an interesting point. The media repeatedly describe Greenfield as a “leading neuroscientist”, yet this is not how she is currently perceived among her peer group. In science, you establish your reputation by publishing in the peer-reviewed literature. A Web of Science search for the period 2010-2014 found thirteen papers in peer-reviewed journals authored or co-authored by Greenfield, ten of which reported new empirical data. This is not negligible, but for a five-year period, it is not stellar - and represents a substantial fall-off from her earlier productivity.

But quality is more important than quantity, and maybe, you think, her work is influential in the field. To check that out, I did a Web of Science search for papers published from a UK address between 2005-2014 with topic specified as (Alzheimer* OR Parkinson’s OR neurodegener*) AND brain. (The * is wildcard, so this will capture all words starting this way). I used a 10-year period because citations (a rough measure of how influential the work is) take time to accrue. This yielded over 3,000 articles, which I rank ordered by the number of citations. The first paper authored by Greenfield was 956th in this list: “Non-hydrolytic functions of acetylcholinesterase - The significance of C-terminal peptides”, with 21 citations.

Her reputation appears to be founded on two things: her earlier work, in basic neuroscience in the 1980s and 1990s, which was well-cited, and her high profile as a public figure. Sadly, she seems to now be totally disconnected from mainstream science.

If Greenfield seriously believes in what she is saying, and internet use by children is causing major developmental difficulties, then this is a big deal. So why doesn’t she spend some time at IMFAR, the biggest international conference on autism (and autism spectrum disorder!)  that there is? She could try presenting her ideas and see what feedback she gets. Better still, she could listen to other talks, get updated on current research in this area, and talk with people with autism/ASD and their families.

*For a transcript of the Hard Talk interview see here
 **Thanks for Alan Rew for providing the link to this article


Finkenauer, C., Pollman, M.M.H., Begeer, S., & Kerkhof, P. (2012). Examining the link between autistic traits and compulsive Internet use in a non-clinical sample. Journal of Autism and Developmental Disorders, 42, 2252-2256. doi:10.1007/s10803-012-1465-4

Hertz-Picciotto, I., & Delwiche, L. (2009). The rise in autism and the role of age at diagnosis. Epidemiology, 20(1), 84-90. doi:10.1097/EDE.0b013e3181902d15.

McDowell, M. (2004). Autism, early narcissistic injury and self-organization: a role for the image of the mother's eyes? Journal of Analytical Psychology, 49 (4), 495-519 DOI: 10.1111/j.0021-8774.2004.00481.x

Waldman M, Nicholson S, Adilov N, and Williams J. (2008). Autism prevalence and precipitation rates in California, Oregon, and Washington counties. Archives of Pediatrics & Adolescent Medicine, 162,1026-1034.

Waldman, M., Nicholson, S., & Adilov, N. (2006). Does television cause autism? (Working Paper No. 12632). Cambridge, MA: National Bureau of Economic Research.

Bishopblog catalogue (updated 26th Sept 2014)


Those of you who follow this blog may have noticed a lack of thematic coherence. I write about whatever is exercising my mind at the time, which can range from technical aspects of statistics to the design of bathroom taps. I decided it might be helpful to introduce a bit of order into this chaotic melange, so here is a catalogue of posts by topic.

Language impairment, dyslexia and related disorders
The common childhood disorders that have been left out in the cold (1 Dec 2010) What's in a name? (18 Dec 2010) Neuroprognosis in dyslexia (22 Dec 2010) Where commercial and clinical interests collide: Auditory processing disorder (6 Mar 2011) Auditory processing disorder (30 Mar 2011) Special educational needs: will they be met by the Green paper proposals? (9 Apr 2011) Is poor parenting really to blame for children's school problems? (3 Jun 2011) Early intervention: what's not to like? (1 Sep 2011) Lies, damned lies and spin (15 Oct 2011) A message to the world (31 Oct 2011) Vitamins, genes and language (13 Nov 2011) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012) Phonics screening: sense and sensibility (3 Apr 2012) What Chomsky doesn't get about child language (3 Sept 2012) Data from the phonics screen (1 Oct 2012) Auditory processing disorder: schisms and skirmishes (27 Oct 2012) High-impact journals (Action video games and dyslexia: critique) (10 Mar 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) Raising awareness of language learning impairments (26 Sep 2013) Good and bad news on the phonics screen (5 Oct 2013) What is educational neuroscience? (25 Jan 2014) Parent talk and child language (17 Feb 2014) My thoughts on the dyslexia debate (20 Mar 2014) Labels for unexplained language difficulties in children (23 Aug 2014) International reading comparisons: Is England really do so poorly? (14 Sep 2014)

Autism diagnosis in cultural context (16 May 2011) Are our ‘gold standard’ autism diagnostic instruments fit for purpose? (30 May 2011) How common is autism? (7 Jun 2011) Autism and hypersystematising parents (21 Jun 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) The ‘autism epidemic’ and diagnostic substitution (4 Jun 2012) How wishful thinking is damaging Peta's cause (9 June 2014)

Developmental disorders/paediatrics
The hidden cost of neglected tropical diseases (25 Nov 2010) The National Children's Study: a view from across the pond (25 Jun 2011) The kids are all right in daycare (14 Sep 2011) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Changing the landscape of psychiatric research (11 May 2014)

Where does the myth of a gene for things like intelligence come from? (9 Sep 2010) Genes for optimism, dyslexia and obesity and other mythical beasts (10 Sep 2010) The X and Y of sex differences (11 May 2011) Review of How Genes Influence Behaviour (5 Jun 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Genes, brains and lateralisation (22 Dec 2012) Genetic variation and neuroimaging (11 Jan 2013) Have we become slower and dumber? (15 May 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013)

Neuroprognosis in dyslexia (22 Dec 2010) Brain scans show that… (11 Jun 2011)  Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Neuronal migration in language learning impairments (2 May 2012) Sharing of MRI datasets (6 May 2012) Genetic variation and neuroimaging (1 Jan 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) What is educational neuroscience? ( 25 Jan 2014) Changing the landscape of psychiatric research (11 May 2014)

Book review: biography of Richard Doll (5 Jun 2010) Book review: the Invisible Gorilla (30 Jun 2010) The difference between p < .05 and a screening test (23 Jul 2010) Three ways to improve cognitive test scores without intervention (14 Aug 2010) A short nerdy post about the use of percentiles (13 Apr 2011) The joys of inventing data (5 Oct 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Causal models of developmental disorders: the perils of correlational data (24 Jun 2012) Data from the phonics screen (1 Oct 2012)Moderate drinking in pregnancy: toxic or benign? (1 Nov 2012) Flaky chocolate and the New England Journal of Medicine (13 Nov 2012) Interpreting unexpected significant results (7 June 2013) Data analysis: Ten tips I wish I'd known earlier (18 Apr 2014) Data sharing: exciting but scary (26 May 2014) Percentages, quasi-statistics and bad arguments (21 July 2014)

Journalism/science communication
Orwellian prize for scientific misrepresentation (1 Jun 2010) Journalists and the 'scientific breakthrough' (13 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Orwellian prize for journalistic misrepresentation: an update (29 Jan 2011) Academic publishing: why isn't psychology like physics? (26 Feb 2011) Scientific communication: the Comment option (25 May 2011) Accentuate the negative (26 Oct 2011) Publishers, psychological tests and greed (30 Dec 2011) Time for academics to withdraw free labour (7 Jan 2012) Novelty, interest and replicability (19 Jan 2012) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Communicating science in the age of the internet (13 Jul 2012) How to bury your academic writing (26 Aug 2012) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013) Blogging as post-publication peer review (21 Mar 2013) A short rant about numbered journal references (5 Apr 2013) Schizophrenia and child abuse in the media (26 May 2013) Why we need pre-registration (6 Jul 2013) On the need for responsible reporting of research (10 Oct 2013) A New Year's letter to academic publishers (4 Jan 2014)

Social Media
A gentle introduction to Twitter for the apprehensive academic (14 Jun 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) Will I still be tweeting in 2013? (2 Jan 2012) Blogging in the service of science (10 Mar 2012) Blogging as post-publication peer review (21 Mar 2013) The impact of blogging on reputation ( 27 Dec 2013) WeSpeechies: A meeting point on Twitter (12 Apr 2014)

Academic life
An exciting day in the life of a scientist (24 Jun 2010) How our current reward structures have distorted and damaged science (6 Aug 2010) The challenge for science: speech by Colin Blakemore (14 Oct 2010) When ethics regulations have unethical consequences (14 Dec 2010) A day working from home (23 Dec 2010) Should we ration research grant applications? (8 Jan 2011) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Should we ever fight lies with lies? (19 Jun 2011) How to survive in psychological research (13 Jul 2011) So you want to be a research assistant? (25 Aug 2011) NHS research ethics procedures: a modern-day Circumlocution Office (18 Dec 2011) The REF: a monster that sucks time and money from academic institutions (20 Mar 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) Journal impact factors and REF2014 (19 Jan 2013)  An alternative to REF2014 (26 Jan 2013) Postgraduate education: time for a rethink (9 Feb 2013) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013) Ten things that can sink a grant proposal (19 Mar 2013)Blogging as post-publication peer review (21 Mar 2013) The academic backlog (9 May 2013) Research fraud: More scrutiny by administrators is not the answer (17 Jun 2013) Discussion meeting vs conference: in praise of slower science (21 Jun 2013) Why we need pre-registration (6 Jul 2013) Evaluate, evaluate, evaluate (12 Sep 2013) High time to revise the PhD thesis format (9 Oct 2013) The Matthew effect and REF2014 (15 Oct 2013) Pressures against cumulative research (9 Jan 2014) Why does so much research go unpublished? (12 Jan 2014) The University as big business: the case of King's College London (18 June 2014) Should vice-chancellors earn more than the prime minister? (12 July 2014) Replication and reputation: Whose career matters? (29 Aug 2014)  

Celebrity scientists/quackery
Three ways to improve cognitive test scores without intervention (14 Aug 2010) What does it take to become a Fellow of the RSM? (24 Jul 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) How to become a celebrity scientific expert (12 Sep 2011) The kids are all right in daycare (14 Sep 2011)  The weird world of US ethics regulation (25 Nov 2011) Pioneering treatment or quackery? How to decide (4 Dec 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012)

Academic mobbing in cyberspace (30 May 2010) What works for women: some useful links (12 Jan 2011) The burqua ban: what's a liberal response (21 Apr 2011) C'mon sisters! Speak out! (28 Mar 2012) Psychology: where are all the men? (5 Nov 2012) Men! what you can do to improve the lot of women ( 25 Feb 2014) Should Rennard be reinstated? (1 June 2014)

Politics and Religion
Lies, damned lies and spin (15 Oct 2011) A letter to Nick Clegg from an ex liberal democrat (11 Mar 2012) BBC's 'extensive coverage' of the NHS bill (9 Apr 2012) Schoolgirls' health put at risk by Catholic view on vaccination (30 Jun 2012) A letter to Boris Johnson (30 Nov 2013) How the government spins a crisis (floods) (1 Jan 2014)

Humour and miscellaneous Orwellian prize for scientific misrepresentation (1 Jun 2010) An exciting day in the life of a scientist (24 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Parasites, pangolins and peer review (26 Nov 2010) A day working from home (23 Dec 2010) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Scientific communication: the Comment option (25 May 2011) How to survive in psychological research (13 Jul 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) The bewildering bathroom challenge (19 Jul 2012) Are Starbucks hiding their profits on the planet Vulcan? (15 Nov 2012) Forget the Tower of Hanoi (11 Apr 2013) How do you communicate with a communications company? ( 30 Mar 2014) Noah: A film review from 32,000 ft (28 July 2014)

Sunday, 14 September 2014

International reading comparisons: is England really doing so poorly?

I was surprised to see a piece in the Guardian stating that "England is one of the most unequal countries for children's reading levels, second in the EU only to Romania". This claim was made in an article about a new campaign, Read On, Get On, that was launched this week.

The campaign sounds great. A consortium of organizations and individuals have got together to address the problem of poor reading: the tail in the distribution of reading ability that seems to stubbornly remain, despite efforts to reduce it. Poor readers are particularly likely to come from deprived backgrounds, and their disadvantage will be perpetuated, as they are at high risk of leaving school with few qualifications and dismal employment prospects. I was pleased to see that the campaign has recognized weak language skills in young children as an important predictor of later reading difficulties. The research evidence has been there for years (Kamhi & Catts, 2011), but it has taken ages to percolate into practice, and few teachers have any training in language development.

But! You knew there was a 'but' coming. It concerns the way the campaign has used evidence. They've mostly based what they say on the massive Progress in International Reading Literacy Study (PIRLS), and the impression is they have exaggerated the negative in order to create a sense of urgency.

I took a look at the Read On Get On report. The language is emotive and all about blame: "The UK has a sorry history of educational inequality. For many children, this country provides enormous and rich opportunities. At the top end of our education system we rival the best in the world. But it has long been recognised that we let down too many children who are allowed to fall behind. Many of them are condemned to restricted horizons and limited opportunities." I was particularly interested in the international comparisons, with claims such as "The UK is one of the most unfair countries in the developed world."

So how were such conclusions reached? Read On, Get On commissioned the National Foundation for Educational Research (NFER) to compare levels of reading attainment in the UK with that of other developed countries, with a focus on children approaching the last year of primary schooling.

Given the negative tone of "letting down children", it was interesting to read that "In terms of its overall average performance, NFER’s research found England to be one of the best performing countries." I put that in bold because, somehow, it didn't make it into the Guardian, so is easy to miss. It is in any case dismissed by the NFER report in a sentence: "As a wealthy country with a good education system, that is to be expected."

The evidence of the parlous state of UK education came from consideration of the range of scores from best (95th percentile) to worst (5th percentile) for children in England. Now this is where I think it gets a bit dishonest. Suppose there were a massive improvement in scores for a subset of children, such that the mean and highest scores went up, but with the lowest scoring still doing poorly; presumably, the shrill voices would get even shriller, because the range would extend even further. This seems a tad unfair: yes, it makes sense to stress that the average attainment doesn't capture important things, and that a high average is not a cause for congratulation if it is associated with a long straggly tail of poor achievers. But if we want to focus on poor achievers, let's look at the proportion of children scoring at a low level, and not at some notional 'gap' between best and worst, which is then translated into 'years' to make it sound even more dramatic.

The question is how does England compare with other countries if we just look at the absolute level of the low score corresponding to the 5th percentile. Answer: not brilliant – 16th out of the 24 countries featured in the subset considered by the NFER survey. But, rather surprisingly, we find that the NFER survey excluded New Zealand and Australia, both of whom did worse than England.

So do we notice anything about that? Well, in all three countries, children are learning English, a language widely recognized as creating difficulty for young readers because of the lack of consistent mapping between letters (orthography) and sounds (phonology). In fact, when looking for sources for this blogpost, I happened upon a report from an earlier tranche of PIRLS data, which examined this very topic, by assigning an 'orthographic complexity' score to different languages. The authors found a correlation of .6 between the range of scores (5th to 95th percentile again, this time for 2003 data) and a measure of complexity of the orthography. I applied their orthography rating scale to the 2011 PIRLS data and found that, once again the range of reading scores was significantly related to orthography (r = .72), with the highest ranges for those countries where English was spoken – see Figure below. (NB it would be very interesting to extend this to include additional countries: I was limited to the languages with an orthographic rating from the earlier report).
PIRLS 2011 data: range of reading attainment vs. orthographic complexity
International comparisons have their uses, and in this case they seem to suggest that a complex orthography widens the gap between the best and worst readers. However, they still need to be treated with caution. I haven't had time to delve into PIRLS in any detail, but just looking at how samples of children were selected, it is clear that criteria varied. In particular, there were differences from country to country in terms of whether they excluded children who were non-native speakers of the test language, and whether they included those with special educational needs. Romania, which had the most extreme range of scores between best and worst, excluded nobody. Finand, which tends to do well in these surveys, excluded "students with dyslexia or other severe linguistic disorders, intellectually disabled students, functionally disabled students, and students with limited proficiency in the assessment language." England excluded "students with significant special educational needs". Needless to say, all of these criteria are open to interpretation.

I'm not saying that the tail of the distribution is unimportant. Yes, of course, we need to do our best to ensure that all children are competent readers, as we know that poor literacy is a major handicap to a person's prospects for employment, education and prosperity. But let's stop beating ourselves over the head about this. Research indicates that the reasons for children's literacy problems are complex and will be influenced by the writing system they have to learn (Ziegler & Goswami, 2005) and constitutional factors (Asbury & Plomin, 2013), as well as by the home and school environment: we still have only a poor grasp of how these different factors interact. Until we gain a better understanding, we should of course put in our best efforts to help those children who are struggling. The enthusiasm and good intentions of those behind Read On, Get On are to be welcomed, but their spin on the PIRLS data is unhelpful in implying that only social factors are important.

Asbury K, and Plomin R. 2013. G is for genes: The impact of genetics on education and achievement. Chichester: Wiley Blackwell.

Kamhi AG, and Catts HW. 2011. Language and Reading Disabilities (3rd Edition): Allyn & Bacon.

Ziegler JC, & Goswami U (2005). Reading acquisition, developmental dyslexia, and skilled reading across languages: a psycholinguistic grain size theory. Psychological bulletin, 131 (1), 3-29 PMID: 15631549

Friday, 29 August 2014

Replication and reputation: Whose career matters?

Some people are really uncomfortable with the idea that psychology studies should be replicated. The most striking example is Jason Mitchell, Professor at Harvard University, who famously remarked in an essay that "unsuccessful experiments have no meaningful scientific value".

Hard on his heels now comes UCLA's Matthew Lieberman, who has published a piece in Edge on the replication crisis. Lieberman is careful to point out that he thinks we need replication. Indeed, he thinks no initial study should be taken on face value - it is, according to him, just a scientific anecdote, and we'll always need more data. He emphasises:"Anyone who says that replication isn't absolutely essential to the success of science is pretty crazy on that issue, as far as I'm concerned."

It seems that what he doesn't like, though, is how people are reporting their replication attempts, especially when they fail to confirm the initial finding. "There's a lot of stuff going on", he complains "where there's now people making their careers out of trying to take down other people's careers".  He goes on to say that replications aren't unbiased, and that people often go into them trying to shoot down the original findings and this can lead to bad science:
"Making a public process of replication, and a group deciding who replicates what they replicate, only replicating the most counterintuitive findings, only replicating things that tend to be cheap and easy to replicate, tends to put a target on certain people's heads and not others. I don't think that's very good science that we, as a group, should sanction."
It's perhaps not surprising that a social neuroscientist should be interested in the social consequences of replication, but I would take issue with Lieberman's analysis. His depiction of the power of the non-replicators seems misguided. You do a replication to move up in your career? Seriously? Has Lieberman ever come across anyone who was offered a job because they failed to replicate someone else? Has he ever tried to publish a replication in a high-impact outlet? Give it a try and you'll soon be told it is not novel enough. Many of the most famous journals are notorious for turning down failures to replicate studies that they themselves published.  Lieberman is correct in noting that failures to replicate can get a lot of attention on Twitter, but a strong Twitter following is not going to recommend you to a hiring committee (and, btw, that Kardashian index paper was a parody).

Lieberman makes much of the career penalty for those whose work is not replicated. But anyone who has been following the literature on replication will be aware of just how common non-replication is (see e.g. Ioannidis, 2005). There are various possible reasons for this, and nobody with any sense would count it against someone if they do a well-conducted and adequately powered study that does not replicate. What does count against them is if they start putting forward implausible reasons why the replication must be wrong and they must be right. If they can show the replicators did a bad job, their reputation can only be enhanced. But they'll be in a weak position if their original study was not methodologically strong and should not have been submitted for publication without further evidence to support it.  In other words, reputation and career prospects will, at the end of the day, come down to the scientific rigour of a person's research, not on whether a particular result did or did not cross a threshold of p < .05.

The problem with failures to replicate is that they can arise for at least four reasons, and it can be hard to know which applies in an individual case. One reason, emphasized by Lieberman,  is that the replicator may be incompetent or biased.  But a positive feature of the group replication efforts that Lieberman so dislikes is that the methods and data are entirely open, allowing anyone who wants to evaluate them – see for instance this example. Others have challenged replication failures on the grounds that there are crucial aspects of the methodology that only the original experimenter knows about. To those I recommend making all aspects of methods explicit.

A second possibility is that a scientist does a well-designed study whose results don't replicate because all results are influenced by randomness – this could mean that an original effect was a false positive, or the replication was a false negative. The truth of the matter will only be settled by more, rather than less replication, but there's research showing that the odds are that an initial large effect will be smaller on replication, and may disappear altogether - the so-called Winner's Curse (Button et al, 2012).

The third reason why someone's work doesn't replicate is if they are a charlatan or fraudster, who has learned that they can have a very successful career by telling lies. We all hope they are very rare and we all agree they should be stopped. Nobody would make the assumption that someone must be in this category just because a study fails to replicate.

The fourth reason for lack of replication arises when researchers are badly trained and simply don't understand about probability theory, and so engage in various questionable research practices to tweak their data to arrive at something 'significant'. Although they are innocent of bad intentions, they stifle scientific progress by cluttering the field with nonreplicable results. Unfortunately, such practices are common and often not recognised as a problem, though there is growing awareness of the need to tackle them.

There are repeated references in Lieberman's article to people's careers: not just the people who do the replications ("trying to create a career out of a failure to replicate someone") but also the careers of those who aren't replicated ("When I got into the field it didn't seem like there were any career-threatening giant debates going on"). There is, however, another group whose careers we should consider: graduate students and postdocs who may try to build on published work only to find that the original results don't stand up. Publication of non-replicable findings leads to enormous waste in science and demoralization of the next generation. One reason why I take reproducibility initiatives seriously is because I've seen too many young people demoralized after finding that the exciting effect they want to investigate is actually an illusion.

While I can sympathize with Lieberman's plea for a more friendly and cooperative tone to the debate, at the end of the day, replication is now on the agenda and it is inevitable that there will be increasing numbers of cases of replication failure.

So suppose I conduct a methodologically sound study that fails to replicate a colleague's work. Should I hide my study away for fear of rocking the boat or damaging someone's career? Have a quiet word with the author of the original piece? Rather than holding back for fear of giving offence it is vital that we make our data and methods public: For a great example of how to do this in a rigorous yet civilized fashion I recommend this blogpost by Betsy Levy Paluck.

In short, we need to develop a more mature understanding that the move towards more replication is not about making or breaking careers: it is about providing an opportunity to move science forward, improve our methodology and establish which results are reliable (Ioannidis, 2012). And this can only help the careers of those who come behind us.

Button, K., Ioannidis, J., Mokrysz, C., Nosek, B., Flint, J., Robinson, E., & Munafó, M. (2013). Power failure: why small sample size undermines the reliability of neuroscience Nature Reviews Neuroscience, 14 (6), 365-376 DOI: 10.1038/nrn3475

Ioannidis, J. (2005). Contradicted and Initially Stronger Effects in Highly Cited Clinical Research JAMA, 294 (2) DOI: 10.1001/jama.294.2.218

Ioannidis, J. (2012). Why Science Is Not Necessarily Self-Correcting Perspectives on Psychological Science, 7 (6), 645-654 DOI: 10.1177/1745691612464056