Why any researcher should start their career with a meta-analysis

If your ambition is to become a scientist and an expert in a specific research area one path is more efficient than many others. The one that we think will make you an expert quickest is the writing of a meta-analysis. This path is very different from one involving primary research, but it will allow you to answer many more questions that you could conceivably answer with a single experiment. We will provide three reasons why you should take the meta-analysis path. Yet if I still do not convince you, we hope the lessons one of us (Alessandro) is learning from his explorations in meta-analysis thus far will still be of use to you.

1) Meta analyses allow you to have a broad view of a phenomenon of interest

Have you ever tried to go to the top of a tower and look down? The view is much more complete from there; it allows you to have an overview that you wouldn’t have had from the ground. Doing an experimental study is kind of like looking from the ground: only the result of your own experiment will be visible to your eyes. Conducting a meta-analysis instead allows you to see other people’s experiments and approaches at once. Say for example that you are interested in studying how meditation can help reduce stress levels. If you conduct a randomized controlled trial you will only know about that specific treatment and only on one particular population of participants. By conducting a meta-analysis, instead, you hopefully get insight into whether 1) meditation is more effective on individuals with certain personality traits and 2) the effects of meditation can be extended to different populations, while you may also observe when 3) meditation is effective in reducing stress levels and when effects are null or small. 

image credit Timothy Chan

There is one observation from our own path that we can already share with you. As psychologists we should be interested in how different people respond to different manipulations. Does people’s anxiety in their attachments, for example, matter whether or not they benefit from mediation? Or is biofeedback more effective for younger or older people?  The fact of the matter is that psychologists often neglect to report detailed records of the populations they study. One of the recommendations that will surely make it into the meta-analysis that Alessandro is leading is that researchers need to keep detailed protocols (like we have recommended here). In that way,  meta-analysts can start using this information across many studies.

2) Meta analyses allow you to have information about the health of the literature of interest

A meta-analysis can teach important lessons even to those who have no intention of taking this path. It is not a secret that many sciences have been hit with a replication crisis, as many replication studies have failed to obtain the same results of original studies they sought to replicate (see, for psychology, Klein et al. 2018; Maxwell, Lau, & Howard, 2015; Open Science Collaboration, 2015). One likely reason for the replication crisis is publication bias (see e.g., Sutton, Duval, Tweedie, Abrams, & Jones, 2000). A primary goal of meta-analysis is thus to know how bad the problem actually is and how bad publication bias in that literature is. 

In some fields, such as medicine, knowing the real effectiveness of a drug directly impacts people’s lives. However, because of publication bias, assessing the risk-benefit ratio of particular types of drugs is not easy to estimate. As but one example, Turner et al. (2008)  analyzed the effects of 12 antidepressant agents on over 12 thousand people both in terms of the proportion of positive studies and the effect sizes associated with the use of this drug. According to the published literature 94% of the trials were positive. Yet after using techniques to account for publication bias, Turner et al. (2008) found out that the percentage dropped to 51% and that the effect size decreased to 32% of its original. 

Overestimating the effect of a drug has direct consequences on the choice of certain therapies which in turn impact on the health of a population (and we feel those consequences even more so now, in the midst of a health crisis). A meta-analytic approach can help us signal there is a problem in a literature due to publication bias. Some think that meta-analysis can provide a correction of the effect size by correcting for publication bias. The jury might still be out on this, as others say that even “meta-analysis is fucked”. Even if meta-analyses cannot provide accurate effect sizes, they can provide a snapshot of the health of a particular research field (e.g., by pointing to how many results are positive and what researchers record). Based on this report of health, solutions (like Registered Reports) can be recommended to researchers in that field. It may well be that if meta-analysts do not do their work and provide recommendations, meta-analyses remain fucked for a long time to come. 

3) Meta-analysis allows you to acquire skills important for your future career as a scientist 

This recommendation is primarily for the starting PhD student. Stephen King famously said: “If you don’t have time to read, you don’t have the time (or the tools) to write. Simple as that”, and there is nothing more true. Reading numerous articles is the key to very quickly becoming a more efficient and faster writer. When I (Alessandro) started my meta-analysis, I may have been shell-shocked by the sheer quantity of what I had to read. But not only did my vocabulary quickly improve, I also encountered many different writing styles. It allowed me to integrate expert writers’ writing styles into mine. What also helps as a beginning PhD student is that conducting a meta-analysis has taught me the importance of good reporting practices and the limitations of a single study. We think for example that the psychological literature vastly underreported important information. We will try to contribute to changes and  make protocols available for the researchers in our own literature and I will try my best not to repeat the same errors. 

Final considerations

We can certainly recommend walking the meta-analysis path. What we have learned so far is that scientists underreport and they need to create more detailed protocols to keep good records of their work. In addition – and we are stating the obvious here – meta-analyses confirm that publication bias is a considerable problem. Finally, the exercise of doing a meta-analysis is vital for any researcher: it improves one’s writing and the body of knowledge required for running solid experimental studies.

The path to become a better scientist is arduous. Conducting a careful meta-analysis is definitely one of the stages that could lead you to the top. We hope to have convinced you that if you start your research, a meta-analysis is a good path to walk on to ensure that you become a careful observer of the phenomena you study. 

This blog post was written by Alessandro Sparacio and Hans IJzerman.

Love in the times of COVID – help us support physically isolated people

In the past few weeks, a humanitarian, social, and economic disaster has been unfolding because of COVID19. To stop the virus from spreading, people have been asked to engage in social distancing. Based on what we know so far this is a wise decision and we encourage everyone to engage in social distancing, too. At the same time, we know that being socially isolated can have extremely adverse consequences that can even lead to death. How can we adhere to governmental regulations but also maintain social contact? It is time to act now and build smart solutions. We want to build a “relationship simulator” that will keep learning and will mobilise WhatsApp, Facebook, and Zoom communicators to protect our health in this time of crisis. 

What is social distancing?

Social distancing means not being in the same place together or at the very least not in close proximity to each other. In Italy, at first, shop and restaurant owners were required to make sure their customers keep their distance and now all of them have been closed to make sure people stay at home. France, Norway, and Ireland have closed their entire education system to keep people from gathering in one place. The data clearly supports this decision, showing that social distancing can contain the virus

But social distancing can also be dangerous and – in the very worst case scenario – can kill us. Research has convincingly demonstrated that people who are physically and socially isolated die more quickly. The link between social and physical isolation is even stronger than between health and being obese or not, drinking six glasses of alcohol per day or not, exercising regularly or not, and equal to smoking sixteen cigarettes per day or not. The impact of social distancing in the current situation can even make things worse. Loneliness can force people to ignore the recommendations to stay away from others, especially friends and family. So how can we avoid spreading the virus, while not creating another, unintended consequence?

The Relationship Simulator 

The first solution that comes to mind is to schedule frequent calls via social media. That is a good first step. But, as everyone who skyped with a loved one knows, it is not enough. Physical proximity is incredibly important. People live in societies and relate to each other for good reasons. It used to help us to more easily gather food, to cope with dangers and predators, and to keep each other warm. We can deal with all of these problems pretty well now without direct contact. But these problems all have had consequences for our psychological makeup: we still need to be with others to stay alive. Being socially connected is a basic biological need: the late John Cacioppo likened the feeling of loneliness to hunger.  So what can we do to reduce our hunger for human contact when we need to socially distance? 

Food gathering and dealing with predators are not our main concern anymore. But we still co-regulate our temperature with others even though we have modern ways to keep us warm. We rarely think about body temperature regulation in a social context but thermoregulation is inherently social. It is a major factor in determining why people enter and nurture social relationships. Our knowledge is based on diverse research findings, as there is considerable neural overlap between social and thermoregulatory behaviors, while people’s social networks protect them from the cold , and people respond in social ways to temperature fluctuations

How the EmbrWave works

What we all intuitively understand is that touch is important for our physical and mental well-being and social thermoregulation is the reason why. That is why social distancing can have adverse consequences for our health. So how to resolve this conundrum – to keep both our distance and literal warmth of human contact? We hope to build a “relationship simulator” that can emulate the intimacy of touch (via temperature fluctuations) when people are distant from each other. In a first phase, we want to connect a device that can warm or cool a person (the EmbrWave) to programs like Facebook Messenger, WhatsApp, Skype, and Zoom so that people can warm and cool each other with just one click of a button during the call or conversation to simulate real-life social contact. In that first phase, we will also measure people’s temperature to see how they respond. In the next phase, we will be able to adjust the temperature manipulation through an algorithm, so that the temperature manipulation (through the EmbrWave and the sensors) can simulate intimacy automatically while people are far apart. 

Who/what do we need to make this happen?

This may seem like a distant dream, but we believe it is in reach and we can build it now. Our team can construct and evaluate the validity of what we measure through psychometric and quantitative modelling. To make this happen, we will need the following:

– Programmers that can connect the EmbrWave with programs like Facebook Messenger, WhatsApp, Skype, and Zoom.
– Programmers that will connect temperature sensors with the same types of programs. 
– Safe data storage to store the data in and a server powerful enough to help conduct computations.
– Experts to help test a prototype that can learn during interactions.
– Experts on data privacy laws, to ensure we do not interfere with privacy/law while we collect the data.
– Additional data scientists to help data experts from our team to most accurately interpret and model the data.
– Scientists to help organize this project and conduct the necessary research.
– Thermoregulation experts to further test our sensors and replace it if necessary (we currently use the ISP131001 sensor).
– Core body temperature sensors to model the process (current – excellent – solutions like the GreenTEG body sensor are too expensive for our team and the individual user).
– Sensors and EmbrWaves being made available for different users: this costs a considerable amount of money. 

While we build the relationship simulator, these temperature sensors can also be used to quickly detect fevers and other problems with people’s health, so there will be other benefits of this system. To join our team or to contribute to our cause, please fill in this form

This blog post was written by Anna Szabelska, EmbrLabs, and Hans IJzerman

Why African researchers should join the Psychological Science Accelerator

The goals of AfricArXiv include fostering community among African researchers, facilitate collaborations between African and non-African researchers, and raise the profile of African research on the international stage. These goals align with the goals of a different organization, the Psychological Science Accelerator (PSA). This post describes how these goals align and argues that joining the Psychological Science Accelerator will benefit members of the AfricArXiv research community through increased collaboration and resource access.

What is the Psychological Science Accelerator?

The PSA is a voluntary, globally distributed, democratic network of over 500 labs from over 70 countries on all six populated continents, including Africa. Psychology studies have traditionally been dominated by Western researchers studying Western participants (Rad, Martingano, & Ginges, 2018). One of the primary goals of the PSA is to help address this problem by expanding the range of researchers and participants in psychology research, thereby making psychology more representative of humanity.

This goal is consistent with the goals of AfricArXiv: addressing the lack of non-Western psychology researchers entails raising the profile of African psychology researchers and fostering collaborations between African and non-African researchers. In addition, the PSA in particular has an interest in expanding its network in Africa: although the PSA wishes to achieve representation on all continents, at last count only 1% of its 500 labs were from Africa.

How the PSA can benefit the African research community

The shared goal of the PSA and AfricArXiv is thus to win/recruit a group of African researchers to join the PSA and its programmes on internationally acclaimed research in psychological science. We are committed to expanding the profile of members of the African research community.

Any psychology researcher can join the PSA at no cost. Member labs will have the opportunity to contribute to PSA governance, submit studies to be run through the PSA network of labs, and collaborate and earn authorship on projects involving hundreds of researchers from all over the world. PSA projects are very large in scale; the first global study run through its network (Jones et al., 2020) involved more than 100 labs from 41 countries, who collectively recruited over 11,000 participants.

The PSA generates a large amount of research communication, which can all be shared at no cost through AfricArXiv. The PSA datasets that involve African participants are available for free for secondary analysis. These datasets may be analyzed with a specifically African focus, and the resultant research can again be freely shared via AfricArXiv.

The specific benefits of PSA membership

The first step to obtaining the benefits of the PSA is to become a member by expressing an in-principle commitment to contribute to the PSA in one way or the other. Membership is free of charge.

Once you are a member, you gain access to the five following benefits:

  1. Free submission of proposals to run a large, multi-national project. The PSA accepts proposals for new studies to run through its network every year between June and August (you can see our 2019 call here). You too can submit a proposal. If your proposal is accepted during our peer review process, the PSA will help you recruit collaborators from its international network of 500 labs and provide support with all aspects of completing a large, multi-site study. You can then submit any research products that result from this process free of charge as a preprint on AfricArXiv.
  2. Join PSA projects. The PSA is currently running six multi-lab projects, one of which is actively recruiting collaborators. In the next two weeks, the PSA will accept a new wave of studies. As a collaborator on one of our studies, you can collect data or assist with statistical analysis, project management, or data management. If you join a project as a collaborator, you will earn authorship on the papers that result from the project (which can be freely shared via AfricArXiv). You can read about the studies that the PSA is currently working on here.
  3. Join the PSA’s editorial board. The PSA sends out calls for new study submissions on a yearly basis. Like grant agencies and journals, it needs people to serve as reviewers for these study submissions. You can indicate interest in serving as a reviewer when you become a PSA member. In return, you will be listed as a member of the PSA editorial board. You can add this editorial board membership to your website and CV.
  4. Join one of the PSA’s governance committees. The PSA’s policies and procedures are developed in its various committees. Opportunities regularly arise to join these committees. Serving on committees helps shape the direction of the PSA and puts researchers in touch with potential collaborators from all over the world. If you are interested in joining a committee, join the PSA newsletter and the PSA Slack workspace. We make announcements of new opportunities to join our committees on these outlets.
  5. Receive compensation to defray the costs of collaboration. We realize that international collaboration can be challenging and expensive, particularly for researchers at lower income institutions. The PSA is therefore providing financial resources to facilitate collaboration. At present, we have a small pool of member lab grants, small grants of $400 USD to help defray the costs of participating in a PSA research project. You can apply for a member lab grant here.


The PSA aims to foster collaboration on our large, multi-national and multi-lab projects. We believe these collaborations can yield tremendous benefits to African researchers. If you agree, you can join our network to gain access to a vibrant and international community of over 750 researchers from 548 labs in over 70 countries. We look forward to working with you.

This blog post written by Adeyemi Adetula and Patrick Forscher and is cross-posted at AfricArxiv.

Science for Science Reformers

In November 2019, Tal Yarkoni set psychology Twitter ablaze with a fiery preprint, “The Generalizability Crisis” (Yarkoni, 2019). Written with direct, pungent language, the paper fired a direct salvo at the inappropriate breadth of claims in scientific psychology, arguing that the inferential statistics presented in papers are essentially meaningless due to their excessive breadth and the endless combinations of unmeasured confounders that plague psychology studies.

The paper is clear, carefully argued, and persuasive. You should read it. You probably have.

Yet there is something about the paper that bugs me. That feeling wormed its way into the back of my mind until it has become a full-fledged concern. I agree that most verbal claims in scientific articles are often, or even usually, hopelessly misaligned with their instantiations in experiments such that the statistics in papers are practically useless as tests of the broader claim. In a world where claims are not refuted by future researchers, this represents a huge problem. That world characterizes much of psychology.

But the thing that bugs me is not so much the paper’s logic as (what I perceive to be) its theory of how to change scientist behavior. Whether Tal explicitly believes this theory or not, it’s one that I think is fairly common in efforts to reform science — and it’s a theory that I believe to be shared by many failed reform efforts. I will devote the remainder of this blog to addressing this theory and laying out the theory of change that I think is preferable.

A flawed theory of change: The scientist as a logician

The theory of change that I believe underlie’s Tal’s paper is something I will call the “scientist as logician” theory. Here is a somewhat simplified version of this theory:

  • Scientists are truth-seekers
  • Scientists use logic to develop the most efficient way of seeking truth
  • If a reformer uses logic to identify flaws in a scientist’s current truth-seeking process, then, as long as the logic is sound, that scientist will change their practices

Under the “scientist as logician” theory of change, the task of a putative reformer is to develop the most rigorously sound logic as possible about why a new set of practices is better than an old set of practices. The more unassailable this logic, the more likely scientists are to adopt the new practices.

This theory of change is the one implicitly adopted by most academic papers on research methods. The “scientist as logician” theory is why, I think, most methods research focuses on accumulating unassailable evidence about what are the most optimal methods for a given set of problems — if scientists operate as logicians, then stronger evidence will lead to stronger adoption of those optimal practices.

This theory of change is also the one that arguably motivated many early reform efforts in psychology. Jacob Cohen wrote extensively and persuasively on why, based on considerations of statistical power, psychologists ought to use larger sample sizes (Cohen, 1962; Cohen, 1992). David Sears wrote extensively on the dangers of relying on samples of college sophomores for making inferences about humanity (Sears, 1986). But none of their arguments seemed to really have mattered. 

In all these cases, the logic that undergirds the arguments for better practice is nigh unassailable. The lack of adoption of their suggestions reveal stark limitations in the “scientist as logician” theory. The limited influence of methods papers is infamous (Borsboom, 2006) — especially if the paper happens to point out critical flaws in a widely used and popular method (Bullock, Green, & Ha, 2010). Meanwhile, despite the highly persuasive arguments by Jacob Cohen, David Sears, and many other luminaries, statistical power has barely changed (Sedlmeier & Gigerenzer, 1989), nor has the composition of psychology samples (Rad, Martingano, & Ginges, 2018). It seems unlikely that scientists change their behavior purely on logical grounds.

A better theory of change: The scientist as human

I’ll call my alternative to the “scientist as logician” model the “scientist as human” model. A thumbnail sketch of this model is as follows:

  • Scientists are humans
  • Humans have goals (including truth and accuracy)
  • Humans are also embedded in social and political systems
  • Humans are sensitive to social and political imperatives
  • Reformers must attend to both human goals and the social and political imperatives to create lasting changes in human behavior

Under the “scientist as human” model, the goal of the putative reformer is to identify the social and political imperatives that might prevent scientists from engaging in a certain behavior. The reformer then works to align those imperatives with the desired behaviors.

Of course, for a desired behavior to occur, that behavior should be aligned with a person’s goals (though that is not always necessary). Here, however, reformers who want science to be more truthful are in luck: scientists overwhelmingly endorse normative systems that suggest they care about the accuracy of their science (Anderson et al., 2010). This also means, however, that if scientists are behaving in ways that appear irrational or destructive to science, that’s probably not because the scientists just haven’t been exposed to a strong enough logical argument. Rather, the behavior probably has more to do with the constellation of social and political imperatives in which the scientists are embedded.

This view, of the scientist as a member of human systems, is why, I think, the current open science movement has been effective where other efforts have failed. Due to the efforts of institutions like the Center for Open Science, many current reformers have a laser focus on changing the social and political conditions. The goal behind these changes is not to change people’s behavior directly, but to shift institutions to support people who already wish to use better research practices. This goal is a radical departure from the goals of people operating under the “scientist as logician” model.

Taking seriously the human-ness of the scientist

The argument I have made is not new. In fact, the argument is implicit in many of my favorite papers on science reform (e.g., Smaldino & McElreath, 2018). Yet I think many prospective reformers of science would be well-served in thinking through the implications of the “scientist as human” view. 

While logic may help in identifying idealized models of the scientific process, reformers seeking to implement and sustain change must attend to social and political processes. This includes especially those social and political processes that affect career advancement, such as promotion criteria and granting schemes. However, this also includes thinking through the processes that affect how a potential reform will be taken up in the social and political environment, especially whether scientists will have the political ability to take collective action to take up particular reform. In other words, taking seriously scientists as humans means taking seriously the systems in which scientists participate.


  • Anderson, M. S., Ronning, E. A., De Vries, R., & Martinson, B. C. (2010). Extending the Mertonian Norms: Scientists’ Subscription to Norms of Research. The Journal of Higher Education, 81(3), 366–393. https://doi.org/10.1353/jhe.0.0095
  • Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71(3), 425–440. https://doi.org/10.1007/s11336-006-1447-6
  • Bullock, J. G., Green, D. P., & Ha, S. E. (2010). Yes, but what’s the mechanism? (Don’t expect an easy answer). Journal of Personality and Social Psychology, 98(4), 550–558. https://doi.org/10.1037/a0018933
  • Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145–153. https://doi.org/10.1037/h0045186
  • Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159. https://doi.org/10.1037/0033-2909.112.1.155
  • Rad, M. S., Martingano, A. J., & Ginges, J. (2018). Toward a psychology of Homo sapiens: Making psychological science more representative of the human population. Proceedings of the National Academy of Sciences, 115(45), 11401–11405. https://doi.org/10.1073/pnas.1721165115
  • Sears, D. O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology, 51(3), 515–530. https://doi.org/10.1037/0022-3514.51.3.515
  • Sedlmeier, P., & Gigerenzer, G. (1992). Do studies of statistical power have an effect on the power of studies? In Methodological issues & strategies in clinical research (pp. 389–406). American Psychological Association. https://doi.org/10.1037/10109-032
  • Smaldino, P. E., & McElreath, R. (n.d.). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384
  • Yarkoni, T. (2019). The Generalizability Crisis [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/jqw35

Examining whether science self-corrects using citations of replication studies

As scientists, we often hope that science self-corrects. But several researchers have suggested that the self-corrective nature of science is a myth (see e.g., Estes, 2012; Stroebe et al., 2012). If science is self-correcting, we should expect that, when a large replication study finds a result that is different from a smaller original study, the number of citations to the replication study ought to exceed, or at least be similar to, the number of citations to the original study. In this blog post, I examine this question in six “correction” studies in which I’ve been involved.1I did not include any of the ManyLabs replication studies because they were so qualitatively different from the rest. This exercise is intended to provide yet another anecdote to generate a discussion about how we, as a discipline, approach self-correction and is by no means intended as a general conclusion about the field.

Sex differences in distress from infidelity in early adulthood and later life.
In an article in 2004, Shackelford and colleagues (2004) reported that men, compared to women, are more distressed by sexual than emotional fidelity (total N = 446). The idea was that this effect generalize from young adulthood to later adulthood and this was taken as evidence for an evolutionary perspective. In our pre-registered replication studies (total N = 1,952) we did find the effect for people in early adulthood but not for later adulthood. In our replication study we also found that the disappearance of the effect was likely due to sociosexual orientation in the older adults that we sampled (in the Netherlands as opposed to the United States). In other words, the basic original effect seemed present, but the original conclusion was not supported.
How did the studies fare in terms of citations?
Original study (since 2014): 56 citations
Replication study (since 2014): 23 citations
Conclusion: little to no correction done (although perhaps it was not a conclusive non-replication given the ambiguity of the theoretical interpretation)

Does recalling moral behavior change the perception of brightness?
Banerjee, Chatterjee, and Sinha (2012; total N = 114) reported that recalling unethical behavior led participants to see the room as darker and to desire more light-emitting products (e.g., a flashlight) compared to recalling ethical behavior. In our pre-registered replication study (N = 1,178) we did not find the same effects.
How did the studies fare in terms of citations?

Original study (since 2014): 142 citations
Replication study (since 2014): 24 citations
Conclusion: correction clearly failed.

Physical warmth and perceptual focus: A replication of IJzerman and Semin (2009)
This replication is clearly suboptimal, as this was a self-replication. This study was conducted in the midst of the beginning of the replication crisis so we wanted to self-replicate some of our work. In the original study (N = 39), we found that when people are in a warm condition, they focus more on perceptual relationships than individual properties. In a higher-powered replication study (N = 128), we found the same effect (with a slightly different method to better avoid experimenter effects).
How did the studies fare in terms of citations?
Original study (since 2014): 323
Replication study (since 2014): 26
Conclusion: no correction needed (yet; but please someone other than us replicate this and the other studies, as these 2009 studies were all underpowered).

  • Perceptual effects of linguistic category priming
    This was a particularly interesting case as this paper was published after the first author, Diederik Stapel, was caught for data fabrication. All but one of the (12) studies were conducted before he got caught (but we could never publish them due to the nature of the field at the time). In the original (now retracted) article, Stapel and Semin reported that priming abstract linguistic categories (adjectives) led to more global perceptual processing, whereas priming concrete linguistic categories (verbs) led to more local perceptual processing. In our replication, we could not find the same effect.2Note: technically, this study was not a replication, as the original studies were never conducted. After Stapel was caught, the manuscript was originally submitted to the journal that originally published the effects. Their message was then that they would not accept replications. When we pointed out that these were not replication, the manuscript was rejected for the fact that we had found null effects. Thankfully, the times are clearly changing now.
  • How did the studies fare in terms of citations?
  • Original study (since 2015): 12
  • Replication study (since 2015): 3
  • Conclusion: correction failed (although citations slowed down significantly and some of the newer citations were about Stapel’s fraud).

Does distance from the equator predict self-control?
This is somewhat of an outlier in this list, as this is an empirical test of a hypothesis in an theoretical article. The hypothesis of this article that was proposed is that people who live further away from the equator have poorer self-control and the authors suggested that this should be tested via data-driven methods. We were lucky enough to have a dataset (N = 1,537) to test this and took up the authors’ suggestion by using machine learning. In our commentary article, were unable to find the effect (equator distance as a predictor of self-control was just a little bit less important than whether people spoke Serbian).
How did the studies fare in terms of citations?
Original article (since 2015): 57
Empirical test of the hypothesis (since 2015): 3
Conclusion: correction clearly failed (in fact, the original first author published a very similar article in 2018 and cited the article 6 times).

A demonstration of the Collaborative Replication and Education Project: replication attempts of the red-romance effect project
Elliot et al (2010; N = 33) reported a finding that women were more attracted to men when their photograph was presented with a red (vs. grey) border. Via one of my favorite initiatives that I have been involved in, the Collaborative Replications and Education Project, 9 student teams tried to replicate this finding via pre-registered replications and were not able to find the same effect (despite very high quality control and a larger sample with total N = 640).
How did the studies fare in terms of citations?

Original study (since 2019): 17
Replication study (since 2019): 8
Conclusion: correction failed.

Social Value Orientation and attachment
Van Lange et al. (1996; N Study 1 = 573; N Study 2 = 136) reported two findings that people who are more secure in their attachment are also more prone to give more to other (fictitious) people in a coin distribution game. These original studies suffered from some problems: first, the reliabilities of the measurement instruments ranged between alpha = 0.46 and 0.68. Second, the somewhat more reliable scales (at alpha = 0.66 and 0.68) only produced marginal differences in a sample of 573 participants, when controlling for gender and after dropping items from the attachment scale (in addition, there were problems with one of the measure’s translation to Dutch). In our replication study (N = 768) that we conducted with better measurement instruments and in the same country, we did not find the same effects.
How did the studies fare in terms of citations?

Original study (since 2019): 110
Replication study (since 2019): 8
Conclusion: correction clearly failed (this one is perhaps a bit more troubling, as 1) the replication covered 2 out of the 4 studies, and the researchers from
ManyLabs2 were also not able to replicate Study 3. Again, the first author was responsible for some (4) of the citations). 

[EDIT March 8 2020]: Eiko Fried suggested I should plot the citations by year. If you want to download the data and R code, you can download them here.

A couple of observations:

  • 2020, of course, is not yet complete. I thus left it out of the graph as having 2020 in may be a bit misleading.  
  • When plotting per year, it became apparent that 2016 for Banerjee et al. had the “BBS effect” (the article was cited in a target article and received many ghost citations in Google Scholar for the commentaries that were published [but that did not cite the article; the citations for 2016 are thus inaccurate]. This does not take away from the overall conclusion).
  • Overall, there seemed to be no decline in citations.

Overall conclusion
Total for original studies (excluding 1 and 3): 338
Total for replication studies (excluding 1 and 3): 46
For the current set of studies, we clearly fail in correcting our beloved science. I suspect the same is true for other replication studies. I would love to hear more about experiences of other replication authors and I think it is time to generate a discussion how we can change these Questionable Citation Practices.

La société devrait exiger davantage des scientifiques : lettre ouverte à la population française

This blog was written to originally appear in “Le Monde” and so was initially aimed at the French public. However, people from all countries can sign to show their support for the integration of open science into grants and hiring practices. The French version is first, after which the English version follows. If you want to sign the petition, please sign it with the name of the country where you live. If you want to petition your own scientific organizations/governments, then we will share the data of our signers per country upon request (corelab.grenoble@gmail.com).

L’étude scientifique des comportements humains fournit des connaissances pertinentes pour  chaque instant de notre vie. Ces connaissances peuvent être utilisées pour résoudre des problèmes sociétaux urgents et complexes, tels que la dépression, les discriminations et le changement climatique. Avant 2011, beaucoup de scientifiques pensaient que le processus de création des connaissances scientifiques était efficace. Nous étions complètement dans l’erreur. Plus important encore, notre domaine a découvert que même des chercheurs honnêtes pouvaient produire des connaissances non fiables. Il est donc temps d’appliquer ces réflexions à nos pratiques afin de changer radicalement la façon dont la science fonctionne.

En 2011, Daryl Bem, un psychologue reconnu, mit en évidence la capacité de l’être humain à voir dans le futur. La plupart des scientifiques s’accorderaient sur le caractère invraisemblable de ce résultat. En utilisant les critères de preuves partagées par de nombreuses disciplines, Bem trouva des preuves très solides en apparence et répliquées sur 9 expériences avec plus de 1000 participants. Des études ultérieures ont démontré de façon convaincante que l’affirmation de Bem était fausse. Les psychologues, en réalisant des réplications d’études originales dans des dizaines de laboratoires internationaux, ont découvert que cela ne se limite pas à ces résultats invraisemblables. Un membre de notre équipe a mené deux de ces projets, dans lesquels des affirmations sont testées sur plus de 15 000 participants. En rassemblant les résultats de trois de ces projets internationaux, seuls 27 de ces 51 effets préalablement rapportés dans la littérature scientifique ont pu être confirmés (et des problèmes similaires sont maintenant  détectés par des projets de réplication en biologie du cancer) .

Le point de vue des scientifiques (et pas seulement des psychologues) sur la robustesse des preuves scientifiques a drastiquement changé suite à publication de Joe Simmons et de ses collègues démontrant comment il est possible d’utiliser les statistiques pour prouver n’importe quelle idée scientifique, aussi absurde soit-elle. Sans vérification de leur travail et avec des méthodes routinières, les chercheurs peuvent trouver des preuves dans des données qui en réalité n’en contiennent pas. Or, ceci devrait être une préoccupation pour tous, puisque les connaissances des sciences comportementales sont importantes à l’échelle sociétale. 

Mais quels sont les problèmes ? Premièrement, il est difficile de vérifier l’intégrité des données et du matériel utilisé, car ils ne sont pas partagés librement et ouvertement. Lorsque des chercheurs ont demandé les données de 141 articles publiés dans de grandes revues de psychologie, ils ne les ont reçu que dans 27% des cas. De plus, les erreurs étaient plus fréquentes dans les articles dont les données n’étaient pas accessibles. Ensuite, la majorité du temps nous n’avons pas connaissance des échecs scientifiques ni même des hypothèses a priori des chercheurs. Dans la plupart des domaines scientifiques, seuls les succès des chercheurs sont publiés et leurs échecs partent à la poubelle. Imaginez que cela se passe de la même façon avec le sport : si l’Olympique de Marseille ne communiquait que ses victoires et cachait ses défaites, on pourrait penser (à tort) que c’est une excellente équipe. Nous ne tolérons pas cette approche dans le domaine sportif. Pourquoi devrions-nous la tolérer dans le domaine scientifique ?

Depuis la découverte de la fragilité de certains de leurs résultats, les psychologues ont prit les devants pour améliorer les pratiques scientifiques. À titre d’exemple, nous, membres du « Co-Re lab », au LIP/PC2S de l’Université Grenoble Alpes, avons fait de la transparence scientifique un standard. Nous partageons nos données dans les limites fixées par la loi. Afin de minimiser les erreurs statistiques nous réalisons une révision de nos codes. Enfin, nous faisons des pré-enregistrements ou des Registered Report qui permettent de déposer une idée ou d’obtenir une acceptation de publication par les revues avant la collecte des données. Cela assure la publication d’un résultat, même s’il n’est pas considéré comme un « succès ». Ces interventions permettent de réduire drastiquement la probabilité qu’un résultat insensé soit intégré dans la littérature.

Tous les chercheurs ne suivent pas cet exemple. Cela signifie qu’une partie de l’argent des impôts français finance une science dont l’intégrité des preuves qui soutiennent les affirmations ne peut être vérifiée, faute d’être ouvertement partagées. Plus spécifiquement, nous appelons à ce qui suit :

  •  Pour toute proposition de subvention (qu’elle repose sur une recherche exploratoire ou confirmatoire) adressée à tout organisme de financement, exiger un plan de gestion des données.
  • Pour toute proposition de subvention adressée à tout organisme de financement, rendre par défaut accessible ouvertement codes/matériel/données (à moins qu’il n’y ait une raison convaincante pour laquelle cela soit impossible, comme dans le cas de la protection de l’identité des participants)
  • Le gouvernement français devrait réserver des fonds dédiés à des chercheurs pour vérifier l’exactitude et l’intégrité des résultats scientifiques majeurs.
  • Les universités devraient accorder la priorité d’embauche et de promotion aux chercheurs qui rendent leur matériel, données, et codes accessibles ouvertement.

C’est à l’heure actuelle où la France investit dans la science et la recherche qu’il faut agir. Le premier ministre Édouard Philippe a annoncé en 2018 que 57 milliards d’euros seront dédiés à la recherche. Nous sommes certains qu’avec les changements que nous proposons, l’investissement français nous conduira à devenir des leaders mondiaux en sciences sociales. Plus important encore, cela conduira la science française à devenir crédible et surtout, utile socialement. Nous vous appelons à soutenir cette initiative et à devenir signataire pour une science ouverte française. Vous pouvez signer notre pétition ci-dessous. Veuillez signer avec votre nom, votre adresse e-mail et le pays dans lequel vous vivez.


Society should demand more from scientists: Open letter to the (French) public

The science of human behavior can generate knowledge that is relevant to every single moment of our lives. This knowledge can be deployed to address society’s most urgent and difficult problems — up to and including depression, discrimination, and climate change. Before 2011, many of us thought the process we used to create this scientific knowledge was working well. We were dead wrong. Most importantly, our field has discovered that even honest researchers can generate findings that are not reliable. It is therefore time to apply our insights to ourselves to drastically change the way science works. 

In 2011, a famous psychologist, Daryl Bem, used practices then standard for his time to publish evidence that people can literally see the future. Most scientists would agree that this is an implausible result. Bem used the standards of evidence for many sciences available at that time, and found seemingly solid evidence across 9 experiments and over 1,000 participants. Later studies have convincingly demonstrated that Bem’s claim was not true. Psychologists have now discovered that this is not just restricted to those implausible results, as they have conducted studies replicating original studies across dozens of international labs. One of us led two of these projects, in which claims are examined in over 15,000 participants. When we take the evidence of three of such international projects together, we could only confirm 27 out of the 51 effects that were previously reported in the scientific literature (and similar problems have now been detected through replication projects in Cancer Biology). 

Scientists’ — and not only psychologists’ — view of the solidity of their evidence changed quite dramatically when Joe Simmons and his colleagues demonstrated how, as a researcher, you could use statistics to basically prove any nonsensical idea with scientific data. Unchecked, researchers are able to use fairly routine methods to find evidence in datasets where there is none. This should be a concern to anyone, as insights from behavioral science are important society wide. 

So what are some of the problems? One is the difficulty of even checking a study’s data and materials for integrity because these data and materials are not openly and freely shared. Many labs regard data as proprietary. When researchers requested the data from 141 papers published in leading psychology journals, they received the data only 27% of the time. What is more, of papers of which data was not shared, errors were more common. But we also often don’t know people’s failures, nor do we know what their a priori plans were. Within most of the sciences, we only learn about their successes, as researchers publicize their successes and leave their failures to rot on their hard drive. Imagine if we were to do the same for sports: if Olympique Marseille only told us about the games that they won, hiding away games that they lost, we would think — erroneously — that OM has a great team. We do not tolerate this approach in sports. Why should we tolerate it for science? 

Since discovering that their findings are not always robust, psychologists have led the charge in improving scientific practices. For example, we members of the “Co-Re” lab at LIP/PC2S at Université Grenoble Alpes have made transparency in our science a default. We share data to the degree that it is legally permitted. To limit the occurrence of statistical errors we conduct code review prior to submitting to scientific journals. Finally, we do pre-registrations or registered reports, which is a way to deposit an idea or to obtain a publication acceptance by journals before data collection. This ensures the publication of a result, even when this is not considered a “success”. Because of all these interventions the chance of a nonsensical idea entering the literature becomes decidedly smaller. 

Not all researchers follow this example. This means that a lot of tax money (including French tax money) goes to science where the evidence that supports its claims cannot be checked for integrity because it is not openly shared. We strongly believe in the potential of psychological science to improve society. As such, we believe French tax money should go toward science (including psychological science) that has the highest chance of producing useful knowledge — in other words, science that is open.

Specifically, we call for the following:

  • For all grant proposals (whether they are relying on exploratory or confirmatory research) to any funding agency demand a data management plan. 
  • For all grant proposals to any funding agency, make open code/materials/data the default (unless there is a compelling reason that this is impossible, such as in the case of protecting participants’ identity). 
  • The French government should set aside dedicated funding for researchers to check the accuracy and integrity of major scientific findings
  • Universities should prioritize hiring and promoting researchers who make their materials, data, and code openly available 

The time for change is now, because France is investing into science and research. The French prime minister Édouard Philippe announced in 2018 to invest 57 billion into investment and innovation. Importantly, Minister of Higher Education Frédérique Vidal’s has committed to making science open, so that the knowledge we generate is available to the taxpayer. We believe we can maximize this money’s return on investment for society by ensuring that these open principles also apply to the materials, data, and the code generated by this money. Only with our proposed changes, we have the confidence that the French investment will lead us to become world leaders in social science. What’s more important, it will lead (French) science to become credible, and, importantly, socially useful. We call for your action to support this initiative and to become a signature for (French) open science. You can do so below.

Written by Patrick Forscher, Alessandro Sparacio, Rick Klein, Nick Brown, Mae Braud, Adrien Wittman, Olivier Dujols, Shanèle Payart, and Hans IJzerman.

Our department/labo will add a standard open science statement to all its job ads!

The Co-Re Lab is part of the Laboratoire Inter-universitaire de Psychologie Personnalité, Cognition, Changement Social (LIP/PC2S) at Université Grenoble Alpes. In France, “laboratoire” or “labo” (laboratory) is used for what researchers in the Anglo-Saxon world would call “department”. During our labo meeting yesterday one of the agenda points was to vote on the following statement:

« Une bonne connaissance et une volonté de mettre en œuvre des pratiques de science ouverte (au sens par exemple de pre- enregistrement, mise à disposition des données…) sont attendues, une adoption de ces pratiques déjà effective (lorsque le type de recherche le permet) sera en outre très appréciée »

This can be roughly translated as: “A good knowledge and the willingness to put in place open science practices (for example, pre-registration or sharing of data) are expected. It will be highly valued if one has already adopted these practices (when the research permits it).” The statement was adopted by an overwhelming majority. We at the Co-Re lab are thrilled that this statement will be communicated to future job candidates.

Many Labs 4: Failure to Replicate Mortality Salience Effect With and Without Original Author Involvement

December 10th, 2019. Richard Klein, Tilburg University; Christine Vitiello, University of Florida; Kate A. Ratliff, University of Florida. This is a repost from the Center for Open Science’s blog.

We present results from Many Labs 4, which was designed to investigate whether contact with original authors and other experts improved replication rates for a complex psychological paradigm. However, the project is largely uninformative on that point as, instead, we were unable to replicate the effect of mortality salience on worldview defense under any conditions.

Recent efforts to replicate findings in psychology have been disappointing. There is a general concern among many in the field that a large number of these null replications are because the original findings are false positives, the result of misinterpreting random noise in data as a true pattern or effect.

Plot summarizing results from each data collection site. Length of each bar indicates the 95% confidence interval, and thickness is scaled by the number of participants at that lab. Blue shading indicates an In House site, red shading indicates an Author Advised site. Diamonds indicate the aggregated result across that subset of labs. Plots for other exclusion sets are available on the OSF page (https://osf.io/xtg4u/).

But, failures to replicate are inherently ambiguous and can result from any number of contextual or procedural factors. Aside from the possibility that the original is a false positive, it may instead be the case that some aspect of the original procedure does not generalize to other contexts or populations, or the procedure may have produced an effect at one point in time but those conditions no longer exist. Or, the phenomena may not be sufficiently understood so as to predict when it will and will not occur (the so-called “hidden moderators” explanation).

Another explanation — often made informally — is that replicators simply lack the necessary expertise to conduct the replication properly. Maybe they botch the implementation of the study or miss critical theoretical considerations that, if corrected, would have led to successful replication. The current study was designed to test this question of researcher expertise by comparing results generated from a research protocol developed in consultation with the original authors to results generated from research protocols designed by replicators with little or no particular expertise in the specific research area. This study is the fourth in our line of “Many Labs” projects, in which we replicate the same findings across many labs around the world to investigate some aspect of replicability.

To look at the effects of original author involvement on replication, we first had to identify a target finding to replicate. Our goal was a finding that was likely to be generally replicable, but that might have substantial variation in replicability due to procedural details (e.g. a finding with strong support but that is thought to require “tricks of the trade” that non-experts might not know about). Most importantly, we had to find key authors or known experts who were willing to help us develop the materials. These goals often conflicted with one another.

We ultimately settled on Terror Management Theory (TMT) as a focus for our efforts. TMT broadly states that a major problem for humans is that we are aware of the inevitability of our own death; thus, we have built-in psychological mechanisms to shield us from being preoccupied with this thought. In consultation with those experts most associated with TMT, we chose Study 1 of Greenberg et al. (1994) for replication. The key finding was that, compared to a control group, U.S. participants who reflected on their own death were higher in worldview defense; that is, they reported a greater preference for an essay writer adopting a pro-U.S. argument than an essay writer adopting an anti-U.S. argument.

We recruited 21 labs across the U.S. to participate in the project. A randomly assigned half of these labs were told which study to replicate, but were prohibited from seeking expert advice (“In House” labs). The remaining half of the labs all followed a set procedure based on the original article, and incorporating modifications, advice, and informal tips gleaned from extensive back-and-forth with multiple original authors (“Author Advised” labs).* In all, the labs collected data from 2,200+ participants.

The goal was to compare the results from labs designing their own replication, essentially from scratch using the published method section, with the labs benefitting from expert guidance. One might expect that the latter labs would have a greater likelihood of replicating the mortality salience effect, or would yield larger effect sizes. However, contrary to our expectation, we found no differences between the In House and Author Advised labs because neither group successfully replicated the mortality salience effect. Across confirmatory and exploratory analyses we found little to no support for the effect of mortality salience on worldview defense at all.

In many respects, this was the worst possible outcome — if there is no effect then we can’t really test the metascientific questions about researcher expertise that inspired the project in the first place. Instead, this project ends up being a meaningful datapoint for TMT itself. Despite our best efforts, and a high-powered, multi-lab investigation, we were unable to demonstrate an effect of mortality salience on worldview defense in a highly prototypical TMT design. This does not mean that the effect is not real, but it certainly raises doubts about the robustness of the effect. An ironic possibility is that our methods did not successfully capture the exact fine-grained expertise that we were trying to investigate. However, that itself would be an important finding — ideally, a researcher should be able to replicate a paradigm solely based on information provided in the article or other readily available sources. So, the fact that we were unable to do so despite consulting with original authors and enlisting 21 labs, all of which were highly trained in psychology methods is problematic.

From our perspective, a convincing demonstration of basic mortality salience effects is now necessary to have confidence in this area moving forward. It is indeed possible that mortality salience only influences worldview defense during certain political climates or among catastrophic events (e.g. national terrorist attacks), or other factors explain this failed replication. A robust Registered Report-style study, where outcomes are predicted and analyses are specified in advance, would serve as a critical orienting datapoint to allow these questions to be explored.

Ultimately, because we failed to replicate the mortality salience effect, we cannot speak to whether (or the degree to which) original author involvement improves replication attempts.** Replication is a necessary but messy part of the scientific process, and as psychologists continue replication efforts it remains critical to understand the factors that influence replication success. And, it remains critical to question, and empirically test, our intuitions and assumptions about what might matter.

*At various points we refer to “original authors”. We had extensive communication with several authors of the Greenberg et al., 1994 piece, and others who have published TMT studies. However, that does not mean that all original authors endorsed each of these choices, or still agree with them today. We don’t want to put words in anyone’s mouth, and, indeed, at least one original author expressly indicated that they would not run the study given the timing of the data collection — September 2016 to May 2017, the period leading up to and following the election of Donald Trump as President of the United States. We took steps to address that concern, but none of this means the original authors “endorse” the work.

**Interested readers should also keep an eye out for Many Labs 5 which looks at similar issues. The Co-Re lab was involved in Many Labs 5 as well.

Many Labs 4: Failure to Replicate Mortality Salience Effect With and Without Original Author Involvement

Greenberg, J., Pyszczynski, T., Solomon, S., Simon, L., & Breus, M. (1994). Role of consciousness and accessibility of death-related thoughts in mortality salience effects. Journal of Personality and Social Psychology, 67(4), 627-637.

Older Announcements

Here are some of our older announcements.

05/2021 – Hans Rocha IJzerman’s book Heartwarming was featured in a recent NRC article!

05/2021 – Hans Rocha IJzerman’s book Heartwarming was featured in a recent Radiolab episode!

05/2021 – Our lab philosophy has been covered in the University newspaper of Free University, Amsterdam.

05/2021 – The Human Behavior & Evolution Society recommended Hans’ last book ‘Heartwarming‘.

05/2021- Anna & Olivier taught a workshop on Exploratory Data Analysis for the Kurt Lewin Institute.

03/2021 – Hans IJzerman talked about his recent book ‘Heartwarming’ in the online show Pillowtalk.

03/2021 – Hans published a popular science article on social thermoregulation, in ‘Psychology Today’.

02/2021 – Hans & Patrick published an article on the difficulty experienced by Science to provide practical advice, in ‘The Inquisitive Mind’.

02/2021 – Hans has been interviewed by embrlabs to talk about him, his work, and embrlabs.

02/2021 – Hans & Olivier published an article on social thermoregulation, in ‘The Inquisitive Mind’.

02/2021 – Patrick Forscher is hosting an SPSP free-form Friday titled ‘Encouraging team-based science in psychology‘.

02/2021 – Hans served as a fact-checker for science news in the Dutch newspaper ‘De Volkskrant’.

02/2021 – IJzerman et al.’s Nature Human Behaviour paper was covered in the Dutch newspaper ‘De Volkskrant’.

02/2021 – Hans IJzerman talked to the ‘Next Big Idea Club‘ about his recent book ‘Heartwarming’.

02/2021 – We find the ‘hostile priming effect’ did not replicate across close (n = 2123) and conceptual (n = 2579) replications.

02/2021 – Hans IJzerman has been interviewed on kpcw radio to talk about his recent book ‘Heartwarming’.

12/2020 – Patrick Forscher talked to the bbc on the efficiency of training in reducing unconscious bias.

11/2020 – Patrick & Alessandro taught a workshop on Meta-analysis for the Kurt Lewin Institute.

11/2020 – Our Lab Philosophy and our Research Templates have been awarded with a SIPS Commendation.

11/2020 – Our recent blogpost on African psychology has been awarded a SIPS commendation.

02/2020 – Society should demand more from scientists: We propose ways to improve the methodological quality of scientific work.

01/2020 – After investigators couldn’t repeat key findings, researchers are trying to establish what’s worth saving.

01/2020 – Patrick Forscher consider that this area of research is not yet ready for application in every day live.

10/2019 – Patrick Forscher and Hans IJzerman were involved in a major grant proposal (Synergy) to promote the PSA and team science.

9/2019 – We’ve released the 2019 update to our lab philosophy/workflow. Key changes: Sprints, CRediT contributorship, code hygeine.

5/2019 – We’re organizing an EASP conference and Facebook+AdopteUnMec Hackathon in Annecy, France.

4/2019 – We’re giving a workshop on the Open Science Framework, and a deep dive on power analysis.

12/2018 – 135 labs around the world replicate 28 studies. Find minimal variation in results depending on the sample.

10/2018 – Developing a scale to measure individual differences in social thermoregulation.

9/2018 – We organized a workshop training scientists on doing reproducible, open science. Videos and materials available on the OSF.

9/2018 – International collaboration investigating the link between social environment and core body temperature.

8/2018 – Past experiences, and a current manipulation of physical temperature, affect thinking about our loved ones.