Today I jointd the seminar on “The meaningfulness of effect sizes in the social and behavioural sciences in light of the reproducibility crisis” given by Thomas Schäfer, a Professor for Quantitative Research Methods at MSB Medical School Berlin. Here’s the abstract of the seminar:
ABSTRACT: Effect sizes are the currency of the social and behavioral sciences. They quantify the results of a study to answer the research question and are used to calculate statistical power. They are also a central aspect when the evidence of a study–and thus, its practical usefulness–is to be evaluated. In these days, effect sizes are also used to evaluate the success of replication studies. However, the meaningfulness and usefulness of effect sizes hinges on a reliable framework that defines how the size of an effect is to be interpreted. This framework—helping define an effect as small, medium, or large—has been guided by the recommendations Jacob Cohen gave in his pioneering writings starting in 1962: Either compare an effect with the effects found in past research or use certain conventional benchmarks. The present analysis shows that neither of these recommendations is currently applicable. From past publications without pre-registration, 900 effects were randomly drawn and compared with some 100 effects from publications with pre-registration, revealing a large difference: Effects from the former were much larger than effects from the latter. That is, certain biases, such as publication bias or questionable research practices, have caused a dramatic inflation in published effects, making it difficult to compare an actual effect with the real population effects (as these are unknown). In addition, there were very large differences in the mean effects between psychological sub-disciplines and between different study designs, making it impossible to apply any global benchmarks. Many more pre-registered studies are needed in the future to derive a reliable picture of real population effects. Apart from that, it is outlined how we can arrive at more theory-driven criteria for the interpretation of effects.
Here are a few notes:
- We should try to use unstandarized effects and derive a sensible interpretation of the effect’s practical meaning in the context of the specific research area (see Baguley 2009).
- We need many more studies with pre-registration to re-animate the comparison approach of pre-registred vs non pre-registered studies in the near future.
- We should always report effect sizes whenever possible.
Feature image from Pexels – C0 license.