Given the current situation with COVID-19, several people from my investigation center at the university have started to change the way they’ve been programming their experiments so far and switched from lab-based to online studies. This has brought new considerations to bear in mind, such as whether the timing of online experiments is as reliable as the ones at the lab, whether participants are actually doing the task… and so on.
Here’s a paper of a mega-study in which they’ve compared different platforms of lab-based and online experiments:
PsyArcXiv preprints Jan 2020.
The timing mega-study: comparing a range of experiment generators, both lab-based and online
Bridges D., Pitiot A., MacAskill MR., and Peirce JW.
Abstract. Many researchers in the behavioral sciences depend on research software that presents stimuli, and records response times, with sub-millisecond precision. There are a large number of software packages with which to conduct these behavioural experiments and measure response times and performance of participants. Very little information is available, however, on what timing performance they achieve in practice. Here we report a wide-ranging study looking at the precision and accuracy of visual and auditory stimulus timing and response times, measured with a Black Box Toolkit. We compared a range of popular packages: PsychoPy, E-Prime®, NBS Presentation®, Psychophysics Toolbox, OpenSesame, Expyriment, Gorilla, jsPsych, Lab.js and Testable. Where possible, the packages were tested on Windows, MacOS, and Ubuntu, and in a range of browsers for the online studies, to try to identify common patterns in performance. Among the lab-based experiments, Psychtoolbox, PsychoPy, Presentation and E-Prime provided the best timing, all with mean precision under 1 millisecond across the visual, audio and response measures. OpenSesame had slightly less precision across the board, but most notably in audio stimuli and Expyriment had rather poor precision. Across operating systems, the pattern was that precision was generally very slightly better under Ubuntu than Windows, and that Mac OS was the worst, at least for visual stimuli, for all packages. Online studies did not deliver the same level of precision as lab-based systems, with slightly more variability in all measurements. That said, PsychoPy and Gorilla, broadly the best performers, were achieving very close to millisecond precision on a number of browser configurations. For response times (using a high-performance button box), most of the packages achieved precision at least under 10 ms in all browsers, with PsychoPy achieving a precision under 3.5 ms in all. There was considerable variability between operating systems and browsers, especially in audio-visual synchrony which is the least precise aspect of the browser-based experiments. Nonetheless, the data indicate that online methods can be suitable for a wide range of studies, with due thought about the sources of variability that result. The results, from over 110,000 trials, highlight the wide range of timing qualities that can occur even in these dedicated software packages for the task. We stress the importance of scientists making their own timing validation measurements for their own stimuli and computer configuration.
What a long abstract, right? To sum up, they found that:
- Among the lab-based experiments, Psychtoolbox, PsychoPy, Presentation, and E-Prime provided the best timing, all with mean precision under 1 millisecond across the visual, audio and response measures.
- Across operating systems, the pattern was that precision under Ubuntu was generally slightly better than Windows and that Mac OS was the worst.
- Online studies did not deliver the same level of precision as lab-based systems, with slightly more variability in all measurements.
- For response times (using a high-performance button box), most of the packages achieved precision at least under 10 ms in all browsers, with PsychoPy achieving a precision under 3.5 ms in all.
Feature image from Pexels – C0 license.