Competitive Reaction Time Task

Publications: 130   |   Quantification Strategies: 157

Publications    |    Quantification Strategies    |    Authors    |    Recommendations

how to cite this?

What is the Competitive Reaction Time Task (CRTT)?

The Competitive Reaction Time Task, sometimes also called the Taylor Aggression Paradigm (TAP), is one of the most commonly used tests to purportedly measure aggressive behavior in a laboratory environment. In the CRTT, participants are led to believe they play a computerized reaction time game against another participant in an adjacent room. At the beginning of each round, both participants set the intensity (volume and/or duration) of a noise blast. Then, they have to react to a stimulus as quickly as possible by pressing a button, and the faster player is determined the winner. The loser of a round then is punished with a noise blast using the intensity settings made by the winner at the beginning of the round. The intensity settings are used as the measure for aggressive behavior.

How the CRTT is Flexible

While the CRTT ostensibly measures how much unpleasant, or even harmful, noise a participant is willing to administer to a nonexistent confederate, that amount of noise can be extracted as a measure in myriad different ways using various combinations of volume and duration over one or more trials. There are currently 130 publications in which results are based on the CRTT, and they reported 157 different quantification strategies in total! Not all of these quantification strategies differ substantially from each other (e.g., average volume of 25 trials vs average volume of 30 trials), and thus, choosing one over the other would not be expected to change a study's results dramatically. In other cases, however, differences are more apparent (e.g. the logarithmized product of volume and duration in the first trial), and their specific usage is frequently not justified in the respective publications.

This archive does not contain all variations of the CRTT, as some procedural differences are so substantial that their quantification strategies would be impossible to compare. For example, there are papers in which a hypothetical CRTT is reported (i.e. participants are given a description of the procedure and are then asked what volume/duration settings they would chose), while many others use electric shocks instead of noise blasts. All original research included here reported the CRTT as at least one round of a reaction time game in which the loser is punished with a noise blast set by the opponent. Their procedures may still be different, in terms of, among on things, whether participants can set volume and/or duration of the noise blast, whether there is a non-aggressive response option, the number of rounds played, or the opponent's behavior pattern (e.g., ambiguously or increasingly aggressive). Most importantly, this website documents differences in the quantification strategies of raw data to yield one or multiple scores for aggressive behavior. Note that grey literature (dissertations, conference presentations) is not yet included.

Why the Flexibility is Problematic

Given the number of different versions of the CRTT measure that can be extracted from its use in a study, it is very easy for a researcher to analyze several (or several dozen) versions of the CRTT outcome measures in a study, running hypothesis tests with one version of the measure after another until a version is found that produces the desired pattern of results. Given that the measure has been used in several dozen different ways in the published research, and often in multiple ways by the same authors (and sometimes even in multiple ways for different analyses within the same paper), it appears likely that selective reporting of results after exploring analyses with multiple versions of the CRTT is not uncommon. Even if multiple quantifications used are reported in a paper, it is often not clear how to interpret the findings given that in many cases one quantification strategy does not have greater validity than others.

The magnitude of the problem in multiple CRTT quantification strategies corresponds to their convergence. If all quantification strategies were perfectly correlated, switching between them would not have the potential to lead to different outcomes in statistical analyses (and thus, yielding different inferences). Estimating to which extent these quantification strategies converge is difficult without access to raw data. Researchers willing to share their data are welcome to reach out to me in order to establish a classification of similar and dissimilar quantification strategies.

The practical implications of the methodological flexibility are that one has to consider and interpret findings of studies in which the CRTT is used with extreme caution, particularly when detailed reasons forany modification are not provided. This, however, does not imply that the stimuli used in those studies do not actually induce or increase aggressiveness -- any research looking into causes of human aggression is of high relevance, as long as the results are reliable. Readers, including reviewers and editors of journals that publish empirical work on human aggression, need to be aware that current practices in the use of the CRTT might diminish the credibility and significance of laboratory research on aggression.

How the CRTT Could be Used Better

If one, or even a few, versions of the CRTT measure were established as most appropriate for use as measures of aggression for research, and these versions were actually used consistently by researchers, concerns about post-hoc "cherry-picking" of measure versions to support hypotheses predicting effects of stimuli on agression would be alleviated. Additionally, preregistration of analysis strategies would address many, if not all, of the flexibility concerns described here. Detailed recommendations are provided here.

All contents CC-BY Malte Elson (2016), Ruhr University Bochum; malte(dot)elson(at)rub(dot)de; @maltoesermalte