Management Science, Accepted October 2025
Author(s): Xinlan Emily Hu, Mark E. Whiting, Linnea Gandhi, Duncan J. Watts, and Abdullah Almaatouq
Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.
Is a larger group more effective than a smaller group? Are groups more productive than individuals? Can AI enhance team performance?
The answers to many critical questions about teamwork depend on the task being performed. Consequently, researchers and practitioners alike need a practical way to quantify when two tasks are similar or different from one another — and hence to understand how findings from one task might generalize to another.
The Task Space provides a framework to do exactly that. It allows researchers to represent any task as a point in a 24-dimensional space, where each dimension corresponds to a theoretically-motivated feature of the task.
Potential applications of the Task Space include:
As a starting point, we built a “map” of the space: a repository of 102 laboratory tasks from the interdisciplinary literature on group performance, rated along the 24 Task Space dimensions using a crowd annotation process.
The Task Map (102 x 24 matrix) can be downloaded here: task_map.csv.
The 24 dimensions are detailed here: 24_dimensions_clean.csv.
The 102 tasks and their descriptions are detailed here: 102_tasks_with_sources_clean.csv.
We also provide documentation on how to annotate new tasks, including the annotation rubric and the code used to generate the Task Space from raw annotations.
The questions displayed to raters can be found here: questions_displayed_to_raters.csv.
Resources for rater training and managing the annotation pipeline
can be found here: rating pipelines/
The code used to generate the Task Space from raw annotations can
be found here: generate_task_map_from_raw.Rmd.
Please use the interactive tools below to explore the data associated with our paper!
The following interactive visualizer allows you to explore the Task Map in two dimensions using PCA, and to conduct k-means clustering on the underlying 24-dimensional space. You can choose different values of k, and mouse over each dot to see the name of the task. You can use this tool to get an intuitive sense of the relationships between tasks.
In our paper, we demonstrate the use of the Task Space by conducting a large-scale integrative experiment measuring the phenomenon of group advantage — in which an interacting group outperforms individuals working alone. In our experiment, we define two types of group advantage:
Our experiment involved 20 tasks sampled from the task space, which were implemented at three levels of complexity (low, medium, and high) and completed by groups of two different sizes (3 and 6). In the following interactive panel, please explore our data to see how group advantage varies across these experimental conditions. You can toggle between Strong/Weak Advantage, filter by complexity and group size, and hover to see task names.
A key takeaway is that group advantage is incredibly heterogeneous; there’s no one answer for whether groups outperform individuals. But importantly, these differences are explainable by task features. However, it turns out that the Task Space features can explain 43% of the variance in this phenomenon, demonstrating that our framework can systematically account for variations in group outcomes like this one.
If you are interested in learning more, our data, code, and materials are available in our GitHub repository.
Our full reproduction package is hosted on GitHub (https://github.com/Watts-Lab/task-mapping).
The paper’s authors are listed below. For feedback, questions, or suggestions for new tasks and dimensions, please reach out to the Corresponding Authors.
We also acknowledge that this work was created with the support from many other people, including research assistants at the University of Pennsylvania and the labor of Amazon Mechanical Turk workers.
This project is part of the group dynamics / integrative experiments research at the Computational Social Science Lab at Penn. You can learn more about our lab here.