Counterfactual reasoning requires predicting how alternative events, contrary to what actually happened, might have resulted in different outcomes. Despite being considered a necessary component of AI-complete systems, few resources have been developed for evaluating counterfactual reasoning in narratives.
In this paper, we propose Counterfactual Story Rewriting: given an original story and a counterfactual event, the task is to generate a revised story that is compatible with the counterfactual event while being minimally edited from the original story. Solving this task will require deeper understanding of the causal narrative chains and counterfactual invariance, and integration of such story reasoning capabilities into conditional language models.
We present a new dataset of 29,849 counterfactual rewritings, each with the original story, a new “branch” the story could take, and a human-generated alternative rewriting. Additionally, we include 81,407 counterfactual “branches” without a rewritten storyline to support future work on semi- or un-supervised approaches to counterfactual story rewriting.
Finally, we evaluate the counterfactual rewriting capacities of several competitive baselines based on pretrained language models, and assess whether common overlap and model-based automatic metrics for text generation correlate well with human scores for counterfactual rewriting.