A recent FHI project investigated whether AI systems can predict human deliberative judgments. Today’s AI systems are good at imitating quick, “intuitive” human judgments in areas including vision, speech recognition, and sentiment analysis. Yet some important decisions can’t be made quickly. They require careful thinking, research, and analysis. For example, a judge should not decide a case with a snap decision. Likewise, deciding whether a news story is fake or misleading can require extensive research.
We collected a dataset of human deliberative judgments, using a new app ThinkAgain. The human participants had to answer Fermi estimation questions (“is the weight of a blue whale < 50,000 kg ?”) and also had to evaluate the veracity of statements by politicians (“When I was Governor, we cut the rate of growth in government” — Mitt Romney in 2012). Both domains are challenging and humans give better answers when given more time for thinking and research.
Our paper compares a range of Machine Learning algorithms on the task of predicting human deliberative judgments. These include collaborative filtering algorithms, neural collaborative filtering, and Bayesian hierarchical regression. Our dataset turned out to have some important limitations (described in the paper) and so additional work is needed to carry out some of the goals of this project.
The paper is here.
The dataset and code is here.