Academics at FHI bring the tools of mathematics, philosophy, social sciences, and the natural sciences to bear on big-picture questions about humanity and its prospects. Our mission is to shed light on crucial considerations that might shape humanity’s long-term future.
We currently divide our work into four focus areas:
Macrostrategy – Understanding which crucial considerations shape what is at stake for the future of humanity.
AI safety – Researching computer science techniques for building safer artificially intelligent systems.
AI strategy – Understanding how geopolitics, governance structures, and strategic trends will affect the development of advanced artificial intelligence.
Biorisk – Working with institutions around the world to reduce risks from especially dangerous pathogens.
FHI’s big picture research focuses on the long-term consequences of our actions today, and the complicated dynamics that are bound to shape our future in significant ways. A key aspect to this is the study of existential risks – events that endanger the survival of Earth-originating, intelligent life or that threaten to drastically and permanently destroy our potential for realising a valuable future. Our focus within this area lies in the impact of future technology capabilities and impacts (including the possibility and impact of Artificial General Intelligence or ‘Superintelligence’), existential risk assessment, anthropics, population ethics, human enhancement ethics, game theory, and consideration of the Fermi paradox. . Many of the core concepts and techniques within this field originate from research by FHI scholars, they are already having a practical impact, such as in the effective altruism movement
That is not dead which can eternal lie: the aestivation hypothesis for resolving Fermi’s paradox
If a civilization wants to maximize computation it appears rational to aestivate until the far future in order to exploit the low-temperature environment: this can produce a 1030 multiplier of achievable computation. We hence suggest the “aestivation hypothesis”: the reason we are not observing manifestations of alien civilizations is that they are currently (mostly) inactive, patiently waiting for future cosmic eras.
Underprotection of unpredictable statistical lives compared to predictable ones
Existing ethical discussion considers the differences in care for identified versus statistical lives. However, there has been little attention to the different degrees of care that are taken for different kinds of statistical lives. Read more.
Superintelligence: paths, dangers, strategies
Superintelligence asks the questions: What happens when machines surpass humans in general intelligence? Will artificial agents save or destroy us? Nick Bostrom lays the foundation for understanding the future of humanity and intelligent life. Read More >>
The unilateralist's curse: the case for a principle of conformity
This article considers agents that are purely motivated by an altruistic concern for the common good, and shows that if each agent acts on her own personal judgment as to whether the initiative should be undertaken, then the initiative will move forward more often than is optimal. It explores the unilateralist’s curse. Read more.
Existential risk reduction as global priority
This paper discusses existential risks. It raises that despite the enormous expected value of reducing the possibility of existential risk, issues surrounding human-extinction risks and related hazards remain poorly understood. Read More.
Global catastrophic risks
In Global Catastrophic Risks, 25 leading experts look at the gravest risks facing humanity in the 21st century, including asteroid impacts, gamma-ray bursts, Earth-based natural catastrophes, nuclear war, terrorism, global warming, biological weapons, totalitarianism, advanced nanotechnology, general artificial intelligence, and social collapse. The book also addresses over-arching issues – policy responses and methods for predicting and managing catastrophes. Read more.
Anthropic Bias explores how to reason when you suspect that your evidence is biased by “observation selection effects”–that is, evidence that has been filtered by the precondition that there be some suitably positioned observer to “have” the evidence. Read more.
Probing the improbable: methodological challenges for risks with low probabilities and high stakes
This paper argues that there are important new methodological problems which arise when assessing global catastrophic risks and we focus on a problem regarding probability estimation. Read more.
The reversal test: eliminating status quo bias in bioethics
Explores whether we have reason to believe that the long-term consequences of human cognitive enhancement would be, on balance, good. Read more.
How unlikely is a doomsday catastrophe?
This article considers existential risk and how many previous bounds on their frequency give a false sense of security. It derives a new upper bound of one per 10^9 years (99.9% c.l.) on the exogenous terminal catastrophe rate that is free of such selection bias, using planetary age distributions and the relatively late formation time of Earth. Read more.
Astronomical waste: the opportunity cost of delayed technological development
Surveys of leading AI researchers suggest a significant probability of human-level machine intelligence being achieved this century. Machines already outperform humans on several narrowly defined tasks, but the prospect of general machine intelligence (AGI) would introduce novel challenges. The goal system would need to be carefully designed to ensure that the AI’s actions would be safe and beneficial. Avoiding AGI’s potential negative impact on the future of humanity is maybe one of the most important challenges of this century.
Current problems in AI safety include the risk of AGI to game their own reward functions, reinforcement learners being able to safely explore their environments as well as avoiding negative side effects of goal functions. FHI works closely with Deepmind and other leading actors in the development of artificial intelligence.
Trial without Error: Towards Safe RL with Human Intervention
This paper introduces exploration potential, a quantity for that measures how much a reinforcement learning agent has explored its environment class. In contrast to information gain, exploration potential takes the problem’s reward structure into account. This leads to an exploration criterion that is both necessary and sufficient for asymptotic optimality (learning to act optimally across the entire environment class). Read more >>
Safely interruptible agents
A formal solution to the grain of truth problem
A Bayesian agent acting in a multi-agent environment learns to predict the other agents’ policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem. This paper presents a formal and general solution to the full grain of truth problem. Read more >>
Thompson sampling is asymptotically optimal in general environments
This paper discusses a variant of Thompson sampling for nonparametric reinforcement learning in a countable classes of general stochastic environments. It shows that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges to the optimal value in mean and (2) given a recoverability assumption regret is sublinear. Read more >>
Learning the preferences of ignorant, inconsistent agents
An analysis of what people value and how this relates to machine learning. Read more.
Off-policy Monte Carlo agents with variable behaviour policies
This paper looks at the convergence property of off-policy Monte Carlo agents with variable behaviour policies. It presents results about convergence and lack of convergence. Read more.
An introduction to the notion of corrigibility and analysis of utility functions that attempt to make an agent shut down safely if a shutdown button is pressed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or selfmodifies. Read more.
Learning the preferences of bounded agents
This paper explicitly models structured deviations from optimality when inferring preferences and beliefs. They use models of bounded and biased cognition as part of a generative model for human choices in decision problems, and infer preferences by inverting this model. Read more.
In addition to working directly on the technical problem of safety with AI systems, FHI examines the broader strategic, ethical, and policy issues to reduce the risks of long-term developments in machine intelligence. Given that the actual development of AI systems is shaped by the strategic incentives of nations, firms, and individuals, we research norms and institutions that might support the safe development of AI. For example, being transparent about different parts of the AI research process differently shapes incentives for making safety a priority in AI design.
As part of this work, we participate as members of the Partnership on AI to advise industry and research partners and work with governments around the world on aspects of long-run AI policy. We have worked with or consulted for the UK Prime Minister’s Office, the United Nations, the World Bank, the Global Risk Register, and a handful of foreign ministries.
When Will AI Exceed Human Performance? Evidence from AI Experts
Advances in artificial intelligence (AI) will transform modern life by reshaping transportation, health, science, finance, and the military. To adapt public policy, we need to better anticipate these advances. Here we report the results from a large survey of machine learning researchers on their beliefs about progress in AI. Read more >>
Policy Desiderata in the Development of Machine Superintelligence
This paper seeks to initiate discussion of challenges and opportunities posed by the potential development of superintelligence by identifying a set of distinctive features of the transition to a machine intelligence era. From these distinctive features, we derive a correlative set of policy desiderata—considerations that should be given extra weight in long-term AI policy compared to other policy contexts. Read more >>
Strategic implications of openness in AI development
This paper attempts a preliminary analysis of the global desirability of different forms of openness in AI development (including openness about source code, science, data, safety techniques, capabilities, and goals). Read more >>
Unprecedented technological risks
Over the next few decades, the continued development of dual-use technologies will provide major benefits to society. They will also pose significant and unprecedented global risks, this report gives an overview of these risks and their importance, focusing on risks of extreme catastrophe. Read more.
The rapid developments in biotechnology and genetic engineering have led to great advances in the medical and other sciences, however, they also increase the potential for existential crises due to the engineering and possible release of harmful biological pathogens. FHI looks at technical and ethical questions around these potential dangers in order to prevent the low probability but high impact events threatening global biosafety. One approach to reducing the risks is to improve policy in the area of biorisk and ensure lab standards and safety procedures are at a safe level. FHI is in the process of growing its considerations and research efforts in the field of biosafety. In the past, FHI researchers have worked with and consulted the US President’s Council on Bioethics, the Global Risk Register and some foreign ministries on policy in bioethics.
Beyond risk-benefit analysis: pricing externalities for gain-of-function research of concern
In this policy working paper we outline an approach for handling decisions about Gain of Function research of concern. Read more >>