Academics at FHI bring the tools of mathematics, philosophy, social sciences, and the natural sciences to bear on big-picture questions about humanity and its prospects. Our mission is to shed light on crucial considerations that might shape humanity’s long-term future.

We currently divide our work into four focus areas: Macrostrategy, AI Safety, Center for the Governance of AI and Biotechnology.


Investigating which crucial considerations are shaping what is at stake for the future of humanity

FHI’s big picture research focuses on the picture of planet -term consequences of our actions today, and the complicated dynamics that are bound to shape our future in significant ways. A key aspect to this is the study of existential risks – events that endanger the survival of Earth-originating, intelligent life or that threaten to drastically and permanently destroy our potential for realising a valuable future. Our focus within this area lies in the impact of future technology capabilities and impacts (including the possibility and impact of Artificial General Intelligence or ‘Superintelligence’), existential risk assessment, anthropics, population ethics, human enhancement ethics, game theory, and consideration of the Fermi paradox. Many of the core concepts and techniques within this field originate from research by FHI scholars, they are already having a practical impact, such as in the effective altruism movement.

Featured macrostrategy publications

That is not dead which can eternal lie: the aestivation hypothesis for resolving Fermi’s paradox (2017)

Anders Sandberg, Stuart Armstrong, Milan Cirkovic

If a civilization wants to maximize computation it appears rational to aestivate until the far future in order to exploit the low-temperature environment: this can produce a 1030 multiplier of achievable computation. We hence suggest the “aestivation hypothesis”: the reason we are not observing manifestations of alien civilizations is that they are currently (mostly) inactive, patiently waiting for future cosmic eras.

Underprotection of unpredictable statistical lives compared to predictable ones (2016)

Marc Lipsitch, Nicholas G. Evans, Owen Cotton-Barratt

Existing ethical discussion considers the differences in care for identified versus statistical lives. However, there has been little attention to the different degrees of care that are taken for different kinds of statistical lives.

Superintelligence (2014)

Nick Bostrom

Superintelligence asks the questions: What happens when machines surpass humans in general intelligence? Will artificial agents save or destroy us? Nick Bostrom lays the foundation for understanding the future of humanity and intelligent life.

The unilateralist’s curse: the case for a principle of conformity

Nick Bostrom, Thomas Douglas & Anders Sandberg

This article considers agents that are purely motivated by an altruistic concern for the common good, and shows that if each agent acts on her own personal judgment as to whether the initiative should be undertaken, then the initiative will move forward more often than is optimal. It explores the unilateralist’s curse.

Existential risk reduction as global priority

Nick Bostrom

This paper discusses existential risks. It raises that despite the enormous expected value of reducing the possibility of existential risk, issues surrounding human-extinction risks and related hazards remain poorly understood.

Global Catastrophic Risks

Nick Bostrom, Milan M. Cirkovic

In Global Catastrophic Risks, 25 leading experts look at the gravest risks facing humanity in the 21st century, including asteroid impacts, gamma-ray bursts, Earth-based natural catastrophes, nuclear war, terrorism, global warming, biological weapons, totalitarianism, advanced nanotechnology, general artificial intelligence, and social collapse. The book also addresses over-arching issues – policy responses and methods for predicting and managing catastrophes.

Anthropic bias

Nick Bostrom

Anthropic Bias explores how to reason when you suspect that your evidence is biased by “observation selection effects”–that is, evidence that has been filtered by the precondition that there be some suitably positioned observer to “have” the evidence.


Probing the improbable: methodological challenges for risks with low probabilities and high stakes

Toby Ord, Rafaela Hillerbrand, Anders Sandberg

This paper argues that there are important new methodological problems which arise when assessing global catastrophic risks and we focus on a problem regarding probability estimation.

The reversal test: eliminating status quo bias in bioethics

Nick Bostrom, Toby Ord

Explores whether we have reason to believe that the long-term consequences of human cognitive enhancement would be, on balance, good.


How unlikely is a doomsday catastrophe?

Max Tegmark, Nick Bostrom

This article considers existential risk and how many previous bounds on their frequency give a false sense of security. It derives a new upper bound of one per 10^9 years (99.9% c.l.) on the exogenous terminal catastrophe rate that is free of such selection bias, using planetary age distributions and the relatively late formation time of Earth.


Astronomical waste: the opportunity cost of delayed technological development

Nick Bostrom

This paper considers how with very advanced technology, a very large population of people living happy lives could be sustained in the accessible region of the universe. It emphasizes that for every year that development of such technologies and colonization of the universe is delayed, there is an opportunity cost.


Researching computer science techniques for building safer artificially intelligent systems

Surveys of leading AI researchers suggest a significant probability of human-level machine intelligence being achieved this century. Machines already outperform humans on several narrowly defined tasks, but the prospect of general machine intelligence (AGI) would introduce novel challenges. The goal system would need to be carefully designed to ensure that the AI’s actions would be safe and beneficial. Avoiding AGI’s potential negative impact on the future of humanity is maybe one of the most important challenges of this century.

Current problems in AI safety include the risk of AGI to game their own reward functions, reinforcement learners being able to safely explore their environments as well as avoiding negative side effects of goal functions. FHI works closely with Deepmind and other leading actors in the development of artificial intelligence.

Featured AI safety publications

Trial without Error: Towards Safe RL with Human Intervention (2017)

William Saunders, Girish Sastry, Andreas Stuhlmueller, Owain Evans

How can AI systems learn safely in the real world? Self-driving cars have safety drivers, people who sit in the driver’s seat and constantly monitor the road, ready to take control if an accident looks imminent. Could reinforcement learning systems also learn safely by having a human overseer?


Exploration potential

Jan Leike

This paper introduces exploration potential, a quantity for that measures how much a reinforcement learning agent has explored its environment class. In contrast to information gain, exploration potential takes the problem’s reward structure into account. This leads to an exploration criterion that is both necessary and sufficient for asymptotic optimality (learning to act optimally across the entire environment class).

Safely interruptible agents

Laurent Orseau, Stuart Armstrong

This paper provides a formal definition of safe interruptibility and exploits the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa. It shows that even ideal, uncomputable reinforcement learning agents can be made safely interruptible.

A formal solution to the grain of truth problem

Jan Leike, Jessica Taylor, Benya Fallenstein

A Bayesian agent acting in a multi-agent environment learns to predict the other agents’ policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem. This paper presents a formal and general solution to the full grain of truth problem.

Thompson sampling is asymptotically optimal in general environments

Jan Leike, Tor Lattimore, Laurent Orseau, Marcus Hutter

This paper discusses a variant of Thompson sampling for nonparametric reinforcement learning in countable classes of general stochastic environments. It shows that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges to the optimal value in mean and (2) given a recoverability assumption regret is sublinear.


Learning the preferences of ignorant, inconsistent agents

Owain Evans, Andreas Stuhlmüller, Noah D. Goodman

An analysis of what people value and how this relates to machine learning.


Off-policy Monte Carlo agents with variable behaviour policies

Stuart Armstrong

This paper looks at the convergence property of off-policy Monte Carlo agents with variable behaviour policies. It presents results about convergence and lack of convergence



Nate Soares, Benja Fallenstein, Eliezer Yudkowsky, Stuart Armstrong

An introduction to the notion of corrigibility and analysis of utility functions that attempt to make an agent shut down safely if a shutdown button is pressed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or self-modifies.


Learning the preferences of bounded agents

Owain Evans, Andreas Stuhlmüller, Noah D. Goodman

This paper explicitly models structured deviations from optimality when inferring preferences and beliefs. They use models of bounded and biased cognition as part of a generative model for human choices in decision problems, and infer preferences by inverting this model.


Understanding how geopolitics, governance structures, and strategic trends will affect the development of advanced artificial intelligence

In addition to working directly on the technical problem of safety with AI systems, FHI examines the broader strategic, ethical, and policy issues to reduce the risks of long-term developments in machine intelligence. Given that the actual development of AI systems is shaped by the strategic incentives of nations, firms, and individuals, we research norms and institutions that might support the safe development of AI. For example, being transparent about different parts of the AI research process differently shapes incentives for making safety a priority in AI design.

As part of this work, we participate as members of the Partnership on AI to advise industry and research partners and work with governments around the world on aspects of long-run AI policy. We have worked with or consulted for the UK Prime Minister’s Office, the United Nations, the World Bank, the Global Risk Register, and a handful of foreign ministries.

You can read more about the Center for the Governance of AI here.

Featured publications

When Will AI Exceed Human Performance? Evidence from AI Experts

Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang and Owain Evans

Advances in artificial intelligence (AI) will transform modern life by reshaping transportation, health, science, finance, and the military. To adapt public policy, we need to better anticipate these advances. Here we report the results from a large survey of machine learning researchers on their beliefs about progress in AI.

Policy Desiderata in the Development of Machine Superintelligence

Nick Bostrom, Allan Dafoe, Carrick Flynn

This paper seeks to initiate discussion of challenges and opportunities posed by the potential development of superintelligence by identifying a set of distinctive features of the transition to a machine intelligence era. From these distinctive features, we derive a correlative set of policy desiderata—considerations that should be given extra weight in long-term AI policy compared to other policy contexts.


Strategic implications of openness in AI development

Nick Bostrom

This paper attempts a preliminary analysis of the global desirability of different forms of openness in AI development (including openness about source code, science, data, safety techniques, capabilities, and goals).


Unprecedented technological risks

Over the next few decades, the continued development of dual-use technologies will provide major benefits to society. They will also pose significant and unprecedented global risks, this report gives an overview of these risks and their importance, focusing on risks of extreme catastrophe.


Working with institutions around the world to reduce risks from especially dangerous pathogens

Rapid developments in biotechnology and genetic engineering will pose novel risks and opportunities for humanity in the decades to come.  Arms races or proliferation with advanced bioweapons could pose existential risks to humanity, while advanced medical countermeasures could dramatically reduce these risks.  Human enhancement technologies could radically change the human condition.  FHI’s biotechnology research group conducts cutting-edge research on the impacts of advanced biotechnology and their impacts on existential risk and the future of humanity.  In addition to research, the group regularly advises policymakers: for example, FHI researchers have consulted with the US President’s Council on Bioethics, the US National Academy of Sciences, the Global Risk Register, the UK Synthetic Biology Leadership Council, as well as serving on the board of DARPA’s SafeGenes programme and directing iGEM’s safety and security system.

Featured biotechnology publications

Beyond risk-benefit analysis: pricing externalities for gain-of-function research of concern

Owen Cotton-Barratt, Sebastian Farquhar, Andrew Snyder-Beattie

In this policy working paper, we outline an approach for handling decisions about Gain of Function research of concern.

Human Agency and Global Catastrophic Biorisks

Piers Millett,  Andrew Snyder-Beattie

Given that events such as the Black Death and the introduction of smallpox to the Americas have comprised some of the greatest catastrophes in human history, it is natural to examine the possibility of global catastrophic biological risks (GCBRs). In the particularly extreme case of human extinction or permanent collapse of human civilization, such GCBRs would jeopardize the very existence of many thousands of future generations. Does the category of GCBR merit special research effort?

Existential Risk and Cost-Effective Biosecurity

Piers Millett, Andrew Snyder-Beattie

This paper provides an overview of biotechnological extinction risk, makes some estimates for how severe the risks might be, and compares the cost-effectiveness of reducing these extinction-level risks with existing biosecurity work. The authors find that reducing human extinction risk can be more cost-effective than reducing smaller-scale risks, even when using conservative estimates. This suggests that the risks are not low enough to ignore and that more ought to be done to prevent the worst-case scenarios.

Embryo Selection for Cognitive Enhancement: Curiosity or game-changer?

Carl Shulman, Nick Bostrom

In this article, we analyze the feasibility, timescale, and possible societal impacts of embryo selection for cognitive enhancement. We find that embryo selection, on its own, may have significant (but likely not drastic) impacts over the next 50 years, though large effects could accumulate over multiple generations. However, there is a complementary technology – stem cell-derived gametes – which has been making rapid progress and which could amplify the impact of embryo selection, enabling very large changes if successfully applied to humans.