This post outlines activities at the Future of Humanity Institute during July, August and September 2016. This post is also being distributed to our newsletter mailing list, which you can subscribe to here.
Nick Bostrom is currently working on a paper on cooperation, legitimacy, and governance in AI development together with Carrick Flynn and Allan Dafoe, an FHI Research Associate and political scientist at Yale University.
Stuart Armstrong is working with Laurent Orseau at DeepMind on a sequel to Safely Interruptible Agents, which will cover situations where the agent cannot fully explore the environment due to interruptions and where we may wish to penalise a misbehaving agent. Stuart has also been delving into other problems, for example around cooperative inverse reinforcement learning and safe oracles.
Anders Sandberg submitted for publication a paper on the aestivation hypothesis. He has developed a preliminary protocol for managing information hazards in academic settings together with FHI intern Fiona Furnari. Anders drafted a tech report on speed limits of large-scale space settlement. He also spoke at a number of international workshops and conferences.
Toby Ord, Anders Sandberg and Eric Drexler have been finalising a paper on “Dissolving the Fermi Paradox”, and Toby is also preparing a report on the lessons from the making of the atomic bomb for publication at the Bulletin of Atomic Scientists.
Eric Drexler has been exploring and preparing internal documents on strategies for safe development of high-level AI, and on potential applications of structured transparency to problems of existential risk and strategic stability. He helped to organize and lead an international workshop on atomically precise manufacturing hosted by the Centre for the Study of Existential Risk at Cambridge.
Owen Cotton-Barrett’s paper with Marc Lipsitch and Nick Evans ‘Underprotection of Unpredictable Statistical Lives Compared to Predictable Ones’ was published in Risk Analysis. He helped run a collaborative workshop organized by the Centre for Effective Altruism and FHI, to discuss existential risk policy with Finnish government agencies. He also acted as an instructor for EuroSPARC and ran several workshops at EA Global.
Owain Evans has been working with FHI interns David Krueger, David Abel and John Salvatier, on two papers at the intersection of machine learning and AI safety. One of these, in collaboration with Jan Leike, studies a variation on reinforcement learning where the reward signal is not always available but has to be requested for a cost. Owain also gave a talk at an event for VCs and AI companies alongside Demis Hassabis.
Jan Leike, alongside Owain Evans, is collaborating with Stanford University, DeepMind, OpenAI and Google Brain to address a reward signal problem within semi-supervised reinforcement learning. He is also working with Laurent Orseau at Deepmind on other projects.
Piers Millett is developing a short review that links together issues around biological weapons, advances in biotechnology and existential risk. He is collaborating with colleagues at CSER on their Bioengineering Horizon Scanning exercise, and is refining his research agenda on pandemics, deliberate disease and the implications of biotechnology.
Miles Brundage is investigating the utility of formal modelling and simulation of different AI policy interventions and preparing a scenario planning workshop in which AI stakeholders will consider the robustness of different policies against social and technical uncertainties. He is working with Nick Bostrom on issues related to desiderata for governance of AI development and robustly beneficial early commitment mechanisms for AI developers.
In addition to his work with Nick Bostrom and Allan Dafoe, Carrick Flynn has been working on a project within space law, and has been contributing to the formation of the Global Politics of Artificial Intelligence Research Group with Allan Dafoe at Yale University.
FHI hosted researchers from DeepMind for the second of a program of monthly seminar meetings on technical AI safety. Laurent Orseau gave an overview presentation on his work on agent models, and Owain Evans led a discussion on Cooperative Inverse Reinforcement Learning, a recent AI Safety paper from Stuart Russell’s group.
Miles Brundage, Anders Sandberg and Andrew Snyder-Beattie attended the Symposium on Ethics of Autonomous Systems (SEAS), an event organized by IEEE, starting the process of defining standards for AI ethics. They additionally attended the 22nd European Conference on Artificial Intelligence.
Finally, FHI has hired Prof. William MacAskill, who will also continue in his role as CEO and trustee of the Centre for Effective Altruism. William will be working with Hilary Greaves to set up the Oxford Institute for Effective Altruism, a proposed new research centre within Oxford University. If successful, this research centre could become independent from both FHI and CEA, but would collaborate with both organisations, particularly with the new ‘research fundamentals’ team within CEA. Michelle Hutchinson at CEA is managing the operations preparations for the Institute, with help from Jonathan Courtney, while Hilary Greaves at FHI is leading its research aspects. They have recruited Peter Singer and Derek Parfit as advisors. The hope is that the Institute will formally launch in mid-2017.
During Q3, FHI had a generous offer from Luke Ding to fund Prof. Hilary Greaves for four years from mid-2017, in the event that the proposed Oxford Institute for Effective Altruism is unable to raise academic funds for her position. Luke Ding also kindly offered to fund William MacAskill’s full salary for five years. The Oxford Institute for Effective Altruism team are currently writing several academic grants, and so we are not currently seeking additional donor funding for that project.
We are also in discussions regarding two large donations from two other potential major donors, which have not yet been finalised. We hope to be able to announce these in the coming months.