6. Risks From AI

Risks from artificial intelligence (AI) #

Transformative artificial intelligence may well be developed this century. If it is, it may begin to make many significant decisions for us, and rapidly accelerate changes like economic growth. Are we set up to deal with this new technology safely?

You will also learn about strategies to prevent an AI-related catastrophe and the possibility of so-called “s-risks”.

This work is licensed under a Creative Commons Attribution 4.0 International License .

The case for taking AI seriously as a threat to humanity #

This is a linkpost for https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment

[Our World in Data] AI timelines: What do experts in artificial intelligence expect for the future? (Roser, 2023) #

This is a linkpost for https://ourworldindata.org/ai-timelines

Linkposting, tagging and excerpting - in this case, excerpting the article’s conclusion - in accord with ‘ Should pretty much all content that’s EA-relevant and/or created by EAs be (link)posted to the Forum? ’.

When do experts expect artificial general intelligence big

[ click here for a big version of the visualization ]

The visualization shows the forecasts of 1128 people – 812 individual AI experts, the aggregated estimates of 315 forecasters from the Metaculus platform, and the findings of the detailed study by Ajeya Cotra.

There are two big takeaways from these forecasts on AI timelines:

  1. There is no consensus, and the uncertainty is high. There is huge disagreement between experts about when human-level AI will be developed. Some believe that it is decades away, while others think it is probable that such systems will be developed within the next few years or months.
    There is not just disagreement between experts; individual experts also emphasize the large uncertainty around their own individual estimate. As always when the uncertainty is high, it is important to stress that it cuts both ways. It might be very long until we see human-level AI, but it also means that we might have little time to prepare.

  2. At the same time, there is large agreement in the overall picture. The timelines of many experts are shorter than a century, and many have timelines that are substantially shorter than that. The majority of those who study this question believe that there is a 50% chance that transformative AI systems will be developed within the next 50 years. In this case it would plausibly be the biggest transformation in the lifetime of our children, or even in our own lifetime.

The public discourse and the decision-making at major institutions have not caught up with these prospects. In discussions on the future of our world – from the future of our climate, to the future of our economies, to the future of our political institutions – the prospect of transformative AI is rarely central to the conversation. Often it is not mentioned at all, not even in a footnote.

We seem to be in a situation where most people hardly think about the future of artificial intelligence, while the few who dedicate their attention to it find it plausible that one of the biggest transformations in humanity’s history is likely to happen within our lifetimes.

The longtermist AI governance landscape: a basic overview #

Aim: to give a basic overview of what is going on in longtermist AI governance.

Audience: people who have limited familiarity with longtermist AI governance and want to understand it better. I don’t expect this to be helpful for those who already have familiarity with the field. ETA: Some people who were already quite familiar with the field have found this helpful.

This post outlines the different kinds of work happening in longtermist AI governance. For each kind of work, I’ll explain it, give examples, sketch some stories for how it could have a positive impact, and list the actors I’m aware of who are currently working on it.1

Firstly, some definitions:

  • AI governance means bringing about local and global norms, policies, laws, processes, politics, and institutions (not just governments) that will affect social outcomes from the development and deployment of AI systems.2

  • Longtermist AI governance , in particular, is the subset of this work that is motivated by a concern for the very long-term impacts of AI. This overlaps significantly with work aiming to govern transformative AI (TAI).

It’s worth noting that the field of longtermist AI governance is very small. I’d guess that there are around 60 people working in AI governance who are motivated by a concern for very long-term impacts.

Short summary #

On a high level, I find it helpful to consider there being a spectrum between foundational and applied work. On the foundational end, there’s strategy research , which aims to identify good high-level goals for longtermist AI governance; then there’s tactics research which aims to identify plans that will help achieve those high-level goals. Moving towards the applied end, there’s policy development work that takes this research and translates it into concrete policies; work that advocates for those policies to be implemented, and finally the actual implementation of those policies (by e.g. civil servants).

There’s also field-building work (which doesn’t clearly fit on the spectrum). Rather than contributing directly to the problem, this work aims to build a field of people who are doing valuable work on it.

Of course, this classification is a simplification and not all work will fit neatly into a single category.

You might think that insights mostly flow from the more foundational to the more applied end of the spectrum, but it’s also important that research is sensitive to policy concerns, e.g. considering how likely your research is to inform a policy proposal that is politically feasible.

We’ll now go through each of these kinds of work in more detail.

Research #

Strategy research #

Longtermist AI strategy research ultimately aims to identify high-level goals we could pursue that, if achieved, would clearly increase the odds of eventual good outcomes from advanced AI, from a longtermist perspective (following Muehlhauser , I’ll sometimes refer to this aim as ‘getting strategic clarity ’).

This research can itself vary on a spectrum between targeted and exploratory as follows:

  • Targeted strategy research answers questions which shed light on some other specific, important, known question
    • e.g. “I want to find out how much compute the human brain uses, because this will help me answer the question of when TAI will be developed (which affects what high-level goals we should pursue)”
  • Exploratory strategy research answers questions without a very precise sense of what other important questions they’ll help us answer
    • e.g. “I want to find out what China’s industrial policy is like, because this will probably help me answer a bunch of important strategic questions, although I don’t know precisely which ones”

Examples #

  • Work on TAI forecasting, e.g. biological anchors and scaling laws for neural language models .
    • Example of strategic relevance: if TAI is soon, then slowly growing a large field of experts seems less promising; if TAI is very far, then longtermist AI governance should probably be relatively deprioritised.
  • Work on clarifying the sources of AI x-risk, e.g. writing by Christiano, Critch , Carlsmith , Ngo and Garfinkel .
    • Example of strategic relevance: if most x-risk from AI comes from advanced misaligned AI agents, then governance should focus on influencing the first actors to build them.
  • Work on investigating the speed of AI progress around TAI, e.g. investigation and analysis by AI Impacts.
    • Example of strategic relevance: if AI progress occurs discontinuously , then there are likely to be only a small number of high-stakes actors, and most of the value of governance will come from influencing those actors.

It’s easy to confuse strategy research (and especially exploratory strategy research) with broadly scoped research. As many of the above examples show, strategy research can be narrowly scoped - that is, it can answer a fairly narrow question. Examples of broadly vs. narrowly scoped questions:

  • On scaling laws:
    • Broad question: in general, how does the performance of deep learning models change as you increase the size of those models?
    • Narrower question: how does the performance of large language models specifically (e.g. GPT-3) change as you increase the size of those models? (The question tackled in this paper .)
  • On sources of AI x-risk:
    • Broad question: how much x-risk is posed by advanced AI in general?
    • Narrower question: how much x-risk is posed by influence-seeking AI agents specifically? (The question tackled in this report .)

Indeed, I think it’s often better to pick narrowly scoped questions, especially for junior researchers, because they tend to be more tractable.

Luke Muehlhauser has some recommendations for those who want to try this kind of work: see point 4 in this post . And see this post for some examples of open research questions.3

Stories for impact #

  • Direct impact : there are many possible goals in AI governance, and we need to prioritise the most important ones. This work is often motivated by researchers’ impressions that there is very little clarity about topics which affect what goals we should pursue. For example, see the results of these surveys which show wide disagreement about AI x-risk scenarios and the total amount of AI x-risk, respectively.
  • Indirect impact:
    • Field-building: having a clear understanding of what we’re working to achieve and why it matters would help attract more people to the field.
    • Communicating the need for policy change: if you want to convince people to do costly or dramatic things in the future, you’d better have clear things to say about what we’re working to achieve and why it matters.

Who’s doing it? #

Some people at the following orgs: FHI , GovAI , CSER , DeepMind , OpenAI , GCRI , CLR , Rethink Priorities , OpenPhil , CSET ,4 plus some independent academics.

Tactics research #

Longtermist AI tactics research ultimately aims to identify plans that will help achieve high-level goals (that strategy research has identified as a priority). It tends to be more narrowly scoped by nature.

It’s worth noting that there can be reasons to do tactics research even if you haven’t clearly identified some goal as a priority: for your own learning, career capital, and helping to build an academic field.

Examples #

  • The Windfall Clause
    • Plan: develop a tool for distributing the benefits of AI for the common good
    • High-level goals which this plan is pursuing: reducing incentives for actors to race against each other to be the first to develop advanced AI; reducing economic inequality.
  • Mechanisms for Supporting Verifiable Claims
    • Plan: develop practices by which AI developers could make their own claims about AI development more verifiable (that is, claims to which developers can be held accountable)
    • High-level goals which this plan is pursuing: developing mechanisms for demonstrating responsible behaviour of AI systems; enabling more effective oversight; reducing pressure to cut corners for the sake of gaining a competitive edge.
  • AI & Antitrust
    • Plan: proposing ways to mitigate tensions between competition law and the need for cooperative AI development
    • High-level goal which this plan is pursuing: increasing cooperation between companies developing advanced AI.

Stories for impact #

  • Direct impact : creating solutions that get used to help make better decisions (in policy and future research).
    • This is what Allan Dafoe calls the ‘Product model of research’.
  • Indirect impact : even if not all solutions get used to help make better decisions, they will help grow the field of people who care about longtermist AI governance issues, and improve insight, expertise, connections and credibility of researchers.
    • This is what Allan Dafoe calls the ‘Field-building model of research’.

Who’s doing it? #

Some people at the following orgs: FHI , GovAI , CSER , DeepMind , OpenAI , GCRI , CSET , Rethink Priorities , LPP , plus some independent academics.

Policy development, advocacy and implementation #

Strategy research outputs high-level goals. Tactics research takes those goals and outputs plans for achieving them. Policy development work takes those plans and translates them into policy recommendations that are ready to be delivered to policymakers. This requires figuring out (e.g.) which precise ask to make, what language to use (both in the formal policy and in the ask), and other context-specific features that will affect the probability of successful implementation.

Policy advocacy work advocates for policies to be implemented, e.g. figuring out who is the best person to make the policy ask, to whom, and at what time.

Policy implementation is the work of actually implementing policies in practice, by civil servants or corporations.

It’s worth distinguishing government policy (i.e. policy intended to be enacted by governments or intergovernmental organisations) from corporate policy (i.e. policy intended to be adopted by corporations). Some people working on longtermist AI governance focus on improving corporate policy (especially the policies of AI developers), while others in the field focus on improving the policies of relevant governments.

A common motivation for all policy work is that implementation details are often thought to be critical for successful policymaking. For example, if a government regulation has a subtle loophole, that can make the regulation useless.

Compared with research, this kind of work tends to involve relatively less individual thinking, and relatively more conversation/information collection (e.g. having meetings to learn who has authority over a policy, what they care about, and what other players want in a policy) as well as coordination (e.g. figuring out how you can get a group of actors to endorse a policy, and then making that happen).

As mentioned earlier, policy insight sometimes flows ‘backwards’. For example, policy development might be done iteratively based on how advocacy changes your knowledge (and the policy landscape).

Examples #

  • Government policy:
    • Committing to not incorporate AI technology into nuclear command, control and communications (NC3), e.g. as advocated for by CLTR in their Future Proof report.
    • Government monitoring of AI development, e.g. as developed in this whitepaper on AI monitoring .
    • Making nascent regulation or AI strategies/principles sensitive to risks from advanced AI systems (as well as current ones), e.g. feedback by various EA orgs about the EU AI Act.
  • Corporate policy:
    • Developing norms for the responsible dissemination of AI research, given its potential for misuse, e.g. these recommendations by PAI.

These ideas vary on a spectrum between more targeted (e.g. not integrating AI into NC3) to more general (in the sense of creating general-purpose capacity to deal with a broad class of problems that will likely arise, e.g. most of the others above). I think our policy development, advocacy and implementation today should mostly focus on more general ideas, given our uncertainties about how AI will play out (whilst also pushing for obviously good specific ideas, when they arise).

Stories for impact #

  • Direct impact: having good policies in place increases our chances of successfully navigating the transition to a world with advanced AI.

  • Indirect impact: even if you can’t be sure that some policy idea is robustly good, developing/advocating/implementing it will help build insight, expertise, connections and credibility of longtermist AI governance people. We don’t want to get to an AI “crunch time”,5 and only then start learning about how to develop policy and decision-making.

    • That said, we should be very careful with implementing policies that could end up being harmful, e.g. by constraining future policy development.

Who’s doing it? #

Field-building #

This is work that explicitly aims to grow the field or community of people who are doing valuable work in longtermist AI governance.6 One could think of this work as involving both (1) bringing in new people, and (2) making the field more effective.

Examples #

  1. Bringing in new people by creating:

    • policy fellowships, such as the OpenPhil Technology Policy Fellowship ;
    • online programs or courses to help junior people get synced up on what is happening in AI governance;
    • high quality, broadly appealing intro material that reaches many undergraduates;
    • more scalable research fellowships to connect, support and credential interested junior people.
  2. Making the field more effective by creating:

    • research agendas;
    • ways for senior researchers to easily hire research assistants.7

Stories for impact #

  • Growth model : building a longtermist AI governance field with lots of aligned people with the capacity and relevant expertise to do important research and policy work (perhaps especially when this work is less bottlenecked by lack of strategic clarity).
  • Metropolis model:8 building a longtermist AI governance field with dense connections to broader communities (e.g. policymaking, social science, machine learning), such that the field can draw on diverse expertise from these communities.

Who’s doing it? #

GovAI , OpenPhil , SERI , CERI , CHERI and EA Cambridge . From a broader view, all cause-general EA movement building as well. This is the least explored kind of work discussed in this post.

Other views of the longtermist AI governance landscape #

I’ve presented just one possible view of the longtermist AI governance landscape - there are obviously others, which may be more helpful for other purposes. For example, you could carve up the landscape based on different kinds of interventions, such as:

  • Shifting existing discussions in the policy space to make them more sensitive to AI x-risk (e.g. building awareness of the difficulty of assuring cutting-edge AI systems)
  • Proposing novel policy tools (e.g. international AI standards)
  • Getting governments to fund AI safety research
  • Shifting corporate behaviour (e.g. the windfall clause)

Or, you could carve things up by geographic hub (though not all organisations are part of a geographic hub):

  • Bay Area: OpenPhil, OpenAI, PAI, various AI alignment orgs. On average more focused on misalignment as the source of AI x-risk; culturally closer to Silicon Valley and rationality cultures.
  • DC: US govt, CSET. Focus on US policy development/advocacy/implementation; culturally closer to DC culture.
  • UK: FHI/GovAI, DeepMind, UK govt, CSER, CLTR, (others?). On average more concern over a wider range of sources of AI x-risk.
  • EU. In 2020, the European Commission drafted the world’s first AI regulation, which will likely be passed in the next few years and could lead to a Brussels effect.
  • China.

Or, you could carve up the landscape based on different “theories of victory”, i.e. complete stories about how humanity successfully navigates the transition to a world with advanced AI. There’s a lot more that could be said about all of this; the aim of this post has just been to give a concise overview of the kinds of work that are currently happening.

Acknowledgements: this is my own synthesis of the landscape, but is inspired and/or draws directly from EA forum posts by Allan Dafoe , Luke Muehlhauser and Convergence Analysis . Thanks also to Jess Whittlestone for helpful conversation, plus Matthijs Maas, Yun Gu, Konstantin Pilz, Caroline Baumöhl and especially a reviewer from SERI for feedback on a draft.

This work is licensed under a Creative Commons Attribution 4.0 International License.


Preventing an AI-related catastrophe - Problem profile #

This is a linkpost for https://80000hours.org/problem-profiles/artificial-intelligence

We (80,000 Hours) have just released our longest and most in-depth problem profile — on reducing existential risks from AI.

You can read the profile here .

The rest of this post gives some background on the profile, a summary and the table of contents.

Some background #

Like much of our content, this profile is aimed at an audience that has probably spent some time on the 80,000 Hours website, but is otherwise unfamiliar with EA – so it’s pretty introductory. That said, we hope the profile will also be useful and clarifying for members of the EA community.

The profile primarily represents my (Benjamin Hilton’s) views, though it was edited by Arden Koehler (our website director) and reviewed by Howie Lempel (our CEO), who both broadly agree with the takeaways.

I’ve tried to do a few things with this profile to make it as useful as possible for people new to the issue:

Also, there’s a feedback form if you want to give feedback and prefer that to posting publicly.

This post includes the summary from the article and a table of contents.

Summary #

We expect that there will be substantial progress in AI in the next few decades, potentially even to the point where machines come to outperform humans in many, if not all, tasks. This could have enormous benefits, helping to solve currently intractable global problems, but could also pose severe risks. These risks could arise accidentally (for example, if we don’t find technical solutions to concerns about the safety of AI systems), or deliberately (for example, if AI systems worsen geopolitical conflict). We think more work needs to be done to reduce these risks.

Some of these risks from advanced AI could be existential — meaning they could cause human extinction, or an equally permanent and severe disempowerment of humanity. 9 There have not yet been any satisfying answers to concerns — discussed below — about how this rapidly approaching, transformative technology can be safely developed and integrated into our society. Finding answers to these concerns is very neglected, and may well be tractable. We estimate that there are around 300 people worldwide working directly on this.10 As a result, the possibility of AI-related catastrophe may be the world’s most pressing problem — and the best thing to work on for those who are well-placed to contribute.

Promising options for working on this problem include technical research on how to create safe AI systems, strategy research into the particular risks AI might pose, and policy research into ways in which companies and governments could mitigate these risks. If worthwhile policies are developed, we’ll need people to put them in place and implement them. There are also many opportunities to have a big impact in a variety of complementary roles, such as operations management, journalism, earning to give, and more — some of which we list below.

Our overall view #

Recommended - highest priority

This is among the most pressing problems to work on.

Scale #

AI will have a variety of impacts and has the potential to do a huge amount of good. But we’re particularly concerned with the possibility of extremely bad outcomes, especially an existential catastrophe. We’re very uncertain, but based on estimates from others using a variety of methods, our overall guess is that the risk of an existential catastrophe caused by artificial intelligence within the next 100 years is around 10%. This figure could significantly change with more research — some experts think it’s as low as 0.5% or much higher than 50%, and we’re open to either being right. Overall, our current take is that AI development poses a bigger threat to humanity’s long-term flourishing than any other issue we know of.

Neglectedness #

Around $50 million was spent on reducing the worst risks from AI in 2020 – billions were spent advancing AI capabilities.11 12 While we are seeing increasing concern from AI experts, there are still only around 300 people working directly on reducing the chances of an AI-related existential catastrophe.10 Of these, it seems like about two-thirds are working on technical AI safety research, with the rest split between strategy (and policy) research and advocacy.

Solvability #

Making progress on preventing an AI-related catastrophe seems hard, but there are a lot of avenues for more research and the field is very young. So we think it’s moderately tractable, though we’re highly uncertain — again, assessments of the tractability of making AI safe vary enormously.

Full table of contents #

Acknowledgements #

Huge thanks to Joel Becker, Tamay Besiroglu, Jungwon Byun, Joseph Carlsmith, Jesse Clifton, Emery Cooper, Ajeya Cotra, Andrew Critch, Anthony DiGiovanni, Noemi Dreksler, Ben Edelman, Lukas Finnveden, Emily Frizell, Ben Garfinkel, Katja Grace, Lewis Hammond, Jacob Hilton, Samuel Hilton, Michelle Hutchinson, Caroline Jeanmaire, Kuhan Jeyapragasan, Arden Koehler, Daniel Kokotajlo, Victoria Krakovna, Alex Lawsen, Howie Lempel, Eli Lifland, Katy Moore, Luke Muehlhauser, Neel Nanda, Linh Chi Nguyen, Luisa Rodriguez, Caspar Oesterheld, Ethan Perez, Charlie Rogers-Smith, Jack Ryan, Rohin Shah, Buck Shlegeris, Marlene Staib, Andreas Stuhlmüller, Luke Stebbing, Nate Thomas, Benjamin Todd, Stefan Torges, Michael Townsend, Chris van Merwijk, Hjalmar Wijk, and Mark Xu for either reviewing the article or their extremely thoughtful and helpful comments and conversations. (This isn’t to say that they would all agree with everything I said – in fact we’ve had many spirited disagreements in the comments on the article!)

This work is licensed under a Creative Commons Attribution 4.0 International License .

Why s-risks are the worst existential risks, and how to prevent them #

This is a linkpost for https://www.youtube.com/watch?v=jiZxEJcFExc&list=PLwp9xeoX5p8Pi7rm-vJnaJ4AQdkYJOfYL&index=14


Effective altruists focussed on shaping the far future face a choice between different types of interventions. Of these, efforts to reduce the risk of human extinction have received the most attention so far. In this talk, Max Daniel makes the case that we may want to complement such work with interventions aimed at preventing very undesirable futures (“s-risks”), and that this provides a reason for, among the sources of existential risk identified so far, focussing on AI risk.

Transcript: Why S-risks are the worst existential risks, and how to prevent them #

I’m going to talk about risks of large scale severe suffering in the far future or S-Risks. And to illustrate what S-Risks are about, I’d like to start with a fictional story from the British TV series Black Mirror, which some of you may have seen. So, in this fictional scenario, it’s possible to upload human minds into virtual environments. In this way, sentient beings can effectively be stored and run on very small computing devices, such as the wide-act shaped gadget you can see on the screen here. Behind the computing device, you can see Matt.

Matt’s job is to sell those virtual humans as virtual assistants. And because this isn’t a job description that’s particularly appealing to everyone, part of Matt’s job is to convince these human uploads to actually comply with the request of their human owners. And so in this instance, human upload Greta, which you can see here, is unwilling to do this. She’s not thrilled with the prospect of serving for the rest of her life as a virtual assistant. In order to break her will, in order to make her comply, Matt increases the rate at which time passes for greater. So, while Matt only needs to wait for a few seconds, during that time, Greta effectively endures many months of solitary confinement.

So, I hope you agree that this would be an undesirable scenario. And now, fortunately, of course, this particular scenario is quite unlikely to be realized. So, for any particular scenario we can imagine to unfold, it’s pretty unlikely that it will be realized in precisely this form. So that’s not the point here. However, I’ll argue that there in fact is a broad range of scenarios that we face risks of scenarios that are in some ways like this scenario or even worse.

And I’ll call these risks S-risks. I’ll first explain what these S-risks are, contrasting them with the more familiar existential risks or X-risks. And then in a second part of the talk, I’ll talk a bit about why as effective altruists we may want to prevent those S-risks and how we could do this. So, the way I’d like to introduce S-risks will be a subclass of the more familiar existential risks.

As you may recall, these have been defined by Nick Bostrom as risks where an adverse outcome would either completely annihilate Earth-originating intelligent life or at least permanently and drastically curtail its potential. Bostrom also suggested in one of his major publications on existential risk that one way to understand how these risks differ from other kinds of risks is to look at how bad this adverse outcome would be along two dimensions. And these dimensions are the scope and the severity of the adverse outcome that we’re worried about. I’ve reproduced one of Bostrom’s central figures here.

You can see a risk scope on the vertical axis. That is, here we ask how many individuals would be negatively affected if the risks were realized? Is it just a small number of individuals? Is it everyone in a particular region or even everyone alive on Earth? Or in the worst case, everyone alive on Earth plus some or even all future generations? And the second relevant dimension is the severity. So here we ask for each individual that would be affected, how bad would the outcome be? For instance, consider a fatal car crash, the risk of a single fatal car crash. If this happened, this would be pretty bad. You could die, so it would have pretty high severity, but it would only have personal scope because in any single car crash, only a small number of individuals are affected. However, there are other risks which would have even worse severity. For instance, consider factory farming. We commonly believe that, for instance, the life of say, chickens and battery cages is so bad that it would be better to not bring these chickens into existence in the first place. That’s why we believe that it’s a good thing that most of the food at this conference is vegan. Another way to look at this is that, I guess, some of you would think that the prospect of being tortured for the rest of your life probably would be even worse than a fatal car crash. So, there can be risks which even worse severity than terminal risks such as a fatal car crash. And as risks are now risks which with respect to their severity are about as bad as factory farming and that they are about outcomes that would be even worse than non-existence, but would also have much greater scope than a car crash or even factory farming. And that they could potentially affect a very large number of beings for the entire far future across the whole universe.

So, this explains why in the title of the talk I claimed that S-risks are the worst existential risks. I said this because I just defined them to be risks of outcomes which have the worst possible severity and the worst possible scope. So, one way to understand this and how they differ from other kinds of existential risk is to zoom in on the top right corner of the figure I showed before. This is the corner that shows existential risks.

These are risks that would at least affect everyone alive on earth plus all future generations. So, this is why Boston called them risks of pan-generational scope and risks which would be at least what Boston called crushing which we can roughly understand as removing everything that would be valuable for those individuals. And one central example of such existential risks are risks of extinction. And these are risks that have received a lot of attention in the EA community already. They have pan-generational scope because they would affect everyone alive and would also remove the rest of the future and they would be crushing because they would remove everything valuable. But S-risks are another type of existential risk which are also conceptually included in this concept of existential risk. They are risks that would be even worse than extinction because they contain a lot of things we disvalue such as for instance intense involuntary suffering and risks that would have even larger scope because they would affect a significant part of the universe.

So, you could think of the Black Mirror story from the beginning and imagine that Greta endures her solitary confinement for the rest of her life and that it’s not only one upload but a large population of such uploads across the universe. Or you could think of something like factory farming with a much, much larger scope for some reason realized in many ways across the whole galaxy. So, I have now explained what conceptually S-risks are. They are risks of severe involuntary suffering on a cosmic scale thereby exceeding the total amount of suffering we’ve seen on earth so far. And this makes them a subclass of existential risk but a distinct subclass from the more well-known extinction risks. So far, I’ve just defined some conceptual term. I’ve called attention to some kind of possibility.

But what may be more relevant is as effective altruists, if reducing S-risks something we can do and if so, if it’s something we should do. And let’s make sure that we understand this question correctly. Because all plausible ethical views agree that intense involuntary suffering is a bad thing. So, I hope you can all agree that reducing S-risks is a good thing. But of course you’re here because you’re interested in effective altruism. That is, you just don’t just want to know whether there is something good you can do or we’re interested in identifying the most good we can do. We realize that doing good has opportunity cost and we really want to make sure to focus our time, focus our money on the most impact we can have. So, the question here really is, and the question I’d like you to discuss is, can reducing S-risks meet this higher bar? For at least some of us, could it be the best use of your time or your money to invest this into reducing as risk as opposed to doing other things that could be good in some sense? And this of course is a very challenging question and I won’t be able to conclusively and comprehensively answer that question today. And to illustrate just how complex this question is and also to make really clear what kind of argument I’m not making here, I’ll first introduce a flawed argument, an argument that doesn’t work for focusing on reducing S-risks. So this argument will roughly go as follows.

First premise, the best thing to do is preventing the worst risks. Second premise, well, S-risks by definition are the worst risk, so conclusion you may think the best thing to do is to prevent S-risks. Now with respect to premise one, let’s get one potential source of misunderstanding out of the way. One way to understand this first premise is that it could be a rock bottom feature of your ethical worldview.

So, you could think whatever you expect to happen in the future, you have some specific additional ethical reasons to focus on preventing the worst case outcomes. Some kind of maximum principle or perhaps prioritarianism applied to the far future. This however is not the sense I’m going to talk about today. So, if your ethical view contains some such principles, I think they give you additional reasons for focusing on S-risks, but this is not what I’m going to talk about. What I’m going to talk about is that there are more criteria that are relevant for identifying the ethically optimal action than the two dimension of risks we have looked at so far.

Right, so far we’ve only looked at if a risk was realized, how bad would the outcome be in terms of its severity and its scope. And in this sense, S-risks are the worst risks, but in this sense I think premise one is not clearly true. Because when deciding what’s the best thing to do, there are more criteria that are relevant and many of you will be well familiar with those criteria because they are rightly getting a lot of attention in the EA community. So, in order to see if reducing S-risks is the best thing we can do, we really need to look at how likely would it be that these S-risks are realized, how easy is reducing them, in other words, how tractable is it and how neglected is this endeavor. Are there already a lot of people or organizations doing it, how much attention is it already getting? So, these criteria clearly are relevant. Even if you are a prioritarian or whatever and think you have a lot of reasons to focus on the worst outcomes, if for instance their probability was zero or there was absolutely nothing we could do to reduce them, it would make no sense to try to do it. So, we need to say something about the probability, the tractability and the neglectedness of these as risks and I’ll offer some initial thoughts on these in the rest of the talk. What about the probability of these S-risks? What I’ll argue for here is that S-risks are at least not much more unlikely than extinction risks from superintelligent AI, which are a class of risks at least some parts of the community take seriously and think we should do something about them.

And I’ll explain why I think this is true and will address two kinds of objections you may have. So, reasons to think that these as risks in fact are too unlikely to focus on them. The first objection could be that well these kind of S-risks they are just too absurd. We can’t even send humans to Mars yet so why should we worry about suffering on cosmic scales you could think.

And in fact when I first encountered related ideas I had a similar immediate intuitive reaction that this is a bit speculative, maybe this is not something I should focus on. But I think we should really be careful to examine such intuitive intuitions because as many of you I guess are well aware there is a large body of psychological research in the heuristics and biases approach that suggests that intuitive assessments of probability by humans are often driven by how easy it is for us to recall a prototypical example of the kind of scenarios we are considering. And for things that have never happened for which there is no historical precedent this leads to us intuitively systematically underestimating their probability known as the absurdity heuristics. So, I think we shouldn’t go with this intuitive reaction but should rather really examine, okay what can we say about the probability of these S-risks.

And if we look at all of our best scientific theories and what the experts are saying about how the future could unfold I think we can identify two not too implausible technological developments that may plausibly lead to the realization of S-risks. This is not to say that these are the only possibilities there may be unknown unknowns things we can’t foresee yet that could also lead to such S-risks but there are some known pathways that could get us into as risk territory. And these two are artificial sentience and super intelligent AI. So artificial sentience simply refers to the idea that the capacity to have subjective experience and in particular the capacity to suffer is in fact not limited in principle to biological animals. But that there could be novel kinds of beings perhaps computer programs stored on silicon based hardware about who’s suffering we would also have reasons to care about. And while this isn’t completely settled in fact few contemporary views and the philosophy of mind would say that artificial sentience is impossible in principle. So, it seems to be a conceptual possibility we should be concerned about. Now how likely is it that this will ever be realized. This may be less clear but in fact here as well we can identify one technological pathway that may lead to artificial sentience and this is the idea of whole brain emulation. Basically just understanding the human brain and sufficient detail so that we could build a functionally equivalent computer simulation of it. And for this technology it’s still not completely certain that we will be able to do it but in fact researchers have looked at this and have outlined a quite detailed roadmap for this technology. So, they’ve identified concrete milestones and remaining uncertainties and have concluded that this definitely seems to be something we should take into account when thinking about the future. So, I’d argue there is a not too implausible technological possibility that we will get to artificial sentience.

I won’t say as much about the second development, super intelligent AI because this is already getting a lot of attention in the EA community. If you aren’t familiar with worries related to super intelligent AI I recommend Nick Bostrom’s excellent book, Super Intelligence and I’ll just add that super intelligent AI presumably could also unlock many more technological capabilities that we would need to get into S-risks territory. So, for instance the capacity to colonize space and spread out sentient beings into larger parts of the universe. This could conceivably be realized by super intelligent AI and I’d also like to add that some scenarios in which the interaction between super intelligent AI and artificial sentience could lead to S-risks scenarios have been discussed by Bostrom in super intelligent and other places under the term mind crime. So, this is something you could search for if you’re interested in related ideas.

So in fact if we look at what we can say about the future I think it would be a mistake to say that S-risks are so unlikely that we shouldn’t care about them. But maybe you have now a different objection. So maybe you’re convinced that okay in terms of the technological capabilities we can’t be sure that these S-risks are just too unlikely but you may think okay vast amounts of suffering there seems to be a pretty specific outcome even if we have much greater technological capabilities it seems unlikely that such an especially bad outcome will be realized. So you could think after all this would require some kind of evil agent some kind of evil intent that actively tries to make sure that we get these vast amounts of suffering. And I think I agree that this seems to be pretty unlikely but here again after reflecting on this a little bit I think we can see that this is only one and perhaps the most implausible route to get into S-risks territory. There also are two other routes I’d like to argue.

The first of these S-risks could arise by accident. So, one class of scenarios how this could happen could be the following. Imagine that the first artificially sentient beings we create aren’t as highly developed as complete human minds but perhaps more similar to non-human animals in that we may create artificially sentient beings with the capacity to suffer but with a limited capability to communicate with us and to signal that they are suffering. In an extreme case we may create beings that are sentient that can suffer but who’s suffering we overlook because there is no possibility of easy communication.

A second scenario where S-risks could be realized without evil intent are the toy example if the toy example of a paperclip maximizer which serves to illustrate the idea what would happen if we create a very powerful super intelligent AI that pursues some unrelated goal. A goal that’s neither closely aligned with our values nor actively evil. And as Nick Bostrom and many people have argued conceivably such a paperclip maximizer could lead to human extinction for instance because it would convert the whole earth and all the matter around here into paperclips because it just wants to maximize the number of paperclips and has no regard for human survival. But it’s only a small further step to worry well what if such a paperclip maximizer runs for instance sentient simulations say for scientific purposes to better understand how to maximize paperclip production or maybe similar to the way that our suffering serves some kind of evolutionary function maybe a paperclip maximizer would create some kind of artificially sentient sub-programs or work who suffering would be instrumentally useful for maximizing the production of paperclips. So, we only need to add very few additional examples and assumptions to see that scenarios which are already getting a lot of attention could not only lead to human extinction but effect to outcomes that would be even worse. Finally to understand the significance of the third route S-risks could be realized as part of a conflict note that it’s often the case then that if we have a large number of agents competing for a shared pie of resources that this can incentivize negative sub-dynamics that lead to very bad outcomes even if none of the agents involved actually actively values those bad outcomes they are just resorting to them in order to out-compete the other agents. For instance look at most wars the countries waging them are rarely intrinsically valuing the suffering and the bloodshed implicated in them but sometimes wars still happen to further the strategic interests of the countries involved.

So I think if we critically examine the situation we are in we should conclude that in fact if we take seriously a lot of the thumb considerations that are already being widely discussed in the community such as risks on super intelligent AI there are only few additional assumptions we need to justify worries about a stress and it’s not like we need to invent some completely made up technologies or need to assume extremely implausible or rare motivations such as sadism or hatred to substantiate worries about S-Risks. So this is why I’ve said that I think S-Risks are at least not much more unlikely than say extinction risks from super intelligent AI. Now of course the probability of S-Risks isn’t the only criterion we need to address as I said we also need to ask how easy is it to reduce those S-Risks. And in fact I think this is a pretty challenging task. We haven’t found any kind of silver bullet yet here but I’d also like to argue that reducing S-Risks is at least minimally tractable even today and one reason for this is that we are arguably already reducing S-Risks. So, as I just said some scenarios how S-Risks could be realized are super intelligent AI goes wrong in some way.

This is why some work in technical AI safety as well as AI policy probably already effectively reduces S-Risks. To give you one example I said that we might be worried about S-Risks arising because of the strategic behavior of say AI agents as part of a conflict. Some work in AI policy that reduces the likelihood of such multi-polar AI scenarios and makes unipolar AI scenarios with less competition more likely could in particular have the effect of reducing S-Risks. Similar with some work in technical AI safety. That being said it seems to me that a lot of the interventions that are currently undertaken reduce S-Risks by accident in a sense they aren’t specifically tailored to reducing S-Risks and there may well be particular say sub-problems within technical AI safety that would be particularly effective at reducing S-Risks specifically and which aren’t getting a lot of attention already. So, for instance to give you one toy example that’s probably hard to realize in precisely this form but illustrates what might be possible. Consider the idea of achieving the goal of an AI being uncontrolled, conditional on our efforts on solving the control problem failing making sure that in such a scenario AI at least doesn’t create say additional sentient simulations or artificially sentient sub-programs. If we could solve this problem through work in technical AI safety we would arguably reduce S-Risks specifically.

Of course there also are more broad interventions that don’t directly aim to influence some kinds of levers that directly affect the far future but would have a more indirect effect on reducing S-Risks. So for instance we could think that strengthening international cooperation will enable us to at some point for instance prevent AI arms races that could again lead to negative sum dynamics that could lead us into S-Risk territory. Similarly because artificial sentience is such a significant worry when thinking about S-Risks we could think that expanding the moral circle and making it more likely that human decision makers in the future would care about artificially sentient beings that this would have a positive effect on reducing S-Risk. That all being said I think it’s fair to say that we currently don’t understand very well how best to reduce S-Risk. One thing we could do if we suspect that there are some low hanging fruits to reap there we could say okay let’s go meta and do research about how best to reduce those S-Risk and in fact this is a large part of what we are doing at the Foundation Research Institute. Now there’s also another aspect of tractability I’d like to talk about. This is not the question how easy is it intrinsically to reduce S-Risk but the question could we raise the required support. For instance can we get sufficient funding to get work on reducing S-Risk off the ground. And one worry we may have here is that well all these talk about suffering on a cosmic scale and so on this will seem too unlikely to many people in other words that S-Risks are just too weird a concern for us to be able to raise significant support and funding to reduce them.

And I think this is to some extent a legitimate worry but I also don’t think that we should be too pessimistic and I think the history of the AI safety field substantiates this assessment. If you think back even 10 years ago you will find that back then worries about extinction risk from super intelligent AI were ridiculed, dismissed or misrepresented and misunderstood as for instance being about the terminator or anything something like that. Today we have Bill Gates blurbing a book talking openly and directly about these risks of super intelligent AI and also related concepts such as mind crime. And so I would argue that the recent history of the AI safety field provides some reason for hope that we are able to push even seemingly weird cause areas sufficiently far into the mainstream, into the window of acceptable discourse that we can raise significant support for them.

Last but not least what about the neglectedness of S-risk? So, as I said some work that’s already underway in the X-Risk area arguably reduces S-risk so reducing S-risk is not totally neglected but I think it’s fair to say that they get much less attention than say extinction risk. In fact I’ve sometimes seen people in the community either explicitly or implicitly equate existential risk and extinction risk which conceptually clearly seems to be untrue. And in fact while some existing interventions may be also effective at reducing S-risks there are few people that are specifically trying to identify interventions that are most effective at reducing S-risk specifically. And I think the foundational research institute is the only EA organization which at an organizational level has the mission of focusing on reducing S-risk. So to summarize I haven’t conclusively answered the question for who exactly reducing S-risk is the best thing to do. I think this depends both on your ethical view and on some empirical questions such as the probability, the tractability and the neglectedness of S-risk. But I have argued S-risks are not much more unlikely than say extinction risk from super intelligent AI and so they warrant at least some attention. And I’ve argued that the most plausible known path that could lead us into S-risk territory so aside from unknown unknowns are AI scenarios that involve the creation of large numbers of artificially sentient beings. And this is why I think among the currently known sources of existential risk the AI risk cost area is unique in also being very relevant for reducing S-risk. Because if we don’t get AI right there seems to be a significant chance that we get into S-risk territory whereas in other areas say an asteroid hitting earth or a deadly pandemic or wiping out a human life it seems much less likely that this could get us into scenarios that would be much, much worse than extinction because they in addition contain a lot of suffering. In this sense if you haven’t considered S-risk before I think this is an update for caring more about the AI risk cost area as opposed to other S-risk cost areas. In this sense some but not all of the current work in the ex-risk area is already effective at reducing S-risk but there seems to be a gap of people and research specifically optimizing for reducing S-risk and trying to find those interventions that are most effective for this particular goal. And I’d argue in some that the Foundational Research Institute in having this unique focus occupies an important niche and I would very much like to see more people join us in that niche. So, people from say other organizations also doing some kind of research that’s hopefully effective at reducing S-risk. So this all being said I hope to have raised some awareness for the worrying prospect of S-risks I don’t think I have convinced all of you that reducing S-risk is the best use of your resources.

I don’t think I could expect this both because our rock bottom ethical views differ to some extent and also because the empirical questions involved are just extremely complex and it seems very hard to reach agreement on them. So, I think realistically those of us who are interested in shaping the for future who are convinced that this is the most important thing to do among those of us we will be faced with a situation where there are people with different priorities in the community and we need to sort out how to manage this situation. And this is why I’d like to end this talk with a vision for this for future shaping community. So, shaping the for future as a metaphor could be seen as being involved in like a long journey. But what I hope I have made clear is that it’s a misrepresentation to frame this journey as involving a binary choice between extinction or utopia. In another sense however I’d argue that this metaphor was apt. We do face a long journey but it’s a journey through hard to traverse territory and on the horizon there is a continuum ranging from a very bad thunderstorm to the most beautiful summer day. And interest in shaping the for future in a sense determines who is with us in the vehicle but it doesn’t necessarily comprehensively answer the question what more precisely to do with the steering wheel.

Some of us are more worried about not getting into the thunderstorm, others are more motivated by the existential hope of maybe arriving at that beautiful sunshine. And it seems hard to reach agreement on what more precisely to do and part of the reasons is that it’s very hard to keep track of the complicated networks of roads far ahead and to see steering in what directions will lead to precisely what outcome. Thisis very hard by contrast we have an easy time seeing who else is with us in the vehicle. And so the concluding thought I’d like to offer is maybe among the most important things we can do is to make sure to compare our maps among the people in the vehicle and to find some way of handling the remaining disagreements without inadvertently derailing the vehicle and getting to an outcome that’s worth for all. Thank you.

I’ll start off with a question that a few people were asking which is something that you said isn’t necessary to be concerned with S-risks but would help shed a little bit of clarity which is besides AI, besides a whole brain emulation and running like uploaded brains are there any other forms of an S-risk that you can try to visualize? Particularly people were trying to figure out ways in which you can work on the problem if you don’t have a concrete sense of the way in which it might manifest. So, I do in fact think that artificial the most plausible scenarios we can foresee today involve artificial sentience partly because many people have talked about that it artificial sentience would come with novel challenges for instance it would presumably be very easy to spawn a large number of artificially sentient beings. There are a lot of efficiency advantages of silicon based substrates as compared to biological substrates and also we can observe that in many other areas the worst outcomes contain most of the expected value like the predominance of fat tailed institutions for instance with respect to the casualties in war and diseases and so on. So, it seems somewhat plausible to me that if we are concerned about reducing as much suffering as possible in expectation we should focus on these very bad outcomes and that most of these for various reasons involve artificial sentience. That being said I think there are some scenarios especially in future scenarios where we don’t have this archetypical intelligence explosion scenario and the heart takeoff where we are faced with a more messy and complex future where there are maybe a lot of factions controlling AI and using that for various purposes that we could face risks that maybe aren’t as don’t have as a higher scope as the worst scenarios involving artificial sentience but would be maybe more akin to factory farming. Some kind of novel technology that would be misused in some way maybe just because people don’t sufficiently care about the consequences and they pursue some kind of say economic goals and yeah create inadvertently create large amounts of suffering similar to the way we can see this happening today in for example the animal industry. You said that the debate is still kind of out whether or not you could actually have extend your moral concern to something that’s in a silicon substrate that isn’t flesh and bone in the way that we are.

Can you make the case for why we might in fact care about something that is an uploaded brain and isn’t a brain in the way that we generally think of it? So one suggestive thought experiment that has been discussed in the philosophy of mind is imagine that you replace your brain not all at once but step by step with a silicon based hardware. So, you start with replacing just one neuron with some kind of chip that serves the same function. It seems intuitively clear that this doesn’t make you less sentient or that we should care less about you in that way. And now you can imagine step by step replacing your brain one neuron at a time with a computer in some sense. And it seems that you have a hard time pinpointing any particular point in this transition where you say oh well now the situation flips and we should stop caring about the same after all the same information processing that’s still going on in that brain. And yeah but there is a large body of literature in the philosophy of mind discussing this question. And assuming that these brains in do in fact have the capacity to suffer what reason would we have to think that it would be advantageous say for a superintelligence to emulate lots of brains in a way in which they suffer rather than just have them exist without any sort of positive or negative feeling.

So, one reason we may be worried is that if we look at the current successes in the AI field we see that they are often driven by machine learning techniques. That is techniques where we don’t program the knowledge and the capabilities. We think the AI system should have directly into the system but rather set up some kind of algorithm that can be trained and that can learn via trial and error receiving some information about how good or bad it’s doing and thereby increasing its capabilities. Now it seems very unlikely that the current machine learning techniques that involve giving some kind of reward signals to algorithms should be concerning to us to a large extent. I don’t want to claim that current say reinforcement learning algorithms are suffering to a large extent but we may be worried that similar architectures where the capabilities of artificially sentient beings arise by them being trained in some things receiving some kind of reward signal that this is a feature of AI systems that will persist even at a point when the sentience of these algorithms is realized to a larger extent. So, in some way this is similar to the way as I mentioned our suffering serves some kind of evolutionary function that helps us navigating in the world and in fact people who don’t feel pain have a great deal of difficulties for this reason because they don’t intuitively avoid damaging outcomes. And yeah so this is certainly a longer discussion but I hope you can give a brief answer to this one.

A couple of people also wanted to know you have a particular suffering focus in the case of S-Risk but some people wonder that perhaps an agent might actually just prefer, it’s not clear whether they would prefer death or suffering, they might actually prefer to exist even if their experiences are pretty negative. Is this a choice that you would be making on behalf of the agents that you’re considering in your moral realm when you’re trying to mitigate an S-Risk, is this a necessary precondition for caring about S-Risks? So, I think whatever your rock bottom ethical views there are nearly always are prudential reasons for considering the preferences of other agents. So, if I was faced with a situation where I think oh there is like some kind of being whose experiences are so negative that I think in a consequentialist sense it would be better that this being doesn’t exist but this being has for whatever reason a strong preference to exist and then argues with me oh well should I continue or not and so on and so forth. I think there often can be prudential reasons to take these preferences into account. I think actually there will be some kind of convergence between different ethical views on the question of how to take such hypothetical preferences into account.

That being said I think it’s fairly implausible to claim that no imaginable amount of suffering would be intrinsically worse than non-existence. This seems fairly implausible to me so one intuition pump for this could be if you face the choice between one hour of sleep or one hour of torture what do you prefer. It seems fairly clear I would guess to most of us that one hour of sleep having no experience at all is if the better choice in the sense. And you said that hopefully we’ll come to some sort of convergence on what the true moral philosophy in so far as there is one is but there might also be reason to think that we wouldn’t do this in the time scales of the development of a super intelligent AI or the development of whole brain emulations that we can run on many computers. What do we do in that case where we haven’t solved moral philosophy in time? So that I think is a very important question because to me it seems to be fairly likely that there won’t be such convergence at least to a large extent of detail.

More to explore on ‘Risks from Artificial Intelligence’ #

The development of artificial intelligence #

Other resources on aligning artificial intelligence #

Governance for artificial intelligence #

Technical AI alignment work #

Criticisms of worries about AI risk #


  1. I’ve surely forgotten some important groups from this list, and I may have misclassified or otherwise misrepresented some of them - please let me know if that’s the case! ︎ ↩︎

  2. This borrows directly from Open Philanthropy’s definition . ︎ ↩︎

  3. Note that some of these are tactics research questions rather than strategy research questions. ︎ ↩︎

  4. CSET mostly do tactics research, policy development and policy advocacy, but their work on mapping the semiconductor supply chain falls under strategy research. ︎ ↩︎

  5. Muehlhauser defines this as “a period lasting 1-20 years when the decisions most impactful on TAI outcomes might be made”. ︎ ↩︎

  6. This is distinct from the field-building benefits of other kinds of work discussed in this document, since it is solely and explicitly focused on building the field. ︎ ↩︎

  7. Which can also help bring in new people. ︎ ↩︎

  8. This idea directly borrows from Allan Dafoe’s forum post . ︎ ↩︎

  9. We’re also concerned about the possibility that AI systems could deserve moral consideration for their own sake — for example, because they are sentient. We’re not going to discuss this possibility in this article; we instead cover artificial sentience in a separate article here↩︎

  10. I estimated this using the AI Watch database. For each organisation, I estimated the proportion of listed employees working directly on reducing existential risks from AI. There’s a lot of subjective judgement in the estimate (e.g. “does it seem like this research agenda is about AI safety in particular?”), and it could be too low if AI Watch is missing data on some organisations, or too high if the data counts people more than once or includes people who no longer work in the area. My 90% confidence interval would range from around 100 people to around 1,500 people. ↩︎ ↩︎

  11.  ↩︎
  12. It’s difficult to say exactly how much is being spent to advance AI capabilities. This is partly because of a lack of available data, and partly because of questions like:

    • What research in AI is actually advancing the sorts of dangerous capabilities that might be increasing potential existential risk?
    • Do advances in AI hardware or advances in data collection count?
    • How about broader improvements to research processes in general, or things that might increase investment in the future through producing economic growth?

    The most relevant figure we could find was the expenses of DeepMind from 2020, which were around £1 billion, [according to their annual report](https://find- and-update.company-information.service.gov.uk/company/07386350/filing- history). We’d expect most of that to be contributing to “advancing AI capabilities” in some sense, since their main goal is building powerful, general AI systems. (Although it’s important to note that DeepMind is also contributing to work in AI safety, which may be reducing existential risk.)

    If DeepMind is around about 10% of the spending on advancing AI capabilities, this gives us a figure of around £10 billion. (Given that there are many AI companies in the US, and a large effort to produce advanced AI in China, we think 10% could be a good overall guess.)

    As an upper bound, the total revenues of the AI sector in 2021 were around $340 billion .

    So overall, we think the amount being spent to advance AI capabilities is between $1 billion and $340 billion per year. Even assuming a figure as low as $1 billion, this would still be around 100 times the amount spent on reducing risks from AI. ↩︎