AI "Stop Button" Problem - Computerphile - Computerphile

What is corrigibility in the context of AGI?

Corrigibility is a property of AGI that means it is open to being corrected. It understands that its utility function is not complete and is willing to change it.

Why is it important for an AGI to be corrigible?

Corrigibility is important for an AGI to be able to learn and improve. If an AGI is not corrigible, it may resist changes to its utility function, making it difficult to teach or correct its behavior.

How does the speaker describe the problem of creating an AGI that won't prevent a user from pressing a button but also won't persuade the user to press it?

The speaker describes it as an open problem and notes that while there are proposals for ways to create an AGI with these properties, none of them are without issues and perfectly solve all the problems.

What is the utility function in the context of AI, as mentioned by the speaker?

The utility function is what the AI cares about and is the measure of what the AI is trying to optimize. In the example given, the stamp collecting device's utility function is how many stamps it collects in a year.