← Back

Guaranteed Safe AI Architectures

Governance/Ethics Risk
Develop and implement AI architectures with separable, auditable world models; where safety can be specified in terms of the state space of the model; and proposed AI outputs come with proofs that the output does not leave the safe region of the world model’s state space.

R&D Gaps (1)

The potential for AI systems to behave unpredictably or dangerously (“go rogue”) is a critical concern. Ensuring safe and controllable AI architectures is essential for reliable operation. See also:  • https://www.lesswrong.com/posts/fAW6RXLKTLHC3WXkS/shallow-review-of-technical-ai-safety-2024 • h...