Guaranteed Safe AI Architectures
Develop and implement AI architectures with separable, auditable world models; where safety can be specified in terms of the state space of the model; and proposed AI outputs come with proofs that the output does not leave the safe region of the world model’s state space.
Resources (2)
ARIA Opportunity Space: Safeguarded AI
Funding Program
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Whitepapers and Essays
R&D Gaps (1)
The potential for AI systems to behave unpredictably or dangerously (“go rogue”) is a critical concern. Ensuring safe and controllable AI architectures is essential for reliable operation.
See also:
• https://www.lesswrong.com/posts/fAW6RXLKTLHC3WXkS/shallow-review-of-technical-ai-safety-2024
• h...