Understanding Neural Design Principles of Social Instincts
Study the neural basis of human social instincts to inform AI design, ensuring that AI systems can safely interpret and emulate human social behavior.
Resources (3)
Intro to Brain-Like-AGI Safety, by Steven J. Byrnes
Research and Reviews
Reducing LLM deception at scale with self-other overlap fine-tuning
Research and Reviews
Intro to Brain-like-AGI Safety: Symbol grounding & human social instincts
Whitepapers and Essays
R&D Gaps (1)
The potential for AI systems to behave unpredictably or dangerously (“go rogue”) is a critical concern. Ensuring safe and controllable AI architectures is essential for reliable operation.
See also:
• https://www.lesswrong.com/posts/fAW6RXLKTLHC3WXkS/shallow-review-of-technical-ai-safety-2024
• h...