MITTR | Artificial intelligence
OpenAI has trained its LLM to confess to bad behavior
OpenAI已训练其大型语言模型承认不当行为

2025-12-03 1295词 困难
OpenAI sees confessions as one step toward that goal. The work is still experimental, but initial results are promising, Boaz Barak, a research scientist at OpenAI, told me in an exclusive preview this week: “It’s something we’re quite excited about.”
免责声明:本文来自网络公开资料,仅供学习交流,其观点和倾向不代表本站立场。