MITTR | Artificial intelligence

OpenAI has trained its LLM to confess to bad behavior

OpenAI已训练其大型语言模型承认不当行为

2025-12-03 1295词困难

字号

OpenAI sees confessions as one step toward that goal. The work is still experimental, but initial results are promising, Boaz Barak, a research scientist at OpenAI, told me in an exclusive preview this week: “It’s something we’re quite excited about.”

请登录后继续阅读完整文章

还没有账号？立即注册

成为会员后您将享受无限制的阅读体验，并可使用更多功能，了解更多

免责声明：本文来自网络公开资料，仅供学习交流，其观点和倾向不代表本站立场。