OpenAI published a new paper called "Monitoring Monitorability." It offers methods for detecting red flags in a model's reasoning. Those shouldn't be mistaken for silver bullet solutions, though. In ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results