OpenAI has announced it is implementing a series of new mental health guardrails for ChatGPT, alongside the release of new open models and the anticipation of a GPT-5 update in the coming weeks.
The new guardrails are designed to change how the chatbot interacts with users on sensitive topics. ChatGPT will no longer provide direct answers to high-stakes personal questions, such as relationship advice. Instead, it will adopt a more facilitative role by asking questions to help users think through the issue themselves. Additionally, the system will monitor the duration of user engagement and will prompt individuals to take breaks during prolonged, continuous sessions.
GPT-5 rollout nears with Copilot upgrade
OpenAI is also developing capabilities for ChatGPT to detect signs of mental or emotional distress. When such signs are detected, the chatbot will direct users toward evidence-based resources for support. The implementation of these features follows multiple reports of individuals experiencing negative mental health outcomes after extensive interactions with AI chatbots.
According to OpenAI, these new guardrails were developed in collaboration with over 90 physicians from more than 30 countries, including specialists in psychiatry and pediatrics. This work helped create custom evaluation methods for complex conversations. The company is also working with researchers to fine-tune its algorithms for detecting concerning user behavior and is establishing an advisory group of experts in mental health, youth development, and human-computer interaction to further enhance safety.