ChatGPT currently unavailable
Incident Report for OpenAI
Postmortem

On November 8th, 2024 from 4:06 PT to 4:41 PT, a large portion of ChatGPT traffic failed with 503 response codes. For most of the duration of the incident – from 4:06 PT to 4:32 PT –  this affected requests from the web client, OS-specific application clients, and iOS devices. The Assistants API also experienced errors.

The root cause was a configuration change to the load-balancing configuration for a downstream service which ChatGPT depends on. This config change activated a latent bug in the logic to actuate the config, rapidly increasing server worker’s memory usage which led to all of them crashing.

Once we identified the root cause, we mitigated the outage by rolling back the configuration change. By 4:32 the errors had almost completely subsided, and service was fully recovered by 4:41.

To keep the system safe in the short term, we have already done the following:

  • Locked the configuration in question.
  • Audited all places where the configuration is read.

In the coming weeks, we will significantly refactor our configuration delivery systems to prevent this class of outage from happening again:

  • Use property testing to verify that our configuration parsing libraries perform as expected across wildly varying inputs, especially those which library authors do not foresee.
  • Implement gradual rollout of configuration changes.
  • Create faster configuration rollback mechanisms.

We know that extended API outages affect our customers’ products and business, and outages of this magnitude are particularly damaging. While we came up short here, we are committed to preventing such incidents in the future and improving our service reliability.

Posted Nov 15, 2024 - 13:54 PST

Resolved
This issue has now been resolved. Between 4:06pm and 4:30pm PT, ChatGPT was unavailable to all users. Access was restored to most users by 4:34, with a small number of customers still experiencing issues until 5pm.
Posted Nov 08, 2024 - 17:47 PST
Monitoring
A fix has been implemented and we are monitoring results. Most users should already see access to ChatGPT restored.
Posted Nov 08, 2024 - 17:17 PST
Update
We are continuing to work on the fix for our issue. Thank you for your patience.
Posted Nov 08, 2024 - 17:06 PST
Identified
We have identified the root cause of this issue, and are currently working to implement a fix.
Posted Nov 08, 2024 - 16:37 PST
Investigating
We are aware of an issue which has resulted in ChatGPT being unavailable. We are currently investigating and working to restore functionality as soon as possible.
Posted Nov 08, 2024 - 16:13 PST
This incident affected: ChatGPT.