No webhooks from Asana

Hi everyone,

Thanks again for your support this week. Our team has completed an investigation of Tuesday’s incident, and I just wanted to make sure we close the loop for everyone here.

We’ve updated our status page with the following postmortem report:

Incident: An operation created a large backlog of events that was unable to be processed within our timeout. The event processing job was rescheduled in the failing state, causing corresponding workers to be stuck and resulting in severe delays for all event distribution. Full recovery to expected conditions took ~11 hours.

Impact: Events associated with this database were delayed. A subset of these events failed to be delivered because they aged out due to the delay. Of the events that were not delivered, only ~5-10% were customer events. No customer data was lost.

Moving forward: As a result of this incident, Asana is implementing changes to make our event distribution systems more resilient to cascading failures and high event volume.

Our metric considers a weighted average of uptime experienced by users at each data center. The number of minutes of downtime shown reflects this weighted average.

Feel free to reply here directly if you have more questions about the incident. Again, we appreciate your understanding as we continue to optimize the durability and stability of webhooks.

We’re looking to have about 30 minutes of scheduled maintenance at 9:00am PDT on Saturday, October 16th. I’m happy to share that we’re continuing to make progress on what we’ve committed, and just need a brief period of down time as we improve our webhooks infrastructure. For more details, check out this post.

Thank you and have a great weekend, everyone!

Andrew

5 Likes