GitHub went down for ~70 minutes yesterday. Interestingly, the root cause was not a database (the usual suspect), but an auth was returning 401s. Although outages are not good, we as engineers can learn a thing or two from them. Here's a quick dissection...
So, about 15% of API traffic started getting "Unauthorized" responses for requests that were perfectly valid. The credentials were fine. But the 'infra' was lying. Here is the part that makes this interesting.
Every well-behaved HTTP client reauths when it receives 401. So thousands of apps did exactly what they were supposed to do - and that made things worse.
Every client getting a false 401 (root cause for 401 not mentioned yet) kicked off a token refresh, which piled more load onto an already struggling auth layer. Here is my key takeaway...
When a 401 comes back, we typically reauthenticate, and we should. But if we get 10 consecutive 401s on a token that was just refreshed, reauthenticating again is not the answer. That is a circuit-breaker moment - back off, raise an alert, and stop hammering the system.
Retrying blindly in an auth-failure loop could turn an incident into a full outage. So, this is something you can account for when building your next system :)
Hope this helps.
Worst travel experience ever with @FlixBus Bus is delayed. Our seats AC are not working properly. Within an hour after the journey started they had to call an electrician for a headlight problem which is not fixed even after more than an hour now.
📢 Exciting news! 🎉🎉
PyCon India 2025 is happening in Bengaluru from September 12-15!
Get ready for an incredible gathering of Python enthusiasts. 🌐✨
Stay tuned for more details! https://t.co/y2lR7bJb1M
#PyConIndia2025#Bengaluru
Today, @akshayg96_ will share his experience with #LFXMentorship at #OSSummit Europe🎉
Hear firsthand how the @linuxfoundation Mentorship can help you build skills & gain the experience to make meaningful contributions to #OSS.
👋 in comments if you're attending.
LFX Mentorship program can help you gain the skills & experience necessary to make effective open source contributions.
At #OSSummit Europe, @akshayg96_ will be speaking about his experience with #LFXMentorship 🎉
Register here to catch his talk👇
https://t.co/VlaIY6XXks
Kubernetes communities are celebrating #KuberTENes across the globe🌍In #Pune, we are throwing a party🎉 at @infracloudio office🏢
Join us to celebrate this milestone with like-minded folks over cake, coke, & fascinating war stories from #K8s veterans👇
https://t.co/Va3QgSxMB8
🚀 **Don't miss today's mentorship call!** 🚀
If you're drafting your first conference proposal or need any assistance with your proposal, this is the perfect opportunity for you. ✨
Join us today at 🕒 15:00 IST at [🔗 https://t.co/qKitSB1KfQ] 🎉
#PyConIndia2024#Python 🐍
Guess who's going to @KubeCon_ 🤩
Paralus will make its presence felt at the CNCF lightning talks with a quick overview.
If you're attending, be there to support and cheer the project!
#KubeConEU2024#KubeconParis#KubeCon#CloudNative
🥁 Exciting news! PyCon India 2024 is happening in Bengaluru from September 20-23! 📆 Join us for an unforgettable conference in the heart of India's tech hub! Calling for Talks and Workshop proposals: https://t.co/PfsP9UWJbH
#PyConIndia#Python#CFP
@KubeCon_ India is Happening!
🗓️
December 11-12, 2024
📌 India International Convention
and Expo Centre, Dwarka Delhi, India
While they mention estimated no. of attendees as 3000+, I get a feeling there will be many more!
Mark your calendars 🚀
#KubeConIndia#KubeCon
📢 Exciting news
Starting January 23rd, we'll be hosting regular #community meetings every 2nd and 4th Tuesday of the month.
Join us for lively discussions, project updates, and community engagement.
Details 👉🏻 https://t.co/Vym8cgstru
#Paralus#Kubernetes#CloudNative
The 1st Workshop on Programming for the Planet (PROPL) is coming up on January 20, 2024 in London! The schedule is available at https://t.co/st8CmL4McK. #POPL2024@poplconf@tarides_