As co-chair of Cilium + eBPF Day with @lbernail I'm excited we are doing the opening and closing remarks together
See you there? 🐝
https://t.co/kO5FglP53z
Every year I post a “best tech/papers of the year” list.
I try to write a detailed blog post, with a few lines on why I liked or what I learned from each of the items on the list.
Here’s my list for 2023, if anyone is interested. I’ll have the blog post ready by early Jan
The "Everything, Everywhere, All At Once" session from the Kubecon 2023 NA keynote has been posted. Great lessons learned shared by @hemanthmalla, @lbernail, and the @datadoghq team.
https://t.co/f0VBAg5ZEM
this talk is instant canon. @datadog rocks up to kick off the day two #KubeConNA keynote with a post mortem of a recent 48 hour outage. this is the way. learn from system failure, share the knowledge, gain kudos and earn confidence. a proper tech talk.
#KubeCon day 2 keynote talk from Hemanth Malla & Laurent Bernaille about DataDog’s outage of losing 60% of their #K8s nodes. The cause was an unattended upgrade of nodes that received a patch related to systemd. They created node lifecycle automation pltfrm to protect the future
Important lessons from Hemanth Malla & Laurent Bernaille about DataDog’s global outage; unexpected interactions arise organically over time if unchecked. #KubeCon
.@hemanthmalla + @lbernail are sharing how DataDog lost everything, everywhere, all at once - or more than 60% of its #Kubernetes nodes in less than an hour - and how it recovered and took steps to prevent another incident!
🐝 Watch this exciting session from eBPF Summit 2022
@lbernail spoke on All Your Queues Belong to Us: Debugging and Mitigating a Kernel Bug with eBPF
Register for eBPF Summit 2023 👇
https://t.co/Z3hQ28Yduz
Watch 👇
https://t.co/IakrXpAYim
In "A Deep Dive into the Platform-level Recovery" @lbernail picks up where he left off in the first post, and covers what it took to restore the @datadoghq platform in all affected regions in order to provide apps with enough compute capacity to recover.
https://t.co/5UghkwD3I6
@lbernail and I gave a talk about a kernel bug at OS Summit in Vancouver a few weeks back: All Your Queues Are Belong To Us. It's available to stream now: https://t.co/ubWxk0m9iw… - enjoy!
"A Deep Dive into the Platform-level Impact" by @lbernail is the first in a series of in-depth posts by engineers at @datadoghq about an incident earlier this year. We wanted to fully understand the "how", "why", and "what's next" before posting.
https://t.co/03GmgFeHcv
Slides are ready for our #OSSummit talk. If you're curious about Linux networking bugs and how to debug them (and in Vancouver) you should come: https://t.co/rEewG7pgiU