@Alfieri_Craig@GoogleOSS@AntithesisHQ I'd be thrilled to collaborate on putting etcd to the test. Let's connect and discuss how we can make this happen!
Kubernetes is going AI-first! 🚀 We’re proud to support the new Certified Kubernetes AI Conformance program, ensuring AI workloads remain portable and reliable through common, industry-wide standards.
Read more: https://t.co/OVhxOCmgKb #Kubernetes#AI
How is your team scaling AI today?
@danielepolencic I would appreciate proper research.
While it is not always easy to get accurate information, open source maintainers don't bite. At least not all of them.
Each community has its own chat and public meetings, where you can ask questions. They would appreciate feedback too.
@danielepolencic As a person that worked on 30k node GKE clusters based on etcd, I don't know what this refers to.
If you are interested in scalability topics by people working on the topic watch https://t.co/K7LIjfLIUX
We are also working on 15GB resource size on etcd https://t.co/bnsjF46PyV
@danielepolencic Again, you didn't understand the motivation. It's not project A vs project B. It's the fact that AWS journal and Spanner use Atomic clocks for concensus. That allows for clean horizontal scaling of distributed system, don't expect something like this in open source.
@danielepolencic Have you ever run etcd in production? Or are you collecting random complaints from the internet from people trying to use etcd with NFS?
Etcd is not perfect, but the fact that some people can struggle with 3 node K8s, while others run 30k without a problem is interesting.
@danielepolencic What have watch events to the leader? Totally unrelated.
Watch events are distributed totally outside raft concensus. Also API server demuxes the watch request so that there is just one watch per resource. So number of controllers has no impact on etcd.
https://t.co/BI2UtqtEWx
@danielepolencic Here we go again...
8GB is an old limit that was never revisited or updated. Most cloud providers run etcd with 20GB-30GB without much problem.
Usually it's Kubernetes that breaks when you put a lot of data in a single resource or namespace.
Read https://t.co/bnsjF46PyV
Learn how the etcd open source project overcame a significant knowledge drain and reliability issues by implementing deterministic simulation testing.
By @ha_joslyn, thanks to @AntithesisHQ, feat. @serathius https://t.co/xdL1EN0S9t
@indygupta Pretty sure this fact was well established over 7 years ago in multiple Reddit posts. Fact that you can doesn't automatically means that you should set limits for all your containers.
Maybe we should give the discussion a rest and just put it in official documentation? @thockin
Today our CEO, Will Wilson, announced a partnership with @CloudNativeFdn to provide free reliability testing for Graduated and Incubating projects. Stop by booth 457 to learn more. @etcdio maintainer @serathius is presenting at 12 today in the Project Pavilion about the work we did together to help advance etcd testing!
How do you make a foundational project like etcd truly robust? 🤔 @ha_joslyn sits down with @serathius (@Google) at #KubeCon to discuss how the etcd maintainer community used the challenge of DST to drive better, more rigorous testing.