Fairwinds provides Managed Kubernetes-as-a-Service & open source software to enable organizations to run & optimize mission-critical Kubernetes infrastructure
If every AI pod requests https://t.co/Ef6CEedj2q: 1 and uses a fraction of it, you are paying for idle silicon. The article shows when whole GPUs are fine, and when MIG or time slicing actually reduce waste without breaking latency:
https://t.co/5qe922nPHD
#AIInfra#Kubernetes
Running AI on Kubernetes is more than adding GPU nodes.
You need:
> GPU aware scheduling
> Patterns for training vs inference
> Guardrails so one model can’t break prod
This post digs into what AI ready Kubernetes actually requires:
https://t.co/cjTHgyJRGO
#K8s#AIInfrastructure
AWS Community Day | Midwest is June 24 in Indianapolis. Fairwinds’ Stevie Caldwell (SRE Tech Lead) + Andy Suderman (CTO) are speaking on running LLM inference on Kubernetes with EKS. Worth a look if you own platform/SRE.
https://t.co/XEaY70dCvw
#AWS#AWSCommunityDay#k8s#LLM
Next time you hit CrashLoopBackOff, skip the panic and random kubectl commands. We broke down exactly how to troubleshoot it methodically and efficiently so you can resolve it fast and get back to building.
https://t.co/iDeP0Ejih7
#Kubernetes#CloudNative#DevOps
Kubernetes has become the backbone of production: 82% of container users now run it in production and 94% are running, piloting, or evaluating it, according to the latest CNCF survey. The challenge is operating it well at scale.
https://t.co/VGl4SsOOHs
#Kubernetes#CloudNative
AI infrastructure conversations often end in the same place: the Kubernetes platform.
Teams want one place to run services, data pipelines, and GPU workloads with shared guardrails, not a separate AI stack.
Here’s how that’s playing out :
https://t.co/kogIPIFH7x
#k8s#AI#MLOps
One reason the Trivy attack worked because version tags were mutable. Attackers overwrote known-good binaries after the fact.
Immutable tags make that impossible. Pin to a version, it stays there.
Learn more ways to shrink your exposure:
https://t.co/pTwmD0GP24
#k8ssecurity
Dashboards show what your cluster is doing. This experiment asks what happens if you listen instead.
A Go controller, OSC, and SuperCollider turn Kubernetes events into sound so you can hear pod creates, deletes, and scaling.
https://t.co/Gf5wKTosec
#KubeCon#DevOps#SRE
#Kubernetes was supposed to help with efficiency, yet the cloud bill keeps climbing. Labels, requests, idle nodes, and autoscaling policies decide who pays and how much. This post outlines the questions teams use to untangle spend (and more). Learn more:
https://t.co/PtQSUVMrEu
Kubernetes gives you graphs and logs. This project asks what happens if you turn events into sound instead.
A small controller watches pod and deployment events, sends them over OSC, and lets SuperCollider handle the rest.
Details:
https://t.co/oOHq25bh5X
#KubeCon#DevOps#Infra
Most GPU waste on Kubernetes is not that Kubernetes is “wrong for AI.” It is whole‑GPU allocation, no rightsizing, and idle GPU nodes. If you cannot see idle GPU hours by team, you are probably missing it. Details: https://t.co/AMhHFCaKNj
#Kubernetes#GPU#MLOps
Bags packed? 🧳
Observability Summit North America kicks off TOMORROW, May 21 in Minneapolis. We can’t wait to see the community back together!
Final registration: https://t.co/N9evKgAIcv
#O11ySummit
Pods that look fine in dev but die under real traffic are often hitting bad memory requests and limits, not only bad code. Th walks through how teams diagnose OOMKilled workloads using real usage data and smart autoscaling.
More:
https://t.co/cYHI0vsp44
#Kubernetes#SRE
Tired of trial-and-error debugging for CrashLoopBackOff? Follow this systematic troubleshooting approach to find the root cause faster, get your pods stable, and stop wasting time on random fixes that don't work:
https://t.co/aQfkeuZ1os
#Kubernetes#SRE#PlatformEngineering
When Trivy was compromised, organizations with 1-hour CI/CD credentials had nothing to rotate.
The exposure window closed on its own. That's the power of short-lived credentials.
Learn more ways to shrink your exposure:
https://t.co/3m10GUSsKG
#Kubernetes#DevSecOps
CNCF reports that 66% of organizations using generative AI now run those workloads on Kubernetes, making the platform a key layer for GPU utilization, autoscaling, and cost control across training, data processing, and inference.
https://t.co/ia09q0v7NA
#AI#Kubernetes#GPU
How much momentum would you get back if Kubernetes upgrades stopped hijacking sprints?
This piece looks at upgrades as an economics problem, not just a technical chore:
https://t.co/bXOmaAGBHB
#Kubernetes#EngineeringLeadership#TechStrategy
Today's the day! Join us at 1pm EDT. Free, hands-on EKS workshop with Andy Suderman (Fairwinds CTO, CNCF Ambassador, creator of Goldilocks) and team.
Overview session followed by a hands-on lab. Limited spots — register now.
https://t.co/xji19ruClq
#EKS#Kubernetes
Happening today: EKS Fundamentals: How to Run and Scale Containerized Production Workloads.
Get hands‑on with Amazon EKS and learn how to run and scale production containers the right way.
Last‑minute sign‑up 👇
https://t.co/HvUQv76aR7
#AmazonEKS#Kubernetes#CloudNative
Running Kubernetes in production usually means chasing the same few issues: OOMKills, spend blowups, and fragile autoscaling. This post shares seven commo questions SREs use to get to root cause faster. Learn more:
https://t.co/eiATLYQRSQ
#Kubernetes#SRE#DevOps