Praveen @krishprav23 - Twitter Profile

Praveen @krishprav23

4 days ago

@insanekrishnaa GitHub: https://t.co/9uBsHxEjKK Portfolio: https://t.co/ixwIGXcAHE LinkedIn: https://t.co/olSuDz63Z3

1

0

29

krishprav23 retweeted

Puneet Patwari

@system_monarch

6 days ago

LOAD BALANCING: Day 1/25 Your load balancer is lying to you. It says traffic is "balanced." What it really means is: requests are being distributed. Whether they're distributed well is a completely different question. I've seen a system with 4 servers where Server 3 was at 95% CPU while Server 1 sat at 12%. The load balancer said everything was fine. A load balancer doesn't know what's happening inside your servers. It doesn't know one request takes 2ms and the next takes 12 seconds. Your choice of algorithm, layer, and health check strategy is the difference between actual balance and the illusion of it. Let me break it down. 🧵Your load balancer is lying to you.

system_monarch's tweet photo. LOAD BALANCING: Day 1/25

Your load balancer is lying to you.

It says traffic is "balanced." What it really means is: requests are being distributed. Whether they're distributed well is a completely different question.

I've seen a system with 4 servers where Server 3 was at 95% CPU while Server 1 sat at 12%. The load balancer said everything was fine.

A load balancer doesn't know what's happening inside your servers. It doesn't know one request takes 2ms and the next takes 12 seconds.

Your choice of algorithm, layer, and health check strategy is the difference between actual balance and the illusion of it.

Let me break it down. 🧵Your load balancer is lying to you.

9

398

29

284

71K

krishprav23 retweeted

Thariq

@trq212

14 days ago

https://t.co/R6exTuF7P8

260

10K

1K

24K

3M

krishprav23 retweeted

Suni

@suni_code

13 days ago

Found the Best Resource to learn Harness Engineering. 😭 https://t.co/3eOEmMlfbv

34

2K

306

4K

129K

krishprav23 retweeted

Dan McAteer

@daniel_mac8

about 1 month ago

guys, I think memory gets cracked in 2026 then infinite context and memory + infinite context = continual learning you can feel it in the air

daniel_mac8's tweet photo. guys, I think memory gets cracked in 2026

then infinite context

and memory + infinite context

= continual learning

you can feel it in the air https://t.co/R884JGWYAK

76

1K

153

1K

137K

krishprav23 retweeted

Sebastian Raschka

@rasbt

about 1 month ago

New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings, layer-wise attention budgets, compressed attention, and mHC. Link: https://t.co/KO81y3kTH7

rasbt's tweet photo. New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4.

I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings, layer-wise attention budgets, compressed attention, and mHC.

Link: https://t.co/KO81y3kTH7 https://t.co/wTx51QpQu4

45

2K

426

2K

123K

krishprav23 retweeted

ani

@anirudhbv_ce

about 1 month ago

We finally know why LLMs hallucinate. It's not the model. It's the geometry. @OpenAI text-embedding-3-large: 91/3072 dimensions do real work. @GeminiApp gemini-embedding-001: 80/3072 dimensions do real work. ~97% of your vector database is mathematically empty. Your RAG system is retrieving from noise. @ashwingop and I present "The Geometry of Consolidation" - a proof that RAG compression has a hard floor no algorithm can beat, set by a single spectral number your embedding model cannot escape. Every hallucination your RAG pipeline produces? This is why. Paper + results: https://t.co/zut8pdoPbH

anirudhbv_ce's tweet photo. We finally know why LLMs hallucinate. It's not the model. It's the geometry.

@OpenAI text-embedding-3-large: 91/3072 dimensions do real work.

@GeminiApp gemini-embedding-001: 80/3072 dimensions do real work.

~97% of your vector database is mathematically empty. Your RAG system is retrieving from noise.

@ashwingop and I present "The Geometry of Consolidation" - a proof that RAG compression has a hard floor no algorithm can beat, set by a single spectral number your embedding model cannot escape.

Every hallucination your RAG pipeline produces? This is why.

Paper + results: https://t.co/zut8pdoPbH

148

4K

458

4K

274K

Praveen @krishprav23

about 1 month ago

@himanshutwtxs @theomkute @LocalDev_ Codex

0

16

krishprav23 retweeted

Alexander Whedon

@alex_whedon

about 1 month ago

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

1K

23K

3K

19K

13M

krishprav23 retweeted

DAIR.AI

@dair_ai

about 2 months ago

The Top AI Papers of the Week (April 19 - 26) - Skill-RAG - DeepSeek V4 - Autogenesis - Attention to Mamba - Stateless Decision Memory - Self-Evolving Logic Synthesis - Self-Generated World Knowledge Read on for more:

8

315

58

284

50K

krishprav23 retweeted

Ben Lang

@benln

about 2 months ago

YC on how to build a company with AI from the ground up:

96

5K

549

11K

461K

krishprav23 retweeted

Daily Dose of Data Science

@DailyDoseOfDS_

about 2 months ago

Everyone is sleeping on this new OCR model! - 85.9% (sota) on olmocr bench - 90+ language support w/benchmarks - 4B model (down from 9B) - Full layout information - Extracts + captions images and diagrams - Strong handwriting, math, form, table support 100% open-source.

9

1K

164

2K

80K

krishprav23 retweeted

Atal

@ZabihullahAtal

about 2 months ago

Stanford just released a 1.5-hour lecture on “LLM Architecture.” This is the exact thing systems engineers at Anthropic and OpenAI require to understand at a deep level. Give it some time. This might be the highest-ROI learning you do this month.

34

5K

693

10K

469K

krishprav23 retweeted

Kol Tregaskes

@koltregaskes

about 2 months ago

Ex-MIT researcher Isaak Freeman quits his PhD and drops the 50,000 H100 GPU roadmap to emulate a full human brain. He mapped the entire path from 302-neuron worm to 86-billion-neuron human with connectomics costs now at 100 dollars per neuron and data acquisition via advanced microscopes as the only blocker left - digital humans just got a realistic timeline. https://t.co/kGB5hOAQHC

koltregaskes's tweet photo. Ex-MIT researcher Isaak Freeman quits his PhD and drops the 50,000 H100 GPU roadmap to emulate a full human brain.

He mapped the entire path from 302-neuron worm to 86-billion-neuron human with connectomics costs now at 100 dollars per neuron and data acquisition via advanced microscopes as the only blocker left - digital humans just got a realistic timeline.
https://t.co/kGB5hOAQHC

69

3K

558

3K

183K

krishprav23 retweeted

Kye Gomez (swarms)

@KyeGomezB

about 2 months ago

Introducing OpenMythos An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch. The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts. My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning. Learn more ⬇️🧵

KyeGomezB's tweet photo. Introducing OpenMythos

An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch.

The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts.

My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning.

Learn more ⬇️🧵

240

8K

1K

9K

2M

krishprav23 retweeted

Tanuj

@tanujDE3180

2 months ago

AI Production Skills That Pay ₹35L+ in India (2026) - 7+ YOE > End-to-end RAG pipelines (evals, monitoring, cost controls) > GPU/TPU orchestration at scale (autoscaling, spot vs on-demand trade-offs) > Feature stores for low-latency, consistent inference > Vector DB scaling + hybrid search (BM25 + embeddings) > AI agent reliability (guardrails, retries, tool constraints) > Hallucination mitigation (grounding, eval datasets, feedback loops) > Multi-cloud resilience (failover, vendor abstraction) > Cost-aware architecture (token optimization, caching layers) > Real-time + batch pipeline design (when to use which) > Observability for AI (not just logs -quality + drift)

11

146

19

527

591K

Praveen @krishprav23

2 months ago

Anthropic Fellows program: "Cross-Architecture Model Diffing with Crosscoders" unsupervised discovery. Full paper: https://t.co/NWWC4ratcz found cool stuff like CCP alignment features in Qwen models, American exceptionalism in Llama, and more.

0

51

krishprav23 retweeted

Anthropic

@AnthropicAI

2 months ago

New Anthropic Fellows Research: a new method for surfacing behavioral differences between AI models. We apply the “diff” principle from software development to compare open-weight AI models and identify features unique to each. Read more: https://t.co/VAsu2PSgCX

262

3K

350

1K

578K

krishprav23 retweeted