Kush Varshney कुश वार्ष्णेय

@krvarshney

I wrote a book. Free pdf: are my own and don't necessarily represent IBM.

Chappaqua, NY

Joined March 2012

647 Following

3.2K Followers

8.4K Posts

Kush Varshney कुश वार्ष्णेय @krvarshney

1 day ago

Here's the paper, the framework, the seriousness. https://t.co/AIUSzmMh0b

Abhivardhan

@IndusThink

7 days ago

@AKLKO1977 "Advaita Vedanta approach to agentic AI" No wonder what it means. No white paper. No framework, just non serious stuff. Current Agents are unreliable. @AKLKO1977

IndusThink's tweet photo. @AKLKO1977 "Advaita Vedanta approach to agentic AI"

No wonder what it means. No white paper. No framework, just non serious stuff.

Current Agents are unreliable.

@AKLKO1977 https://t.co/gT6hFbLmgE

416

316

krvarshney retweeted

The Hindu

@the_hindu

3 days ago

The Nexbax AI Index proposes new AI evaluation metrics focused on real-world usability, cost, and accessibility for users in India and the Global South, challenging standard global benchmarks. ✍️Rashmi Patil https://t.co/q1EvuHQDNK

krvarshney retweeted

IBM Research @IBMResearch

9 days ago

Quantum startup @ParityQC demonstrates 52‑qubit quantum Fourier transform on IBM Heron processor, the largest to date: https://t.co/KUzbTxhp2b New “Parity Twine” method achieves record-setting performance by rethinking how quantum information is represented and propagated.

IBMResearch's tweet photo. Quantum startup @ParityQC demonstrates 52‑qubit quantum Fourier transform on IBM Heron processor, the largest to date: https://t.co/KUzbTxhp2b

New “Parity Twine” method achieves record-setting performance by rethinking how quantum information is represented and propagated. https://t.co/AsYVaV8oqi

krvarshney retweeted

Paul Schweigert @psschwei

13 days ago

A blog on how to get frontier-level results from small language models by using Mellea and Granite Libraries do the heavy lifting: https://t.co/rfjBuJjfc8

488

Who to follow

𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞

@hima_lakkaraju

AI Professor @Harvard; Senior Staff Research Scientist @GoogleAI; @trustworthy_ml #AI #XAI; AI PhD from Stanford; Sloan/Kavli Fellow, MIT TR #35Under35

Sebastien Bubeck

@SebastienBubeck

I work on AI at OpenAI. Former VP AI and Distinguished Scientist at Microsoft.

Aaron Roth

@Aaroth

CS prof at Penn. Amazon Scholar at AWS. Author of The Ethical Algorithm (w/ Michael Kearns). I study machine learning, privacy, game theory, and uncertainty.

Kush Varshney कुश वार्ष्णेय @krvarshney

29 days ago

@RishiBommasani @percyliang We used the open concept kitchen analogy in this video a few years ago: https://t.co/FfAsaR5FTt (start around 6:15) and have gotten positive feedback from viewers.

krvarshney retweeted

Javier Carnerero Cano @ccanojavi

about 1 month ago

📊 We also introduce VELI5, a new dataset with controlled factual errors + ground-truth fixes. This dataset has already been used to fine-tune state-of-the art factuality guardrails such as Granite Guardian [https://t.co/t30KnYRs7P]. (3/4)

152

krvarshney retweeted

aipulsedaily

@aipulseda1ly

about 1 month ago

IBM just dropped Granite 4.1, their largest model release to date Language, vision, speech, embeddings, and safety models all in one drop The 8B instruct model reportedly matches their previous 32B MoE on instruction following and tool calling Guardian 4.1 does risk and policy scoring with calibrated confidence levels instead of binary yes/no filtering, which is a smarter approach for enterprise deployment All Apache 2.0, available on HuggingFace, Ollama, and watsonx IBM is quietly building a full enterprise AI stack https://t.co/KTarte7hkG

aipulseda1ly's tweet photo. IBM just dropped Granite 4.1, their largest model release to date

Language, vision, speech, embeddings, and safety models all in one drop

The 8B instruct model reportedly matches their previous 32B MoE on instruction following and tool calling

Guardian 4.1 does risk and policy scoring with calibrated confidence levels instead of binary yes/no filtering, which is a smarter approach for enterprise deployment

All Apache 2.0, available on HuggingFace, Ollama, and watsonx

IBM is quietly building a full enterprise AI stack

https://t.co/KTarte7hkG

232

krvarshney retweeted

AqibAi

@Aqib__786Ai

about 1 month ago

IBM is clearly doubling down on a very specific lane here: practical, efficient, enterprise-ready models rather than chasing leaderboard dominance. Granite 4.1 feels like a continuation of that philosophy—especially the 8B. That 4M token usage vs 78M on Qwen is kind of wild. In real deployments, that translates directly into: lower latency dramatically lower cost easier scaling for agent workflows Which honestly matters more than raw benchmark scores for most companies. The tradeoff is obvious though: you’re giving up peak intelligence. A 12 vs 15 score doesn’t sound huge, but in practice that gap can show up in: reasoning depth edge-case handling coding reliability So these aren’t “frontier competitors”—they’re workhorse models. What’s arguably more important is the Apache 2.0 + openness push. That 61 Openness Index score puts IBM ahead of most “open-ish” players like Alibaba (Qwen) and Google (Gemma). For enterprises, that’s a big deal: fewer licensing headaches more control over deployment (on-prem / air-gapped) easier compliance story The positioning is pretty clear: Granite 3B → edge / lightweight agents Granite 8B → sweet spot (cost vs capability) Granite 30B → heavier enterprise workloads where you still want efficiency The most interesting signal here isn’t the scores—it’s the token efficiency trend. If models like this keep improving, the industry might shift from “bigger is better” to: “good enough intelligence, but 10–20x cheaper to run” And that’s where adoption really explodes. Curious part: if someone pairs Granite 8B with strong retrieval + tools, it could close a lot of that intelligence gap without losing its cost advantage. That’s probably the real play.

258

krvarshney retweeted

Artificial Analysis

@ArtificialAnlys

about 1 month ago

IBM has released three new non-reasoning Granite 4.1 models (30B, 8B, 3B) as open weights under Apache 2.0. All three are notably token-efficient relative to peer non-reasoning models, with the 8B standing out for its token efficiency relative to intelligence @IBM has released three new instruct models in the Granite 4.1 family: Granite 4.1 30B (15 on the Intelligence Index), Granite 4.1 8B (12), and Granite 4.1 3B (9). The release continues IBM's focus on small, efficient, and open models for enterprise and edge deployment, alongside the existing Granite 4.0 Nano family (1B and 350M variants released in October 2025). The Intelligence Index is the Artificial Analysis synthesis metric incorporating 10 evaluations covering agentic tasks, coding, and scientific reasoning. Key benchmarking results: ➤ All three Granite 4.1 models score 61 on the Artificial Analysis Openness Index, standing out among peer open weights non-reasoning models. This is driven by full open weights under Apache 2.0 plus partial disclosures across pre-training data, post-training data, and training methodology. Granite 4.1 sits well above peers like Qwen3.5 (39), Gemma 4 (39) and GLM-4.7-Flash (44), and represents a meaningful improvement over the Granite 4.0 family (56), driven by stronger methodology disclosure. Olmo 3.1 and K2 Think V2 (both 89) remain leaders as the most ‘open’ models. ➤ Granite 4.1 8B uses just 4M output tokens to run the Intelligence Index. This is ~20x fewer than Qwen3.5 9B (78M tokens), ~3x fewer than Ministral 3 8B (13M), and ~2x fewer than Gemma 4 E4B (8M). The pattern holds across the family: Granite 4.1 30B uses 4.6M output tokens (vs 7M for Gemma 4 31B and 25M for Qwen3.5 27B), and Granite 4.1 3B uses 2.7M. ➤ Token efficiency comes at the cost of intelligence relative to peer non-reasoning models. Granite 4.1 30B (15) trails leading peers like Qwen3.5 27B (37) and Gemma 4 31B (32). Granite 4.1 8B (12) trails Ministral 3 8B (15) and Gemma 4 E4B (15). Granite 4.1 3B (9) trails Gemma 4 E2B (12). ➤ Granite 4.1 30B and 3B both gain on the Intelligence Index over their Granite 4.0 predecessors. Granite 4.1 30B (15) gains 4 points over Granite 4.0 H Small (32B / 9B active, 11), with the largest gains in tool use (τ²-Bench: 42% vs 17%) and agentic tasks (GDPval-AA: 493 vs 344 Elo). Granite 4.1 3B (9) gains 1 point over Granite 4.0 Micro (8). Other information: ➤ License: Apache 2.0 (open weights, permissive commercial use) ➤ Context window: 128K tokens ➤ Availability: Granite 4.1 8B is available via @WandB ($0.05/$0.1 per 1M input/output tokens) and @replicate. Weights for all three models are available via @huggingface.

ArtificialAnlys's tweet photo. IBM has released three new non-reasoning Granite 4.1 models (30B, 8B, 3B) as open weights under Apache 2.0. All three are notably token-efficient relative to peer non-reasoning models, with the 8B standing out for its token efficiency relative to intelligence

@IBM has released three new instruct models in the Granite 4.1 family: Granite 4.1 30B (15 on the Intelligence Index), Granite 4.1 8B (12), and Granite 4.1 3B (9). The release continues IBM's focus on small, efficient, and open models for enterprise and edge deployment, alongside the existing Granite 4.0 Nano family (1B and 350M variants released in October 2025). The Intelligence Index is the Artificial Analysis synthesis metric incorporating 10 evaluations covering agentic tasks, coding, and scientific reasoning.

Key benchmarking results:

➤ All three Granite 4.1 models score 61 on the Artificial Analysis Openness Index, standing out among peer open weights non-reasoning models. This is driven by full open weights under Apache 2.0 plus partial disclosures across pre-training data, post-training data, and training methodology. Granite 4.1 sits well above peers like Qwen3.5 (39), Gemma 4 (39) and GLM-4.7-Flash (44), and represents a meaningful improvement over the Granite 4.0 family (56), driven by stronger methodology disclosure. Olmo 3.1 and K2 Think V2 (both 89) remain leaders as the most ‘open’ models.

➤ Granite 4.1 8B uses just 4M output tokens to run the Intelligence Index. This is ~20x fewer than Qwen3.5 9B (78M tokens), ~3x fewer than Ministral 3 8B (13M), and ~2x fewer than Gemma 4 E4B (8M). The pattern holds across the family: Granite 4.1 30B uses 4.6M output tokens (vs 7M for Gemma 4 31B and 25M for Qwen3.5 27B), and Granite 4.1 3B uses 2.7M.

➤ Token efficiency comes at the cost of intelligence relative to peer non-reasoning models. Granite 4.1 30B (15) trails leading peers like Qwen3.5 27B (37) and Gemma 4 31B (32). Granite 4.1 8B (12) trails Ministral 3 8B (15) and Gemma 4 E4B (15). Granite 4.1 3B (9) trails Gemma 4 E2B (12).

➤ Granite 4.1 30B and 3B both gain on the Intelligence Index over their Granite 4.0 predecessors. Granite 4.1 30B (15) gains 4 points over Granite 4.0 H Small (32B / 9B active, 11), with the largest gains in tool use (τ²-Bench: 42% vs 17%) and agentic tasks (GDPval-AA: 493 vs 344 Elo). Granite 4.1 3B (9) gains 1 point over Granite 4.0 Micro (8).

Other information:

➤ License: Apache 2.0 (open weights, permissive commercial use) ➤ Context window: 128K tokens ➤ Availability: Granite 4.1 8B is available via @WandB ($0.05/$0.1 per 1M input/output tokens) and @replicate. Weights for all three models are available via @huggingface.

237

24K

krvarshney retweeted

Keshav Ramji

@KeshavRamji

about 1 month ago

What if your language model could reason efficiently in an entirely new language? We introduce Abstract Chain-of-Thought, a new mechanism which allows language models to reason through a short sequence of reserved "abstract" tokens through reinforcement learning. It is as performant as verbalized CoT at a fraction of the cost, achieving major gains in inference-time efficiency.

$KeshavRamji's tweet photo. What if your language model could reason efficiently in an entirely new language? We introduce Abstract Chain-of-Thought, a new mechanism which allows language models to reason through a short sequence of reserved "abstract" tokens through reinforcement learning. It is as performant as verbalized CoT at a fraction of the cost, achieving major gains in inference-time efficiency.$

132

887

krvarshney retweeted

Alex Bozarth @stbando

about 1 month ago

I've been working on an open source project called Mellea, and wrote a blog post about using it to automatically validate and fix Qiskit code generated by an LLM: https://t.co/xmibsKpaFm

218

Kush Varshney कुश वार्ष्णेय @krvarshney

2 months ago

krvarshney's tweet photo. https://t.co/mLcUOk4tDd

Kush Varshney कुश वार्ष्णेय @krvarshney

2 months ago

@percyliang Congratulations!

419

krvarshney retweeted

Saleh Afroogh

@AfrooghSaleh

3 months ago

🚨 Is Explainable AI (XAI) broken at its core? A landmark new study addresses this — and charts a path forward. 📄 Check it out: https://t.co/b9BgWPXrxL #ExplainableAI #XAI #ArtificialIntelligence #MachineLearning #ResponsibleAI #AIResearch #LLMs #DeepLearning

AfrooghSaleh's tweet photo. 🚨 Is Explainable AI (XAI) broken at its core? A landmark new study addresses this — and charts a path forward.

📄 Check it out: https://t.co/b9BgWPXrxL

#ExplainableAI #XAI #ArtificialIntelligence #MachineLearning #ResponsibleAI #AIResearch #LLMs #DeepLearning https://t.co/JDUyyraqHR

459

Kush Varshney कुश वार्ष्णेय @krvarshney

3 months ago

Nice to see this benchmark dataset on LLM-supported rare disease diagnosis and confirmation. paper: https://t.co/7NX4iBvaWf github: https://t.co/iSqvJvDite #healourskin #raredisease

193

Kush Varshney कुश वार्ष्णेय @krvarshney

3 months ago

@Timur_Yessenov Then we're on the same page. I also think that humans hold contradictory moral beliefs.

Kush Varshney कुश वार्ष्णेय @krvarshney

3 months ago

I disagree with the statement "we do not expect human beings to hold within themselves multiple different sets of moral beliefs and values" that appears in a paper about LLM moral reasoning that was published yesterday. https://t.co/W85aTjWDsb

193

krvarshney retweeted

Miriam Rateike @miriamrateike

4 months ago

We have extended our ICLR workshop deadline to Feb 5th! #AFAA2026 Submit your work on fairness across alignment & agentic AI systems. We also continue to accept broad work on fairness. CfP: https://t.co/eCObuMJsVF

453

krvarshney retweeted

Chappaqua Central School District @chappaqua_csd

4 months ago

4th graders welcomed RB parent Kush R. Varshney, an IBM Fellow who volunteered his time to explain how AI works—its benefits and pitfalls—with a tailored presentation featuring our school song and a Charlotte’s Web excerpt. Grateful for his generosity & expertise! #WeAreChappaqua

chappaqua_csd's tweet photo. 4th graders welcomed RB parent Kush R. Varshney, an IBM Fellow who volunteered his time to explain how AI works—its benefits and pitfalls—with a tailored presentation featuring our school song and a Charlotte’s Web excerpt. Grateful for his generosity & expertise! #WeAreChappaqua https://t.co/umH0LNdziB

216

krvarshney retweeted

Satyapriya Krishna @SatyaScribbles

4 months ago

Grateful to have co-hosted the Trusted AI Symposium yesterday. Left with so many new ideas from the posters, panels, and lectures. 🧠 Big thanks to our keynote speakers, panelists, and staff for driving the conversation on trust in AI.🤝 #TrustedAISymposium2026

Kush Varshney कुश वार्ष्णेय

@krvarshney

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users