Sanchi Mittal

@_keen_learner_

AI and Music 💜

Joined October 2019

220 Following

68 Followers

112 Posts

_keen_learner_ retweeted

Ajitesh Panda @iamAjiteshp

about 1 month ago

Asked a healthcare AI founder: "how do you know when your agent is drifting across sessions, not just in one conversation?" Their setup: observer agents, Slack alerts, manual review. The gap: dashboard shows individual flags. Nobody connecting dots automatically across hundreds of sessions. Manual triage doesn't scale past a point. How are you catching drift before users feel it?

_keen_learner_ retweeted

Andrej Karpathy

@karpathy

about 2 months ago

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

150K

11K

14K

28M

_keen_learner_ retweeted

Ajitesh Panda @iamAjiteshp

3 months ago

Attended YC Startup School India? Drop your favorite moment from the event below 👇 I'll add everyone who replies to an exclusive WhatsApp group for founders who were there. The room was special, Real connect. No Fluff! Let's keep it going!!

iamAjiteshp's tweet photo. Attended YC Startup School India?
Drop your favorite moment from the event below 👇
I'll add everyone who replies to an exclusive WhatsApp group for founders who were there.

The room was special, Real connect. No Fluff!

Let's keep it going!! https://t.co/J051zGLHEt

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

3 months ago

Is your agent's risk management process manual or automated? What triggers a risk re-evaluation in your team, a new model version? A new tool? An incident?

Who to follow

Soumya_Sinha_

@Soumya__Sinha__

Technical Program Manager at RapidClaims

25 | blr | living in sprints, seeking unpretentious simplicity. full stack dev @ 🦄. @realmadrid & ex - @indianfootball fan ♟️• ⚽ • 🏸 • 🏃‍♂️

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

3 months ago

What's your current human oversight model? - Full automation, no human review - Human reviews outputs after the fact - Human approves before agent acts - We don't have agents in prod yet

m_sidhartha07's tweet photo. What's your current human oversight model?

- Full automation, no human review
- Human reviews outputs after the fact
- Human approves before agent acts
- We don't have agents in prod yet https://t.co/mUQ1JJjt3Y

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

3 months ago

EU AI Act, Article 10 explained for AI teams. 🧵 Under the EU AI Act, the data your AI agent was trained on and has access to at runtime, must meet documented quality standards. Your RAG pipeline is now compliance infrastructure.

m_sidhartha07's tweet photo. EU AI Act, Article 10 explained for AI teams. 🧵

Under the EU AI Act, the data your AI agent was trained on and has access to at runtime, must meet documented quality standards.

Your RAG pipeline is now compliance infrastructure. https://t.co/hZAU6Y3G60

104

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

3 months ago

Which Annex III categories could your agent touch? Drop it in the replies. Genuinely curious how teams are mapping this. Save this thread if you're building or deploying AI agents in production. #EUAIAct #AIAgents #AICompliance #HighRiskAI

160

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

3 months ago

125 days until the EU AI Act applies to your production AI agents. If your agent touches any of these 8 categories, you're already classified as high-risk. Most teams don't know they're in scope. 🧵

340

_keen_learner_ retweeted

Sidhartha Mallick

@m_sidhartha07

4 months ago

Last year, I built https://t.co/dZzJSsf7SN and shut it down. Here's the full story — the wins, the cracks, and what killed it 🧵

_keen_learner_ retweeted

India in Pixels by Ashris 🔱

@indiainpixels

9 months ago

This is pretty cool, great way to guide people ✨ Good work, @compli_18!

116

10K

_keen_learner_ retweeted

Jash Doshi

@compli_18

9 months ago

explore the ancient wisdom of the Rig Veda through an immersive 3D experience! (turn up the volume) link: https://t.co/AFr8LR4emy retweets appreciated! #RigVedaHack @indiainpixels @iipmaps

104

13K

_keen_learner_ retweeted

Andrej Karpathy

@karpathy

about 1 year ago

An attempt to explain (current) ChatGPT versions. I still run into many, many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3 you're ngmi. - 4o is different from o4. Yes I know lol. 4o is a good "daily driver" for many easy-medium questions. o4 is only available as mini for now, and is not as good as o3, and I'm not super sure why it's out right now. Example basic "router" in my own personal use: - Any simple query (e.g. "what foods are high in fiber"?) => 4o (about ~40% of my use) - Any hard/important enough query where I am willing to wait a bit (e.g. "help me understand this tax thing...") => o3 (about ~40% of my use) - I am vibe coding (e.g. "change this code so that...") => 4.1 (about ~10% of my use) - I want to deeply understand one topic - I want GPT to go off for 10 minutes, look at many, many links and summarize a topic for me. (e.g. "help me understand the rise and fall of Luminar"). => Deep Research (about ~10% of my use). Note that Deep Research is not a model version to be picked from the model picker (!!!), it is a toggle inside the Tools. Under the hood it is based on o3, but I believe is not fully equivalent of just asking o3 the same query, but I am not sure. All of this is only within the ChatGPT universe of models. In practice my use is more complicated because I like to bounce between all of ChatGPT, Claude, Gemini, Grok and Perplexity depending on the task and out of research interest.

karpathy's tweet photo. An attempt to explain (current) ChatGPT versions.

I still run into many, many people who don't know that:
- o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3 you're ngmi.
- 4o is different from o4. Yes I know lol. 4o is a good "daily driver" for many easy-medium questions. o4 is only available as mini for now, and is not as good as o3, and I'm not super sure why it's out right now.

Example basic "router" in my own personal use:
- Any simple query (e.g. "what foods are high in fiber"?) => 4o (about ~40% of my use)
- Any hard/important enough query where I am willing to wait a bit (e.g. "help me understand this tax thing...") => o3 (about ~40% of my use)
- I am vibe coding (e.g. "change this code so that...") => 4.1 (about ~10% of my use)
- I want to deeply understand one topic - I want GPT to go off for 10 minutes, look at many, many links and summarize a topic for me. (e.g. "help me understand the rise and fall of Luminar"). => Deep Research (about ~10% of my use). Note that Deep Research is not a model version to be picked from the model picker (!!!), it is a toggle inside the Tools. Under the hood it is based on o3, but I believe is not fully equivalent of just asking o3 the same query, but I am not sure.

All of this is only within the ChatGPT universe of models. In practice my use is more complicated because I like to bounce between all of ChatGPT, Claude, Gemini, Grok and Perplexity depending on the task and out of research interest.

616

13K

11K

_keen_learner_ retweeted

Vision and AI Lab, IISc @val_iisc

over 1 year ago

o3-mini vs Deepseek-r1: As the battle between o3-mini and Deepseek-r1 heats up, we put them to test with a simple linear algebra coding task. Prompt: Write a C program to find the maximum value in a skew-symmetric matrix. The results surprised us! Seems mini is yet to mature😅

val_iisc's tweet photo. o3-mini vs Deepseek-r1: As the battle between o3-mini and Deepseek-r1 heats up, we put them to test with a simple linear algebra coding task. Prompt: Write a C program to find the maximum value in a skew-symmetric matrix. The results surprised us! Seems mini is yet to mature😅 https://t.co/C8tiFbCQKH

_keen_learner_ retweeted

Praveen Naik

@p_naix

over 1 year ago

"Build LLM in India" – Stop Lying to Yourselves and Fix VCs It’s been almost two years since @sama said, “you can try,” and someone from Mahindra Group responded with “challenge accepted.” Where are the models competing with OpenAI, Anthropic, or DeepMind? In Nandan Nilekani's recent podcast with Nikhil Kamath, the tech lead at the government level bluntly admitted, "building LLMs is not for us." That’s a pathetic, defeatist mindset. @AravSrinivas is right. Look at the U.S.—their top tech companies and government work in tandem, nullifying benchmarks with every release. Here, all we do is talk. India seems content with the reputation of doing things cheaply while China leads the way in execution. Somehow only China can trigger us. @paraschopra, @championswimmer, @waitin4agi_—all on point. Talent isn’t the issue; it’s the lack of courage. We’re stuck hyping 10-minute deliveries and obsessing over tangible commodities. Most universities don’t even offer decent AI curricula. Mediocre college placements churn out graduates who’ve never seen money, so their only goal is stability. Ambition comes later—if at all. I can’t convince one friend to quit their job and build something meaningful because they draw the line at financial risk. It’s a vicious cycle. How do we fix this? Break the VC/investor trap. They won’t back ambitious founders unless they see "differentiators." What differentiators do OpenAI, Anthropic, or DeepMind have? They’re all chasing benchmarks and mining the same gold. Meanwhile, our VCs keep betting on "build for India" garbage—e-com, UPI fintech, quick commerce. Enough! Start funding "build for the world." Indian Founders don’t need massive funding to start. $30-50k (₹25-40L) is enough for most ambitious founders to prove their commitment. But no one funds pre-seed at that scale. Show us you’re willing to disproportionately back talent. Stop forcing them to move to the U.S. just to get funded. Then, when they succeed, you call them "Indian founders" as if you had their back all along. We don’t lack ambition; we lack support. Back small, ambitious companies with compounding potential. That’s how you win.

114

354

484

111K

_keen_learner_ retweeted

Deedy

@deedydas

over 1 year ago

Dead Internet Theory, the conspiracy that the internet is mostly bots, is happening. AI generated content is killing the internet slowly. Reddit posts. Pinterest. Google results. Facebook videos. Spotify music. And most people don't even know it's happening.

deedydas's tweet photo. Dead Internet Theory, the conspiracy that the internet is mostly bots, is happening.

AI generated content is killing the internet slowly.

Reddit posts.
Pinterest.
Google results.
Facebook videos.
Spotify music.

And most people don't even know it's happening. https://t.co/W6KfeR1Zeh

144

130

423

152K

_keen_learner_ retweeted

AI at Meta

@AIatMeta

almost 2 years ago

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context window and improved support for 8 languages among other improvements. Llama 3.1 405B rivals leading closed source models on state-of-the-art capabilities across a range of tasks in general knowledge, steerability, math, tool use and multilingual translation. The models are available to download now directly from Meta or @huggingface. With today’s release the ecosystem is also ready to go with 25+ partners rolling out our latest models — including @awscloud, @nvidia, @databricks, @groqinc, @dell, @azure and @googlecloud ready on day one. More details in the full announcement ➡️ https://t.co/hhJoLm5eLV Download Llama 3.1 models ➡️ https://t.co/rRjvmxqCTC With these releases we’re setting the stage for unprecedented new opportunities and we can’t wait to see the innovation our newest models will unlock across all levels of the AI community.

259

_keen_learner_ retweeted

Peter Yang

@petergyang

almost 2 years ago

There's a viral video going around about an MIT neurosurgeon who quit to spend time alone in the mountains. It's such a great perspective on our broken healthcare system (and getting healthy) that I thought I'd share the key excerpt here:

petergyang's tweet photo. There's a viral video going around about an MIT neurosurgeon who quit to spend time alone in the mountains.

It's such a great perspective on our broken healthcare system (and getting healthy) that I thought I'd share the key excerpt here: https://t.co/7WV0UmhTiC

147

12K

10K

_keen_learner_ retweeted

Soumith Chintala

@soumithchintala

almost 2 years ago

Meta's PyTorch teams published their actual roadmap documents publicly for the first time: https://t.co/Gp4C3i9D6j Our engineers' incentives and objectives are laid bare for folks to read 🙂 While all PyTorch development happens publicly on github, the actual planning and roadmap documents that teams at various PyTorch-affiliated companies write out weren't public, so we decided to change that for increased transparency.

823

124

374

95K

_keen_learner_ retweeted

Deedy

@deedydas

about 2 years ago

It's finally here. Q* rings true. Tiny LLMs are as good at math as a frontier model. By using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K! That’s better than GPT-4, Claude and Gemini, with 200x less parameters!

deedydas's tweet photo. It's finally here. Q* rings true. Tiny LLMs are as good at math as a frontier model.

By using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K!

That’s better than GPT-4, Claude and Gemini, with 200x less parameters! https://t.co/ajpEJlqvtD

293

804K

_keen_learner_ retweeted

Liliang Ren

@liliang_ren

about 2 years ago

Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯 Paper: https://t.co/KwMpeyaDxc (1/6)

liliang_ren's tweet photo. Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯

Paper: https://t.co/KwMpeyaDxc

(1/6)

266

244K

Sanchi Mittal

@_keen_learner_

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users