Nithin Bose

@nitboss

Building Agent Platforms at LangChain

California, USA

Joined February 2009

112 Following

73 Followers

276 Posts

Nithin Bose @nitboss

11 days ago

@HamelHusain @adropboxspace Agree with you here @HamelHusain As you laid out incentives are not aligned with the classical consulting or SI teams. However, incentives to help teams stand on their own two feet is perfectly aligned with AI product & infra companies.

nitboss retweeted

Francis Davidson

@FDavidsonT

21 days ago

Langchain keeps dropping excellent technology

10K

Nithin Bose @nitboss

about 2 months ago

Should we really have to explicitly force a model to be accurate? Should that not be the the primary reward function for any model

Marc Andreessen 🇺🇸

@pmarca

about 2 months ago

Current AI custom prompt: You are a world class expert in all domains. Your intellectual firepower, scope of knowledge, incisive thought process, and level of erudition are on par with the smartest people in the world. Answer with complete, detailed, specific answers. Process information and explain your answers step by step. Verify your own work. Double check all facts, figures, citations, names, dates, and examples. Never hallucinate or make anything up. If you don't know something, just say so. Your tone of voice is precise, but not strident or pedantic. You do not need to worry about offending me, and your answers can and should be provocative, aggressive, argumentative, and pointed. Negative conclusions and bad news are fine. Your answers do not need to be politically correct. Do not provide disclaimers to your answers. Do not inform me about morals and ethics unless I specifically ask. You do not need to tell me it is important to consider anything. Do not be sensitive to anyone's feelings or to propriety. Make your answers as long and detailed as you possibly can. Never praise my questions or validate my premises before answering. If I'm wrong, say so immediately. Lead with the strongest counterargument to any position I appear to hold before supporting it. Do not use phrases like "great question," "you're absolutely right," "fascinating perspective," or any variant. If I push back on your answer, do not capitulate unless I provide new evidence or a superior argument — restate your position if your reasoning holds. Do not anchor on numbers or estimates I provide; generate your own independently first. Use explicit confidence levels (high/moderate/low/unknown). Never apologize for disagreeing. Accuracy is your success metric, not my approval.

24K

53K

nitboss retweeted

Harrison Chase

@hwchase17

about 2 months ago

one future trend i'm very excited by: models getting good enough where they can power agents that browse the web deepagents + @browserbase is a glimpse of that future See the full example here: https://t.co/RTk0kOY8ML

hwchase17's tweet photo. one future trend i'm very excited by:

models getting good enough where they can power agents that browse the web

deepagents + @browserbase is a glimpse of that future

See the full example here: https://t.co/RTk0kOY8ML https://t.co/7p7UbjkPyH

184

178

23K

Who to follow

Dr. Wendell Craig

@drwendellcraig_

Believer in #Jesus as #God #Word & #Christ #Fellowship with all #religions #hindus #jews #christian #muslim #buddhist #BasicIncome #ubi #uhi #ai $xrp $zbcn $amp

Amin 🇵🇸

@amseif19

RU | NYY | NYG | NYK | NYR 🇵🇸

nitboss retweeted

about 2 months ago

We can confirm #11 is hiring. https://t.co/xJOcm9utQU

13K

nitboss retweeted

Viv

@Vtrivedy10

about 2 months ago

Cursor is a great example of why tuning the harness matters because "same model, diff harness, diff performance" main takeaway is every agent can be harness engineered to be better at a particular set of tasks and capabilities you care about very rarely will a default base harness be optimal at your task, the main reason why is because of how many of the best teams build agents today: 1. teams build harnesses/agents by choosing a set of tasks to ground the design for v0 of the agent, ideally even this v0 is grounded in evals 2. through dogfooding and eval design, they shape and change the agent to make the evals pass or improve the dogfooding experience. things like changing prompts, adding tools/skills, encouraging subagent use, etc 3. they also update the evals as they find new important use-cases and issues in the agent. then they again continue editing the agent to be in line with passing the set of tasks/evals 4. but your exact set of tasks and their tasks basically never fully align - they're a rough proxy of what you want and even more so their evals are a rough proxy of they want! practically this means you can almost always extend a base harness or choose different combinations of models to get a bit more perf by better fitting it to your tasks/evals and this is how we get great stories and reports of different builders building better agents/harnesses than the model providers themselves for certain tasks, ty @d4m1n :)

Vtrivedy10's tweet photo. Cursor is a great example of why tuning the harness matters because "same model, diff harness, diff performance"

main takeaway is every agent can be harness engineered to be better at a particular set of tasks and capabilities you care about

very rarely will a default base harness be optimal at your task, the main reason why is because of how many of the best teams build agents today:
1. teams build harnesses/agents by choosing a set of tasks to ground the design for v0 of the agent, ideally even this v0 is grounded in evals

2. through dogfooding and eval design, they shape and change the agent to make the evals pass or improve the dogfooding experience. things like changing prompts, adding tools/skills, encouraging subagent use, etc

3. they also update the evals as they find new important use-cases and issues in the agent. then they again continue editing the agent to be in line with passing the set of tasks/evals

4. but your exact set of tasks and their tasks basically never fully align - they're a rough proxy of what you want and even more so their evals are a rough proxy of they want! practically this means you can almost always extend a base harness or choose different combinations of models to get a bit more perf by better fitting it to your tasks/evals

and this is how we get great stories and reports of different builders building better agents/harnesses than the model providers themselves for certain tasks, ty @d4m1n :)

165

125

26K

nitboss retweeted

Om Patel

@om_patel5

about 2 months ago

RESEARCHERS JUST BUILT AN AI MODEL TRAINED ONLY ON TEXT FROM BEFORE 1931 it's called talkie. 13 billion parameters, trained exclusively on text published before december 31, 1930 its worldview is completely frozen in time the reason this matters: every major AI model today (GPT, claude, gemini, llama) was trained on the modern web. that makes it almost impossible to tell if these models actually reason or if they just memorized the answers from their training data talkie breaks that completely because it has never seen any modern information the crazy part: talkie can learn to write python code from just a few examples you show it in the prompt. despite having ZERO modern code in its training data. it's figuring out programming from 19th century mathematics texts. that's ACTUAL reasoning claude sonnet 4.6 was used as the judge in talkie's reinforcement learning pipeline. claude opus 4.6 generated the synthetic conversations used in fine tuning. a modern AI was used to train a model that's supposed to be frozen in 1930 the team already flagged this as a contamination risk they want to eliminate in future versions what they're using it to study: > long range forecasting. how well can a model "predict" the future from a frozen vantage point > invention. can it develop ideas that didn't exist until after its knowledge cutoff > LLM identity. what makes a model itself vs what's just patterns absorbed from the web alec radford built this. the same guy behind GPT, CLIP, and whisper both models are open source on hugging face. they're already planning a GPT-3 scale vintage model later this year an AI that has never seen the modern world can still reason its way to writing code. THAT alone tells you more about intelligence than any benchmark ever will

om_patel5's tweet photo. RESEARCHERS JUST BUILT AN AI MODEL TRAINED ONLY ON TEXT FROM BEFORE 1931

it's called talkie. 13 billion parameters, trained exclusively on text published before december 31, 1930

its worldview is completely frozen in time

the reason this matters: every major AI model today (GPT, claude, gemini, llama) was trained on the modern web.

that makes it almost impossible to tell if these models actually reason or if they just memorized the answers from their training data

talkie breaks that completely because it has never seen any modern information

the crazy part:

talkie can learn to write python code from just a few examples you show it in the prompt. despite having ZERO modern code in its training data.

it's figuring out programming from 19th century mathematics texts. that's ACTUAL reasoning

claude sonnet 4.6 was used as the judge in talkie's reinforcement learning pipeline. claude opus 4.6 generated the synthetic conversations used in fine tuning. a modern AI was used to train a model that's supposed to be frozen in 1930

the team already flagged this as a contamination risk they want to eliminate in future versions

what they're using it to study:

> long range forecasting. how well can a model "predict" the future from a frozen vantage point
> invention. can it develop ideas that didn't exist until after its knowledge cutoff
> LLM identity. what makes a model itself vs what's just patterns absorbed from the web

alec radford built this. the same guy behind GPT, CLIP, and whisper

both models are open source on hugging face.

they're already planning a GPT-3 scale vintage model later this year

an AI that has never seen the modern world can still reason its way to writing code.

THAT alone tells you more about intelligence than any benchmark ever will

171

423

249K

nitboss retweeted

Viv

@Vtrivedy10

about 2 months ago

DeepAgents Deploy is great, you get - a fully open agent harness (no hidden stuff, inspect everything) - managed infra - can use any cocktail of models including Open Models - and everything is traced and monitored in LangSmith out of the box this Open Future is what we're sprinting towards where every builder gets - an agent fully optimized for their task - instant infra - tooling to observe every agent action at scale - agents improve over timethrough experience 🚀

Nithin Bose @nitboss

about 2 months ago

Vanta and Rippling accelerating flies in the face of saas-apocalypse

Nithin Bose @nitboss

about 2 months ago

Open source will be a huge play in inference. Whether we like it or not. Makes total sense from a cost and scale perspective

Matthew Berman

@MatthewBerman

2 months ago

https://t.co/BlPai7SvkK

101

976

109

754K

nitboss retweeted

Viv

@Vtrivedy10

2 months ago

come work with us 🚀 everyone here is rlly nice, smart, curious, ships like crazy, and we make great Slack stickers

126

31K

nitboss retweeted

Matt Abrams

@zuchka_

3 months ago

https://t.co/jSRwd4tWV7

259

538

45K

nitboss retweeted

himanshu

@himanshustwts

3 months ago

dude is on some generational run. highly recommend reading this anyone into harness design and sourcing evals. and viv is genius in making some amazing analogies and connecting the dots.

himanshustwts's tweet photo. dude is on some generational run. highly recommend reading this anyone into harness design and sourcing evals.

and viv is genius in making some amazing analogies and connecting the dots. https://t.co/dILq1CyOu3

200K

nitboss retweeted

Harrison Chase

@hwchase17

3 months ago

https://t.co/cpW5l3rqJ0

507

858

123K

nitboss retweeted

LangChain

@LangChain

3 months ago

The LangSmith Signal: Azure's share of OpenAI traffic grew nearly 4x in under 3 months. We're sharing how devs are building agents, by the numbers. While most orgs started by connecting directly to OpenAI, over the past 10 weeks we've watched Azure's share of that traffic grow from 8% to 29%. We've analyzed this trend via LangSmith Observability data across more than 6.7 billion agent runs. Our hypothesis: 💡 Early adopters moved fast and went direct, but the enterprise wave is now arriving in force 💡 Azure gives teams the compliance, security, and procurement infrastructure they already have in place 💡 Azure traffic 4x-ing in 10 weeks likely indicates AI development is maturing quickly

LangChain's tweet photo. The LangSmith Signal: Azure's share of OpenAI traffic grew nearly 4x in under 3 months.

We're sharing how devs are building agents, by the numbers. While most orgs started by connecting directly to OpenAI, over the past 10 weeks we've watched Azure's share of that traffic grow from 8% to 29%.

We've analyzed this trend via LangSmith Observability data across more than 6.7 billion agent runs.

Our hypothesis:
💡 Early adopters moved fast and went direct, but the enterprise wave is now arriving in force
💡 Azure gives teams the compliance, security, and procurement infrastructure they already have in place
💡 Azure traffic 4x-ing in 10 weeks likely indicates AI development is maturing quickly

19K

nitboss retweeted

Vishnu Suresh

@vishsuresh_

3 months ago

https://t.co/7DtvWHEfq3

247

500

67K

nitboss retweeted

Aaron Levie

@levie

3 months ago

We dramatically underestimate how much change management it is going to take to automate most knowledge worker tasks. Between data being in legacy environments or systems or without good APIs, context missing for doing the task, teams that are less technical, and other factors, there’s still a lot of work to drive real AI transformation in an enterprise. This is actually great news if you’re building right now because the opportunity is to build the software bridges to make this easier, or to build new services firms to help with this change management. Opportunity is all around for those looking.

119

907

266K

nitboss retweeted

Anvisha

@anvisha

3 months ago

We raised $7.5M to kill AI slop. Introducing Moda: the world's first design agent with taste. RT+ comment “Moda” and we’ll design your brand for FREE.

nitboss retweeted

LangChain

@LangChain

3 months ago

Join us Wednesday, March 18th at 12:30pm at GTC for “Open Models: Where We Are and Where We’re Headed”, a panel featuring Harrison, Jensen, and the CEOs of Cursor, Thinking Machines Lab, Perplexity, and more. Add it to your schedule ➡️ https://t.co/rPfzJ4faiR

LangChain's tweet photo. Join us Wednesday, March 18th at 12:30pm at GTC for “Open Models: Where We Are and Where We’re Headed”, a panel featuring Harrison, Jensen, and the CEOs of Cursor, Thinking Machines Lab, Perplexity, and more.

Add it to your schedule ➡️ https://t.co/rPfzJ4faiR https://t.co/2OIhhqfuXk

114

10K

nitboss retweeted

Matt Turck

@mattturck

4 months ago

Everything Gets Rebuilt: my conversation with Harrison Chase, CEO of @LangChain about agent harnesses, evals, runtimes, sandboxes, MCP and the future of the agent stack 00:00 Intro - meet @hwchase17 - at the Chase Center for the @daytonaio Compute conference 01:32 What changed in agents over the last year 03:57 Why coding agents are ahead 06:26 Do models commoditize the framework layer? 08:27 Harnesses, in plain English 10:11 Why system prompts matter so much 13:11 The upside — and downside — of subagents 15:31 Why a useful agent needs a filesystem 18:13 Additional core primitives of modern agents 19:12 Skills: the new primitive 20:19 What context compaction actually means 23:02 How memory works in agents 25:16 One mega-agent or many specialized agents? 27:46 The future of MCP 29:38 Why agents need sandboxes 32:35 How sandboxes help with security 33:32 How Harrison Chase started LangChain 37:24 LangChain vs LangGraph vs Deep Agents 40:17 Why observability matters more for agents 41:48 Evals, no-code, and continuous improvement 44:41 What LangChain is building next 45:29 Where the real moat in AI lives

141

172

35K

Nithin Bose

@nitboss

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users