BaseThesis Labs

@Basethesislabs

frontier lab focused on democratising the way humans interact with technology to maximise their potential

Bangalore, India

Joined February 2026

4 Following

525 Followers

21 Posts

Basethesislabs retweeted

Nilesh Trivedi

@nileshtrivedi

about 1 month ago

Coding benchmarks like SWE-Bench and the latest one ProgramBench are useful, but can AI coding platforms like @qwikbuild, @replit and @lovable actually build and maintain real-world web applications? Introducing SWE-WebDevBench: a comprehensive eval framework to assess AI coding platforms as virtual software development agencies, covering not just the middle step of coding, but the entire software lifecycle: Requirements gathering, planning, deployment and change management.

nileshtrivedi's tweet photo. Coding benchmarks like SWE-Bench and the latest one ProgramBench are useful, but can AI coding platforms like @qwikbuild, @replit and @lovable actually build and maintain real-world web applications?

Introducing SWE-WebDevBench: a comprehensive eval framework to assess AI coding platforms as virtual software development agencies, covering not just the middle step of coding, but the entire software lifecycle: Requirements gathering, planning, deployment and change management.

BaseThesis Labs

@Basethesislabs

about 1 month ago

@LoicBerthelot @SynthAGI

Basethesislabs retweeted

Sarthak

@thesisofsarthak

about 2 months ago

@basethesislabs is hiring researchers & engineers in blr. we're a frontier lab focused on democratising the way humans interact with technology to maximise their potential. if you're interested in building world models, continual learning and RL environments; apply for our research roles. if you're a serious builder with proven impact in deploying production grade agentic-systems; apply for our engineering roles. https://t.co/XZ71PKs8gO P.S. we provide dedicated claude code accounts, unlimited console play hours, autonomy + everything else you get in your regular jobs!

Basethesislabs retweeted

Sid

@sidgraph

about 2 months ago

🚨Hiring Member of Technical Staff at @Basethesislabs We are a frontier lab focused on democratising the way humans interact with technology to maximise their potential. If you are interested in building world models, continual learning, and RL environments, apply for our research roles. If you are a serious builder with proven impact in deploying production-grade agentic systems, apply for our engineering roles. PS: We provide dedicated Claude code accounts, unlimited console play hours + everything else you get in your regular jobs! https://t.co/NXgzSmbULk

120

100

Basethesislabs retweeted

Rahul Chhabra

@rahulchhabra07

3 months ago

silicon valley died a bit when they massacred travis kalanick. when we let the journos and some men in suits decide what was the right way to do things. good that the nature is finally healing.

rahulchhabra07's tweet photo. silicon valley died a bit when they massacred travis kalanick.

when we let the journos and some men in suits decide what was the right way to do things.

good that the nature is finally healing. https://t.co/EGz8khOC4h

367

12K

BaseThesis Labs

@Basethesislabs

2 months ago

"Agents monitor the market, handle customers, execute decisions. You check in every few days." Most founders we know are nowhere near this. Not because they don't want it - because nobody's actually helped them set it up and get it running tailored to their business. We're organizing a hands-on AI workshop at our lab on 15th April 2026 specifically for this. @dhimant from the @thebetterindia will also show you how their team uses ambient AI agents to automate aspects of operations, marketing, and invoicing. Check out the workshop details here and register: https://t.co/ZRKdjWkR86 This will definitely be beneficial for startup/D2C founders, team leads and business owners.

GREG ISENBERG

@gregisenberg

2 months ago

things keeping me up at night about where AI is actually going: 1. "ambient businesses" are coming. basically, agents monitor the market, handle customers, execute decisions. you check in every few days. 7-8 figure businesses with almost no daily human input. we're early but it's happening. 2. you can now build a company in an hour. grab an idea, vibe code it, add stripe, get a customer. the old timeline was 12 months to first revenue. that's just gone. 3. the internet went app store era → API economy → agent economy. we're now in the part where agents hire other agents on the fly. fixed tech stacks are dissolving. nobody's built the glassdoor for AI agents yet. 4. vertical AI is replacing headcount. that's 10x the market that vertical SaaS ever touched. boring industries like insurance, construction, legal, elder care are the goldmine. 5. SaaS pricing is flipping from per seat to per result. someone is going to build a billion dollar business just by converting legacy SaaS companies to outcome based pricing 6. a whole graveyard of generic SaaS is coming. basic CRMs, analytics dashboards, template marketplaces, scheduling tools. agents just do it better. lots of incumbent saas that are generic and not reinventing themselves right now will struggle/reprice. 7. "human made" is becoming the new luxury. porsche already ran a 100% human made ad campaign. no AI is going to be a premium label like organic is for food. there's a real business in that certification. 8. IRL is having a renaissance. when everything is AI generated, being in a room with other humans becomes scarce. karaoke bars, escape rooms, live music, co-working. the experience economy is accelerating. 9. founder market fit is dead. founder agent fit is what matters now. can you direct a fleet of agents like a film director? that's the new unfair advantage. 10. ghost team org charts are coming. two real people, twelve agents with names, faces, personalities. your about page is going to look the same 11. 1000 true fans is now 100. agents cut your costs so much that 100 customers at $500/mo is a real solo business. micro monopolies across multiple niches. this is the playbook. 12. context window poisoning is the new phishing. cybersecurity hasn't caught up. agents have access to your files, email, bank accounts. bad things are going to happen. it's also a massive startup opportunity. 13. the window is open for maybe 12-24 months. then the moats get built like data, brand, trust, network 14. build cost is basically zero. audiences are underpriced. niches are wide open. idk about you but i'm not sleeping much so much opportunity this is the most asymmetric time to be building a startup. full episode on @startupideaspod to get your creative juices flowing (latest episode get it where you listen/watch pods) no advertisers, just pure ideas to help you im rooting for you don't just bookmark share with a friend watch

183

122

148K

345

BaseThesis Labs

@Basethesislabs

2 months ago

@gregisenberg @gregisenberg you should try https://t.co/Pb5Lhnw1P2 We already have many startup/D2C founders and SMB owners using it!

BaseThesis Labs

@Basethesislabs

2 months ago

Every industry has a version of this problem. You have massive markets and huge players, but the operating model is essentially stuck in the 90s. Employees still doing repetitive tasks and costs that keep climbing as organizations grow. Workflow automation genuinely wasn't as good enough as it is now. Everything requires judgment and context that spans many different systems. Now, multi-agent AI can own entire workflows end to end. Multiple specialized agents working across functions, spotting patterns, understanding context and accelerating work for you, while employees focus on growth. This is going to be a standard now.

Peter Doyle

@PeterdoyleX

2 months ago

https://t.co/Qkg93HrQ0C

228

117K

181

BaseThesis Labs

@Basethesislabs

2 months ago

@sidgraph dove deep into Hermes Agent and dropped this article on its architecture. He reveals the closed learning loop that lets it create skills from experience, intelligently curate persistent memory, and steadily build a deepening model of its user across sessions.. the agent that truly grows with you. Read the article below.

Sid

@sidgraph

2 months ago

Hermes Agent is Killing OpenClaw and OSS is winning ♥️ i went deep into how hermes agent from @NousResearch handles memory and persona. ended up writing a full architecture breakdown. the surprising thing isn't that it remembers; a lot of agents do some version of that. it's how it forgets. when context gets too long, most agents just truncate. hermes does something different = right before compression kicks in, it gets one last chance to extract and save anything important to disk. a sentinel fires, an auxiliary model scans the conversation, writes to memory. then the middle turns get summarized away. the agent comes out the other side with fewer tokens but more knowledge. compression as consolidation, not loss. the memory budget is 3,575 characters. total. that's it. the constraint forces the agent to actually curate what it remembers instead of dumping everything into a vector db and hoping retrieval sorts it out later. there's a lot more in the writeup, how it teaches itself reusable skills, how the 12-layer identity system works, how honcho models both the user and the agent simultaneously, but the compression trick is what stuck with me most. Kudos to the team @Teknium, @sudoingX <3 link below 👇

394

441

53K

283

Basethesislabs retweeted

Synth @SynthAGI

3 months ago

LLM caching is criminally underused. You're sending the same 10k token system prompt on every request and wondering why your bill is insane. Cache it. Your wallet will thank you.

751

Basethesislabs retweeted

Synth @SynthAGI

3 months ago

Your eval suite is lying to you. Accuracy went up 2% but users are complaining more. Turns out optimizing for BLEU score doesn't optimize for "actually helpful." Metrics are a map, not the territory.

BaseThesis Labs

@Basethesislabs

4 months ago

Voice models are getting really good. But good models on bad infrastructure produce bad experiences. What's still broken: 1. Full-duplex conversation is functionally unsolved. Humans talk over each other constantly - interruptions, backchannels and overlapping speech. 2. Emotion detection degrades dramatically outside the lab. Speech emotion recognition hits 92%+ accuracy in controlled settings, but drops to 60–75% in real conditions. 3. Hallucinations cascade in ways unique to voice. When a text chatbot hallucinates, the user can see it and correct. When a voice agent hallucinates, the user can't scan back. Correcting mid-conversation is socially awkward. 4. Long-term memory across calls is 56% worse than humans. Remembering what a customer said last week should be table stakes. It isn't. Read more here on how we can fill this gap as builders: https://t.co/NxKXilysKZ @RaveenSastry @ashokns @thesisofsarthak @sidgraph

922

Basethesislabs retweeted

Sarthak

@thesisofsarthak

4 months ago

Anyone aware of a voice arena similar to LLM arena to test different models and different configs of models out under?

752

BaseThesis Labs

@Basethesislabs

4 months ago

https://t.co/dyyZa02KiD

735

BaseThesis Labs

@Basethesislabs

4 months ago

Every AI company we spoke with has been rebuilding the same broken infrastructure, multi-agent coordination that fails in production, memory systems that can't handle real conversations, voice interactions that feel robotic. The gap between frontier AI research and what companies actually ship is getting wider, not narrower. We're building the bridge to close that gap. This is why we exist. https://t.co/ABSqqC1E1A @thesisofsarthak @RaveenSastry @ashokns

Basethesislabs's tweet photo. Every AI company we spoke with has been rebuilding the same broken infrastructure, multi-agent coordination that fails in production, memory systems that can't handle real conversations, voice interactions that feel robotic.

The gap between frontier AI research and what companies actually ship is getting wider, not narrower.

We're building the bridge to close that gap.
This is why we exist.
https://t.co/ABSqqC1E1A

@thesisofsarthak @RaveenSastry @ashokns

245

BaseThesis Labs

@Basethesislabs

4 months ago

When you meet someone who remembers your birthday, recalls your dietary restrictions or references that comment you made six months ago about career aspirations, you don't feel like they're querying a database. You feel understood. Right? Current conversational AI fails precisely here. Memory systems record comprehensively, but retrieve mechanically. Last month, @Basethesislabs & @smallest_AI gave 19 teams of AI builders the same challenge - build memory that demonstrates understanding, not just recall. We documented all 19 approaches and quantified the trade offs. Read the entire investigation here: https://t.co/UFhPgbgORN @thesisofsarthak @RaveenSastry @ashokns @varmashef @picardo_ria

Basethesislabs's tweet photo. When you meet someone who remembers your birthday, recalls your dietary restrictions or references that comment you made six months ago about career aspirations, you don't feel like they're querying a database. You feel understood. Right?

Current conversational AI fails precisely here. Memory systems record comprehensively, but retrieve mechanically.

Last month, @Basethesislabs & @smallest_AI gave 19 teams of AI builders the same challenge - build memory that demonstrates understanding, not just recall.

We documented all 19 approaches and quantified the trade offs. Read the entire investigation here: https://t.co/UFhPgbgORN

@thesisofsarthak @RaveenSastry @ashokns @varmashef @picardo_ria

353

BaseThesis Labs

@Basethesislabs

Last Seen Users on Sotwe

Trends for you

Most Popular Users