U G Murthy @murthyug - Twitter Profile

Pinned Tweet

U G Murthy

@murthyug

3 months ago

https://t.co/CO4vO6GKOi

0

2

0

92

U G Murthy

@murthyug

about 6 hours ago

@vadym_petryshyn It did it in the sandbox. The screen shots show the initial struggle looking for data locally but the need is in the sandbox, the realisation for uploading - ultimately it does everything in the sandbox - messages[] confirm this.

murthyug's tweet photo. @vadym_petryshyn It did it in the sandbox. The screen shots show the initial struggle looking for data locally but the need is in the sandbox, the realisation for uploading - ultimately it does everything in the sandbox - messages[] confirm this. https://t.co/QrQk50QlPw

0

1

0

7

U G Murthy

@murthyug

about 7 hours ago

Day 65: #BuildInPublic #AdaptiveAgent #AI GAIA Benchmark exposed the sandbox tool in AdaptiveAgent - repeated creation of sandbox instead of persisting the sandbox over the run. The sandbox is not persistent over the run with failsafe closure - here is a test result

murthyug's tweet photo. Day 65: #BuildInPublic #AdaptiveAgent #AI
GAIA Benchmark exposed the sandbox tool in AdaptiveAgent - repeated creation of sandbox instead of persisting the sandbox over the run. The sandbox is not persistent over the run with failsafe closure - here is a test result https://t.co/59D8MkRcrn

1

2

0

15

U G Murthy

@murthyug

1 day ago

@aater_ali You are right - In this case they are independent - no inter agent communication.

1

0

16

Who to follow

Mithun

@lordofdevmithun

Aspiring Full Stack developer. Very Curious . From Silicon City Bengaluru Github : https://t.co/gFK0nv0Z4d

1 day ago

Day 64 #AdaptiveAgent #BuildInPublic #AI Early morning runs, listening to @latentspacepod podcast triggered a idea on Agent swarm! built it using @ampcode ,which is flawless. I am now running GAIA benchmark - multiple agents in parallel

murthyug's tweet photo. Day 64 #AdaptiveAgent #BuildInPublic #AI
Early morning runs, listening to @latentspacepod podcast triggered a idea on Agent swarm! built it using @ampcode ,which is flawless. I am now running GAIA benchmark - multiple agents in parallel https://t.co/sH9FQsMyBf

1

0

32

U G Murthy

@murthyug

1 day ago

#AdaptiveAgent #BuildInPublic #AI End report/results persisted after all complete their tasks with indication or success or failure

murthyug's tweet photo. #AdaptiveAgent #BuildInPublic #AI
End report/results persisted after all complete their tasks with indication or success or failure https://t.co/WvodBhFdbU

0

10

U G Murthy

@murthyug

1 day ago

#AdaptiveAgent #BuildInPublic #AI Each agent with unique color as events can be hard to read.

1

0

12

U G Murthy

@murthyug

2 days ago

Day 63: #BuildInpublic #AdaptiveAgent #AI On one retry - if found agent in infinite search loop. The agent was made aware of this via a /steer command and it corrected itself and finally completed the task - answer was wrong but a good guess.

murthyug's tweet photo. Day 63: #BuildInpublic #AdaptiveAgent #AI
On one retry - if found agent in infinite search loop. The agent was made aware of this via a /steer command and it corrected itself and finally completed the task - answer was wrong but a good guess. https://t.co/tJJPi2DcGv

0

9

U G Murthy

@murthyug

2 days ago

Day 63: #BuildInpublic #AdaptiveAgent #AI Tests using GAIA validation set continues - Yesterday was the first time I was able to test all 3 levels - scores are improving but there work to do - see overall scores and failure analysis

murthyug's tweet photo. Day 63: #BuildInpublic #AdaptiveAgent #AI
Tests using GAIA validation set continues - Yesterday was the first time I was able to test all 3 levels - scores are improving but there work to do - see overall scores and failure analysis https://t.co/l5J8rhm9wf

1

0

16

U G Murthy

@murthyug

2 days ago

Day 63: #BuildInpublic #AdaptiveAgent #AI Retrying failed runs : Further analysis revealed that the agent is tenacious at search (duckduckgo), results being poor it tries again and again and fails on MAX_STEPS which is 80 for the agent

1

0

13

murthyug retweeted

Shay Boloor

@StockSavvyShay

3 days ago

Jensen Huang says the AI PC reinvention is as big as the smartphone shift by calling it “a new line” and “a new beginning.” $NVDA and $MSFT unveiled RTX Spark which will be the world’s most powerful deskside AI supercomputer built to run next-gen AI agent workloads locally.

53

1K

131

161

192K

U G Murthy

@murthyug

3 days ago

Day 62: #BuildInPublic #AdaptiveAgent #AI I was 90% done on Day 30. Testing and refinements seem endless. It somehow feels I am half way! reminds me of @JamesClear one of 3 ideas posted week before last

murthyug's tweet photo. Day 62: #BuildInPublic #AdaptiveAgent #AI
I was 90% done on Day 30. Testing and refinements seem endless. It somehow feels I am half way! reminds me of @JamesClear one of 3 ideas posted week before last https://t.co/hHbRkxzzkU

0

8

U G Murthy

@murthyug

3 days ago

@meshapi_ai How are you evaluating capabilities?

1

0

19

U G Murthy

@murthyug

9 days ago

Day 56 : #BuildInPublic #AdaptiveAgent #AI Used adaptiveAgent feedback to improve tools(item 1 ) and search optimisation(item 3) GAIA Benchmark scores improved from 62% -> 71% on validation set. hard failures are limited to modality and 'maxsteps' exceeded for some tasks. The latter should improve by smarter search and system prompt improvements

murthyug's tweet photo. Day 56 : #BuildInPublic #AdaptiveAgent #AI
Used adaptiveAgent feedback to improve tools(item 1 ) and search optimisation(item 3)

GAIA Benchmark scores improved from 62% -> 71%
on validation set. hard failures are limited to modality and 'maxsteps' exceeded for some tasks. The latter should improve by smarter search and system prompt improvements

0

18

U G Murthy

@murthyug

10 days ago

3/n #BuildingPublic #AdaptiveAgent #AI next steps : Improve tools and routing - add ability to read .parquet files - be smarter at web search - don't just look for latest stuff - modality based agent routing

0

1

25

U G Murthy

@murthyug

10 days ago

1/n Day 55 : #BuildInPublic #AdaptiveAgent #AI Struggling to improve GAIA benchmark and getting there slowly. Analysis of failure move from manual to adaptiveAgent driven analysis - structured output has enabled this. Past results : https://t.co/KIFByzd61x

U G Murthy

@murthyug

16 days ago

2/n Day 49: #AdaptiveAgent #BuildInPublic #AI https://t.co/rCoL3nnNAS Did dry runs, using validation set of 53 questions with two model on @meshapi_ai as provider 1. qwen/qwen3.5-27b - got 50% right 2. gpt-4o-min - got 19% right

2

1

0

78

1

47

U G Murthy

@murthyug

10 days ago

2/n Day 55: #BuildInPublic #AdaptiveAgent #AI GAIA Benchmark : 50% -> 62% improved a bit 12 were hard failures - Tool issues 8 were wrong answers Insights by adaptiveAgent on its own performance

murthyug's tweet photo. 2/n Day 55: #BuildInPublic #AdaptiveAgent #AI
GAIA Benchmark : 50% -> 62% improved a bit
12 were hard failures - Tool issues
8 were wrong answers
Insights by adaptiveAgent on its own performance https://t.co/OHYlqGY8zZ

1

0

30

U G Murthy

@murthyug

10 days ago

@meshapi_ai @meshari_ai @github Happy to contribute but have no idea what this means. The good repos are always hard to find.

0

16

U G Murthy

@murthyug

16 days ago

@JacobSobolev @meshapi_ai My goal is make it point to the shortcoming of the Agents. So far it has been useful in pointing them out. Poor scores, i am sure is reflection of the agent's inability to do certain things.

0

6

U G Murthy

@murthyug

16 days ago

1/n Day 49: #AdaptiveAgent #BuildInPublic #AI Last couple days has been hard, Agentic runs were failing, subtle issues surface as you push the boundaries. Got the testing infrastructure in place to do GAIA benchmark for AdaptiveAgent

1

0

15

U G Murthy

@murthyug

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users