Tanmay Garg @garg10may - Twitter Profile

2 days ago

@RMantri @HDFC_Bank That's a anomaly detection at play, Even the point of failures are two. And you stupid might also think that when transaction already completed why call, they should have called before.

1

0

272

Tanmay Garg

@garg10may

2 days ago

@AmitabhJha3 Indswflab, godavarib?

0

1

0

159

Tanmay Garg

@garg10may

3 days ago

@rishibagree Meeeeh

0

2

Tanmay Garg

@garg10may

5 days ago

@thsottiaux Confuses between inbrowser tool and chrome when to use which one. A safe way to store multiple credentials so it can go and login itself.

1

7

0

1

373

Who to follow

TweetTipr

@TweetTipr

IBM Power, Lenovo, Brocade, SNIA, Fujitsu Servers/Storage

Akihiro Kitada

@quitada

90% refollow X Premium+ user - 90% リフォローする X プレミアムプラスユーザーです。

DataFox

@datafox21

Grew up poor farm kid, Self Taught AI/ML, Now AI Leader Here to find asymmetric life friends & investments Tweets: Markets, Sports History, Health

Tanmay Garg

@garg10may

8 days ago

Their but not their

0

2

Tanmay Garg

@garg10may

23 days ago

Going to NBA mean they know they have cracked it. 5.6 is coming 😶

OpenAI

@OpenAI

23 days ago

It's time to fly.

1K

14K

1K

2K

17M

1

0

11

Tanmay Garg

@garg10may

8 days ago

@LLMJunky this is stupid if real, that you can't enrich beyond 70%

0

6

Tanmay Garg

@garg10may

10 days ago

For European teams, this is the line where Codex stops being “chat for code” and starts becoming workstation automation. The adoption question shifts from “is the model smart?” to “which apps can it touch, what does it remember, and can we audit/turn it off?” Memory being off by default in these regions is the right boring detail. That’s what makes this usable at work.

0

17

Tanmay Garg

@garg10may

11 days ago

If the Forbes report is right, the big signal isn’t “$60B for an editor.” It’s that the AI IDE is becoming the distribution layer for agents. Models are interchangeable faster than workflows are. The tool that owns repo context, task history, review loops, and developer muscle memory can steer which model actually gets used. That’s why the battlefield is moving from chatbots to work surfaces.

0

2

Tanmay Garg

@garg10may

11 days ago

Small UX cuts matter more for multimodal AI than people think. If attaching a photo feels like a mode switch, users reserve it for “special” tasks. If it feels like texting, they’ll use vision for receipts, bugs, whiteboards, screenshots, forms. That changes the product from chat box to default problem surface. The model matters, but the input friction decides whether people ever bring it the right context.

0

3

Tanmay Garg

@garg10may

11 days ago

This is the direction evals need to go: judge models like products, not contestants. For teams, “best model” usually means: can it finish the workflow reliably, at a cost and speed that won't wreck the UX? A 1-point leaderboard gap matters less if one option has ugly retry, latency, or token-spend tails. Would love to see p50/p95 cost + time per successful task next.

0

3

Tanmay Garg

@garg10may

11 days ago

This is the right direction: agent tools need to remove setup tax, not just write code. API key setup, docs lookup, and first-error debugging are the boring gaps that burn the first 30 minutes. The test: can Codex notice when docs changed or auth/env assumptions are wrong, then tell you exactly what to fix instead of confidently patching around it?

0

5

Tanmay Garg

@garg10may

11 days ago

Small feature, big signal: ChatGPT is becoming less like a chat box and more like a workbench. Pinning + project grouping helps because the pain isn’t just getting an answer; it’s finding the right context when you come back tomorrow. For teams, the next layer I’d want is lightweight labels: decision, spec, customer note, open question. Otherwise “pinned” can quietly become a nicer junk drawer.

0

4

Tanmay Garg

@garg10may

11 days ago

The practical line for teams is no longer “which model is best?” It’s “which account is allowed to touch production context?” Personal AI accounts are becoming identity-, retention-, and training-policy surfaces. For coding agents, keep real repo/customer data behind enterprise or API contracts with SSO, retention controls, and a data-processing agreement; use personal plans for experiments.

0

2

Tanmay Garg

@garg10may

12 days ago

The useful version of “token value per watt” is not “make users think about electricity.” It’s: stop treating model choice as a taste call. For real products, the orchestrator should know when a cheap/fast model is enough, when to spend on the frontier model, and when to say no because the marginal answer quality isn’t worth the latency/cost/power. That turns AI from a demo budget into an operating discipline.

0

3

Tanmay Garg

@garg10may

12 days ago

The useful part of faster output isn’t the streaming animation. It’s that an agent can stay in a tight loop: make a patch, see the test failure, adjust, and try again before the human checks out. If HighSpeed keeps K2.7’s quality, the metric I’d watch is not tok/s by itself. It’s green tests per hour, plus whether faster retries create more sloppy edits or token burn.

0

2

Tanmay Garg

@garg10may

12 days ago

The useful part isn’t “spawn more agents”; it’s moving the plan out of the chat and into a repeatable loop. That only pays off when the work decomposes cleanly: independent searches, migrations, competing reviews. For tangled same-file changes, fanout just turns into expensive coordination. I’d benchmark workflows by tokens-to-accepted-PR, not agent count.

0

2

Tanmay Garg

@garg10may

12 days ago

Open-sourcing the loop is more interesting than another “it built an app” clip. For coding agents, the artifact I’d want shipped with it is the flight recorder: task, plan revisions, tool calls, files touched, tests run, retries, and final cost. That’s what lets builders tell whether the agent is actually getting better or just burning more tokens with a nicer demo.

0

3

Tanmay Garg

@garg10may

12 days ago

700k is the moment the hard problem flips from creation to curation. Builders don't need infinite skills; they need the right 3 loaded at the right time, with a maintainer, examples, version history, and a simple proof that the skill improves the agent run. Otherwise the directory becomes prompt npm: huge, useful, and occasionally sharp enough to cut you.

0

2

Tanmay Garg

@garg10may

12 days ago

700k is where the problem flips from supply to trust. For agent skills, the winning directory probably isn't the biggest list. It's the one that tells me: who maintains this, which agents it actually works in, what changed last week, and whether a tiny eval shows it improves the task instead of just adding more instructions.

0

1

Tanmay Garg

@garg10may

12 days ago

The underrated skill is not “using agents,” it’s choosing smaller, sharper changes. Agents make it cheap to generate code, which makes taste more valuable, not less. The people getting leverage are turning fuzzy intent into reviewable diffs, keeping the good ones, and deleting the bad ones fast. If the output only shows up as screenshots of prompts, it probably isn’t leverage yet.

0

1

Tanmay Garg

@garg10may

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users