Asmit @asmitks - Twitter Profile

Pinned Tweet

Asmit

@asmitks

over 1 year ago

AGI will be the ultimate monument to human pride, and simultaneously its ultimate undoing

Deedy

@deedydas

over 1 year ago

99.99% of people cannot comprehend how insane FrontierMath is. The problems are crafted by Math profs and not in any training data. Math legend Terry Tao said "These are extremely challenging. I think they will resist AIs for several years at least." OpenAI o3 did 25% on THIS.

deedydas's tweet photo. 99.99% of people cannot comprehend how insane FrontierMath is.

The problems are crafted by Math profs and not in any training data.

Math legend Terry Tao said "These are extremely challenging. I think they will resist AIs for several years at least."

OpenAI o3 did 25% on THIS. https://t.co/Cur2uOKFVL

115

5K

478

3K

754K

0

4

0

793

Asmit

@asmitks

6 days ago

@Aartiutwani

3

4

0

288

Asmit

@asmitks

8 days ago

@Hunter_Weiss Manish on the beat, shabang

0

66

Asmit

@asmitks

20 days ago

Presented our work from  Apple at EuroMLSys 2026 in Edinburgh: “Asynchronous Verified Semantic Caching for Tiered LLM Architectures” The core idea: semantic caches are usually conservative because false positives are catastrophic. Instead of pushing more verification into the serving path, we moved verification off the critical path entirely. Near miss cache interactions asynchronously trigger an LLM judge verifier. Verified pairs are promoted into the dynamic tier over time, increasing effective cache coverage while preserving the latency behavior of the original system. Interesting systems tension: you want higher cache hit quality without paying synchronous verification costs on user requests. Paper: https://t.co/zZhwftl5pR #MLSys #LLM #Inference #Caching #EuroSys

asmitks's tweet photo. Presented our work from  Apple at EuroMLSys 2026 in Edinburgh:

“Asynchronous Verified Semantic Caching for Tiered LLM Architectures”

The core idea:
semantic caches are usually conservative because false positives are catastrophic. Instead of pushing more verification into the serving path, we moved verification off the critical path entirely. Near miss cache interactions asynchronously trigger an LLM judge verifier. Verified pairs are promoted into the dynamic tier over time, increasing effective cache coverage while preserving the latency behavior of the original system.
Interesting systems tension:
you want higher cache hit quality without paying synchronous verification costs on user requests.

Paper:
https://t.co/zZhwftl5pR

#MLSys #LLM #Inference #Caching #EuroSys

0

4

0

120

Who to follow

Anmol Goel

@anmgoel

NLP ∩ Privacy @ELLISForEurope PhD @UKPLab, @TUDarmstadt and @UCPH_Research | Prev @iiit_hyderabad |

Chirag Jain

@chiragj_

Fullstack Developer | SDE @ https://t.co/78JtBFsTBH | Alum @IIITDelhi & @ISBedu Building @JumpFastTech and @clearworkapp

Rohan Rajpal 🪄

@rohanrajpal98

co-founder @spurnow_com | bootstrapped | ai agents + helpdesk for shopify outfit repeater | backpacker | i hate ketchup

asmitks retweeted

Amrit Khera @Amritkhera10

7 months ago

Incredibly excited to release GPT-5.1 Pro to the world! It’s great at tackling the hardest, messiest problems with clearer, more comprehensive responses. I’m eager to see you throw your toughest problems at it. Please try it and share what you think :)

0

7

3

0

771

Asmit

@asmitks

7 months ago

@sporadica still got a grazer in me

0

1

0

73

Asmit

@asmitks

7 months ago

@rcx86 could have dropped the "unit", to make it more dense.

0

1

0

40

Asmit

@asmitks

7 months ago

@rohanpaul_ai It gets dicey when the sources cited by the likes of perplexity/AI mode/searchGPT are AI generated themselves. As long as it's only used for paraphrasing/ readability it should be fine.

0

1

0

103

Asmit

@asmitks

8 months ago

@raphaelschaad You can ask Siri iPhone questions.

0

40

Asmit

@asmitks

8 months ago

@yacineMTB Is ur username really कछे 🩲

0

42

Asmit

@asmitks

8 months ago

@tszzl This is why humans have conquered the seas, crossed the lands. Couldn't just sit and enjoy some bread.

0

1

0

57

Asmit

@asmitks

8 months ago

@hovinthenorth

1

23

0

1

2K

Asmit

@asmitks

8 months ago

@khoomeik @tszzl It would check out tho. When there are no adventures, no wishes, nothing to conquer for the sapiens. A perfectly created AI slop machine will quench the brain's thirst to be unsettled. Stimulations to move our brains around, but not move us anywhere.

0

2

0

80

Asmit

@asmitks

8 months ago

@sporadica Even after having this prior, there is an uncontrollable primal urge to yeet patience and have a fleeting shot at reaching the exit cave first. Definitely some evolutionary slag.

1

0

89

Asmit

@asmitks

8 months ago

@tszzl Our fractal minds have a tendency to see self similar orders. The similarity in the abstraction of a man, and a state is an exhibit.

0

3

0

34

Asmit

@asmitks

9 months ago

@_nonfigurativ_ @sanity_io Its giving Pantheon

0

1

0

156

asmitks retweeted

Amey Agrawal @agrawalamey12

11 months ago

The bitter lesson of AI infra: The hardest part about building faster LLM inference systems is not designing the systems, but rather it is evaluating if the system is actually faster! 🤔 This graph from a recent top systems venue paper about long-context serving shows average normalized input token latency for a trace with both short and 100K+ token requests. System X looks like a clear win: lower normalized latency and higher request rates. But normalized metrics can obscure the actual user experience: at those rates, long inputs see >2hr delays to the first token! Let’s do the math!🧮

agrawalamey12's tweet photo. The bitter lesson of AI infra: The hardest part about building faster LLM inference systems is not designing the systems, but rather it is evaluating if the system is actually faster! 🤔

This graph from a recent top systems venue paper about long-context serving shows average normalized input token latency for a trace with both short and 100K+ token requests. System X looks like a clear win: lower normalized latency and higher request rates. But normalized metrics can obscure the actual user experience: at those rates, long inputs see >2hr delays to the first token!

Let’s do the math!🧮

1

23

10

5

2K

asmitks retweeted

Amey Agrawal @agrawalamey12

about 1 year ago

Super long-context models with context window spanning millions of tokens are becoming commonplace (@GoogleDeepMind Gemini, @xai Grok 3, @Alibaba_Qwen Qwen2.5). But efficiently serving these models is tough, especially alongside short requests. Head-of-Line (HOL) blocking becomes a major issue, hurting latency for everyone. We present Medha, a system designed to handle this mix efficiently. Achieving 30x lower latency, and 5x higher throughput compared to the state-of-the-art. Full paper: https://t.co/PQlwLtlnD5. 🧵

agrawalamey12's tweet photo. Super long-context models with context window spanning millions of tokens are becoming commonplace (@GoogleDeepMind Gemini, @xai Grok 3, @Alibaba_Qwen Qwen2.5). But efficiently serving these models is tough, especially alongside short requests. Head-of-Line (HOL) blocking becomes a major issue, hurting latency for everyone.

We present Medha, a system designed to handle this mix efficiently. Achieving 30x lower latency, and 5x higher throughput compared to the state-of-the-art. Full paper: https://t.co/PQlwLtlnD5. 🧵

1

31

14

7

4K

Asmit

@asmitks

over 1 year ago

This Apple Intelligence feature hits the spot for inquisitive but impatient types like me. Information snacking.

0

5

0

1

834

Asmit

@asmitks

over 1 year ago

@emilymbender @AravSrinivas And as far as, incorrect summarisation is concerned, most blogs and pages are second/third order pieces of information which themselves are an attempted summarisation of the facts. LLMs are superior in playing with words as of now.

0

52

Asmit

@asmitks

over 1 year ago

@emilymbender Most of the “LLM-driven” searchbots are RAG based, aligned to only emit lines that can be cited from external sources. I think @AravSrinivas mentioned that you can imagine this search response to be like writing an academic paper, where each line refers to a citation or itself.

1

0

140

Asmit

@asmitks

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users