Krish Modi

Verified account

@krishmodi404

@uwaterloo se, incoming @palantirtech, prev @bloomberg, @Huawei, ISEF

Sarnia ON

Joined February 2022

446 Following

682 Followers

211 Posts

Pinned Tweet

3 months ago

I made AgentIR: a scheduler for distributed LLM serving that makes agent workloads run much faster. 41.3% lower E2E latency, and up to 70% higher throughput

13

125

24

98

12K

krishmodi404 retweeted

10 days ago

I took OSDN, a brand-new linear-attention model that learns to tune its own memory updates as it reads (think AdaGrad for the architectures trying to replace the transformer), rebuilt it from scratch in pure C++ with my own autograd engine, and ran it on a $4 microcontroller to predict hypoglycemia 60 minutes before it hits. No PyTorch. No JAX. No TensorFlow. No ML library at all. Straight C++ standard library.

6

11

3

2

922

10 days ago

@Rishi_Shah99 this is so fucking cool

1

0

0

0

180

11 days ago

@Rishi_Shah99 thanks bro

0

0

0

0

15

Who to follow

rabbitholeathon

Verified account

@rabbitholeathon

spend a weekend diving down rabbitholes, reading, learning and connecting 🐇🕳 apps close july 7 👇

Verified account

venture ecosystem @mercury

just a young girl who wishes for the world & more 🤍

12 days ago

launching AgentIR Blackbox https://t.co/P7GC38xM5w an llm request router for agent system Blackbox finds which llm calls are on your workflow’s critical path, sends them to faster providers, and routes less urgent calls cheaper to maintain your selected cost-latency constraint it uses your workflow stats and real-time provider latency profiles to reroute before throttling or slowdowns hit the full workflow setup is simple too. connect your app, and blackbox handles the workflow annotations for you use it for free!

krishmodi404's tweet photo. launching AgentIR Blackbox https://t.co/P7GC38xM5w

an llm request router for agent system

Blackbox finds which llm calls are on your workflow’s critical path, sends them to faster providers, and routes less urgent calls cheaper to maintain your selected cost-latency constraint

it uses your workflow stats and real-time provider latency profiles to reroute before throttling or slowdowns hit the full workflow

setup is simple too. connect your app, and blackbox handles the workflow annotations for you

use it for free!

16

63

14

19

8K

11 days ago

@bmptrsn @datacurve so tuff

1

1

0

0

138

12 days ago

@hamostaf04 thanks hamza!!

0

1

0

0

76

12 days ago

@CMNM50660490 thanks!

0

0

0

0

46

12 days ago

@pahu2353 @IKorovinsky thanks pahu

0

0

0

0

60

12 days ago

@VishnuSatish_ thanks!

0

0

0

0

46

12 days ago

@argupta_ thanks boss

0

1

0

0

88

12 days ago

@waterloo_intern thanks!

0

0

0

0

84

12 days ago

@sunniekapar thanks sunnie!

0

1

0

0

74

12 days ago

@IKorovinsky thanks ian

0

1

0

0

128

12 days ago

@notakki_ UI GOAT

0

1

0

0

129

12 days ago

@fahmi___omer thanks boss

0

0

0

0

114

12 days ago

@ishaandey_ thanks bro

0

1

0

0

113

12 days ago

@_wilsonchenn thanks bro

0

0

0

0

104

12 days ago

@alexabelonix thanks!

0

1

0

0

142

12 days ago

@_rajanagarwal THANK YOU

0

1

0

0

96

12 days ago

@_rajanagarwal thanks ragarwal

0

1

0

0

188

12 days ago

@shamsharoonn 🤩🤩

0

0

0

0

151

Last Seen Users on Sotwe

Trends for you

Most Popular Users