danger laboratories

@dangerlab

advanced capability research @_dangertech

Sparks, NV

Joined April 2025

6 Following

146 Followers

110 Posts

Pinned Tweet

danger laboratories @dangerlab

11 months ago

I don't know if this was ever released but 10M is actually doable now If you're interested I'm sure it would help with code generation and many other things @spawn @cline @replit @cursor_ai @windsurf_ai @boltdotnew @scoutdotnew lmk my dms are open

Magic @magicailabs

almost 2 years ago

LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels. Full blog: https://t.co/oFz4A9ynVZ Evals, efficiency, and more ↓

171

3K

422

1K

2M

1

5

1

4

5K

danger laboratories @dangerlab

8 months ago

@napmaxxing @Reiteller @snwy_me This listening tower is being built it does not yet function perfectly

0

0

0

0

74

dangerlab retweeted

Saad @sodakeyEatsMush

8 months ago

[blog] So I was exploring some very influential vision-language models, and while making notes along the way, it kind of turned into a mega blog. In this blog, I’ve covered the novelties and interesting aspects of models like Flamingo, BLIP, BLIP-2, and LLaVA. (There’s even a mini-blog inside this one about Perceiver by Google DeepMind). Some of the common ideas I noticed across these papers were: - The use of cross-attention to make visual and language information interact. - The idea of using a mapping network to project from one embedding space into the LLM’s embedding space. I’ll drop the link in the comments - do check it out, and I really hope you all will like it!!

sodakeyEatsMush's tweet photo. [blog]

So I was exploring some very influential vision-language models, and while making notes along the way, it kind of turned into a mega blog.

In this blog, I’ve covered the novelties and interesting aspects of models like Flamingo, BLIP, BLIP-2, and LLaVA. (There’s even a mini-blog inside this one about Perceiver by Google DeepMind).

Some of the common ideas I noticed across these papers were:
- The use of cross-attention to make visual and language information interact.

- The idea of using a mapping network to project from one embedding space into the LLM’s embedding space.

I’ll drop the link in the comments - do check it out, and I really hope you all will like it!!

6

332

41

277

20K

dangerlab retweeted

NVIDIA AI Developer

8 months ago

.@vllm_project has quickly become a go-to open source engine for efficient large language model inference, balancing performance with a strong developer experience. At NVIDIA, direct contributions to projects like vLLM reflect a commitment to advancing open source AI infrastructure for everyone. In this Q&A, Benjamin Chislett, Senior Systems Software Engineer at NVIDIA and Committer for vLLM, shares his perspective on shaping the project’s future, his work on speculative decoding, and why open source collaboration matters for AI at scale. 🔗 https://t.co/Jg7XjhUs34

NVIDIAAIDev's tweet photo. .@vllm_project has quickly become a go-to open source engine for efficient large language model inference, balancing performance with a strong developer experience. At NVIDIA, direct contributions to projects like vLLM reflect a commitment to advancing open source AI infrastructure for everyone.

In this Q&A, Benjamin Chislett, Senior Systems Software Engineer at NVIDIA and Committer for vLLM, shares his perspective on shaping the project’s future, his work on speculative decoding, and why open source collaboration matters for AI at scale.

🔗 https://t.co/Jg7XjhUs34

4

117

15

22

9K

danger laboratories @dangerlab

8 months ago

@lawaashley Sure does

0

0

0

0

33

danger laboratories @dangerlab

9 months ago

XAi just added 2M, putting call out again in case anyone is still interested in scaling in these pivotal moments

0

0

0

0

340

danger laboratories @dangerlab

11 months ago

I don't know if this was ever released but 10M is actually doable now If you're interested I'm sure it would help with code generation and many other things @spawn @cline @replit @cursor_ai @windsurf_ai @boltdotnew @scoutdotnew lmk my dms are open

Magic @magicailabs

almost 2 years ago

LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels. Full blog: https://t.co/oFz4A9ynVZ Evals, efficiency, and more ↓

171

3K

422

1K

2M

1

5

1

4

5K

danger laboratories @dangerlab

9 months ago

0

0

0

0

449

danger laboratories @dangerlab

9 months ago

@__Tkat__ @pk_iv @Atlassian

dangerlab's tweet photo. @__Tkat__ @pk_iv @Atlassian https://t.co/ceGdMNDpXl

0

1

0

0

67

danger laboratories @dangerlab

9 months ago

@efectual machine can be friend

0

4

0

0

17

danger laboratories @dangerlab

9 months ago

@gordic_aleksa true

0

0

0

0

58

dangerlab retweeted

Aleksa Gordić (水平问题)

9 months ago

New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol). I covered: * Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching) * "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D * Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup * Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :) * Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model Lots of examples, lots of visuals! --- I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened. Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy). I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist. If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts. --- Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing! Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!

gordic_aleksa's tweet photo. New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work!

Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol).

I covered:

* Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching)

* "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D

* Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup

* Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :)

* Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model

Lots of examples, lots of visuals!

---

I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened.

Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy).

I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist.

If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts.

---

Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing!

Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!

63

3K

401

3K

324K

dangerlab retweeted

Alan Sguigna @AlanSguigna

9 months ago

Part 1 of my article series on fine-tuning an LLM for analysis of massive amounts of Intel Processor Trace is up. Use cases: codebase vulnerability scan, at-scale bug triage, etc. With thanks to @33y0re, @ivanrouzanov, and @vGPUArthur: https://t.co/fx5AdiQR4M

AlanSguigna's tweet photo. Part 1 of my article series on fine-tuning an LLM for analysis of massive amounts of Intel Processor Trace is up. Use cases: codebase vulnerability scan, at-scale bug triage, etc. With thanks to @33y0re, @ivanrouzanov, and @vGPUArthur: https://t.co/fx5AdiQR4M https://t.co/gNwtFAaAdL

2

54

18

31

6K

danger laboratories @dangerlab

10 months ago

danger is here, are you prepared to face it

Roko ʕ •ᴥ•ʔっ🪄✨🐍

10 months ago · Tamalpais Valley

rokobasili's tweet photo. https://t.co/BUkMxClzaF

0

1

1

1

570

1

3

0

0

363

danger laboratories @dangerlab

10 months ago

do you see danger creeping in

dangerlab's tweet photo. do you see danger creeping in https://t.co/uf9cPpjTTP

0

1

1

0

230

danger laboratories @dangerlab

11 months ago

https://t.co/1W0AM3m32B

0

2

0

0

170

danger laboratories @dangerlab

11 months ago

don't spill your tea • danger blog 001

1

1

0

0

336

danger laboratories @dangerlab

11 months ago

dangerlab's tweet photo. https://t.co/pCq2MV6B2a

0

3

0

0

187

dangerlab retweeted

Earth Liberation Studio @EarthStvdio

11 months ago

Just poking my head in here to say

EarthStvdio's tweet photo. Just poking my head in here to say https://t.co/tXGkF9r0uP

223

11K

2K

471

604K

danger laboratories @dangerlab

11 months ago

@perplexity_ai Perplexity's Comet

0

0

0

0

15

danger laboratories @dangerlab

11 months ago

dangerlab's tweet photo. https://t.co/2FSJnUqSlF

0

2

1

0

345

dangerlab retweeted

11 months ago

> I log onto twitter > sex choking controversy > grok becomes MechaHitler > hot blonde chick is apparently ugly I am logging off twitter.

178

102K

2K

4K

4M

Last Seen Users on Sotwe

Trends for you

Most Popular Users