Ankit Prateek @ankit_appy - Twitter Profile

1 day ago

Share your inner interests. Like a child. (Only don’t lose it when you grow up) And if you already lost it, dig deep and find it again. “The most personal is the most creative.” Martin Scorsese

2

28

3

5

2K

ankit_appy retweeted

Pavlo Molchanov

@PavloMolchanov

7 days ago

What if you could take three completely different model families… and distill them into one tiny model? 🤯 📜 Paper: https://t.co/K2iKD4xFvp MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights. But what if we could go further - and distill models from entirely different families? Turns out, it is possible. Today we’re releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. 📄 We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B. MMLU jumped from 32.05 → 46.32 when using multiple teachers. 📈 The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. 🚀

50

3K

328

4K

1M

Ankit Prateek @ankit_appy

6 days ago

@opencode MiMo-2.5

0

1K

Ankit Prateek @ankit_appy

9 days ago

@MichaelHyatt @Teknium @Hermes_agentAI @gregisenberg @AlexFinn @NetworkChuck Install on your main machine for daily use. It gets better the more you use it. Also install on a machine that you'll keep running 24x7, for your cronjobs, and for using it remotely via telegram, discord, WhatsApp. Hermes is probably the best harness out there.

0

59

Who to follow

Anant Shrivastava

@anantshri

Personal Twitter. Founder @cyfinoid Project: @CodeVigilant @TamerPlatform https://t.co/5MjfS4jjAK fediverse @[email protected]

Building https://t.co/WAgycrQ6cK & @gethardenly I help startups prevent cloud security breaches — without building an internal security team.

Ankit Prateek @ankit_appy

9 days ago

Complete global data-flow tracking down to external method bodies, zero brute-force prompt stuffing, and a cloud bill ~$2. Smart scaffolding > massive compute budgets. Lesson learned: Don't scan Chromium again. It's too good. 😂

0

21

Ankit Prateek @ankit_appy

9 days ago

~21h grinding a SAST run on Chromium's net/ alongside its full dependency graph. 283M total tokens (~163M DeepSeek + ~120M local LLM). The structural orchestration framework built for deep context caching hit a 97.8% cloud cache-hit rate (only ~3.5M cache misses).

ankit_appy's tweet photo. ~21h grinding a SAST run on Chromium's net/ alongside its full dependency graph.
283M total tokens (~163M DeepSeek + ~120M local LLM). The structural orchestration framework built for deep context caching hit a 97.8% cloud cache-hit rate (only ~3.5M cache misses). https://t.co/om5ObZxGhO

1

0

49

Ankit Prateek @ankit_appy

12 days ago

Called it a discount. Turns out it was the price. #DeepSeek just made the 75% V4-Pro cut permanent. The token economics I wrote about 5 days ago aren't a promo window anymore – they're the floor. Resharing: https://t.co/ozuUytMPSI

ankit_appy's tweet photo. Called it a discount. Turns out it was the price.
#DeepSeek just made the 75% V4-Pro cut permanent.
The token economics I wrote about 5 days ago aren't a promo window anymore – they're the floor. Resharing: https://t.co/ozuUytMPSI https://t.co/zH0dMyykg9

Ankit Prateek @ankit_appy

18 days ago

33 Million Tokens for $0.25 Just ran a full SAST scan against 1M+ lines of code for the price of a gumball. The secret? Hybrid Architecture + Context Caching. Master: DeepSeek V4-Flash (Orchestrator) Worker: llama.cpp (Local) Full deep dive on LinkedIn: [https://t.co/nQzJq0THP7]

ankit_appy's tweet photo. 33 Million Tokens for $0.25

Just ran a full SAST scan against 1M+ lines of code for the price of a gumball.
The secret? Hybrid Architecture + Context Caching.
Master: DeepSeek V4-Flash (Orchestrator) Worker: llama.cpp (Local)
Full deep dive on LinkedIn: [https://t.co/nQzJq0THP7] https://t.co/U5LdpzWTYm

0

2

1

0

314

0

1

0

125

Ankit Prateek @ankit_appy

15 days ago

If your framework supports MTP, turn it on. It’s an uncompromised velocity multiplier for repository-wide code reviews and massive doc analysis. Full writeup: https://t.co/1Tx2M4DnMN Shoutout to @ggerganov & team for https://t.co/kvGqXzAlJD

0

41

Ankit Prateek @ankit_appy

15 days ago

Multi-Token Prediction (MTP) is a rare "free lunch" in LLM inference. Just finished benchmarking Qwen 3.6 27B on a single RTX 5090 using llama.cpp. At extreme context scales, MTP roughly DOUBLES generation throughput with 0 quality loss. The raw telemetry 👇 🧵

ankit_appy's tweet photo. Multi-Token Prediction (MTP) is a rare "free lunch" in LLM inference.

Just finished benchmarking Qwen 3.6 27B on a single RTX 5090 using llama.cpp. At extreme context scales, MTP roughly DOUBLES generation throughput with 0 quality loss.

The raw telemetry 👇 🧵 https://t.co/X3IE2V8B3e

1

0

318

Ankit Prateek @ankit_appy

15 days ago

Is it lossy? No. At temp=0, MTP holds a strict veto layer. The core model brain validates every parallel guess before it hits your screen. You get cosmetic word changes from FP near-ties, but logical, code, & math accuracy remain 100% intact.

1

0

30

Ankit Prateek @ankit_appy

18 days ago

@r0ktech what are you talking about? second one is pure gold https://t.co/EB9zfCNcUs 😂

0

127

Ankit Prateek @ankit_appy

18 days ago

@KaiXCreator Ignorance. Outrage.

0

8

Ankit Prateek @ankit_appy

18 days ago

@neural_avb Thanks for sharing the insights. I gave up on opencode a while back after it's timeout failed repeatedly while working with llama.cpp

0

1

0

36

Ankit Prateek @ankit_appy

18 days ago

33 Million Tokens for $0.25 Just ran a full SAST scan against 1M+ lines of code for the price of a gumball. The secret? Hybrid Architecture + Context Caching. Master: DeepSeek V4-Flash (Orchestrator) Worker: llama.cpp (Local) Full deep dive on LinkedIn: [https://t.co/nQzJq0THP7]

0

2

1

0

314