Kirill @kshvidler - Twitter Profile

Kirill

@kshvidler

26 days ago

LETSGOOO

Claude

@claudeai

26 days ago

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

5K

105K

14K

22K

57M

0

18

Kirill

@kshvidler

3 months ago

😋

left curve dev

@leftcurvedev_

3 months ago

Okay this one is insane. A new 18B frankenstein model was just released on @huggingface — Beats the new Qwen3.6-35B-A3B on a 44-test suite despite requiring 12GB VRAM instead of 24GB 🤯 Runs on a SINGLE RTX 3060 (!) 🧠 Opus 4.6 & GLM-5.1 reasoning in one model ⚡️ 66+ tok/s stable on mid-range GPUs 🧪 Experimental, no additional training 🛠️ Perfect tool calling & agentic reasoning 📷 Fits on low hardware, any 12gb card 📚 GGUF size is 9.8GB (Q4_K_M) Another gift from Jackrong, adding both qwopus and glm-distilled qwen together was not on my bingo card. Truly seems like the sweet spot between 9B and 27B models right now. The ultimate model for 12-16GB VRAM owners? https://t.co/kqDUizh1pf

leftcurvedev_'s tweet photo. Okay this one is insane. A new 18B frankenstein model was just released on @huggingface — Beats the new Qwen3.6-35B-A3B on a 44-test suite despite requiring 12GB VRAM instead of 24GB 🤯

Runs on a SINGLE RTX 3060 (!)

🧠 Opus 4.6 & GLM-5.1 reasoning in one model
⚡️ 66+ tok/s stable on mid-range GPUs
🧪 Experimental, no additional training
🛠️ Perfect tool calling & agentic reasoning
📷 Fits on low hardware, any 12gb card
📚 GGUF size is 9.8GB (Q4_K_M)

Another gift from Jackrong, adding both qwopus and glm-distilled qwen together was not on my bingo card. Truly seems like the sweet spot between 9B and 27B models right now.

The ultimate model for 12-16GB VRAM owners?

https://t.co/kqDUizh1pf

65

2K

172

3K

175K

0

1

0

52

Kirill

@kshvidler

4 months ago

Making Googles most popular model smarter, cheaper and faster all at once is kinda of huge

Chubby♨️

@kimmonismus

4 months ago

Gemini 3.1 Flash lite released and its next level price-performance-ratio Gemini 3.1 Flash-Lite, its fastest and most cost-efficient Gemini 3 model yet, built for high-volume developer workloads. Priced at just $0.25 per 1M input tokens and $1.50 per 1M output tokens, it delivers 2.5x faster time to first token and 45% faster output speed than 2.5 Flash, while achieving a 1432 Elo score and up to 86.9% on GPQA Diamond benchmarks.

kimmonismus's tweet photo. Gemini 3.1 Flash lite released and its next level price-performance-ratio

Gemini 3.1 Flash-Lite, its fastest and most cost-efficient Gemini 3 model yet, built for high-volume developer workloads.

Priced at just $0.25 per 1M input tokens and $1.50 per 1M output tokens, it delivers 2.5x faster time to first token and 45% faster output speed than 2.5 Flash, while achieving a 1432 Elo score and up to 86.9% on GPQA Diamond benchmarks.

27

327

19

47

31K

0

16

Kirill

@kshvidler

4 months ago

People are falling for the hype once again for something that's just a FAISS wrapper, nothing new here...

Aakash Gupta

@aakashgupta

4 months ago

This tweet frames Zvec as a Pinecone killer. That framing obscures what Alibaba actually built. Zvec is an embedded, in-process library. No server. No network calls. No daemon. You import it like you import pandas. Pinecone and Weaviate will never see this in their competitive dashboards because it targets a completely different layer of the stack. The technical lineage matters here. Proxima has been running production vector search inside Alibaba for years. Taobao product search, Alipay face payments, Youku video recommendations, Alimama advertising. Billions of queries across systems where latency failures cost real revenue. What Alibaba’s Tongyi Lab did is strip that engine down, wrap it in a Python API, and release it as a library anyone can pip install. The benchmarks back it up. On VectorDBBench with the Cohere 10M dataset, Zvec hits 8,000+ QPS at comparable recall. That’s more than 2x ZillizCloud, the previous #1, with faster index build times. An embedded library matching or beating managed cloud services on raw throughput. This follows the SQLite playbook. SQLite opened an entirely new category of database usage by embedding directly inside applications. Software that would never have run a client-server database suddenly had access to SQL. Zvec is making the same bet for vector search: local RAG pipelines, desktop AI assistants, CLI tools, edge devices, anywhere spinning up a Pinecone cluster would be absurd. And that tells you where AI inference is heading. Every agent framework, every local assistant, every on-device RAG system needs vector search. The vast majority of those workloads run fine on an in-process library with 64 MB streaming writes and configurable memory limits. Pinecone raised $138M and just repositioned as a “knowledge platform.” Weaviate raised $50M. Their customers have SLAs, compliance requirements, and enterprise procurement cycles. Zvec is going after the 90% of developers who were overprovisioning infrastructure for workloads that fit inside a single process. The companies that should actually be nervous are Chroma and FAISS, the current defaults for lightweight local vector search, because Zvec just showed up with Alibaba-scale engineering in a pip-installable package.

23

319

28

348

42K

0

26

Who to follow

Beat saber 😳 | Peaked 26th global

MonsterWook

@wook_monster

That one canadian who played is BSWC 2020 | Peak: #24 global 🇦🇷🇨🇦

Kirill

@kshvidler

4 months ago

If your pitch needs urgency pricing, affiliate pressure, and “just install this code” energy, that is a trust exploit. Reposting this because it shows exactly what feels wrong here. LarryBrain MIGHT be legit, but right now it feels scammy: • aggressive “buy now before price jumps” messaging • huge affiliate incentives • FOMO-heavy marketing instead of technical proof • second-hand skill marketplace model where users can end up installing code they never audited From a security perspective, this is a dangerous combo: • remote skill download • local file writes • easy one-command install flow • weak user due diligence in practice That is not empowerment. That is how people accidentally run harmful code on their own machines. Call it growth hacking if you want. When the system is built around getting more people to buy in through affiliate incentives, people will call it pyramid-scheme behavior. Audit first. Install second. Always.

Oliver Henry

@oliverhenry

4 months ago

https://t.co/ohhXikdz3y

57

582

46

2K

423K

0

74

Kirill

@kshvidler

4 months ago

Imagine the possibilities...

Wildminder

@wildmindai

4 months ago

17,000 tokens per second!! Read that again! LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200. the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought. Transition from brute-force GPU clusters to actual AI appliances. https://t.co/Bf6DH7Q6Uf

wildmindai's tweet photo. 17,000 tokens per second!! Read that again!
LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200.
the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought.
Transition from brute-force GPU clusters to actual AI appliances.
https://t.co/Bf6DH7Q6Uf

382

7K

931

5K

2M

0

31

kshvidler retweeted

Remotion

@Remotion

6 months ago

Remotion now has Agent Skills - make videos just with Claude Code! $ npx skills add remotion-dev/skills This animation was created just by prompting 👇

783

21K

2K

37K

18M

Kirill

@kshvidler

6 months ago

Every solo founder I know, including myself, eventually hits this wall: building the thing is 20% of the battle. Distribution? That's the other 80%, and it's a jungle out there. Following the same 'get 100 followers, engage 10 people, comment on threads' playbook everyone shares? It's the fastest way to get lost in the noise. You end up blending in, not standing out. Your advantage isn't how hard you work; it's how different you are. I learned this going from data scientist to full-stack with AI tools, building Artera, now dartpost. What makes YOU uniquely you? That’s your distribution strategy. Seriously, what’s your take on this? Are you just following playbooks?

2

1

0

35

Kirill

@kshvidler

6 months ago

AI isn't here to replace software developers; it's here to promote us. Forget junior dev tasks. That's AI's job now. The game has changed. When I started as a data scientist, then learned backend/AWS and became full stack with AI's help, I quickly saw this shift. The real skill premium is now on system design, architecture, and knowing *what* to build and *why*. AI can write a brilliant Next.JS module, but it can't tell you if that module solves a real business problem, or how it integrates into a complex distributed system. My entire journey building #dartpost 🚀 on AWS and Supabase has been about me driving the architectural vision, and AI rapidly iterating on the low-level code. Humans drive ideas. AI builds the vehicle. Are you designing the vehicle or just turning wrenches?

0

61

Kirill

@kshvidler

6 months ago

I always hated ORMs, they hide complexity by sacrificing functionality, hiding behind "simple" syntax. I remember the Django ORM had issues with composite primary keys in postgres, which was beyond stupid to me at the time. We managed to work around it though, so it can work

Arpit Bhayani

@arpit_bhayani

6 months ago

ORMs are an anti-pattern once you start operating at scale. At scale, most move away from them as the abstraction that bridges the database and programming language never fully holds. At scale, by using ORM, you almost always end up missing native database optimizations. And if you are already writing raw queries to work around ORM limitations, you are actually better off using prepared statements directly. The ORM abstraction hides query complexity, yes, but it does make it easy to ship inefficient N+1 queries without realizing it. Debugging becomes painful at the worst possible time - during production incidents. ORMs tend to lag behind database features. Indexing strategies, query hints, and newer capabilities are either inconvenient to use (just look at prefetch in Django ORM) or simply unavailable. This makes you opt for the lowest common capability (as offered by ORM) instead of leveraging the database properly. At scale, explicit queries are always simpler, faster, and easier to optimize; the cognitive overhead of mapping objects to tables is often not worth it. To be fair, ORMs make sense for prototypes and for getting to market quickly. Use them when you are starting out or in the early phases, but have a plan to migrate away from them :) Easier said than done, though; been there!

149

1K

83

519

487K

0

55

Kirill

@kshvidler

8 months ago

@cgtwts Wonderful bait

0

34

Kirill

@kshvidler

8 months ago

I've worked in a team of two where I was the first guy and my coworker - the second. Both are important, perfectly balancing each others faults. Engineering is a coop game, not single player

Joe Skyler

@businessguy_1

8 months ago

There are two types of engineers: 1. Ship fast (speed) 2. Ship perfect (quality) Which one are you?

828

975

41

69

4M

0

2

0

96

Kirill

@kshvidler

8 months ago

@teni_thedev Appreciate the sentiment, but I'm mostly building for myself at this stage. Good luck on your journey though 💪💪

1

0

12

Kirill

@kshvidler

8 months ago

The “post 3x a day” advice is everywhere. I’m doing the opposite. I’m running an experiment: Only posting once a week, but making it the most high-signal content I can produce. No more filler just to feed the algorithm. I'm betting on quality over quantity. Let's see how it goes.

1

6

0

97

Kirill

@kshvidler

8 months ago

Tony Stark was a vibecoder

0

2

0

69

Kirill

@kshvidler

8 months ago

3. Focus on Impact, Not Just Activity Instead of saying "I built a new caching layer"... Say "I reduced API response time by 300ms, improving user experience and cutting our server costs." Always connect your technical work to a business outcome. This is how you demonstrate senior-level thinking.

1

0

42

Kirill

@kshvidler

8 months ago

Presenting is a critical skill for an engineer... ESPECIALLY for an engineer. Clear communication is what separates junior/mid-level engineers from the senior/staff-level ones. Here are 3 core principles to master presenting 🧵:

kshvidler's tweet photo. Presenting is a critical skill for an engineer...
ESPECIALLY for an engineer.

Clear communication is what separates junior/mid-level engineers from the senior/staff-level ones.

Here are 3 core principles to master presenting 🧵: https://t.co/UrbVjp6qRp

1

4

0

1

107

Kirill

@kshvidler

8 months ago

2. Structure the Narrative: Why → What → How Most engineers present in the reverse order. Flip it. Start with the WHY: What problem are you solving and why does it matter? Then the WHAT: What was your solution? Finally, the HOW: Briefly explain key technical decisions.

2

0

56

Kirill

@kshvidler

9 months ago

Looks like we will be getting INSANELY GOOD vlms in the next 3-6 minths

Liquid AI

@liquidai

9 months ago

Introducing our new tiny vision language model: LFM2-VL-3B 👀 > Expanded multilingual visual understanding: English, Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, Korean > 51.8% on MM-IFEval (instruction following) > 71.4% on RealWorldQA (real-world understanding) > Excels in single and multi-image understanding and English OCR > Low object hallucination rate (POPE benchmark) Download below 👇

liquidai's tweet photo. Introducing our new tiny vision language model: LFM2-VL-3B 👀

> Expanded multilingual visual understanding: English, Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, Korean
> 51.8% on MM-IFEval (instruction following)
> 71.4% on RealWorldQA (real-world understanding)
> Excels in single and multi-image understanding and English OCR
> Low object hallucination rate (POPE benchmark)

Download below 👇

15

376

57

147

58K

0

5

0

1

357

Kirill

@kshvidler

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users