Vinay Kumar @vinkrp - Twitter Profile

9 days ago

Check out the latest article in my newsletter: The model isn’t the risk. The workstation is. https://t.co/ZKEdCPK42m via @LinkedIn

0

6

Vinay Kumar @vinkrp

10 days ago

PROMPT INJECTION IS A FREE BUG BOUNTY PROGRAM FOR YOUR AI AGENTS

0

51

Vinay Kumar @vinkrp

10 days ago

When it comes to AI - 'rm -rf /' is the most feared

0

12

vinkrp retweeted

Claude

@claudeai

4 months ago

Introducing Claude Code Security, now in limited research preview. It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss. Learn more: https://t.co/n4SZ9EIklG

2K

49K

6K

25K

26M

Who to follow

Aniketh S Deshpande

@AnikethSDeshpa1

Data Engineer, Table Tennis

ricardo jorge Rodrigues Alves

@ricardojorgero5

jonssonyan

@jonssonyan

Full-Stack Developer, Open Source Contributor, Building awesome things. Building https://t.co/rXghB2WweF

Vinay Kumar @vinkrp

4 months ago

💸 $15,000 · ⏳ 7 days · 🤖 1 AI ➕ 🥇 YOUR IDEA Here's the experiment: I'm handing Codex + TuringMind a $15K cloud + API budget and 7 days to ship a real product. From architecture to deployment — fully autonomous. But here's the twist: 💪 you decide what gets built. Drop your craziest product idea below. The winning pitch gets: → Built live, documented publicly → 1% equity in the resulting product → Credit as co-creator The catch? The AI has to follow TuringMind's engineering rails — specs, tests, verification gates. No vibe coding allowed.

0

42

Vinay Kumar @vinkrp

4 months ago

Only Pain Point left right now ....

0

18

Vinay Kumar @vinkrp

4 months ago

Real Impact of Anthropic / Aardvark on Code Security :)

0

20

Vinay Kumar @vinkrp

5 months ago

We need a better code reviewer in the Age of LLM generated code. Meet TuringMind AI - A Code reviewer that understands your entire codebase, not just the diff. https://t.co/Nx6F0wdSht #turingmindai

0

60

Vinay Kumar @vinkrp

5 months ago

Day 2/30 of Building in Public : I let my AI roast its own source code. I just unlocked the shortest feedback loop in software engineering. Normal product cycle: Build → Ship → Wait a week → Get feedback. My cycle: Build → Review → Fix Without memory, I’d have to prompt it every time to ignore that file. With memory, it learned the context once and moved on. Follow along for Day 3 - https://t.co/07evjUVshi

0

2

0

129

Vinay Kumar @vinkrp

5 months ago

AppSec is hard because we've spent a decade telling developers to "think like hackers" when their job is to "act like builders." We don't need developers to be security experts; we need security experts to build better developer tools that make it impossible (or very difficult) to do the wrong thing

0

1

0

55

Vinay Kumar @vinkrp

5 months ago

Classic AppSec has decades of infrastructure. AI has raw capability but zero plumbing. Maybe one eats the other. Maybe they merge. Maybe I'm asking the wrong question. So I'm starting with tmind — a Claude Code Skill + Memory for the important stuff https://t.co/RftbAoTfJv

0

26

Vinay Kumar @vinkrp

5 months ago

Starting 2026 with an experiment Day 1 of building AI Code Reviewer infrastructure in public - no idea if this works. What I’ve learned so far 👇 Claude catches things SAST tools miss. The stuff that actually matters. #ClaudeCode #SAST #CodeReview

1

0

59

Vinay Kumar @vinkrp

5 months ago

But teams going all-in on AI review keep hitting the same walls: 🎯 Claude forgets everything between sessions 🎯 Same false positives, every single day 🎯 No audit trail (good luck with compliance) 🎯 Scaling gets expensive, fast AI has raw capability — but no plumbing.

1

0

33

Vinay Kumar @vinkrp

6 months ago

Claude Code Skill to perform deep code reviews locally https://t.co/CbqMLe3pJh

0

25

vinkrp retweeted

Greg Brockman

@gdb

8 months ago

Introducing Aardvark, our agentic security researcher:

92

2K

154

588

286K

vinkrp retweeted

Andrej Karpathy

@karpathy

8 months ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

karpathy's tweet photo. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.

It weighs ~8,000 lines of imo quite clean code to:

- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.

Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.

My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved.

Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

682

24K

3K

18K

6M

vinkrp retweeted

Greg Brockman

@gdb

10 months ago

hard to beat flow state at midnight while working on a worthy problem

276

4K

265

325

570K

Vinay Kumar @vinkrp

11 months ago

Check out the latest article in my newsletter: 🎉 Launching Driftcop – an open-source SAST Scanner for AI agent tooling! https://t.co/w9FFrQ5PK8 via @LinkedIn

0

26

Vinay Kumar

@vinkrp

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users