I’m building Videocrawl https://t.co/FO5LU7fNDw – an LLM-powered assistant for better learning from videos. We also provide an API to extract structured data from videos.
One of my favorite uses of Claude’s new create/edit files feature: generating synthetic PDFs.
For one project, I generated 10 synthetic regulatory amendment PDFs (add/replace/delete) to thoroughly test document-processing agents.
Great way to get unstuck and move forward.
Hi @RahulGandhi for the last 10 years, I’ve been telling my friends and colleagues that I truly believe — and hope — you will become the best Prime Minister India has ever had.
Just tried Gemini CLI with the Pro model — felt pretty dumb. Even after a detailed prompt with context, it kept asking, “tell me what you want to do.” Really disappointing.
I use Claude Code daily it is so much better.
@kislayverma@rahulj51 - Start in plan mode for all tasks (even small ones). Review the plan and provide feedback.
- Actively monitor the code being generated and give feedback during generation to steer it in the right direction.
- Include explicit solution details in the prompt.
3/3 Another example: how do you build production‑ready classifiers using LLMs? There are so many real topics I wish people would cover.
I want to learn if there are better approaches.
1/3 I watched many AI Engineer @aiDotEngineer over the last few days, but most lacked the practical or deep insights I was hoping for from a leading AI conference. Too many talks felt like fluff or marketing pitches.
2/3 Sidenote: I’m building a Regulatory Intelligence solution, and even seemingly simple tasks like extracting obligations from text are hard. LLMs don’t consistently extract the same obligations. I’ve found ways to do it reliably, but it took a lot of trial and error.
I am finding gpt-4.1-mini better at instruction following compared to gpt-4.0. We are building a system that extract obligations from regulatory documents. gpt-4.1-mini is more accurate and complete compared to gpt-4.0
I documented how I used @AnthropicAI Claude 3.7 Sonnet with Extended Thinking to build the screenshot feature in Videocrawl.
This post shows my workflow, how humans are still in the loop, and why I had to guide the model to use the Screen Capture API when it got things wrong.
Videocrawl (https://t.co/FO5LU7fNDw) is LLM powered experience for videos. We use @OpenAI models for our AI features. We upgraded from gpt-4o to gpt-4.1 and our transcript evals success rate has improved by 30%. Impressive work by OpenAI team.
I’m building a couple of voice agents and was looking for ways to test them. Just watched Superdial’s Voice Agent Engineering talk. 3 useful tips:
1. Creating a fake phone number that plays an MP3 to test if the bot can interact with audio playback
2. Creating a simulated voice tree using different phone tree building tools to have the bot pseudo-navigate it
3. Using generative services to have the bot talk to another bot for testing conversations.
Enjoyed watching "Explain How Kubernetes Works With GPU Like I’m 5" talk. The talk covers how you can setup a home k8s GPU lab.
Video link: https://t.co/KedWOaFgfY