vibe coded personal task tracker with calendar, added mcp to it and voice calls with @retellai. It's really cool when agent actually calls you instead of silent notifications we never pay attention to. https://t.co/9A46GyTwjk check it out and grab if you want
1) Google veo total disaster.
2) Kling was able to do only some parts
3) Midjorney just ignores half of promts
4) Grok is the winner. I was not expect but it generated most suitable video
last 45 minutes I was trying to generate 5 sec videos by very specific and not so much scripts with Veo, @Kling_ai, @midjourney and @grok. Here are my thoughts:
This paper builds an agentic LLM that can run the whole data science workflow by itself.
It is an 8B model that plans work, reads structured files, writes and runs code, checks results, and iterates.
Standard “workflow agents” break here because fixed scripts do not adapt well to long, multi step jobs.
DeepAnalyze fixes that with 5 actions, Analyze, Understand, Code, Execute, and Answer, so the model can switch between thinking and doing.
Training happens in 2 stages, first single skills are strengthened, then multi skill reinforcement in live environments.
They also synthesize step by step trajectories so the model sees full examples of planning, coding, and using feedback.
Rewards are hybrid, simple checks like correctness and format plus an LLM judge that scores report usefulness and clarity.
The result is autonomous orchestration, choosing the next best action, and adaptive optimization, improving decisions from environment feedback.
Across many benchmarks, this 8B model beats most workflow agents and can produce analyst grade research from raw structured data.
---
Paper – arxiv. org/abs/2510.16872
Paper Title: "DeepAnalyze: Agentic Large Language Models for Autonomous Data Science"