Andrea | 🇸🇪🇪🇸🇻🇪 @aicoding_ - Twitter Kullanıcısı

aicoding_ retweetledi

1 yıldan fazla bir süre önce

Introducing Critique Fine-Tuning (CFT): a more effective SFT method for enhancing LLMs' reasoning abilities. 📄 Paper: https://t.co/oK4vCIMP7z CFT is simple: instead of training models to directly answer questions, we train them to critique noisy answers. What's fascinating is that while most approaches focus on using generative critique or reward models to provide feedback for policy models, these critique models can themselves serve as policy models： directly answering questions with stronger reasoning. Interestingly, we also found that CFT saturates quickly: overtraining on critiques can even degrade problem-solving performance. Work led by @YuboWang726 and collaborated with @WenhuChen

xiangyue96's tweet photo. Introducing Critique Fine-Tuning (CFT): a more effective SFT method for enhancing LLMs' reasoning abilities.
📄 Paper: https://t.co/oK4vCIMP7z
CFT is simple: instead of training models to directly answer questions, we train them to critique noisy answers.

What's fascinating is that while most approaches focus on using generative critique or reward models to provide feedback for policy models, these critique models can themselves serve as policy models： directly answering questions with stronger reasoning.

Interestingly, we also found that CFT saturates quickly: overtraining on critiques can even degrade problem-solving performance.

Work led by @YuboWang726 and collaborated with @WenhuChen

11

306

67

228

23K

aicoding_ retweetledi

Unsloth AI

@UnslothAI

1 yıldan fazla bir süre önce

Run DeepSeek-R1 (671B) locally on @OpenWebUI - Full Guide No GPU required. Using our 1.58-bit Dynamic GGUF and llama.cpp. Tutorial: https://t.co/xaR9KpJzcj

UnslothAI's tweet photo. Run DeepSeek-R1 (671B) locally on @OpenWebUI - Full Guide

No GPU required.
Using our 1.58-bit Dynamic GGUF and llama.cpp.

Tutorial: https://t.co/xaR9KpJzcj

16

839

173

791

68K

aicoding_ retweetledi

ILIAS ISM

@illyism

1 yıldan fazla bir süre önce

You don't need a reasoning model like R1 or o3, just use this .cursorrules with Claude Sonnet to add a thinking step, works 100x better.

illyism's tweet photo. You don't need a reasoning model like R1 or o3, just use this .cursorrules with Claude Sonnet to add a thinking step, works 100x better. https://t.co/G68V3piHpx

80

5K

272

11K

558K

aicoding_ retweetledi

Ivan Fioravanti ᯅ

@ivanfioravanti

1 yıldan fazla bir süre önce

🔥 o3-mini-high beats deepseek r1 and o1-pro! in a p5.js challenge! 03-mini result is so good that deserves a video on its own. deepseek r1 (bad result) and o1-pro (better) in comments below. Prompt in last comment. 1/4

69

1K

127

652

463K

aicoding_ retweetledi

Dimitris Papailiopoulos

@DimitrisPapail

1 yıldan fazla bir süre önce

Transformers can overcome easy-to-hard and length generalization challenges through recursive self-improvement. Paper on arxiv coming on Monday. Link to a talk I gave on this below 👇 Super excited about this work!

DimitrisPapail's tweet photo. Transformers can overcome easy-to-hard and length generalization challenges through recursive self-improvement.

Paper on arxiv coming on Monday.
Link to a talk I gave on this below 👇

Super excited about this work!

19

1K

139

901

167K

aicoding_ retweetledi

Sam Altman

@sama

1 yıldan fazla bir süre önce

o3-mini is out! smart, fast model. available in ChatGPT and API. it can search the web, and it shows its thinking. available to free-tier users! click the "reason" button. with ChatGPT plus, you can select "o3-mini-high", which thinks harder and gives better answers.

2K

26K

2K

3K

3M

aicoding_ retweetledi

Seunghyun Seo @SeunghyunSEO7

1 yıldan fazla bir süre önce

what up guys, I made a one-page comparison of MHA and MLA from @deepseek_ai for those who skipped the DS-V2 paper. pls correct me if I'm wrong.

SeunghyunSEO7's tweet photo. what up guys, I made a one-page comparison of MHA and MLA from @deepseek_ai for those who skipped the DS-V2 paper.
pls correct me if I'm wrong. https://t.co/MVoAcOrNzB

4

361

47

320

39K

aicoding_ retweetledi

LangChain

@LangChain

1 yıldan fazla bir süre önce

📚🤖 Advanced RAG + Agents Cookbook A comprehensive open-source guide delivering production-ready implementations of cutting-edge RAG techniques with AI agents. Built with LangChain and LangGraph, it features advanced implementations like Hybrid, Self, and ReAct RAG. Learn more: https://t.co/pXkXMFFSYt

LangChain's tweet photo. 📚🤖 Advanced RAG + Agents Cookbook

A comprehensive open-source guide delivering production-ready implementations of cutting-edge RAG techniques with AI agents. Built with LangChain and LangGraph, it features advanced implementations like Hybrid, Self, and ReAct RAG.

Learn more: https://t.co/pXkXMFFSYt

5

702

158

779

61K

aicoding_ retweetledi

Andi Marafioti

@andimarafioti

1 yıldan fazla bir süre önce

Fuck it, today we're open-sourcing the codebase used to train SmolVLM from scratch on 256 H100s🔥 Inspired by our team's effort to open-source DeepSeek's R1 training, we are releasing the training and evaluation code on top of the weights 🫡 Now you can train any of our SmolVLMs—or create your own custom VLMs!

andimarafioti's tweet photo. Fuck it, today we're open-sourcing the codebase used to train SmolVLM from scratch on 256 H100s🔥
Inspired by our team's effort to open-source DeepSeek's R1 training, we are releasing the training and evaluation code on top of the weights 🫡
Now you can train any of our SmolVLMs—or create your own custom VLMs!

34

1K

211

902

99K

aicoding_ retweetledi

AK

@_akhaliq

1 yıldan fazla bir süre önce

OpenAI o3-mini System Card

11

360

68

99

47K

aicoding_ retweetledi

Han Xiao

@hxiao

1 yıldan fazla bir süre önce

Letter-dropping physics comparison: o3-mini vs. deepseek-r1 vs. claude-3.5 in one-shot - which is the best? Prompt: Create a JavaScript animation of falling letters with realistic physics. The letters should: * Appear randomly at the top of the screen with varying sizes * Fall under Earth's gravity (9.8 m/s²) * Have collision detection based on their actual letter shapes * Interact with other letters, ground, and screen boundaries * Have density properties similar to water * Dynamically adapt to screen size changes * Display on a dark background

153

3K

253

2K

604K

aicoding_ retweetledi

elvis

@omarsar0

1 yıldan fazla bir süre önce

AI Agents for Computer Use This report provides a comprehensive overview of the emerging field of instruction-based computer control, examining available agents – their taxonomy, development, and resources.

omarsar0's tweet photo. AI Agents for Computer Use

This report provides a comprehensive overview of the emerging field of instruction-based computer control, examining available agents – their taxonomy, development, and resources. https://t.co/pNFyewjee6

15

657

141

751

66K

aicoding_ retweetledi

Gabriel Massadas

@G4brym

1 yıldan fazla bir süre önce

Gemini 2.0 doesn’t get nearly enough credit. I just dumped all my workers-qb source code into it, hit it with a simple, humble prompt, and boom => it one-shotted the docs. Not just good docs, way better than what I had before, packed with examples. Kinda insane.

30

714

60

486

115K

aicoding_ retweetledi

AK

@_akhaliq

1 yıldan fazla bir süre önce

OpenAI o3-mini just one shotted this prompt: write a script for 100 bouncing yellow balls within a sphere, make sure to handle collision detection properly. make the sphere slowly rotate. make sure balls stays within the sphere. implement it in p5.js

137

4K

399

2K

815K

aicoding_ retweetledi

anton

@abacaj

1 yıldan fazla bir süre önce

Finished a run (R1 style) GRPO on Qwen-2.5-0.5B (base model) yield +10 accuracy points on GSM8K. Literally just works. Base model scores 41.6% as reported on qwen paper vs 51%~ GRPO

abacaj's tweet photo. Finished a run (R1 style) GRPO on Qwen-2.5-0.5B (base model) yield +10 accuracy points on GSM8K. Literally just works. Base model scores 41.6% as reported on qwen paper vs 51%~ GRPO https://t.co/vGgAMX0DHK

41

1K

108

653

108K

aicoding_ retweetledi

Antaripa Saha

@doesdatmaksense

1 yıldan fazla bir süre önce

for people learning gpu programming and especially triton should check out liger kernel by linkedin it was released last year and built on top of triton to provide pre-optimized, ready-to-use implementations gpu optimization techniques specifically tailored for llm training

doesdatmaksense's tweet photo. for people learning gpu programming and especially triton should check out liger kernel by linkedin

it was released last year and built on top of triton to provide pre-optimized, ready-to-use implementations gpu optimization techniques specifically tailored for llm training https://t.co/YC4Epcw6dg

9

616

60

556

34K

aicoding_ retweetledi

Caleb Peffer (Hiring!)

@CalebPeffer

1 yıldan fazla bir süre önce

Excited to announce https://t.co/azlzx4Rrah A website that turns any website into a get API with @firecrawl /extract endpoint. Data on the web has never been more accessible! Thanks to @devdigest, for starting this fabulous trend. Check out his GitHub repo below!

37

2K

193

4K

235K

aicoding_ retweetledi

Lex Fridman

@lexfridman

1 yıldan fazla bir süre önce

OpenAI o3-mini is a good model, but DeepSeek r1 is similar performance, still cheaper, and reveals its reasoning. Better models will come (can't wait for o3pro), but the "DeepSeek moment" is real. I think it will still be remembered 5 years from now as a pivotal event in tech history, due in-part to the geopolitical implications but for many other reasons too. All this discussed in 5 hour technical podcast I just recorded on the state of AI industry. Out tomorrow (hopefully).

971

13K

1K

2K

2M

aicoding_ retweetledi

Artificial Analysis

@ArtificialAnlys

1 yıldan fazla bir süre önce

OpenAI’s o3-mini is here - a significant jump forward from o1-mini Initial results (full benchmarking coming soon): ➤ Artificial Analysis Quality Index of 89, matching DeepSeek R1 and just below o1 ➤ Cheaper - $1.1/$4.4 input/output pricing per million tokens, lower than many DeepSeek R1 APIs (higher than DeepSeek’s first party R1 API) ➤ Fast - similar speed to o1-mini at 170 tokens/s, although that means 2000 tokens of ‘thinking’ time will still take ~12 seconds

ArtificialAnlys's tweet photo. OpenAI’s o3-mini is here - a significant jump forward from o1-mini

Initial results (full benchmarking coming soon):
➤ Artificial Analysis Quality Index of 89, matching DeepSeek R1 and just below o1
➤ Cheaper - $1.1/$4.4 input/output pricing per million tokens, lower than many DeepSeek R1 APIs (higher than DeepSeek’s first party R1 API)
➤ Fast - similar speed to o1-mini at 170 tokens/s, although that means 2000 tokens of ‘thinking’ time will still take ~12 seconds

24

400

59

122

80K

aicoding_ retweetledi

Carlos E. Perez

@IntuitMachine

1 yıldan fazla bir süre önce

When working with o1/o3 models, I always have this feeling that I'm leaving a lot on the table with my prompting. Creating a long sequence of prompts for regular LLMs is good practice. This is because you don't want to overload what an LLM can process (or it'll lead to hallucinations). But Large Reasoning Models (LRMs) are different.

IntuitMachine's tweet photo. When working with o1/o3 models, I always have this feeling that I'm leaving a lot on the table with my prompting. Creating a long sequence of prompts for regular LLMs is good practice. This is because you don't want to overload what an LLM can process (or it'll lead to hallucinations). But Large Reasoning Models (LRMs) are different.

21

525

77

594

54K

Andrea | 🇸🇪🇪🇸🇻🇪

@aicoding_

Sotwe'de En Son Ziyaret Edilenler

Senin İçin Trendler

En Popüler Kullanıcılar