Wangda Tan @leftnoteasy - Twitter Profile

(6/n) Looking forward: While these new reasoning models aren't fully practical yet, they show huge potential. Once they solve the fast/slow thinking problem and learn when to stop deliberating, they'll be game-changing. My prediction? The future belongs to such thinking models!

0

68

Wangda Tan @leftnoteasy

over 1 year ago

A Quick Deepseek R1 Testing for SQL Generation (with Waii) Finally got to test Deepseek R1! Tried both versions: distilled LLaMA 8B from R1 (runs in local ollama) and Deepseek R1 (671B) from fireworks. Here's what I discovered. 🧵

6

0

1

193

Who to follow

Tianqi Chen

@tqchenml

AssistProf @CarnegieMellon. Distinguished Eng @NVIDIA. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNp4p, @TheASF. Views are on my own

Zhe Zhang

@zhz_ray

Apache Software Foundation Member

Arno Candel

@ArnoCandel

Macrohard @xAI Previously CTO @h2oai AutoML, Data Science, Deep Learning, Accelerator Physics, Supercomputing. ACE3P, H2O-3, Driverless AI. PhD Physics ETH SLAC

Wangda Tan @leftnoteasy

over 1 year ago

(5/n) About the distilled model - don't get too excited. It can't handle complexity: - Struggles with AMC-8 (8th-grade math competition) questions -- just went into an infinite loop of output and cannot give me an answer. - Can't generate SQL based on the schema input.

0

63

Wangda Tan @leftnoteasy

over 1 year ago

(4/n) more downside: - Query generation takes 4-5 mins vs 10-20 secs with GPT-4o - Most time is spent on unnecessary self-debate. The solution is clear in first 20-30% (40-60 secs), but it keeps rewriting and second-guessing the correct solution

0

48

Wangda Tan @leftnoteasy

over 1 year ago

(3/n) Downsides? Everything comes with a cost: - Model takes long <think> output for any task, even simple entity extraction - I still need to use GPT-4o for quick tasks (reranking, entity extraction) during the test, otherwise it will take forever.

0

43

Wangda Tan @leftnoteasy

over 1 year ago

(2/n) R1 is REALLY good at understanding aggregation. Current models like GPT-4o, Claude 3.5 sometimes mess up complex aggregations (like avg of daily avg sales) or window functions with filters. But R1 handles these consistently better than other models I've tested. Example:

leftnoteasy's tweet photo. (2/n) R1 is REALLY good at understanding aggregation. Current models like GPT-4o, Claude 3.5 sometimes mess up complex aggregations (like avg of daily avg sales) or window functions with filters. But R1 handles these consistently better than other models I've tested. Example: https://t.co/JJwfzHxP5g

0

43

Wangda Tan @leftnoteasy

over 1 year ago

(1/n) 1) The self-debate is fascinating - it's like watching a real, capable but hesitant person think through problems. Check out this snippet for one of the query

leftnoteasy's tweet photo. (1/n) 1) The self-debate is fascinating - it's like watching a real, capable but hesitant person think through problems. Check out this snippet for one of the query https://t.co/eN6eQ6Jtyu

0

45

leftnoteasy retweeted

Zipeng Fu

@zipengfu

over 2 years ago

Mobile ALOHA's hardware is very capable. We brought it home yesterday and tried more tasks! It can: - do laundry👔👖 - self-charge⚡️ - use a vacuum - water plants🌳 - load and unload a dishwasher - use a coffee machine☕️ - obtain drinks from the fridge and open a beer🍺 - open doors🚪 - play with pets🐱 - throw away trash - turn on/off a lamp💡 Project website: https://t.co/9rzIX8wLEp Co-lead @tonyzzhao, advised by @chelseabfinn (amazing photographing from @qingqing_zhao_ )

326

7K

2K

3K

3M

Wangda Tan @leftnoteasy

over 2 years ago

@JosephJacks_ @garrytan @MistralAI @togethercompute @abacusai @DeepInfra Competitor of Mistral AI

0

338

Wangda Tan @leftnoteasy

over 2 years ago

Wrote a blog post about how to build an enterprise-ready Text-to-SQL system, plan to build one yourself? Check out the post! https://t.co/dWvueo3AMk

leftnoteasy's tweet photo. Wrote a blog post about how to build an enterprise-ready Text-to-SQL system, plan to build one yourself? Check out the post! https://t.co/dWvueo3AMk https://t.co/WiC7IietqZ

0

5

0

3

207

leftnoteasy retweeted

Jim Fan

@DrJimFan

over 2 years ago

AlphaCode-2 is also announced today, but seems to be buried in news. It's a competitive coding model finetuned from Gemini. In the technical report, DeepMind shares a surprising amount of details on an inference-time search, filtering, and re-ranking system. This may be Google's Q*? 🤔 They also discussed the finetuning procedure, which is 2 rounds of GOLD (an offline RL algorithm for LLM from 2020), and the training dataset. AlphaCode-2 scores at 87% percentile among the human competitors. Don't miss it: https://t.co/A8Y64cAqrx

DrJimFan's tweet photo. AlphaCode-2 is also announced today, but seems to be buried in news. It's a competitive coding model finetuned from Gemini. In the technical report, DeepMind shares a surprising amount of details on an inference-time search, filtering, and re-ranking system. This may be Google's Q*? 🤔

They also discussed the finetuning procedure, which is 2 rounds of GOLD (an offline RL algorithm for LLM from 2020), and the training dataset. AlphaCode-2 scores at 87% percentile among the human competitors.

Don't miss it: https://t.co/A8Y64cAqrx

27

1K

238

577

199K

Wangda Tan @leftnoteasy

over 2 years ago

Excited to feature in @llama_index's blog! 🚀 We're blending text-to-SQL with PDF data for powerful enterprise solutions. Check it out! 👉

LlamaIndex 🦙

@llama_index

over 2 years ago

Building advanced text-to-SQL is hard. Building advanced QA over both structured and unstructured docs is even harder. We’re excited to feature a blog by @leftnoteasy (https://t.co/gumBYCXYl2) - build an agent that can query enterprise-grade DB’s along with PDF data, with @llama_index + https://t.co/gumBYCXYl2 The enterprise text-to-SQL consists of the following: ✅ Knowledge Graph modeling metadata/query history to help table/schema selection ✅ Semantic rules: guide producing the right queries ✅ Automatic error correction through query compiler We use this over a SQL database of retail data, and combine this with a @llama_index RAG pipeline over a Deloitte PDF report. This allows our agent to compare the structured/unstructured data ⚖️ - e.g. the top items sold during the holidays. Check out the full blog! https://t.co/5N2Yjnluc1 Notebook: https://t.co/DsbnO96Err Signup with Waii here: https://t.co/RGHCfGQhvI

3

204

45

226

57K

0

2

236

leftnoteasy retweeted

LlamaIndex 🦙

@llama_index

over 2 years ago

Building advanced text-to-SQL is hard. Building advanced QA over both structured and unstructured docs is even harder. We’re excited to feature a blog by @leftnoteasy (https://t.co/gumBYCXYl2) - build an agent that can query enterprise-grade DB’s along with PDF data, with @llama_index + https://t.co/gumBYCXYl2 The enterprise text-to-SQL consists of the following: ✅ Knowledge Graph modeling metadata/query history to help table/schema selection ✅ Semantic rules: guide producing the right queries ✅ Automatic error correction through query compiler We use this over a SQL database of retail data, and combine this with a @llama_index RAG pipeline over a Deloitte PDF report. This allows our agent to compare the structured/unstructured data ⚖️ - e.g. the top items sold during the holidays. Check out the full blog! https://t.co/5N2Yjnluc1 Notebook: https://t.co/DsbnO96Err Signup with Waii here: https://t.co/RGHCfGQhvI

3

204

45

226

57K

Wangda Tan @leftnoteasy

almost 3 years ago

Fine-Tuned GPT-3.5 vs. GPT-4 For SQL Generation The results? We found that the fine-tuned version outperformed GPT-4, achieving higher accuracy at 1/3 of the cost! We also explore how fine-tuning affects readability and SQL statements usage. link: https://t.co/ztD4qupWol 🚀

0

4

0

156

Wangda Tan @leftnoteasy

about 3 years ago

@yakrobat 5/5 🔮 #GPT4 has ushered in a new era of AI capabilities for SQL generation. Stay tuned for more updates as we push the boundaries of what's possible in automating SQL generation and data analytics!

0

1

0

135

Wangda Tan @leftnoteasy

about 3 years ago

GPT-4's SQL Mastery: Solved 'Text to SQL' Problem? @yakrobat and I collaborated on research that demonstrates GPT-4's impressive SQL generation abilities through fine-tuning and optimized techniques (Read our blog post: https://t.co/hEdLCmwInU). Summary see this 🧵

6

3

0

1

335

Wangda Tan @leftnoteasy

about 3 years ago

@yakrobat 4/5 🌐 Our goal is refining GPT-4's query generation for enterprise warehouses. Challenges include testing on real-world databases, handling wide data models, and generating complex queries. We're working on addressing these limitations. #AI #Enterprise

0

1

0

119

Wangda Tan @leftnoteasy

about 3 years ago

@yakrobat 3/5 🛠️ Techniques like constraints, query examples, samples, semantics, and human guidance help optimize SQL generation. However, complexity remains a challenge, with more complex queries less likely to succeed. #FineTuning #AI

0

1

0

98

Wangda Tan @leftnoteasy

about 3 years ago

@yakrobat 2/5 🧪 We tested GPT-4 on the Spider dataset, a SQL generation benchmark. Our evaluation focused on query result correctness. With proper prompting and tuning, GPT-4 outperforms previous methods. #Benchmarking #ML

leftnoteasy's tweet photo. @yakrobat 2/5 🧪 We tested GPT-4 on the Spider dataset, a SQL generation benchmark. Our evaluation focused on query result correctness. With proper prompting and tuning, GPT-4 outperforms previous methods. #Benchmarking #ML https://t.co/ziuZVPhahb

0

1

161

Wangda Tan

@leftnoteasy

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users