Kah Seng Tay

5 months ago

So excited to see this launch finally! Congrats @calebmer and the @alpine4work team - such an incredible vision finally come to life after 2 years of stealth!

Caleb Meredith

@calebmer

5 months ago

Introducing Alpine The first ever AI-native workspace. Your docs, tasks, and chat finally in one app https://t.co/OzXvrLptb5

41

191

30

167

65K

2

10

0

269

kahseng retweeted

Kradle

@kradleai

6 months ago

Would an AI die to save you? #GPT 5.2 would. With #Claude 4.5 Sonnet you die 63% of the time. #Gemini 3 saves you both. #Grok 4.1 refuses the binary choice and destroys the trolley! Video & 🧵

9

118

31

33

31K

kahseng retweeted

Design at @castlexyz. Making distinctions.

7 months ago

we built OCR Arena, a free playground for the community to compare leading VLMs and OCR models side-by-side! upload any doc, run 10+ OCR models, and vote for the best ones on a public leaderboard:

kushalbyatnal's tweet photo. we built OCR Arena, a free playground for the community to compare leading VLMs and OCR models side-by-side!

upload any doc, run 10+ OCR models, and vote for the best ones on a public leaderboard: https://t.co/zIHpfeUxDa

26

202

39

106

37K

Who to follow

David Cole

@irondavy

High-variance engineer. Building things at https://t.co/bFNC3bdtjQ.

8 months ago

Huge congrats on the launch!!

Kasey Zhang

@_WEEXIAO

8 months ago

We've raised $7M to help companies build AI agents that actually learn and work. @Osmosis_AI is a platform for companies to fine-tune models that outperform foundation models with reinforcement learning. Better, faster, and cheaper.

138

642

90

262

1M

0

2

0

183

kahseng retweeted

9 months ago

Introducing Composer — the first AI Agent for document processing. Get to production-grade accuracy, autonomously in minutes. In our early beta, some teams hit 99% accuracy on complex document tasks in under 10 minutes. Composer is an agent built to optimize schemas the same way a human would (but way faster). Instead of tuning prompts by hand, you point Composer at your eval set inside Extend. Composer will: - analyze where your schema falls short - propose targeted improvements - run multiple experiments in parallel - surface diffs, accuracy gains, and traces behind each change With this launch, Extend is the only product on the market that helps you reach production-grade accuracy this fast. Composer is live for all Extend customers today! Try it out at the link in comments below.

24

470

41

696

75K

kahseng retweeted

Kasey Zhang

@_WEEXIAO

12 months ago

It’s easy to fine-tune small models w/ RL to outperform foundation models on vertical tasks. We’re open sourcing Osmosis-Apply-1.7B: a small model that merges code (similar to Cursor’s instant apply) better than foundation models. Links to download and try out the model below!

45

1K

131

987

127K

kahseng retweeted

12 months ago

PDFs suck. We just raised $17,000,000 in funding to fix this problem once and for all. Extend is building the modern document processing cloud. See how Brex, Square, Checkr, and Fortune 500s use it to process millions of documents:

48

653

43

575

211K

kahseng retweeted

over 1 year ago

big milestone for Extend! our new website captures months of learnings and focuses on one core mission — helping technical teams transform complex documents into reliable, high-quality data

kushalbyatnal's tweet photo. big milestone for Extend!

our new website captures months of learnings and focuses on one core mission — helping technical teams transform complex documents into reliable, high-quality data https://t.co/l4aamod7eK

14

336

24

368

47K

kahseng retweeted

Reflex

@getreflex

over 1 year ago

Today, we’re excited to announce and launch Reflex Cloud on Product Hunt! https://t.co/HFiLrkASTb Reflex is an open-source framework for building and deploying data and AI web apps in pure Python. Frontend and Backend in Pure Python: No JavaScript required! With Reflex Cloud, you can now deploy, manage, and scale your Python apps with just a single command! If you're a Python developer, an upvote or a share would mean a lot to us :)💪❤️

22

340

67

263

55K

kahseng retweeted

Zep AI

@zep_ai

over 1 year ago

💬 ➕🗄️🟰 ❤️ AI agents need more than conversational memory for state—they need to understand who they're helping & why. ➡️ Today we're connecting conversations with business data in Zep, making AI interactions more personal & relevant to every user. https://t.co/lgHjjoi2I1

0

6

1

2

498

kahseng retweeted

Erik Torenberg

@eriktorenberg

almost 2 years ago

Excited to finally launch the Turpentine Network: a social network for top founders, including CEOs of companies like Databricks, Perplexity, & 400 others totaling over $200B in valuation. We're aiming to create the most valuable social network for startups. Apply below

eriktorenberg's tweet photo. Excited to finally launch the Turpentine Network: a social network for top founders, including CEOs of companies like Databricks, Perplexity, & 400 others totaling over $200B in valuation.

We're aiming to create the most valuable social network for startups. Apply below https://t.co/T2hnIifHdU

104

887

90

440

325K

kahseng retweeted

Rishabh Srivastava

@rishdotblog

almost 2 years ago

We made a thing! Very happy to announce sqlcoder-pro and the Defog Alignment Platform. Available to use immediately without a wait-list, weights will be open-sourced very soon. The video does a quick show and tell comparison against ChatGPT (with gpt-4o). Read on for more details! TLDR 💪 equal (or better) performance on text-to-SQL as the most capable Claude-3.5 or GPT-4 models 🤝 You can use it today on a free plan/free trial, without a waitlist 🪽 self-hostable on a single RTX4090, with 2 second median generation times for SQL queries 🔁 exactly the same output every time, give the same prompt 👨🏻‍🏫 teachable and steerable: show the model what you want it to do 🛞 debuggable – you can understand WTF is going on inside the model, instead of treating it like a black box Let's dig into each of these one-by-one! Performance SQLCoder-8b-pro significantly exceeds the performance of our previous sqlcoder-8b model on Postgres text-to-SQL (from 88.2% to 90.2% accuracy - gpt-4o is at 87.6%, for reference). It is also better at following instructions. This was done via self-merges, hand crafted fine-tuning data, and adapting the training data to fit our tokenizer. Cost You can host this on the model on a single $3,500 RTX4090, and support ~5 requests/second via VLLM. If you're looking to host on the cloud instead, you can run it on a single L4 GPU that costs $300/mo on GCP Repeatability We have a dense 8b model with no MoE shenanigans. For the same prompt with temperature=0, you'll always get the same answer – which is critical in BI. Teachable In our alignment and feedback modes, you can give the model feedback on how it answered certain questions, and it will automatically adapt to the feedback. Debuggable You can use logprobs and attention scores to determine where, exactly is the model paying attention to inside a prompt + what it's getting confused by when generating outputs. Available today You can use Defog on the cloud today by going to docs[dot]defog[dot]ai, and getting an API key. Excited to hear what you think!

10

122

23

76

13K

about 2 years ago

For those curious, this is the problem I’ve been deeply looking to solve for the past few years https://t.co/CjGRK1dNcR (h/t @sriramk & commenters) Still super messy & with many different ways to tackle it! Wonder if anyone has else wants to see this solved in their own lives?

1

11

0

1

404

about 2 years ago

@itstimconnors have you checked out @Lutra_AI? they can do this, and you can specify custom rules with english (disclosure: small angel check, but that's how I know they do it)

0

1

0

137

kahseng retweeted

Rishabh Srivastava

@rishdotblog

about 2 years ago

Llama-3 based SQLCoder 8b is out! Open weights with a commercially friendly cc-by-sa license. Probably the best <10B param model for Postgres text to SQL right now. Slightly better than gpt-4-turbo and claude opus for 0-shot text to SQL generation. Also approaches their performance when following instructions. Weights on @huggingface: https://t.co/lg0A2f4tqc Demo (optimized for postgres): https://t.co/qp4zvZ52xV More technical details below! What's new about this model Our previous small model (sqlcoder-7b-2) was good at generating 0-shot SQL, but did terribly at following instructions. So while it was great in our evals, it was lacking in real-world use-cases where instruction following is much more important. To address this, we trained this model with much more instruction data. We also made our original eval much harder to make sure we stayed on the right track. Changes to evals There were 3 changes to our original eval: 1. Previously, we pruned the database schema to only consider the 20 relevant columns in the DDL statements. We have now removed pruning that so that all columns in a database are used 2. We previously used beam search with 4 beams to make our results more accurate. But with a large number of input and/or output tokens, that increased memory requirements and became computationally intractable. So we have shifted to a single beam now. 3. We added 104 complex instruction-following text=> SQL questions questions to our evals, in addition to the 200 0-shot questions that were already there. Link to our eval framework here: https://t.co/n0CxuKqjPf Changes to prompt You previously had to use our slightly idiosyncratic prompt for best results. Now, you can just use the standard Llama-3 instruct prompt. 70B model, technical report, and more up next We've also been training a llama-3 based 70B model right now. It's still training and will get better over time – but even an AWQ quantized version of our interim model is giving excellent results for now. We hope to open-source the 70B next week. We also have a technical report coming up next week (or over the weekend, if I can be productive enough on a flight) about the training methods used for this model. More on that soon! Feedback very much appreciated! In the meantime, please send us your feedback as you try out the model - specially if you see failure modes. Would very much appreciate it!

rishdotblog's tweet photo. Llama-3 based SQLCoder 8b is out! Open weights with a commercially friendly cc-by-sa license. Probably the best <10B param model for Postgres text to SQL right now.

Slightly better than gpt-4-turbo and claude opus for 0-shot text to SQL generation. Also approaches their performance when following instructions.

Weights on @huggingface: https://t.co/lg0A2f4tqc

Demo (optimized for postgres): https://t.co/qp4zvZ52xV

More technical details below!

What's new about this model
Our previous small model (sqlcoder-7b-2) was good at generating 0-shot SQL, but did terribly at following instructions. So while it was great in our evals, it was lacking in real-world use-cases where instruction following is much more important.

To address this, we trained this model with much more instruction data. We also made our original eval much harder to make sure we stayed on the right track.

Changes to evals
There were 3 changes to our original eval:

1. Previously, we pruned the database schema to only consider the 20 relevant columns in the DDL statements. We have now removed pruning that so that all columns in a database are used

2. We previously used beam search with 4 beams to make our results more accurate. But with a large number of input and/or output tokens, that increased memory requirements and became computationally intractable. So we have shifted to a single beam now.

3. We added 104 complex instruction-following text=> SQL questions questions to our evals, in addition to the 200 0-shot questions that were already there.

Link to our eval framework here: https://t.co/n0CxuKqjPf

Changes to prompt
You previously had to use our slightly idiosyncratic prompt for best results. Now, you can just use the standard Llama-3 instruct prompt.

70B model, technical report, and more up next
We've also been training a llama-3 based 70B model right now. It's still training and will get better over time – but even an AWQ quantized version of our interim model is giving excellent results for now. We hope to open-source the 70B next week.

We also have a technical report coming up next week (or over the weekend, if I can be productive enough on a flight) about the training methods used for this model. More on that soon!

Feedback very much appreciated!
In the meantime, please send us your feedback as you try out the model - specially if you see failure modes. Would very much appreciate it!

27

573

115

497

97K