Toran Billups @toranb - Twitter Profile

3 months ago

I'm /beyond/ excited to join the next cohort on AI Powered Search with @softwaredoug and @treygrainger 🥳 https://t.co/rhu41ueKmW

0

6

3

1

682

toranb retweeted

Upen

@upen946

over 1 year ago

Don’t build a product, build value. Don’t market your SaaS, market how it solves a problem. People care about results, not your product. If it’s valuable and you show them the benefits, they’ll happily buy it.

14

37

6

10

2K

toranb retweeted

Jo Kristian Bergum

@jobergum

over 1 year ago

On AI in enterprises: models come and go; the true competitive advantage lies not in which frontier models you use but in how effectively you can connect those models to your organization's knowledge.

3

54

9

13

5K

toranb retweeted

Jeremy Howard

@jeremyphoward

over 1 year ago

ModernBERT is available as a slot-in replacement for any BERT-like model, with both 139M param and 395M param sizes. It has a 8192 sequence length, is extremely efficient, is uniquely great at analyzing code, and much more. Read this for details: https://t.co/uI3FKXiDhZ

6

493

34

205

30K

Who to follow

Kris Selden

@krisselden

Sr Staff Eng @LinkedIn and @emberjs core team alum.

Adolfo Builes

@abuiles

Building https://t.co/MfXJDtk1E9 Ex- @getrecharge where I led Bundles for subscriptions (via Rebundle - acquired), customer portals, and page builders.

Brian Cardarella

@bcardarella

🔴 CEO - @DockYard Offline-Last Advocate https://t.co/YVlZcgXki5

Toran Billups @toranb

over 1 year ago

@jobergum what language is the talk in? I'm planning to translate it from mp3 so I can listen this weekend

1

0

178

Toran Billups @toranb

over 1 year ago

This blog post from the team @bitcrowd is an outstanding resource for those who want to leverage SOTA embeddings with bumblebee. Easily the highest value resource I've seen on the subject yet. This post in particular covers the path from zero to Jina v2 https://t.co/K9BsSj240l

0

5

0

3

574

toranb retweeted

Gary Bernhardt @garybernhardt

about 2 years ago

Me at 25: Tests should be 5ish lines! One assert per test! Me at 40: This test is 56 lines long with 11 asserts. If I broke it up, it would be 11 separate tests, ~5x as much code, multiple helper functions and `beforeEach`s to avoid duplication, and more difficult to read.

16

1K

70

125

87K

Toran Billups @toranb

almost 2 years ago

@jobergum haha, just when I hoped you /would/ get started on e-comm search 😆

0

192

Toran Billups @toranb

almost 2 years ago

@_philschmid what is the link for that paper?

1

0

533

Toran Billups @toranb

about 2 years ago

@_philschmid This list is awesome! I recently did a talk on my adventures with synthetic data and I would add that for generating DPO datasets you can derive a synthetic prompt from a good response and then use that synthetic prompt to generate the rejected response https://t.co/Yh4bifmRJa

0

8

0

14

965

toranb retweeted

Philipp Schmid

@_philschmid

about 2 years ago

Data is all we need! 💎 @Alignment Labs AI just released Buzz, an instruction dataset with 3.13 million rows and a total of 85 million conversations in single- and multiturns. 🤯 It comes in 3 configurations: Buzz (SFT), RLSTACK (RLHF), Select Stack (filtered SFT) TL;DR: 💥 Curated, deduplicated, extended, and regenerated from 435 datasets 🧠 Training Llama 3 on it with Buzz-8b-Large 🌍 85 million conversational turns, including new and augmented data ⚖️ RLSTACK contains 1 million samples of DPO preference pairs 🥇 Select stack contains 1.5 million samples of the top-scoring response 🔄 intend to update and improve the dataset 🔓 Released under cc-by-4.0 🤗 Available on @huggingface Kudos to the team at @alignment_lab and @HIVEDigitalTech for this release! I am looking forward to read and learn more about the creation process! 🤗

_philschmid's tweet photo. Data is all we need! 💎 @Alignment Labs AI just released Buzz, an instruction dataset with 3.13 million rows and a total of 85 million conversations in single- and multiturns. 🤯 It comes in 3 configurations: Buzz (SFT), RLSTACK (RLHF), Select Stack (filtered SFT)

TL;DR:
💥 Curated, deduplicated, extended, and regenerated from 435 datasets
🧠 Training Llama 3 on it with Buzz-8b-Large
🌍 85 million conversational turns, including new and augmented data
⚖️ RLSTACK contains 1 million samples of DPO preference pairs
🥇 Select stack contains 1.5 million samples of the top-scoring response
🔄 intend to update and improve the dataset
🔓 Released under cc-by-4.0
🤗 Available on @huggingface

Kudos to the team at @alignment_lab and @HIVEDigitalTech for this release! I am looking forward to read and learn more about the creation process! 🤗

5

131

31

85

18K

toranb retweeted

Paraxial.io @paraxialio

about 2 years ago

New talk from @toranb, Adventures with Synthetic Data (lessons learned building a chatbot from my SMS dataset), presenting at the Denver Elixir Meetup! #myelixirstatus https://t.co/Rm3w1J5Klr

0

6

5

0

825

Toran Billups @toranb

about 2 years ago

My favorite podcast of 2024! @peterg021 absolutely levels the pod with such a unique blend of business and machine learning from his experience in product. Thanks for sharing in such detail, this content stretched me in a few dimensions 🤯

MLOps Community @mlopscommunity

about 2 years ago

Just wrapped up this super enlightening episode of the MLOps Community podcast featuring Peter Guagenti, a total tech guru who's really shaping the AI scene in software development.

mlopscommunity's tweet photo. Just wrapped up this super enlightening episode of the MLOps Community podcast featuring Peter Guagenti, a total tech guru who's really shaping the AI scene in software development. https://t.co/BSLh7aREoS

1

5

2

1

1K

0

1

0

473

Toran Billups @toranb

about 2 years ago

I had a blast with Gemma 7B this weekend using the latest bumblebee so I put together a single file example for those interested https://t.co/VZHdrNiZ73

1

12

2

1

829

Toran Billups @toranb

over 2 years ago

@yevkurtov I showed at the end of the video that you can use the f16 or quantized model from the command line. Are you asking about a specific inference platform perhaps?

1

0

30

Toran Billups @toranb

over 2 years ago

I had trouble converting Mistral 8B Pro to GGUF format recently so I recorded a short how-to for llama-cpp n00bs like myself. Check it out! 👇

1

2

0

4

502

Toran Billups @toranb

over 2 years ago

The next version of bumblebee is out and it's working great with Mistral 7B from HF using bf16 OOTB. It's great to see the platform moving forward with loads of improvements!

toranb's tweet photo. The next version of bumblebee is out and it's working great with Mistral 7B from HF using bf16 OOTB. It's great to see the platform moving forward with loads of improvements! https://t.co/wXdAKwMP4j

0

5

1

0

445

toranb retweeted

Charlie Holtz

@charlieholtz

over 2 years ago

Introducing YouTune — fine tune image models on YouTube videos. > python tune.⁠py <youtube-url> • downloads video • screenshots every 50 frames • removes near duplicates • fine tunes SDXL for you https://t.co/j8VdErhpup

44

658

85

536

104K

Toran Billups @toranb

over 2 years ago

@edwarddonner @huggingface @wandb @AIatMeta @DigiDNA @JonKrohnLearns Thanks for sharing this fun idea! I've already got my dataset for this and taken 1 pass at instruction fine tuning but would love to see what prompt and dataset tweaks you made to get better performance 🤠

1

2

0

127

toranb retweeted

José Valim

@josevalim

over 2 years ago

Tomorrow marks 13 years since the first commit to the Elixir repo. And today we celebrate by announcing that Elixir is, officially, a gradually typed language:

josevalim's tweet photo. Tomorrow marks 13 years since the first commit to the Elixir repo. And today we celebrate by announcing that Elixir is, officially, a gradually typed language: https://t.co/p92MkO69zC

33

2K

440

140

193K

Toran Billups

@toranb

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users