Stewart Whiting @stewhir - Twitter Profile

stewhir retweeted

about 2 years ago

If you're into GPU clusters, @imbue_ai has just published this very detailed log of how they built a 512-node gpu cluster from the ground up. https://t.co/nctLNs5oru They open sourced the tools they created: https://t.co/9abJSLqhl0 Thanks to @slippylolo for the heads up Added to: https://t.co/1QiV6zRkvj p.s. it looks like they used FSDP/ZeRO to train a 70B model - Yay!

3

361

66

293

24K

stewhir retweeted

Marc Randolph

@marcrandolph

about 2 years ago

Stop thinking you need an MBA. Stop thinking you need to be in Silicon Valley. Stop thinking you need investors. What you need is a tolerance for risk. You need passion. And above all else, you need an idea.

53

1K

117

258

68K

Stewart Whiting @stewhir

about 2 years ago

Garbage in, garbage out with LLMs. Knowledge has to come from somewhere 🤷

Peter Yang

@petergyang

about 2 years ago

Google AI overview suggests adding glue to get cheese to stick to pizza, and it turns out the source is an 11 year old Reddit comment from user F*cksmith 😂

petergyang's tweet photo. Google AI overview suggests adding glue to get cheese to stick to pizza, and it turns out the source is an 11 year old Reddit comment from user F*cksmith 😂 https://t.co/uDPAbsAKeO

666

26K

3K

4M

0

91

stewhir retweeted

Andrew McCalip

@andrewmccalip

almost 3 years ago

Meissner effect or bust: Day 8.5 We made the rocks

818

20K

3K

2K

13M

Who to follow

Craig Macdonald

@craig_macdonald

Professor of Information Retrieval

Guido Zuccon

@guidozuc

Professor at The University of Queensland, leader of @IELabGroup (https://t.co/yLTRjRQAWA), Information Retrieval researcher

BCS IRSG

@bcs_irsg

The Information Retrieval Specialist Group (IRSG) of the Chartered Institute for IT (BCS).

stewhir retweeted

Simon Menashy

@simonmenashy

almost 3 years ago

I penned some thoughts on AI: all the *layers* where we're seeing innovation, and what kinds of businesses I expect to see built. - Foundational models - Specialised models - Vertical use cases - The orchestration layer - Supporting tools and enablers https://t.co/AWGQdLohEg

0

3

2

1

534

stewhir retweeted

Amy Lewin @amyrlewin

almost 3 years ago

NEW: UK government unveils plan to direct £75bn from pension funds to startups. Hunt has nine of the UK's biggest private pension schemes already signed up. https://t.co/5fke5hOoZp

19

174

44

23

71K

stewhir retweeted

Rishi Sunak

@RishiSunak

about 3 years ago

Great news that @a16z – one of the world’s leading tech investment firms, is opening a new base here in London. Another huge vote of confidence in the UK as a place to build and grow tech businesses of the future.

RishiSunak's tweet photo. Great news that @a16z – one of the world’s leading tech investment firms, is opening a new base here in London.

Another huge vote of confidence in the UK as a place to build and grow tech businesses of the future. https://t.co/caP6nNXfow

671

6K

759

132

4M

stewhir retweeted

Tim Dettmers

@Tim_Dettmers

about 3 years ago

QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark: Paper: https://t.co/J3Xy195kDD Code+Demo: https://t.co/SP2FsdXAn5 Samples: https://t.co/q2Nd9cxSrt Colab: https://t.co/Q49m0IlJHD

Tim_Dettmers's tweet photo. QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark:

Paper: https://t.co/J3Xy195kDD
Code+Demo: https://t.co/SP2FsdXAn5
Samples: https://t.co/q2Nd9cxSrt
Colab: https://t.co/Q49m0IlJHD https://t.co/UJcowpfhpH

81

4K

899

2K

2M

stewhir retweeted

Aran Komatsuzaki

@arankomatsuzaki

about 3 years ago

QLoRA: Efficient Finetuning of Quantized LLMs Presents an efficient finetuning approach that reduces memory usage enough to finetune a 65B model on a single 48GB GPU while preserving full 16-bit finetuning task performance. repo: https://t.co/Szt1col8Sc https://t.co/2jqKm7S505 abs: https://t.co/PHyrtVP4em

arankomatsuzaki's tweet photo. QLoRA: Efficient Finetuning of Quantized LLMs

Presents an efficient finetuning approach that reduces memory usage enough to finetune a 65B model on a single 48GB GPU while preserving full 16-bit finetuning task performance.

repo: https://t.co/Szt1col8Sc
https://t.co/2jqKm7S505
abs: https://t.co/PHyrtVP4em

1

258

60

82

39K

stewhir retweeted

Tim Dettmers

@Tim_Dettmers

about 3 years ago

The 4-bit bitsandbytes private beta is here! Our method, QLoRA, is integrated with the HF stack and supports all models. You can finetune a 65B model on a single 48 GB GPU. This beta will help us catch bugs and issues before our full release. Sign up: https://t.co/XBAQv76laa

24

828

150

307

189K

stewhir retweeted

Simon Willison

@simonw

about 3 years ago

Leaked Google document: “We Have No Moat, And Neither Does OpenAI” The most interesting thing I've read recently about LLMs - a purportedly leaked document from a researcher at Google talking about the huge strategic impact open source models are having https://t.co/q2lsjTHKGS

114

5K

1K

3K

2M

stewhir retweeted

Tim Dettmers

@Tim_Dettmers

about 3 years ago

@karpathy Super excited to push this even further: - Next week: bitsandbytes 4-bit closed beta that allows you to finetune 30B/65B LLaMA models on a single 24/48 GB GPU (no degradation vs full fine-tuning in 16-bit) - Two weeks: Full release of code, paper, and a collection of 65B models

38

1K

182

288

366K

Stewart Whiting @stewhir

about 3 years ago

@JohnNosta @OpenAI The more the better, esp for complex tasks. Just launched today, https://t.co/nB03rbbvgW may be a helpful tool for scaling data generation and the fine tuning pipeline

0

37

Stewart Whiting @stewhir

about 3 years ago

@VishalGulati_ Biggest opportunity for value I see right now tho is open gpt3.5+ level models which can run locally (whether self trained or open base models). The openai api pricing model is hindering all kinds of use cases (eg ambient, large batch processing, confidential data etc)

0

29

Stewart Whiting @stewhir

about 3 years ago

@VishalGulati_ There’s prob scope for domain specific adaptation during llm training wrt to bias, citation recognition etc. Main benefit to self llm training I can see for medicine is for med device regs.

0

28

Stewart Whiting @stewhir

about 3 years ago

@VishalGulati_ I quite liked this talk by Peter lee for outlining a few good usecases https://t.co/XbW4j67pYF

0

1

0

27

Stewart Whiting @stewhir

about 3 years ago

@VishalGulati_ Oh, if you haven’t already seen it - this book released shortly may be interesting to help figure out where the value is https://t.co/Sqbw0ReYj9

0

16

stewhir retweeted