Patrick @patrick_physics - Twitter Profile

Patrick @patrick_physics

almost 3 years ago

me watching oppenheimer

0

2

0

120

Patrick @patrick_physics

almost 3 years ago

Who evals the evals?

0

70

patrick_physics retweeted

Yann LeCun

@ylecun

almost 3 years ago

This is huge: Llama-v2 is open source, with a license that authorizes commercial use! This is going to change the landscape of the LLM market. Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers Pretrained and fine-tuned models are available with 7B, 13B and 70B parameters. Llama-2 website: https://t.co/PKrrXgHdem Llama-2 paper: https://t.co/aINNrXNhMb A number of personalities from industry and academia have endorsed our open source approach: https://t.co/N7HwgW9Suh

384

15K

3K

4K

4M

Patrick @patrick_physics

almost 3 years ago

@typedfemale model extraction

0

1

0

307

Who to follow

Gordan Krnjaic

@GordanKrnjaic

Theoretical physicist @ Fermilab & U. Chicago

Thomas Van Riet

@ThomasVanRiet2

Boltzmann brain, bringing the broscience to theoretical physics, (- + + + + + + + + +) . Tweets with spelling mistakes are an act of rebellion. 🇧🇪🇵🇸🌍

Patrick @patrick_physics

almost 3 years ago

BREAKING CHANGE: can we all calm down

```eval @giant_hornet

almost 3 years ago

@ryxcommar What many people fail to recognize is that code is editable

0

1

0

4K

0

1

0

99

Patrick @patrick_physics

almost 3 years ago

@sh_reya @vboykis tired: attention is all you need wired: xgboost is all you need h/t @tunguz

0

64

Patrick @patrick_physics

almost 3 years ago

@sh_reya @vboykis this, 100%. rumours of the demise of data science are greatly exaggerated

0

1

0

1

137

Patrick @patrick_physics

almost 3 years ago

@Tim_Dettmers Thanks a lot for the hard work on the CUDA kernels. I'm curious if batch_size=1 is optimal for this method, or if in the future it may be possible to speed up batched inference at 4 bit also?

0

1

0

416

Patrick @patrick_physics

almost 3 years ago

Supersymmetry 2.0

OpenAI

@OpenAI

almost 3 years ago

We need new technical breakthroughs to steer and control AI systems much smarter than us. Our new Superalignment team aims to solve this problem within 4 years, and we’re dedicating 20% of the compute we've secured to date towards this problem. Join us! https://t.co/cfJMctmFNj

460

4K

684

611

3M

0

83

patrick_physics retweeted

(((ل()(ل() 'yoav))))👾

@yoavgo

about 3 years ago

there is a huge difference between knowledge and expertise. individual humans are not so great with acquiring and retaining knowledge, but are remarkable at developing deep expertise. language models seem to be the exact opposite. they are amazing in the knowledge part.

3

83

14

11K

Patrick @patrick_physics

about 3 years ago

*cries in bias-variance tradeoff*

0

95

Patrick @patrick_physics

about 3 years ago

@visakanv eggs

1

0

281

Patrick @patrick_physics

about 3 years ago

@chrisalbon AFAIK only important for deep neural networks, so I would keep the card as a good lesson for everything else. Would still love to see a card with an intuitive explanation for double descent

1

0

49

patrick_physics retweeted

merve

@mervenoyann

about 3 years ago

I think it's a horrible idea to ask for licenses to train models, this will reduce number of open-source models and will give big corporations an unfair competitive advantage to train closed-source models which will not be transparent at all and companies will have to sacrifice a lot of privacy since they don't own their models. Monopolies aren't good. If a system affects people, people have the right to know more about it (biases and so on), given we know companies are using these models in all kinds of bad ways (they even use it for candidate screening 😔)

9

119

14

5

15K

Patrick @patrick_physics

about 3 years ago

I get the desire to regulate but placing restrictions on open source orgs is not the right way. Whoever deploys the open source model should be responsible instead

Jeremy Howard

@jeremyphoward

about 3 years ago

"Any model made available in the EU, without first passing extensive, and expensive, licensing, would subject companies to massive fines of the greater of €20,000,000 or 4% of worldwide revenue. Opensource developers, and hosting services such as GitHub... would be liable"

82

1K

216

363

2M

0

81

Patrick @patrick_physics

about 3 years ago

@yacineMTB Does this also help get the right answers out of chatGPT?

0

1

0

33

Patrick @patrick_physics

about 3 years ago

@chrisalbon Looking for a cause?

0

20

Patrick @patrick_physics

about 3 years ago

@Tim_Dettmers @karpathy How is 4-bit fine tuning stable? I guess I will wait and see :)

0

344

Patrick @patrick_physics

about 3 years ago

open source is catching up to closed models quicker than expected, although still a few unknowns

Andrej Karpathy

@karpathy

about 3 years ago

Oops haven't tweeted too much recently; I'm mostly watching with interest the open source LLM ecosystem experiencing early signs of a cambrian explosion. Roughly speaking the story as of now: 1. Pretraining LLM base models remains very expensive. Think: supercomputer + months. 2. But finetuning LLMs is turning out to be very cheap and effective due to recent PEFT (parameter efficient training) techniques that work surprisingly well, e.g. LoRA / LLaMA-Adapter, and other awesome work, e.g. low precision as in bitsandbytes library. Think: few GPUs + day, even for very large models. 3. Therefore, the cambrian explosion, which requires wide reach and a lot of experimentation, is quite tractable due to (2), but only conditioned on (1). 4. The de facto OG release of (1) was Facebook's sorry Meta's LLaMA release - a very well executed high quality series of models from 7B all the way to 65B, trained nice and long, correctly ignoring the "Chinchilla trap". But LLaMA weights are research-only, been locked down behind forms, but have also awkwardly leaked all over the place... it's a bit messy. 5. In absence of an available and permissive (1), (2) cannot fully proceed. So there are a number of efforts on (1), under the banner "LLaMA but actually open", with e.g. current models from @togethercompute, @MosaicML ~matching the performance of the smallest (7B) LLaMA model, and @AiEleuther , @StabilityAI nearby. For now, things are moving along (e.g. see the 10 chat finetuned models released last ~week, and projects like llama.cpp and friends) but a bit awkwardly due to LLaMA weights being open but not really but still. And most interestingly, a lot of questions of intuition remain to be resolved, e.g. especially around how well finetuned model work in practice, even at smaller scales.

145

6K

912

3K

1M

0

1

0

115

Patrick @patrick_physics

about 3 years ago

This happens if you tell python to f-off too much

Chris Albon

@chrisalbon

about 3 years ago

Them: "What did you accomplish today?" Me: "Well, the 'f' in the f-string was INSIDE the quotes instead of outside. So that was like... two hours."

16

619

38

9

81K

0

83

Patrick

@patrick_physics

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users