Jesse Spencer-Smith @jbspence - Twitter Profile

jbspence retweeted

6 months ago

🚨 Calling all @VanderbiltU employees! Show your Vanderbilt ID for 2️⃣ 𝐅𝐑𝐄𝐄 tickets to our Elite Eight match against TCU Saturday at 6:30 p.m. Help us 𝐏𝐀𝐂𝐊 𝐓𝐇𝐄 𝐏𝐋𝐄𝐗‼️

VandySoccer's tweet photo. 🚨 Calling all @VanderbiltU employees!

Show your Vanderbilt ID for 2️⃣ 𝐅𝐑𝐄𝐄 tickets to our Elite Eight match against TCU Saturday at 6:30 p.m.

Help us 𝐏𝐀𝐂𝐊 𝐓𝐇𝐄 𝐏𝐋𝐄𝐗‼️ https://t.co/SzoNOkXx75

0

121

20

0

7K

jbspence retweeted

Vanderbilt Soccer

@VandySoccer

6 months ago

𝐇𝐄𝐀𝐃𝐄𝐃 𝐓𝐎 𝐓𝐇𝐄 𝐄𝐋𝐈𝐓𝐄 𝐄𝐈𝐆𝐇𝐓‼️

18

558

137

5

63K

Jesse Spencer-Smith @jbspence

over 1 year ago

@AstroKPJ @VUDataScience @ChayanChirps @OpenAI @LIGO Outstanding work by the team!

1

2

0

115

jbspence retweeted

Jim Fan

@DrJimFan

about 2 years ago

The upcoming Llama-3-400B+ will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini. Llama-3-400B is still training and will hopefully get even better in the next few months. There is so many research potential that can be unlocked with such a powerful backbone. Expecting a surge in builder energy across the ecosystem!

DrJimFan's tweet photo. The upcoming Llama-3-400B+ will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini.

Llama-3-400B is still training and will hopefully get even better in the next few months. There is so many research potential that can be unlocked with such a powerful backbone. Expecting a surge in builder energy across the ecosystem!

79

2K

408

668

872K

Who to follow

Arnav Arora

@rnav_arora

PhD student @CopeNLU | Research Intern @Apple MLR | LLM behaviour, alignment, online safety | @rnv.bsky.social | he/him.

Michal Gdula #StandWithUkraine 🇺🇦

@michal_gdula

NewPI @UAM_IBMiB, developmental epigenetics, 3D nuclear architecture, prev postdoc @BUMedicine & @UniofOxford 🇵🇱 🇪🇺 #StandWithUkraine 🇺🇦

KoçDigital

@kocdigital

Nesnelerin İnterneti | İleri Analitik #doğalolarakdijital

Jesse Spencer-Smith @jbspence

about 2 years ago

@VanderbiltU This is an exciting day for Vanderbilt!

0

4

0

335

jbspence retweeted

Jim Fan

@DrJimFan

over 2 years ago

Very nice analysis on long context vs RAG. I believe the way of future will be “soft” methods that interpolate between pure retrieval and pure long context. Some form of spreading neural activations across a giant unstructured database.

15

229

33

115

62K

Jesse Spencer-Smith @jbspence

over 2 years ago

@GilbGonzales @VanderbiltU @VU_Provost @VU_Chancellor @MHSVanderbilt @LGBTPolicyLabVU @vuartsci @GilbGonzales won a well-deserved award!

0

2

0

72

jbspence retweeted

SpaceX

@SpaceX

over 2 years ago

Watch Falcon Heavy launch the USSF-52 mission to orbit https://t.co/zrk4JcZ9Pt

34

298

719

18

38K

jbspence retweeted

abhishek

@abhi1thakur

almost 3 years ago

LLAMA-v2 training successfully on Google Colab's free version! "pip install autotrain-advanced" 💥 Yes, you can also use your local machine!

abhi1thakur's tweet photo. LLAMA-v2 training successfully on Google Colab's free version! "pip install autotrain-advanced" 💥 Yes, you can also use your local machine! https://t.co/VOvocAQ46c

27

1K

234

966

219K

Jesse Spencer-Smith @jbspence

almost 3 years ago

@ylecun Thrilling news that Llama-v2 is open source! and the variety of ways to train/inference is going to simplify the lives of so many in academia and industry alike. So many possibilities! Where to start??

0

126

Jesse Spencer-Smith @jbspence

almost 3 years ago

@cwolferesearch Flash Attention v1 was groundbreaking, v2 is another huge advancement. Attention is at the heart of transformers, and is the key feature that gives them their remarkable capabilities. It’s also where the greatest challenge lies as it’s responsible the huge compute cost!

0

1

0

115

jbspence retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

almost 3 years ago

FlashAttention-2 was released today, which is 5-9X faster than vanilla attention and 2X faster than FlashAttention-v1. Given that many of the top open-source LLMs leverage FlashAttention, this is an important advancement that can make existing models much more efficient during both training and inference. Here’s what you need to know… What is FlashAttention? In its canonical form, self-attention is an O(N^2) operation, where N is the length of the input sequence. Many proposals try to achieve an approximation of self-attention that runs in O(N) time, but they achieve no wall-clock speedup. FlashAttention reformulates attention in an IO-aware manner to achieve significant wall clock speedups, as well as an improvement in memory efficiency. Lots of LLMs use it. Given that FlashAttention is very fast (i.e., ~2-4X faster) compared to a vanilla implementation, many transformer-based applications have adopted it. For example, recent open-source LLMs (such as Falcon and MPT models) use FlashAttention, leading them to be very fast at inference time (and more efficient during pre-training). For example, Falcon-40B is 5X faster at performing inference than GPT-3. FlashAttention-2. A new update to FlashAttention has just been made available, called FlashAttention-2. This variant is about twice as fast as the original FlashAttention, and 5-9X faster than the original attention implementation. This improvement impacts both training an inference speed. The sources of these improvements are summarized by the quote below. “We (1) tweak the algorithm to reduce the number of non-matmul FLOPs (2) parallelize the attention computation, even for a single head, across different thread blocks to increase occupancy, and (3) within each thread block, distribute the work between warps to reduce communication through shared memory.” - from FlashAttention-2 paper The only caveat. Currently, the FlashAttention-2 implementation does not apply to all GPUs. However, the GPUs to which it does apply see a significant benefit; e.g., a 225 TFLOPs/s training speed can be achieved on an A100 using FlashAttention-2.

6

171

31

93

70K

Jesse Spencer-Smith @jbspence

almost 3 years ago

@Gradio This will greatly accelerate development of AI applications! Very excited to try this out!

0

1

0

260

jbspence retweeted

Vanderbilt University @VanderbiltU

almost 3 years ago

Scientists led by Vanderbilt astronomer Stephen Taylor have identified evidence of slowly undulating #GravitationalWaves passing through our galaxy. Learn more about VU researchers' contributions to the exciting @NANOGrav findings: https://t.co/ihJjTYB4PS

1

33

8

1

15K

jbspence retweeted

Karan Jani

@AstroKPJ

almost 3 years ago

Let the revolution begin #LILA 🌘

2

64

9

0

6K

jbspence retweeted

Vanderbilt School of Nursing @VanderbiltNurse

about 3 years ago

Assistant Prof Alvin Jeffery was accepted into a NIDA-sponsored entrepreneurial program at the intersection of #informatics, software development, #genetics and substance use disorder called L-SPRINT @babson College. @UCDavisHealth @NIDAnews

VanderbiltNurse's tweet photo. Assistant Prof Alvin Jeffery was accepted into a NIDA-sponsored entrepreneurial program at the intersection of #informatics, software development, #genetics and substance use disorder called L-SPRINT @babson College. @UCDavisHealth @NIDAnews https://t.co/2AqYVfnMJe

0

14

3

0

672

jbspence retweeted

Thomas Wolf

@Thom_Wolf

about 3 years ago

The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉

13

700

136

172

127K

jbspence retweeted

Binyuan Hui

@huybery

about 3 years ago

In-context learning as the mysterious ability in LMs. We propose ✨Deep-thinking✨ to boost ICL by iterative forward tuning. It is possible to tune LMs without backpropagation! 🤯 Paper: https://t.co/vX1BdqwOCe Gradio Demo: https://t.co/okQK7mxMgB

1

214

55

133

53K

jbspence retweeted

Tim Dettmers

@Tim_Dettmers

about 3 years ago

QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark: Paper: https://t.co/J3Xy195kDD Code+Demo: https://t.co/SP2FsdXAn5 Samples: https://t.co/q2Nd9cxSrt Colab: https://t.co/Q49m0IlJHD

Tim_Dettmers's tweet photo. QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark:

Paper: https://t.co/J3Xy195kDD
Code+Demo: https://t.co/SP2FsdXAn5
Samples: https://t.co/q2Nd9cxSrt
Colab: https://t.co/Q49m0IlJHD https://t.co/UJcowpfhpH

81

4K

903

2K

2M

jbspence retweeted

Together AI @togethercompute

about 3 years ago

The first RedPajama models are here! The 3B and 7B models are now available under Apache 2.0 license, including instruction-tuned and chat versions! This project demonstrates the power of the open-source AI community with many contributors ... 🧵 https://t.co/msO4afBQEK

togethercompute's tweet photo. The first RedPajama models are here! The 3B and 7B models are now available under Apache 2.0 license, including instruction-tuned and chat versions!
This project demonstrates the power of the open-source AI community with many contributors ... 🧵 https://t.co/msO4afBQEK https://t.co/ekLeidmd0q

16

836

212

377

518K

Jesse Spencer-Smith

@jbspence

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users