Alexander Derve @AlexanderDerve - Twitter Profile

Alexander Derve

@AlexanderDerve

6 months ago

@AISafetyMemes amazing

0

97

Alexander Derve

@AlexanderDerve

6 months ago

@burkov Pretty insightful, I also think a lot of RL training is focused on straight forward tasks where there is a defined problem and solution. Meanwhile a lot of these real world tasks involve a ton of unknowns/required trial and error, and this process goes well outside the training.

0

2

0

223

AlexanderDerve retweeted

Michael Antonelli

@BullandBaird

6 months ago

No bigger lie

89

48K

3K

596

880K

Alexander Derve

@AlexanderDerve

7 months ago

@AskPerplexity Can confirm, my day was ruined

0

1

0

13

Who to follow

Florian Brand

@xeophon

evals @PrimeIntellect | open models @interconnectsai

Jean-Marie Lemercier

@jm_lemercier

Research Scientist @nvidia Discrete Diffusion for Language Modelling - Previously @AIatMeta @sp_uhh

9 months ago

@NousResearch Why use Llama for the base?

0

2

0

568

Alexander Derve

@AlexanderDerve

about 1 year ago

@cuemewch If you have central AC but aren't getting enough cold air in your room I'd try out a register booster, they work pretty alright. https://t.co/2jMlIXOHAN

1

3

0

253

AlexanderDerve retweeted

💗 Lunie 💗 @Back2Batk

about 1 year ago

AI voices will never surpass the power of sentence mixing

307

69K

11K

13K

1M

AlexanderDerve retweeted

non aesthetic things

@PicturesFoIder

about 1 year ago

If the keyboard button was a person

192

34K

4K

5K

2M

Alexander Derve

@AlexanderDerve

about 1 year ago

@cocktailpeanut @deepbeepmeep @TXhunyuan Any preference between wan2.1 and HunyuanVideo?

1

0

344

AlexanderDerve retweeted

Massimo

@Rainmaker1973

about 1 year ago

The little joy of this dog who figured out how to have fun on an escalator. https://t.co/5qF0SZbQ5J

60

2K

131

132

178K

AlexanderDerve retweeted

Qwen

@Alibaba_Qwen

about 1 year ago

Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1. Blog: https://t.co/jpNEx0Ck8p HF: https://t.co/h91przQmoP ModelScope: https://t.co/p0ztmZpWIZ Demo: https://t.co/sxVVRFwunC Qwen Chat: https://t.co/bg4tAU1p74 This time, we investigate recipes for scaling RL and have achieved some impressive results based on our Qwen2.5-32B. We find that RL training con continuously improve the performance especially in math and coding, and we observe that the continous scaling of RL can help a medium-size model achieve competitieve performance against gigantic MoE model. Feel free to chat with our new models and provide us feedback!