Alejandro @Alexrs95 - Twitter Profile

Alejandro @alexrs95

5 days ago

@rikelhood Enhorabuena! Muy buen report, estoy disfrutando mucho leyéndolo!

0

1

0

21

Alejandro @alexrs95

about 2 months ago

@XMihura 4o-mini puede ser útil para algunas cosas: es rápido, barato, y bueno en ciertas tareas. No todo requiere SOTA :)

1

0

145

Alejandro @alexrs95

about 2 months ago

@katomubirusteve @Cohere_Labs 🎉🔥

1

0

61

Alejandro @alexrs95

2 months ago

@sarahookr I think they missed a team building activity on a boat spinning around for a while. I’ve heard that can bond teams as well 🤣

0

45

Who to follow

Pedro Piñera

@pepicrft

CEO @tuistdev ☀️ Tool maker ✦ Ex @shopify, @soundcloud

burke

@burkelibbey

journeyman slopbender

Nacho Soto

@NachoSoto

🤖 iOS/Codex @OpenAI 📱's 2014 App of the Year 👨🏻‍✈️ Airline Captain & Flight Instructor: @NSAviator

alexrs95 retweeted

Cohere Labs

@Cohere_Labs

3 months ago

Very excited to support the release of Cohere Transcribe, available with a fully open license. One more step towards making breakthroughs accessible to people worldwide. 🌏

0

27

7

2K

alexrs95 retweeted

Pierre Richemond 🇪🇺

@TheOneKloud

3 months ago

Blog post : https://t.co/5rsVaSmuT2 Technical blog post : https://t.co/alfBpoE9eh Model card : https://t.co/QyYJos3KmB

2

35

10

1

2K

alexrs95 retweeted

Cohere Labs

@Cohere_Labs

3 months ago

4 Days Left to Join Expedition Tiny Aya! 🚀 Here’s what you need to know: 🌟 Join a team or pitch your own idea — Connect with like-minded researchers and mentors. 🤝 Get hands-on support — 20+ research mentors from Cohere and Cohere Labs are ready to guide your project. 💡 Explore 20+ project ideas — Spark your creativity and contribute to impactful open science. Learn more and join a team before March 8th: https://t.co/a13GRWyuo1

Cohere_Labs's tweet photo. 4 Days Left to Join Expedition Tiny Aya! 🚀

Here’s what you need to know:

🌟 Join a team or pitch your own idea — Connect with like-minded researchers and mentors.

🤝 Get hands-on support — 20+ research mentors from Cohere and Cohere Labs are ready to guide your project.

💡 Explore 20+ project ideas — Spark your creativity and contribute to impactful open science.

Learn more and join a team before March 8th: https://t.co/a13GRWyuo1

0

14

1

12

1K

Alejandro @alexrs95

4 months ago

@rasbt Thanks for sharing this Sebastian! If you have any questions, we are happy to answer and chat! 😄

0

7

0

661

alexrs95 retweeted

Sebastian Raschka

@rasbt

4 months ago

Tiny Aya reimplementation From Scratch! Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon). Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support of that size class. Great for on-device translation tasks. Just did a from-scratch implementation here: https://t.co/6KEV0DfVQu Architecture-wise, Tiny Aya is a classic decoder-style transformer with a few noteworthy modifications (besides the obvious ones like SwiGLU and Grouped Query Attention): 1. Parallel transformer blocks. A parallel transformer block computes attention and MLP from the same normalized input, then adds both to the residual in one step. I assume this is to reduce serial dependencies inside a layer to improve computational throughput. 2. Sliding window attention. Specifically, it uses a 3:1 local:global ratio similar to Arcee Trinity and Olmo 3. The window size is also 4096. Also, similar to Arcee, the sliding window layers use RoPE whereas the full attention layers use NoPE. 3. LayerNorm. Most architectures moved to RMSNorm as it's computationally a bit cheaper and performs well. Tiny Aya is keeping it more classic with a modified version of LayerNorm (the implementation here is like standard LayerNorm but without shift, i.e., bias, parameter).

rasbt's tweet photo. Tiny Aya reimplementation From Scratch!

Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon).
Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support of that size class. Great for on-device translation tasks.

Just did a from-scratch implementation here: https://t.co/6KEV0DfVQu

Architecture-wise, Tiny Aya is a classic decoder-style transformer with a few noteworthy modifications (besides the obvious ones like SwiGLU and Grouped Query Attention):

1. Parallel transformer blocks. A parallel transformer block computes attention and MLP from the same normalized input, then adds both to the residual in one step. I assume this is to reduce serial dependencies inside a layer to improve computational throughput.

2. Sliding window attention. Specifically, it uses a 3:1 local:global ratio similar to Arcee Trinity and Olmo 3. The window size is also 4096. Also, similar to Arcee, the sliding window layers use RoPE whereas the full attention layers use NoPE.

3. LayerNorm. Most architectures moved to RMSNorm as it's computationally a bit cheaper and performs well. Tiny Aya is keeping it more classic with a modified version of LayerNorm (the implementation here is like standard LayerNorm but without shift, i.e., bias, parameter).

22

1K

161

835

68K

Alejandro @alexrs95

4 months ago

@QGallouedec 🤗

0

129

alexrs95 retweeted

Quentin Gallouédec @QGallouedec

4 months ago

Cohere just released 🌿 Tiny Aya, a fully open-source 3B model that speaks 70+ languages. But there’s a catch!: No agent capabilities. Is it hard to turn Tiny Aya into an agent? Here'e a hands-on guide to train it and get the first massively multilingual open agent. Link below

QGallouedec's tweet photo. Cohere just released 🌿 Tiny Aya, a fully open-source 3B model that speaks 70+ languages. But there’s a catch!:

No agent capabilities. Is it hard to turn Tiny Aya into an agent?

Here'e a hands-on guide to train it and get the first massively multilingual open agent.

Link below https://t.co/PycRIqjIBa

3

42

6

22

5K

Alejandro @alexrs95

4 months ago

@TomGorse @pepicrft @SimonHoiberg Spain has many flaws and things to improve for sure, but for the shake of a healthy, constructive conversation, calling everything we find slightly annoying a “third world country” thing does not help. I guess you’ve never been to a third world country.

1

0

41

alexrs95 retweeted

Marzieh Fadaee @mziizm

4 months ago

We're hiring a Research Engineer who understands models at a deep technical level and excited to take responsibility across the full lifecycle. If you're excited to join a small team driving research with real-world impact, we'd love to hear from you. https://t.co/2Jdf2GEWcL

15

488

35

382

45K

Alejandro @alexrs95

4 months ago

Tiny Aya can also run in your browser! 🌿🥖

Xenova

@xenovacom

4 months ago

Cohere Labs just released Tiny Aya on Hugging Face: a collection of 4 open-weight multilingual LLMs optimized for over 70 languages. At just 3.35B parameters, they are perfect for on-device use cases, and can even run 100% locally in your browser on WebGPU! Try out the demo! 👇

5

110

17

58

9K

1

18

2

4

2K

Alejandro @alexrs95

4 months ago

@SergioPaniego @Cohere_Labs Interesante ejemplo! Muchas gracias! 🤗

1

2

0

1

92

alexrs95 retweeted

Sergio Paniego

@SergioPaniego

4 months ago

Tiny Aya 🌿 just dropped from @Cohere_Labs, a really powerful multilingual small model! To celebrate, we cooked up fresh resources to train it for tool calling 🔧 > Free Google Colab guide > Standalone training script

SergioPaniego's tweet photo. Tiny Aya 🌿 just dropped from @Cohere_Labs, a really powerful multilingual small model!

To celebrate, we cooked up fresh resources to train it for tool calling 🔧

> Free Google Colab guide
> Standalone training script https://t.co/HoQTjSG3e0

2

23

3

5

1K

Alejandro @alexrs95

4 months ago

@RicardoMonti9 @agcrnz @KaleighMentzer Really interesting work! 👏 And thanks for including Tiny Aya, that was fast! I’m curious about why you chose tiny-aya-base and not one of the instruction tuned models such as tiny-aya-global? Thanks!

1

4

0

117

Alejandro @alexrs95

4 months ago

@sarahookr Thank you! 💙

0

2

0

36

Alejandro @alexrs95

4 months ago

Working on Tiny Aya has been incredibly rewarding, fun, and insightful. I hope this tiny model can make a huge impact and bring AI to more people, in their native languages. Let's keep exploring ✨

Cohere Labs

@Cohere_Labs

4 months ago

Introducing ✨Tiny Aya✨, a family of massively multilingual small language models built to run where people actually are. Tiny Aya delivers strong multilingual performance in 70+ global languages in a 3.35B parameter model, efficient enough to run locally, even on a phone.

28

849

156

498

192K

4

34

7

1

1K

Alejandro @alexrs95

4 months ago

Check out our tech report! https://t.co/8pYXJI7Kgm

0

7

0

1

125

Alejandro

@alexrs95

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users