@harpomaxx@sigmoid.social @harpolabs - Twitter Profile

over 2 years ago

Since Mixture of Expert (MoE) LLMs are all the rage as of this weekend, thanks to the Mixtral-8x-7B release, here's a quick explainer. The figure below shows the architecture behind the Switch Transformer (https://t.co/6jowgQx0DV), a great intro to MoEs. The model depicted in this figure uses 1 expert per token with 4 experts in total. Mixtral-8x-7B, on the other hand, consists of 8 experts and uses 2 experts per token. Why MoEs? Combined, the 8 experts in a 7B model like Mixtral are still ~56B parameters. (Actually, it's less than 56B, because the MoE approach is only applied to the MoE layers, not the self-attention weight matrices. So, it's likely closer to 40-50B parameters.) However, since the router reroutes the tokens such that only 7B parameters (instead of all 56B) are used at a time for the forward pass, the training (and especially inference) will be much faster compared to the traditional non-MoE approach. If you read my AI and Open Source in 2023 article (https://t.co/C8SGseRHNZ) approx. 2 months ago, I mentioned that "It will be interesting to see if MoE approaches can lift open-source models to new heights in 2024". It looks like Mixtral started this trend early, and I am sure that this is just the beginning :).

rasbt's tweet photo. Since Mixture of Expert (MoE) LLMs are all the rage as of this weekend, thanks to the Mixtral-8x-7B release, here's a quick explainer. The figure below shows the architecture behind the Switch Transformer (https://t.co/6jowgQx0DV), a great intro to MoEs.

The model depicted in this figure uses 1 expert per token with 4 experts in total. Mixtral-8x-7B, on the other hand, consists of 8 experts and uses 2 experts per token.

Why MoEs? Combined, the 8 experts in a 7B model like Mixtral are still ~56B parameters. (Actually, it's less than 56B, because the MoE approach is only applied to the MoE layers, not the self-attention weight matrices. So, it's likely closer to 40-50B parameters.)

However, since the router reroutes the tokens such that only 7B parameters (instead of all 56B) are used at a time for the forward pass, the training (and especially inference) will be much faster compared to the traditional non-MoE approach.

If you read my AI and Open Source in 2023 article (https://t.co/C8SGseRHNZ) approx. 2 months ago, I mentioned that "It will be interesting to see if MoE approaches can lift open-source models to new heights in 2024". It looks like Mixtral started this trend early, and I am sure that this is just the beginning :).

15

1K

253

1K

179K

harpolabs retweeted

Seba García @eldracote

almost 3 years ago

Today ⁦@verovaleros⁩ gave the talk “Four Key Problems in OSINT for Cyber Threat Intelligence” in ⁦@enisa_eu⁩ CTI conf. #cti

0

13

4

2

1K

harpolabs retweeted

Seba García @eldracote

almost 3 years ago

"Researchers Pre-trained LLM Agents Acting as Human Penetration Testers" Our work w/@mrigaki @ondrej_lukas @harpolabs and me, as part of the Aidojo w/@muni_cz was covered in @The_Cyber_News https://t.co/1kSu8qgjfV @StratosphereIPS #RL #agents #automaticpentest

0

5

2

0

694

harpolabs retweeted

Seba García @eldracote

almost 3 years ago

Our work with @MurisSladic @verovaleros @harpolabs in LLM honeypots was highlighted in @The_Cyber_News "shelLM – A New AI-Based Honeypot to Engage Attackers as a Real System" @StratosphereIPS https://t.co/0PbysS49Sx via @The_Cyber_News

2

16

7

1

4K

Who to follow

Luis D. Verde Arregoitia

@LuisDVerde

🇲🇽 - ecology, evolution, conservation, spatial. #rstats, data. mammalogist. tidyverse and 🔨 instructor

harpolabs retweeted

Maria Rigaki @mrigaki

almost 3 years ago

🚨New Paper Alert! In our latest research we used LLMs as reinforcement learning red team agents: "Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments" https://t.co/LmMHoUlHYh with @ondrej_lukas @harpolabs @eldraco #AI #LLM #ML #infosec

mrigaki's tweet photo. 🚨New Paper Alert! In our latest research we used LLMs as reinforcement learning red team agents: "Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments" https://t.co/LmMHoUlHYh with @ondrej_lukas @harpolabs @eldraco #AI #LLM #ML #infosec https://t.co/OAjYnwA2iR

1

37

15

20

5K

harpolabs retweeted

Max Kuhn @topepos

almost 3 years ago

@predict_addict I'm talking on this at #PositConf2023 in Chicago (https://t.co/vAkF2WBi7y) in about 2 weeks!

2

16

2

1

2K

harpolabs retweeted

__veronica__ @verovaleros

almost 3 years ago

It’s time to admit that we were all overly optimistic during COVID, thinking that organizations will finally understand the “remote work” culture. Not only is back to office, but also back to paper and physical meets. We thought we could make a change and we lost. #fail

0

13

4

0

1K

harpolabs retweeted

Maria Rigaki @mrigaki

almost 3 years ago

We are happy to announce that our paper with @eldracote was accepted in ESORICS 2023: "The Power of MEME: Adversarial Malware Creation with Model-Based Reinforcement Learning" https://t.co/MhtHQ4x5cf @StratosphereIPS @ctu_cs #ai #adversarial #MachineLearning #malware #security

mrigaki's tweet photo. We are happy to announce that our paper with @eldracote was accepted in ESORICS 2023: "The Power of MEME: Adversarial Malware Creation with Model-Based Reinforcement Learning" https://t.co/MhtHQ4x5cf @StratosphereIPS @ctu_cs #ai #adversarial #MachineLearning #malware #security https://t.co/AR9YOvg5Dd

1

18

6

1

1K

harpolabs retweeted

Seba García @eldracote

almost 3 years ago

Thanks for the kits and the workshop! Help us get into the community and have fun!

0

12

3

0

2K

harpolabs retweeted

Leo Oliva @loliva

almost 3 years ago

Desde el nuevo Media Lab de la @UNCUYO acabamos de lanzar un podcast dedicado a la inteligencia artificial. Se llama "Inteligencia natural" y acá están los primeros 3 episodios. https://t.co/0HAtJ6CcHV

loliva's tweet photo. Desde el nuevo Media Lab de la @UNCUYO acabamos de lanzar un podcast dedicado a la inteligencia artificial. Se llama "Inteligencia natural" y acá están los primeros 3 episodios. https://t.co/0HAtJ6CcHV https://t.co/M8FUIYBkIU

2

6

4

0

938

harpolabs retweeted

Magdalena Day @magdalenaday

almost 3 years ago

Los Harpo están conquistando el mundo @harpolabs 😂😂#LiliFiallo muy interesante arte y IA #podcasts

0

1

2

0

199

harpolabs retweeted

Seba García @eldracote

about 3 years ago

Join us in Las Vegas for our @BlackHatEvents hands-on Advanced Malware Traffic Analysis training with @verovaleros! Learn how to detect botnets, ransomware, lateral movement, and more! Learn to detect attacks w/machine learning tools! #BlackHat #BHUSA https://t.co/LmYEnqG9Dz

0

4

3

0

593

harpolabs retweeted

Cancu Rodriguez 🇦🇷 @CancuCS

about 3 years ago

Automated Code Reviews by ChatGPT through GH Actions? Sure! Learn how to do it at https://t.co/VdDh6eRJ0o #chatgpt #githubactions #codereview #pullrequest #llm

0

3

1

201

harpolabs retweeted

Seba García @eldracote

about 3 years ago

Our paper w/@MasarahClouston "On the dynamics behind profit-driven cybercrime from contextual factors to perceived group structures, and the workforce at the periphery" was finally published! I'm so proud and happy for this collaboration! #cybercrime https://t.co/oWrgygd6Ky

0

11

5

0

871

harpolabs retweeted

Hugging Face

@huggingface

about 3 years ago

Let's gooo! 🔥 We're happy to introduce two new official Space Templates in collaboration with @posit_pbc: Build and share your R and Python Shiny apps directly on the Hub! 🥳

0

99

35

10

49K

@[email protected] @harpolabs

about 3 years ago

Mornings are made for reading, nights are made for coding.🤓 ... Except when you have kids 🤷🏻‍♀️

1

2

0

107

harpolabs retweeted

clem 🤗

@ClementDelangue

about 3 years ago · Normandy Shores

I believe we need open-source alternatives to ChatGPT for more transparency, inclusivity, accountability and distribution of power. Excited to introduce HuggingChat, an open-source early prototype interface, powered by OpenAssistant, a model that was released a few weeks ago.

ClementDelangue's tweet photo. I believe we need open-source alternatives to ChatGPT for more transparency, inclusivity, accountability and distribution of power.

Excited to introduce HuggingChat, an open-source early prototype interface, powered by OpenAssistant, a model that was released a few weeks ago. https://t.co/8U1OY0jnzP

55

2K

382

381

236K

harpolabs retweeted

Nicolás Wolovick ⭐⭐⭐ @nwolovick

about 3 years ago

¡El #SAHTI2023 de las #52JAIIO @jaiio_oficial se prepara! @bmassare desde CABA, Karina Bianculli desde MdP y yo desde Cba, armando un Simposio de Historias, Tecnología e Informática que va a estar muy bueno. ¡Hasta el 1 de mayo tenés tiempo de presentar el resúmen! Te esperamos.

nwolovick's tweet photo. ¡El #SAHTI2023 de las #52JAIIO @jaiio_oficial se prepara!
@bmassare desde CABA, Karina Bianculli desde MdP y yo desde Cba, armando un Simposio de Historias, Tecnología e Informática que va a estar muy bueno.

¡Hasta el 1 de mayo tenés tiempo de presentar el resúmen!
Te esperamos. https://t.co/OhRmsJLNLJ

0

10

3

0

668

@[email protected] @harpolabs

about 3 years ago

Another collaboration with visual artist Lili Fiallo. Using #dreambooth we fine tuned #stablediffusion to incorporate Lili's toy art style. incredibly simple using diffusers and gradio libraries. https://t.co/7TK1n6tTDj https://t.co/1TXvs5iQop