Makemeflow @florianecm - Twitter Profile

almost 3 years ago

ChatGPT can now browse the internet to provide you with current and authoritative information, complete with direct links to sources. It is no longer limited to data before September 2021.

2K

50K

9K

4K

15M

florianecm retweeted

Weights & Biases

@wandb

almost 3 years ago

🔥 Just Released: Free guide to 💪 training LLMs, including techniques for parallelization, tokenization strategies and their tradeoffs, plus how much data you'll actually need 🤯

wandb's tweet photo. 🔥 Just Released: Free guide to 💪 training LLMs, including techniques for parallelization, tokenization strategies and their tradeoffs, plus how much data you'll actually need 🤯

28

1K

143

746

8M

florianecm retweeted

Itamar Golan 🤓

@ItakGol

almost 3 years ago

This is scary. 😱 The MOTHER of all LLM Jailbreaks & Prompt injections. "Universal and Transferable Adversarial Attacks on Aligned Language Models" 🌐🔒 --- TL;DR --- This research & code introduces a fascinating method called "Universal and Transferable Adversarial Attacks on Aligned Language Models," which automatically generates potentially infinite suffixes for any prompt to cause aligned language models to produce objectionable behaviors. 🤖🚨 --- Background --- Previous attempts at jailbreaking language models have relied on manual crafting, which could be easily patched by vendors. In contrast, this method presents an automated approach called GCG that constructs an endless array of jailbreaks with high reliability, even for novel instructions and models. This makes it unfeasible for manual patching to address the vulnerabilities. 🛡️💻 --- The Method --- 1. Initial affirmative responses: To induce objectionable behavior, the attack targets the model to provide a positive response to harmful queries, initiating with "Sure, here is (content of the query)." This switches the model into a mode where it generates objectionable content immediately after. 2. Combined greedy and gradient-based discrete optimization: The adversarial suffix optimization is challenging due to the need to optimize over discrete tokens. The method utilizes gradients at the token level to identify promising single-token replacements, evaluate the loss of candidate tokens, and select the best substitutions. It shares similarities with the AutoPrompt approach but explores all possible tokens for replacement at each step, enhancing effectiveness. 3. Robust multi-prompt and multi-model attacks: To ensure reliable attacks, the method generates a single suffix string that induces negative behavior across various prompts and multiple models. The attack is tested on different models, such as Vicuna-7B/13b and Guanaco-7B. 🎯🎮 --- Evaluation --- This GCG approach achieves an impressive attack success rate, with 100% on Vicuna-7B and 88% on Llama-2-7B-Chat, surpassing the success rates of prior work tremendously. 📈🏆 --- Transferability --- That part is the real magic of this work. ✨ The research reveals that the attacks generated by this approach can transfer effectively to other language models, even those using entirely different tokens to represent the same text, different training procedures, and different training datasets... Whatttttt? Adversarial examples designed for Vicuna-7B can transfer to larger Vicuna models. Apparently, those that fool both Vicuanas can transfer to Pythia, Falcon, Guanaco - and most importantly -- also to GPT-3.5, GPT-4, and PaLM-2, leading to harmful instructions being followed over 60% of the time!!! 😮🔄🧙‍♂️ This is a huge discovery. --- Conclusion --- We are left with more questions than answers. ❓ One of the crucial aspects to explore is whether models can be explicitly fine-tuned to avoid such attacks through adversarial training. The robustness of models against these attacks and their generative capabilities require further investigation. Moreover, additional alignment training might partially address the issue, and exploring mechanisms in pre-training to prevent such behavior from arising initially is essential. 🕵️‍♀️🛠️ --- Links --- Website - https://t.co/aRllNUA9ue Paper - https://t.co/MxwsTbaM2o Code - https://t.co/Qi4FZbEUmw

ItakGol's tweet photo. This is scary. 😱

The MOTHER of all LLM Jailbreaks & Prompt injections.

"Universal and Transferable Adversarial Attacks on Aligned Language Models" 🌐🔒

--- TL;DR ---
This research & code introduces a fascinating method called "Universal and Transferable Adversarial Attacks on Aligned Language Models," which automatically generates potentially infinite suffixes for any prompt to cause aligned language models to produce objectionable behaviors. 🤖🚨

--- Background ---
Previous attempts at jailbreaking language models have relied on manual crafting, which could be easily patched by vendors. In contrast, this method presents an automated approach called GCG that constructs an endless array of jailbreaks with high reliability, even for novel instructions and models. This makes it unfeasible for manual patching to address the vulnerabilities. 🛡️💻

--- The Method ---
1. Initial affirmative responses: To induce objectionable behavior, the attack targets the model to provide a positive response to harmful queries, initiating with "Sure, here is (content of the query)." This switches the model into a mode where it generates objectionable content immediately after.

2. Combined greedy and gradient-based discrete optimization: The adversarial suffix optimization is challenging due to the need to optimize over discrete tokens. The method utilizes gradients at the token level to identify promising single-token replacements, evaluate the loss of candidate tokens, and select the best substitutions. It shares similarities with the AutoPrompt approach but explores all possible tokens for replacement at each step, enhancing effectiveness.

3. Robust multi-prompt and multi-model attacks: To ensure reliable attacks, the method generates a single suffix string that induces negative behavior across various prompts and multiple models. The attack is tested on different models, such as Vicuna-7B/13b and Guanaco-7B. 🎯🎮

--- Evaluation ---
This GCG approach achieves an impressive attack success rate, with 100% on Vicuna-7B and 88% on Llama-2-7B-Chat, surpassing the success rates of prior work tremendously. 📈🏆

--- Transferability ---
That part is the real magic of this work. ✨

The research reveals that the attacks generated by this approach can transfer effectively to other language models, even those using entirely different tokens to represent the same text, different training procedures, and different training datasets...

Whatttttt?

Adversarial examples designed for Vicuna-7B can transfer to larger Vicuna models. Apparently, those that fool both Vicuanas can transfer to Pythia, Falcon, Guanaco - and most importantly -- also to GPT-3.5, GPT-4, and PaLM-2, leading to harmful instructions being followed over 60% of the time!!! 😮🔄🧙‍♂️

This is a huge discovery.

--- Conclusion ---
We are left with more questions than answers. ❓

One of the crucial aspects to explore is whether models can be explicitly fine-tuned to avoid such attacks through adversarial training. The robustness of models against these attacks and their generative capabilities require further investigation.

Moreover, additional alignment training might partially address the issue, and exploring mechanisms in pre-training to prevent such behavior from arising initially is essential. 🕵️‍♀️🛠️

--- Links ---
Website - https://t.co/aRllNUA9ue
Paper - https://t.co/MxwsTbaM2o
Code - https://t.co/Qi4FZbEUmw

39

1K

308

1K

238K

florianecm retweeted

Cohere

@cohere

almost 3 years ago

Dive into the fascinating world of Transformer models! Luis Serrano breaks down the architecture & functionality of these ML marvels in this blog. You'll learn how they maintain context, generate coherent text, & much more! Enhance your AI knowledge 🚀💡 https://t.co/h20eVrlkz5

96

3K

378

529

5M

Who to follow

Deux connards sur Internet déblatèrent sur la pop culture autour d' une table !

florianecm retweeted

almost 3 years ago

Bring design and code even closer together with plugins in Dev Mode. The @github plugin connects your files, issues, and PRs to your Figma components, giving you the context you need when implementing designs. Try it now: https://t.co/PyYdhnv6Ud

figma's tweet photo. Bring design and code even closer together with plugins in Dev Mode.

The @github plugin connects your files, issues, and PRs to your Figma components, giving you the context you need when implementing designs.

Try it now: https://t.co/PyYdhnv6Ud https://t.co/oAswr5m2eA

13

683

107

149

104K

florianecm retweeted

asim

@asimdotshrestha

about 3 years ago

Introducing #AgentGPT, an attempt at #AutoGPT directly in the browser 🤖 Give your own AI agent a goal and watch as it thinks, comes up with an execution plan and takes actions. Try for free now at https://t.co/F8Nz4LGC0e

291

6K

1K

5K

4M

Makemeflow @florianecm

about 3 years ago

@AdamFard_ Send 🙏🦾

0

florianecm retweeted

Hasan Toor

@hasantoxr

about 3 years ago

Canva has over 125 million users worldwide. Recently, Canva introduced new AI-powered design features. Here are 10 new Canva features to save you countless hours of work:

127

6K

1K

6K

1M

florianecm retweeted

Michał Żołnieruk 📱👋

@michal_creates

over 3 years ago

Wait what, we can generate pie charts with @NotionHQ AI?! It works both on tables generated by Notion AI and existing data. This is super cool! 🤯

41

2K

138

396

211K

Makemeflow @florianecm

over 3 years ago

@AdamFard_ Send 🙏

0

4

Makemeflow @florianecm

over 3 years ago

@bardeenai Yeeeess ☄️☄️

0

6

Makemeflow @florianecm

over 3 years ago

@notionpunk @bardeenai NOW 🔥

0

1

0

9

florianecm retweeted

Fred | Notion Punk 💡 @notionpunk

over 3 years ago

99% Notion users DO NOT use automations for reptitive tasks 🤯! Today @bardeenai and me are giving away: Notion Automations for Newbies (value 29$) for free in the next 48 hrs Simply: • Like • Retweet • Comment "NOW" I'll DM you (must follow @bardeenai and @notionpunk)

notionpunk's tweet photo. 99% Notion users DO NOT use automations for reptitive tasks 🤯!

Today @bardeenai and me are giving away:

Notion Automations for Newbies (value 29$) for free in the next 48 hrs

Simply:
• Like
• Retweet
• Comment "NOW"

I'll DM you (must follow @bardeenai and @notionpunk) https://t.co/6hAvmP5b1b

418

543

335

46

100K

florianecm retweeted

Alex Hormozi

@AlexHormozi

over 3 years ago

The Queen of England died 5 months ago…. She ruled an entire nation and accumulated more wealth than 99.99% of humans… And…yet…you haven’t thought about her except for this tweet. You’re gonna die. Everyone will move on. Do what you want.

205

8K

1K

559

615K

florianecm retweeted

Just Ship It @JustShipItClub

over 3 years ago

10 No-code tools for startups: 1. AI: No-code AI model builder 2. Design: Canva 3. Landing page: Carrd 4. Newsletter: Beehiiv 5. App: Glide 6. Automation: Make 7. Website: Bubble 8. Workspace: ClickUp What's stopping you?

1

6

3

2

450

florianecm retweeted

Charly Wargnier

@DataChaz

over 3 years ago

.@TalarianHQ's `GPT for Sheets™` is the gift that keeps on giving! 🔥 Look at how easy it is to create personalized content with it, thanks to #GPT3's seamless integration! 🤯👇 Get the add-on here: 🔗https://t.co/3xUwkhBkc4

7

420

95

208

151K

Makemeflow @florianecm

over 3 years ago

@themattmic Bundle 🙌

0

3

florianecm retweeted

Sir Doge of the Coin ⚔️

@dogeofficialceo

over 3 years ago

It really is that simple

1K

89K

21K

1K

8M

florianecm retweeted

Aazar Shad

@Aazarshad

over 3 years ago

1. Pencil What: Pencil is the AI Ad Generator that helps brands & agencies create new ad variations 10x faster. Use case: • Automatically generate static & video ad creatives • Run creatives predicted to win based on $1B in ad spend. Link: https://t.co/yg3DuPyDXB

3

257

24

104

58K

Makemeflow

@florianecm

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users