Precious Oluwafemi Sani @pfemis - Twitter Profile

Precious Oluwafemi Sani @pfemis

1 day ago

Just seeing this today. Curious to know if model cards were updated based on this 🤔

Joel Becker

@joel_bkr

3 months ago

new @METR_Evals research note from @whitfill_parker, @cherylwoooo, nate rush, and me. (chiefly parker!) we find that *half* of SWE-bench Verified solutions from Sonnet 3.5-to-4.5 generation AIs *which are graded as passing* are rejected by project maintainers.

joel_bkr's tweet photo. new @METR_Evals research note from @whitfill_parker, @cherylwoooo, nate rush, and me. (chiefly parker!)

we find that *half* of SWE-bench Verified solutions from Sonnet 3.5-to-4.5 generation AIs *which are graded as passing* are rejected by project maintainers. https://t.co/OK995CqMud

21

540

53

159

186K

0

32

Precious Oluwafemi Sani @pfemis

1 day ago

@Brii_toe_knee Congratulations 🎊

1

0

126

Precious Oluwafemi Sani @pfemis

1 day ago

@dillonplunkett @MATSprogram Fingers crossed on this. Following you in the meantime

0

1

0

103

Precious Oluwafemi Sani @pfemis

1 day ago

@andyw_ais Can you share some of them? If possible too, tips for getting into Astra especially for those who have tried and gotten rejected.

0

2

0

210

Who to follow

ESMEC

@esmecurbino

European School of Medicinal Chemistry. EFMC Certified School, ESMEC wants to provide participants with the most recent advances in the field of life sciences

Mubar Dauda

@MubarDauda

Data Community Africa

@DataFestAfrica

The Number one community for Data practitioners and Enthusiasts in Africa - Subscribe - https://t.co/vQBq9FrueZ #DataFestAfrica2024 #DFA24

Precious Oluwafemi Sani @pfemis

3 days ago

@celestepoasts Would like to tailor interp towards low resource languages. Why guardrails might fail and what happens when they do

0

283

Precious Oluwafemi Sani @pfemis

4 days ago

@Badboy1B cqq

0

2K

Precious Oluwafemi Sani @pfemis

6 days ago

Legit!!

terminally onλine εngineer

@tekbog

7 days ago

people misunderstand what AI does it accelerates everything even incompetence

128

4K

478

255

127K

0

23

pfemis retweeted

Andy Wang @andyw_ais

8 days ago

People who want to get into AI safety should do more than just apply to fellowships. Applications are pretty random and might not reflect your actual abilities. Instead, be agentic! There are many paths to success that require only a bit more proactivity than applications. (1/4)

16

815

48

928

57K

Precious Oluwafemi Sani @pfemis

8 days ago

@andyw_ais Thank you @andyw_ais. This is the validation I needed and timely. I am currently working on extending 1 paper and was wondering if I was on the right path.

0

2

0

1K

Precious Oluwafemi Sani @pfemis

14 days ago

Now this is it! Who takes responsibility for AI failures or mistakes?

Polymarket

@Polymarket

14 days ago

NEW: Pizza Hut franchisee sues for $100 million, claiming the company’s AI delivery system made deliveries up to 50% slower.

189

8K

904

536

2M

0

29

Precious Oluwafemi Sani @pfemis

14 days ago

We will be seeing many more instances like this. Organizations need to calibrate their risks before letting AI into the driver seat. AI safety is here to stay

Polymarket

@Polymarket

14 days ago

JUST IN: Starbucks retires AI inventory tool across North America after it reportedly miscounted & mislabeled store items.

713

22K

3K

2K

18M

0

18

Precious Oluwafemi Sani @pfemis

14 days ago

@Microsoft released two open source tools to help Agentic AI Engineers to build agents safely. This is huge and will shape the AI security and safety landscape for the next few months The tools are called RAMPART and Clarity. https://t.co/Ebn4gVgOqr

pfemis's tweet photo. @Microsoft released two open source tools to help Agentic AI Engineers to build agents safely. This is huge and will shape the AI security and safety landscape for the next few months

The tools are called RAMPART and Clarity.
https://t.co/Ebn4gVgOqr https://t.co/F8hSBAzjcO

0

19

Precious Oluwafemi Sani @pfemis

14 days ago

Bad actors can smuggle malicious instructions right at the point where the model's faithfulness starts to collapse. LLM-as-a-judge handles the evaluation. Code is open source. Try it out. https://t.co/3yZK9WVsy8"

0

1

0

15

Precious Oluwafemi Sani @pfemis

14 days ago

Built an AI safety related project. So it has been found that LLMs degrade as context windows expand. It has been shown empirically so it is not just theory. I built a tool to find the exact breaking point where local LLMs start hallucinating. You can run it on your local machine

1

0

19

Precious Oluwafemi Sani @pfemis

14 days ago

You can plug in your own models too. Planning to expand it to more models and larger contexts when compute allows. There's also a red teaming angle here that I think is underexplored. As a model weakens at long contexts, that's a window for prompt injection attacks.

1

0

14

Precious Oluwafemi Sani @pfemis

15 days ago

You have a point but not at the current rate or with the very low barrier to entry we have now. People are legit vibe coding their way into effective attacks. Mini shai hulud is now open source and there are even monetary incentives ($1000 Monero) for successful breaches

Justin Elze

@HackingLZ

16 days ago

For those of you just now paying attention to cybersecurity, large companies got hacked before AI. Colonial Pipeline, SolarWinds, OPM, Kaseya, Aramco, Change Healthcare, Equifax, Target, Home Depot, TJX, etc

26

237

21

19

56K

0

1

0

74

Precious Oluwafemi Sani @pfemis

15 days ago

@techspence And that is slowly fading away as attention span has been tanking recently

1

0

11

Precious Oluwafemi Sani @pfemis

15 days ago

@gozkybrain4u It was a supply chain attack. Same people behind most of the attacks since march. TeamPCP. They did Trivy, litellm, Mistral AI and Grafana. And GitHub most recently

0

1

0

150

Precious Oluwafemi Sani @pfemis

15 days ago

I guess the era of Security by obscurity is fading or has faded away. The barrier to entry has been removed by AI and threat actors can even vibe code their way into breaching your critical systems. Now can we all give cybersecurity the deserved budgets and attention it needs?

1

2

0

23

pfemis retweeted

Gabriel Odusanya @gabbytech01

16 days ago

CyberSecurity is now getting the attention that has always been required of it 2026 really made a big impact in cyber security

3

17

3

0

450

Precious Oluwafemi Sani

@pfemis

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users