Josh @Maybejoshb - Twitter Profile

Josh

@Maybejoshb

about 4 hours ago

This actually made me laugh out loud just now lmaoo. The king is still asleep I guess 😆😆

Saad Jamal

@sadjamz_

about 8 hours ago

you can't make this up

6

399

4

11

21K

0

1

0

15

Maybejoshb retweeted

Przemek Chojecki | PC

@prz_chojecki

about 7 hours ago

I disproved a 50-year old math conjecture with AI. With GPT-5.5 Pro I found counterexamples to Beneš Conjecture as well as related Shuffle-Exchange Conjecture in network theory.

prz_chojecki's tweet photo. I disproved a 50-year old math conjecture with AI.

With GPT-5.5 Pro I found counterexamples to Beneš Conjecture as well as related Shuffle-Exchange Conjecture in network theory. https://t.co/PpmoPcNcHQ

13

252

26

63

16K

Maybejoshb retweeted

Kasra

@kasrak

about 10 hours ago

I've joined @OpenAI to work on Codex @ajambrosino and team have built a very good app! It's the first coding agent GUI that got me out of the terminal Excited to help make it even better, especially as it goes beyond software engineers Also delighted to get to work with old friends @gpeal8 @tarstarr again

kasrak's tweet photo. I've joined @OpenAI to work on Codex

@ajambrosino and team have built a very good app! It's the first coding agent GUI that got me out of the terminal

Excited to help make it even better, especially as it goes beyond software engineers

Also delighted to get to work with old friends @gpeal8 @tarstarr again

93

969

14

87

40K

Maybejoshb retweeted

Firecrawl @firecrawl

about 11 hours ago

Introducing web-scale /monitor. An always-on search that watches the web & pings you or your agent the moment something comes online. Before /monitor only worked for single pages or websites, but now you can use that power on the entire web. Available today.

22

373

29

354

38K

Maybejoshb retweeted

matt palmer

@mattyp

1 day ago

You can embed this model practically anywhere - like a chrome extension This is transformersjs + rampart for real-time PII removal in your browser (personal email still blurred) You can see I'm toggling fields - the model is removing them from text almost instantly: zip, state, city. Since the model is local, data never leaves my device. Rampart doesn't hide API keys, but regex can handle those Aside from obvious enterprise implications, Rampart will be huge for privacy-forward individuals I'm working on getting this live on the Chrome Web Store. The smol model era is here - next I want to see a native mac app 👀 Thanks to @xenovacom & @huggingface for the amazing libraries and @ndstudio for the awesome model

8

130

8

128

21K

Maybejoshb retweeted

Cloudflare @Cloudflare

about 14 hours ago

We're opening the waitlist for our Monetization Gateway, which will allow you to charge for any web page, dataset, API, or MCP tool behind Cloudflare. The charges will settle in stablecoins over the x402 open protocol. https://t.co/pvICtEIixj

99

2K

249

1K

308K

Josh

@Maybejoshb

1 day ago

From the Chief of Staff for DJT: https://t.co/iy38J5XX7V

Susie Wiles

@SusieWiles47

1 day ago

Under President Trump’s leadership the United States is the undisputed winner in the AI race. My gratitude to companies across industries who continue to work closely with the White House to implement the President’s EO: “Promoting Advanced AI Innovation and Security.” This includes excellent work around advanced model access and guardrail testing and security. The government and private sector have worked together in a way we have never seen before and this foundation of America First is unprecedented. Our shared priority remains: get the best tech deployed as quickly and safely as possible.

704

5K

696

200

653K

0

1

0

42

Josh

@Maybejoshb

1 day ago

Finally releasing Fable 5 and Mythos 5 for use. Should be available tomorrow! “Anthropic has taken steps in close coordination with the U.S. government to address the risks associated with Claude Mythos 5 and Claude Fable 5. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models; to work diligently with the U.S. government on protocols and standards and releases for Mythos, Fable, and future models; and to inform the U.S. government of any malicious activity.” “In light of these actions and commitments, as well as the Bureau of Industry and Security's evaluation of the diversion risks now presented by Claude Mythos 5 and Claude Fable 5, the controls in the June 12 letter are withdrawn.”

Maybejoshb's tweet photo. Finally releasing Fable 5 and Mythos 5 for use. Should be available tomorrow!

“Anthropic has taken steps in close coordination with the U.S. government to address the risks associated with Claude Mythos 5 and Claude Fable 5. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models; to work diligently with the U.S. government on protocols and standards and releases for Mythos, Fable, and future models; and to inform the U.S. government of any malicious activity.”

“In light of these actions and commitments, as well as the Bureau of Industry and Security's evaluation of the diversion risks now presented by Claude Mythos 5 and Claude Fable 5, the controls in the June 12 letter are withdrawn.”

1

0

67

Maybejoshb retweeted

banteg

@banteg

1 day ago

noticed that this bench was run with safeguards turned off. on a production version it scores zero. same situation as a third-party eval finding that fable rejects 99.5% of prompts while anthropic posts self-reported scores achieved on a non-production config. this is unacceptable, and we should reject such deceitful evals.

12

258

18

21

16K

Josh

@Maybejoshb

1 day ago

@rasbt If I purchase using your manning link, is the ebook available immediately as well?

1

0

136

Josh

@Maybejoshb

1 day ago

@rasbt Congrats bro!

0

10

Maybejoshb retweeted

Sebastian Raschka

@rasbt

1 day ago

After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out! My first copies just arrived! 📚 440 full-color pages. Inference scaling, reinforcement learning, and distillation from scratch.

rasbt's tweet photo. After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is
finally out!

My first copies just arrived! 📚

440 full-color pages. Inference scaling, reinforcement learning, and distillation from scratch. https://t.co/647ksI7sLc

249

6K

526

3K

462K

Maybejoshb retweeted

Andrew Ambrosino

@ajambrosino

1 day ago

what’s a little funny about the “GPT weak on frontend” discourse is that everything we ship in the codex app gets adopted by the entire industry within days or weeks, pixel for pixel

127

973

19

128

127K

Maybejoshb retweeted

OpenAI

@OpenAI

1 day ago

We’re introducing GeneBench-Pro, a research-level benchmark for a harder kind of AI progress: how well agents can navigate messy biological data, choose the right analysis path, and make judgment calls that real computational research depends on. https://t.co/AsilnnSxnE

281

4K

380

871

1M

Maybejoshb retweeted

OpenAI Developers

@OpenAIDevs

1 day ago

⚙️ We debugged a year’s worth of crashes in our data infrastructure and found one issue in the hardware and another that has been unnoticed in open-source code for 18 years. Here’s how we tracked them down: https://t.co/5c13Knw69o

49

1K

79

340

90K

Maybejoshb retweeted

Silen Naihin

@silennai

2 days ago

Excited to share that my paper "Size Doesn't Matter: Cosine-Scored Sparse Autoencoders" got accepted as a spotlight at ICML! We propose cosine sparse auto-encoders (SAEs) which have - 14.6% better top 1 sparse probing accuracy - discover ~3x more features - matched FVE and interpretability - minimal recipe change SAEs detect features via inner product, so a feature's activation scales with both its directional alignment and the input's norm But sublayer normalization discards magnitude entirely, which means the encoder detects a quantity the model does not read! A learned scalar parameter is free to recover inner product scoring but doesn't, showing that 74% of magnitude is noise Github and paper below 👇

silennai's tweet photo. Excited to share that my paper "Size Doesn't Matter: Cosine-Scored Sparse Autoencoders" got accepted as a spotlight at ICML!

We propose cosine sparse auto-encoders (SAEs) which have
- 14.6% better top 1 sparse probing accuracy
- discover ~3x more features
- matched FVE and interpretability
- minimal recipe change

SAEs detect features via inner product, so a feature's activation scales with both its directional alignment and the input's norm

But sublayer normalization discards magnitude entirely, which means the encoder detects a quantity the model does not read!

A learned scalar parameter is free to recover inner product scoring but doesn't, showing that 74% of magnitude is noise

Github and paper below 👇

11

109

17

86

11K

Josh

@Maybejoshb

2 days ago

@thsottiaux @charliermarsh This team ships

0

14

Josh

@Maybejoshb

2 days ago

This team moved at lightning speed.

Tibo

@thsottiaux

2 days ago

Codex usage limits will be fully reset again in the next hour and we will credit one additional reset into your bank for your own usage over the next 24 hours. We investigated reports that Codex usage was being consumed faster than expected. There wasn't one central issue, but a few smaller problems compounded for some users. Here's what we found and changed: - Actual usage: Auto-review had become more proactive, another change was triggering more subagent work, and background suggestions could run twice or retry too frequently after failures. We reverted the changes and fixed suggestion scheduling, duplicate generation, and retry behavior. This should reduce unnecessary background token consumption while preserving the work users explicitly request. - Usage reporting: Auto-review was incorrectly appearing as GPT‑5.4 usage, and failed or rate-limited requests were still shown as turns. Auto-review now appears as its own category, and only successful requests count toward the turn graphs. Rate-limited requests were never charged, but they were being displayed incorrectly. - Immediate relief: We reset usage limits while rolling out the fixes, then shipped hotfixes across the CLI, desktop app, and usage backend. - What to expect: New usage data should be clearer and actual consumption should be lower. Historical charts may still show auto-review under GPT‑5.4 because older turn data was not relabeled. Features that intentionally perform more work; such as /goal, subagents, and higher reasoning levels will still naturally use more capacity. All fixes are now deployed, and we've added more detailed monitoring so we can detect background-usage regressions sooner. We'll continue watching the results closely. Thank you for building and doing all sorts of things with Codex.

1K

7K

412

620

906K

0

26

Josh

@Maybejoshb

Last Seen Users on Sotwe

Trends for you

Most Popular Users