Nathan Axcan @AxcanNathan - Twitter Profile

Well I agree re: msft, but if a comparatively broke open lab (DeepSeek) serves between 3-15x cheaper tokens (83x cheaper if you count cache hits) at same quality as an anthropic model (who we're supposed to assume has the highest quality engineers of the west along with OAI) then we are also supposed to assume their margins are even higher than the factors [3,15,83] suggest, and that's even with credible people claiming DS still makes a profit. Not arguing re: bubble, not interested in that, but it's a big stretch if all the geniuses at the big labs didn't manage to collect massive profits from their current offering, seems to me.

3

6

0

1

714

Nathan Axcan

@AxcanNathan

4 days ago

If they would run all tools used when solving GDPval it would represent real life use right? That’s how I would do it at least. Maybe they’re setting a new metric and that’s the important part now, which Artificialanalysis should pick up to turn into a realistic workload. Maybe this is all about KV weights sittting on the GPU memory taking up HBM. CPUs have to optimize for latency because the main cost driver of inference is the gpu usage.

0

2

0

1

297

Nathan Axcan

@AxcanNathan

5 days ago

@teortaxesTex Dying laughing at ratatouille

0

1

0

209

Nathan Axcan

@AxcanNathan

5 days ago

@giffmana Because family member photos are not encrypted on disk I believe We should really go sleep sheesh

1

0

236

Nathan Axcan

@AxcanNathan

6 days ago

something LLMs are very well RL'd to do is scraping and data collection 👀

0

3

0

35

Nathan Axcan

@AxcanNathan

6 days ago

@giffmana I can’t use immich for whole family because we want privacy (parents etc) So I’m considering https://t.co/d2te3tNko1

1

2

0

1

403

Nathan Axcan

@AxcanNathan

7 days ago

🚨 Breaking: next Muse model will be good at multimodal 🚨

Lucas Beyer (bl16)

@giffmana

7 days ago

When the weather is just as good as the experiment results:

13

205

1

6

25K

0

39

Nathan Axcan

@AxcanNathan

7 days ago

@giffmana need to see your smile there

0

1

0

114

Nathan Axcan

@AxcanNathan

8 days ago

The new trend is that nobody ends up investing as much into inference efficiency as the people who developed a given model (also due to RL), meaning they end up having the best inference code, meaning they end up having the cheapest API and best ability to serve a coding plan/sub. Assuming a large enough gap, open source and small model development becomes an advertising strategy. Best exemplified by DeepSeek at the moment.

0

1

0

24

Nathan Axcan

@AxcanNathan

8 days ago

@levelsio How did you do it so far

0

276

Nathan Axcan

@AxcanNathan

8 days ago

@Laz4rz probably amazfit will have qwen models doing better than that google app pretty quickly if they're good

0

1

0

99

Nathan Axcan

@AxcanNathan

8 days ago

@Laz4rz think I liked polar more because they don't require a subscription (everyone else does) but not sure that's still true if amazfit turns out to be good that's a game changer though

1

0

96

Nathan Axcan

@AxcanNathan

8 days ago

https://t.co/vaWjpozKsr - X API for access to posts - AirTable for recording already-annotated posts - Openrouter for LLM inference As you can see, the values used for automated judgement are nicely shown in the prompts. Much to be learned from effective truth-seeking scaffolds, an agent continuously RL learning from this process might serve as a reward-hacking-resistant verifier for models being currently trained.

AxcanNathan's tweet photo. https://t.co/vaWjpozKsr
- X API for access to posts
- AirTable for recording already-annotated posts
- Openrouter for LLM inference
As you can see, the values used for automated judgement are nicely shown in the prompts.
Much to be learned from effective truth-seeking scaffolds, an agent continuously RL learning from this process might serve as a reward-hacking-resistant verifier for models being currently trained.

0

48

Nathan Axcan

@AxcanNathan

8 days ago

World-class forecaster (and fellow Nathan) develops a harness that auto-researches context for X posts, where the optimization target is one of the best: perceived helpfulness (link below) Once an LLM is "helpful to the public", it can receive real-time human signal and thereby receive "RLHF for free". Worth studying.

Nathan 🔎

@NathanpmYoung

15 days ago

AI Note-writer Progress Our note-writer (wholesome-raspberry-stilt) has written community notes on X with 47M views. The cost per helpful note is about $7. You can look at our notes here: https://t.co/zRcusOyYry

NathanpmYoung's tweet photo. AI Note-writer Progress

Our note-writer (wholesome-raspberry-stilt) has written community notes on X with 47M views.

The cost per helpful note is about $7. You can look at our notes here:

https://t.co/zRcusOyYry https://t.co/hno57AcfVn

8

26

2

6

7K

1

2

0

1

63

Nathan Axcan

@AxcanNathan

Last Seen Users on Sotwe

Trends for you

Most Popular Users