@RealJoshHoward@CreatorML_ From what I’ve seen, frontier models like Opus and GPT don’t do much better whether you include the thumbnails or not as inputs to the prediction. Could be that they need your signal processing first
I’m considering open-sourcing an eval based on 4 years of trying to build AI that answers: ‘What should my YouTube audience watch next?’
At @CreatorML_ we built view predictors, but we always lacked a clean public benchmark.
Curious: how much better are Claude Opus/GPT-5/etc. out-of-the-box at this vs specialized models? Planning to test it properly.
Key question: Given channel history + new video idea (title/thumbnail/desc), how well can models predict relative performance (views, CTR, retention)?
Predicting “is 1 of 10” might be good starting point as opposed to raw views.
how to build anything rn:
- get a hetzner, do, or hostinger vps
- host hermes on it
- add gbrain or implement your own memory vault using qmd + sql
- set up hermes with codex auth -> gpt-5.5 / no reasoning / fast mode
- install orca on your macbook and phone with tailscale to have a nice ide to work on both
- before starting any work, ask hermes to conduct deep research on the subject and save it to gbrain as source material for the project
- use the `/grill-me` skill or a similar prompt to uncover as many unknowns as possible. save results to memory too
- define/write clear evals for every project to determine whether a run was successful
- have hermes iterate over the project until all evals pass, saving all learnings to the vault along the way
- whenever it gets stuck, use memory + a new research or `/grill-me` session to unblock it
rinse and repeat until the work is done. pay attention to the process. develop a feeling for how long tasks should take and do not be afraid to stop a model mid session to ask for status and why it's taking so long.
@dexhorthy Yep. Codex 5.5 is still the best coder. But sometimes 4.8 surprises me in good ways. I still wouldn’t rely on Claude for anything mission critical
“People who are really serious about software should make their own hardware” —Alan Kay
People who are serious about AI should use specialized chips for inference.
We raised $15m to build the ASICs-first inference cloud.
We're betting big on alternatives to GPUs, and the result is that we are already 5-8x faster on most models.
Read more about General Compute on Tech Crunch!
@FPuklowski@fastinference
https://t.co/xQyZmfVJPN