@darraghcurran You have so much that people are interested in and maybe the Q&A votes and groupings could be a proxy for deep dive topics. I was think during the Webinar that we could spend entire sessions just on skills, just on Shrek, just on adoption and metrics, just on security & complianc
HELLOOOOOO CRAWLERS!
Book 8, A Parade of Horribles is now available! We never expected to have so many crawlers reading this early. You are all so very spunky, and we really appreciate that. The ratings have never been higher!
Now get out there and Kill, Kill, Kill!
It's wild to think about how massive 1M token context windows in LLMs really are
That's roughly equivalent to:
- The complete works of Shakespeare
- 11 hours of audio
- A 5-minute session fixing some TypeScript issue
@ljharb@dscape @JunghwanNa8355 could you make the same argument about open source? its open, you are inviting random PRs from people you dont know and havent heard of, welcome to the internet.
@pvncher Ive used Opus 4.7 to 500k-1mil compaction, I have definitely seen that it misinterprets what I said and I end up writing a lot more woah wait I didnt mean that, I meant this and having it backtrack
seems obvious but:
things that are changing rapidly:
1. context windows
2. intelligence / ability to reason within context
3. performance on any given benchmark
4. cost per token
things that are not changing much:
1. humans
2. human behavior, preferences, affinities
3. tools, integrations, infrastructure
4. single core cpu performance
therefore,
ngmi:
1. "i found this method to cut 15% context"
2. "our method improves retrieval performance 10% by using hybrid search"
3. "our finetuned model is cheaper than opus at this benchmark"
4. "our harness does this better because we invented this multi agent system"
5. "we're building a memory system"
6. "context graphs"
7. "we trained an in house specialized rl model to improve task performance in X benchmark at Y% cost reduction"
wagmi:
1. product/ui
3. customer acquisition
4. integrations
5. fast linting, ci, skills, feedback for agents
6. background agent infra to parallelize more work
7. speed up your agent verification loops
8. training your users, connecting to their systems and working with their data, meeting them where they are
Software horror: litellm PyPI supply chain attack.
Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords.
LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm.
Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks.
Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages.
Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
@tobi The biggest win comes from using a regex rather than a manual scan of bytes in a while loop. With how many regex DoS vulnerabilities there are I wonder if this approach is even warranted. Id take 20 years of stable code that works and 3 microseconds more in compute
1.25 billion tokens in 24 hours.
The Way of Kings by Brandon Sanderson is a 45 hour long audio book with 385,000 words or about 515k tokens.
Assume output was 1/1000th of input, that means this guy generated 2.43 The Way of Kings in 24hrs hours. This is called vibe coding.
This is @RepoPrompt 2.0
A fully integrated agent, that makes it seamless to use RP's powerful MCP tools, with a built-in oracle and context builder.
A first class experience showcasing how much better and efficient your agents can be with good context engineering tools.