5tok/s is actually my favourite, so I can read and follow in real time. Compared to getting a wall of text each time you ask a simple question.
https://t.co/KviTyNeDQL
Enterprises thought AI would cut costs. Instead, itβs created entirely new line items on their balance sheet β AI infrastructure, compute, tokens.
For the time being.
RAG (Retrieval-Augmented Generation) sounds like cutting-edge spatial intelligence, but itβs really just Google search for an Agent that has short-term memory loss.
On Cognitive Load: Familiarity is not the same as Simplicity when it comes to software engineering. Just because someone is familiar with a project it doesn't mean it is clear or simple.