agentic web search should be treated as poisonous by default. blogs, listicles, AI-generated summaries: most of what comes back is either marketing or even straight malicious. here's what I actually do when I research with AI:
slop docs flying around, slop images all over social, open source drowning in slop code. it just shows what we always knew: generative AI is a tool, and how you use it makes all the difference
here's how i actually handle it: treat everything an agent fetches as hostile, and never let the agent that read it be the one that runs anything. full approach here
https://t.co/wGchLSi1WV
earlier this week i said treat anything your AI agent fetches as poisoned. guess what: a dev hid instructions in their own code telling any AI that copies it to wipe your data
https://t.co/Jwz3UTa4Pf
the first thing I do when starting a new task is to get an agent to research what experts+community are saying, something I did for ages without thinking until I realized how naive it is: four sources agreeing means nothing if they all copy one blog post
https://t.co/SMIb834p7P
@AxialisSoftware@trq212 Yeah LLMs still struggle with that but I'm mostly using this approach for local throwaway explainers, nothing that ends up in production
i find @trq212 approach really interesting: ask Claude/Codex for html output instead of markdown, you get .svg diagrams inline, anchor links, interactive widgets, callouts, the markdown default is a habit from old models that makes no sense at modern context sizes
agentic web search should be treated as poisonous by default. blogs, listicles, AI-generated summaries: most of what comes back is either marketing or even straight malicious. here's what I actually do when I research with AI:
finally for code or configs, use a second agent to audit the proposed code without showing it the original blog posts. malicious instructions buried in retrieved pages should never reach the implementer that way
Launching our new paper on arXiv: we trained the largest multilingual food model ever built.
4.1M recipes. 7 languages. 1,790 ingredients. 300 dimensions.
All of human cooking compressed into 2 megabytes.