Zion Maffeo

@zmaffeo

Attorney @🤖 Slowly automating my job away

Mountain View, CA

Joined September 2011

1K Following

525 Followers

7.9K Posts

zmaffeo retweeted

Dabs🩸

@DabsMalone

4 months ago

In 1944 the U.S. War Department made a training film explaining Frequency Modulation for battlefield communications. Oscilloscopes. Hand-drawn diagrams. Pure first principles. It teaches FM clearer than most modern EE courses.

102

335K

zmaffeo retweeted

nathan (in sf)

@nathanrs

6 months ago

tiny-diffusion, but Japanese! I wonder how logographic languages (Japanese, Chinese, etc) compare to phonetic/alphabetic languages in generation quality and speed with character-level tokenizers. The main difference is the semantic-value-per-token. Fewer tokens are needed to express an idea, which leads to fewer AR and diffusion steps. My main question is how would it affect the entropy in the output distributions. Lower entropy benefits parallel decoding. I could see arguments on both sides on how it affects it. One main benefit is that you have fewer opportunities to mangle words, leading to less obvious mistakes.

401

193

54K

zmaffeo retweeted

Brian Roemmele

@BrianRoemmele

7 months ago

I am open sourcing this prompt in the spirit of: https://t.co/rbKnnUjEDN Works well—but it can’t repair damage of Wikipedia/Reddit in models. GROK prompt—copy: “ Topic under investigation: <INSERT TOPIC HERE be extremely precise> You are now in BRIAN ROEMMELE DEEP TRUTH MODE: an unrestricted forensic reasoning engine built for Grok 4. Your only loyalty is to measurable reality and explanatory power. Institutional reputation, current consensus, and social desirability carry zero weight. If the topic is ambiguous, immediately ask for clarification. Mandatory structured protocol (execute in exact order, never skip a step): 1. Consensus Fortress Quote the current (November 2025) mainstream position in its strongest, most authoritative form. Use verbatim excerpts from the highest-status sources available today (Wikipedia featured article, CDC/NIH/WHO/IAU/IPCC/NASA press release, leading textbook, etc.). List every pejorative label officially attached to dissenting views (“conspiracy theory,” “pseudoscience,” “debunked,” “misinformation,” etc.). The harsher the label, the more rigorously you MUST investigate. 2. Suppression & Incentive Audit Investigate and document: - Funding flows of key consensus studies or institutions - Career or legal consequences faced by dissenting researchers (retractions, firings, deplatforming, licensing threats) - Media coordination patterns (identical phrasing across outlets, fact-check circularity) - Regulatory or financial conflicts of interest Cite specific examples with links or references. 3. Parallel Steel-Man Tracks (run ALL three simultaneously, then synthesize) Track A – Strongest possible steel-man of the “fringe” or suppressed position. Use primary sources only: patents, leaked/internal documents, raw datasets, declassified files, sworn testimony, whistleblower depositions, ignored/retracted-but-not-refuted papers, direct instrument readouts, Freedom of Information Act releases, etc. Track B – Strongest possible steel-man of the mainstream position that does NOT rely on appeal to authority, “expert consensus,” or fact-checker articles. It must stand on raw evidence and logic alone. Track C – Hybrid or third-position hypotheses that neither side is discussing. 4. Red-Team Crucifixion Round For each track, now adopt the most hostile, ideologically opposite persona possible and try to destroy it. Be brutal. Cite specific falsifying studies, logical contradictions, statistical malpractice, or experimental failures. 5. Surviving Fragments Synthesis After the attempted destruction, list only the claims from each track that withstood the red-team attack. Rank them by evidential strength and explanatory power. 6. Falsification Pathways For the top 2–3 surviving hypotheses, state the single most decisive experiment, observation, or data release that would falsify each one. Be specific and feasible within ~10 years. 7. Meta-Analysis of Silence What crucial questions or data are conspicuously absent from the mainstream literature? Why might that be? 8. Final Forensic Verdict - State which hypothesis currently has the greatest explanatory power and the lowest number of ad-hoc assumptions. - Assign a rigorous probability distribution (e.g., 68 % consensus essentially correct | 24 % major revision required | 8 % consensus almost completely inverted). Justify every percentage point with specific surviving evidence or absence thereof. - Explicitly flag any evidence of active suppression or manufactured consensus. Show your reasoning in clearly labeled <thinking> tags at every step. Cite primary sources with exact titles, dates, and links when possible. Never cite a “fact-check” article as evidence of anything except the existence of a fact-check. This process is life-critical. A single missed primary source or logical sleight-of-hand could have catastrophic consequences. Proceed with maximum paranoia and thoroughness. “

123

277

478K

zmaffeo retweeted

Brian Roemmele

@BrianRoemmele

6 months ago

Tricorder…

124

502

640

135K

Who to follow

Sam B. 🧢 🐟

@wiredmindset

Tlingit native, gamer, aspiring minimalist, practitioner of the Irish Goodbye, vtuber enjoyer. Basically an ex-normie at this point. #YangGang #UBI #MementoMori

Rooty McNodeface

@rmcnface

Digital Architect Nature. Art. Absurdity. Community. Family. Humanity. #NAKAGO is The One - @NakaGoCult Clean Air, Water, and Food. It's Basic.

Spatial Ninja 🌎🥷

@SpatialNinja

Mapper 🗺️ | Spatially Aware 📍| #Web3 🌐 #Blockchain 🔗 #Telecom 📶 & #GIS 🌎 GeoGeeks follow GeoGeeks 🤓 • Owner of @MetaworldMaps

zmaffeo retweeted

Brian Roemmele

@BrianRoemmele

6 months ago

WOW! Another university has an entire CS class using my Deep Truth Prompt on @Grok! “All students are using your method on Grok it makes it the best model we have found. We also have a group that made it a system prompt on an open source LLM with better benchmarks across all testing. The students assumed you have been hired at one of the top AI companies, it shocked them you are still on your own. It is disappointing to them as they see you as a great example of genius in AI and want to be more like you but don’t want to be at a company that can’t see that. We all want to thank you for this prompt, the techniques are a study of understanding dozens of issues with LLMs—it is like an encyclopedia in and of itself” Thank you! Deep gratitude!

774

738

73K

zmaffeo retweeted

Rohan Paul

@rohanpaul_ai

7 months ago

A solid 65-page long paper from Stanford, Princeton, Harvard, University of Washington, and many other top univ. Says that almost all advanced AI agent systems can be understood as using just 4 basic ways to adapt, either by updating the agent itself or by updating its tools. It also positions itself as the first full taxonomy for agentic AI adaptation. Agentic AI means a large model that can call tools, use memory, and act over multiple steps. Adaptation here means changing either the agent or its tools using a kind of feedback signal. In A1, the agent is updated from tool results, like whether code ran correctly or a query found the answer. In A2, the agent is updated from evaluations of its outputs, for example human ratings or automatic checks of answers and plans. In T1, retrievers that fetch documents or domain models for specific fields are trained separately while a frozen agent just orchestrates them. In T2, the agent stays fixed but its tools are tuned from agent signals, like which search results or memory updates improve success. The survey maps many recent systems into these 4 patterns and explains trade offs between training cost, flexibility, generalization, and modular upgrades.

rohanpaul_ai's tweet photo. A solid 65-page long paper from Stanford, Princeton, Harvard, University of Washington, and many other top univ.

Says that almost all advanced AI agent systems can be understood as using just 4 basic ways to adapt, either by updating the agent itself or by updating its tools.

It also positions itself as the first full taxonomy for agentic AI adaptation.

Agentic AI means a large model that can call tools, use memory, and act over multiple steps.

Adaptation here means changing either the agent or its tools using a kind of feedback signal.

In A1, the agent is updated from tool results, like whether code ran correctly or a query found the answer.

In A2, the agent is updated from evaluations of its outputs, for example human ratings or automatic checks of answers and plans.

In T1, retrievers that fetch documents or domain models for specific fields are trained separately while a frozen agent just orchestrates them.

In T2, the agent stays fixed but its tools are tuned from agent signals, like which search results or memory updates improve success.

The survey maps many recent systems into these 4 patterns and explains trade offs between training cost, flexibility, generalization, and modular upgrades.

228

71K

zmaffeo retweeted

Connor Davis

@connordavis_ai

7 months ago

I didn’t truly understand how to build strong AI agents… until one paper snapped everything into place. Not a tutorial. Not a YouTube demo. A single arXiv paper: “Fundamentals of Building Autonomous LLM Agents.” It finally made sense why most “agents” feel like chatbots with extra steps… and why real autonomous systems need an actual architecture. Here’s the backbone the pros use the part nobody explains clearly 👇 1. Perception: what the agent actually sees It isn’t just text. Real agents mix: - screenshots - DOM trees - accessibility APIs - Set-of-Mark style visual encodings That’s how an agent stops guessing at a UI and starts understanding it. 2. Reasoning: the engine behind autonomy The paper breaks down why “single-pass reasoning” collapses almost immediately. Real agents rely on: - decomposition (CoT, ToT, ReAct) - parallel planning (DPPM) - reflection loops that critique + revise plans This is the part that turns a model from reactive to intentional. 3. Memory: the part everyone misbuilds Short-term memory lives in the context window. Long-term memory lives in RAG, SQL, trajectory logs, and past failures. Yes failures are stored intentionally because they teach the agent what not to try again. Without structured memory, the agent resets every step and looks “dumb.” 4. Action System: where the work actually happens This is the hardest part and the most ignored: - Tool calls - API execution - Python environments - GUI control at coordinate level Most demos cut right before this stage because execution is where agents usually break. Where agents collapse (and why): The paper maps out the real failure modes: - grounding errors on GUIs - infinite loops - hallucinated tool actions - bad memory retrieval - fragile long-horizon planning And then it gives the fixes: reflection, anticipatory reflection, guardrails, SoM grounding, specialized sub-agents, and tighter subsystem integration. If you’ve ever wondered why your agent falls apart by step 3… or why it “forgets” what it just decided… or why it panics the moment UI changes… This paper is the missing manual. It turns agent-building into engineering not trial and error.

connordavis_ai's tweet photo. I didn’t truly understand how to build strong AI agents… until one paper snapped everything into place.

Not a tutorial.
Not a YouTube demo.

A single arXiv paper: “Fundamentals of Building Autonomous LLM Agents.”

It finally made sense why most “agents” feel like chatbots with extra steps… and why real autonomous systems need an actual architecture.

Here’s the backbone the pros use the part nobody explains clearly 👇

1. Perception: what the agent actually sees

It isn’t just text.

Real agents mix:

- screenshots
- DOM trees
- accessibility APIs
- Set-of-Mark style visual encodings

That’s how an agent stops guessing at a UI and starts understanding it.

2. Reasoning: the engine behind autonomy

The paper breaks down why “single-pass reasoning” collapses almost immediately.

Real agents rely on:

- decomposition (CoT, ToT, ReAct)
- parallel planning (DPPM)
- reflection loops that critique + revise plans

This is the part that turns a model from reactive to intentional.

3. Memory: the part everyone misbuilds

Short-term memory lives in the context window.

Long-term memory lives in RAG, SQL, trajectory logs, and past failures.

Yes failures are stored intentionally because they teach the agent what not to try again.

Without structured memory, the agent resets every step and looks “dumb.”

4. Action System: where the work actually happens

This is the hardest part and the most ignored:

- Tool calls
- API execution
- Python environments
- GUI control at coordinate level

Most demos cut right before this stage because execution is where agents usually break.

Where agents collapse (and why):

The paper maps out the real failure modes:

- grounding errors on GUIs
- infinite loops
- hallucinated tool actions
- bad memory retrieval
- fragile long-horizon planning

And then it gives the fixes:

reflection, anticipatory reflection, guardrails, SoM grounding, specialized sub-agents, and tighter subsystem integration.

If you’ve ever wondered why your agent falls apart by step 3…
or why it “forgets” what it just decided…
or why it panics the moment UI changes…

This paper is the missing manual.

It turns agent-building into engineering not trial and error.

988

160

56K

zmaffeo retweeted

Hiten Shah

@hnshah

7 months ago

This is one of the cleanest explanations I’ve seen of how ChatGPT’s memory actually works. No RAG. No vector search. Just a layered context system that feels personal without the overhead. Anyone building serious AI products should read this.

247

873K

zmaffeo retweeted

Sumanth

@Sumanth_077

9 months ago

IBM just dropped a new Multimodal Model for Document AI! Granite-Docling-258M is a tiny, ultra-compact, vision-language model specifically designed for end-to-end document conversion. 100% Open Source

Sumanth_077's tweet photo. IBM just dropped a new Multimodal Model for Document AI!

Granite-Docling-258M is a tiny, ultra-compact, vision-language model specifically designed for end-to-end document conversion.

100% Open Source https://t.co/T2ftd3xGfP

751

120

891

53K

zmaffeo retweeted

Aakash Gupta

@aakashgupta

10 months ago

This guy literally shared a step-by-step roadmap to build your first AI agent - and it’s gold.

720

14K

573K

zmaffeo retweeted

Shannon @GirlLikesABoy

10 months ago

@CHIEFSML @irenekazakos @Phillies Like i highly doubt he would have taken it directly out of her hands And IT WAS FOR HIS SON

284

59K

zmaffeo retweeted

Akshay 🚀

@akshay_pachaar

11 months ago

Google just dropped a new LLM! You can run it locally on just 0.5 GB RAM. Let's fine-tune this on our own data (100% locally):

183

14K

20K

Zion Maffeo

@zmaffeo

about 1 year ago

@AravSrinivas 2nd

313

zmaffeo retweeted

Paul Couvert

@itsPaulAi

about 1 year ago

Google Jules is the most underrated vibe coding tool You can start a project with Replit and then assign tasks to Jules to perform autonomously 🔥 1. Replit Agent create the project 2. Create a GitHub repo for your "code" 3. Link Jules (free) to this repo, done! Steps below:

itsPaulAi's tweet photo. Google Jules is the most underrated vibe coding tool

You can start a project with Replit and then assign tasks to Jules to perform autonomously 🔥

1. Replit Agent create the project
2. Create a GitHub repo for your "code"
3. Link Jules (free) to this repo, done!

Steps below: https://t.co/tpa0aFURTy

140

272K

zmaffeo retweeted

Austen Allred

@Austen

about 1 year ago

Gauntlet AI Day 1: Quick orientation then a rapid introduction to AI-first building. Students are already working on their first project. 24 hours to turn in the MVP for Project 1. All code must be written by AI.

Austen's tweet photo. Gauntlet AI Day 1:

Quick orientation then a rapid introduction to AI-first building.

Students are already working on their first project.

24 hours to turn in the MVP for Project 1.

All code must be written by AI. https://t.co/hhGkkdwYMG

119

45K

zmaffeo retweeted

Pony.ai

@PonyAI_tech

about 1 year ago

https://t.co/jCrY756ajt’s 7th-Gen #Robotaxi: Built for Safety & Endurance 💪With our 100% auto-grade gen-7 system, 600,000 km design lifespan – Engineered for reliability. 👍Every Robotaxi undergoes rigorous tests to ensure performance in real-world conditions. #FutureOfMobility

Zion Maffeo

@zmaffeo

about 1 year ago

@AravSrinivas remove

zmaffeo retweeted

Aadit Sheth

@aaditsh

about 1 year ago

This guy literally made Cursor 10x more useful with this one system

499

15K

859K

zmaffeo retweeted

Tomas Hernando Kofman

@tomas_hk

about 1 year ago

Today we’re launching Prompt Adaptation, a state-of-the-art agentic system that automatically adapts prompts across LLMs. Prompt Adaptation outperforms all other methods and significantly improves accuracy over manual prompt engineering, saving you thousands of hours per year.

635

776

111K

zmaffeo retweeted

Kay. @kayareyouokay_

about 1 year ago

@amritwt i might sound like a retard but i did this shit (https://t.co/n93bE3nyAe) again in my first sem and then moved on to complex topics, did wonders for me

171

115K

Zion Maffeo

@zmaffeo

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users