Justin Angel

@JustinAngel

AI/ML. Worked at Meta, Uber, Amazon, Apple, and Microsoft building apps, developer platforms, and hardware. Tweeting about LLM psychotherapy.

San Francisco, CA

Joined August 2008

397 Following

5.8K Followers

21.4K Posts

Pinned Tweet

Justin Angel

@JustinAngel

9 months ago

🚨 New Preprint: ΔAPT - Can we build an AI Therapist? https://t.co/xeSEALATHj LLMs are already powering AI psychotherapy tools (APTs), but are they clinically effective? This interdisciplinary review + frameworks maps architecture design choices to clinical outcomes. 🧵

JustinAngel's tweet photo. 🚨 New Preprint: ΔAPT - Can we build an AI Therapist?

https://t.co/xeSEALATHj

LLMs are already powering AI psychotherapy tools (APTs), but are they clinically effective?

This interdisciplinary review + frameworks maps architecture design choices to clinical outcomes.

🧵 https://t.co/JEwYfi4RCz

318

115K

Justin Angel

@JustinAngel

about 20 hours ago

GPUs go brrrrrr when they dream of us

Justin Angel

@JustinAngel

about 20 hours ago

https://t.co/GPPN37DYeq

123

Justin Angel

@JustinAngel

about 20 hours ago

https://t.co/GPPN37DYeq

123

Justin Angel

@JustinAngel

2 days ago

During NEO @swyx casually dropped "you can make custom ribbons for @aiDotEngineer in SF". Challenge accepted.

11K

Who to follow

Tim

@timheuer

👨‍💻 Developer Experience at Microsoft / 🚴‍♂️ Cyclist @timheuer everywhere

Pete Brown @ Microsoft::Windows

@Pete_Brown

Engineer in MSFT Windows, working with music creation devs. Windows MIDI. WinDev MVP Lead. Synths. Astrophotography. MIDI Association Exec board Chair. He/Him.

2 days ago

100%. I'm a very experiential learner as well, first from concepts, then demos, code, and distant last math. I wrote a Build your Own LLM course based on those preferences. Ended up creating a physical analog perceptron and online demos to teach those foundational concepts. https://t.co/7oY4fJsVhe

263

Justin Angel

@JustinAngel

2 days ago

Can’t access the article because it’s paywalled. Some objections to the methodology based on the appendix: 1. This seems like a random list of 228 words. Was the selection criteria used biased? Prove it’s not biased. The “delve” thing always sounded mildly geographically biased since that word is popular in Nigerian English. 2. The research disqualified ~50% of potential papers (3 million papers out of 7 million). That leaves error margins many times the size of articles containing the 228 words. 3. Doesn’t words usage go through a normal cycle of growth and decline? What does this paper actually prove besides that? And the regional distribution thereof? (Would love to actually read it if you choose to share an accessible copy)

Justin Angel

@JustinAngel

12 days ago

@suchenzang “Least directed execution results” reads like people who don’t do anything. Sounds mean. The real impact of LLM psychosis is mostly felt by people who were dealt are really bad starting hand in life while being manipulated by tech-oligopolies to sustain DAU/revenue numbers.

234

Justin Angel

@JustinAngel

12 days ago

Live streaming the recording of the Build your own LLM workshop @ https://t.co/WMDZ3E1tqW (with breaks during the day)

15K

Justin Angel

@JustinAngel

14 days ago

@tenobrus Sex

Justin Angel

@JustinAngel

14 days ago

@suchenzang So I'm a professional baker, and I have an important question related to MI: why does this bread have two domes? I get three or four, but two just looks weird.

Justin Angel

@JustinAngel

14 days ago

Play around with the circuit yourself @ https://t.co/oStZdUr0Yq Used in my "Build your own LLM" course to teach about perceptron, neural nets, and back-propogation. Heavily inspired by @welchlabs and @ProfTomYeh.

JustinAngel's tweet photo. Play around with the circuit yourself @ https://t.co/oStZdUr0Yq

Used in my "Build your own LLM" course to teach about perceptron, neural nets, and back-propogation.

Heavily inspired by @welchlabs and @ProfTomYeh. https://t.co/WJxe2SKbmy

155

Justin Angel

@JustinAngel

14 days ago

I've solidified a neural net perceptron into a physical circuit. 100 billion of these is ChatGPT. f(x)= wx+b = 1*1 + 1.5 = 2.5 output = weight * input + bias You can change the input, weight and bias and see the output neuron update. Learning ML can be fun!

98K

Justin Angel

@JustinAngel

14 days ago

Also added a ReLU max(0,x) function to more closely resemble GPT-2. f(x) = w*x+b = -1.5 * 2 + 2 = -1 When numbers go negative, the final output is still 0. So max(-1, 0) -> the final output is 0.

JustinAngel's tweet photo. Also added a ReLU max(0,x) function to more closely resemble GPT-2.

f(x) = w*x+b = -1.5 * 2 + 2 = -1

When numbers go negative, the final output is still 0.

So max(-1, 0) -> the final output is 0. https://t.co/HwmCvhIhin

214

JustinAngel retweeted

Jessica Rumbelow

@JessicaRumbelow

17 days ago

Most unsupervised "feature discovery" in LLMs uses sparse auto-encoders, which work, and which have been scaled to millions of features on frontier-scale models, but which bundle two distinct commitments – a reconstruction loss and a sparsity loss over a fixed-size dictionary – into a single training objective. Those commitments make sense if your goal is reconstructive decomposition. They make less obvious sense if your aim is to find interpretable structure (directions? features?) in activation space, to retrieve representative examples, identify causal interventions, or measure how representations change across layers and inputs. It turns out a lot of that doesn't need the full SAE machinery. Exemplar Partitioning (EP) uses leader-clustering (Hartigan, 1975!) to cover the activation manifold with observed exemplars at a calibrated resolution, resulting in a Voronoi partition of activation space that you can read like a feature dictionary. EP makes one streaming pass over the data until saturation (when no new exemplars form), and uses no backward passes or gradient descent. The animation above shows the algorithm – each new activation either joins an existing cell (close enough to an exemplar) or seeds a new one. It's extraordinarily simple and cheap. On AxBench latent concept detection at Gemma-2-2B-it L20, EP reaches 0.881 mean AUROC across 500 concepts. That's within 0.03 of SAE-A (AxBench's strongest dictionary-based baseline), and +0.126 over the canonical GemmaScope 16k SAE leaderboard entry – with about 1,000× less build compute. And you can do a lot interesting stuff with the resulting dictionary! If you build it on a mix of harmful and benign prompts, one region absorbs most of the refusing prompts. Projecting held-out harmful prompts off that exemplar's direction collapses refusal from around 0.98 to around 0.02 – the same ballpark as dedicated refusal-direction work (Arditi et al., 2024). If you build the EP dictionary to saturation on a corpus (e.g. the Pile), distance-to-nearest-exemplar becomes a graded measure of distribution shift, for free. Random-token-sequence activations sit measurably further out than Pile activations, and Bulgarian Wikipedia (under-represented in the Pile but not really OOD) sits between the two. Because exemplars are real activations rather than learned decoder columns, you can match dictionaries across different models by their exemplars. If you match EP dictionaries from base vs instruction-tuned Gemma-2-2B, only a handful of regions survive as common, mostly general-purpose syntactic patterns. You can also see how the base model already represents "harmful" as a direction at earlier prompt positions, and instruction tuning pulls it forward to the final-token activation where the refusal decision is made. The saturated size of a dictionary on a given input stream is itself a measurement of that stream's activation geometry at each layer. On the same model, the proportion of activation space dedicated to chat grows monotonically with depth, code is essentially flat across the network (and lives in a smaller area of activation space than chat does, at every layer), and math is non-monotonic, peaking in the middle. EP and SAEs don't converge on the same features, aside from a shared core of about 20%. The two methods make different geometric commitments – SAEs to linear separability, EP to density. The experiments I've done so far are small-scale and exploratory, and I have only tested on Gemma-2-2b. There's a huge amount of further work to be done (both in terms of improving the method and applying it to more tasks), some of which is discussed in the post and paper. If you are an interpretability researcher interested in developing this method please check out the github repo and get stuck in! Post: https://t.co/xdzevM6bfo Paper: https://t.co/9oFYOYjznv Code: https://t.co/p0z6r8LIJr

Justin Angel

@JustinAngel

18 days ago

Light weekend reading focus on overviews of pragmatic mechanistic Interpretability. As opposed to the inactionable and impractical kind I guess? Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability @ https://t.co/s7JfsQ6OY1 Practical Review of Mechanistic Interpretability @ https://t.co/62DXF7wpjr

JustinAngel's tweet photo. Light weekend reading focus on overviews of pragmatic mechanistic Interpretability. As opposed to the inactionable and impractical kind I guess?

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability @ https://t.co/s7JfsQ6OY1

Practical Review of Mechanistic Interpretability @ https://t.co/62DXF7wpjr

204

JustinAngel retweeted

Jessica Rumbelow

@JessicaRumbelow

20 days ago

Voronoi partitions on activations reveal interpretable structure with orders of magnitude less compute than SAEs! Here is an introduction to a new interpretability method: https://t.co/lFZZJMmLi9

262

190

26K

JustinAngel retweeted

Justin Angel

@JustinAngel

21 days ago

@tdietterich @arxiv My guess is that this policy will be applied selectively depending on institutional privilege and personal notoriety. It'll end up as a tool of silencing unconnected individuals vs. promoting better scientific discourse. I aspire to be wrong.

191

35K

Justin Angel

@JustinAngel

21 days ago

Every meta-analysis and review on LLM writing detection says the technology doesn't work. How will you apply a biased technology in a fair and consistent way? "human detection accuracy varied widely but generally clustered around chance performance" (Ramos, 2026) @ https://t.co/kF2qtc4JF5

Justin Angel

@JustinAngel

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users