Jack Friedson

@JackFriedson

building something new · prev infra/product eng @haizelabs, applied AI @datadog

New York

Joined November 2023

503 Following

89 Followers

163 Posts

Pinned Tweet

Jack Friedson

@JackFriedson

18 days ago

hotter take: ML is a product design problem your ability to improve any ML-based product depends on the volume and quality of your data. if your product isn't purpose-built to capture as much of this data in as high quality a way as possible, you're gonna have a bad time

Viv

@Vtrivedy10

18 days ago

maybe hot take 👀 Improving Agents is a Data Mining Problem Harness Engineering, Post-Training, Continual Learning...these all boil down to the same underlying substrate - Mining Agent Traces 1. I need to run my agents to collect Traces 2. Understand behaviors from Traces at scale 3. Filter data for "improvement" 4. Do an "improvement experiment" There's a reason why every continual learning platform ends up looking like an observability platform. It's because Traces are the lifeblood of agent improvement and data is king 👑 The mechanism that we use to attempt improvement can vary - Harness Eng, SFT, etc. But without understanding the data agents produce, no algorithm will truly build better agents. The holy grail of Agent Improvement is Continual Learning. Consistently mining data and integrating it into the agent definition over infinitely long time horizons. Today, the easiest way to do that is to build an observability platform and constantly point agentic compute to understand the data that agents produce

JackFriedson retweeted

Mitchell Hashimoto

@mitchellh

about 16 hours ago

The problem with the "if it works who cares what the code looks like" mindset for agentic work is that it assumes the agent has a perfect understanding of "works." Realistically, things are underspecified, agents make bad assumptions, etc. To be fair, agents are pretty good at unit test coverage. They're pretty bad at designing human experiences (API, CLI flags, etc.), especially cohesive ones for future roadmap plans they may not have visibility into (unless your backlog is perfect and vision fully laid out, which I doubt). They're bad at knowing where performance matters and what type (CPU vs memory tradeoffs). They're bad at where compatibility matters and where it doesn't (and tend to err on the side of preserving it without further guidance). Etc. Unless you have this ALL specified, you can't possibly claim "it works" without taking a look and thinking about it.

211

651

107K

JackFriedson retweeted

Dean W. Ball

@deanwball

4 days ago

AmErIcAn DyNaMiSm

736

31K

JackFriedson retweeted

Tenobrus (→vibecamp)

@tenobrus

4 days ago

- i'm angry about this because i personally and for others want access to fable, and simultaneously believe anthropic's safeguards were sufficient and the US government badly misunderstood the information they were presented - but in abstract this is in fact exactly what I want. it's heartening to see the USG treat artificial intelligence with the seriousness and immediacy it deserves. this kind of swift action is what might have a chance of saving us from unaligned RSI. - but i also very much don't trust *this* government to handle this well, to take sane unilateral action, to chart any kind of correct path. - and this escalates the global race enormously. this is as strong a signal as you can get to, not just China but the EU and even our closest allies, that the US will not be sharing this advantage. that if they want sovereignty they're going to have to fight for it - obviously, that was always the case, and it was always going to happen eventually. but i don't think now was the time to send that signal. it would have been better to delay as long as possible. very mixed feelings today

169

111K

Jack Friedson

@JackFriedson

6 days ago

@dbreunig if only there was some sort of analogous human system for making judgements about fault and accountability as related to a set of natural language rules…

107

Jack Friedson

@JackFriedson

6 days ago

one of the biggest blockers to truly automated software engineering is agents' inability to rigorously verify their work. this is exactly the problem @niteshiftdev solves, and I can't imagine a better team than Sajid and Conor to be working on it

Sajid Mehmood

@smehmood

6 days ago

We're launching @niteshiftdev – the full-stack cloud for coding agents Verification is the new bottleneck. Software teams can now define their dev environment and verification tools once. Then run any frontier agent in the cloud: Claude Code, Codex, or OpenCode

205

106

81K

JackFriedson retweeted

Tenobrus (→vibecamp)

@tenobrus

7 days ago

fuck man i'm about to enter full on llm-psychosis 18 hour workdays again fable is hyperaddictive

941

35K

JackFriedson retweeted

Dean W. Ball

@deanwball

6 days ago

My last observation re: Anthropic’s secret sabotage safety policy, is that it undermines actually good safety policy. How? 1. First, it is very plausible to describe this as anti-competitive behavior (even if you are maximally sympathetic to Anthropic here you must admit this), and it is behavior being justified in the name of AI safety. If you believe, as I and many Anthropic staff do, that it may end up being critically important to relax antitrust enforcement so that the frontier labs can cooperate and collaborate on some areas of AI safety, Anthropic just undermined the case for that in a large way. 2. Overall, this massively and profoundly raises the status of the argument that AI safety has been hype to justify monopolistic behavior by labs. I continue to believe AI safety is a real and serious issue that is growing in importance rather than diminishing. If you agree with me, this incident is a setback, maybe a serious one. 3. As I have observed elsewhere, Anthropic’s official corporate policy is structurally identical to the fact pattern alleged against them by the Department of War. I still think DoW acted both falsely and wrongly in that fight, but it is no longer possible to defend Anthropic with a full throat after this incident. 4. This raises the case for heavier handed regulations. Anthropic is making an awfully good case here that their products ought to be treated as utilities, and thus that their alignment practices should be a matter of public policy rather than private property. I am starkly opposed to this sort of state power grab, but Anthropic is doing more to justify it than anyone else. 5. Thus, significant damage has been done to a community and entire approach to AI governance. It was done unilaterally by Anthropic, likely motivated largely by self-interest and justified within the internal psychology of the firm through the lens of safety. I suspect this is fixable in the economic and legal senses for Anthropic, but I fear the trust that has just been broken, and the goodwill extinguished, will take very much time to repair.

132

343

150K

JackFriedson retweeted

Nathan Lambert

@natolambert

7 days ago

Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.

187

298

288K

Jack Friedson

@JackFriedson

7 days ago

@JayaGup10 only read the first 8 words or so but 100% agree

234

Jack Friedson

@JackFriedson

7 days ago

every "coding is solved" argument I've heard relies on this same incredibly pedantic interpretation of the word "coding" yes, obviously there is more to software eng than coding. but defining coding so narrowly that it excludes quality, maintainability, and performance (all of which require domain understanding) makes the argument at best specious, and at worst deliberately misleading

Boris Cherny

@bcherny

8 days ago

Coding is just one part of engineering. There’s also debugging, operating services, scaling up infrastructure, deciding what to optimize, setting up hardware and capacity, talking to users, product planning, etc. Coding is the easy part, everything else is not yet solved (but is also becoming increasingly automated).

429

130K

Jack Friedson

@JackFriedson

8 days ago

I think this is obviously the case, no? the hard problem has always been how to turn complex, subjective human preferences into measurable, hill-climbable objectives. RL is only a valuable tool if you've already solved that problem

Jack Friedson

@JackFriedson

9 days ago

@scottastevenson isn't this more of a semantic issue around the definition of "eval"? like I view this less as "evals don't work" and more "people suck at doing evals". if you're building a user-facing AI product, obviously you should be incorporating user interaction data into your evals

Jack Friedson

@JackFriedson

11 days ago

someone needs to invent a coding agent you can throw things at

Jack Friedson

@JackFriedson

11 days ago

@charliermarsh *handle

108

Jack Friedson

@JackFriedson

11 days ago

@charliermarsh why is everything a "handler" or a "projection"?? we are not writing a filesystem or using cqrs

671

JackFriedson retweeted

Charlie Marsh

@charliermarsh

11 days ago

Becoming radicalized against meaningless LLM-induced nouns in code and comments: seam, cut, slice, shard, glue, spine, lane, etc.

368

62K

Jack Friedson

@JackFriedson

11 days ago

> ask agent to review effect-ts docs for best-practices we should follow > "we often call Effect.ignore without logging anything" > "so true, please fix that" > it adds TWELVE module-local helpers wrapping Effect.ignore > open effect docs > Effect.ignore takes a `log` parameter

Jack Friedson

@JackFriedson

11 days ago

@zeeg unfortunately the key to building good software is also the key to being perpetually unsatisfied

11 days ago

11 days ago

SpaceX's revenue could reach $3.4 trillion by 2040, according to analysis Morgan Stanley shared with investors yesterday. Goldman Sachs also made similar projections yesterday that it could hit $322 billion by 2030.

AndrewCurran_'s tweet photo. SpaceX's revenue could reach $3.4 trillion by 2040, according to analysis Morgan Stanley shared with investors yesterday. Goldman Sachs also made similar projections yesterday that it could hit $322 billion by 2030. https://t.co/HtyyLZKIc0

127

25K

Jack Friedson

@JackFriedson

12 days ago

@dbreunig if building "in distribution" software becomes 10x easier, presumably that results in some sort of mode collapse, right? like you'd expect the "distribution of generated software" to become very spiky around whatever things models are good at today?

Jack Friedson

@JackFriedson

Last Seen Users on Sotwe

Trends for you

Most Popular Users