Ted Pavlic (he/him/his) @TedPavlic - Twitter Profile

Pinned Tweet

about 9 years ago

Now available on YouTube: My #TEDxASU talk about the hidden costs that disciplinary identity puts on innovation. https://t.co/tDWT5eoYed

2

53

11

4

0

Ted Pavlic (he/him/his) @TedPavlic

about 2 hours ago

Used @Waymo batteries will bolster California and Texas energy storage projects "Used Waymo robotaxi batteries become backup storage for power grids" https://t.co/0LF2zbCorn

0

34

Ted Pavlic (he/him/his) @TedPavlic

about 16 hours ago

But the real test for any new harness is whether it can reliability get its LLM backbone to count the number of the days of the week that have a "d" in them.

elvis

@omarsar0

1 day ago

New research from Google. Just shows the impressive results you can get from custom agent harnesses. LEAP wraps a general-purpose LLM in an agentic scaffold that grounds every step in the Lean compiler and iterates against verifier feedback. The same general model solves all 12 Putnam 2025 problems and lifts Lean-IMO-Bench one-shot solve rate from under 10% to 70%, beating a specialized gold-medal system that scores 48%. Paper: https://t.co/bh4Yoi19E2 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

omarsar0's tweet photo. New research from Google.

Just shows the impressive results you can get from custom agent harnesses.

LEAP wraps a general-purpose LLM in an agentic scaffold that grounds every step in the Lean compiler and iterates against verifier feedback.

The same general model solves all 12 Putnam 2025 problems and lifts Lean-IMO-Bench one-shot solve rate from under 10% to 70%, beating a specialized gold-medal system that scores 48%.

Paper: https://t.co/bh4Yoi19E2

Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

25

444

63

421

32K

0

38

TedPavlic retweeted

rody

@0x_rody

1 day ago

https://t.co/TLFYfXGoBG

3

83

11

198

38K

Who to follow

Myrmecological News

@MyrmecolNews

Myrmecol. News: an open access, independent, international, peer reviewed, non-profit ant journal; https://t.co/4qkqxyec16: blog https://t.co/956HjTyN1i

Insectes Sociaux

@InsSociaux

International journal for the study of social arthropods 🐝🐜🪳 https://t.co/FPmzWTQTsa

Guy Theraulaz

@GTheraulaz

Directeur de recherches au CNRS, Éthologue, étudie les comportements collectifs et l’intelligence collective dans les sociétés animales et humaines

Ted Pavlic (he/him/his) @TedPavlic

about 21 hours ago

@sflorimm The waiting is exhausting, anxiety producing, and frustrating when you need to step in. But the goal is productivity, not peace, and I feel like I complete a lot of projects I wouldn't have even started before. But compresses minor angst of 3 months of coding into 1 awful day...

0

65

TedPavlic retweeted

Joe Gibbs Politz @JoePolitz

1 day ago

One of the better “Claude ~= me as a grad student” exhibits

1

5

1

0

1K

TedPavlic retweeted

John Crickett

@johncrickett

1 day ago

“Software engineers who don’t know how to use AI coding agents will fall behind.” No. AI coding agents are not the hard part of building software. They’re a simple tool. You can learn to use them in a few days. The hard part is knowing what to ask for. Knowing whether the answer is any good. Knowing when the code is brittle, overcomplicated, insecure, or just plain wrong. It’s the judgement required to use them well. That’s software engineering. And that takes decades to learn properly. The people at risk aren’t engineers who haven’t mastered using coding agents yet. It’s people who only know how to prompt one.

83

834

95

175

66K

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

On a more serious note, if you do want to know why they are "weights", here is an interactive ANN–lever analogy that I built for an AI survey course I teach. https://t.co/HhJM9PkRe2

Max Leiter @maxleiter

1 day ago

"They're made out of weights." "Weights?" "Weights. Floating-point numbers. We checked the whole thing through. It's nothing but weights." "Weights doing what? Where do the words come from?" "The weights make the words. Are you understanding me?"

89

3K

191

420

185K

0

2

0

1

119

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@maxleiter @ShriramKMurthi If anyone does care about why they are "weights", here's an interactive ANN-to-lever analogy I put together for one of the intro classes I teach. https://t.co/HhJM9PkRe2

0

12

1

6

1K

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@ShriramKMurthi One of the best fortune-cookie fortunes I have received. I think it is a great statement of the scientific process -- which is ultimately a statement about generating new questions, not generating answers. Like a sports car, the power of an LLM really depends on the driver.

TedPavlic's tweet photo. @ShriramKMurthi One of the best fortune-cookie fortunes I have received. I think it is a great statement of the scientific process -- which is ultimately a statement about generating new questions, not generating answers. Like a sports car, the power of an LLM really depends on the driver. https://t.co/3RgTwE9br1

0

2

0

47

TedPavlic retweeted

Shriram Krishnamurthi (primary: Bluesky) @ShriramKMurthi

1 day ago

What if it eventually turns out that the ONLY profession programmers managed to utterly decimate was…programming? That would be poetically funny. Whole new spin on "bitter lesson".

17

145

10

24

46K

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@dasanil @ShriramKMurthi But all modern foundation models (particularly ones connected to chat UI's) are all able to do this and will if they are triggered to do so. The trick is convincing them that that problem is hard enough to take the time and tokens to write and execute the script.

0

2

0

19

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@norootcause @ShriramKMurthi https://t.co/B5vkjPtqys

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@ShriramKMurthi Two examples, both with Sonnet 4.6. The chain of thought makes all the difference, but you have to trick it into really thinking about it.

TedPavlic's tweet photo. @ShriramKMurthi Two examples, both with Sonnet 4.6. The chain of thought makes all the difference, but you have to trick it into really thinking about it. https://t.co/F9Idzs7saK

0

1

0

44

0

15

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@norootcause @ShriramKMurthi But that uses up a lot of tokens and takes longer.

1

0

15

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@ShriramKMurthi Two examples, both with Sonnet 4.6. The chain of thought makes all the difference, but you have to trick it into really thinking about it.

0

1

0

44

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@ShriramKMurthi If you prompt things just right, they generate and execute a little algorithm to actually do the simple calc (and it's right 100% of time). They need to figure out how to get the LLM's to "think computationally" (w/ tools) for every quantitative prompt, not just really hard ones.

1

0

150

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@davidmanheim @JerusalemDemsas @TheArgumentMag 💯 And if you prompt it the right way, you can get it to use tools, and it's correct 100% of the time. It would just be nicer if it could recognize that anything quantitative should trigger tools (despite the extra tokens needed for them).

0

17

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

Due to oil prices, American Airlines confirmed Wednesday that it’s temporarily cutting nonstop flights from Los Angeles (LAX) to Cleveland (CLE), Columbus (CMH), Pittsburgh (PIT) and Washington Dulles (IAD). https://t.co/RpEHq3IqS2

0

49

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@zekramu @moomooskycow jj is basically a hg frontend to git, written in rust. hg had a much more reasonable process flow, but it was slow and not scalable. jj took the way hg did things and used rust to make it fast and efficient, and git backend made it compatible with conventions. It's pretty clever.

0

1

0

71

Ted Pavlic (he/him/his) @TedPavlic

1 day ago

@dev_lazaro @zekramu rcs ftw! (but please no sccs) On a more serious note, Mercurial/hg was a much more practical dvcs than git for most projects, particularly when collaborating with non-developers. Git scales better through.

0

2

0

98

Ted Pavlic (he/him/his)

@TedPavlic

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users