mark williams @markjwill - Twitter Profile

17 days ago

Are Berkeley Law faculty prohibited from using AI-detection tools, or merely warned that AI-detection tools may be inaccurate and biased? I'm curious what schools are doing about this, as I've only seen warnings. It's news to me that it was obvious in the 2000s that ML is biased. If that were accurate, a number of groundbreaking studies published starting around 2016 wouldn't have been such a big deal. Prohibiting AI detection tools does not solve the bias problem. As @olawaleidowu_ pointed out, the concern is AI Police who overconfident in their ability to identify AI-generated content. https://t.co/EXgoRyD6vm

1

8

2

1

2K

markjwill retweeted

Andrej Karpathy

@karpathy

21 days ago

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

8K

150K

11K

14K

27M

markjwill retweeted

John Nay

@johnjnay

about 1 month ago

we’ve developed proprietary benchmarks for legal reasoning maintained by attorneys at Norm. because Norm Law attorneys, while serving as outside counsel to hedge funds, PE firms, etc. deploy AI agents in their day-to-day work, we can uniquely build real legal AI benchmarking. we've been tracking frontier models across generations, and the trend is clear: models are improving substantially in their legal reasoning capabilities. the latest generation of models are nearly indistinguishable from each other in accurately answering legal questions. most models are increasingly consistent, with the most recent generation of frontier models reaching the same conclusion ~90% of the time. but for high-stakes legal work, even the best models on this benchmark reach a different answer often enough that, at scale, users receive contradictory answers to the same question every week. to integrate ai agents into high-stales legal workflows, you need both (1) purpose-built systems that can constrain, verify, and govern AI reasoning automatically, and (2) the human overlay of expertise for live workflows fully intertwined/integrated w/ ai agents in a deliberate process.

johnjnay's tweet photo. we’ve developed proprietary benchmarks for legal reasoning maintained by attorneys at Norm.

because Norm Law attorneys, while serving as outside counsel to hedge funds, PE firms, etc. deploy AI agents in their day-to-day work, we can uniquely build real legal AI benchmarking.

we've been tracking frontier models across generations, and the trend is clear: models are improving substantially in their legal reasoning capabilities.

the latest generation of models are nearly indistinguishable from each other in accurately answering legal questions.

most models are increasingly consistent, with the most recent generation of frontier models reaching the same conclusion ~90% of the time.

but for high-stakes legal work, even the best models on this benchmark reach a different answer often enough that, at scale, users receive contradictory answers to the same question every week.

to integrate ai agents into high-stales legal workflows, you need both (1) purpose-built systems that can constrain, verify, and govern AI reasoning automatically, and (2) the human overlay of expertise for live workflows fully intertwined/integrated w/ ai agents in a deliberate process.

6

43

5

66

5K

markjwill retweeted

Ed Walters @EJWalters

about 2 months ago

If you think clients buy documents and hours from law firms, this is bad news for lawyers. But if you believe that clients seek discernment, counseling, risk management, and advice from law firms, and documents are just artifacts, this is terrific news for lawyers and clients.

3

13

2

3

1K

Who to follow

Sean A. Harrington

@SeanLovesBooks

Shoggoth Surfer | AI x Law

Eugene Giudice

@eugenegiudice

A Professional Librarian and a Whole Lot More

markjwill retweeted

Dean W. Ball

@deanwball

3 months ago

Think about the power Hegseth is asserting here. He is claiming that the DoD can force all contractors to stop doing business of any kind with arbitrary other companies. In other words, every operating system vendor, every manufacturer of hardware, every hyperscaler, every type of firm the DoD contracts with—all their services and products can be denied to any economic actor at will by the Secretary of War. This is obviously a psychotic power grab. It is almost surely illegal, but the message it sends is that the United States Government is a completely unreliable partner for any kind of business. The damage done to our business environment is profound. No amount of deregulatory vibes sent by this administration matters compared to this arson.

530

13K

3K

1K

1M

markjwill retweeted

Dean W. Ball

@deanwball

4 months ago

The belief that AI is mostly fake remains shockingly common, including, as Derek points out, among journalists and takesmen, but also *people with AI in their job titles.* There are people *who do AI policy for a living* who think AI is mostly fake and don’t use it.

30

328

15

23

44K

markjwill retweeted

Rany Jazayerli @jazayerli

4 months ago

The clear winner of the first half: Patrick Mahomes. Y'all wanted someone other than the Chiefs in the Super Bowl. You reap what you sow.

516

5K

397

90

425K

markjwill retweeted

Ethan Mollick

@emollick

4 months ago

If you are considering taking a job offer, you may want to ask what your token budget will be.

85

2K

223

289

532K

markjwill retweeted

Kevin Roose

@kevinroose

4 months ago

don't worry guys, they're just stochastic parrots

66

1K

54

63

121K

markjwill retweeted

John Palmer

@johnpalmer

5 months ago

you kinda seem more like a Claude Cowork user (derogatory)

12

578

20

36

29K

markjwill retweeted

Anthony

@omgitsbirdman

5 months ago

Josh Allen always better than Pat until it's time to actually be better than Pat

91

16K

2K

320

449K

markjwill retweeted

Hammer and Nigel

@hammerandnigel

5 months ago

TRANSFER PORTAL UPDATE: (And if you get this joke, we could totally drink moonshine together and be friends)

617

10K

981

315

643K

markjwill retweeted

Ahmad

@TheAhmadOsman

5 months ago

me watching Claude Code write the code for me

139

11K

628

465

323K

markjwill retweeted

Chase Snyder

@ChasingSnyder

6 months ago

Here’s the specific 5 minutes everything changed at Arrowhead and it became Arrowhead. Everyone knows it when you go through history. Broncos/Chiefs, 1990. False start, Broncos. Elway goes back to the line again, realizes they’re screwed to even operate, and has to BEG for help.

25

1K

153

257

173K

markjwill retweeted

Idaho Football @VandalFootball

6 months ago

Flagship ✌️

4

134

26

1

14K

markjwill retweeted

Ethan Mollick

@emollick

6 months ago

I meet a lot of very smart AI critics who never seriously try to make AI work for them by spending a couple of hours with a frontier model. People can be (and should be & are) critical after realizing what AI can do, but experience leads to better-informed and sharper critiques.

43

520

34

49

35K

markjwill retweeted

Ethan Mollick

@emollick

11 months ago

The problem is not just the proliferation of devices that let you record people without their knowledge, but the fact that multimodal LLM let you use recordings in ways that neither law not society anticipated. Everyone has an easy way to mine hours of footage. No forgetting.

emollick's tweet photo. The problem is not just the proliferation of devices that let you record people without their knowledge, but the fact that multimodal LLM let you use recordings in ways that neither law not society anticipated.

Everyone has an easy way to mine hours of
footage. No forgetting. https://t.co/Zfl5YPnyPR

25

230

22

40

28K

markjwill retweeted

Ethan Mollick

@emollick

11 months ago

So every major model is already exceeding or will soon exceed the EU's systemic risk FLOP limit when it comes into effect next year.

emollick's tweet photo. So every major model is already exceeding or will soon exceed the EU's systemic risk FLOP limit when it comes into effect next year. https://t.co/QURBkar14v

25

361

44

109

48K

markjwill retweeted

Ethan Mollick

@emollick

12 months ago

Many firms built around the limitations & cost assumptions of GPT-3.5 class models, and are now stuck with complex solutions that are more expensive & worse than a reasoner without any scaffolding You need to build solutions with an eye towards riding the cost/performance curve.

24

592

64

169

69K

markjwill retweeted

Ethan Mollick

@emollick

12 months ago

This study is being massively misinterpreted. College students who wrote an essay with LLM help engaged less with the essay & thus were less engaged when (a total of 9 people) were asked to do similar work weeks later. LLMs do not rot your brain. Being lazy & not learning does.

87

2K

244

387

219K

mark williams

@markjwill

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users