Bernard Koch (Bernie) @bernardjkoch - Twitter Profile

6 months ago

old-style benchmarks that measure narrow capabilities precisely, and qualitative assessments that are holistic but can't produce clear evidence of progress. Without better evaluation methods that bridge these perspectives, it's hard to gauge the true potential and limits of LLMs

0

14

Bernard Koch (Bernie) @bernardjkoch

6 months ago

Hi friends, I wanted to share a TIME op-ed David Peterson and I wrote about the history of evaluation in AI. https://t.co/6A9krRUtjF Core argument: The inability to evaluate AI's potential precipitated the bubble and winter of the 1980s. Today, we face a similar problem.

1

0

27

Bernard Koch (Bernie) @bernardjkoch

6 months ago

Many creative and process-based tasks we now seek to automate can’t be benchmarked. There is no "correct" PowerPoint or scientific hypothesis. As a result, new models are evaluated as much by "vibe tests" as concrete metrics. We're caught between two limited approaches:

1

0

23

bernardjkoch retweeted

Alex Hanna (اليكس حنٌا) @alexhanna

over 4 years ago

Wow! Honored that our paper with Bernie Koch, @cephaloponderer, and Jacob Foster won a best paper award at the NeurIPS Dataset and Benchmark track! So pleased that a sociology of science paper won such an honor at NeurIPS. https://t.co/oYlJNTKsrO

15

229

25

17

0

Who to follow

SSHA

@socscihist

Social Science History Association

Jiaxin Pei

@jiaxin_pei

Postdoc @StanfordHAI @stanfordnlp @DigEconLab, PhD from Umich. Incoming Assistant Professor @UTAustin LLM, Human-AI Interaction, Computational Social Science

Jesse Dodge

@JesseDodge

Research Scientist at Meta. 10-yr test-of-time ACL 22, Best Demo ACL 25, Best Resource Paper ACL 24, Best Theme Paper ACL 24, Best Student Paper NAACL 15 🏳️‍🌈

bernardjkoch retweeted

Bernard Koch (Bernie) @bernardjkoch

over 4 years ago

@pablogerbas Thanks for sharing beyond my loyal 13 followers Pablo. :) I'd just add that this literature is exciting because it provides interesting directions not just for heterogeneous effect estimation, but also CI with text, graphs, and images!

0

1

0

bernardjkoch retweeted

Pablo Geraldo B. @pablogerbas

over 4 years ago

And, to make things even better, the review paper is accompanied by a detailed tutorial in TensorFlow 2, so you can try it by yourself! https://t.co/U4HZEUgW0m

0

6

1

0

Bernard Koch (Bernie) @bernardjkoch

over 4 years ago

@pablogerbas Thanks for sharing beyond my loyal 13 followers Pablo. :) I'd just add that this literature is exciting because it provides interesting directions not just for heterogeneous effect estimation, but also CI with text, graphs, and images!

0

1

0

bernardjkoch retweeted

Pablo Geraldo B. @pablogerbas

over 4 years ago

The amazing @bernardjkoch just posted on Arxiv "Deep Learning of Potential Outcomes". If you're either familiar with causality but not with deep learning, or the other way around, this is a great place to start! https://t.co/CQBRPQ9yqQ

5

29

9

17

0

Bernard Koch (Bernie)

@bernardjkoch

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users