Blair Johnson @oijna - Twitter Profile

7 months ago

There's this copilot ad where the guy asks his computer how much thrust the Saturn V produced in hatchback cars (equivalent) and copilot says 90.

0

29

Blair Johnson @oijna

about 1 year ago

@juliendelavande @huggingface How is power consumption estimated?

1

0

243

Blair Johnson @oijna

about 1 year ago

@kalomaze @findboundary You aren’t in bad shape. A new driver is backwards compatible with old cuda toolkits. Torch doesn’t usually come packaged with nvcc, so it isn’t surprising that the compiler is out of date. Just look for a cudatoolkit-dev that matches your pytorch cuda version.

0

12

0

3

2K

Blair Johnson @oijna

over 1 year ago

No notes.

0

1

0

158

Who to follow

Jon Durbin

@jon_durbin

Human. Backend dev https://t.co/CJYvkACyne

haruhi fujioka was my blueprint | 🇵🇭 | (she/he/they)

Blair Johnson @oijna

over 1 year ago

My For You page was 37 elon musk tweets in a row today. I don’t follow him.

0

3

0

131

Blair Johnson @oijna

about 2 years ago

After dozens of visits, google still ranks the jax documentation below a pile of garbage about jacksonville, local businesses, and the singer whenever I search “jax”.

0

126

Blair Johnson @oijna

over 2 years ago

@StasBekman We usually see a ~10°C delta between the “front” A100s on our SuperMicro nodes and the “back” A100s that breathe their exhaust.

1

0

41

Blair Johnson @oijna

over 2 years ago

Jax is so fast.

0

207

oijna retweeted

Mark Riedl @mark_riedl

over 2 years ago

You can apply for early pilot access to the National AI Research Resource (NAIRR): https://t.co/cGrX4As7eo NSF's Press Release: https://t.co/0w8vQR6yXB

0

11

2

7

3K

Blair Johnson @oijna

over 2 years ago

Just tested and confirmed that copilot in outlook is vulnerable to the unicode exploit.

0

1

0

132

Blair Johnson @oijna

over 2 years ago

@DrJimFan I’ll be interested to see what happens after the first big leak of one of these kinds of models. A company puts a lot of potentially valuable data into a single 25GB crown jewel, but that data is inherently noisy and unreliable to competitors without ground truth.

2

5

0

421

Blair Johnson @oijna

over 2 years ago

@felix_red_panda @artificialguybr It looks like you can get ERA5 data for free (as recent as 6 days ago) from the Climate Data Store (seems to be an EU public service).

0

2

0

65

Blair Johnson @oijna

over 2 years ago

@Karmedge These benchmarks are for very specific formal reasoning tasks which GPT-4 is probably not heavily trained on (“all cats are blue, tigers are cats, Richard is a tiger, is Richard blue?”). It’s not that surprising that a fine-tuned task-specific model can perform comparably.

0

1

0

349

Blair Johnson @oijna

over 2 years ago

The only F1 stat that matters.

0

1

0

233

Blair Johnson @oijna

almost 3 years ago

@jeremyphoward A bigger monitor and htop 😉

0

288

Blair Johnson @oijna

about 3 years ago

@TensorFlow Discovering that they need to read EXIF data for mobile phone images to be right-side up.

0

105

Blair Johnson @oijna

about 3 years ago

@OfirPress I also see a lot of people citing minor performance degradation on arc/winograde/hellaswag/MMLU. Very few of the tasks in these benchmarks are longer than a few hundred tokens, let alone the many thousands of tokens in the context lengths that they're trying to evaluate.

1

0

154

oijna retweeted

Ofir Press

@OfirPress

about 3 years ago

@ggerganov @theemozilla @yacineMTB Watch out- models with different context window lengths aren't so easily comparable, and these results might be misleading. I talk about this here: https://t.co/erEt0XtFt9

3

41

3

17

4K

Blair Johnson @oijna

about 3 years ago

@theemozilla @teknium @ggerganov @yacineMTB How long are the benchmark tasks in these datasets? It would be great to rule out the possibility that the benchmarks aren’t really testing / don’t necessitate the full context.

0

1

0

557

Blair Johnson @oijna

about 3 years ago

This is the distribution of prompt+output lengths (in tokens) for the WizardLM 70k dataset. The examples >3k tokens seem to get there through listing lots of things, long junk code, and repetitive number sequences. Warrants filtering, and not helpful for evaluating long context.

oijna's tweet photo. This is the distribution of prompt+output lengths (in tokens) for the WizardLM 70k dataset. The examples >3k tokens seem to get there through listing lots of things, long junk code, and repetitive number sequences. Warrants filtering, and not helpful for evaluating long context. https://t.co/BhIafwpo2c

0

4

0

196

Blair Johnson

@oijna

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users