Lousy Scout Trooper @rwohleb - Twitter Profile

Pinned Tweet

Lousy Scout Trooper @rwohleb

over 5 years ago

F*** you, 2020. 😢

io9

@io9

over 5 years ago

We received news this morning that the iconic SETI dish at Arecibo will be demolished https://t.co/aJW9pYZnmL

9

41

16

0

1

0

rwohleb retweeted

Stew Peters @realstewpeters

3 days ago

REP HUFFMAN: “The CEO of Trump’s Freedom 250 flew to Davos, stood in front of a room full of foreign governments and asked them ‘how they wanted to shape America’s birthday’. Then they stole hundreds of millions and committed wire fraud.” Holy shit.

217

13K

4K

2K

395K

rwohleb retweeted

Peter Dedene

@dedene

about 1 month ago

POV: you're still using GitHub Copilot after June 1st, 2026

246

17K

1K

3K

2M

rwohleb retweeted

kristina.

@cosmepolitics

about 2 months ago

oh this isn’t an exaggeration. this is exactly how tucker carlson is.

69

17K

777

2K

790K

Who to follow

Steve Rude

@slantview

Building https://t.co/NLDJ6WKHfz, A3T, Kingpin, and Consulting with Rude Company. Building AI and Security. Ex-Riot Games, Bird, Zwift. Zen Practioner.

Bryan Hirsch

@BryanHirsch

Present: https://t.co/5radCJsI4z. Past: Chief Digital Officer & Founder of MA Digital Service, @Whitehouse44. LinkedIn: https://t.co/XUsgB5Imys RT ≠ +

Swarad Mokal 🇮🇳

@swarad07

The search for the truth is not for the faint-hearted. दोगलापन buster. 💭 All ramblings are my own.

rwohleb retweeted

Muratcan Koylan

@muratcan

4 months ago

We benchmarked NVIDIA’s new Nemotron 3 Super in two modes **Thinking Off and High Thinking** across three medical evaluation sets: MedMCQA, MedCaseReasoning, and MedXpertQA. Thinking Off outperformed High Thinking: 26.4% vs. 25.2% accuracy. The cost gap was much larger than the accuracy gap. High Thinking increased mean latency from 1.13s to 4.43s and mean completion length from 109 tokens to 1,089 tokens. In our setup, the higher-reasoning mode was much slower and more verbose, without improving aggregate results. The benchmark-level split was more revealing than the overall average. On MedMCQA, accuracy dropped from 56.6% to 49.1% with High Thinking. On MedCaseReasoning, it also declined, from 24.4% to 20.2%. The only clear gain was on MedXpertQA, where High Thinking improved accuracy from 9.2% to 15.0%. That pattern fits the benchmark design: MedMCQA rewards concise answer selection on constrained multiple-choice questions, while MedXpertQA is harder and more reasoning-intensive, so extra inference budget appears to help more there than on exam-style MCQs. Across the overlap set, High Thinking improved 166 questions but flipped 182 previously correct answers into incorrect ones, explaining the net regression. Many of these looked like classic overthinking on structured medical multiple-choice items: the non-thinking run selected the correct answer directly, while High Thinking often chose a plausible distractor after longer deliberation. Our main takeaway: Nemotron Super’s High Thinking mode should not be treated as a universal default. In this experiment, it looked more like a specialized mode for harder expert synthesis than a general-purpose accuracy booster. For structured medical multiple-choice tasks, Thinking Off was both faster and more accurate. For harder expert-level reasoning tasks, especially those closer to MedXpertQA, additional reasoning showed some benefit. The practical implication is that the reasoning depth should likely be routed by task type rather than enabled globally. We used the @baseten Model API for these runs, and we’re grateful for their support from day one. We’re also thankful to @NVIDIAAI for its commitment to open source. As a research team that transitioned fully to open-source models this year, we deeply appreciate this level of openness, weights, data, and recipes. We also expect this model to be especially strong for orchestration and agent-style tasks, which is an area we’re excited to explore further.

muratcan's tweet photo. We benchmarked NVIDIA’s new Nemotron 3 Super in two modes **Thinking Off and High Thinking** across three medical evaluation sets: MedMCQA, MedCaseReasoning, and MedXpertQA.

Thinking Off outperformed High Thinking: 26.4% vs. 25.2% accuracy. The cost gap was much larger than the accuracy gap. High Thinking increased mean latency from 1.13s to 4.43s and mean completion length from 109 tokens to 1,089 tokens. In our setup, the higher-reasoning mode was much slower and more verbose, without improving aggregate results.

The benchmark-level split was more revealing than the overall average. On MedMCQA, accuracy dropped from 56.6% to 49.1% with High Thinking. On MedCaseReasoning, it also declined, from 24.4% to 20.2%.

The only clear gain was on MedXpertQA, where High Thinking improved accuracy from 9.2% to 15.0%. That pattern fits the benchmark design: MedMCQA rewards concise answer selection on constrained multiple-choice questions, while MedXpertQA is harder and more reasoning-intensive, so extra inference budget appears to help more there than on exam-style MCQs.

Across the overlap set, High Thinking improved 166 questions but flipped 182 previously correct answers into incorrect ones, explaining the net regression. Many of these looked like classic overthinking on structured medical multiple-choice items: the non-thinking run selected the correct answer directly, while High Thinking often chose a plausible distractor after longer deliberation.

Our main takeaway: Nemotron Super’s High Thinking mode should not be treated as a universal default.

In this experiment, it looked more like a specialized mode for harder expert synthesis than a general-purpose accuracy booster. For structured medical multiple-choice tasks, Thinking Off was both faster and more accurate. For harder expert-level reasoning tasks, especially those closer to MedXpertQA, additional reasoning showed some benefit.

The practical implication is that the reasoning depth should likely be routed by task type rather than enabled globally.

We used the @baseten Model API for these runs, and we’re grateful for their support from day one.

We’re also thankful to @NVIDIAAI for its commitment to open source. As a research team that transitioned fully to open-source models this year, we deeply appreciate this level of openness, weights, data, and recipes.

We also expect this model to be especially strong for orchestration and agent-style tasks, which is an area we’re excited to explore further.

6

30

4

13

4K

Lousy Scout Trooper @rwohleb

10 months ago

@EastEndJoe If I didn’t know any better I might offer her an EpiPen. 😅

0

3

Lousy Scout Trooper @rwohleb

12 months ago

@mosamour @soychotic @JeremyNguyenPhD I was just about to post the same thing. We do love a pretzel pose.

0

63

Lousy Scout Trooper @rwohleb

over 1 year ago

@TracketPacer It appears the X AD service has thoughts on the subject. 😂

0

94

rwohleb retweeted

Scientific American

@sciam

almost 2 years ago

For only the second time in our 179-year history, the editors of Scientific American are endorsing a candidate for president. That person is @KamalaHarris. | Editorial https://t.co/dOsFW8BQCn

15K

48K

16K

2K

11M

Lousy Scout Trooper @rwohleb

almost 2 years ago

@NotAttained @willc @urlichsanais @DrinkerOfTears @PaloAltoNtwks It’s two women in a slinky cocktail dress. If they were wearing work attire? Maybe. If it was one guy and one gal in roughly equivalent attire? Maybe. If it was two dudes in something slinky? How many “it’s just a pun” people would be irritated? Not rocket science.

0

6

0

548

Lousy Scout Trooper @rwohleb

almost 2 years ago

@George_Kurtz *cough* canary deployments. How are you not doing that?

0

4

Lousy Scout Trooper @rwohleb

about 2 years ago

@bestofnextdoor You gotta pay those organ utility bills on time. Big Organ gives you zero grace time.

0

1

0

194

Lousy Scout Trooper @rwohleb

about 2 years ago

As a company, why on earth would you risk antagonizing a large group of potential customers?!

0

38

Lousy Scout Trooper @rwohleb

over 2 years ago

@jab A decent analysis. Would have liked to see their opinion of the size of the effects of “greedflation” and supply chain interruption lag that still continues in some sectors. Consumerism outpaced these other economic forces, but by how much.

1

0

20

Lousy Scout Trooper @rwohleb

over 2 years ago

These need to be turned into nail polish colors.

vx-underground

@vxunderground

over 2 years ago

We've updated the malware family collection - AtlasAgent - BumbleBeeLoader - ChargeWeapon - DangerAds - DBatLoader - DinodasRAT - DreamLand - EasyStealer - GOLDBACKDOORDropper - HyperBro - RevengeRAT - RhadamanthysLoader - ShadowPad - Stealc - WannaCry https://t.co/W6czPhOcHt

vxunderground's tweet photo. We've updated the malware family collection

- AtlasAgent
- BumbleBeeLoader
- ChargeWeapon
- DangerAds
- DBatLoader
- DinodasRAT
- DreamLand
- EasyStealer
- GOLDBACKDOORDropper
- HyperBro
- RevengeRAT
- RhadamanthysLoader
- ShadowPad
- Stealc
- WannaCry

https://t.co/W6czPhOcHt https://t.co/PWSJy8MIBL

3

157

24

18

25K

0

31

rwohleb retweeted

Dr. Tech

@doctechmd

over 2 years ago

I don't have an engineering degree, but this looks like a data leak

436

31K

5K

1K

3M

rwohleb retweeted

sergii

@SergiiKirianov

over 2 years ago

POV: Senior Dev reviewing your code

32

2K

194

143

109K

Lousy Scout Trooper @rwohleb

almost 3 years ago

@ITSourceress @Cthulhu_Answers Even after years I still bristle when a higher up phrases things a certain way. It sticks with you. Hard not to remain guarded in even the regular 1:1 with a manager.

0

1

0

10

Lousy Scout Trooper @rwohleb

almost 3 years ago

@ITSourceress @Cthulhu_Answers Still releasing tension from my last layoff. Heads were on the chopping block for months and empty CTO position. Toxic AF. Layoff before that they turned off my admin access the night before while I was working late. Didn’t sleep much that night. Fuels imposter syndrome for sure.

1

0

55

Lousy Scout Trooper @rwohleb

almost 3 years ago

@marileezafari @ITSourceress That’s F’ed up. There is a special place in a very hot place for people who would do that to another. I wish you well in navigating it all.

0

2

0

18

Lousy Scout Trooper @rwohleb

almost 3 years ago

@Creech Software, since I’ve been doing it professionally for 20+ years. Otherwise, maybe electronics? Some professionally, but mostly as a hobby, so I suppose it might add up to 10k+ hours. 🤔

0

3

Lousy Scout Trooper

@rwohleb

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users