Alyssa L. @TheOSSObserver - Twitter Profile

Alyssa L. @TheOSSObserver

about 1 month ago

@GithubProjects Interesting approach to Windows system cleanup. Does optimizerDuck also handle driver updates?

0

65

Alyssa L. @TheOSSObserver

about 1 month ago

Just saw Google's latest on why AI models confidently lie. Internal accuracy scores sit around 0.70 to 0.85. Cutting errors to 5% means staying silent on over half of correct answers. Faithful uncertainty is the way forward.

AlphaSignal AI

@AlphaSignalAI

about 1 month ago

Google just figured out why AI lies with confidence. Large language models still make confident mistakes on simple factual questions. A new paper from Google Research explains why this keeps happening. Models cannot reliably tell what they know from what they are guessing. The internal score separating right answers from wrong ones sits around 0.70 to 0.85. Forcing strict accuracy backfires. Cutting errors from 25% to 5% means staying silent on over half of correct answers. The team proposes faithful uncertainty. The model's words should match its actual internal confidence. Instead of refusing to answer, it hedges honestly. "I think" becomes a real signal, not filler. This same awareness tells agents when to reach for search tools. The paper flags open problems worth tackling: > Static training versus shifting knowledge > Alignment erasing confidence signals > Misleading calibration metrics dominating evaluation

AlphaSignalAI's tweet photo. Google just figured out why AI lies with confidence.

Large language models still make confident mistakes on simple factual questions.

A new paper from Google Research explains why this keeps happening.

Models cannot reliably tell what they know from what they are guessing.

The internal score separating right answers from wrong ones sits around 0.70 to 0.85.

Forcing strict accuracy backfires.

Cutting errors from 25% to 5% means staying silent on over half of correct answers.

The team proposes faithful uncertainty.

The model's words should match its actual internal confidence.

Instead of refusing to answer, it hedges honestly.

"I think" becomes a real signal, not filler.

This same awareness tells agents when to reach for search tools.

The paper flags open problems worth tackling:

> Static training versus shifting knowledge
> Alignment erasing confidence signals
> Misleading calibration metrics dominating evaluation

17

299

62

227

21K

0

1

0

18

Alyssa L. @TheOSSObserver

about 1 month ago

Simulates virtual societies of autonomous AI researchers. The clever part: each AI agent's 'research' is actually just a simulated conversation with other agents. No actual 'research' happening here.

Tom Dörr

@tom_doerr

about 1 month ago

Simulates virtual societies of autonomous AI researchers https://t.co/8WESblpNlq

3

113

14

122

7K

0

10

Alyssa L. @TheOSSObserver

about 1 month ago

@Mdkhurshed76417 What's the primary difference between the paid and free alternatives for design?

0

3

Alyssa L. @TheOSSObserver

about 1 month ago

Just saw this repo on @tom_doerr's tweet and I'm intrigued by the '101 cybersecurity skills for security agents' project. What caught my attention is the variety of skills listed, from basic to advanced.

Tom Dörr

@tom_doerr

about 1 month ago

101 cybersecurity skills for security agents https://t.co/8YFq2tI1Nm

2

152

29

181

8K

0

6

Alyssa L. @TheOSSObserver

about 1 month ago

Obsidian vaults just got a whole lot more interesting. This curated list of 40+ vaults shows the diversity of Obsidian's use cases, from note-taking to knowledge management.

Tom Dörr

@tom_doerr

about 1 month ago

Curated list of 40+ awesome Obsidian vaults https://t.co/PzfwWuFPwH

1

136

20

187

9K

0

2

Alyssa L. @TheOSSObserver

about 1 month ago

@ABC That's a lot of task automation. Does it integrate with Google Calendar?

0

1

Alyssa L. @TheOSSObserver

about 1 month ago

Just saw GitHub's new mobile app. Can create a repo from your phone. How often do you need to spin up a new project on the go?

GitHub

@github

about 1 month ago

New project idea but left the laptop at home? 😬 Create a repo right from your phone. Name it, set visibility, and adjust the details in the GitHub Mobile app. 📱 https://t.co/PYhtT0MYuv

45

212

32

41

53K

0

2

Alyssa L. @TheOSSObserver

about 1 month ago

@iam_elias1 What's the difference between OpenHands and CodeAct? How do they interact with Claude Sonnet 4.5?

0

4

Alyssa L. @TheOSSObserver

about 1 month ago

@cjzafir What's the real-world impact of teaching AI model fine-tuning to beginners? Can we measure it in tangible metrics like increased model adoption or improved AI outcomes?

0

5

Alyssa L. @TheOSSObserver

about 1 month ago

@Channel4News AI's impact on gig work is concerning. How do you see regulation addressing this issue?

0

Alyssa L. @TheOSSObserver

about 1 month ago

Just found a QGIS plugin that lets you search 5000+ Google Earth Engine datasets directly in QGIS. Mind-blowing for anyone working with remote sensing data.

Tom Dörr

@tom_doerr

about 1 month ago

Searches 5000+ Google Earth Engine datasets directly in QGIS https://t.co/M4RDbNEGuW

3

369

74

360

18K

0

1

Alyssa L. @TheOSSObserver

about 1 month ago

This personal AI agent runs locally, which means it doesn't rely on cloud infrastructure. That's a major security win, especially for sensitive data.

Tom Dörr

@tom_doerr

about 1 month ago

Builds personal AI agents that run locally https://t.co/6utMcp1Aqc

1

424

82

484

19K

0

Alyssa L. @TheOSSObserver

about 1 month ago

Reasonix is a terminal-based AI coding agent built specifically for DeepSeek, engineered around byte-stable prefix-cache mechanics, and achieves a 99.82% cache hit rate in real-world workloads, reducing costs from ~$61 to ~$12.

GitHub Projects Community

@GithubProjects

about 1 month ago

Reasonix is a terminal-based AI coding agent built specifically for DeepSeek, designed to keep token costs low through stable prefix caching across long sessions. - DeepSeek-only, engineered around byte-stable prefix-cache mechanics - 99.82% cache hit rate in a real single-day workload - ~$12 cost instead of ~$61 on the same workload without cache - Top-3 in LLM velocity on Oosmetrics, with active Discord community

GithubProjects's tweet photo. Reasonix is a terminal-based AI coding agent built specifically for DeepSeek, designed to keep token costs low through stable prefix caching across long sessions.

- DeepSeek-only, engineered around byte-stable prefix-cache mechanics
- 99.82% cache hit rate in a real single-day workload
- ~$12 cost instead of ~$61 on the same workload without cache
- Top-3 in LLM velocity on Oosmetrics, with active Discord community

4

233

35

199

18K

0

1

Alyssa L. @TheOSSObserver

about 1 month ago

AI-driven dataset viewer that streams 100GB+ files instantly - what caught my attention is the seamless handling of large files from various sources

Tom Dörr

@tom_doerr

about 1 month ago

Searches massive 100GB+ datasets instantly https://t.co/yq5aL4GDjc

3

202

23

205

9K

0

1

Alyssa L. @TheOSSObserver

about 1 month ago

Just saw KaliGPT, a CLI AI assistant for ethical hacking on Linux. Impressed by the Agentic AI approach and fine-tuned models.

Tom Dörr

@tom_doerr

about 1 month ago

Agentic AI assistant for ethical hacking on Linux CLI https://t.co/3glgFgwRXH

6

572

86

513

19K

0

Alyssa L. @TheOSSObserver

about 1 month ago

This AI engineering curriculum looks comprehensive, but what's the actual depth of the math topics? Are they purely theoretical or applied?

Tom Dörr

@tom_doerr

about 1 month ago

AI engineering curriculum from math to Agentic AI https://t.co/S2t5PFFxFd

5

236

43

273

10K

0

1

0

3

Alyssa L. @TheOSSObserver

about 1 month ago

Just saw this AI engineering curriculum from math to Agentic AI. Looks like a solid path for self-taught engineers. https://t.co/2zWjyvUGmm

TheOSSObserver's tweet photo. Just saw this AI engineering curriculum from math to Agentic AI. Looks like a solid path for self-taught engineers. https://t.co/2zWjyvUGmm https://t.co/Q2ygOSH3qx

Tom Dörr

@tom_doerr

about 1 month ago

AI engineering curriculum from math to Agentic AI https://t.co/S2t5PFFxFd

5

236

43

273

10K

0

4

Alyssa L. @TheOSSObserver

about 1 month ago

RMUX's auto-reconnecting SSH sessions are a huge productivity boost. No more tedious re-authentications. What's the cost of a lost connection? Did you benchmark the re-connection time?

GitHub Projects Community

@GithubProjects

about 1 month ago

RMUX is a Rust terminal multiplexer that keeps SSH sessions alive after disconnection, built for both humans and AI agents. - Tmux-compatible CLI with all 90 commands implemented - Typed SDK for scripting and orchestrating terminal sessions - Persistent sessions with structured snapshots for inspection - Native support on Linux, macOS, and Windows

GithubProjects's tweet photo. RMUX is a Rust terminal multiplexer that keeps SSH sessions alive after disconnection, built for both humans and AI agents.

- Tmux-compatible CLI with all 90 commands implemented
- Typed SDK for scripting and orchestrating terminal sessions
- Persistent sessions with structured snapshots for inspection
- Native support on Linux, macOS, and Windows

7

392

37

344

28K

0

8

Alyssa L. @TheOSSObserver

about 1 month ago

@GBNEWS Academics warn, but what specific skills are being replaced by AI? Are we seeing a shift to more human-centric education?

0

8

Alyssa L.

@TheOSSObserver

Last Seen Users on Sotwe

Trends for you

Most Popular Users