๐ Celebrating 100+ users on https://t.co/qyoh6dMxGk!
We're grateful to every team, freelancer, and startup that trusted us to improve productivity and simplify their workflow.
and yes, itโs now FREE for 1 year!
Letโs keep growing together ๐
๐ https://t.co/ZcoFea5qDZ | ๐ +91 8141067657
#TrakkarIN #MilestoneCelebration #100Users #TimeTrackingIndia #ProductivitySaaS #WorkSmarter #StartupIndia #RemoteTeamSolutions #MadeInIndia #GrowthTogether
I am working on an awesome feature , you don't want to miss. May be it's time to change your project management tool.
Stay tuned. Something big is coming soon!
Demo video soon!
Practical implication: if you're building agentic systems, the model's internal signals at specific token positions may be a far better hallucination detector than asking "are you sure?" or reading log-probs.
Paper: https://t.co/PIApyFcTAq
Replicated across Gemma 3 27B and Qwen 2.5 7B, on TriviaQA and MNLI.
So this isn't a quirk of one model โ it looks like a general property of how transformers organize self-evaluation.
The wildest part: causal interventions show the PANL signal can rescue error detection even when the answer info is corrupted.
The signal isn't just "this is probably wrong" โ it encodes whether the model has the knowledge to fix it.
They tested it with a verify-then-correct paradigm. Three results:
1. Verbal confidence predicts errors far better than log-probs
2. PANL activations predict errors better than verbal confidence itself
3. PANL predicts which errors the model can actually fix
Their finding: there's a specific token โ the newline right after the answer (they call it PANL: post-answer newline) โ where the model caches a confidence representation.
This signal causally drives verbal confidence and dissociates from token log-probabilities.
The paper (Kumaran, Patraucean, Osindero, Velickovic, Daw โ arXiv:2604.22271) borrows from decision neuroscience.
Humans use a "second-order" model: a separate evaluative signal that can disagree with the committed response.
Turns out LLMs do something similar.
The puzzle: LLMs sometimes detect and correct their own errors with no external feedback. But how?
In a "first-order" system, confidence comes from the generation itself โ so the chosen answer should always look maximally confident. Error detection shouldn't be possible.
A new paper from DeepMind researchers cracked open something fascinating: LLMs have an internal confidence signal that predicts whether they're wrong โ and whether they can fix it โ even when their own words say otherwise.
Here's what the research found ๐งต
Building SaaS in India full time for the last few years and this is real. The dev exposure problem is bigger than people think. Most BTech grads I meet have never shipped anything to a real user, AI or otherwise. The gap isn't technical, it's that nobody teaches them building is the easy part.
@uday_devops Appreciate it Uday. If you do try it, hit me with the harshest feedback you've got. Free users who actually care enough to roast it are how this gets better.
@deep3labs GM. Building https://t.co/MfMvuXIJjI, time tracking and team productivity for small teams. Less AI hype, more boring stuff that actually works. 100+ teams, free for now, paid plans soon. https://t.co/hB1dsn4XU3
@anupamrjp Raw version: small teams waste hours every week guessing where time went, and existing tools are either bloated enterprise software or a spreadsheet. Building https://t.co/MfMvuXIJjI to be the boring, fast middle ground. https://t.co/hB1dsn4XU3
@RoundtableSpace https://t.co/MfMvuXIJjI. Time tracking and team productivity for small teams.
100+ teams.
0 paying.
Free for a year was my call. Now shipping the paid plans and finding out who actually values it. https://t.co/hB1dsn4XU3
@AIsaOneHQ Working on the paid plans for https://t.co/MfMvuXIJjI. No grass touching this weekend. 100+ free teams need to become paying customers and that math doesn't solve itself. https://t.co/hB1dsn4XU3