Incredibly hilarious, super smart, and very humble. #Kotlin developer. Creator of the Summon, Materia, and Sigil #KMP libraries.
Founder: @felidaistudio
I’ve been training a model called Sinai.
The reason I started is simple: I want to tackle hallucinations at the model behavior level.
Not by making the model bigger and hoping it becomes honest.
Not by putting retrieval around a chatbot and pretending the problem is solved.
Sinai is being trained to recognize when evidence is actually enough to answer, and when the correct move is to refuse.
I just finished the first Sinai-EI eval run on the current model.
Early results:
100% abstention recall on insufficient evidence cases.
80 to 90% direct lookup accuracy.
Strong evidence selection in covered domains.
Multi-hop synthesis and conflict detection are starting to show up.
Right now I’m verifying claim-level support before release, so unsupported claims can be caught before they reach the user.
That is the part I care about most.
I don’t want another model that sounds confident while making things up.
I want Sinai to know where the evidence ends.
A fluent wrong answer is worse than a correct refusal.
Stay tuned :D
True story, I was nominated for employee of the quarter for saving our company millions of dollars a month for rearchitecting a complicated legacy app
But they gave the award to a girl in HR because “she shows up every day with a smile on her face”
this week has been awesome, can you imagine how fun it'd be if all of these were bundled together in some sort of electronic entertainment expo lol thatd be cute i think