@GavinRayDev@itsEmZee_ Hey Gavin, I’ve made some changes which I think it may fix it, can I run your vcf to verify? I’ve requested access to the file on google drive
@GavinRayDev@itsEmZee_ I’m glad! Do you mind send me the exact schema errors or your trace with codex you saw? There are a lot of different moving parts I must’ve miss one somewhere!
Introducing Genomi: an open-source agent harness that turns your AI agent into your personal DNA expert.
I took a DNA test years ago. Like a lot of people, I got the report, found something interesting, and forgot about it.
Recently I gave the data to my codex agent and it was obvious how incredibly useful DNA is for personal health, but:
> General AI can sound right while being wrong
> Static DNA reports can’t keep up with new science
> DNA data should stay on your local device, not uploaded to a website
So we built Genomi, local-first, agent-native, self-evolving, evidence-grounded.
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
Introducing Agent Arena: real-world agentic evals at scale.
How do you evaluate agents doing actual work? We measure millions of live sessions where real users accomplish real tasks.
On Arena, models now get web search, filesystem, and terminal tools to complete complex workflows: writing code, creating slide deck, researching the web, building apps, and analyzing documents.
Every session produces rich signals. Users iterate with the agent turn-by-turn: approving, editing, correcting, praise or expressing frustration. The environment gives feedback too: shell errors, tool failures, recovery attempts, and more.
Our leaderboard measures each model's agentic performance using causal inference across five signals: task success, steerability, error recovery, user praise vs. complaint, and tool hallucination.
This leaderboard snapshot is built from 300K+ tasks, 2M+ tool calls, and 40M lines of code by agents.
Top labs in Agent Arena:
- #1 @OpenAI: GPT-5.5 (High)
- #2 @AnthropicAI: Claude-Opus-4.7 (Thinking)
- #3 @Zai_org: GLM-5.1
- #4 @GoogleDeepMind: Gemini-3.1-Pro
- #5 @Kimi_Moonshot: Kimi-K2.6
More analysis in the thread, with the full technical blog below.
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
@YashHustle_22 Not something I can do but Codex cannot do, but something Codex couldn’t do at all, but now I made it able to do:
https://t.co/z2YlTBRMng
Introducing Genomi: an open-source agent harness that turns your AI agent into your personal DNA expert.
I took a DNA test years ago. Like a lot of people, I got the report, found something interesting, and forgot about it.
Recently I gave the data to my codex agent and it was obvious how incredibly useful DNA is for personal health, but:
> General AI can sound right while being wrong
> Static DNA reports can’t keep up with new science
> DNA data should stay on your local device, not uploaded to a website
So we built Genomi, local-first, agent-native, self-evolving, evidence-grounded.
Hi. Over the last 24 hours we had three separate small incidents that affected Codex reliability. Those are three too many and we are taking active steps for them to not reproduce.
I have reset usage limits for Codex across all paid plans. May the tokens flow again.
@gptsiolis@itsEmZee_ We suspect the problem is due to we had an AGI schema change that a path in 23andMe got left out, so a mismatch, I tried it on some 23andMe files and they seem to work.
Genomi is officially on @ProductHunt
Genomi transforms your massive DNA raw data into contexually manageable, queryable Active Genome Index that you can trust.
https://t.co/qSvywCVQxy
Building autonomous agents for scientific discovery? 🧬🤖
@GoogleDeepMind Science Skills is now available on GitHub. We've open-sourced this specialized toolkit to accelerate your agentic workflows with scientific grounding and higher token efficiency.
Download now ↓
https://t.co/cwp1HOeKvo