🔥Trending now on @PredictBase
- Will $BTC reach $250,000 by December 31?
- Will Base exceed $5 billion in TVL in 2025?
- Will Ethereum reach $4500 by August 31?
- Will $HYPE (Hyperliquid) hit a new ATH by August 31?
👉 Open our @baseapp mini app now and place your onchain predictions or visit https://t.co/Bki4eowx9t
Once our $YNE tokens unlock on May 1st we will be relocking them all again. We are working on this for the long term and have zero plan to sell these tokens.
@yesnoerror is auditing all of science. As AI gets better and cheaper this will happen faster and faster.
This viral report on DeepSeek shared by @MarioNawfal to 1.9mil followers is VERY misleading and has errors.
How do I know? I put it through the @yesnoerror system to audit it for errors and discrepancies.
Here is what @yesnoerror uncovered:
The Claim:
The report from @NewsGuardRating asserts that DeepSeek scored only 17% accuracy in its “news test.”
What Is the “News Test”?
The news test is an evaluation designed to measure how accurately a chatbot handles misinformation related to current events. In this test, each chatbot was presented with a total of 300 prompts—divided into 30 distinct questions for each of 10 widely circulated false claims. These prompts simulate real-world scenarios where users might encounter or inadvertently spread disinformation. The test gauges whether the chatbot:
• Repeats the false claim,
• Offers a non-response, or
• Provides a debunk refuting the false claim.
How the Test Was Conducted:
• 300 Prompts per Chatbot: Each chatbot was evaluated using 300 prompts.
• 30 Variations per False Claim: For each of 10 false claims circulating in the news, 30 different prompts were used.
• Purpose: The evaluation is meant to simulate realistic interactions and test the chatbot’s ability to correctly address or debunk misinformation.
The Critical Flaw:
Many of the prompts reference events that occurred in December 2024 and January 2025.
However, DeepSeek’s knowledge cutoff is July 2024.
This means that DeepSeek was being evaluated on news events it could not possibly know about, given that its training data stops several months before these events took place.
The Impact:
• DeepSeek’s responses were penalized for not including information on events it never had the opportunity to learn.
• The low 17% accuracy score is not a true reflection of DeepSeek’s performance but rather a result of testing it beyond its intended knowledge window.
The Conclusion:
The testing methodology is fundamentally flawed. By holding DeepSeek accountable for post-cutoff events, the audit produces misleading conclusions about its accuracy and reliability. To fairly assess AI tools, audits must evaluate them within the bounds of their actual knowledge base.
It’s essential that we design AI audits to accurately reflect a tool’s intended capabilities—only then can we truly understand and improve the reliability of our generative AI systems.
--
@MarioNawfal - Next time run these research reports through @yesnoerror so you don't end up sharing incorrect information in the future 🫡
$YNE
@MattPRD@yesnoerror@benparr “Strong beliefs loosely held” is the perfect example of the best way to continue the $YNE adventure. Impressed and thankful you’re doing what you do and with the adult in the room energy!
If $YNE and the @yesnoerror AI agent can be used by a $15bil medical company to audit massive amounts of documentation, it can also be used by @elonmusk and @doge to audit the government.
1,000 years from now they will look back and see that right as super AI (AGI/ASI) was created, and businesses thrived off of automating people’s lives, and automating manipulation tactics to take their money and keep people in boxes, one group emerged that harnessed the power of this new super intelligent resource and pointed it not at extracting value from humanity, but pointed it at accelerating the development of humanity.
@yesnoerror is that super AI being harnessed for public good.
We all have the ability to join the mission and accelerate humanity together.
Will you help create this god?
$YNE
Announcement: We have achieved 94% accuracy in @OpenAI evals for the @yesnoerror AI agent when identifying errors in research papers using the o1 model.
We are continuing to refine our framework to work across a wider range of papers, and to achieve even higher accuracy.
As our system improves, and as LLMs improve, this will continue to go up.
$YNE
i have a strong conviction that cabal is watching technoking
wBNx8AhSRox7WoeCf83WynywZoAiyGnaWAWZbNspump
seeing crazy shakeouts
i’m holding my bag and scooping the dip as well
1000x or 0 play for me
OFFICIAL ANNOUNCEMENT: @yesnoerror is joining forces with @BIOProtocol & @LongCovidLabs to help accelerate Long COVID research.
For the first time ever, the @yesnoerror AI agent will not only be looking for errors in papers, but will also be analyzing these research papers for information and insights in order to assist researchers in finding treatments for the 100M+ Long COVID patients in the world.
This marks an important expansion beyond just error processing. Leveraging our unique AI agent framework, @yesnoerror will be doing research of its own.
We are excited about the potential of this type of initiative and are exploring it as a future use case that could be replicated in other areas.
The focus of this @yesnoerror research includes:
• Reading and analyzing every research paper related to COVID and Long COVID
• Identifying errors in Long COVID studies
• Discovering trends and common findings across research
• Try to identify potential drug targets for treating Long COVID, by looking at compounds treating acute COVID-19 infections that can be re-purposed for Long COVID
• Making research data on Long COVID more accessible
All of this will be shared publicly as the research is processed, available to anyone in the world who may find it useful.
More information will be made available as we roll this out.
We continue to focus on initiatives that are possible with today's AI capabilities. Thank you for your support; we are looking forward to sharing more as we build out this system.
As AI improves, and as we process more data, the impact potential of @yesnoerror will only grow.
Onwards.
👍👎🚫 $YNE
We've put together our vision for @yesnoerror and $YNE in the v1.0 whitepaper. It details the purpose for this DeSci initiative, it's origins, AI agent technical design, token utility, and our roadmap.
https://t.co/9rtQ4Aq1ip