@mayukh_panja Yes, it's so closed that almost all ground level discovery is based on academia to be used for "free" of cost.
Also, there's no restrictions to put up your paper in GitHub/blog for the world to see and learn.
7,000 false positives per square millimeter. The culprit was the lab gloves.
University of Michigan researchers just upended a core assumption in microplastics science. Latex and nitrile gloves, worn by the scientists doing the measuring, shed stearate particles that look chemically identical to polyethylene. Standard infrared and Raman instruments can't tell them apart. The gloves were counting as plastic.
Seven glove types tested. All contaminated. The cheapest fix: switch to cleanroom gloves, which dropped false positives to around 100 per mm² vs. 7,000.
The "credit card per week" headline (5 grams, WWF/Newcastle 2019) has separate problems. A 2022 re-analysis found severe methodological errors in the original estimate. Actual measured intake is likely 100x lower.
None of this means microplastics are harmless. Last month's data on brain accumulation still stands. But the numbers driving the panic may have been measuring the scientists, not the environment.
Science catching its own errors is exactly how it's supposed to work.
Hallucinated citations highest in social sciences preprints site - but, as I point out, part of the difference could reflect differences in moderation and publication stage, rather than something inherent to social sciences themselves.
After seeing that Claude Mythos marketing turned out to be, as expected, a scam, I wanted to make a master list of tricks being used to market LLMs.
The master list includes statements directly from leadership in the companies or from the "organic marketing" of people on social media, along with an explanation on how the scam works. This is my first attempt, so likely incomplete.
The LLM Marketing Scams Master List v1:
"Two more weeks" - the models will be good enough someday soon to do what we claim.
"They're already good enough" - the models are already good enough to replace workers, but it hasn't happened yet because of x y z reasons.
"We just built God in the backroom, and no, you can't see it" - the models they built in private are actually capable of doing the things we have been waiting for, but they can't let us see them yet for x y z reasons.
"Actually they already have replaced jobs" - the layoffs that tech companies have been doing, citing AI as the reason, have already been replaced with current LLM tech, ignoring market conditions and past data on layoffs during such conditions.
"You just don't know how to use then as well as me" - the models are good enough, but esoteric prompt engineering is required to get these results, and no, I won't teach you.
"I built an app making big money with LLMs" - they claim they already have made startup companies, almost always SaaS companies, that are making them tons of money, but when you ask to see them, they won't show you.
"You aren't using the right model" - claims that you must be using the wrong model and need to use Open Claude 420b-parameter Gemini Plus Pro 6.9 with 4RealThisTime HomerSimpson agent mode enabled. Note that this will be used to attack every study on the effectiveness of LLMs, since studies take time to complete and publish, with new models releasing more frequently than it's possible to complete and publish a study
"You're falling behind" - claims that you need to use the bots now, even though they aren't good enough to fully automate any jobs, because otherwise, when the bots are good enough, you will lose your natural English skills required to prompt effectively.
"All these companies are using LLMs, so do you think you know better than they do?" - pointing to claims of large companies deeply invested in LLMs being a success saying that LLMs are being used effectively, with no viewable results in the speed and/or quality of their company's output.
"The benchmark score went up" - claiming improvements on the benchmarking tests given to their latest model, despite the training being specifically tuned to improve on these tests, and then conflating better benchmark scores with actually being more able to automate jobs or drastically improve worker productivity.
"It can now count the letters in Strawberry/can now do things it famously couldn't do previously" - saying that it can now count the letters in Strawberry or instruct you on how to use a cup without a bottom, etc. is often done to suggest increased reasoning for the LLM, but often involves just hard coding an answer into the service.
"It has escaped our control" - saying that they cannot control the LLM, implying it is conscious or living to some degree when really it just said words that it wasn't supposed to or an agent used an app that wasn't intended by the user's prompt when next-token predicting
"It's feeling sad/scared/happy/angry, suggesting it is conscious" - they ask the LLM how it is feeling, and it next-token predicts a response that includes an emotion felt by humans, since training data is from human conversations online.
"Costs are going down/the LLM service is profitable" - ignores training costs and capex for hardware, usually just referring to inference being profitable, which isn't even true in many cases. Training and capex is 95%+ of the total costs to serve the models.
Did I miss any?
ENG vs IND. Extreme game, never seen such a game where each and every ball mattered. It's a shame that Jacob Bethell was not the man of the match. The undefeated (run-out doesn't count) run machine.
@ismisemichelle_ This bus service actually goes to Dublin port, then you take the ferry to the UK. Then on the other side the Go-Ahead bus takes you to Palermo.