@tommy_bennett There are many more typos and wrong citations my code picked up (11 problematic cases before I stopped my automatic checking). I donโt have time to check them manually because I have nothing to do with the case. But those on the case will have to.
@IanCutress Yes. As a concept, it is an obvious future direction, but proven to be very hard. They claim to have it in production and will release chips this year with specific compute and energy efficiency numbers that can be verified this year. If true, a new scaling law follows.
@fangshimin Even more complexity due to how universities fill out survey form differently. If one was to know how much universities spend on Research, they would have to look at their audited financial reports. The survey data could be more than 50% off like the case of JHU
@fangshimin Maybe one way to account it would include it, but if you do so, then you would add Lincoln Lab to MIT and SLAC to Stanford. If you remove all national labs, they are basically the same.
@fangshimin NSF survey data should not be used for rankings because they do not reflect how institutions actually spend on research. For example, the Johns Hopkins figure includes the Applied Physics Laboratory, a $2.5 billion defense research operation located on a separate campus.
@ladymissazira it is much harder to review AI generated docs, because they seem polished and legit, it is easy to miss errors. Also, past experience in spotting typical associate errors is not useful.
@ladymissazira Tried a legal memo with it, then asked ChatGPT to find hallucination in it. Sure enough, it found "one clear statutory error" in addition to some minor citation errors. But we cant trust ChatGPT either so here comes the painful manual review.
@gordon_cassie Very true. Starts with the excitement of Claude Code handling an entire test project, to fixing errors all the time and the compacting wait. Same for legal work: a memo can look polished, but if the case analysis is wrong, correcting it is painful. Still very useful but not AGI
@CuriousLuke93x@MaxJunestrand I am running my own local AI. No network connection. It works great. Few simple wrappers to make it a bit easier. Not critical. Most work can be done by bare metal local inference and local search.
@th_s4m0ht The most needed is a hallucination insurance policy from legal tech companies. Malpractice insurance will probably explicitly exclude AI uses. Law firms should never adopt any tech that does not have the warranty.
@matt_ambrogi@harvey In view of the number of cases where Attorneys are facing sanctions, it seems to me that there is only one most important bench mark over all others, % of documents containing AI hallucination. To emphasize this, other benchmarks should count very little.
@GavinSBaker TSMCโs US expansion hinges on the Arizona experiment. Its success has been built on Taiwanโs culture and ecosystemโscaling in the US will require a new model adapted to a different environment.
@dan_biderman Legal techโs fundamental problem is AIโs fundamental problem. Task-specific patches help at the margins, but hallucination and unreliability canโt be fixed downstream. It needs breakthroughs at the foundation model level.
@MarioNawfal Careful reposting. UCLA didnโt revoke his degree, heโs now at Columbia for his Masterโs, and AI use was allowed on that project per his LinkedIn. Colleges need to catch up, there are ways to permit AI and still ensure real learning and rigor.
@stokebuilder By legal community, I assume lawyers and judges. They love open source. Few legal tech companies may not like it, but who cares. Open source may be a necessity for the legal field.