@BretBielema@Nike@JakeRosch Please don’t shelve these! They need to come back periodically — maybe a Homecoming tradition? Too good to only flash once.
@jackwshepherd Also, many applications fit within an ecosystem / aren’t on an island. Understanding data ingress and egress isn’t easy and requires a lot of context, isn’t just a coding gap.
The claim that GPT-4 performed at the 90th percentile on the Uniform Bar Exam appears misleading according to this new preprint.
It seems more accurate to say it performs in the 63rd or 68th percentile.
Some light discussion of model performance & evaluation in an AI session. Honestly, market needs a whole dedicated session on that topic. Not easy when dealing w/Generative, unfortunately. Isn’t the fun stuff, but thinking needs to evolve on that in Legal. #ILTACON2023
Speaker in the data science session just covered context windows and document segmentation in a session. Really good to see. Need more of the sharp edges in GenAI pointed out to the industry, not just hocus pocus. #ILTACON2023
Retrieval Augmented Generation (RAG) isn’t a “domain specific LLM.” It’s a usage pattern around an LLM, injecting content into the prompts. The difference here matters a bit. Legal industry needs to unpack RAG a bit more. #ILTACON23
Really solid analysis evaluating performance of Generative AI in our industry. We need more of this in Legal. Requires cross-discip. perspective, as displayed here. Can't rely on product companies and those selling services; need a more objective lens.👏
https://t.co/PfhQ8pJ4qH
Where would we be if 20% of the time spent thinking about what the future could look like because of Generative AI was spent rigorously testing use cases and validating approaches? Collective "Legal Science Fiction" skills are fairly refined, "Legal Science" interest fairly low.
@LeanLawStrategy Line of thinking here is right. Not dealing in magic; treat as science. Research / validation of the application of Gen AI to specific NLP tasks is hugely important. Fit will vary, and we need to ground that in better ways. Big knowledge gaps to close.
@inspiredcat That particular example is beyond the pale. But our industry is way behind in formal evaluation of information retrieval (and other language tasks) via Generative AI. “Use correctly” is a tougher challenge than most acknowledge, given nature of tech (e.g., content window limits).
Regardless, this highlights the need for a healthy ecosystem of researchers checking each others' work and rerunning evaluations. Tests need to be run and/or checked by multiple different teams.
The claim that GPT-4 performed at the 90th percentile on the Uniform Bar Exam appears misleading according to this new preprint.
It seems more accurate to say it performs in the 63rd or 68th percentile.
@marclauritsen@Nicola_Shaver And sitting on calls with you and Ron Staudt while I was in law school reshaped my world. And that of others. The through-lines are pretty cool when you zoom out.
I read these “law firms are sitting on so much data” threads & wonder: Have you been in a law firm? Have you run queries against common data sources directly? Ever try mining the text sources you portray as easy pickings? Know what that even entails? “No” is too often the answer.