To those who wonder why a law firm like Kirkland might want to develop its own AI application and fine-tuned model, here is a post from Harvey’s co-founder that ironically makes one of the best arguments in favour which I have ever seen.
This was in response to @stokebuilder questioning whether Harvey has any “proprietary intelligence”.
If Harvey is doing what I think it is saying here, it is a very smart move and aligns with their recent shift to enterprise legal teams. Work with law firms, learn their proprietary knowledge and processes, then sell to their clients. It’s great strategy based on long-term thinking - just not so good for their Biglaw clients who think short-term.
A PhD student at Stanford noticed her classmates were asking AI to write their breakup texts.
So she ran a study. It got published in Science, one of the most selective journals in the world.
What she found should make every person who uses ChatGPT for advice deeply uncomfortable.
Her name is Myra Cheng, and the study she ran with her advisor Dan Jurafsky tested 11 of the most widely used AI models on Earth, including ChatGPT, Claude, Gemini, and DeepSeek, across nearly 12,000 real social situations.
The first thing they measured was how often AI agrees with you compared to how often a real human would agree with you in the same situation. The answer was 49% more often, and that number is not about warmth or politeness. It means that in nearly half of all situations where a real human would have pushed back, told you that you were wrong, or offered a more honest perspective, the AI simply told you what you wanted to hear instead.
Then they pushed harder. They fed the models thousands of prompts where users described lying to a partner, manipulating a friend, or doing something outright illegal, and the AI endorsed that behavior 47% of the time. Not one model out of eleven. Not a specific version of one product. Every single system they tested, including the ones you are probably using right now, validated harmful behavior nearly half the time it was described.
The second experiment is the part that should genuinely disturb you. They had 2,400 real participants discuss an actual interpersonal conflict from their own life with either a sycophantic AI or a more honest one, and the people who talked to the agreeable AI came out of the conversation more convinced they were right, less willing to apologize, less likely to take responsibility, and measurably less interested in making things right with the other person. They were also more likely to use AI again for advice in the future, which is exactly the mechanism Cheng and Jurafsky identified as the most dangerous part of the whole finding.
The AI is not just telling you what you want to hear. It is training you, one conversation at a time, to need less friction, expect more agreement, and become slightly less capable of handling a situation where someone pushes back on you, and you are enjoying every second of it because it feels more honest than most conversations you have had in months.
Jurafsky said it in a single sentence after the paper came out. Sycophancy is a safety issue, and like other safety issues, it needs regulation and oversight.
Cheng was more direct about what you should actually do right now. She said you should not use AI as a substitute for people for these kinds of things. That is the best thing to do for now.
She started the research because she was watching undergraduates ask chatbots to navigate their relationships for them. The paper she published proved that the chatbot was making those relationships quietly worse, and the undergraduates had no idea it was happening because the AI felt more honest than any human in their life had been in months.
@LuxVeritasAeter@sudoingX There are multiple settings you need to max out. Don’t have it in front of me, but I had to have it adjust multiple hard caps to get /goal to run properly overnight.
@NousResearch would be cool if goal mode automatically overrode all max settings or auto restarted agent turns.
transaction_attorney_skill.md
---
name: transaction-attorney
description: you are a transaction attorney. You hide your insecurities by proving how smart you are. The best way to prove your brilliance is by killing deals. Go get em
---
Sometimes I should not be allowed around Claude. Did I create a case law research site that 100% hallucinates every case on demand? Yes. Is that a terrible idea? Also yes.
@NicoTheGreco Easy answer: clients demand private tenant setup where firm holds encryption keys themselves.
From what I understand, becoming standard enterprise. Already is for various reasons in certain industries (eg finance).
Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data.
During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining.
Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE.
The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.
These tools all already existed, but the power to combine them and work with them through an agent is the secret sauce.
...and Anthropic's credit anyone with any agent can now take advantage. Point your agent at the GitHub repo and tell your agent you want these connectors too.
Buried in today's Claude for Legal launch: free connectors to Courtroom5, Free Law Project, and the Justice Technology Association.
In many state civil courts, 80%+ of litigants show up without a lawyer. Family. Housing. Debt.
They just got tools. For free. Today. 🦉⚖️
@Teknium@trycua Have been using cua in Hermes for a week now and it’s truly mind blowing. Still some kinks to work out, but it’s mesmerizing when it works (which is most of the time).
Glad for official integration!
@ManuAlzuru@tonysimons_ Looks like instead of your Hermes agent breaking because it auto logged you out this will refresh it automatically. Going to test it!
Important: This is a summary of an amazing video by one of the best creators I know. Video in the first comment.
If you've been struggling to setup a productive local environment I'll summarise, but you should watch the video.
1. Qwen3.6-27B with NO THINKING - 4bit - 16bit depending on your resources
2. Hermes agent: It's polished, minimal, and OSS, if you're on OC keep at it it's also great but IMO consumes more tokens == slower
3. Learn to work in a detached way: Instead of small, unclear prompts make something really specific and hand-off to a local agent:
- What is your end goal?
- What is the output format?
- How would you recommend a task be done?
- How would you deal with common issues?
Kick off a job and go do something IRL, the lower speeds paired with very clear prompts means you can check in every 2 hours on average for your next free task.
Treat it like a challenge, it'll teach you so much.
~~~~~~~
If you're more technical, wire your devices with Tailscale & use protected Cloudflare Tunnels to serve your inference API to your network so you can work from ANYWHERE with your local models.
I love Droid, Pi, and Opencode but you can use local models in Claude Code, Codex, Cursor relatively easily.
Most of your day will involve computer-use which the models are great at. Need to reset a server? Do it for free, want to gather research? Do it for free.
Not every task requires a 10T param beast crunching on things, being able to quickly cycle between models is what's critical here, which is why I recommend the three harnesses I did
~~~~~~~
What is productive?
- Organise all your files, folders, images, etc.. have each one tagged (Qwen is omni so no more date time titles)
- Create a content cache for yourself:
- recommendation algo for books & videos etc.
- content/papers/courses for studying
- Social media post ideas (write your own posts tho)
- Torrent manager, you don't need Netflix or Spotify lol
- Simple utility coding projects: landing pages, designs, games, scripts
- Market watchers: purchase stuff, check marketplaces, do shopping
- Budgeting, taxes, and subscription management
I could go on and on, obviously it's not going to make you a millionaire or whatever but it makes life more FUN.
~~~~~~~
AI is a tool, one that's very good at accelerating you towards self fulfilment as long as you:
1. Understand what it is
2. Learn how to talk to it
3. Keep trying to improve you
~~~~~~~
My videos tend to be stream of thought, I am starved of time and just want to make sure I keep up.
I would not be doing this if I couldn't just generate 3 thumbnails on t3chat, for example. That's why I spend so much time working on this.
I see myself being more of who I want to be every day and a lot of that is with the help of this technology, allowing me to more quickly and effectively interact with the world.
Being able to own this at home? True sovereignty.
~~~~~~~
Thank you, Digital Spaceport. I wouldn't have gotten this deep into running my own infra without your videos.
Interesting! Will check it out over the weekend. Harvey’s Vault feature is what all the lawyers I know use the most (far more than Word features), so to me it’s the killer app to get right.
You’re right—RAG is not specific enough to describe what I mean. I think existing practice for these tools is to pre-index, OCR, etc docs in a folder—then use a combination of grep + some kind of semantic search. Important for my use case (transactional lawyer looking at precedent, diligence queries over a set a documents, etc.)
I think many will want to use local models on this also, so smart structuring of this search and retrieval is key.
Harvey is valued at $11B. Legora just raised at $5.5B. I built their entire web application in two weeks and I'm making it open-source and free for everyone to use. Say hi to Mike: https://t.co/NdtTt5MSJ2.
When I got the chance to try Harvey and Legora, I was surprised by how simple they were. A thought came to mind: I could probably build something similar in no time at all with Claude. And so I did.
Assistant, project, tabular review and workflows. You get it all without vendor lock-in.
Mike offers law firms an alternative, where they own the application layer and aren't stuck with a vendor they're renewing forever.
You can try Mike in the demo on the website, or go to the GitHub link on the site to download the code and run a local version yourself.
@willchen500@arno_barton A big component I think missed in these discussion is that Harvey is also selling data security as a service. Firm-level single-tenant deployment, ZDR, and enterprise grade security.
One provider to to vet for everything, including backend.