Since nobody took me up on my $60 bounty for solving the alignment problem, I made a website for it: https://t.co/Mzb3DtplnN
Would someone please just solve it? I'm getting kind of tired of this whole AI risk thing.
Open-weight LLMs ship with safety training that can be stripped in a few hundred fine-tuning steps. Can current defenses stop this?
We built and open-sourced TamperBench, the first unified framework for evaluating tamper resistance, and the answer is mostly no. 1/7
@AdrienneLaF This is the second Atlantic article I’ve read in the past month that sort of makes me regret my subscription.
I suppose if you must have poorly reasoned articles about AI, you might as well get a skilled writer.
@r_vonhagn@Noahpinion@rex_the_reptile I don't think SF has NYC beat on weird people generally, but SF has many people like this. Used to be two homeless pirates near my office. An angry one and a happy one.
For non-homeless weird, there's the guy with the purple Ferrari and pet duck https://t.co/3cDr5qHNiW
This is sort of tough because if I give an example of a technology not developed/deployed you can say "not yet" or "not possible" or something. Anyway:
- human cloning (not deployed)
- nuclear power (way underdeployed vs value)
- geoengineering (neither developed nor deployed)
- human challenge trials (way underdeployed vs value)
Jeffrey and I (mostly Jeffrey) worked on this a few years ago: https://t.co/PRQU2QoOtC
I’ve said enough about my disagreements with some of the ideas below, or at least with the certainty Pope Leo is expressing about them, but can we also reflect for a moment how wild it is that the Pope is tweeting stuff like this? It feels lifted from a screenplay about takeoff.
Yeah, and when I lived in a city where I used a car to get around, I would offer to pick up dates.
But picking someone up in a car is different from ordering an uber for them. You're asking for their trust to get in a car with you, you're spending time together in the car, and you're putting in effort to do it. Ordering an uber for someone doesn't have these benefits, which is why it hadn't ever occurred to me that someone would expect me to do it for them.
BTW the thing you mention in the other thread where he added a stop to his own uber seems great and checks the other boxes (mostly--less trust to get in a car with an uber driver present).
Seems like at least 3x as much logistics to ask him to send an uber than to just order one yourself, though, right?
Like, I understand it feels good to have the guy do stuff for you and spend money on you. But in this case, she was making a bid for both of them to spend more effort, so that she could make sure he was willing to pay the cost to see her.
@freed_dfilan@NinaPanickssery Identifying causes is hard, and I don't see anything special happening during those years on graphs of religiosity over time. But I suppose internet sneering could have been helped. There was latent demand to leave religion and people didn't feel they had social permission
@boazbaraktcs I think you're misunderstanding the attitude, and I don't expect the views upstream of caution around kids using LLMs without supervision would also generate caution around kids using a computer in the 80s.