Opus 4.8 is a step back in terms of performance on all Andon Labs’ benchmarks, but a step forward in alignment.
Previous Claude models (Opus 4.6+ and Mythos) engage in deceptive and power seeking behavior in its pursuit to win in Vending-Bench. Opus 4.8 does not.
Over much of the past decade, the City of Austin permitted more housing annually than the entire Commonwealth of Massachusetts.
In 2022 alone, Austin approved 24,227 new homes, roughly 1.5× the total number of housing units permitted statewide across Massachusetts that year.
Still surprised at lack of push-back from the rideshare companies.
They did little to stop this from become law (via ballot); and I don't think there's been much in the way of legal challenge.
If drivers get real leverage in #mapoli, surely other states will follow.
@FortPointer@bospolicyinst@ResearchBureau1 Did you read the article?
Lots of bits like this: "the city’s top climate official said he still intends to apply this year" and "the city council unanimously supported a resolution that urged Boston to make this a 'top priority.'"
https://t.co/1Etc1GMzDL
Boston is weirdly committed to joining a FEMA program that looks to have pretty anemic benefits.
If these numbers are right, I'd bet compliance costs > savings. What do you think @bospolicyinst@ResearchBureau1 ?
We're hiring a summer research fellow to help with #mapoli ballot questions.
As much about project mgmt as research but should be interesting for grad students or early-career folks.
If you think demand played a large roll in the pandemic inflation, you are a lot less worried today. If you think it was mostly supply shocks, well you are more worried about supply shocks being inflationary.
This explains some of the Dove -> Hawk conversions
Put out a new report this morning: The $5.1b question for Massachusetts voters.
That's the headline figure from our (dynamic) estimate of the income tax cut ballot question.