I've created a benchmark that measures agent performance in basic Power BI report tasks, showing that directly modifying PBIR files is worse, more expensive, and (often) slower than using pbir-cli.
The tests evaluate 30 simple tasks like "create a card that shows the total invoice lines, using <tool or skill>" and a pass is when the agent can do it correct across 3/3 replicates. So if it does it 1 or 2 times, it still fails.
The results were quite interesting. A plugin with skills and small CLIs for validation didn't meaningfully increase performance much (esp not for Opus 4.8), but ballooned token cost and time. In contrast, the pbir-cli alone was sufficient to increase performance significantly on those tasks, and adding the skill made it faster and take less tokens. "Dumber" models took longer with the CLI because they spent more time exploring or iterating, and seem worse at using skills in general.
Obviously together with @maxanatsko I'm an author of the pbir-cli so it's appropriate to be skeptical of my benchmark that favors my own tool, but I did try to make these tests objective and fair.
Note this benchmark only evaluates performance as a function of "can it complete the task" not "do the report / visuals look nice". I'm building a design evaluation as well but it takes a lot longer since design is subjective and coming up with good, repeatable tests that aren't handwavy marketing bs (like "make the report from this image") is challenging.
@kurtbuhler So I asked Claude to restore it from the JSON data, and surprisingly, it restored the map, with all the data naming intact.
What a time to be a BI developer :D
@kurtbuhler Yesterday, I asked Claude to retrieve a loss SVG file that had been embedded via a synoptic map visual, but it is now deprecated, unsupported, and won't open. I tried extracting the zip file to see if the SVG was stored separately but it wasn't.
@kurtbuhler Agreed 100%! I think straightforward reports will move to fabric data apps first but I believe data apps will eventually replace Power BI at some point. I’ve already started building more data web apps instead of Power BI reports because it’s significantly faster with AI agents
While I think what Anthropic does is sad for the ecosystem, I wanna give Boris credit for doing what he can to soften the fallout.
Today's release will include some fixes for better cache use, to lower cost for API users.
If you are the top guy, you should do this:
exit ASAP, fly to an Asian country where you don’t speak the language, install Claude code, go to a cafe for 10 hours everyday, drink with locals at night, and do not enter America again until you are at $1m ARR
I don't think people understand the gravity of the situation as the UN is preparing for possible nuclear weapon use in Iran.
This is a picture of Tehran. For you uneducated, untraveled, never-served, warhawks licking your chops at the thought of bombing it. It's not some low population desert. There are families, children, family pets. Regular working class people with dreams. You're sick to want war.
Tehran is a city of nearly 10,000,000 people. Imagine nuking Washington, Berlin, Paris, London, or beyond, bombed with nuclear weapons.
I gave up my diplomatic career to leak this information. I suspended my duties so as not to be part of or a witness to this crime against humanity, in an attempt to prevent a nuclear winter before it is too late.
Yesterday, nearly ten million people protested “No Kings” in the United States. The possibility of the use of nuclear weapons must be taken very seriously. It's dangerous. Act now. Spread this message worldwide. Take the streets. Protest for our humanity and future. Only the people can stop it. History will remember us.
The first things you want to buy once you are financially free are often the things you grew up without, but you can unfortunately not buy the most important ones directly: health, love, confidence, mental clarity, discipline, talent. Earning them is quite straightforward though:
US-Israel nuclear trap
Under immense global economic pressure and facing a domestic inflation crisis, President Trump,in coordination with Israel,made the fateful decision on March 21 to bomb the Iranian nuclear facility at Natanz.
Iran has interpreted this strike as a deliberate attempt by the coalition to trigger a catastrophic nuclear incident, a “shortcut” to victory through environmental and humanitarian disaster.
This bombardment follows only days after a projectile struck the vicinity of the Bushehr reactor, sparking widespread protests. Yesterday, Iran retaliated with a heavy strike on Dimona, home to Israel’s most critical nuclear reactor.
There is no doubt that the Iranian most modern arsenal, possesses some precision-guided missiles capable of striking Dimona reactor, and the reverse is equally true.
We are witnessing a terrifying escalation, with both nations trading threats of triggering a nuclear incident. This is the definition of wartime insanity.
The situation is further complicated by the fact that Iran has not been inspected for nearly nine months, more than enough time to have clandestinely developed a nuclear weapon.
Technically, Iran is already an ambiguous nuclear power; it possesses the chemical material, the technology means, the timeframe, and at least 4 to 6 models of modern dual-capable missiles designed to carry nuclear warheads.
What if this ends in a nuclear exchange? We must recognize the sheer madness this senseless war is driving us toward. At no point has the conflict de-escalated. On the contrary, the range and sophistication of Iranian missile launches continue to expand.
The coalition’s munitions are running low, yet Iran keeps firing; The war is becoming prohibitively expensive; inflation is surging in U.S and in the world; Gulf nations face billion-dollar losses; Asian allies are undersupplied; the Strait remains blocked; and the bombings have failed to produce the desired effect, With falling war popularity and elections approaching for Trump, and Netanyahu seeking a judicial reprieve, the pressure is mounting.
This pressure is fertile ground for nuclear escalation, the dangerous belief in a “magic bullet” that could end all problems in few hours.
However, in Iran’s case, this may only invite a devastating retaliation from hardened silos using nuclear weapons that, at a basic level, could be assembled in a week.
By targeting Natanz, the coalition has already violated Article 56 of the Additional Protocol I to the Geneva Conventions, which forbids attacks on nuclear facilities, -Even militaries -, due to the risk of “releasing dangerous forces.”
They are opening a nuclear door that is escalating faster than anticipated, and if not contained now, the fallout will be far greater than anyone predicted.
Join my Substack: https://t.co/tfm3rw40K6
And .support my work here 👉🏻 PayPal: [email protected] pix: [email protected] Solana wallet: HoRmrU2wa2LKaD81N9DwV22rxX2e4QRfqn46uFtG1qok
🇦🇺An Australian tech founder with zero biology background sequenced his dog’s tumor DNA, then used ChatGPT and AlphaFold to design a custom mRNA cancer vaccine.
A month later, the tumors shrank by half.
And this is just the start of AI medicine.
this is actually insane
> be tech guy in australia
> adopt cancer riddled rescue dog, months to live
> not_going_to_give_you_up.mp4
> pay $3,000 to sequence her tumor DNA
> feed it to ChatGPT and AlphaFold
> zero background in biology
> identify mutated proteins, match them to drug targets
> design a custom mRNA cancer vaccine from scratch
> genomics professor is “gobsmacked” that some puppy lover did this on his own
> need ethics approval to administer it
> red tape takes longer than designing the vaccine
> 3 months, finally approved
> drive 10 hours to get rosie her first injection
> tumor halves
> coat gets glossy again
> dog is alive and happy
> professor: “if we can do this for a dog, why aren’t we rolling this out to humans?”
one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline.
we are going to cure so many diseases.
I dont think people realize how good things are going to get
The goal of solopreneurship is to become unemployable.
So free, so selective, and so allergic to nonsense that no job could ever compete with what you've built for yourself.