I've created a benchmark that measures agent performance in basic Power BI report tasks, showing that directly modifying PBIR files is worse, more expensive, and (often) slower than using pbir-cli.
The tests evaluate 30 simple tasks like "create a card that shows the total invoice lines, using <tool or skill>" and a pass is when the agent can do it correct across 3/3 replicates. So if it does it 1 or 2 times, it still fails.
The results were quite interesting. A plugin with skills and small CLIs for validation didn't meaningfully increase performance much (esp not for Opus 4.8), but ballooned token cost and time. In contrast, the pbir-cli alone was sufficient to increase performance significantly on those tasks, and adding the skill made it faster and take less tokens. "Dumber" models took longer with the CLI because they spent more time exploring or iterating, and seem worse at using skills in general.
Obviously together with @maxanatsko I'm an author of the pbir-cli so it's appropriate to be skeptical of my benchmark that favors my own tool, but I did try to make these tests objective and fair.
Note this benchmark only evaluates performance as a function of "can it complete the task" not "do the report / visuals look nice". I'm building a design evaluation as well but it takes a lot longer since design is subjective and coming up with good, repeatable tests that aren't handwavy marketing bs (like "make the report from this image") is challenging.
There's a new release of Power BI agentic development skills. New paginated report skill, updates for pbir-cli, and more. Also lots of useful stuff for Claude Code users (status lines, hooks, etc.) Check it out: https://t.co/7uAotDHabI
New plugins
- Paginated reports: Experimental paginated reports skill, quite highly requested. I don't use paginated reports much but try it out and LMK what you think.
- Custom visuals: I moved all the custom viz skills into a new plugin. New skill for custom pbiviz, as well as the MCP server from Microsoft.
- ETL: This is mostly copied out of the fabric cli skill. If you just want the ETL stuff from the Fabric CLI skill this is for you. Otherwise ignore it if you're content with the Fabric CLI skill (like me).
Changes
- Custom viz skills (svg, deneb etc) moved out of reports plugin
- Power BI Design skill is WIP deprecated in preparation for a massive restructuring of the skills over the next 1-3 releases.
- pbi-cli skill updated for 26.25
- pbir-format skill (for those not using pbir-cli) updated to include many more validation criteria
- connect-pbid hooks made less "noisy" and connect-pbid skill also with a few enhancements
Friendly reminder that the Fabric CLI is amazing and you should try it.
Here’s the actual skill I use to set up my personal AI advisor.
I like to keep the SKILL md concise and clean so that I can share it with others. It tells Claude Code or Codex:
→ What role to play
→ Which context files to read
→ How to give advice
→ When to save new learnings
But the real magic with an advisor skill is the personal context you give it. That lives in files like:
→ plan md: Goals, principles, energy, life
→ learnings md: Patterns from past chats
→ eval md: A checklist for AI to give better advice
The more context your advisor has, the less generic the advice gets.
📌 Full tutorial: https://t.co/V5e2RCIinT
pbir-cli v0.9.25 is out. Lots of new things:
- Hot reload TMDL to power bi desktop
- Screenshot all pages at once
- Huge improvements to property discovery / validation
- New command groups for bulk font / color ops
New experimental commands
- Convert legacy reports to PBIR programmatically
- Migrate thin report measures to the model
- Query report usage metrics
Here’s my new tutorial on how to turn Codex or Claude Code into an AI advisor for life and career decisions.
It involves setting up an /advisor skill with 4 files:
→ SKILL md: How the advisor should behave
→ plan md: Your goals, principles, energy, etc
→ learnings md: Insights from past chats
→ eval md: A checklist AI runs before giving advice
This is my favorite skill because it knows my goals, gives me useful feedback, and gets better the more I talk to it.
📌 Watch now: https://t.co/CXw9rwD8Oa
Org Apps in #MicrosoftFabric are now generally available. You can now have multiple audiences, use CI/CD and my personal favourite is it automatically propagates access to underlying items and semantic models. https://t.co/jjhDOE1RCS
Last week @AnthropicAI published an article about how they are having success with agents and analytics / BI.
The article had some useful insights and reminders that we highlight in this week's Tabular Editor blog.
Link (Tabular Editor Blog): https://t.co/kz4p4r2SJL
Original article (Anthropic): https://t.co/GAeFUC7W9x
A few highlights in a nutshell:
- Remember that success with BI / analytics is not a technical KPI / problem. Just because an agent is generating most of the queries and getting them correct doesn't mean it's bringing business value. User training and adoption should never be neglected - AI or no AI.
- The fundamentals are more important than ever. Reminder: If you get these fundamentals right, it's going to help you get success with BI across the board... with or without AI. This includes dimensional/semantic modelling, data quality, governance/oversight, and source control and automated testing with CI/CD. The tools we make at Tabular Editor (such as our new CLI) are laser-focused on this problem.
- We should be treating metadata, skills, and documentation as first-class citizens. This means that they are human-owned and curated, but also in source control and ideally also tested before use or distribution. Agents can help, but shouldn't create the context de novo.
- Some things we may need to re-think, such as co-location of artifacts in workspaces rather than separating them by item type, or telling self-service users to use PBIX + OneDrive if they want to use agents, instead of metadata-first formats.
@jglopes26 after i got nuked by the flu or whatever that was in may i've had to put off all these updates. hoping to already get some good ones out at a decent cadence again. The refresh and screenshot of pbir-cli + release of te-cli both are massive amplifiers here.
A preview of the next version of the pbir-cli, which can refresh and screenshot the Power BI canvas using the new Power BI desktop preview feature from Microsoft.
This makes the agent a lot more powerful, and the experience more fun / satisfying. View in full-screen!
The power-bi-agentic-development skills have been updated to the latest version.
A lot of new updates, fixes, improvements. Biggest is that connect-pbid and pbir-format can use the new APIs exposed by Power BI Desktop to refresh the canvas and get screenshots.
Other updates:
- Semantic model skill
- Improvements and fixes to pbir-format and connect-pbid
- Enhancements to various report skills
- Copied "te-cli" skill from TabularEditor/CLI into the tabular-editor plugin
Note:
- I plan to overhaul the skills significantly over the next weeks to make them more easy to fork, template, and own. Part of this will involve skill routing but also strategies to instruct the agent more forcefully to maintain its own context based on interactions with you.
𝐍𝐨𝐛𝐨𝐝𝐲 𝐝𝐨𝐜𝐮𝐦𝐞𝐧𝐭𝐬 𝐭𝐡𝐞𝐢𝐫 𝐏𝐨𝐰𝐞𝐫 𝐁𝐈 𝐚𝐧𝐝 𝐅𝐚𝐛𝐫𝐢𝐜 𝐨𝐛𝐣𝐞𝐜𝐭𝐬. 𝐁𝐞𝐜𝐚𝐮𝐬𝐞 𝐢𝐭 𝐭𝐚𝐤𝐞𝐬 𝐟𝐨𝐫𝐞𝐯𝐞𝐫.
Until now.
I used 𝐆𝐢𝐭𝐇𝐮𝐛 𝐂𝐨𝐩𝐢𝐥𝐨𝐭 + 𝐅𝐚𝐛𝐫𝐢𝐜 𝐒𝐤𝐢𝐥𝐥𝐬 to auto-generate documentation for Microsoft Fabric and Power BI objects. The AI knows your environment. It understands Fabric. It writes the docs for you.
This changes the workflow completely.
Video in the comments 👇
#MicrosoftFabric #PowerBI #GitHubCopilot #FabricSkills #DataEngineering #PowerBIDeveloper #AITools #VSCode #FabricAnalytics #BusinessIntelligence #GitHubCopilotCLI #DataPlatform #PowerBITips #AIAutomation #MicrosoftFabric
This chart from Anthropic is useful, since Agent Teams and Workflows are both very new and very powerful (and token hungry).
On the other hand, maybe it doesn't matter as a lot of the decisions about which approach to use is from the AI itself & it often uses them in combination
I have not been using fabric data agents because they only provided data in tables or text. The cognitive load is just too big - it doesn't matter if it's correct or fast, I don't want a wall o text.
However it seems soon they will be able to render visuals. Huge improvement!
Data apps are more complex than power bi reports since it is all code. Some of this is offset by AI but not all.
That said, data apps work miles better with AI than Power BI does, since it's actual code. You get better results faster and without macguyvering visuals or the model. When I tried making the same design in power bi vs data apps it took me 80% less time in the data app, and that was for something relatively simple.
I have said that data apps don't replace power bi reports, but the truth is that it does raise some uncomfortable questions about the future of reports. There are many consequences of this now being possible.
For instance, from now on, anytime someone demos a dashboard or shares a screenshot, you can't tell from looking at it whether it's a report or a fabric app unless that UI is visible or it's disclosed.
Further, if I share something for power bi, you need to know how to make it. Even with AI you can't replicate it without the config. In a data app, you give a screenshot to Claude and say "make this"- it works 98% of the time. The whole "economy" that's built up around power bi content has the potential to completely change because of that. But more importantly it means users don't have to learn the weird wizardry of power bi UI manipulation to make what they want - they can focus on real design.
For reporting where people need basics like subscriptions, export to excel, etc you should stick to power bi.
For scenarios where you want actual good visualization, though, it's not even close. The numbers favor data apps by a long shot. It is just a question of whether you can manage the step up in complexity, and what you will do if AI prices get too high down the road.
Regardless, I expect we will start seeing many cases where someone approaches their boss with something impressive they built in one hour, and decisions are made. It also has the same "five minutes to wow" of early power bi, but on steroids.