Dagster+ delivers impressive ROI according to the latest Forrester TEI report: 432% ROI with data engineers shifting from maintenance to high-value work.
As one executive noted: “Because we now have a full test suite that ensures everything is actually running as expected, we trust our code more. And now it’s trivial to go through the process of managing deployments and pull requests and managing releases and deployments of pipelines and code.”
Get the full report today!
Link in thread
We adopted Astral’s Python type checker, ty, to speed up type checking in the Dagster monorepo. The performance gains were dramatic, but the bigger surprise was that ty caught real runtime bugs Pyright missed.
See the full story here:
https://t.co/wgDCoYsLAY
Check out this post from @sspaeti that covers a complete guide of insights, tips, and predictions for the data platform engineer, just like an Almanack provides, with practical information for daily life.
I wrote an Almanack like Charlie Munger and Naval Ravikant did, but about life wisdom and my time of using Dagster for data orchestration over the years. Since using it in 2018, the story has gone from complexity to composability and from orchestrating to a full data platform.
Lots have changed since I started. We have shifted from execution-only to fully data-aware pipelines with shared resources and code locations separate of concern, from task-based DAGs to data-aware assets, and moved from pure data pipeline orchestration to provisioning for DevOps or automating other departments' tasks, too.
One thing has stayed since day one: the focus on developer usability, being a toolbox for data engineers with best practices and functional data engineering applied by default. And the main goal, besides orchestration, is to deal with the complexity of the data architecture we inevitably have and reduce it through intelligent design principles (which may take a little more to learn at first, but will help us a lot down the road).
This article tells my personal story of how I was introduced to Dagster, what convinced me early on, and why it has evolved into a fully open data platform today. I will take you through the best parts of Dagster and its capabilities, and why it's a little different from other orchestrators.
At @poolsideai, we're announcing the first public models in the Laguna family, Laguna M.1 and Laguna XS.2!
As part of the model factory team, it's been surreal to witness how our models and harnesses have co-evolved to be an incredible thought partner in our day-to-day engineering.
We have put so much thought and care into building this model.
Our alchemical journey of model development continues before us. I am beyond excited :)
Today we’re releasing Laguna XS.2, Poolside’s first open-weight model.
It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks.
Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0.
Links 👇
Weights: https://t.co/HSo8L2gM64
API: https://t.co/DMJtNFrace
Blog: https://t.co/BXEjQxtQoV
Dagster 1.13 is out!
Partitioned asset checks (at long last), virtual assets (preview), open-source AI skills for Claude Code/Codex/OpenCode, 20+ new components, and state-backed components on by default.
Check out the release blog!
Excited to release 1.13.0 of Dagster with lots of great features like official Dagster skills for, virtualized assets for modeling entities like views, partitioned asset checks, and more. Check out the blog post for more details.
What is the ideal setup for structuring git repositories in the age of AI?
We've found that monorepos are key for cross-cutting changes and unified context, and we've done this by defining a hub-and-spoke model using Google's Copybara.
Every tool you need to fix team coordination already exists: transcription, summarization, search, cataloging, orchestration. Nobody is wiring them together.
We keep pointing AI at code generation while the real bottleneck is Slack threads nobody can find and institutional memory that walks out the door every two weeks.
Join @striimteam, @Yugabyte, and @dagster for an exclusive #AI After Party after the first day of #GoogleNEXT!🥂
📅 April 22, 6:00–8:30 PM - Rí Rá Irish Pub 📆
Don't miss:
🎶 Great music
🍴 Delicious food
🍸 An open bar
🤝 Chat with the sharpest minds in AI and data
Because the best conversations don’t end when the sessions do!🔥
👉 RSVP today to save your spot: https://t.co/BpSShWCt1l
If you use @dagster and have thoughts on how it should work, we want to hear from you.
We're investing in making contributions easier to submit, review, and ship. Smarter review tooling, clearer guidelines, and better signals about where your work can have the most impact.
Code and PRs are great, but docs, bug reports, examples, and feedback in Slack or GitHub all matter just as much.
The project has always been shaped by the community using it.
If you can't waste hours, you'll waste years.
An old boss told me that. AI was supposed to give us those hours back. Instead it filled them with planning, coordination, and status updates.
A prison of my own making.
My did this analysis on dispersion, showed that $NFLX returned -89% in 2025. That didnt pass the sniff test and I spiraled a little bit finding what happened
We just launched a new free course on Dagster University: AI-Driven Data Engineering
8 lessons. Blank directory to production ELT pipeline. Built entirely from prompts.
If you've been curious about using AI agents for real data engineering work and dont know where to start, this is the one for you!
Databricks is a fantastic platform for compute and storage. But as your deployment scales across teams and workspaces, something needs to sit above it, coordinating dependencies, tracking lineage end-to-end, and giving every team visibility into what they own.
We'll be hosting a hands-on deep dive showing how Dagster and Databricks work better together — specifically for teams managing multiple workspaces who need true cross-workspace orchestration without stitching together workarounds.
We'll cover:
→ Connecting multiple Databricks workspaces into a single observable asset graph
→ Auto-discovering existing workspace jobs with zero code changes to get started
→ Dagster Pipes for bidirectional orchestration on top of your existing notebooks
→ The full reference stack: Fivetran + dbt + Databricks, coordinated from one control plane
New video 🔥
Dataops and reliability has been on my mind lately so I made a quick guide on how to improve your data platform performance with Dagster!
Stakeholder trust is the most important thing when it comes to data work and dataops is a practice to minimize the risk of degrading trust.
→ Transient failures that resolve themselves without manual intervention
→ Resource protection so your warehouse doesn't get overwhelmed
→ Production jobs that always run first, no matter what's in the queue
��� Zombie runs that get caught and killed before they drain your budget
→ Data quality gates that catch issues before they reach a dashboard
→ Tailored views in Dagster+ so every team member sees exactly what they own
Check out the full video today!
Link in the comments.