Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time.
Excited to finally share what I have been working on over the last year: a Foundation Model architecture which brings the power of Transformers to relational domains, enabling large-scale pretraining and zero-shot generalization in enterprise settings. 🧵1/n
Can reasoning models become overly reliant on chain-of-thought examples? 🤔
Our #ACL2026 work shows excessive CoT supervision is not always beneficial, and gives a recipe for tuning the CoT fraction to improve novel-task accuracy. 🧵
Website: https://t.co/hZmPCF6bue
Relational Foundation Models face a scaling problem: diverse training datasets are rarely public due to privacy constraints 🔒.
🚀 We are excited to introduce "PluRel": a framework that synthesizes diverse multi-table relational databases from scratch, unlocking scaling laws for RFMs. 🧵
Kudos to the amazing collaborators at @StanfordAILab@Kumo_ai_team , and @SAP : @_rishabhranjan_@VHudovernik@vijaypradwi@johanneshoffart@guestrin@jure
If you're at ICLR, come check out our poster for RelBench v2 (https://t.co/YuABahiFdi) at the DATA-FM (Data for Foundation Models) Workshop! Apr 26, Hall 203 A/B 🇧🇷
Although relational databases are everywhere, there is no equivalent of the public internet for pretraining Relational Foundation Models (RFMs). Excited to see RelBench bridging that gap, growing from 7 datasets in v1 to 88+ datasets in v2.
Deeply grateful to the numerous community contributions for helping RelBench serve as the central data repository for RFM research. ❤️
Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time.
Excited to finally share what I have been working on over the last year: a Foundation Model architecture which brings the power of Transformers to relational domains, enabling large-scale pretraining and zero-shot generalization in enterprise settings. 🧵1/n
Thoroughly enjoyed the discussions on PluRel and Relational Foundation Models during the talk! Thanks to an amazing audience @tempgraph_rg
Slides: https://t.co/04P226ajil
Website: https://t.co/0JvREDwu9a
Github: https://t.co/FIVNrovEDT
Thoroughly enjoyed the discussions on PluRel and Relational Foundation Models during the talk! Thanks to an amazing audience @tempgraph_rg
Slides: https://t.co/04P226ajil
Website: https://t.co/0JvREDwu9a
Github: https://t.co/FIVNrovEDT
Enjoyed presenting our ICLR 2026 work (Relational Transformer) at the TGL reading group today. Thanks for the insightful discussion!
Slides from today: https://t.co/xouFt7njSR
Paper: https://t.co/JXikptIiEz
Code, data, models: https://t.co/sg0jVLcuBq
This Thursday (Feb 19, 11am EST) at the reading group: Rishabh Ranjan (Stanford) presents Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data.
Paper & code: https://t.co/GIfqdgJpXU
Hope to see you there! zoom link on website!
Excited to talk about our recent work on Relational Transformers at the TGL Reading Group tomorrow. Please drop by on Feb 19, 11am EST (see https://t.co/hVqC7pCzOm for Zoom link).
This Thursday (Feb 19, 11am EST) at the reading group: Rishabh Ranjan (Stanford) presents Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data.
Paper & code: https://t.co/GIfqdgJpXU
Hope to see you there! zoom link on website!
Quite exciting work on synthetic data generation that for the first time demonstrates scaling laws for graph/relational foundation models.
Great work by @kvignesh1420@_rishabhranjan_@VHudovernik and our collaborators at @Kumo_ai_team and @SAP
@navneet_rabdiya Yes! We use Hierarchical Stochastic Block Model (HSBM) to randomly sample bipartite graphs that capture realistic foreign--primary key relationship patterns. Please check the paper for more details.
Synthetic data is critical for foundation models, even more so in relational and tabular domains where public data is scarce. Our new work shows how synthetic pretraining unlocks a whole new axis to scale up relational foundation models (RFMs)!
This was a super fun collaboration with @kvignesh1420, @VHudovernik, @vijaypradwi, @johanneshoffart, @guestrin and @jure.
Paper: https://t.co/4jzttapESf
Code, data, models: https://t.co/gd58JA9imo
Relational Foundation Models face a scaling problem: diverse training datasets are rarely public due to privacy constraints 🔒.
🚀 We are excited to introduce "PluRel": a framework that synthesizes diverse multi-table relational databases from scratch, unlocking scaling laws for RFMs. 🧵
Kudos to the amazing collaborators at @StanfordAILab@Kumo_ai_team , and @SAP : @_rishabhranjan_@VHudovernik@vijaypradwi@johanneshoffart@guestrin@jure
Relational Foundation Models face a scaling problem: diverse training datasets are rarely public due to privacy constraints 🔒.
🚀 We are excited to introduce "PluRel": a framework that synthesizes diverse multi-table relational databases from scratch, unlocking scaling laws for RFMs. 🧵
Kudos to the amazing collaborators at @StanfordAILab@Kumo_ai_team , and @SAP : @_rishabhranjan_@VHudovernik@vijaypradwi@johanneshoffart@guestrin@jure
Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time.
Excited to finally share what I have been working on over the last year: a Foundation Model architecture which brings the power of Transformers to relational domains, enabling large-scale pretraining and zero-shot generalization in enterprise settings. 🧵1/n
Although relational databases are everywhere, there is no equivalent of the public internet for pretraining Relational Foundation Models (RFMs). Excited to see RelBench bridging that gap, growing from 7 datasets in v1 to 88+ datasets in v2.
Deeply grateful to the numerous community contributions for helping RelBench serve as the central data repository for RFM research. ❤️
🚀 Announcing RelBench V2, a major update to our benchmark for foundation models on relational data!
With V2, we are significantly expanding the benchmark’s scope to catalyze further research in Relational Deep Learning (RDL) and Relational Foundation Models (RFMs).
Key features:
🍺 4 new databases, spanning domains like e-commerce and beer reviews to scientific research and clinical healthcare.
🧩 40 new predictive tasks, including 28 autocomplete tasks, across new and existing databases.
🔌 External data integrations: 70+ datasets from CTU, 7 datasets from 4DBInfer, and your own data via SQL connector, all in RelBench format.
🛠️ Bug fixes and performance improvements.
🔥 Introducing autocomplete tasks: As opposed to forecasting tasks, autocomplete tasks predict existing columns in the database. We found that models need to deeply understand the relational context to autocomplete database fields, a critical capability that expands the scope of real-world RDL applications.
Learn more:
🌐 Website: https://t.co/G4OBtj0R92
💻 GitHub: https://t.co/99FBJK5kji
Huge thanks to @justingu32@_rishabhranjan_@jakub_peleska@VHudovernik@CKanatsoulis@fengyuli607, Tang Haiming, Alistiq and everyone else who contributed to our GitHub for making this possible!
🚀 Announcing RelBench V2, a major update to our benchmark for foundation models on relational data!
With V2, we are significantly expanding the benchmark’s scope to catalyze further research in Relational Deep Learning (RDL) and Relational Foundation Models (RFMs).
Key features:
🍺 4 new databases, spanning domains like e-commerce and beer reviews to scientific research and clinical healthcare.
🧩 40 new predictive tasks, including 28 autocomplete tasks, across new and existing databases.
🔌 External data integrations: 70+ datasets from CTU, 7 datasets from 4DBInfer, and your own data via SQL connector, all in RelBench format.
🛠️ Bug fixes and performance improvements.
🔥 Introducing autocomplete tasks: As opposed to forecasting tasks, autocomplete tasks predict existing columns in the database. We found that models need to deeply understand the relational context to autocomplete database fields, a critical capability that expands the scope of real-world RDL applications.
Learn more:
🌐 Website: https://t.co/G4OBtj0R92
💻 GitHub: https://t.co/99FBJK5kji
Huge thanks to @justingu32@_rishabhranjan_@jakub_peleska@VHudovernik@CKanatsoulis@fengyuli607, Tang Haiming, Alistiq and everyone else who contributed to our GitHub for making this possible!