Immanuel Trummer

Verified account

@ImmanuelTrummer

Database Prof at Cornell. I make data analysis more efficient and more user-friendly.

Ithaca, NY (USA)

Joined October 2017

57 Following

2.1K Followers

466 Posts

Pinned Tweet

Immanuel Trummer

@ImmanuelTrummer

9 months ago

🚗ThalamusDB counting pictures of red cars in the database: - Semantic operators are described in natural language and evaluated via GPT-5 - Simply store paths to images or audio files in your database – ThalamusDB recognizes the file format and selects the right LLM 💾 Code: https://t.co/6l1gS8zZFA 📄 Website: https://t.co/33PAzsmI7T #SemanticQueries #ApproximateProcessing #LLM #GPT5 #ThalamusDB

0

9

0

3

1K

Immanuel Trummer

@ImmanuelTrummer

8 days ago

✈️ Looking forward to an exciting #SIGMOD2026! 👇 Schedule below. 🇮🇳 See you soon in #Bengaluru! #Databases #LLMs #LanguageModels #DatabaseBenchmarking #QueryOptimization #QuantumComputing @lojil192574 @sigmod @SIGMODConf

ImmanuelTrummer's tweet photo. ✈️ Looking forward to an exciting #SIGMOD2026!
👇 Schedule below.
🇮🇳 See you soon in #Bengaluru!
#Databases #LLMs #LanguageModels #DatabaseBenchmarking #QueryOptimization #QuantumComputing
@lojil192574 @sigmod @SIGMODConf https://t.co/W5WRp1krHO

0

9

0

0

392

Immanuel Trummer

@ImmanuelTrummer

3 months ago

💡 Two arXiv papers published in recent days (one from us, one from TUD) reach the same conclusion: LLMs can now generate C++ code for SQL processing that outperforms classical database systems. ⚙️ Our code generator is based on Claude Code and exploits multiple agents working in parallel. Each agent performs tasks typically associated with different components in a #DBMS, such as workload analysis, query optimization, or physical design tuning. 📊 We compare to various classical #DBMS such as DuckDB, ClickHouse, Umbra, MonetDB, and PostgreSQL, finding that the agent-generated code is often significantly faster. Code generation costs are moderate (<$20), making the approach practical for frequently executed queries. 🤖 Analyzing generated code, we find that agents exploit various optimization techniques, including query-specific data structures, as well as low-level optimizations that are specific to the hardware cache hierarchy of our server. 📃 Paper: https://t.co/PBLC9XX5Tv 💾 Code: https://t.co/MvdH8lHT6F 🌐 Site: https://t.co/2WM2gedwbi @lojil192574 #LLM #Databases #AI #DB

ImmanuelTrummer's tweet photo. 💡 Two arXiv papers published in recent days (one from us, one from TUD) reach the same conclusion: LLMs can now generate C++ code for SQL processing that outperforms classical database systems.
⚙️ Our code generator is based on Claude Code and exploits multiple agents working in parallel. Each agent performs tasks typically associated with different components in a #DBMS, such as workload analysis, query optimization, or physical design tuning.
📊 We compare to various classical #DBMS such as DuckDB, ClickHouse, Umbra, MonetDB, and PostgreSQL, finding that the agent-generated code is often significantly faster. Code generation costs are moderate (<$20), making the approach practical for frequently executed queries.
🤖 Analyzing generated code, we find that agents exploit various optimization techniques, including query-specific data structures, as well as low-level optimizations that are specific to the hardware cache hierarchy of our server.
📃 Paper: https://t.co/PBLC9XX5Tv
💾 Code: https://t.co/MvdH8lHT6F
🌐 Site: https://t.co/2WM2gedwbi
@lojil192574 #LLM #Databases #AI #DB

3

31

7

21

3K

Immanuel Trummer

@ImmanuelTrummer

7 months ago

@adwiteekk @freeCodeCamp Glad to hear it 😀

1

1

0

0

25

Who to follow

The Proceedings of the VLDB Endowment (PVLDB) RSS Feed: https://t.co/5wEKOfq2OD Bluesky: https://t.co/jULSIiQ5M3

Professor @utn_nuremberg. Formerly @AWS, @MIT_CSAIL, @TU_Muenchen. Focused on building efficient, easy-to-use data systems.

Assistant Professor at University of Wisconsin-Madison. Research interest: database systems.

Immanuel Trummer

@ImmanuelTrummer

7 months ago

Happy to hear you liked it, @DanKornas!

0

4

0

4

1K

Immanuel Trummer

@ImmanuelTrummer

11 months ago

A demo of #ThalamusDB (#SIGMOD2023), introducing semantic filter operators. Users write SQL queries with natural language predicates on table columns containing 🖼️ images, 📃 text, or 🔊 sound files. These predicates are evaluated via #LLMs. In the video (below), I'm querying for furniture ads with pictures showing "wooden tables". After entering my query, #ThalamusDB 1️⃣ performs data profiling and cost-based optimization, 2️⃣ shows the Pareto frontier of cost-quality tradeoffs, 3️⃣ updates bounds on query aggregates while processing. #ThalamusDB is designed from the ground up for approximate processing, prioritizing data that maximally reduces approximation error per cost unit. 🪧 #SIGMOD2023 demo: https://t.co/hELlBtKRZb 📃 #SIGMOD2024 paper: https://t.co/jzfd4IUbGt 💾 Code repository: https://t.co/zhRn5xX4Km @SaehanJo @sigmod #GPT4 #LanguageModel #MultimodalData @Cornell @CornellCIS

0

6

0

1

915

Immanuel Trummer

@ImmanuelTrummer

11 months ago

📢 All our posters & talks at #SIGMOD2025! 1️⃣ λ-Tune — using #LLMs to write configuration scripts for databases. 🪧 Poster: https://t.co/plBOp9tSsK 💬 Slides: https://t.co/LiXHMlzNBS @giannakourisv 2️⃣ SpareLLM — selecting #LLMs with optimal cost-quality tradeoffs 🪧 Poster: https://t.co/7ssHX4fhNU @SaehanJo 3️⃣ SQLBarber — generating custom benchmarks via #LLMs 🪧 Poster: https://t.co/B3R3wOVX18 @lojil192574 4️⃣ CEDAR — cost-efficient data-driven claim verification via #LLMs 🪧 Poster: https://t.co/NqfPw8Y6CH @Tharushi96 5️⃣ SwellDB — generating data on-the-fly during query processing by #LLMs 🪧 Poster: https://t.co/bYbByTEzZQ @giannakourisv 6️⃣ Query optimization for hybrid classical-quantum workflows 💬 Slides: https://t.co/h2ydJZb8zm 7️⃣ Quantum annealing for optimal data partitioning 💬 Slides: https://t.co/OPMI0e0sog 8️⃣ Panel "AI for Future Databases" with @tim_kraska, @adityagp, @feifei_initiald, @ailamaki, and #SurajitChaudhuri 💬 Slides: https://t.co/WvMwknlJLS @SIGMODConf @sigmod @Cornell @CornellCIS

0

11

1

1

844

Immanuel Trummer

@ImmanuelTrummer

11 months ago

Really proud of my students — @SaehanJo, @giannakourisv, @lojil192574, and @Tharushi96 (left to right) — who each presented their latest work at #SIGMOD2025. Many thanks to the organizers for an amazing conference! @SIGMODConf @sigmod

ImmanuelTrummer's tweet photo. Really proud of my students — @SaehanJo, @giannakourisv, @lojil192574, and @Tharushi96 (left to right) — who each presented their latest work at #SIGMOD2025. Many thanks to the organizers for an amazing conference! @SIGMODConf @sigmod https://t.co/wIlgCCfeeP

1

13

1

1

740

Immanuel Trummer

@ImmanuelTrummer

12 months ago

🥳Looking forward to an amazing #SIGMOD2025 conference! Our schedule: 📃 Sunday, 15:00-17:30: Data partitioning with quantum and digital annealers 📃 Sunday, 15:00-17:30: Optimizing hybrid quantum-classical processing pipelines 📃 Tuesday, 10:30-11:30: SpareLLM - selecting LLMs with optimal cost-quality tradeoffs 🖥️ Tuesday, 11:30-13:00: Demonstrating SQLBarber - generating custom benchmarks via LLMs 🖥️ Tuesday, 11:30-13:00: Demonstrating SwellDB - generating data on-the-fly during query processing 📢 Tuesday, 16:30-18:00: Panel on AI for future databases with @TimKraska, @drfeifei, @adityagp, @ailamaki, and Surajit Chaudhuri 📃 Thursday, 10:30-11:30 & 16:30-18:00: λ-Tune - using LLMs to write configuration scripts for databases 🖥️ Thursday, 16:30-18:00: Demonstrating CEDAR - cost-efficient data-driven claim verification @giannakourisv @Tharushi96 @SaehanJo @lojil192574 @SIGMODConf @sigmod #LLM #SQL #Database

0

13

1

0

708

Immanuel Trummer

@ImmanuelTrummer

12 months ago

💵Don't overpay when using #LLMs! Introducing our upcoming #SIGMOD2025 paper on #SpareLLM by @SaehanJo ... @sigmod @SIGMODConf #LanguageModel #GPT4 #ChatGPT #CostOptimization #Data @Cornell

0

11

0

1

577

Immanuel Trummer

@ImmanuelTrummer

12 months ago

Outperforming various baselines, including #QuantumAnnealers and classical optimization, for large problem instances with 1000 queries. #SQL #DB #Quantum #QueryOptimization

ImmanuelTrummer's tweet photo. Outperforming various baselines, including #QuantumAnnealers and classical optimization, for large problem instances with 1000 queries.
#SQL #DB #Quantum #QueryOptimization https://t.co/OFBUW4AHVi

0

0

0

0

342

Immanuel Trummer

@ImmanuelTrummer

12 months ago

🥳Paper accepted at #SIGMOD2026! Our paper leverages #DigitalAnnealers (hardware accelerators for optimization) for #QueryOptimization. We scale up to large problem instances using 1⃣domain-specific problem decomposition and 2⃣pre/post-processing on classical machines. @sigmod

ImmanuelTrummer's tweet photo. 🥳Paper accepted at #SIGMOD2026! Our paper leverages #DigitalAnnealers (hardware accelerators for optimization) for #QueryOptimization. We scale up to large problem instances using 1⃣domain-specific problem decomposition and 2⃣pre/post-processing on classical machines. @sigmod https://t.co/ZMlzsmvQEP

1

14

1

4

841

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

Had a great time at the @dagstuhl seminar on Table Representation Learning! Lots of interesting discussions and future work directions. Many thanks to the organizers (@FrankRHutter @cbinnig @MadelonHulsebos @eisenjulian)! #TabularFoundationModel #AI #ML #DB

ImmanuelTrummer's tweet photo. Had a great time at the @dagstuhl seminar on Table Representation Learning! Lots of interesting discussions and future work directions. Many thanks to the organizers (@FrankRHutter @cbinnig @MadelonHulsebos @eisenjulian)! #TabularFoundationModel #AI #ML #DB https://t.co/1RvkoxGbVk

0

13

1

0

653

ImmanuelTrummer retweeted

Ibrahim Sabek @ibrahim_sabek

about 1 year ago

The submission deadline for the Q-Data Workshop has been extended by one week. The new submission deadline is April 27, 2025. @SIGMODConf #SIGMOD2025

0

4

1

1

621

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

@shctechnologies @ManningBooks @OpenAI @langchain @llama_index Thank you!

0

0

0

0

23

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

🥳I finished my book! 📘"Data Analysis with LLMs" shows how to analyze (📄text/🖼️image/🔊audio/📽️video/...) data with #LLMs and #Python! 🔗https://t.co/6tRALMGsGo @ManningBooks #LLM #GPT4 @OpenAI #SQL #GraphData #AgenticAI @langchain @llama_index #Multimodal #DataScience

ImmanuelTrummer's tweet photo. 🥳I finished my book!
📘"Data Analysis with LLMs" shows how to analyze (📄text/🖼️image/🔊audio/📽️video/...) data with #LLMs and #Python!
🔗https://t.co/6tRALMGsGo
@ManningBooks #LLM #GPT4 @OpenAI #SQL #GraphData #AgenticAI @langchain @llama_index #Multimodal #DataScience https://t.co/nXQKQQEyXN

2

23

1

3

793

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

The book is a hands-on introduction to #LLMs and #Multimodal #DataAnalysis, based on a few mini-projects. It covers the @OpenAI #Python library, #Prompting, #FewShotLearning, #FineTuning, #LLM #Agents, and recent #LLM frameworks like #LangChain and #LlamaIndex.

0

2

0

0

310

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

Well deserved 😀

ImmanuelTrummer's tweet photo. Well deserved 😀 https://t.co/1FdtYqG3m9

0

7

0

0

371

Immanuel Trummer

@ImmanuelTrummer

about 1 year ago

🥳Many congrats to Dr. Saehan Jo! 🎓Saehan successfully defended his PhD thesis "Efficient Data Systems for Scalable Analysis with LLMs", introducing systems like #ThalamusDB and #SpareLLM that scale up processing with #LLMs to very large data sets! @SIGMODConf #Data #SQL #ML

ImmanuelTrummer's tweet photo. 🥳Many congrats to Dr. Saehan Jo!
🎓Saehan successfully defended his PhD thesis "Efficient Data Systems for Scalable Analysis with LLMs", introducing systems like #ThalamusDB and #SpareLLM that scale up processing with #LLMs to very large data sets!
@SIGMODConf #Data #SQL #ML

1

13

0

1

783

Last Seen Users on Sotwe

Trends for you

Most Popular Users