One of the coolest clickhouse fetures: the query log.
When working with postgres i was always missing a table to see how queries actually performed and how many resources they consumed.
Today, we use this to tag all queries with feature + tenant metadata to attribute infra cost.
Its important to have a mix of signals when judging quality of traces: human annotation, LLMs, and now deterministic code. super excited to finally allow running small cloud functions on each observation ingested into Langfuse.
day 4 of langfuse launch week: code evaluators.
write a python or typescript `evaluate` function in the langfuse UI. attach it to live observations or an experiment. scores land natively next to your existing ones.
@wochinge demos below; https://t.co/jBHM8WeA5e
This also shows me how valuable the @ClickHouseDB acquisition was!
Our team meet at our Tokyo offsite with the engineer who built CH FTS to discuss scalability, access patterns, and potential performance bottlenecks.
This is one of the most powerful recent launches of us. As Langfuse is increasingly used by Agents, FTS is the best way to help agents to find what it needs in terrabytes of user data. Thanks for the ship @sum3rman !
day 3 of langfuse launch week: full-text search.
multi-GB scans drop from many seconds to sub-second on @ClickHouseDB's new text indexes. great work from @sum3rman.
available via UI and API.
more: https://t.co/jBHM8WeA5e
I am listening to many podcasts about how to use AI to be a better engineer. But what are the best ones talking in depth about how to build agentic apps?
Langfuse already had the momentum: 19 of the Fortune 50, 27k GitHub stars, 59M monthly SDK installs, and an enterprise-grade platform for LLM engineering.
Together with ClickHouse, they now have fast OLAP, global support, flexible deployment … and yes, Alexey!
https://t.co/ua22RvNDWM
day 2 of langfuse launch week 5: langfuse agent skill.
bringing an agent to production is hard.
using the skill you can ask your coding agent to instrument your app, calibrate a judge, or set up evaluators.
@marliessophie demos below; https://t.co/jBHM8WeA5e
We should be able to operate all devtools natively and headless out of codex/claude.
UI is then the interface to revisit results regularly or change some finer settings.
day 2 of langfuse launch week 5: langfuse agent skill.
bringing an agent to production is hard.
using the skill you can ask your coding agent to instrument your app, calibrate a judge, or set up evaluators.
@marliessophie demos below; https://t.co/jBHM8WeA5e
day 1 of langfuse launch week 5: a github action that runs your langfuse experiments on every PR.
fails the workflow when scores drop below your threshold. posts pass/fail to the PR. every run is tracked in langfuse.
https://t.co/jBHM8WeA5e
@langfuse launch week 5 starts monday.
one release per day, mon to fri. agents, evals, and some long-requested features.
we'll be demoing all new features at @ClickHouseDB Open House in San Francisco same week. come say hi.
https://t.co/isf5XmqcVr
We're launching Langfuse Cloud in Japan today 🇯🇵
Hosted in ap-northeast-1 (Tokyo). The full @langfuse platform + now with @clickhousedb team in Tokyo on the ground.
if you're building with LLMs in japan: https://t.co/GRXG5vByws
+ follow @langfusejp for Japanese updates
@DanielLockyer@langfuse Here is a talk where I describe what is going on: https://t.co/nG6uC0cMiJ
We will launch this more broadly over the next weeks. The new APIs are the first product on the new data model.
@mgill25@mgill25 love this. We can host in our office at @langfuse and are also happy to share insights about our system architecture. Do you want to chat about this?
@skylar_b_payne@langfuse@langfuse_dev Hi, im the CTO at Langfuse. Correct, the ClickHouse nodes should never scale to zero. Otherwise you won’t be able to ingest data.
The team is excited to watch Steffen launching his new Rust container which he build over night to fix some of our big data challenges we were not able to solve in our database.
Last day of @Langfuse Launch Week.
Schema Enforcement: Guarantee a consistent data structure for all dataset items, making your experiments reliable.
Second, Dataset Folders. As your app matures, test datasets multiply. Easily organize them in folders.