Entity resolution is the step everyone underestimates. The graph stays clean only if "Acme", "Acme Inc", and "ACME" collapse to one node, and that's a typing/identity problem as much as an embedding one. I lean on a typed schema plus dedup on canonical keys. How are you setting the match threshold for merges?
@SidDegen I'm all in on graphs as well. HelixDB is interesting but the problem still is it's another piece in the stack. So I fear it will suffer the same adoption problem. Doubling down on SQL DBs (most have graph, vector, FTS, etc. already) and doing specialized when scale bites works.
Yes and there's two sides to it. Clean graph data (entity resolution, canonicalization, temporal validity, etc.) and schema evolution.
Sometimes the answer is tooling. Sometimes the answer is add more graph (annotations, more relationships, etc.). Sometimes the answer is to just keep rebuilding the graph off another primary store (graph as index).
@rohit_jsfreaky@daleverett@daltonmeon@damienhe@evokoa_ai Same thing I kept wishing for, so I built it. Graph traversal as a typed query layer over Postgres you already run, no separate service or extensions. https://t.co/VDNed1vaXZ
@HamelHusain That’s what I created Corpus for, open source, markdown for humans and agent with versioning. Skill files, prompts, context, etc but super lightweight. https://t.co/THeiyADEdi
@NathanFlurry And it has all the things you need built in: vector (sqlite-vec), fulltext search (FTS5), graphs (recursive CTEs). No fragile data sync pipelines, no switching to vector-db-of-the-month.
Caveat on the hallucination gap: binary per-answer judging rewards vagueness. Your NV16-C ("which API endpoints serve FM dashboard data?") answers with a module overview and names zero endpoints. Fewer claims → fewer flags. Claim-level grounding rate would separate trustworthy from noncommittal
B vs C is a false dichotomy. A "context core" is a materialized view over a graph: route → select a typed subgraph → compile to text. The variant you didn't test is D: cores generated from a graph, with per-claim provenance. Should get B's token economy with C's hallucination rate.
@AIHacksByMK@abdiisan I got tired of stitching together pieces so I've been putting it all in PostgreSQL or SQLite. Built a library to make this easy (graphs, vector, hybrid) and performant. https://t.co/VDNed1vaXZ
If graph latency really matters I use FalkorDB.
@alxshp Yup! So nice to have one DB and no data pipelines syncing systems to babysit.
The missing piece was a nice DX to combine relational, vector, graph, FTS, and hybrid so I built TypeGraph (https://t.co/VDNed1vaXZ). Evaluating adding SQL/PGQ support for PG 19 now.
@patrickc I built Corpus to handle the markdown side. Real time collaboration, versioning (git model w/o git). MCP/CLI/API connectors for agents. If agents are just data then using OpenProse or a similar approach might get this 100% to what you're describing.
https://t.co/THeiyADEdi
@thomasgauvin@CloudflareDev Would love your view on Corpus (DOs to the max). Also our agent platform arch which is pretty much CF to the max (DOs, Workflows, Sandboxes, Dynamic Workers, Queues).
https://t.co/THeiyADEdi
@paudley@namedgraph Agree. Teams also seem to build a poor version of RDF instead of adopting it when they reach that threshold.
I've been bringing some of this into TypeGraph (https://t.co/VDNed1vaXZ). Big one is ontology (toying w/ idea: KG w/o ontology is just graph indexed entities, not KG)
Both, but "social/business not tech" undersells it. The unfamiliarity is downstream of real complexity you take on: blank nodes, reification, IRI hygiene, a SPARQL endpoint to stand up and operate. That cost is continuous and you pay it whether or not the system ever cashes in on it.
"Future-proof" is the part I'd push on. It's a premium paid every day against a payoff that only lands if you actually hit interchange or heterogeneous integration at scale. When a system does, RDF is the right call and I'll reach for it. Short of that boundary, a property graph models the same domain at a fraction of the operating cost. I see far more teams paying the premium than collecting on it.
@namedgraph@paudley It's tooling most teams don't have and aren't familiar with and typically overkill. Most of the time property graphs are sufficient and can be run an existing DB (w/ recursive CTEs) or in a lighter weight graph DB.
I keep coming back to thinking how agent harnesses might be better in lisp. Especially with the obsession with loops over the past few days which are lisp's specialty. The trampoline points there without fully landing because the tail calls are a red herring. Depth-1 is a runtime policy on agent nesting, not stack growth, so even perfect TCO leaves you flattening into an outer loop.
I'm thinking the missing piece is delimited continuations. An agent firing a tool call and waiting is a continuation captured at the tool boundary: run until you need the world, suspend, resume. The principled version of the event loop you're hand-rolling. S-expressions are also a nicer compile target for model-generated plans than nested JSON: balanced parens, trivially repairable.
Continuations don't have to be in-memory, either. Back the suspend/resume with a durable log and it survives a crash. The price is a determinism constraint on the replayed span.