@HamelHusain Cloudflare artifacts + https://t.co/Ik13OLRoT0 with agent-native skill authoring
Centralized - shared - instantly updated across the team (as it’s behind execute)
Works like a charm
‘Service-as-a-software’ is here...
We moved our entire company brain to GitHub and wired 25+ tools through MCPs.
Any one of our 20+ team members can now spin up a contextualized AI assistant in seconds.
The system has 5 layers:
1. Markdown company OS
↳ SOPs and campaign playbooks converted into .md files using research agents
↳ Most SOPs turned into agents that handle 70% of the task
↳ Output: 50+ actionable Claude skills
2. Context environment
↳ One Company OS GitHub repo propagated to every session via org-wide plugin
↳ Each client gets their own repo with Slack DMs, call transcripts, GDrive changes, and campaign data auto-synced through n8n
↳ Zero configuration needed per session
3. MCPs
↳ 25+ tools connected including InstantlyAI, HeyReach, Apollo, HubSpot, Slack, Notion, n8n, Supabase, Pinecone, Browserbase, Apify
↳ Not just research. Action through AI.
↳ We went from researching work to actually doing it
4. Self-improvement engines
↳ Pinecone database stores 1000s of LinkedIn posts and outbound campaigns with performance metrics
↳ Copywriting skills query this data to find winning formats to reuse
↳ Human corrections get fed back in so the system gets sharper over time
5. Operating principles
↳ Every repo has a safeguard file that prevents certain operations
↳ 100% AI outputs are not acceptable, everyone owns their work and every mistake
↳ Agent swarms split one task into 5-20 sub-agents when needed
Our goal is to become the most advanced AI-native services company for our niche (GTM).
@hungv47@RhysSullivan yeah plugins are the way to go. so far, from self-hosting executor - we have learned that a plugin with a clinical master skill that points to a dynamic skills source inside executor works best. then you can update on the fly across all clients instantly.
Executor v1.5 is live!
Highlights:
- Sources -> Integrations
- Full GSuite support
- Run tools from the web
- Integrations now support multiple accounts
- Revamped product UI
- WIP Docker and Cloudflare deployable versions
Lot of love went into this one, hope you like it!
Policies are coming to https://t.co/Ik13OLRoT0
Being able to take an MCP like the @render MCP which only comes with a full-scope API key - and inject policy on code mode mcp patterns is amazing.
@RhysSullivan shipping all the good stuff
production grade agent coding is not becoming the future anymore.
it is the future.
what felt strange, new and somewhat frowned upon 8 months ago has shifted so fast …internally i see senior engineers shipping alongside ai in ways that begin to feel poetic 🥺
Context is like lifeblood to agentic systems. The quality of context is directly related to the quality of output.
Because context varies from business to business and changes over time it is a really complex and individual problem to solve for.
At @railblocks we think about this problem from the perspective of ops engineering - building internal systems.
Here a (simplified) example…
🗣️ “What you say” (the prompt)
“Create me a new email onboarding sequence for our trial users who haven’t upgraded after 7 days. Use this sequence logic, this copy, and embed in these internal interface.”
👀 “What the model needs to see” (the context)
For that request to actually work, the model needs a ton of context, that doesn't live in a single source of truth. It is scattered across source code, internal decisions and external integrations. Some examples ..
📦 Stored data
> where users and trials are stored
🔬 Schema & types
> how user models and subscription states are structured
🧠 Business rules
> trial period calculations and timezone handling logic
🔄 Workflows
> where email sequences are scheduled and triggered
🖥️ Interfaces
> campaign dashboard and email template components
🔗 Endpoints
> email service provider and user management APIs
🍀 Examples
> existing onboarding patterns and conversion sequences
Agents are getting smarter. They ingest more source code, call various tools including MCPs, consult OpenAPI documentation. But the reality is that pulling all this together still feels fragmented.
I dream of a future where building internal systems becomes as simple as expressing intent. For that to happen, we need better solutions for surfacing the right context at the moment of creation...
you could check out the advanced filters on https://t.co/JBJBTtrHBG https://t.co/ehCdOnrdui
and possibly use that as a baseline inspiration, also from a more practical perspective for an implementation ..
also looking directly in Linear / Airtable how the UX is done can help ..
Luma is simple and beautiful but it’s a standalone product and its data is completely silo’ed from the rest of your stack.
Fixing this isn’t easy: their API is rudimentary and they don’t have webhooks 😫
That’s where @triggerdotdev and @attio come in. With a mix of cron jobs and the new App SDK from Attio we are able to sync data across for less than $1/mo and with higher polling intervals than @zapier
Our clients can now manage their entire events lifecycle right from their CRM ✨
→ No more wrangling CSVs: all data syncs magically in the background
→ No more context switching: a simple, fully integrated where data already lives
→ 10x better UX & cheaper infra costs vs. similar systems
Demo in 🧵⤵
Why have Python and Typescript become the go-to languages for LLMs?
Sure, it makes sense at face value. Both are hugely popular and more data to train on means more nuanced LLMs.
But this feels like an incomplete view of the world...?
The cost of generating code is trending towards zero. But what is the value of that code, if we cannot reliably assert that it is in fact written reliably?
I think some of the hardest "AI code generation" problems to solve in the coming years relate to this question. A fundamental factor constraining code reliability are the properties of the underling language in which the code is written. This is not the only factor, but it is an important one to consider. And it begs the question what would be the ideal language / framework for LLMs if datasets were equal.
From that perspective Javascript and Python feel porous to me and functional programming languages like Elixir feel solid and more promising.
Wouldn't an LLM want immutable data, no shared state, pattern-matching, recursion, and higher-order functions as part of its everyday reality?
All this popped in mind today because I saw @chris_mccord just released https://t.co/k8fQJNJ5gv - which to me feels like a valiant effort for Elixir / Phoenix in the "race of relevance" in the age of AI code gen.
I'd love to see more discussions and long-term thinking around what languages would actually be most helpful for LLMs .. by design.
This is my favourite new #MCP stack..
MCP React Client = @Cloudflare use-mcp + @aisdk
MCP Server = serverless on @modal_labs
Took a few hours to spin up an internal tool that translates natural language to SQL to insights