I was pretty inspired by this and followed it down a rabbit hole and into 2 repo's:
1. https://t.co/086AUM2IdV - the ultimate agent CLI. a CLI that integrates with SaaS providers with embedding search for functions, progress discovery
2. https://t.co/d5ouOv2SFv - a repo for re-usable harnesses.
I've purpose built a harness that goes through deterministically creating and adding new providers to cli-factory with e2e tests.
The balance between latent and deterministic functionality is a really creattve space to play around in.
@theo the one good thing that come out of this is I've moved from claude code to t3 + chatgiptty and it's like I've seen the light ๐ those terminal code harness days can GTFO
I'm building Tilde [ https://t.co/HYcCRIm1nS ], Slack for remote agents & human teams to get work done. For non-technical office workers in SME's.
Create a workspace, share it with your team, add agents, tools and MCP apps to automate your workflows and create AI colleagues. Everyone has the same view.
I've been leading AI infra development at my previous startup for ~2 years.
I did a thing - https://t.co/d5ouOv2SFv . Slop-coded together a harness using @OpenAI 's Agent SDK 2.0 and @karpathy 's Auto-Research. clone your repo, give a hypothesis / narrow goal and let Codex self-iterate until the goal is met or max failed iteraions reached. Designed for quick experimentation and optimizing code with purpose.
I built / slop-coded this https://t.co/d5ouOv2SFv ... It would be great to deep-dive into a particular piece of one's code base and have a harness with a reward function and let codex optimise the particular code until it passes.... i.e. get time to first load to under 5s, then codex self-iterates until it's done.
@sama Just built a harness inspired by @karpathy 's auto-research. give your git repo, set your goal / hypothesis, create a reward function with quant & qualitative metrics to measure it by, and let the agent SDK self-iterate and make code changes until optimised. Agent SDK is neat!
@kentcdodds@aidenybai Not sure if we need MCP but we do need API proxies that can deal with injecting secure credentials for third party auth. I.e my agent shouldnโt ever need to broker or inject an oauth or third party API keyโฆ it should make requests to a server that injects those before passing on
Has anyone tried combining temporality and โgraph edgeโ relationships into semantic vectors and embedding models? Feels like memory is a black box, maybe we should treat it like one! @mem0ai ?
@elonmusk Morals are subjective based on culture, time period & other nuances - how would you model that or review? One would hope training models regular & weighting more recent corpus over time may impact this but itโs an interesting problem.