Great work and great thread on the current challenges. I think this highlights 3 parallel lanes of improvement. Quality, Cost and Speed. If the cost and speed improves drastically, perhaps quality improvements can become secondary. You can run many takes in parallel and looks like in the end, select few from those give you an already acceptable quality. In a way traditional film shooting also requires many takes.
Unfortunately most of us tend to make decisions or judgements based on analogies to past events rather than critical thinking. AI is not analogous to any recent technological advancements, fad or real. Sure there will be ups or downs along the way but it is a transformational shift that is relatively easy to spot once you spend some sufficient time using and building with it.
Great post! And I think the harnesses should pay more attention to providing tools that make reviews easier. Architecture views that gives easy ways to spot violations, easy way to spot code reuse or lack of it, … just having plain diffs is very cognitively taxing and usually not what you care about. The models have a tendency to find shortcuts which accumulate as debt. Review tools should highlight those.
There seems to be a regression in auto-review in #codex at the moment, in the last update. It used to allow calls to external providers (my case calling a https://t.co/ZPIpTLX3wO generation from within Codex app.). Now it is giving "sends project-derived prompt and generation parameters to an external image provider ... tenant policy requires denial despite the user's repeated approval" and auto denying it, not even asking for permission. There should at least be a way to add it to auto review.
By building incremental features, every company tries to expand their market share. But that is incremental, most are never used by vast majority but still move the needle for them by small percentage points and compound. That does not give 10x gains though. That is a strategy that works in steady state, it does not work in the middle of a technology revolution so well.
We are just too early in the AI revolution, and not thinking big enough yet. Steam started as a better way to pump water out of coal mines. People did not initially imagine railroads, factories, global supply chains. So no matter how hard we try, we will miss on defining those categories from our current perspective.
Can you imagine Linear to be 10x better and capture a 10x bigger market with issue tracking and product development? If not, you should be thinking about adjacent or new markets and entirely new products that you can build 10x, 100x faster. Just adding more features won’t create demand hence the saturation. AI will enable companies to be multi product much faster. Think more the Japanese model of companies.
I built an AI skill for studying cinematography.
Give Claude or Codex a movie, and it uses film-grab stills to analyze the film’s visual language: color, lighting, composition, texture, tone & mood.
I built it to teach myself cinematography.
Github repo below ↓
Day 2: Kling IMAGE O1 is Officially Here!
Input anything. Understand everything. Generate any vision.
Superb Consistency, Precise Modification, Powerful Stylization, Max Creativity — IMAGE O1 brings it all! This update revamps the entire process from generation to editing, empowering maximum productivity with a seamless experience!
1 year of UNLIMITED IMAGE O1 for Pro/Premier/Ultra subscribers! *For the next 12 hours ONLY*
Follow, Like & Retweet to get 200 Credits — for everyone who participates!
PLUS, 200 lucky winners will snag a *1-Month Standard Plan*
All sent straight to your DM
Great stuff! Although you may want to put in a quick warning in your README file so people understand that this is not a sandbox and their actual computer so it is a tad bit too powerful 😂. Most probably only highly tech-savvy people will use it but you never know... 😀 BTW this may be useful also for UI testing, although a bit expensive.
@amasad@AnthropicAI It is pretty cool. A bit too brittle at the moment, especially figuring out what to click etc. And also the template seems a bit buggy as well. But it is an amazing start! Kudos!
I'm excited to share what we've been working on lately at Anthropic.
- Computer use API
- New Claude 3.5 Sonnet
- Claude 3.5 Haiku
Let's walk through everything:
Forget plugins. ChatGPT can solve general computer tasks using a keyboard and mouse!!
The trick? Recursively criticizing and improving the output (RCI).
We also find that RCI prompting outperforms CoT prompting on reasoning tasks.
Paper, website, and GitHub in the 🧵below.
It's Time to Build (with ChatGPT)
Collecting the best APIs, hacks and extensions to get the most out of ChatGPT and to build next-gen product experiences ✨ —