@xubinrencs@kimmonismus OpenAI servers have been broken since Friday for Asia users. See https://t.co/OTfcaVKg37 and the linked OpenCode thread where lots mention their experience. It's an underlying fault of the HTTP API, not the app itself.
We’re training models wrong and it’s due to chatGPT. Even the modern coding agents used daily still use message-based exchanges: They send messages to users, to themselves (CoT) and to tools, and receive messages in turn.
This bottlenecks even very intelligent agents to a single stream. The models cannot read while writing, cannot act while thinking and cannot think while processing information.
In our new paper, see below, we discuss LLMs with parallel streams. We show that multi-stream LLMs can …
🔵Be created by instruction-tuning for the stream format
🔵Simplify user and tool use UX removing many pain points with agents and chat models (such as having to interrupt the model to get a word in)
🔵Multi-Stream LLMs are fast, they can predict+read tokens in all streams in parallel in each forward pass, improving latency
🔵 LLMs with multiple streams have an easier time encoding a separation of concerns, improving security
🔵 LLMs with many internal streams provide a legible form of parallel/cont. reasoning. Even if the main CoT stream is accidentally pressured or too focused on a particular task to voice concerns, other internal streams can subvocalize concerns that would otherwise not be verbalized.
Does this sound related to a recent thinky post :) - Yes, but I don’t feel so bad about being outshipped with such a cool report on their side by 23 hours. I’ll link a 2nd thread below with a more direct comparison. I actually think both are complementary in interesting ways.
GPT 5.4 has a much better ability to delegate work to sub-agents compared to GPT 5.3.
Even within OpenCode, which has not been tuned for subagents as much as Codex has (now), GPT 5.4 is able to delegate work even at the planning stage! This is sometihng GPT 5.3 would never do.
Overall, I'm still exploring new methods to review code! I think the promise of `jj` to have `jj edit` is much better than Git's experience of amending old commits, so I hope that GitHub supports jj better in the future.
Trying out new code review tools lately! On my list is using `jj`, from https://t.co/RsDVCTh60C. I'm particularly interested in comparing to a native Git approach of walking commit-by-commit and taking notes, then using an LLM to post the notes correctly on GitHub.
Related, but I am excited for GitHub to have an early preview of a native Stacked PR flow.
Until then, I've been following https://t.co/mTxqkMuhu9 which appears to have been a pretty good approach to help coworkers review my code.
@quant_arb the context management in Cursor is quite bad. Very wasteful of tokens. Price wise, even if it was better, it cannot compete with a coding subscription by a big provider, even with the 2.5x credit multiplier that Cursor gives you for their $20 or other plans.
A former quantum from Jump Trading sent me the source code of the bot
he said they fired him
for running it from the work server
when i opened the code i didn’t sleep til 4 am
gist of the code:
bot holds two websocket connections
simultaneously direct feed from Binance spot orderbook and Polymarket CLOB API
BTC price on Binance updates ~200ms
polymarket recalculates contracts with 3-7s lag
in this window bot calculates implied probability
compares spot divergence with contract price
over 8% - enters
btc does +0.4% on spot up contract at 53¢
model probability - 87%
bot buys
in 5 mins contract resolves at $1
20-50% per trade 100+ times a day
$284,719 per month off $1,300 bankroll
this isn’t prediction
it’s latency arb between CEX and prediction market
no need for $50k colocation $20 vps and clean code suffices
@devdotsasha Fun fact: it's faster to open the "files changed" in tab in a new tab, rather than open it in the current tab.
Absolutely banal that this is the case, but it's true.
https://t.co/ysePHyeStX