My vectorized sorting extravaganza is out and about.
A LOT of work went into this (referring to the javascript in the posts :)
Read, retweet, send scathing reviews, open issues, heat your house with it:
This goes to Eleven!
Pt. 1-3, code, nuget out:
https://t.co/U961XhZA3r
@magnus_trent@fardeentwt As IPO season approaches, they will have to kill any non API plans for coding/work.
See the latest msft changes starting June 1st.
Now this is cute! Finally got hands on with the @Snapdragon X2 Elite powered Asus Ascent QN10 mini PC! Nice solid body, good selection of ports, compact form. This could definitely fit in my home lab.
Now this is cute! Finally got hands on with the @Snapdragon X2 Elite powered Asus Ascent QN10 mini PC! Nice solid body, good selection of ports, compact form. This could definitely fit in my home lab.
Outside of antigravity, has anyone else (Claude/Codex/pi) implemented a plan "review" mode where I can high light text in the plan file (.md file) and add specific comments pertaining to that part of the plan, them submit it all in one go, a bit like PR review, but for plans?
@insane_analyst so the positron story wrt lpddr5x, is what? That everyone is stupid and can't utilize HBM > 30% but they get lpddr6 to 90%?
Why would hbm utilisation be so low? that is just a hand wavy claim I don't understand. Where is the real world data that supports this?
@FUZxxl@zetalyrae Sing me the song of my people.
Jokes aside, that is a very clear place where data flow parallelism pulls ahead.
You tape over the unavoidable latency with to s of untethered (to each other, to warps) threads that are free to progress.
But yeah, different game all together.
@FUZxxl@zetalyrae I also think it's not unique in that respect, many instructions fall to this "could've/should've" group but COU designers don't have clear insight to what optimization would lead to greater use in the years after
@FUZxxl@zetalyrae I don't think the problem is memory per-se.
A modern arch with 3 load + 2 store ports, for example, is already structured for parallel accesses to L1, and even to the same line, every cycle.
The sadness surrounding gathers is lack of determination to optimize it specifically
@FUZxxl@zetalyrae Gather is the killer of joy, it runs so badly even when everything is cached in L1 in every impl I know.
Had they built it, we would come...