TetraMaster

@TetraMasterIO

Crypto degenerate. Wizard.

Fairhaven, MA

Joined January 2014

160 Following

71 Followers

126 Posts

TetraMaster

@TetraMasterIO

7 months ago

@chainlink #smartcon

TetraMasterIO retweeted

armygirl

@armygir36701799

7 months ago

Sound, thought, visuals. All at once. Perfect.

132

867

856

101K

TetraMasterIO retweeted

Elon Musk

@elonmusk

7 months ago

Yes

74K

14M

TetraMasterIO retweeted

hedoήist

@hedo_ist

8 months ago

I see you in 4k

111

162

99K

Who to follow

TetraMasterIO retweeted

Will Cannon

@iamwillcannon

8 months ago

Easiest ways to get rich: •Sell men lust •Sell women beauty •Sell parents peace •Sell kids dreams •Sell the rich safety •Sell the broke hope Same game. Different packaging.

299

31K

13K

904K

TetraMasterIO retweeted

Dogan Ural

@doganuraldesign

8 months ago

Can you understand something you can't imagine?

358

11K

799

629K

TetraMasterIO retweeted

Làzaro

@lazaro45ive

8 months ago

I spent 30 days creating these 20 seconds

283

394

984

223K

TetraMasterIO retweeted

cinesthetic.

@TheCinesthetic

8 months ago

Imagine being in a theater in 1999 and seeing this for the first time.

80K

TetraMasterIO retweeted

Elon Musk

@elonmusk

8 months ago

Grok

67K

14M

TetraMasterIO retweeted

GREG ISENBERG

@gregisenberg

8 months ago

okay, let me get this straight… 1/ you can use sora 2 to create hollywood level videos in seconds to build audiences of millions 2/ you can deploy donotpay style ai agents that call comcast, cancel subscriptions, and renegotiate bills automatically 3/ you can clone your voice with elevenlabs in under 10 seconds and layer it on heygen avatars to create talking head videos without ever filming 4/ you can run open-source models like llama-3 or kimi k2 on your laptop using ollama or lm studio, no gpu cluster required 5/ you can spin up code agents on cursor or claude code that debug, write tests, and push to github while you’re offline 6/ you can record a loom and drop it into chatgpt to automatically research context, extract insights, prioritize the top ideas, and turn it into a polished blog post 7/ you can drag-and-drop zapier workflows and instantly plug in lindy agents to automate outreach, reporting, research, even legal docs 8/ you can drop a 300-page pdf into perplexity or chatgpt with code interpreter and get a structured memo in under a minute 9/ you can find validated startup ideas someone should build right now with @ideabrowser and then chat with agents to help you build it (i built this for you) 10/ you can spin up interactive mobile funnels in 5 mins with perspective 11/ you can generate personalized landing pages per visitor with mutiny ai or replo, tied to real-time crm data 12/ you can run nano banana (google flash 2.5 in google ai studio) or sora 2 to create 100 ad variations and a/b test them in meta ads manager before you spend a dollar 13/ you can query your data warehouse with chatgpt connected to bigquery or snowflake like it’s imessage 14/ you can use fyxer ai to respond to your emails for you 15/ you can train a custom gpt on your intercom or zendesk docs and have it resolve 70 percent of tickets without human touch 16/ you can scrape competitors with apify or bright data and generate GTM strategies in claude in hours instead of months 17/ you can deepfake yourself into 20 languages using heygen’s dubbing and synthesize global content libraries overnight 18/ you can run AI audits on quickbooks from your ai native brex with tools like flowcog and instantly surface tax write-offs your accountant missed 19/ you can auto-generate a week of social posts with notion ai or typefully that match brand tone, visuals, and cadence 20/ you can upload a csv to shopify magic and spin up a full e-commerce storefront with product descriptions, images, and ads pre-baked 21/ you can deploy voice agents through vapi that call leads, qualify them, update your crm, and hand you a booked call calendar 22/ you can build any personal software you’d like. don’t like your crm? no worries, build your own 23/ you can spin up entire media pipelines in runway, sora 2 and pika labs, from raw text to cinematic short-form content 24/ and way more. this just scratches the surface to get your creative juices flowing this weekend. This is the greatest unlock for solo builders since the App Store. Enjoy it.

198

538

11K

446K

TetraMasterIO retweeted

Yann Toskov

@yannToskov

8 months ago

New Midjourney Ranked Profile unlocked with Kling 2.5 Turbo #Cyberpunk

302

13K

TetraMasterIO retweeted

S.A.V

@FilmsBySav

8 months ago

OG PRIME — Official Short Film The response I got from our live premiere in LA was unreal, and now I’m pumped to finally share OG PRIME with the rest of the world. Presented by @Kling_ai A @PhantomX_ai production Starring: The Legendary @MetaGear2600 Directed by: @FilmsBySav Sound by: JMR & @FilmsBySav This film means a lot to me. Hope it hits you too. 🤖♥️

265

105

48K

TetraMasterIO retweeted

Yegor

@yegormethod

8 months ago

met this guy in dubai who helps people "restructure wealth internationally" drives a $3M pagani has 7 passports definitely not laundering money (😉) his system for making money untouchable: (posting this from a VPN) the "Dubai Stack": - dubai company (0% tax) - estonian holding (blockchain friendly) - singapore trading (banking hub) - swiss trust (privacy laws) - cayman foundation (asset protection) money flows like this: revenue → dubai dubai → estonia estonia → singapore singapore → switzerland switzerland → cayman cayman → your pocket completely legal completely untraceable completely untaxable he showed me a client: $50M annual revenue effective tax rate: 0.3% saved $19M last year government can't touch it but here's the wild part: he charges 10% of tax saved makes $30M/year from 15 clients works 4 months annually lives on a yacht dates instagram models has tea with sheikhs probably on 6 watchlists his quote: "paying taxes is a choice only poor people don't know it's optional" got raided once they found nothing everything's in jurisdictions that don't cooperate levels to this shii

225

10K

464

11K

867K

TetraMasterIO retweeted

Gammichan

@gammichan

10 months ago

This is such a great old Saylor video and to me, it encompasses the essence of investing. It really is just about putting together a vision for where the puck is going that the mass market hasn't, and then having the conviction and patience to see it through. The beautiful thing about this is all the billion, trillion dollar funds with all their $1M/yr salary quants don't have any edge in doing that over you and I doing that. Everything Saylor mentions here was public knowledge, he just put it together in a way many didn't. No fancy math gets you this, just some research on the topic at hand and some high level reasoning capability to put it together correctly and project it forward. My best tradfi trade was buying AMD at $2 before it ran to $200. The thesis was that their new chips were competitive again for the first time in 10 years. That's it. Full stop. One sentence thesis for a 100x trade. Meanwhile every analyst on wall street was laughing at them, they had always lost to Intel and always would in their mind. You can have insane edge by simply understanding a product, making some correct assumptions about the future of that product, and having the discipline to hold until that future is realized. There is no formula for learning this, it simply takes experience and education. Read as much as you can about as many things as you can and push your understanding of the world as far as it can go. You'll find that you start making connections, start formulating visions for where things are going rather than where they are because you now have a solid base of knowledge and pattern recognition to project from. Maximize your capacity for reasoning.

112

563

125K

TetraMasterIO retweeted

TheValueist

@TheValueist

9 months ago

$NVDA NVIDIA’s Rubin CPX is a purpose-built GPU class for the prefill/context phase of inference on 1M+ token workloads, introduced alongside a disaggregated serving model that splits context and generation across different processors. The technical blog and press release specify that each Rubin CPX delivers 30 PFLOPS of NVFP4 compute, 128 GB of GDDR7, integrated video encode/decode, and 3x attention acceleration versus GB300 NVL72 baselines. NVIDIA positions CPX as a compute-dense, cost-optimized, monolithic-die part that complements generation-phase Rubin GPUs and Vera CPUs inside a rack-scale “Vera Rubin NVL144 CPX” system that integrates 144 CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs. At the rack level, NVIDIA cites 8 exaFLOPS of NVFP4, 100 TB of high-speed memory, and 1.7 PB/s aggregate memory bandwidth, claiming 7.5x the AI performance of GB300 NVL72. Availability is targeted for end-2026. These are aggressive, rack-level economics intended to monetize long-context inference (coding and video) with a stated ROI claim of 30x–50x, framed as $5B of “token revenue” per $100M CAPEX at scale; while promotional, the direction of travel is clear: NVIDIA wants to convert the rising cost/latency of extreme-context tokens into a systems-level upsell beyond today’s GB200/GB300. (NVIDIA Developer, NVIDIA Newsroom) The architectural rationale is sound. Context/prefill is token-parallel and compute-bound; generation is autoregressive and memory-bandwidth-bound. Co-locating both phases on the same GPU wastes resources under long inputs: compute saturates during prefill, then memory fabrics dominate during decode. NVIDIA’s disaggregated serving, orchestrated by the open-source Dynamo framework, routes requests, manages KV caches across nodes, and exploits hierarchical caching and data-movement paths to minimize recomputation and TTFT. NVIDIA has already demonstrated material per-GPU throughput gains from disaggregated serving in MLPerf v5.1 on GB200/GB300 for interactive scenarios, indicating that the scheduler, router, and comms layers can translate into real-world throughput per dollar improvements. Rubin CPX is, in effect, a specialized prefill engine dropped into this disaggregated topology to accelerate the dominant cost center for long contexts. The strategic bet is that system-level throughput gains from specialization plus NVFP4 quantization of weights and FP8 KV caches continue to compress cost/token faster than model growth inflates token counts. (NVIDIA Developer, NVIDIA Docs) Demand drivers center on 2 high-value domains: code and video. For code, agentic systems need to reason over entire repositories, dependency graphs, and historical context spanning many files and sessions—turning coding copilots into software collaborators tied to enterprise development environments. For video, NVIDIA highlights that 1 hour of content can map to roughly 1M tokens and requires tight coupling of long-context attention with video codecs; CPX integrates video encode/decode with long-context acceleration to collapse that pipeline into a single chip for the context stage. NVIDIA named early lighthouse customers and partners in coding and video to underscore near-term applicability. While not every workload needs 1M tokens, the share of prefill-heavy jobs is rising due to multi-step reasoning, RAG over larger corpora, and tool-using agents with persistent memory; that supports a dedicated context accelerator if and only if the serving stack can keep KV caches hot and route queries to minimize recomputation. (NVIDIA Newsroom, Network World) The most important economic change is the use of a monolithic die with GDDR7 rather than HBM for CPX. HBM scarcity and packaging constraints have been key supply-side throttles. A GDDR7-based, monolithic context accelerator can be cheaper to build at scale, easier to source, and less dependent on advanced 2.5D packaging, while still feeding NVFP4 math at high rates for attention/prefill. Because prefill is compute-dense and less sensitive to HBM-like bandwidth than generation-heavy decode, the GDDR7 trade-off is attractive in a disaggregated pipeline. If the 3x attention acceleration versus GB300 NVL72 holds in production and Dynamo consistently routes to hot caches, CPX should reduce cost per input token and TTFT for long contexts while increasing overall cluster utilization. The heavy bundling—Rubin CPX + Rubin GPUs + Vera CPUs + Quantum‑X800 InfiniBand or Spectrum‑XGS Ethernet + ConnectX‑9 SuperNICs + AI Enterprise/NIM microservices—simultaneously expands NVIDIA’s per-rack ASP and deepens software lock-in. The flip side is that the ROI math depends on sustained high utilization of prefill engines and on the frequency with which workloads truly need 0.5M–1M+ tokens versus algorithmic techniques that prune, compress, or stream context more efficiently. (NVIDIA Newsroom, NVIDIA Developer) From a competitive perspective, AMD’s 2024–2026 roadmap has focused on large HBM footprints per accelerator to keep more of the model and KV cache resident, e.g., MI325X with 256 GB HBM3e and 6 TB/s bandwidth, and next-gen MI350X with expanded low-precision data types and up to 288 GB HBM3e and 8 TB/s bandwidth. That approach is well-suited to decode-heavy and medium-context jobs and competes on cost/TCO where software stacks are mature (vLLM/ROCm) and for customers prioritizing memory capacity per device. NVIDIA’s counter with CPX is not “more HBM,” but a system-level division of labor that potentially scales better at 1M-token prefill. Google’s TPU v5e and similar price-optimized inference platforms emphasize cost/token via software and fabric optimization rather than a context-specialized chip. The main question through 2026 is whether hyperscalers lean into disaggregation plus specialization (NVIDIA’s thesis) or continue to extract cost efficiencies via large HBM devices and networking/fleet-level caches without adding a new processor class (AMD/TPU trajectory). Timelines matter: Rubin CPX availability is end-2026, leaving a multi-quarter window in which GB300 NVL72, AMD MI325X/MI350X, and TPUs set customer habits and software defaults. (AMD, Google Cloud, NVIDIA Newsroom) Investment implications for NVIDIA are significant. First, CPX increases the TAM and the “rack bill of materials” NVIDIA can capture per deployment by adding a new silicon type that is tightly integrated into an already sticky software stack (Dynamo, TensorRT‑LLM, NIM, AI Enterprise). Second, substituting GDDR7 for HBM on a large fraction of prefill compute shifts supply constraints away from HBM packaging and should improve shipment elasticity, smoothing revenue recognition and reducing lead-time risk for inference-heavy builds. Third, CPX’s alignment with long-context workloads in coding and video is strategically astute: these are enterprise monetization vectors with measurable productivity and content-generation outputs, supporting premium token pricing and higher willingness to pay for TTFT reductions and latency guarantees. Fourth, by tying CPX to Quantum‑X800/Spectrum‑XGS and ConnectX‑9, NVIDIA preserves and expands its networking attach. The primary near-term caveat is timing: with availability at end-2026, CPX is unlikely to impact FY26 materially; it is more of a FY27 revenue driver unless NVIDIA books early MGX platform commits. Given NVIDIA’s claim that a fully built NVL144 CPX rack can enable $5B revenue per $100M CAPEX, even partial realization of this thesis can create a second-wave inference CAPEX cycle focused on long-context and agentic systems. We would treat NVIDIA’s ROI multiple as directional rather than literal, but the mechanism—higher utilization and lower cost/token for large contexts—has strong first-principles support. (NVIDIA Newsroom) For AMD, CPX raises the performance bar specifically on long-context prefill. MI325X/MI350X will remain competitive on decode-heavy or medium-context inference and on generalized training/inference TCO, especially where ROCm stacks are mature and customers value memory-per-accelerator over a new processor category. However, if disaggregation plus CPX becomes the default design pattern for agents, coding copilots, and generative video, AMD may need to lean harder into software-level KV cache routing, memory pooling, and fabric optimizations to avoid ceding long-context economics to NVIDIA’s full-stack solution. AMD’s near-term advantage is availability: MI325X is shipping, MI350X is slated for 2025, so AMD can accrue share and relationships before CPX lands. The competitive watch item is whether customers report stable, high utilization under disaggregated serving and quantized KV caches on NVIDIA stacks; if they do, system-level TCO could tilt toward NVIDIA again in 2026–2027. (AMD) For the broader supply chain, CPX implies stronger demand for NVIDIA networking (InfiniBand and Spectrum‑XGS Ethernet) and for GDDR7 suppliers, somewhat de-risking the HBM bottleneck while still keeping HBM demand elevated for generation-phase Rubin GPUs. ODMs/OEMs with MGX capability and high-density rack integration stand to benefit. Competitors in KV cache externalization and memory pooling may see mixed effects: if Dynamo’s cache routing keeps hit rates high within the NVIDIA fabric, demand for external memory pools could be curbed; alternatively, for extreme contexts beyond 1M tokens, external DDR5 pools over 400G/800G Ethernet could remain complementary. The state of the art is moving quickly, and NVIDIA’s ability to bundle software and silicon gives it an edge in converting these plumbing choices into revenue. (Amazon Web Services, Inc.) Key risks and what to watch. 1) Software maturity and reproducibility: the MLPerf wins and blog demos need to translate to multi-tenant, production workloads with real SLAs, tail-latency controls, and dynamic load shaping. Evidence to monitor includes third-party MLPerf submissions and customer case studies showing per-rack TTFT, ITL, and cost/token under 1M-token inputs. 2) Utilization risk: the ROI math is ultrasensitive to context-heavy utilization. If algorithmic techniques—KV cache reuse, chunked prefill, retrieval compression, streaming attention, speculative decoding, and caching across sessions—meaningfully reduce context compute, the specialized CPX layer could be underutilized. 3) Availability and supply risk: end-2026 delivery leaves room for slippage; competitor platforms shipping in 2025–2026 can entrench tooling and developer mindshare. 4) Power and cooling: while CPX’s monolithic GDDR7 design likely has a simpler thermal profile than multi-stack HBM devices, overall rack power with 144 CPX + 144 Rubin + 36 Vera will be material; we need concrete power envelopes and site-level PUE assumptions before underwriting ROI at scale. 5) Pricing and bundling: how NVIDIA prices CPX relative to GB300-class systems will determine whether customers see CPX as accretive or duplicative CAPEX. Actionable positioning. Overweight NVIDIA on the thesis that CPX extends the company’s systems moat into long-context inference with a new silicon beachhead that is supply-friendlier than HBM-intensive designs, while reinforcing networking and software attach. Treat CPX as a 2026–2027 monetization driver layered atop ongoing Blackwell Ultra/NVL72 ramps in 2025–2026, not as a near-term catalyst. Accrue exposure to NVIDIA networking and select GDDR7 suppliers as second-order beneficiaries. Maintain constructive but selective exposure to AMD on availability and memory-capacity leadership through 2025–2026, but watch for customer traction specifically on long-context jobs; if NVIDIA’s disaggregation pattern becomes standard, consider a relative value tilt back toward NVIDIA-linked ecosystem names. For hyperscaler CAPEX, the read-through is a second wave of inference spend specifically for agents/coding/video; we expect buyers to pilot disaggregated pipelines in 2025 and begin CPX-inclusive rollouts in late 2026 where utilization models are convincing. (NVIDIA Developer, NVIDIA Newsroom) Bottom line, Rubin CPX is not just another accelerator; it is a system-level argument that the economics of 1M-token context are best solved by specialization and orchestration. If the claimed 3x attention acceleration, 7.5x rack-level uplift, and the software stack’s cache-aware routing hold in customer deployments, the prefill bottleneck becomes a monetizable product tier that expands NVIDIA’s per-rack revenue and strengthens full-stack differentiation. If, conversely, algorithmic and software advances shrink effective context or cache misses remain high in practice, CPX uptake could be slower and more concentrated in video and top-end coding agents. The burden of proof now shifts to early lighthouse users and third-party benchmarks to validate utilization, latency, and cost/token claims at scale. (NVIDIA Developer, NVIDIA Newsroom)

TetraMasterIO retweeted

CAIN

@66cain92

over 1 year ago

@AMAZlNGNATURE Mountain Goats knows something humans don't know

436

28K

TetraMasterIO retweeted

フォリス

@follis_

over 1 year ago

We are so back

154

19K

TetraMasterIO retweeted

Justin Mihaly

@JustinMihaly

over 1 year ago

You've been lied to about Nicotine. All you've been told is how addictive & terrible it is for your health... But when used correctly? It's the ultimate health hack. Here are the secrets 99% of people don't know about nicotine (& how beneficial it can be):