digitalm00lah @digitalm00lah - Twitter Profile

Pinned Tweet

digitalm00lah @digitalm00lah

3 months ago

It’s 2026 and #rawr is still life!

0

4

0

69

digitalm00lah retweeted

Brian Armstrong

@brian_armstrong

2 days ago

How to keep AI spend flat while token usage grows exponentially: Not with friction and spend alerts. With better defaults, routing, and caching. Better Defaults (not Usage Caps) – Engineers can choose any model they want, but defaults matter. We’re experimenting with defaulting to open weight models like GLM 5.2 and Kimi 2.7 through our LLM gateway, while still encouraging engineers to choose the right model for the task. 91% of our employees were never hitting their usage caps, so instead of lowering caps and driving up alerts, we're moving to cheaper defaults. Note that code reviews use a diversity of models, so they can check each other's work. Better Routing – In our custom harnesses, we preprocess prompts and route to the best model for the job, considering cache hits and model pricing. For instance, you may want a frontier model for planning, but not for execution where they can be overkill. Ultimately, humans shouldn't be choosing models - AI can automate this task. Better Caching – Cache misses are the easiest way to drive your cost up. All of our requests are cache aware, so we’re reusing a warm cache wherever possible. For example, our cache hit rate went from 5% → 60% in LibreChat once properly implemented. Keep Context Lean – Start fresh sessions when switching tasks. Scope file context narrowly. Disconnect unused tools. Don't just compact. The goal isn't fewer tokens used, it's fewer tokens wasted. Better Visibility – Our engineers can use as many tokens as they want, from whatever model they want, but we’ve made usage visible – and the more you spend on AI, the more impact we expect. The goal isn't to suppress usage. It's to build the infrastructure that makes exponential growth sustainable. Putting this into practice has cut our AI spend nearly in half, while our token usage continues to grow.

brian_armstrong's tweet photo. How to keep AI spend flat while token usage grows exponentially: Not with friction and spend alerts. With better defaults, routing, and caching.

Better Defaults (not Usage Caps) – Engineers can choose any model they want, but defaults matter. We’re experimenting with defaulting to open weight models like GLM 5.2 and Kimi 2.7 through our LLM gateway, while still encouraging engineers to choose the right model for the task. 91% of our employees were never hitting their usage caps, so instead of lowering caps and driving up alerts, we're moving to cheaper defaults. Note that code reviews use a diversity of models, so they can check each other's work.

Better Routing – In our custom harnesses, we preprocess prompts and route to the best model for the job, considering cache hits and model pricing. For instance, you may want a frontier model for planning, but not for execution where they can be overkill. Ultimately, humans shouldn't be choosing models - AI can automate this task.

Better Caching – Cache misses are the easiest way to drive your cost up. All of our requests are cache aware, so we’re reusing a warm cache wherever possible. For example, our cache hit rate went from 5% → 60% in LibreChat once properly implemented.

Keep Context Lean – Start fresh sessions when switching tasks. Scope file context narrowly. Disconnect unused tools. Don't just compact. The goal isn't fewer tokens used, it's fewer tokens wasted.

Better Visibility – Our engineers can use as many tokens as they want, from whatever model they want, but we’ve made usage visible – and the more you spend on AI, the more impact we expect.

The goal isn't to suppress usage. It's to build the infrastructure that makes exponential growth sustainable.

Putting this into practice has cut our AI spend nearly in half, while our token usage continues to grow.

425

6K

655

6K

3M

digitalm00lah @digitalm00lah

about 21 hours ago

@Shilllin Big 📠

0

20

digitalm00lah retweeted

Erik Voorhees

@ErikVoorhees

3 days ago

Venice now at 100 billion daily tokens

59

1K

114

58

87K

Who to follow

Cheif

@slimsonnn

Punk# 2032 • BAYC# 1346 • BTC • ETH • NFTs • Hunting the next 100x

Loam.eth 🪱🌱

@Loamfer

grow a garden mfers. im here to help

Starlink

@Starlink

Fast, affordable internet available all around the world. Engineered by @SpaceX

digitalm00lah retweeted

Ksenia Moskalenko

@kseniam0s

5 days ago

FOUNDERS: You've been paying to build on Claude. @AnthropicAI launched a program to change that. @Claudeai for Startups - free API credits and priority rate limits for early-stage VC-backed founders: - Free Claude API credits - Highest rate limits, no throttling in production - Hackathons, Founder Days, and meetups - Early access to new model releases Build with the full Claude stack: Claude API, Claude Code, Claude Managed Agents, and Claude Cowork. To qualify: your startup must be early-stage and backed by one of Anthropic's partner VCs. Ask your investors for a unique application link. Apply → https://t.co/HckO4O93Ho P.S. Founders using Claude to build - when you're ready to raise, @ThePageform is where your data room lives → https://t.co/RgOL0J0kt5

kseniam0s's tweet photo. FOUNDERS: You've been paying to build on Claude. @AnthropicAI launched a program to change that.

@Claudeai for Startups - free API credits and priority rate limits for early-stage VC-backed founders:

- Free Claude API credits
- Highest rate limits, no throttling in production
- Hackathons, Founder Days, and meetups
- Early access to new model releases

Build with the full Claude stack:
Claude API, Claude Code, Claude Managed Agents, and Claude Cowork.

To qualify: your startup must be early-stage and backed by one of Anthropic's partner VCs. Ask your investors for a unique application link.

Apply → https://t.co/HckO4O93Ho

P.S. Founders using Claude to build - when you're ready to raise, @ThePageform is where your data room lives → https://t.co/RgOL0J0kt5

68

2K

150

2K

199K

digitalm00lah retweeted

WallStreetBets

@wallstreetbets

5 days ago

clarity is coming

154

2K

113

30

168K

digitalm00lah retweeted

Watcher.Guru

@WatcherGuru

5 days ago

JUST IN: Ethereum Foundation fires 20% of its staff.

808

10K

994

338

1M

digitalm00lah retweeted

Cryo Palmar

@CryoPalmar

6 days ago

The homies from @rarible just went live with their Gacha on Solana, so we cracked some packs 🔥 They also were so kind to give me 2 x $25 packs to give away! This is how you can win 👇 - retweet this tweet ♻️ - create an account through the link in the comments 🎁 - comment your wallet address💭 Drawing winners in 24 hours!

30

66

23

1

6K

digitalm00lah @digitalm00lah

6 days ago

@zerohedge

0

1K

digitalm00lah retweeted

Ansem 🐂🀄️

@blknoiz06

6 days ago

man what a time this was

205

796

30

17

131K

digitalm00lah retweeted

Elon Musk

@elonmusk

7 days ago

“Full access” 😂

7K

364K

18K

17K

58M

digitalm00lah retweeted

Erik Voorhees

@ErikVoorhees

12 days ago

End to end encrypted GLM 5.2

32

506

47

40

37K

digitalm00lah retweeted

Bark

@barkmeta

16 days ago

If you bought SpaceX at the open this morning and sold 10 minutes later, you outperformed 6 years of holding Ethereum.

281

35K

926

1K

2M

digitalm00lah retweeted

Claude

@claudeai

19 days ago

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

5K

105K

14K

22K

57M

digitalm00lah retweeted

Watcher.Guru

@WatcherGuru

24 days ago

JUST IN: Zcash crashes 48% after Claude AI finds critical vulnerability allowing unlimited minting of $ZEC. It went unnoticed for 4 years until it was patched on June 1st.

WatcherGuru's tweet photo. JUST IN: Zcash crashes 48% after Claude AI finds critical vulnerability allowing unlimited minting of $ZEC.

It went unnoticed for 4 years until it was patched on June 1st. https://t.co/Ddv5JLMvUY

1K

14K

1K

2M

digitalm00lah retweeted

Backpack 🎒

@Backpack

about 1 month ago

$VVV, $BILL, and $CHIP are now available for futures trading on Backpack.

31

175

32

21

17K

digitalm00lah retweeted

Rahul

@sairahul1

about 1 month ago

Anthropic's Head of Product just dropped a 28-minute masterclass on agent production. Prompt caching. Tool search. Programmatic tool calling. Compaction. Advisor strategy. 28 minutes. Free. Worth more than 100 YouTube videos combined. Watch it first. Then read this. The masterclass teaches you how agents work. This teaches you what to build with them — a 5-agent content pipeline that does the work of a $300K creative team. Full pipeline below ↓ Bookmark this. Start this weekend.

29

2K

326

8K

532K

digitalm00lah retweeted

𝒍𝒊𝒍𝒛

@Lilz69420

about 1 month ago

You know what happens next, right? $BNKR

1

94

15

1

5K

digitalm00lah retweeted

jesse.base.eth

@jessepollak

about 1 month ago

DIEM is the coolest new onchain financial building block we've seen in a long time

67

487

56

45

67K

digitalm00lah retweeted

Eli5DeFi

@Eli5defi

about 2 months ago

YTD, $VVV is up ~9x and has become one of the most promising projects in the decentralized AI category. ❶ It addresses the core pain points of centralized AI providers: - 100% privacy (no data retention or surveillance) - Uncensored, bias-free open-source models - Agent-optimized economics - Inference at zero marginal cost, or mint $DIEM for perpetual credits - Scalability + crypto alignment - One API, 200+ models ❷ How it works: - Stake VVV. - Receive a pro-rata share of total platform capacity, perpetually. - Stake 1% of all staked VVV → use 1% of inference for free (no per-token fees). - Mint DIEM by locking your staked VVV (1 DIEM ≈ $1/day of credits, forever). - Stake your DIEM → get automatic $1/day refreshing credits every epoch (00:00 UTC). - DIEM is tradable as an ERC-20 token. ❸ The Ecosystem Many decentralized AI projects used Venice as a core inference layer and migrated their workloads. - Strike Robot / $SR / @StrikeRobot_ai Inference API backend for robotics. Enables private VLM reasoning, vision, and decision-making for humanoids. - Warden / $WARD / @wardenprotocol Migrates onchain AI agent workloads to Venice models. Runs Warden App & Studio for private agents. - Dolphin / $POD / @dphnAI Team behind Dolphin Mistral 24B Venice Edition powering Venice Uncensored. Trained on Bittensor’s @TargonCompute and fully integrated as Venice’s private/uncensored model. - Morpheus / $MOR / @MorpheusAIs Decentralized p2p AI compute network Venice launched on. Foundation for private, permissionless inference. - Bonfires AI / $KNOW (exp. Q2 2026) / @bonfiresai AI agents for collective memory, knowledge graphs, and “judge” agents. Runs core reasoning/judging on Venice models (notably Dolphin-powered Venice Uncensored). Helped judge a Venice hackathon. - Hyperbolic Labs / @hyperbolic_labs On-demand GPU rentals + OpenAI-compatible inference + scalable clusters (H100/H200). Official AI inference infra partner. - Fleek / $FLK / @fleek “Shopify for AI agents”: decentralized hosting, deployment, and monetization for agents + virtual influencers. One-click launch and earn. - BasedAI / @basedai_co New ecosystem layer from Venice’s co-founder. Flagship @gethirebased brings persistent multi-agent automation. They acquired the full Warden App IP, orchestration stack, and builder team. --- The Venice AI crypto ecosystem is one of the most vertically integrated plays in decentralized AI right now. The ecosystem is still early and the flywheel is simple. More agents and users → higher private inference demand → more VVV staking and DIEM usage → revenue buybacks/burns → stronger token economics.