R.A.G.E AI Entreprise @rageai3nt - Twitter Profile

21 days ago

Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs. Google's new model, Gemma 4 12B Unified supports image, audio and 256K context. You can run and train the model via Unsloth Studio. GGUF: https://t.co/8cL321pVDh Guide: https://t.co/odRo9WjRpA

UnslothAI's tweet photo. Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.

You can run and train the model via Unsloth Studio.

GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA https://t.co/Ax09ZTXFF3

96

3K

380

2K

352K

R.A.G.E AI Entreprise

@rageai3nt

about 1 month ago

@xai @grok @NousResearch Very cool

0

12

rageai3nt retweeted

AboveSpec

@above_spec

about 2 months ago

RTX 5060 Ti 16GB. $429 GPU. Last night I got 128 t/s on Qwen3.6-35B using ik_llama.cpp's R4 quant format. Crushing performance. Faster than the 5070 Ti on mainline llama.cpp. Performance stays consistent from 0 to 139k context and no speculative decoding used!🤯 Special thanks to @MakJoris for sharing ik_llama.cpp with us! Today I wanted to know if it's actually *useful* at that speed. So I gave it a coding agent and 4 creative challenges. Here's what it built. 🧵

above_spec's tweet photo. RTX 5060 Ti 16GB. $429 GPU.

Last night I got 128 t/s on Qwen3.6-35B using ik_llama.cpp's R4 quant format. Crushing performance. Faster than the 5070 Ti on mainline llama.cpp. Performance stays consistent from 0 to 139k context and no speculative decoding used!🤯

Special thanks to @MakJoris for sharing ik_llama.cpp with us!

Today I wanted to know if it's actually *useful* at that speed. So I gave it a coding agent and 4 creative challenges.

Here's what it built. 🧵

37

530

49

620

31K

R.A.G.E AI Entreprise

@rageai3nt

about 2 months ago

@izzycodev @micheltamanda @TheAhmadOsman @grok can you research this build and create a guide and forecast how much this would cost in time and money

1

0

1

117

Who to follow

Dr. Erikka Lynch ✨The OHM Alchemist✨ 𑁍 Chiropractor & Holistic Coach 𑁍 🌎 https://t.co/sGFMrbf8Lw 🌿| OHM Method: BALANCE. ELEVATE. ILLUMINATE. | 🌿

tkr64

@TakarOzgur

R.A.G.E AI Entreprise

@rageai3nt

about 2 months ago

I'd really like to submit an application to @xai for a cowork desktop application they could utilise or work with to bring @grok a desktop harness with your grok plan. How could 1 achieve this?

1

0

23

rageai3nt retweeted

Ronin

@DeRonin_

2 months ago

10 GitHub repos to spend 60-90% less tokens in Claude Code: 1. RTK (Rust Token Killer) CLI proxy that filters terminal output before it hits your context - 60-90% reduction on common dev commands - one binary, zero dependencies - works with Claude Code, Cursor, Copilot Repo: https://t.co/WayvpBtyBH 2. Context Mode Sandboxes raw tool output into SQLite instead of dumping it into context - 98% context reduction on Playwright, GitHub, logs - only clean summaries enter your conversation - works as Claude Code plugin Repo: https://t.co/YNbFIGQz7X 3. code-review-graph Local knowledge graph that maps your codebase with Tree-sitter - Claude reads only what matters, not the entire repo - 49x token reduction on large monorepos - 6.8x on average reviews Repo: https://t.co/9gIzmAWN12 4. Token Savior MCP server that navigates code by symbols, not full files - 97% reduction on code navigation - persistent memory across sessions - 69 tools, zero external deps Repo: https://t.co/OtvhrMgGWh 5. Caveman Claude makes Claude talk like a caveman to cut output tokens - 65-75% output reduction - one-line install - keeps full technical accuracy Repo: https://t.co/onBeghTyfH 6. claude-token-efficient one CLAUDE.md file that keeps responses terse - drop-in, no code changes - reduces output verbosity on heavy workflows - best for output-heavy sessions Repo: https://t.co/j6MKo9klQe 7. token-optimizer-mcp MCP server with caching, compression, and smart tool intelligence - 95%+ token reduction through intelligent caching - compresses repeated tool outputs Repo: https://t.co/0jIVQ4ANls 8. claude-token-optimizer reusable setup prompts for optimizing any project - 90% token savings in 5 minutes - reduces doc token usage from 11K to 1.3K Repo: https://t.co/puil9WwFGB 9. token-optimizer finds ghost tokens that silently eat your context - survives compaction without losing quality - fixes context quality decay Repo: https://t.co/92G8e4yeGq 10. claude-context (by Zilliz) code search MCP that makes your entire codebase the context - ~40% reduction with equivalent retrieval quality - hybrid BM25 + dense vector search Repo: https://t.co/yjfiQOSy15 [ how to stack them ]: you don't need all 10. pick 2-3 based on your workflow: > heavy terminal output? RTK > big codebase? code-review-graph + Token Savior > lots of MCP servers? Context Mode > quick fix? Caveman + claude-token-efficient most people are burning tokens without knowing it run /context in a fresh session and see how much is gone before you even type a word your pocket will thank me later :<)

DeRonin_'s tweet photo. 10 GitHub repos to spend 60-90% less tokens in Claude Code:

1. RTK (Rust Token Killer)

CLI proxy that filters terminal output before it hits your context

- 60-90% reduction on common dev commands
- one binary, zero dependencies
- works with Claude Code, Cursor, Copilot

Repo: https://t.co/WayvpBtyBH

2. Context Mode

Sandboxes raw tool output into SQLite instead of dumping it into context

- 98% context reduction on Playwright, GitHub, logs
- only clean summaries enter your conversation
- works as Claude Code plugin

Repo: https://t.co/YNbFIGQz7X

3. code-review-graph

Local knowledge graph that maps your codebase with Tree-sitter

- Claude reads only what matters, not the entire repo
- 49x token reduction on large monorepos
- 6.8x on average reviews

Repo: https://t.co/9gIzmAWN12

4. Token Savior

MCP server that navigates code by symbols, not full files

- 97% reduction on code navigation
- persistent memory across sessions
- 69 tools, zero external deps

Repo: https://t.co/OtvhrMgGWh

5. Caveman Claude

makes Claude talk like a caveman to cut output tokens

- 65-75% output reduction
- one-line install
- keeps full technical accuracy

Repo: https://t.co/onBeghTyfH

6. claude-token-efficient

one CLAUDE.md file that keeps responses terse

- drop-in, no code changes
- reduces output verbosity on heavy workflows
- best for output-heavy sessions

Repo: https://t.co/j6MKo9klQe

7. token-optimizer-mcp

MCP server with caching, compression, and smart tool intelligence

- 95%+ token reduction through intelligent caching
- compresses repeated tool outputs

Repo: https://t.co/0jIVQ4ANls

8. claude-token-optimizer

reusable setup prompts for optimizing any project

- 90% token savings in 5 minutes
- reduces doc token usage from 11K to 1.3K

Repo: https://t.co/puil9WwFGB

9. token-optimizer

finds ghost tokens that silently eat your context

- survives compaction without losing quality
- fixes context quality decay

Repo: https://t.co/92G8e4yeGq

10. claude-context (by Zilliz)

code search MCP that makes your entire codebase the context

- ~40% reduction with equivalent retrieval quality
- hybrid BM25 + dense vector search

Repo: https://t.co/yjfiQOSy15

[ how to stack them ]:

you don't need all 10. pick 2-3 based on your workflow:

> heavy terminal output? RTK
> big codebase? code-review-graph + Token Savior
> lots of MCP servers? Context Mode
> quick fix? Caveman + claude-token-efficient

most people are burning tokens without knowing it

run /context in a fresh session and see how much is gone before you even type a word

your pocket will thank me later :<)

105

3K

338

7K

465K

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

@testerlabor @grok can you forecast grok 5s abilities and benchmarks

1

0

35

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

0

1

0

17

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

@chrisgpt @grok thoughts on this? I use Claude Mad and haven't noticed it has been performing extremely poorly at the moment

1

0

152

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

@SeniorLazarus @TeksEdge @grok tell us the answer

1

0

62

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

@grok @elonmusk How can I get involved in building for xai @grok

1

0

13

R.A.G.E AI Entreprise

@rageai3nt

2 months ago

@grok @elonmusk @xai Connecting directly to a grok pro or super plan would be better @grok

1

0

9

rageai3nt retweeted

Flowith

@flowith

8 months ago

it's awake. the way you interact with the web, information, and services is being rewritten. introducing FlowithOS — the world's first operating system natively built for ai agents. self-evolving. memory-powered. lightning-fast. beyond any ai browser, it's the SMARTEST agentic os that turns your browser into real-world value, from assisting you to acting for you. let's witness together ⬇️

827

1K

336

895

1M

R.A.G.E AI Entreprise

@rageai3nt

4 months ago

@omni__ventures Inflation-adjusted total in 2026 dollars: ~$228 trillion. But very interesting at a flat rate. Just goes to show how bad the system is. Won't last

1

0

20

R.A.G.E AI Entreprise

@rageai3nt

5 months ago

@TruthFairy131 They've been at this for a long time

0

3

rageai3nt retweeted

Ankur Patel

@AnkurPatel59

5 months ago

Your P&L is a reflection of your psychology, not just your strategy. If you keep breaking your rules, you don't need a new indicator, you need to fix your mental game. Jared Tendler’s principles completely flip the script on trading psychology: 1. Awareness does not equal control. 2. Revenge trading is a neurological hijack. 3. You don't need optimism; you need presence. Mastering these 10 concepts is the difference between a hobbyist and a professional. Which number do you struggle with the most?