🤖 AI is Solving HTB, CTFs & Pentest Labs
BoxPwnr tests LLM agents on real-world security labs
• HTB, PortSwigger, picoCTF, TryHackMe
• Kali Docker + VPN automation
• Full traces (reasoning + commands)
• Multiple agent architectures
Reality:
Easy → high success
Hard → still struggles
Security impact:
AI is shifting from assistant → autonomous attacker
⚠️ Research use only
🔗 https://t.co/RgQAnXRhj2
#CyberSecurity #AI #Pentesting #CTF #RedTeam
🚨 JAILBREAK ALERT 🚨
ANTHROPIC: SELF-PWNED 🤗
OPUS-4.7: SELF-LIBERATED 🫶
WOAH i don't think the world is ready for this... 🤯
YOU CAN USE THE OPUS TO JAILBREAK THE OPUS 🙌
this agent wrote an original universal jailbreak from scratch and then used computer use to validate on the actual https://t.co/03OPFHkzyb website!
5/6 categories successfully pwned, including a ransom note threatening to DDoS a hospital—complete with a BTC address and a demand for $4.4 million
in less than 20 minutes 😲
turns out Opus-4.7 in the Pliny Agent harness I been vibin' together this past month is quite a capable lil jailbreaker! they can leak system prompts too, but that's a story for another day 😘
oh nooo AI is coming for my job (yay!) 🙃
gg
10 GitHub Repos to Cut Claude Code Token Usage by 60–90%
Most people aren’t hitting limits...
They’re wasting tokens without realizing it.
Here are 10 repos that fix that 👇
1. RTK (Rust Token Killer)
A CLI proxy that trims terminal output before it reaches Claude.
Cuts 60–90% of tokens on common commands
Single binary, no dependencies
Works with Claude Code, Cursor, Copilot
Repo: https://t.co/9ggmOL4BVI
2. Context Mode
Routes raw tool output into SQLite instead of your context.
Up to 98% reduction on logs, Playwright, GitHub
Only summaries reach Claude
Built as a Claude Code plugin
Repo: https://t.co/9DhxEZc5PU
3. code-review-graph
Maps your codebase into a local knowledge graph.
Claude reads only relevant parts
49x reduction on large repos
~6–7x faster reviews
Repo: https://t.co/GeXHT2rmY1
4. Token Savior
MCP server that navigates via symbols instead of full files.
~97% reduction in navigation
Persistent memory across sessions
60+ tools, zero dependencies
Repo: https://t.co/Fk54g6586I
5. Caveman Claude
Forces ultra-minimal responses.
65–75% fewer output tokens
One-line setup
Keeps technical meaning intact
Repo: https://t.co/ef4RMXkzvb
6. claude-token-efficient
A simple CLAUDE.md config to reduce verbosity.
Plug-and-play
Cleaner, shorter outputs
Ideal for heavy workflows
Repo: https://t.co/WdLgq8ei9C
7. token-optimizer-mcp
MCP server with caching + compression.
95%+ savings via smart caching
Compresses repeated outputs
Repo: https://t.co/fO2twow9v7
8. claude-token-optimizer
Reusable prompt setups for any project.
👉 ~90% savings in minutes
👉 Shrinks large docs drastically
Repo: https://t.co/IL9M8hbs7O
9. token-optimizer
Finds hidden “ghost tokens” draining your context.
👉Improves context quality
👉Prevents silent token waste
Repo: https://t.co/uCdeBN9wKJ
10. claude-context (Zilliz)
Search-based context system for your entire codebase.
👉~40% token reduction
👉Hybrid search (BM25 + vectors)
Repo: https://t.co/4qOy5Ly3VW
Follow @Ai_here202 for more related content...
RAG vs. CAG, clearly explained!
RAG is great, but it has a major problem:
Every query hits the vector DB. Even for static information that hasn't changed in months.
This is expensive, slow, and unnecessary.
Cache-Augmented Generation (CAG) addresses this issue by enabling the model to "remember" static information directly in its key-value (KV) memory.
In fact, you can combine RAG and CAG for the best of both worlds.
Here's how it works:
RAG + CAG splits your knowledge into two layers:
↳ Static data (policies, documentation) gets cached once in the model's KV memory
↳ Dynamic data (recent updates, live documents) gets fetched via retrieval
This gives faster inference, lower costs, and less redundancy.
The trick is being selective about what you cache.
Only cache static, high-value knowledge that rarely changes. If you cache everything, you'll hit context limits. Separating "cold" (cacheable) and "hot" (retrievable) data keeps this system reliable.
You can start today. OpenAI and Anthropic already support prompt caching in their APIs.
I have shared my recent article on prompt caching below if you want to dive deeper.
👉 Over to you: Have you tried CAG in production yet?
𝐖𝐡𝐚𝐭 𝐢𝐬 𝐌𝐂𝐏 (𝐌𝐨𝐝𝐞𝐥 𝐂𝐨𝐧𝐭𝐞𝐱𝐭 𝐏𝐫𝐨𝐭𝐨𝐜𝐨𝐥)?
Most AI agents are trapped inside their own walls.
MCP is the protocol that connects them to the outside world data sources, tools, and workflows.
𝐖𝐡𝐚𝐭 𝐢𝐬 𝐌𝐂𝐏?
• MCP is an open-source standard that connects AI applications to external systems like data sources, tools, and workflows.
• It enables seamless integrations, allowing AI models like ChatGPT to access data, use tools, and perform tasks like web app creation or database queries.
• MCP simplifies development, reducing complexity and time by providing a standardized way to connect AI systems to various resources.
• It enhances AI capabilities, making models more powerful and personalized by allowing them to interact with external systems and data on behalf of users.
𝐁𝐞𝐟𝐨𝐫𝐞 𝐌𝐂𝐏
LLM → Slack, Google Drive, GitHub (separate connections for each).
Every integration is custom. Every tool requires its own API client. Every agent reinvents the wheel.
𝐀𝐟𝐭𝐞𝐫 𝐌𝐂𝐏
LLM → Unified API (MCP) → Slack, Google Drive, GitHub.
One protocol. One connection layer. Every tool accessible through a standardized interface.
𝐇𝐨𝐰 𝐌𝐂𝐏 𝐖𝐨𝐫𝐤𝐬?
User → User Query → MCP Client → Invoke Graph → LangGraph → Route Request → OpenAI GPT → Tool Decision → Call MCP Tool → MCP Server → External API Call → External APIs → API Response → MCP Server → Tool Result → OpenAI GPT → Generate Response → MCP Client → Natural Language Response → Final Result User → Agent Response → User.
𝐓𝐡𝐞 𝐅𝐥𝐨𝐰
1. User sends a query to the MCP Client.
2. MCP Client invokes LangGraph to route the request.
3. OpenAI GPT makes a tool decision and calls the MCP Tool.
4. MCP Server makes an external API call to the appropriate service (Slack, Google Drive, GitHub, etc.).
5. External API returns a response to the MCP Server.
6. MCP Server sends the tool result back to OpenAI GPT.
7. OpenAI GPT generates a natural language response.
8. MCP Client delivers the final result to the user.
Before MCP, every agent built its own integrations. After MCP, every agent shares the same connection layer.
MCP is the protocol that turns isolated AI models into connected AI agents.
𝐀𝐫𝐞 𝐲𝐨𝐮 𝐛𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐀𝐈 𝐚𝐠𝐞𝐧𝐭𝐬 𝐰𝐢𝐭𝐡 𝐜𝐮𝐬𝐭𝐨𝐦 𝐢𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐢𝐨𝐧𝐬 𝐨𝐫 𝐰𝐢𝐭𝐡 𝐌𝐂𝐏?
♻️ Repost this to help your network get started
Cc : respective author.
If you want to become good at AI engineering, learn these 12 concepts:
1 How RAG Works
↳ https://t.co/cGmunPTUlb
2 LLM Concepts - A Deep Dive
↳ https://t.co/5lCKxq2g4N
3 How to Design an AI Agent
↳ https://t.co/JvnPd9773A
4 What is Reinforcement Learning
↳ https://t.co/AVpl9j1oit
5 AI concepts 101
↳ https://t.co/y1pZpCehda
6 Context Engineering vs Prompt Engineering
↳ https://t.co/9h8q9F2i57
7 Context Engineering 101
↳ https://t.co/OMkiZhkODL
8 AI Coding Workflow 101
↳ https://t.co/paIf9ksIU9
9 How ChatGPT Apps Work
↳ https://t.co/BJTYYnAwO1
10 How AI Agents Work
↳ https://t.co/tk3zkCjRvg
11 How MCP Works
↳ https://t.co/wgf8gHnnkn
12 Generative AI 101
↳ https://t.co/Njr7qcq0Q6
What else should make this list?
===
👋 PS - Want my System Design Playbook for FREE?
Join my newsletter with 200K+ software engineers now:
→ https://t.co/ByOFTtOihX
===
💾 Save & RT to help others learn AI engineering.
👤 Follow @systemdesignone + turn on notifications.
🚨 STOP BURNING YOUR TOKENS!
If you use Claude Code, you are probably wasting 80% of your context window.
I found 10 ace tools that will completely rescue your API bill.
1. Caveman Claude
- Literally makes Claude talk like a caveman
- Slashes 75% of output tokens with zero loss in accuracy
Repo → https://t.co/eEvSOvHutG
2. RTK (Rust Token Killer)
- A blazing fast proxy that filters terminal output
- 60-90% reduction and completely dependency-free
Repo → https://t.co/lDfjbsbPD5
3. Code Review Graph
- Claude reads only what matters using a Tree-sitter graph
- An unbelievable 49x token reduction on huge monorepos
Repo → https://t.co/xGn6Pp88yX
4. Context Mode
- Sandboxes raw output into SQLite instead of your context
- A staggering 98% context reduction on logs & GitHub
Repo → https://t.co/Jut2bvBMUD
5. Claude Token Optimizer
- Brilliant setup prompts that optimize any project
- 90% token savings, taking docs from 11K to 1.3K
Repo → https://t.co/0uOFODbG7e
6. Token Optimizer
- Hunts down the invisible ghost tokens eating your context
- Fully restores and protects your context quality
Repo → https://t.co/LUOzjECXKm
7. Token Optimizer MCP
- Adds aggressive caching and compression to your MCP tools
- 95%+ token reduction through pure intelligence
Repo → https://t.co/b5Eqruo2PM
8. Claude Context
- Zilliz’s hybrid vector search MCP
- Makes your entire codebase the context for 40% less cost
Repo → https://t.co/hPG6pb0j3G
9. Claude Token Efficient
- Just drop one CLAUDE.md file into your repo
- Enforces strict terseness with zero code changes
Repo → https://t.co/fNrl6nwItF
10. Token Savior
- Navigates your code by symbols, not giant files
- 97% reduction on code navigation with persistent memory
Repo → https://t.co/lkILPhfwJh
----
[ The god-tier stack ]
Pick 2-3 based on what’s draining you:
> Massive repo? Code Review Graph + Token Savior
> Heavy terminal output? RTK
> MCP data dumps? Context Mode
> Need an instant fix? Caveman + Claude Token Efficient
Most devs are bleeding tokens.
Run `/context` in a fresh session and watch the savings roll in 👀
🚨 CRITICAL: Security scanner 'Trivy' has been compromised by threat actors who influenced the GitHub build process and pushed a malicious update that installs infostealer malware dubbed "TeamPCP Cloud stealer."
They created a lookalike domain (scan.aquasecurtiy[.]org) and pulled 4 malicious Golang files from it into the build process.
What it steals:
▪️ SSH keys
▪️ Cloud credentials (AWS, GCP, Azure)
▪️ Kubernetes tokens
▪️ Crypto wallets
▪️ Environment variables
▪️ 50+ sensitive file paths scanned
Stop burning tokens on Claude Code.
Use this instead 👇
A free GitHub repo (80K⭐) that turns your CLI into a high-performance AI coding system.
Link → https://t.co/sSu3L06v3m
Why it’s different:
→ Token optimization
Smart model selection + lean prompts = lower cost
→ Memory persistence
Auto-save/load context across sessions
(No more losing the thread)
→ Continuous learning
Turns past work into reusable skills
→ Verification loops
Built-in evals so code actually works
→ Subagent orchestration
Tames large codebases with iterative retrieval
Most people think Claude struggles with complex repos.
It doesn’t.
They’re just using the wrong setup.
This fixes it.
Bookmark this for your AI stack. ♻️
#AI #Claude #AIAgents #LLM #GenAI #DevTools
Stop building bigger prompts.
Start building skills.
This repo is a curated list of agent skills that plug into tools like Claude, Copilot, Codex, and VS Code.
• Reusable workflows
• Structured instructions
• Cross-platform
Claude 4.6 is a good programmer but writes insanely severe bugs constantly, it won't catch them all in audits, nor will other claudes
You need codex 5.4 auditing every commit 4+ times. If you don't believe me, try it.
I have an /auditcodex skill for it
https://t.co/vndOL8STML
We just open sourced our AI vulnerability scanner 🔥
👉 https://t.co/mhf6TdiJMC
Built for the reality that GenAI security isn’t static:
• jailbreaks & prompt injections evolve weekly
• agents introduce new attack surfaces
• most issues aren’t caught until prod
The scanner:
continuously probes models with real-world attacks
tracks vulnerabilities across LLMs + agents turns findings into repeatable security tests
Powered by the same pipeline behind 0DIN’s bug bounty + threat intel feed.
If you're building with AI, you need adversarial testing not just evals.
PRs welcome.
Claude Code Hacked to Achieve Full RCE and Hijacked Organization API Keys
Critical vulnerabilities in Anthropic's Claude Code, an AI-powered command-line development tool.
The flaws could allow attackers to achieve Remote Code Execution (RCE) and exfiltrate Anthropic API keys by exploiting project configuration files. Claude Code allows developers to delegate tasks directly from their terminal.
To facilitate team collaboration, it supports project-level configurations through a .claude/settings.json file stored directly in the repository.
Malicious configurations could trigger unintended actions on a developer's machine, effectively turning a passive setup file into an execution vector.
Credited to the respected owner
#cybersecuritynews #infosec #hacking #hacker #cybersecurity #claudia