Nibaldo @CyberMath4 - Twitter Profile

CyberMath4 retweeted

Classic arts & Wallpapers 🥷

@Classisco123

about 2 months ago

Dragon Ball Z | Tap Hold Load 4k

1

1K

121

128

39K

CyberMath4 retweeted

MOON LOVER

@M1ONLOVER

about 2 months ago

Full Moon

47

25K

3K

1K

268K

CyberMath4 retweeted

Sebastian Raschka

@rasbt

3 months ago

I (finally) put together a new LLM Architecture Gallery that collects the architecture figures all in one place! https://t.co/NO7z6XSRHS

rasbt's tweet photo. I (finally) put together a new LLM Architecture Gallery that collects the architecture figures all in one place!
https://t.co/NO7z6XSRHS https://t.co/X41FrK4i94

202

8K

1K

8K

733K

CyberMath4 retweeted

𝘾𝙖𝙧𝙣𝙖𝙫𝙖𝙡𝙖𝙙𝙞𝙘𝙩𝙤𝙨 @Carnavaladictos

3 months ago

Última cuarteta chirigota Ssshhhhh @ChirigotaBizc8

0

198

36

14

8K

Who to follow

Akshat katiyar

@yellowfox_h4xor

Here to make the internet weirder (and smarter)

CyberMath4 retweeted

4 months ago

Ay, ay, ay que cuando llega febrero ❤️‍🩹 @ChirigotaBizc8

2

1K

155

90

32K

CyberMath4 retweeted

C҉r҉a҉z҉y҉ ҉3҉x҉4҉ 🎭 @crazy3x4

5 months ago

Y a ti, por qué te gusta el carnaval?? Pues... 👇

1

567

162

67

29K

CyberMath4 retweeted

Oscar Arias

@OACerebro

5 months ago

Enjoy! https://t.co/RKCakL3TQR

33

6K

1K

4K

250K

CyberMath4 retweeted

MiniMax (official) @MiniMax_AI

5 months ago

MiniMax M2.1 is officially live🚀 Built for real-world coding and AI-native organizations — from vibe builds to serious workflows. A SOTA 10B-activated OSS coding & agent model, scoring 72.5% on SWE-multilingual and 88.6% on our newly open-sourced VIBE-bench, exceeding leading closed-source models like Gemini 3 Pro and Claude 4.5 Sonnet. The most powerful OSS model for the agentic era is here.

116

2K

272

1K

1M

CyberMath4 retweeted

Chao Huang

@huang_chao4969

6 months ago

🚀 Paper2Slides is now open source! Transform research papers & technical reports into professional presentations with ONE click! We've generated stunning presentation slides from the latest DeepSeek V3.2 paper in diverse styles - check them out and share your feedback! 🔥 Core Features: - 📄 Multi-format support - PDF, Word, Excel, PowerPoint & more - 🎯 Smart content understanding - Captures key insights, figures, formulas, tables & data points. - 🎨 Custom styling - Professional themes with full personalization. - ⚡ Lightning fast - High-quality PPT generation in minutes. GitHub: https://t.co/zNxlFifDU3 Never build slides from scratch again! ✨ Come play with it ⭐! #Paper2Slides #AIPPT

huang_chao4969's tweet photo. 🚀 Paper2Slides is now open source! Transform research papers & technical reports into professional presentations with ONE click!

We've generated stunning presentation slides from the latest DeepSeek V3.2 paper in diverse styles - check them out and share your feedback!

🔥 Core Features:
- 📄 Multi-format support - PDF, Word, Excel, PowerPoint & more
- 🎯 Smart content understanding - Captures key insights, figures, formulas, tables & data points.
- 🎨 Custom styling - Professional themes with full personalization.
- ⚡ Lightning fast - High-quality PPT generation in minutes.

GitHub: https://t.co/zNxlFifDU3

Never build slides from scratch again! ✨ Come play with it ⭐!

#Paper2Slides #AIPPT

38

2K

261

2K

467K

CyberMath4 retweeted

Google for Developers

@googledevs

7 months ago

📈 Metrax is the evaluation metrics library for JAX. It's fast, scales well, and fully open source. Learn how it works in your training loop → https://t.co/7CqvDPgImE

googledevs's tweet photo. 📈 Metrax is the evaluation metrics library for JAX. It's fast, scales well, and fully open source.

Learn how it works in your training loop → https://t.co/7CqvDPgImE https://t.co/YaXxufymNM

4

138

21

37

20K

CyberMath4 retweeted

NotebookLM

@NotebookLM

7 months ago

We spent all night wondering what you were "actually" waiting for, and we finally have it: Images as sources!!! Whether it's a photo of handwritten notes, a screenshot of a textbook or graphs on a web page, @NotebookLM can synthesize the information and produce outputs from it.

317

6K

600

2K

982K

CyberMath4 retweeted

Google for Developers

@googledevs

7 months ago

Google Colab is officially coming to @code! ⚡️ You can now connect VS Code notebooks directly to @GoogleColab runtimes. Get the best of both worlds: the editor you love, powered by the compute (GPUs/TPUs) you need. → https://t.co/prgImNfEd2

118

5K

751

2K

907K

CyberMath4 retweeted

Deedy

@deedydas

7 months ago

DeepSeek-OCR is the best OCR ever. It parses this extremely hard to read handwritten letter written by mathematician Ramanujan in 1913 with a frightening degree of accuracy. Not perfect, but beats former best dots ocr. Bonus points if you can spot the errors. Try it here:

deedydas's tweet photo. DeepSeek-OCR is the best OCR ever.

It parses this extremely hard to read handwritten letter written by mathematician Ramanujan in 1913 with a frightening degree of accuracy.

Not perfect, but beats former best dots ocr. Bonus points if you can spot the errors.

Try it here: https://t.co/LP5UQBpc3y

141

5K

522

2K

707K

CyberMath4 retweeted

Ray Fernando

@RayFernando1337

8 months ago

This is the JPEG moment for AI. Optical compression doesn't just make context cheaper. It makes AI memory architectures viable. Training data bottlenecks? Solved. - 200k pages/day on ONE GPU - 33M pages/day on 20 nodes - Every multimodal model is data-constrained. Not anymore. Agent memory problem? Solved. - The #1 blocker: agents forget - Progressive compression = natural forgetting curve - Agents can now run indefinitely without context collapse RAG might be obsolete. - Why chunk and retrieve if you can compress entire libraries into context? - A 10,000-page corpus = 10M text tokens OR 1M vision tokens - You just fit the whole thing in context Multimodal training data generation: 10x more efficient - If you're OpenAI/Anthropic/Google and you DON'T integrate this, you're 10x slower - This is a Pareto improvement: better AND faster Real-time AI becomes economically viable - Live document analysis - Streaming OCR for accessibility - Real-time translation with visual context - All were too expensive. Not anymore.

RayFernando1337's tweet photo. This is the JPEG moment for AI.

Optical compression doesn't just make context cheaper. It makes AI memory architectures viable.

Training data bottlenecks? Solved.
- 200k pages/day on ONE GPU
- 33M pages/day on 20 nodes
- Every multimodal model is data-constrained. Not anymore.

Agent memory problem? Solved.
- The #1 blocker: agents forget
- Progressive compression = natural forgetting curve
- Agents can now run indefinitely without context collapse

RAG might be obsolete.
- Why chunk and retrieve if you can compress entire libraries into context?
- A 10,000-page corpus = 10M text tokens OR 1M vision tokens
- You just fit the whole thing in context

Multimodal training data generation: 10x more efficient
- If you're OpenAI/Anthropic/Google and you DON'T integrate this, you're 10x slower
- This is a Pareto improvement: better AND faster

Real-time AI becomes economically viable
- Live document analysis
- Streaming OCR for accessibility
- Real-time translation with visual context
- All were too expensive. Not anymore.

104

6K

675

5K

502K

CyberMath4 retweeted

elvis

@omarsar0

8 months ago

Agentic Context Engineering Great paper on agentic context engineering. The recipe: Treat your system prompts and agent memory as a living playbook. Log trajectories, reflect to extract actionable bullets (strategies, tool schemas, failure modes), then merge as append-only deltas with periodic semantic de-dupe. Use execution signals and unit tests as supervision. Start offline to warm up a seed playbook, then continue online to self-improve. On AppWorld, ACE consistently beats strong baselines in both offline and online adaptation. Example: ReAct+ACE (offline) lifts average score to 59.4% vs 46.0–46.4% for ICL/GEPA. Online, ReAct+ACE reaches 59.5% vs 51.9% for Dynamic Cheatsheet. Paper: https://t.co/AZRZe0axlI

omarsar0's tweet photo. Agentic Context Engineering

Great paper on agentic context engineering.

The recipe:

Treat your system prompts and agent memory as a living playbook.

Log trajectories, reflect to extract actionable bullets (strategies, tool schemas, failure modes), then merge as append-only deltas with periodic semantic de-dupe.

Use execution signals and unit tests as supervision. Start offline to warm up a seed playbook, then continue online to self-improve.

On AppWorld, ACE consistently beats strong baselines in both offline and online adaptation. Example: ReAct+ACE (offline) lifts average score to 59.4% vs 46.0–46.4% for ICL/GEPA. Online, ReAct+ACE reaches 59.5% vs 51.9% for Dynamic Cheatsheet.

Paper: https://t.co/AZRZe0axlI

31

1K

179

1K

259K

CyberMath4 retweeted

DeepSeek

@deepseek_ai

8 months ago

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n

323

7K

888

1K

1M

CyberMath4 retweeted

Qwen

@Alibaba_Qwen

8 months ago

🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It！ 🌐 Wide language coverage — Understands 18 languages & 6 dialects, speaks 10 languages. 👁️ Vision‑Enhanced Comprehension — Reads lips, gestures, on‑screen text and real‑world entities, robust in noise. ⚡ 3s Latency — Near‑instant real-time translation. 🎯 Lossless Interpretation — Offline‑level translation accuracy. 🎙️ Natural, expressive voices. Connect with the world face-to-face, in real time. 🔗 Blog: https://t.co/EkoJjsU6E7 💻 Dashscope: https://t.co/7wmHANamLY 🤗 Demo: https://t.co/Evfwaj66Z6

Alibaba_Qwen's tweet photo. 🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It！

🌐 Wide language coverage — Understands 18 languages & 6 dialects, speaks 10 languages.
👁️ Vision‑Enhanced Comprehension — Reads lips, gestures, on‑screen text and real‑world entities, robust in noise.
⚡ 3s Latency — Near‑instant real-time translation.
🎯 Lossless Interpretation — Offline‑level translation accuracy.
🎙️ Natural, expressive voices.

Connect with the world face-to-face, in real time.
🔗 Blog: https://t.co/EkoJjsU6E7
💻 Dashscope: https://t.co/7wmHANamLY
🤗 Demo: https://t.co/Evfwaj66Z6

41

1K

205

506

105K

CyberMath4 retweeted

DailyPapers

@HuggingPapers

9 months ago

Microsoft introduces Latent Zoning Network (LZN) A unified principle for generative modeling, representation learning, and classification. LZN uses a shared Gaussian latent space and modular encoders/decoders to tackle all three core ML problems at once!

HuggingPapers's tweet photo. Microsoft introduces Latent Zoning Network (LZN)

A unified principle for generative modeling, representation learning, and classification. LZN uses a shared Gaussian latent space and modular encoders/decoders to tackle all three core ML problems at once! https://t.co/HWzphbuexB

4

219

29

118

20K

CyberMath4 retweeted

Qwen

@Alibaba_Qwen

9 months ago

🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!) 🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed & recall 🔹 Ultra-sparse MoE: 512 experts, 10 routed + 1 shared 🔹 Multi-Token Prediction → turbo-charged speculative decoding 🔹 Beats Qwen3-32B in perf, rivals Qwen3-235B in reasoning & long-context 🧠 Qwen3-Next-80B-A3B-Instruct approaches our 235B flagship. 🧠 Qwen3-Next-80B-A3B-Thinking outperforms Gemini-2.5-Flash-Thinking. Try it now: https://t.co/V7RmqMaVNZ Blog: https://t.co/qhzjBv6dEH Huggingface: https://t.co/zHHNBB2l5X ModelScope: https://t.co/mld9lp8QjK Kaggle: https://t.co/GeTStgaMlu Alibaba Cloud API: https://t.co/RdmUF5m6JA

Alibaba_Qwen's tweet photo. 🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here!

🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!)
🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed & recall
🔹 Ultra-sparse MoE: 512 experts, 10 routed + 1 shared
🔹 Multi-Token Prediction → turbo-charged speculative decoding
🔹 Beats Qwen3-32B in perf, rivals Qwen3-235B in reasoning & long-context

🧠 Qwen3-Next-80B-A3B-Instruct approaches our 235B flagship.
🧠 Qwen3-Next-80B-A3B-Thinking outperforms Gemini-2.5-Flash-Thinking.

Try it now: https://t.co/V7RmqMaVNZ
Blog: https://t.co/qhzjBv6dEH
Huggingface: https://t.co/zHHNBB2l5X
ModelScope: https://t.co/mld9lp8QjK
Kaggle: https://t.co/GeTStgaMlu
Alibaba Cloud API: https://t.co/RdmUF5m6JA

171

4K

681

2K

931K

CyberMath4 retweeted

Sebastian Raschka

@rasbt

9 months ago

Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5

41

3K

495

3K

199K

Nibaldo

@CyberMath4

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users