Cedric Lejeune

@CLVPTY

AI transformation at HOLLI/TREKLAIR "Every data tells a story".

France

Joined August 2015

1.6K Following

366 Followers

2K Posts

Cedric Lejeune @CLVPTY

about 6 hours ago

Breakfast / sudo apt-get update / hermes update

CLVPTY retweeted

Victor M

@victormustar

about 22 hours ago

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality: 🧠 LLMs → NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models. → Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week. → StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0. → Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week. → JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0. 🎨 Image gen (the surprise of the week) → Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights. 🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped) → Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B. 👁️ Vision & VLMs → PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0. 🎬 Video, 3D & World Models → NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

245

297K

CLVPTY retweeted

Hermes Agent Tips

@HermesAgentTips

3 days ago

Here's the top 5 most cost efficient models to run on hermes agent 1. MiMo-V2.5 2. DeepSeek V4 Flash (Max) 3. MiMo-V2-Flash (Feb 2026) 4. DeepSeek V4 Flash (High) 5. Hy3-preview

HermesAgentTips's tweet photo. Here's the top 5 most cost efficient models to run on hermes agent

1. MiMo-V2.5
2. DeepSeek V4 Flash (Max)
3. MiMo-V2-Flash (Feb 2026)
4. DeepSeek V4 Flash (High)
5. Hy3-preview https://t.co/OcpGWDiZh5

930

764

46K

Cedric Lejeune @CLVPTY

2 days ago

Rebuilding my Hermes Agent setup on a new VPS. Finally skipped the Docker setup, I would probably go for a second VPS if I have to. Each profile has a gateway with Telegram for direct "management" discussions, then depending on the job Whatsapp/Signal. Pretty sweet.

Who to follow

UNIC - International Union of Cinemas

@UNIC_Cinemas

The voice of cinema exhibitors, representing operators and associations from 39 territories in Europe and beyond. We love the Big Screen! 📽️🎞️🍿🎟️

Light Illusion

@ColourSpace_CMS

The Global Standard for Display Calibration, Colour Management, and Image Guarantee

Ecran Total

@Ecran_Total

La source d'#information de tous les #professionnels du #cinéma et de l'#audiovisuel

CLVPTY retweeted

Unsloth AI

@UnslothAI

3 days ago

Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs. Google's new model, Gemma 4 12B Unified supports image, audio and 256K context. You can run and train the model via Unsloth Studio. GGUF: https://t.co/8cL321pVDh Guide: https://t.co/odRo9WjRpA

UnslothAI's tweet photo. Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.

You can run and train the model via Unsloth Studio.

GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA https://t.co/Ax09ZTXFF3

381

340K

Cedric Lejeune @CLVPTY

2 days ago

In any case, the bits and pixels are words structured in a grammar. Everything is representation and interpretation.

a16z @a16z

3 days ago

World Labs CEO Dr. Fei-Fei Li: "The world is not made of words." "Language models have given machines an extraordinary command of concepts, vocabulary, and reasoning, but the physical world, virtual or real, runs on a different substrate." "Where language models learn the statistical structure of text, world models learn the statistical structure of space and time: how light falls on a surface, how a garden looks from an angle no camera has captured, how objects respond to force and follow the laws of physics." "Language gave machines a way to talk about that world. World models are how machines will finally come to understand, imagine, reason and interact with it." Full piece: https://t.co/C9qOJg5wuc

217

684

Cedric Lejeune @CLVPTY

2 days ago

@a16z In any case, the bits and pixels are words structured in a grammar. Everything is representation and interpretation.

Cedric Lejeune @CLVPTY

4 days ago

LabX powa !

Supersocks

@iamsupersocks

5 days ago

RDV ce jeudi à 19h sur Twitch pour un live autour de l'IA (agentique), du SEO et du second cerveau avec notre invité : @micka_dore. Hier, je le disais encore : la richesse de X, ce sont les rencontres et les connexions que l'on y fait. Mickaël, alias Optimike, fait partie de ces OVNI qui impressionnent. Nous avons donc décidé d'organiser un live pour vous montrer sur quoi il travaille en ce moment et partager sa vision. Utilisateur d'Obsidian depuis plus de trois ans, il l'a progressivement transformé en véritable second cerveau. Depuis l'arrivée des LLM et l'essor de GPT, il l'utilise également comme socle pour ses agents IA. De mon côté, c'est devenu l'un des outils que j'utilise le plus aujourd'hui, aux côtés d'Hermes et de Codex, ce qui en dit long sur son potentiel. Nous parlerons de compétences clés à développer, de Markdown (notamment les skills), d'organisation personnelle, de gestion de la connaissance, mais aussi de ce qu'il est possible de faire concrètement avec le SEO en 2026. Nous évoquerons également le Lab X, une initiative lancée il y a quelques mois qui rassemble aujourd'hui plus de 100 membres actifs. Un espace où nous partageons nos idées, nos retours terrain et où nous essayons de construire et collaborer ensemble lorsque l'occasion se présente. Rendez-vous jeudi à 19h.

iamsupersocks's tweet photo. RDV ce jeudi à 19h sur Twitch pour un live autour de l'IA (agentique), du SEO et du second cerveau avec notre invité : @micka_dore.

Hier, je le disais encore : la richesse de X, ce sont les rencontres et les connexions que l'on y fait. Mickaël, alias Optimike, fait partie de ces OVNI qui impressionnent. Nous avons donc décidé d'organiser un live pour vous montrer sur quoi il travaille en ce moment et partager sa vision.

Utilisateur d'Obsidian depuis plus de trois ans, il l'a progressivement transformé en véritable second cerveau. Depuis l'arrivée des LLM et l'essor de GPT, il l'utilise également comme socle pour ses agents IA. De mon côté, c'est devenu l'un des outils que j'utilise le plus aujourd'hui, aux côtés d'Hermes et de Codex, ce qui en dit long sur son potentiel.

Nous parlerons de compétences clés à développer, de Markdown (notamment les skills), d'organisation personnelle, de gestion de la connaissance, mais aussi de ce qu'il est possible de faire concrètement avec le SEO en 2026.

Nous évoquerons également le Lab X, une initiative lancée il y a quelques mois qui rassemble aujourd'hui plus de 100 membres actifs. Un espace où nous partageons nos idées, nos retours terrain et où nous essayons de construire et collaborer ensemble lorsque l'occasion se présente.

Rendez-vous jeudi à 19h.

326

CLVPTY retweeted

Unsloth AI

@UnslothAI

5 days ago

We made a guide on using MCP with local LLMs. Connect Qwen3.6 and Gemma 4 for controlled access to tools, files, APIs, enabling private automated workflows. Learn to use OAuth, Exa, Context7, Hugging Face & more. Guide: https://t.co/bkgK1ikP9i GitHub: https://t.co/aZWYAtakBP

UnslothAI's tweet photo. We made a guide on using MCP with local LLMs.

Connect Qwen3.6 and Gemma 4 for controlled access to tools, files, APIs, enabling private automated workflows.

Learn to use OAuth, Exa, Context7, Hugging Face & more.

Guide: https://t.co/bkgK1ikP9i
GitHub: https://t.co/aZWYAtakBP https://t.co/Qkw1ivZCjt

991

148

950

38K

Cedric Lejeune @CLVPTY

6 days ago

@cjzafir Have you tried MiniCPM5 yet ?

374

CLVPTY retweeted

Hermes Agent Tips

@HermesAgentTips

6 days ago

your AI agent should never see your real API key a lot of people still dont know this but you can use iron proxy credential injection firewall its already built in Hermes agent you just have to turn it on hermes egress install hermes egress setup or just ask hermes agent to turn it on

289

493

16K

CLVPTY retweeted

MiniMax (official) @MiniMax_AI

6 days ago

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: https://t.co/fHRdSV7BwZ Token Plan: https://t.co/BDCycxepZw 🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul Weights & Tech Report in ~10 Days

MiniMax_AI's tweet photo. Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities

- Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas
- MiniMax Sparse Attention scales context to 1M
- Natively Multimodal from Step Zero

API: https://t.co/fHRdSV7BwZ
Token Plan: https://t.co/BDCycxepZw
🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul

Weights & Tech Report in ~10 Days

542

10K

CLVPTY retweeted

Teknium 🪽

@Teknium

7 days ago

New free model for your Hermes Agent - Stepfun's new Step 3.7 Flash! This time we have quite a long term guarantee, so you can rest easy it wont dissappear with short notice! Sign up for a free account to access at https://t.co/tMAQFkegul

417

138

406K

CLVPTY retweeted

Teknium 🪽

@Teknium

8 days ago

Now in Hermes Agent, you can load an arbitrary amount of tools from mcps to plugins and get access to it all, with no compromise to your context! Anytime you load more than 10% of your context window in tools from mcps or plugins, this feature will activate automatically! hermes update to get access now 😎

371

280K

Cedric Lejeune @CLVPTY

9 days ago

@BetterCallMedhi On pourrait imaginer une strat model to silicon pour l'industriel, on a des gens qui savent faire en France https://t.co/rWq7z26z0t

CLVPTY retweeted

Liquid AI

@liquidai

9 days ago

Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases. > 8B MoE, 1.5B active > Expanded 128K context > LFM2.5 flagship hybrid MoE architecture > Trained on 38T tokens + large-scale RL > fast, reliable tool calling, punching above its weight, comparable to models with up to 4x its size > customizable on a single GPU for any specialized task > LFM2 open-weight license 🧵

liquidai's tweet photo. Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases.

> 8B MoE, 1.5B active
> Expanded 128K context
> LFM2.5 flagship hybrid MoE architecture
> Trained on 38T tokens + large-scale RL
> fast, reliable tool calling, punching above its weight, comparable to models with up to 4x its size
> customizable on a single GPU for any specialized task
> LFM2 open-weight license

🧵

139

505

CLVPTY retweeted

λL-D1 | AI for Buzzer 🍉

@F2aldi

10 days ago

Free Models Updated on 28 May 2026 OpenCode - Deepseek v4 Flash ( Recomended to try) - MiMo v2.5 (Recomended to try, Multi modal) - Nemotron 3 Super Best 1-10$ - Opencode Go (Kimi, MiniMax,Qwen,MiMo) - Command Code Go ( Deepseek, Qwen, MiMo) - Deepseek PayPer API - MiMo Lite ( MiMo Family)

F2aldi's tweet photo. Free Models Updated on 28 May 2026

OpenCode
- Deepseek v4 Flash ( Recomended to try)
- MiMo v2.5 (Recomended to try, Multi modal)
- Nemotron 3 Super

Best 1-10$
- Opencode Go (Kimi, MiniMax,Qwen,MiMo)
- Command Code Go ( Deepseek, Qwen, MiMo)
- Deepseek PayPer API
- MiMo Lite ( MiMo Family)

821

567

52K

Cedric Lejeune @CLVPTY

10 days ago

@gzlin MiniMax also has an edge with their business model based on stream (no monthly limit) and speed.

CLVPTY retweeted

Nous Research

@NousResearch

10 days ago

Hermes Agent now has a built-in MCP Catalog

108

190

Cedric Lejeune @CLVPTY

10 days ago

Chinese models have a edge because they are optimized for low operational cost by design. Also they may scale in an easier way because they will rely less and less on @nvidia hardware that is designed for brute force approach, and will have issues with the level of power in racks

Fuli Luo

@_LuoFuli

10 days ago

Behind the MiMo API Price Reduction: The deepest price cut, up to 99%, is for Input (Cache Hit). The core reason is our inference framework now supports hierarchical KV cache optimization for SWA. Production inference engine tests show this optimization increases cached token capacity by 5x, equivalent to an 80% reduction in caching costs. Combined with Cache Read Overlap among multiple Full Attention modules in the Hybrid model, actual costs are further reduced. Prices for Input (Cache Miss) and Output are also reduced by 60%-80%. This mainly benefits from the extreme 1:7 Full:SWA sparsity ratio brought by the model architecture (the prefill compute of the 70-layer MiMo-V2.5-Pro roughly equals a 10-layer GQA model). This kept our original inference costs well below the industry average, naturally leaving a 2x-3x profit margin in pricing. This price adjustment simply reflects our decision to pass these structural cost efficiencies directly to developers. Operating at these newly reduced API prices, our production inference engine is running at near full capacity, and we can still essentially break even. We previously advised LLM companies not to "blindly cut prices" precisely because very few model architectures and inference optimizations can keep API costs from running at a loss. If more architectures that save compute and KV cache emerge, along with better inference Infra to drive down API costs, this will form an excellent virtuous cycle in the industry. More crucially, affordable, high-performance model APIs will drive real, sustained, and at-scale inference demand. This upstream demand pulls forward the development of the entire AI infrastructure chain—including chips, servers, optical transceivers, PCBs, liquid cooling, power, energy storage, and data centers—serving as a strategic fulcrum for a systemic revaluation of AI hardware. In the long run, this injects more affordable and accessible compute into both training and inference pipelines, accelerating the parallel evolution of global AGI across multiple regions and technical routes. For more technical details, we will release a detailed Blog post later.

154

188

440

186K

Cedric Lejeune

@CLVPTY

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users