Owais @alshell7 - Twitter Profile

Pinned Tweet

11 months ago

🚀 Hot drop from @JinaAI_ ! Jina Embeddings v4🌍✨ Why it's 🔥: ✅ Superior text retrieval performance ✅ Multimodal & multilingual support ✅ Lightweight & FAST (thanks to GGUF!) 👉 Grab it: https://t.co/XW0uEByAA8 📰 Deep dive: https://t.co/1LuwElQd6j

alshell7's tweet photo. 🚀 Hot drop from @JinaAI_ !
Jina Embeddings v4🌍✨

Why it's 🔥:
✅ Superior text retrieval performance
✅ Multimodal & multilingual support
✅ Lightweight & FAST (thanks to GGUF!)

👉 Grab it: https://t.co/XW0uEByAA8

📰 Deep dive: https://t.co/1LuwElQd6j

1

2

0

425

Owais

@alshell7

about 12 hours ago

@Prince_Canuma Absolutely amazing!

1

0

78

Owais

@alshell7

3 days ago

@NousResearch At least you guys think GUI & TUI both are needed. Thanks

0

8

alshell7 retweeted

Evis Drenova

@evisdrenova

5 days ago

idk who created this website, but it's awesome and i love it. https://t.co/gktjwcY1Gj

33

3K

247

5K

234K

Who to follow

James (Jingxi) Xu

@drjingxi

senior research scientist @antgroup and postdoc @stanford. ex @rai_inst @columbia @mit @penn. robot learning.

Shubham

@shubham_stark

DL, NLProc, Python. Curious and Exploring🌌 ML @ Coinbase

8 days ago

A bubble: Internet > https://t.co/NQTyyCZ5XK Bubbles: ChatGPT > Vibe Coding > OpenClaw > Mac Mini > Claude Currently gated by: AI Psychosis PS: "Attention Is All You Need"

0

1

0

59

alshell7 retweeted

Liquid AI

@liquidai

10 days ago

Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases. > 8B MoE, 1.5B active > Expanded 128K context > LFM2.5 flagship hybrid MoE architecture > Trained on 38T tokens + large-scale RL > fast, reliable tool calling, punching above its weight, comparable to models with up to 4x its size > customizable on a single GPU for any specialized task > LFM2 open-weight license 🧵

liquidai's tweet photo. Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases.

> 8B MoE, 1.5B active
> Expanded 128K context
> LFM2.5 flagship hybrid MoE architecture
> Trained on 38T tokens + large-scale RL
> fast, reliable tool calling, punching above its weight, comparable to models with up to 4x its size
> customizable on a single GPU for any specialized task
> LFM2 open-weight license

🧵

139

4K

506

3K

1M

alshell7 retweeted

Alibaba Group @AlibabaGroup

17 days ago

Qwen3.7-Max is live! 🚀 Introducing the latest proprietary model, built for advanced agentic coding, complex reasoning, and long-horizon execution. It’s here to transform how we approach complex tasks.

AlibabaGroup's tweet photo. Qwen3.7-Max is live! 🚀

Introducing the latest proprietary model, built for advanced agentic coding, complex reasoning, and long-horizon execution. It’s here to transform how we approach complex tasks. https://t.co/hoxURmOboB

72

2K

241

569

3M

alshell7 retweeted

steven

@Tu7uruu

19 days ago

Just added in smol-audio: a reproducible NeuTTS Nano recipe for new languages. Want a TTS model in your own language? - NeuCodec dataset encoding - fine-tuning CLI + notebook - Inference script Change the phonemizer + dataset and train your own TTS model. Italian example includes a 300k samples/around 1000 hours of audio YODAS/Granary config.

4

68

7

52

4K

alshell7 retweeted

Shubham Sharma

@HappyyPablo

18 days ago

open sourcing Marlin-2B 🐟 a tiny VLM to extract structured information from videos Marlin is finetuned for two questions devs want to ask in their videos: what is happening, and when? Best open model in its weight class, competitive with Gemini-2.5-flash at only 2B params 🧵

135

5K

523

5K

305K

alshell7 retweeted

Sapient Intelligence @Sapient_Int

19 days ago

Introducing HRM-Text. An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure. Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models. The kicker? The full model trains in roughly one day on a $1,000 budget. This opens the door to a new generation of AI that is powerful, accessible, and radically easier to adapt. Theories and research concepts once deemed too expensive to test are officially back in the game. Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

160

3K

269

2K

506K

alshell7 retweeted

Yuchen Jin

@Yuchenj_UW

20 days ago

GPU shortage is worse than ever. H100s cost more today than they did 3 years ago, and you cannot get them on-demand. The big AI labs have locked up most of the supply for years. I’m worried university researchers and individual developers simply won’t be able to get GPUs.

Yuchenj_UW's tweet photo. GPU shortage is worse than ever.

H100s cost more today than they did 3 years ago, and you cannot get them on-demand.

The big AI labs have locked up most of the supply for years. I’m worried university researchers and individual developers simply won’t be able to get GPUs. https://t.co/XW12t3ZcTv

117

2K

105

291

257K

Owais

@alshell7

24 days ago

How I wish @CloudflareDev adds integration as a service to its developer ecosystem. I mean why not? Today shipping integrations are a pain, of course Nango like open source solutions exist. But when you are attached to scalability & robustness being cost effective. Cloudflare has been the choice. I wish they would think about this anytime sooner. @Cloudflare @dok2001

0

20

alshell7 retweeted

Nous Research

@NousResearch

24 days ago

Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

NousResearch's tweet photo. Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data.

During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining.

Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE.

The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

151

4K

419

2K

447K

alshell7 retweeted

AJ

@ItsmeAjayKV

28 days ago

Me starting with LLMs: "bigger GPU, more VRAM = faster inference" Me now: - VRAM bandwidth - KV cache behaviour - memory latency - cache locality - PCIe bottlenecks - kernel efficiency - quantization tradeoffs - memory movement Modern AI inference is basically systems engineering disguised as matric multiplication.

16

444

41

237

19K

alshell7 retweeted

Ant Ling

@AntLingAGI

29 days ago

We are launching Ring-2.6-1T, a trillion-parameter flagship thinking model engineered for real-world complex tasks and production env: 🚀 - Adjustable Thinking Effort: dynamic compute mechanism to flexibly balance cognitive depth, token cost, and execution speed; - Agent-Optimized: Built for high-frequency workflows, delivering rapid multi-step execution and tool orchestration with SOTA stability; - Deep Thinking: Unlocks the model's maximum capability ceiling for rigorous mathematical logic and scientific research;

AntLingAGI's tweet photo. We are launching Ring-2.6-1T, a trillion-parameter flagship thinking model engineered for real-world complex tasks and production env: 🚀
- Adjustable Thinking Effort: dynamic compute mechanism to flexibly balance cognitive depth, token cost, and execution speed;
- Agent-Optimized: Built for high-frequency workflows, delivering rapid multi-step execution and tool orchestration with SOTA stability;
- Deep Thinking: Unlocks the model's maximum capability ceiling for rigorous mathematical logic and scientific research;

72

707

101

210

3M

alshell7 retweeted

Zecheng Zhang

@zechengzh

about 1 month ago

Introducing Mirage, a unified virtual filesystem for AI agents! 6 weeks. 1.1M+ lines of code. We rewrote bash from the ground up so cat, grep, head, and pipes work across heterogeneous services. S3, Google Drive, Slack, Gmail, GitHub, Linear, Notion, Postgres, MongoDB, SSH, and more, all mounted side-by-side as one filesystem. Bash that AI agents already know works on every format! cat, grep, head, and wc parse .parquet, .csv, .json, .h5, even .wav! One pipe can stitch S3, Drive, GitHub, Slack, and Linear together, same Unix semantics throughout. Workspaces are versioned too. Snapshot, clone, and roll back the whole thing with one API call. A two-layer cache turns repeated reads into local lookups, so agent loops stay fast and cheap. Drop a Workspace into FastAPI, Express, or a browser app. Wire it into OpenAI Agents SDK, Vercel AI SDK, LangChain, Mastra, or Pi. Run it alongside Claude Code and Codex. Site: https://t.co/zo1orc2wA9 GitHub: https://t.co/zeRAKri7I9 #AIAgents #OpenSource #AgenticAI #Strukto #Filesystem #VFS

zechengzh's tweet photo. Introducing Mirage, a unified virtual filesystem for AI agents!

6 weeks. 1.1M+ lines of code. We rewrote bash from the ground up so cat, grep, head, and pipes work across heterogeneous services. S3, Google Drive, Slack, Gmail, GitHub, Linear, Notion, Postgres, MongoDB, SSH, and more, all mounted side-by-side as one filesystem.

Bash that AI agents already know works on every format! cat, grep, head, and wc parse .parquet, .csv, .json, .h5, even .wav! One pipe can stitch S3, Drive, GitHub, Slack, and Linear together, same Unix semantics throughout.

Workspaces are versioned too. Snapshot, clone, and roll back the whole thing with one API call. A two-layer cache turns repeated reads into local lookups, so agent loops stay fast and cheap.

Drop a Workspace into FastAPI, Express, or a browser app. Wire it into OpenAI Agents SDK, Vercel AI SDK, LangChain, Mastra, or Pi. Run it alongside Claude Code and Codex.

Site: https://t.co/zo1orc2wA9
GitHub: https://t.co/zeRAKri7I9

#AIAgents #OpenSource #AgenticAI #Strukto #Filesystem #VFS

170

3K

339

5K

618K

alshell7 retweeted

LightSeek Foundation

@lightseekorg

about 1 month ago

Introducing TokenSpeed, a speed-of-light LLM inference engine. > TensorRT LLM level performance > vLLM level usability > Built by a lean and mission-driven team in two months > MIT license, open-source https://t.co/MJzhCEg7m8 https://t.co/anhoETwwS9

lightseekorg's tweet photo. Introducing TokenSpeed, a speed-of-light LLM inference engine.

> TensorRT LLM level performance

> vLLM level usability

> Built by a lean and mission-driven team in two months

> MIT license, open-source

https://t.co/MJzhCEg7m8

https://t.co/anhoETwwS9 https://t.co/BWn4Me62x7

44

1K

127

926

2M

alshell7 retweeted

Alexander Whedon

@alex_whedon

about 1 month ago

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

1K

23K

3K

19K

13M

alshell7 retweeted

Poolside

@poolsideai

about 1 month ago

Today we’re releasing Laguna XS.2, Poolside’s first open-weight model. It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks. Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0. Links 👇 Weights: https://t.co/HSo8L2gM64 API: https://t.co/DMJtNFrace Blog: https://t.co/BXEjQxtQoV

poolsideai's tweet photo. Today we’re releasing Laguna XS.2, Poolside’s first open-weight model.
It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks.
Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0.
Links 👇
Weights: https://t.co/HSo8L2gM64
API: https://t.co/DMJtNFrace
Blog: https://t.co/BXEjQxtQoV

44

808

144

379

274K

Owais

@alshell7

about 1 month ago

@brrrkyle @Suyash151504 @TheAhmadOsman Hey this is cool! Something that I was looking for today. A visual way of learning. Is it still under progress?

1

0

28

Owais

@alshell7

about 1 month ago

@TheAhmadOsman Hey @TheAhmadOsman , are there any good tutorials or a starting point to learn Kernels better?

0

1

0

415

Owais

@alshell7

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users