jelveh @jelveh - Twitter Profile

jelveh

@jelveh

1 day ago

Impressive!!

lifcc

@mylifcc

2 days ago

Google再次发力，昨天Google 丢出了Gemma4 系列的QAT模型，直接让16GB MAC可以跑256k上下文的agent了。之前有人说Gemma 4 12B 本地部署30k上下文没啥用，我用 Google QAT 尝试了一下，只要增加1.5GB内存就能打开256k上下文！左边：常规 Q4_K_M @ 32K 右边：Google QAT Q4_0 @ 256K 同一台 Mac、同一 prompt、两个 llama-server 同时跑，实时对比内存、CPU、tok/s。

64

658

46

700

105K

0

39

jelveh

@jelveh

2 days ago

Every demo is a banger @PINTO03091 ! 🎉🎉🎉

Super PINTO

@PINTO03091

2 days ago

1993年の超低画質NHK動画ギャラリーのサンプルで Wholebody49 をテスト。加工や再配布が許可されている動画。ひどいアーティファクト、人間の動きが速い、大量の人物、動画の後半はあまりに暗すぎ。Simple Tracker の実装が超雑なのでマスクの配色が荒れるのは当然として無視する。

3

168

4

79

15K

0

66

jelveh

@jelveh

2 days ago

“Scientific discovery requires that the search space itself changes” With discovery, new languages emerges. Amazing work!

Markus J. Buehler

@ProfBuehlerMIT

3 days ago

We've made a breakthrough in self-evolving AI scientists moving from "search" to "principled discovery": Scientific discovery requires that the search space itself changes, and an AI scientist must perceive this shift without intervention. We built an AI that achieves this for the first time with the ability to discover the scientific vocabulary it reasons in. Evidence, tools, artifacts, verifiers, failures & claims become typed provenance. We show three distinct modalities: 1) retrieval, adding known objects; 2) search, exploring a fixed schema; and critically: 3) discovery, a verified regime transition. We solve the open-endedness evaluation problem by lifting agentic workflows into a typed copresheaf and proving, via a Kan obstruction, that true discovery is not unbounded generation but a verifiable schema expansion: old evidence is transported by Left Kan extension, and genuine novelty is mathematically quantified by the pointwise residual beyond the transported image - separating discovery from mere search and making novelty objective and measurable rather than a subjective judgment or benchmark delta. Our AI scientist is built in a way that does not pre-conceive the approach it chooses; instead, we endow the system with formal power to adapt, evolve, and reason from first principles. Case studies include: 1⃣Builder/Breaker model that discovers mode-conditioned compliance in proteins; 2⃣CategoryScienceClaw that finds anisotropic fiber-network stiffness rules. Great work in collaboration with my graduate student @fwang108_ @MITdeptofBE F.Y. Wang & M.J. Buehler, Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence, arXiv:2606.01444, 2026

101

2K

372

3K

752K

0

34

3

19

8K

jelveh

@jelveh

4 days ago

The crystal cpu sci-fi people were right.

Object Zero

@Object_Zero_

5 days ago

Single Crystal CVD Diamond Have no doubt, you are at the dawn of an industrial revolution. There is a string of breakthroughs happening throughout upstream industries that all compound. Diamond manufacturing is now able to produce CPU size single crystals wafers. Currently these are marketed as heat spreaders because they have thermal conductivity of 2,200 W/mK which means they move heat incredibly effectively. However, that somewhat misses the wood for the trees… Diamond has physical and electrical properties that exceed traditional silicon, making it uniquely suited for high demand applications. Thermal Conductivity: Heat is the enemy of electronics. Diamond conducts heat better than almost any other known material, about 5 times better than copper and over 10 times better than silicon. A diamond chip can act as its own heat sink. Ultra Wide Bandgap: Diamond can handle massive amounts of voltage and operate at incredibly high temperatures without electrical breakdown. This makes it perfect for high power applications like electric vehicle inverters, power grids, and aerospace technologies. High Frequencies: Electrons move very quickly through diamond, allowing chips to operate at much higher frequencies, which is ideal for advanced telecommunications and radar. Radiation Hardness: Diamond is incredibly resilient to radiation, making diamond based chips ideal for satellites, space exploration, and nuclear facilities. To make a material act as a semiconductor, you have to "dope" it. To do this you inject impurities into the crystal lattice to create a positive (p-type) or negative (n-type) charge. Diamond's atomic structure is so tightly packed that forcing other elements into it is hard. While p-type doping (with boron) has been figured out, reliable n-type doping (with phosphorus) remains a massive hurdle. Theoretical ceilings Band gap Silicon wafer = 1.1 eV Diamond CVD wafer = 5.5eV Clock speed Silicon wafer = 5-6 GHz clock wall Diamond CVD wafer = 1-2 THz clock wall Max Running Temp Silicon wafer = 150°C Diamond CVD wafer = 1,000°C Whilst we etch silicon with photolithography and Extreme UV light, this doesn’t really work with chemically inert diamond. Diamond CVD is currently etched with oxygen plasma etching, but this lacks the precision of EUV. However, we can etch diamond to extreme precision with electron projection lithography. EPL was invented in the 90s by Bell Labs, IBM and Nikkon but abandoned as it was harder than EUV. Electrons repel each other so the beams blurrs too readily. What if we built a femto electron beam? What if we built it to extreme such that it was a ‘single electron’ pulse? What if we build a microscopic "bed of nails" containing millions of nanoscale tungsten or silicon tips (photocathodes). You shine a massive, highly complex femtosecond laser system across the entire array. Every time the laser pulses, millions of tiny tips each fire a single, perfectly straight electron at the exact same time. Turns out, research teams at likes of MIT and Stanford are currently experimenting with exactly this, laser driven nanotip electron emitters. Pair that tool with Diamond CVD substrate tech and we approach the material limits of both semiconductors and nanotechnology. Would require asynchronous logic to escape fatal clock skew and operate at full capability. But I think I will live to see it.

Object_Zero_'s tweet photo. Single Crystal CVD Diamond

Have no doubt, you are at the dawn of an industrial revolution. There is a string of breakthroughs happening throughout upstream industries that all compound.

Diamond manufacturing is now able to produce CPU size single crystals wafers.

Currently these are marketed as heat spreaders because they have thermal conductivity of 2,200 W/mK which means they move heat incredibly effectively.

However, that somewhat misses the wood for the trees…

Diamond has physical and electrical properties that exceed traditional silicon, making it uniquely suited for high demand applications.

Thermal Conductivity: Heat is the enemy of electronics. Diamond conducts heat better than almost any other known material, about 5 times better than copper and over 10 times better than silicon.

A diamond chip can act as its own heat sink.

Ultra Wide Bandgap: Diamond can handle massive amounts of voltage and operate at incredibly high temperatures without electrical breakdown.

This makes it perfect for high power applications like electric vehicle inverters, power grids, and aerospace technologies.

High Frequencies: Electrons move very quickly through diamond, allowing chips to operate at much higher frequencies, which is ideal for advanced telecommunications and radar.

Radiation Hardness: Diamond is incredibly resilient to radiation, making diamond based chips ideal for satellites, space exploration, and nuclear facilities.

To make a material act as a semiconductor, you have to "dope" it. To do this you inject impurities into the crystal lattice to create a positive (p-type) or negative (n-type) charge.

Diamond's atomic structure is so tightly packed that forcing other elements into it is hard. While p-type doping (with boron) has been figured out, reliable n-type doping (with phosphorus) remains a massive hurdle.

Theoretical ceilings

Band gap
Silicon wafer = 1.1 eV
Diamond CVD wafer = 5.5eV

Clock speed
Silicon wafer = 5-6 GHz clock wall
Diamond CVD wafer = 1-2 THz clock wall

Max Running Temp
Silicon wafer = 150°C
Diamond CVD wafer = 1,000°C

Whilst we etch silicon with photolithography and Extreme UV light, this doesn’t really work with chemically inert diamond.

Diamond CVD is currently etched with oxygen plasma etching, but this lacks the precision of EUV.

However, we can etch diamond to extreme precision with electron projection lithography. EPL was invented in the 90s by Bell Labs, IBM and Nikkon but abandoned as it was harder than EUV.

Electrons repel each other so the beams blurrs too readily.

What if we built a femto electron beam?
What if we built it to extreme such that it was a ‘single electron’ pulse?

What if we build a microscopic "bed of nails" containing millions of nanoscale tungsten or silicon tips (photocathodes). You shine a massive, highly complex femtosecond laser system across the entire array.

Every time the laser pulses, millions of tiny tips each fire a single, perfectly straight electron at the exact same time.

Turns out, research teams at likes of MIT and Stanford are currently experimenting with exactly this, laser driven nanotip electron emitters.

Pair that tool with Diamond CVD substrate tech and we approach the material limits of both semiconductors and nanotechnology.

Would require asynchronous logic to escape fatal clock skew and operate at full capability.

But I think I will live to see it.

77

3K

301

2K

642K

0

56

4

20

6K

Who to follow

Christopher Blum

@ChristopherBlum

Entrepreneurial Developer, Y Combinator Alumni, Co-Founder of @LanguageTool and @protonet

David Segura

@dseg10

Entrepreneur & Investor, Founder of Glassbox Media, Founder of Giant Media (acquired), Investor at Eaze, Grove Collaborative, Hawthorne, Peer5 and more!

4 days ago

Great write up - thank you!!!

Alok

@analogalok

5 days ago

i just ran Google's brand new Unsloth Gemma4 12B dense GGUF on my RTX 4060 using llama.cpp + CUDA 13.2 21 tokens per second. on a budget consumer GPU. locally. no API. no cloud. no subscription. and the benchmarks are absolutely cooked # first let's talk architecture because this is genuinely different every multimodal model you've used has a frozen vision encoder + frozen audio encoder + LLM backbone glued together Gemma 4 12B is different it's a single decoder only transformer. that's it. vision? raw 48×48 pixel patches → one matmul → projected directly into the LLM audio? raw 16kHz signal sliced into 40ms frames → linear projection → same LLM input space no encoder tax. no latency penalty. no fragmented memory to put the encoder savings in perspective: old Gemma 4 26B approach: - 550M param vision encoder (frozen) - 300M param audio encoder (frozen) - LLM backbone Gemma 4 12B: - 35M param vision embedder (a single matmul) - no audio encoder at all - LLM backbone handles EVERYTHING 550M → 35M for vision alone. that's a 15x reduction this is why the gemma-4-12b-it-Q4_K_M.gguf is just 6.6 GBs!!! and it has 256K native context context # Benchmarks: AIME 2026 (math olympiad): 77.5% GPQA Diamond (expert science): 78.8% LiveCodeBench v6 (real code): 72% Codeforces ELO: 1659 MMLU Pro: 77.2% MATH-Vision: 79.7% BigBench Extra Hard: 53% inference → llama.cpp, LM Studio, vLLM, SGLang llamacpp flags: -m "gemma-4-12b-it-Q4_K_M.gguf" -ngl 99 -c 8000 -v --port 8080 Available on huggingface now! Link below

57

968

89

898

245K

1

0

70

jelveh

@jelveh

4 days ago

Ohh like this!!

stevibe

@stevibe

5 days ago

I explored a further possibility with local models: Qwen3.6 35B A3B + NVIDIA LocateAnything-3B as a local Computer Use agent (proof of concept). In the demo, I asked it to switch my Mac to light mode. It did. Then back to dark. Did that too — finding the right toggle in System Settings, clicking it, and verifying the change itself. It's fully screenshot-based, so no Accessibility API needed. If it's on screen, the agent can see it and act on it. This runs entirely on your own hardware — private, local, built from two small open models.

30

671

62

647

43K

0

33

jelveh

@jelveh

5 days ago

Thanks - on my list to try! 🙏

David Hendrickson

@TeksEdge

6 days ago

🌞This is big Local AI news! A new open-source Computer-Use LLM has just launched. Holo 3.1 is H Company’s (🇫🇷) new local computer-use agent model that beats Qwen3.5-397B, Kimi-K2.5, and Sonnet 4.6! Since it is built for local deployment → ⬩ Runs fully on your machine (MacBook, Windows PC, DGX Spark, RTX Spark) ⬩ Based on Qwen architecture, specialized for GUI understanding & computer control ⬩ Optimized checkpoints: NVFP4, FP8 & Q4 GGUF (0.8B to 35B sizes) ⬩ Strong gains: 79.3% on AndroidWorld benchmark (35B model) 💻 Comparison to Qwen3.5: Holo 3.1 is fine-tuned specifically for computer-use agents (screen understanding, planning, clicking, navigation). Better at real GUI tasks than general-purpose Qwen3.5, especially when running locally.⚡

TeksEdge's tweet photo. 🌞This is big Local AI news! A new open-source Computer-Use LLM has just launched.

Holo 3.1 is H Company’s (🇫🇷) new local computer-use agent model that beats Qwen3.5-397B, Kimi-K2.5, and Sonnet 4.6!

Since it is built for local deployment →
⬩ Runs fully on your machine (MacBook, Windows PC, DGX Spark, RTX Spark)
⬩ Based on Qwen architecture, specialized for GUI understanding & computer control
⬩ Optimized checkpoints: NVFP4, FP8 & Q4 GGUF (0.8B to 35B sizes)
⬩ Strong gains: 79.3% on AndroidWorld benchmark (35B model)

💻 Comparison to Qwen3.5:
Holo 3.1 is fine-tuned specifically for computer-use agents (screen understanding, planning, clicking, navigation). Better at real GUI tasks than general-purpose Qwen3.5, especially when running locally.⚡

66

2K

194

2K

175K

0

37

jelveh

@jelveh

6 days ago

Love this approach!! Really opened up something for me mentally!

stevibe

@stevibe

6 days ago

Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together. (The test: place each element at the right pixel position on a blank form image, not type into a field.) Setup: > Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool). > I gave Qwen a new tool: ask "where's the email field?" and LocateAnything returns the exact x, y, width, height. > The blue boxes on the screen are its detections. Look how tight they are — it nails every field. Result: > Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct. > Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas. > Character-box alignment still a touch loose, but every value is where it belongs. > 9m10s, 224.5k input, 24.3k output, 21 turns. Why it matters: > Qwen alone can't finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can. > A combination of small models can do the work of a single large one.

86

3K

275

3K

146K

0

29

jelveh

@jelveh

6 days ago

Kind of like what we do! Love this work!

Lukas Aichberger @aichberger

7 days ago

We unlocked the working memory of LLMs 💥 Reasoning in Memory (RiM) replaces autoregressive "thinking out loud" with fixed memory blocks that form a task-specific workspace for latent reasoning. The key idea is simple: reasoning should happen inside the LLM, not in its output!

27

314

52

255

57K

0

61

jelveh

@jelveh

7 days ago

Amazing folks!!! Fantastic engineers that create a real difference in the world! 🚀🚀🚀

Prince Canuma

@Prince_Canuma

7 days ago

Today we're shipping our biggest MLX-VLM release yet: v0.6.0 ...and we are raising 💸 This one's about turning your Apple devices into real local agent machines. From your desk to your pocket. What's new: ⚡ Speculative decoding everywhere — Gemma 4 EAGLE3 + DFlash, Qwen MTP, DeepSeek V4 MTP. Faster tokens, less waiting. 🤖 Agent-ready server — native Anthropic /v1/messages API, stateful /v1/responses, tool calls, Codex context budgets. Plug Claude Code & Codex straight into local models. 👁️ New models galore — DeepSeek V4, ZAYA1-VL, MiniCPM-V 4.6, LFM2 MoE, Step-3.7 Flash, Laguna + more. 🎨 Image gen & editing — FLUX.2 (base + klein), PrismML Bonsai. 🔊 Audio in — Qwen3 Omni, Gemma 4 audio, base64 chat audio. 🧮 TurboQuant KV cache — RHT-correct fast paths for leaner memory. 📦 Modular server, better metrics, cleaner streaming. Run real agents on the hardware already in your hands. Github: https://t.co/1T06ur6LU5

50

546

68

340

65K

1

10

0

1

3K

jelveh

@jelveh

7 days ago

Not saying I fully agree, but this is a great piece of writing on what we are all experiencing in one way or another

Mario Zechner

@badlogicgames

8 days ago

recommended reading. > On that last point, this technology is horrific for attention https://t.co/Dv823d9qK2

19

253

23

271

18K

0

34

jelveh

@jelveh

8 days ago

Fascinating!

恒星

@vintcessun

9 days ago

这篇论文终于把为什么AI学东西比人慢的原因讲透了：问题不在数据量，而在学习目标。它从样本复杂度理论出发，证明预测自身的隐表示（latent）比预测原始token在数据效率上有指数级优势——PCFG数据上，token级SSL需要Ω(exp(L))样本，latent预测仅需O(log L)。这首次从理论上解释了data2vec、JEPA等隐空间方法为何高效，也暗示了H-JEPA那种显式多尺度堆叠可能是冗余的。不过理论局限在组合结构数据，对无结构或非层次数据仍需验证。 https://t.co/RgcdRUAstg

20

1K

146

1K

90K

0

51

jelveh

@jelveh

10 days ago

🤯🤯🤯

Bluntly Put Philosopher (BPP)

@SocraticScribe

10 days ago

A drone flying with no battery tether sounds impossible until you see it. GuRu wirelessly transmitted power through the air to directly run an untethered drone from 30 feet away and has kept one flying for 96 hours straight. Power beaming is moving out of science fiction fast.

62

3K

483

1K

165K

0

6

jelveh

@jelveh

10 days ago

😂 cross pollination!!

Jun Song

@jun_song

10 days ago

Opus 4.8 distilled Alibaba Qwen 😂 The table has turned to Open Source AI

56

2K

136

403

321K

0

18

jelveh

@jelveh

10 days ago

Undervalued tweet of the decade.

🌌 Observer of Suns @ObserverSuns

almost 4 years ago

“oh, every morning at 1:00am our language model regenerates the whole codebase from scratch based on the current requirements document. it’s more reliable than trying to make incremental edits”

16

711

50

172

0

18

jelveh

@jelveh

10 days ago

Wow, hope folks are safe and let’s rebuild!

NSF - NASASpaceflight.com

@NASASpaceflight

10 days ago

Blue Origin's New Glenn just blew up at LC-36 while attempting to Static Fire ahead of NG-4. https://t.co/tANS0dWyIH

1K

21K

5K

4K

8M

0

6

jelveh

@jelveh

10 days ago

Wow, can’t wait to play with this! Thanks for building this!

Adithya S K

@adithya_s_k

11 days ago

Introducing Repo2RLEnv Turn any repository into runnable, verifiable coding environments built from real PRs and commits for coding-agent evaluation or RL training > uv pip install repo2rlenv

17

462

43

408

64K

0

1

0

19

jelveh

@jelveh

10 days ago

Congrats!!

Lucas Maes

@lucasmaes_

11 days ago

Would you like to join the research effort on JEPA and World Models easily? After a full year of hard work, we’re excited to finally release stable-worldmodel: an open-source, scalable platform built to accelerate JEPA & World Model research! 📄: https://t.co/gnxGvens5A