Kenji Honda @kenjinoodle - Twitter Profile

about 5 hours ago

How can AI predict event sequences over time? A new survey by researchers from Renmin University, Guangdong University of Technology, and Southeast University reviews Temporal Point Processes using Bayesian, neural, and LLM approaches. They find that neural and LLM models outperform traditional statistical methods in flexibility and accuracy across finance, health, and social media. Advances in Temporal Point Processes: Bayesian, Neural, and LLM Approaches Paper: https://t.co/uOQ5m8sfrk Our report: https://t.co/Qa7EKrOom9 📬 #PapersAccepted by Jiqizhixin

jiqizhixin's tweet photo. How can AI predict event sequences over time?

A new survey by researchers from Renmin University, Guangdong University of Technology, and Southeast University reviews Temporal Point Processes using Bayesian, neural, and LLM approaches.

They find that neural and LLM models outperform traditional statistical methods in flexibility and accuracy across finance, health, and social media.

Advances in Temporal Point Processes: Bayesian, Neural, and LLM Approaches

Paper: https://t.co/uOQ5m8sfrk

Our report: https://t.co/Qa7EKrOom9

📬 #PapersAccepted by Jiqizhixin

1

8

4

6

826

kenjinoodle retweeted

Jesse Green @JesseGreenc1de

about 2 hours ago

$CBRS Reports say OpenAI is running its 5.6 Sol Frontier model on Cerebras hardware, hitting throughput speeds near 750 tokens per second. That level of output is absolutely staggering.

JesseGreenc1de's tweet photo. $CBRS
Reports say OpenAI is running its 5.6 Sol Frontier model on Cerebras hardware, hitting throughput speeds near 750 tokens per second. That level of output is absolutely staggering. https://t.co/OTVB85EKM0

0

4

0

65

kenjinoodle retweeted

DailyPapers

@HuggingPapers

about 6 hours ago

BioMatrix is the first biological foundation model to natively read and generate sequences, structures, and language A single decoder-only architecture maps molecules and proteins into one shared token space. Trained on 304B tokens, it achieves SOTA on 77 of 80 tasks.

HuggingPapers's tweet photo. BioMatrix is the first biological foundation model to natively read and generate sequences, structures, and language

A single decoder-only architecture maps molecules and proteins into one shared token space. Trained on 304B tokens, it achieves SOTA on 77 of 80 tasks. https://t.co/O63OrViI5J

1

39

5

18

2K

kenjinoodle retweeted

Rohan Paul

@rohanpaul_ai

about 2 hours ago

A crazy blog. Chinese developers are buying Claude access through gray-market API transfer stations that can sell tokens at 5% to 10% of official prices while hiding the real user from Anthropic. A transfer station is a middle server that takes a user’s prompt, sends it to Claude through overseas accounts, returns the answer, and collects payment through WeChat or Alipay. The transfer station collects many Claude accounts through free credits, discounted accounts, shared subscriptions, overseas payment workarounds, fake verification, or sometimes stolen-card accounts. It connects all those accounts behind one proxy, so Chinese users do not talk to Anthropic directly and only pay the proxy in RMB. The cheap price comes from account farming, free-credit abuse, resale of unused quota, subscription splitting, possible stolen cards, and a darker trade where user prompts and outputs become training data. So the price hugely cheap not because Anthropic is giving a discount; it is cheap because the transfer station lowers its own cost and creates extra hidden revenue. The user thinks they are buying cheap inference, but the proxy may swap Opus for weaker models, inflate token use, or store private code, tool calls, reasoning traces, and business data. The proxy may store user prompts, code, outputs, and tool traces, then sell or reuse that data for model training. This breaks a core assumption behind KYC, account bans, and abuse monitoring: the AI company sees the proxy, not the real person, so banning one account leaves the upstream supply chain alive.

rohanpaul_ai's tweet photo. A crazy blog.

Chinese developers are buying Claude access through gray-market API transfer stations that can sell tokens at 5% to 10% of official prices while hiding the real user from Anthropic.

A transfer station is a middle server that takes a user’s prompt, sends it to Claude through overseas accounts, returns the answer, and collects payment through WeChat or Alipay.

The transfer station collects many Claude accounts through free credits, discounted accounts, shared subscriptions, overseas payment workarounds, fake verification, or sometimes stolen-card accounts.

It connects all those accounts behind one proxy, so Chinese users do not talk to Anthropic directly and only pay the proxy in RMB.

The cheap price comes from account farming, free-credit abuse, resale of unused quota, subscription splitting, possible stolen cards, and a darker trade where user prompts and outputs become training data.

So the price hugely cheap not because Anthropic is giving a discount; it is cheap because the transfer station lowers its own cost and creates extra hidden revenue.

The user thinks they are buying cheap inference, but the proxy may swap Opus for weaker models, inflate token use, or store private code, tool calls, reasoning traces, and business data.

The proxy may store user prompts, code, outputs, and tool traces, then sell or reuse that data for model training.

This breaks a core assumption behind KYC, account bans, and abuse monitoring: the AI company sees the proxy, not the real person, so banning one account leaves the upstream supply chain alive.

11

52

12

33

6K

kenjinoodle retweeted

Rohan Paul

@rohanpaul_ai

about 2 hours ago

FT: Apple is asking Washington for permission to buy DRAM from CXMT, a blacklisted Chinese supplier, because AI server demand has made ordinary device memory painfully expensive. DRAM is the short-term working memory inside iPhones, Macs, and iPads, while HBM is a stacked, faster version used in AI accelerators, so the AI buildout is pulling factory capacity toward servers and away from consumer gadgets. Apple’s problem is supplier pressure, because it mainly depends on Micron, Samsung, and SK Hynix, while CXMT could add cheaper supply from China’s state-backed memory push. But CXMT sits on the Pentagon’s Chinese Military Company list, which does not block Apple purchases by itself but signals national-security concern and could become far more serious if Commerce adds CXMT to the Entity List. Apple’s $263B market-value loss was triggered by the memory-cost pressure that forced MacBook and iPad price hikes, showing how AI infrastructure demand is now raising the cost base of everyday consumer devices.

rohanpaul_ai's tweet photo. FT: Apple is asking Washington for permission to buy DRAM from CXMT, a blacklisted Chinese supplier, because AI server demand has made ordinary device memory painfully expensive.

DRAM is the short-term working memory inside iPhones, Macs, and iPads, while HBM is a stacked, faster version used in AI accelerators, so the AI buildout is pulling factory capacity toward servers and away from consumer gadgets.

Apple’s problem is supplier pressure, because it mainly depends on Micron, Samsung, and SK Hynix, while CXMT could add cheaper supply from China’s state-backed memory push.

But CXMT sits on the Pentagon’s Chinese Military Company list, which does not block Apple purchases by itself but signals national-security concern and could become far more serious if Commerce adds CXMT to the Entity List.

Apple’s $263B market-value loss was triggered by the memory-cost pressure that forced MacBook and iPad price hikes, showing how AI infrastructure demand is now raising the cost base of everyday consumer devices.

4

20

8

4

4K

kenjinoodle retweeted

fly51fly @fly51fly

about 22 hours ago

[CV] Unlimited OCR Works Y Yin, H Liu, YY, Q Xie… [Baidu Inc.] (2026) https://t.co/aQaQGvdSbv

0

13

7

12

2K

kenjinoodle retweeted

Rachel Thomas

@math_rachel

about 15 hours ago

"Long-term innovation depends not only on optimization for current objectives, but on the continued viability of ideas whose value is not yet legible" Thought-provoking ICML paper on limitations & risks of over-relying on AI benchmarks 1/

math_rachel's tweet photo. "Long-term innovation depends not only on optimization for current objectives, but on the continued viability of ideas whose value is not yet legible"

Thought-provoking ICML paper on limitations & risks of over-relying on AI benchmarks 1/ https://t.co/dNmvRXtsis

2

36

3

18

4K

kenjinoodle retweeted

Vuk Rosić 武克

@VukRosic99

about 13 hours ago

FlashAttention-4 just changed the game! The problem: Blackwell scaled the matrix-multiply units way up, but the units that move shared memory and compute exponentials barely moved. So the old attention kernel now spends its time waiting on the parts that didn't get faster. FlashAttention-4 rebalances around that with 3 tricks: 1. Overlap the matmul and the softmax so neither waits. 2. Compute the exponential in software, not on the slow dedicated unit. 3. Skip the rescaling you don't need. I made a short visual breakdown - one diagram per trick. Swipe through. 👇 --- paper - https://t.co/uBn414Wd7H Today's live: build an LLM from one prompt, then setup an autonomous research loop. Join 👉 https://t.co/6nocqbVceu

VukRosic99's tweet photo. FlashAttention-4 just changed the game!

The problem: Blackwell scaled the matrix-multiply units way up, but the units that move shared memory and compute exponentials barely moved. So the old attention kernel now spends its time waiting on the parts that didn't get faster.

FlashAttention-4 rebalances around that with 3 tricks:

1. Overlap the matmul and the softmax so neither waits.
2. Compute the exponential in software, not on the slow dedicated unit.
3. Skip the rescaling you don't need.

I made a short visual breakdown - one diagram per trick. Swipe through. 👇

---

paper - https://t.co/uBn414Wd7H

Today's live: build an LLM from one prompt, then setup an autonomous research loop. Join 👉 https://t.co/6nocqbVceu

2

97

17

80

6K

kenjinoodle retweeted

Shubham Saboo

@Saboo_Shubham_

about 18 hours ago

UI/UX for building org-level Agent Harness. Bring your model, own the runtime, and wire it with your tools. 100% Opensource.

13

165

24

254

27K

kenjinoodle retweeted

Mark Kretschmann

@mark_k

about 12 hours ago

Seem like the release of the Grok 1.5T / Cursor Composer 3 model is imminent, as the version number has been removed from the menus. This always happens shortly before a release from @xai. 🔥

mark_k's tweet photo. Seem like the release of the Grok 1.5T / Cursor Composer 3 model is imminent, as the version number has been removed from the menus. This always happens shortly before a release from @xai.
🔥 https://t.co/aMHAPuQnfY

30

617

32

51

26K

kenjinoodle retweeted

Fillipe Cordeiro

@fillipecordeiro

about 23 hours ago

My current AI coding workflow: @claudeai → @OpenAICodexCli → @grok - Claude: Planning & architecture - Codex CLI: Actual building (edits files, runs commands, ships) - Grok: Honest review + catches what the others miss Each tool has different strengths. Using them in sequence > using any single one. What's your current stack?

fillipecordeiro's tweet photo. My current AI coding workflow:

@claudeai → @OpenAICodexCli → @grok

- Claude: Planning & architecture
- Codex CLI: Actual building (edits files, runs commands, ships)
- Grok: Honest review + catches what the others miss

Each tool has different strengths. Using them in sequence > using any single one.

What's your current stack?

8

23

7

0

492

kenjinoodle retweeted

DAIR.AI

@dair_ai

1 day ago

NEW paper from NVIDIA. (bookmark it) Speed-of-light performance analysis tells you the theoretical floor of a workload, but teams still derive it by hand and freeze it. SOLAR automates the whole thing straight from PyTorch or JAX source. An LLM frontend translates arbitrary code into an executable Affine Loop IR, validated by output comparison, then a deterministic pass lifts it into an einsum graph, and an analytical backend computes the bounds. The model is confined to translation, so the actual bound math stays deterministic. Across KernelBench, Flax models, and robotics workloads, they report zero observed SOL violations. Paper: https://t.co/KXgsPxcSnY Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

dair_ai's tweet photo. NEW paper from NVIDIA.

(bookmark it)

Speed-of-light performance analysis tells you the theoretical floor of a workload, but teams still derive it by hand and freeze it. SOLAR automates the whole thing straight from PyTorch or JAX source.

An LLM frontend translates arbitrary code into an executable Affine Loop IR, validated by output comparison, then a deterministic pass lifts it into an einsum graph, and an analytical backend computes the bounds.

The model is confined to translation, so the actual bound math stays deterministic.

Across KernelBench, Flax models, and robotics workloads, they report zero observed SOL violations.

Paper: https://t.co/KXgsPxcSnY

Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

11

160

26

118

13K

kenjinoodle retweeted

Luiza Jarovsky, PhD

@LuizaJarovsky

about 22 hours ago

Now you even need to be a time traveler to apply for a job.

5

42

8

3

3K

kenjinoodle retweeted

Amit Shekhar

@amitiitbhu

1 day ago

Q × Kᵀ tells the model how relevant every word is to every other word. Softmax turns that into probabilities. V delivers the actual content. One formula. Three steps. The entire foundation of modern AI.

3

83

21

107

8K

kenjinoodle retweeted

Aurum 🔫😼

@Anand_naraya

2 days ago

Today I pushed a big step forward on the startup research agent I’ve been building. The idea is simple: you enter a niche, and the agent helps turn it into a real startup direction. What it does right now: scans competitors critiques weak ideas ranks the strongest opportunity generates landing page copy suggests a build plan recommends a tech stack What I’m liking most is that it feels less like a chatbot and more like a workflow. It doesn’t just answer. It investigates, compares, scores, and narrows. That shift is the real lesson for me: an agent becomes useful when it can follow a process, not just produce output. We’re still in MVP mode, but the shape is becoming clear. The next improvements I want are: stronger critique logic cleaner memory structure better idea scoring more consistent output across niches a more polished public launch experience Small build. Big learning. And a lot more to ship.

4

17

5

1

623

kenjinoodle retweeted

Changdae Oh ✈️ ACL 2026

@Changdae_Oh

2 days ago

Outcome reward models: cheap, but vulnerable to spurious shortcuts 😣 Process reward models (PRMs): robust, but too expensive to build from scratch 😫 What if you could get a ready-to-use PRM right after any RL post-training? Introducing 'Progress Advantage' 🧵

7

130

19

133

18K

kenjinoodle retweeted

Shubham Saboo

@Saboo_Shubham_

1 day ago

Claude Tag but 100% Opensource. Bring your own model, own the runtime, and wire it to your own tools. Supports - Generative UI - Streaming replies - Human in the Loop approvals

Saboo_Shubham_'s tweet photo. Claude Tag but 100% Opensource.

Bring your own model, own the runtime, and wire it to your own tools.

Supports
- Generative UI
- Streaming replies
- Human in the Loop approvals https://t.co/aPPUKp36yF

15

205

32

256

30K

kenjinoodle retweeted

Daniel Han

@danielhanchen

1 day ago

DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%! DS also showed DSpark works well for other models like Gemma & Qwen Github: https://t.co/EGVYpc1kcK Paper: https://t.co/TaBMRVlaW9 HF: https://t.co/289jVU2pxh

danielhanchen's tweet photo. DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%!

DS also showed DSpark works well for other models like Gemma & Qwen

Github: https://t.co/EGVYpc1kcK
Paper: https://t.co/TaBMRVlaW9
HF: https://t.co/289jVU2pxh https://t.co/GC31XiVjSK

89

3K

455

2K

346K

kenjinoodle retweeted

Fazl Barez @FazlBarez

2 days ago

This paper will be talked about for years to come. V important! There are Futures benchmark driven AI cannot see! led by Sobhan (my fellow) and @Avameanssong w/@kalsbskk81826 Ali, Fateme, @sanmikoyejo, @philiptorr, @yong_suk_lee, @joelbot3000 @NorvigPeter and @random_walker

FazlBarez's tweet photo. This paper will be talked about for years to come. V important!

There are Futures benchmark driven AI cannot see!

led by Sobhan (my fellow) and @Avameanssong w/@kalsbskk81826 Ali, Fateme, @sanmikoyejo, @philiptorr, @yong_suk_lee, @joelbot3000 @NorvigPeter and @random_walker https://t.co/ehBGK8dfsT

4

103

17

109

31K

kenjinoodle retweeted

Google Research

@GoogleResearch

2 days ago

Today on the blog we introduce a method to retrofit Multi-Token Prediction onto frozen production models, accelerating on-device inference without the inefficiencies of separate drafters. Learn more →https://t.co/9Tq8hosoxS

GoogleResearch's tweet photo. Today on the blog we introduce a method to retrofit Multi-Token Prediction onto frozen production models, accelerating on-device inference without the inefficiencies of separate drafters.

Learn more →https://t.co/9Tq8hosoxS https://t.co/JYQwQkROKa

20

864

96

524

63K

Kenji Honda

@kenjinoodle

Last Seen Users on Sotwe

Trends for you

Most Popular Users