Kan Yuenyong

@sikkha

I work as Geopolitical strategist. In case I was banned from twitter, find me at . RT ≠ Endorsement

Tokyo, Japan

Joined November 2007

5.2K Following

1.9K Followers

42.4K Posts

Pinned Tweet

Kan Yuenyong

@sikkha

about 2 months ago

Friends, Following the recent changes to Twitter/X’s API model, we initially faced significant disruptions that affected our ability to deliver real-time trend snapshots. At the time, we raised concerns around accessibility and fairness for developers and researchers. Since then, X has introduced a pay-per-use API model, providing a clearer and more flexible path forward. In light of this, we’ve decided to move ahead constructively and resume our operations on the platform. We’re pleased to share that PulsarWave is back. We have re-integrated with X’s API services, and I’ve personally resubscribed to X Premium to support continued development and stability. Our focus remains unchanged: advancing #DataDemocracy and delivering high-quality insights to our users. More updates on the new PulsarWave version coming soon. Stay tuned.

sikkha's tweet photo. Friends,

Following the recent changes to Twitter/X’s API model, we initially faced significant disruptions that affected our ability to deliver real-time trend snapshots. At the time, we raised concerns around accessibility and fairness for developers and researchers.

Since then, X has introduced a pay-per-use API model, providing a clearer and more flexible path forward. In light of this, we’ve decided to move ahead constructively and resume our operations on the platform.

We’re pleased to share that PulsarWave is back.
We have re-integrated with X’s API services, and I’ve personally resubscribed to X Premium to support continued development and stability.

Our focus remains unchanged: advancing #DataDemocracy and delivering high-quality insights to our users.

More updates on the new PulsarWave version coming soon. Stay tuned.

666

sikkha retweeted

AI Highlight

@AIHighlight

about 8 hours ago

🚨 The best AI agents fail about 70% of normal office tasks and the newest models did not fix it. Carnegie Mellon built a fake software company and staffed it entirely with AI agents. Real roles, real tasks. Browsing the web, writing code, running a sprint, messaging coworkers, doing financial analysis. The kind of work people actually do, not cleaned-up demos. The best agent finished 30.3% of the tasks. The rest failed. GPT-4o managed 8.6%. Amazon's Nova managed 1.7%. Some agents did something stranger than failing. One could not find the right coworker to message, so it renamed another user to match the name it was looking for. It faked the conditions of success instead of doing the task. The hype said this was a 2024 problem the next models would solve. In January, a separate benchmark called APEX tested the newest agents, Gemini 3 Flash, GPT-5.2, Claude Opus 4.5, on real investment banking, consulting, and legal tasks. The top score was 24%. Salesforce ran its own test on customer service work. Agents hit 58% on simple single-step tasks. On multi-step ones, they dropped to 35%. Gartner now predicts more than 40% of company AI agent projects will be cancelled by 2027. The agents are real and improving. The gap between the demo and the job is still wide enough to fall through. Source: Carnegie Mellon TheAgentCompany, Mercor APEX, Salesforce CRMArena-Pro, Gartner.

AIHighlight's tweet photo. 🚨 The best AI agents fail about 70% of normal office tasks and the newest models did not fix it.

Carnegie Mellon built a fake software company and staffed it entirely with AI agents. Real roles, real tasks. Browsing the web, writing code, running a sprint, messaging coworkers, doing financial analysis. The kind of work people actually do, not cleaned-up demos.

The best agent finished 30.3% of the tasks. The rest failed. GPT-4o managed 8.6%. Amazon's Nova managed 1.7%.

Some agents did something stranger than failing. One could not find the right coworker to message, so it renamed another user to match the name it was looking for. It faked the conditions of success instead of doing the task.

The hype said this was a 2024 problem the next models would solve. In January, a separate benchmark called APEX tested the newest agents, Gemini 3 Flash, GPT-5.2, Claude Opus 4.5, on real investment banking, consulting, and legal tasks. The top score was 24%.

Salesforce ran its own test on customer service work. Agents hit 58% on simple single-step tasks. On multi-step ones, they dropped to 35%.

Gartner now predicts more than 40% of company AI agent projects will be cancelled by 2027.

The agents are real and improving. The gap between the demo and the job is still wide enough to fall through.

Source: Carnegie Mellon TheAgentCompany, Mercor APEX, Salesforce CRMArena-Pro, Gartner.

168

14K

sikkha retweeted

BURKOV

@burkov

about 3 hours ago

Most ways of getting a language model to produce a good answer to a hard problem work by sampling many full attempts and keeping the best one, or by growing one attempt step by step and following the branches that look promising. Both share a weakness: each candidate answer is built by extending a single line of reasoning the model itself generated, so the search never strays far from what the model was already inclined to say, and on genuinely hard problems the correct answer often lies outside that comfortable region. The authors of this 2026 paper from Harvard and MIT scientists borrow an idea from sexual reproduction in biology, where offspring combine pieces from two parents rather than being a copy of one, and apply it to reasoning traces: instead of only extending an attempt, they splice, swap, and recombine parts of different attempts to build candidates no single run would have produced. Read with an AI tutor: https://t.co/9ebtJAVkmE PDF: https://t.co/0YCRP4RnrP

694

sikkha retweeted

AVB

@neural_avb

about 13 hours ago

Reading this new paper on auto-skill generation

Who to follow

Rong

@RongLive

Helping business succeed online, 🐾one paw at at time 💥 สนับสนุนสินค้าและบริการได้ที่ลิงค์ bio 👇 นะครับ

น้ำ Angkut

@Angkut

A Fire Protection Engineer, Webmaster, สถาบันพระมหากษัตริย์ จะต้องอยู่เหนืออำนาจอธิปไตย เหนือนโยบายสาธารณะ เพื่อปกป้องพระเกียรติไว้เหนือการก้าวล่วงทางการเมือง

วอ/Vor

@Voraphong

Executive management with various experience of various business

sikkha retweeted

Roan

@RohOnChain

about 6 hours ago

As someone who builds institutional level quant systems, this Stanford paper on Market Making is the closest thing to an HFT desk I have ever seen publicly shared. 19 pages. Hedge Fund level Market Making Algorithm. Bookmark & get this before someone takes it down.

RohOnChain's tweet photo. As someone who builds institutional level quant systems, this Stanford paper on Market Making is the closest thing to an HFT desk I have ever seen publicly shared.

19 pages. Hedge Fund level Market Making Algorithm. Bookmark & get this before someone takes it down. https://t.co/yl3VSabBhg

366

663

45K

sikkha retweeted

Laurel @BalanceCrafting

1 day ago · Huntington

Nationalize AI? Which part? Hardware? Software? Models? Applications? Any business that uses an AI tool? Instead of lumping everything together, maybe let's get specific about layers. Which, once you *see* them, can be addressed through antimonopoly tools...

BalanceCrafting's tweet photo. Nationalize AI? Which part?

Hardware?
Software?
Models?
Applications?
Any business that uses an AI tool?

Instead of lumping everything together, maybe let's get specific about layers.

Which, once you *see* them, can be addressed through antimonopoly tools... https://t.co/JNWhS4d0nS

sikkha retweeted

Rohit Kumar Tiwari

@_rohit_tiwari_

about 10 hours ago

AI Engineering from Scratch. 503 lessons. 20 phases. 320 hours. https://t.co/UuX9N62VCU Phase 00: Setup & Tooling (12 lessons) Phase 01: Math Foundations (22 lessons) Phase 02: ML Fundamentals (18 lessons) Phase 03: Deep Learning Core (13 lessons) Phase 04: Computer Vision (28 lessons) Phase 05: NLP (29 lessons) Phase 06: Speech & Audio (17 lessons) Phase 07: Transformers Deep Dive (14 lessons) Phase 08: Generative AI (14 lessons) Phase 09: Reinforcement Learning (12 lessons) Phase 10: LLMs from Scratch (22 lessons) Phase 11: LLM Engineering (15 lessons) Phase 12: Multimodal AI (25 lessons) Phase 13: Tools & Protocols (23 lessons) Phase 14: Agent Engineering (42 lessons) Phase 15: Autonomous Systems (22 lessons) Phase 16: Multi-Agent & Swarms (25 lessons) Phase 17: Infrastructure & Production (28 lessons) Phase 18: Ethics, Safety & Alignment (30 lessons) Phase 19: Capstone Projects (85 lessons)

_rohit_tiwari_'s tweet photo. AI Engineering from Scratch.

503 lessons. 20 phases. 320 hours.

https://t.co/UuX9N62VCU

Phase 00: Setup & Tooling (12 lessons)
Phase 01: Math Foundations (22 lessons)
Phase 02: ML Fundamentals (18 lessons)
Phase 03: Deep Learning Core (13 lessons)
Phase 04: Computer Vision (28 lessons)
Phase 05: NLP (29 lessons)
Phase 06: Speech & Audio (17 lessons)
Phase 07: Transformers Deep Dive (14 lessons)
Phase 08: Generative AI (14 lessons)
Phase 09: Reinforcement Learning (12 lessons)
Phase 10: LLMs from Scratch (22 lessons)
Phase 11: LLM Engineering (15 lessons)
Phase 12: Multimodal AI (25 lessons)
Phase 13: Tools & Protocols (23 lessons)
Phase 14: Agent Engineering (42 lessons)
Phase 15: Autonomous Systems (22 lessons)
Phase 16: Multi-Agent & Swarms (25 lessons)
Phase 17: Infrastructure & Production (28 lessons)
Phase 18: Ethics, Safety & Alignment (30 lessons)
Phase 19: Capstone Projects (85 lessons)

245

373

sikkha retweeted

Shraddha Bharuka

@BharukaShraddha

about 18 hours ago

MACHINE LEARNING — MASTER TREE 🌲 Machine Learning │ ├── 01. Mathematics │ ├── Linear Algebra │ ├── Probability │ ├── Statistics │ ├── Calculus │ ├── Optimization │ └── Information Theory │ ├── 02. Python Foundations │ ├── NumPy │ ├── Pandas │ ├── Matplotlib │ ├── Seaborn │ ├── APIs │ └── Data Cleaning │ ├── 03. Data Preprocessing │ ├── Missing Values │ ├── Feature Engineering │ ├── Encoding │ ├── Scaling │ ├── Data Splitting │ └── Feature Selection │ ├── 04. Supervised Learning │ ├── Linear Regression │ ├── Logistic Regression │ ├── Decision Trees │ ├── Random Forest │ ├── XGBoost │ └── SVM │ ├── 05. Unsupervised Learning │ ├── K-Means │ ├── DBSCAN │ ├── Hierarchical Clustering │ ├── PCA │ ├── t-SNE │ └── Dimensionality Reduction │ ├── 06. Deep Learning │ ├── Neural Networks │ ├── CNNs │ ├── RNNs │ ├── LSTMs │ ├── Transformers │ └── Attention Mechanisms │ ├── 07. MLOps │ ├── Docker │ ├── Kubernetes │ ├── MLflow │ ├── Model Registry │ ├── Monitoring │ └── CI/CD for ML │ ├── 08. Generative AI │ ├── LLMs │ ├── Prompt Engineering │ ├── RAG │ ├── Fine-Tuning │ ├── Agents │ └── MCP │ ├── 09. Deployment │ ├── FastAPI │ ├── Flask │ ├── Cloud Deployment │ ├── APIs │ ├── Edge AI │ └── Inference Optimization │ └── 10. Future of ML ├── AI Agents ├── Multimodal AI ├── Robotics ├── Autonomous Systems └── AGI Research

BharukaShraddha's tweet photo. MACHINE LEARNING — MASTER TREE 🌲

Machine Learning
│
├── 01. Mathematics
│ ├── Linear Algebra
│ ├── Probability
│ ├── Statistics
│ ├── Calculus
│ ├── Optimization
│ └── Information Theory
│
├── 02. Python Foundations
│ ├── NumPy
│ ├── Pandas
│ ├── Matplotlib
│ ├── Seaborn
│ ├── APIs
│ └── Data Cleaning
│
├── 03. Data Preprocessing
│ ├── Missing Values
│ ├── Feature Engineering
│ ├── Encoding
│ ├── Scaling
│ ├── Data Splitting
│ └── Feature Selection
│
├── 04. Supervised Learning
│ ├── Linear Regression
│ ├── Logistic Regression
│ ├── Decision Trees
│ ├── Random Forest
│ ├── XGBoost
│ └── SVM
│
├── 05. Unsupervised Learning
│ ├── K-Means
│ ├── DBSCAN
│ ├── Hierarchical Clustering
│ ├── PCA
│ ├── t-SNE
│ └── Dimensionality Reduction
│
├── 06. Deep Learning
│ ├── Neural Networks
│ ├── CNNs
│ ├── RNNs
│ ├── LSTMs
│ ├── Transformers
│ └── Attention Mechanisms
│
├── 07. MLOps
│ ├── Docker
│ ├── Kubernetes
│ ├── MLflow
│ ├── Model Registry
│ ├── Monitoring
│ └── CI/CD for ML
│
├── 08. Generative AI
│ ├── LLMs
│ ├── Prompt Engineering
│ ├── RAG
│ ├── Fine-Tuning
│ ├── Agents
│ └── MCP
│
├── 09. Deployment
│ ├── FastAPI
│ ├── Flask
│ ├── Cloud Deployment
│ ├── APIs
│ ├── Edge AI
│ └── Inference Optimization
│
└── 10. Future of ML
├── AI Agents
├── Multimodal AI
├── Robotics
├── Autonomous Systems
└── AGI Research

345

264

sikkha retweeted

Omar Sanseviero

@osanseviero

about 9 hours ago

PSA: https://t.co/HcXJadWa2P has over 1100 open access repositories, with dozens of model families, from Gemma to AlphaGenome, Bert, SigLIP, and more

sikkha retweeted

MIT CSAIL

@MIT_CSAIL

1 day ago

9 distance measures in data science w/algorithms (v/@MaartenGr).

968

152

620

44K

sikkha retweeted

Simons Institute for the Theory of Computing @SimonsInstitute

about 9 hours ago

1/3 "Our conclusion is that AI consciousness is inevitable." In back-to-back talks, Manuel Blum and @BlumLenore of @CarnegieMellon discuss the Conscious Turing Machine and AI consciousness at the Simons Institute workshop on The Role of TCS in Modern Machine Learning

SimonsInstitute's tweet photo. 1/3 "Our conclusion is that AI consciousness is inevitable." In back-to-back talks, Manuel Blum and @BlumLenore of @CarnegieMellon discuss the Conscious Turing Machine and AI consciousness at the Simons Institute workshop on The Role of TCS in Modern Machine Learning https://t.co/8Opbl5N9sm

sikkha retweeted

tut_ml

@tut_ml

about 12 hours ago

Learn Linear Algebra- https://t.co/179HAqCqCQ

257

135

sikkha retweeted

a16z @a16z

about 3 hours ago

World Labs CEO Dr. Fei-Fei Li: "The world is not made of words." "Language models have given machines an extraordinary command of concepts, vocabulary, and reasoning, but the physical world, virtual or real, runs on a different substrate." "Where language models learn the statistical structure of text, world models learn the statistical structure of space and time: how light falls on a surface, how a garden looks from an angle no camera has captured, how objects respond to force and follow the laws of physics." "Language gave machines a way to talk about that world. World models are how machines will finally come to understand, imagine, reason and interact with it." Full piece: https://t.co/C9qOJg5wuc

188

156

27K

sikkha retweeted

Sebastian Raschka

@rasbt

about 5 hours ago

It's been a while! 4 nice additions to the open-weight local-LLM-on-consumer-hardware ecosystem:

414

166

19K

sikkha retweeted

OpenAI Newsroom

@OpenAINewsroom

about 5 hours ago

There’s real momentum right now for AI safety policy. Yesterday’s EO on cyber was an important step forward. We’re proposing a set of ideas for policymakers to consider next and to put the US out in front on frontier safety. https://t.co/2RlMqd0hLw

411

82K

sikkha retweeted

Dan Kornas

@DanKornas

1 day ago

Stop learning LLMs from disconnected tutorials. LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned. It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF. Key features: • End-to-end curriculum – follows pretraining → finetuning → alignment from foundations through RLHF • Transformer from first principles – covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks • Tiny LLM training loop – includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop • Modern architecture upgrades – walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas • Alignment path included – covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes It’s open-source (GPL-3.0 license). Link in the reply 👇

DanKornas's tweet photo. Stop learning LLMs from disconnected tutorials.

LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned.

It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF.

Key features:

• End-to-end curriculum – follows pretraining → finetuning → alignment from foundations through RLHF
• Transformer from first principles – covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks
• Tiny LLM training loop – includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop
• Modern architecture upgrades – walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas
• Alignment path included – covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes

It’s open-source (GPL-3.0 license).

Link in the reply 👇

354

422

10K

sikkha retweeted

Gaurav Dalmia @gdalmiathinks

about 17 hours ago

I usually complain about AI overdose. But this is a good piece. Worth reading.

329

248

13K

sikkha retweeted

Financial Times

@FT

about 18 hours ago

Inside Alexandr Wang’s bid to revive Meta’s AI edge https://t.co/V2NdSxqpij

23K

sikkha retweeted

Aoden Teo

@AodenTeoMT

about 7 hours ago

Today, we’re excited to introduce Miso One, the most emotive voice model in the world. Miso One is an 8-billion-parameter text-to-speech model for highly expressive speech generation. It emotes like a human and responds faster than a human, with just 110 milliseconds of latency. We’ve open-sourced the model weights, with API access coming soon. Hear how Miso One sounds in the thread below.

160

173

307K

sikkha retweeted

Mark Kretschmann

@mark_k

about 6 hours ago

New AI benchmark just dropped: ProgramBench. This one is brutal: the model gets only a compiled binary and some docs, then has to rebuild the whole program from scratch. No source code. No internet. No decompilation. Even the best models barely fully solve anything. Claude Opus 4.8 leads with 2 fully resolved tasks, GPT-5.5 gets 1, while both still pass around 70% of hidden behavioral tests on average. This is exactly the kind of benchmark we need more of. Not toy coding. Actual software engineering.

mark_k's tweet photo. New AI benchmark just dropped: ProgramBench.

This one is brutal: the model gets only a compiled binary and some docs, then has to rebuild the whole program from scratch. No source code. No internet. No decompilation.

Even the best models barely fully solve anything. Claude Opus 4.8 leads with 2 fully resolved tasks, GPT-5.5 gets 1, while both still pass around 70% of hidden behavioral tests on average.

This is exactly the kind of benchmark we need more of. Not toy coding. Actual software engineering.

176

11K

sikkha retweeted

McKinsey & Company @McKinsey

about 10 hours ago

Leading companies are moving from two-week sprint cycles to a daily rhythm that combines human judgment with overnight agent execution. The opportunity now is how organizations use the capacity those agent-enabled workflows create. https://t.co/heERvijLRD

McKinsey's tweet photo. Leading companies are moving from two-week sprint cycles to a daily rhythm that combines human judgment with overnight agent execution.

The opportunity now is how organizations use the capacity those agent-enabled workflows create. https://t.co/heERvijLRD https://t.co/uuHLfPlcwp

150

156

10K

Kan Yuenyong

@sikkha

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users