🅳🅾︎🅼🅴 @dome_cs - Twitter Profile

🅳🅾︎🅼🅴 @dome_cs

9 days ago

0

1

dome_cs retweeted

Sakana AI

@SakanaAILabs

28 days ago

Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation https://t.co/c9AvsRKybj What if we didn’t have to hold an entire neural network in memory to train it? Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network. In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance. With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block. How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently. We validated this across five different architectures: • ViT • DiT • Masked diffusion • Autoregressive transformers • Recurrent-depth transformers In each case, performance is competitive with end-to-end training while using a fraction of the memory. This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training. Read our paper and code, to learn more. Paper: https://t.co/CRj96VGYQn GitHub: https://t.co/eNW0K9Xh8E 🐟

56

2K

366

2K

872K

dome_cs retweeted

Roan

@RohOnChain

about 2 months ago

Anthropic pays $750,000+ a year for engineers who can build LLM architectures from scratch. Stanford taught the entire thing in 1 hour lecture & released it for free. Bookmark & watch this today before someone takes it down.

116

10K

2K

21K

3M

dome_cs retweeted

Ralph Sueppel

@macro_synergy

about 2 months ago

"Getting the Target Right in Return Prediction": "Transforming the target from raw to standardized or rank-based returns nearly triples predictive accuracy and doubles portfolio returns [based on machine learning]." https://t.co/EBdJHkmV60

macro_synergy's tweet photo. "Getting the Target Right in Return Prediction": "Transforming the target from raw to standardized or rank-based returns nearly triples predictive accuracy and doubles portfolio returns [based on machine learning]." https://t.co/EBdJHkmV60 https://t.co/BUF8LiAh05

1

74

10

115

7K

Who to follow

Account dedicated to Chelsea, Maresca & World Football. Tactico, lighthearted, occasional sarcasm.

CS@SC

@SC_SummerCamps

The CS@SC Camps provide K12 students with an opportunity to explore topics in computer science. We are part of the USC Viterbi K-12 STEM Center Program.

dome_cs retweeted

Symplectic.Research

@QuantSymplectic

about 2 months ago

Black-Scholes is wrong almost everywhere. And yet, it’s still the language of options markets. The reason: It’s the flat limit of a curved geometric pricing space. The volatility smile? That’s the curvature. Below we see where markets actually live in that space Preprint: https://t.co/uD7NVupSsx

QuantSymplectic's tweet photo. Black-Scholes is wrong almost everywhere.
And yet, it’s still the language of options markets.

The reason: It’s the flat limit of a curved geometric pricing space.

The volatility smile? That’s the curvature.

Below we see where markets actually live in that space
Preprint:
https://t.co/uD7NVupSsx

29

803

87

897

59K

dome_cs retweeted

Peter - Cracking Markets

@SystematicPeter

2 months ago

A systematic portfolio does not have to be complicated. If I were starting from scratch, I would not begin with 25 exotic strategies and endless optimization. I would start with a few simple, different return drivers: - stock momentum - slow long mean reversion on stocks - faster long mean reversion on stocks - simple intraday system on indices The goal is not to find one perfect system. The goal is to combine simple systems that behave differently, make money in different market conditions, and reduce dependence on any single edge. This chart shows the main strategies in my own portfolio applied to a smaller account, with slippage and commissions included. Simple ideas. Different behavior. One systematic portfolio. Deep dive with detailed statistics, updated daily: https://t.co/QqtucCENtB

SystematicPeter's tweet photo. A systematic portfolio does not have to be complicated.

If I were starting from scratch, I would not begin with 25 exotic strategies and endless optimization.

I would start with a few simple, different return drivers:

- stock momentum
- slow long mean reversion on stocks
- faster long mean reversion on stocks
- simple intraday system on indices

The goal is not to find one perfect system.

The goal is to combine simple systems that behave differently, make money in different market conditions, and reduce dependence on any single edge.

This chart shows the main strategies in my own portfolio applied to a smaller account, with slippage and commissions included.

Simple ideas.
Different behavior.
One systematic portfolio.

Deep dive with detailed statistics, updated daily:
https://t.co/QqtucCENtB

5

110

14

158

8K

dome_cs retweeted

Symplectic.Research

@QuantSymplectic

2 months ago

As a grad student working on Hamiltonian systems in General Relativity, I often wondered what the phase-plane approach from dynamical systems theory could tell us about markets. Today I submitted the third paper in that answer: Information Geometry of Market Dynamics: A Pareto Frontier from Contact Geometry. Preprint: https://t.co/NXpR7VxFlw and code: Zenodo: https://t.co/uppcCQwoo1

QuantSymplectic's tweet photo. As a grad student working on Hamiltonian systems in General Relativity, I often wondered what the phase-plane approach from dynamical systems theory could tell us about markets.

Today I submitted the third paper in that answer:
Information Geometry of Market Dynamics: A Pareto Frontier from Contact Geometry.

Preprint: https://t.co/NXpR7VxFlw
and code:
Zenodo: https://t.co/uppcCQwoo1

8

43

11

46

10K

dome_cs retweeted

Valeriy M., PhD, MBA, CQF

@predict_addict

2 months ago

Solid mathematical ideas almost always outperform contrived engineering tricks. For years deep learning has been dominated by increasingly complex architectural hacks: CNN blocks, attention layers, channel mixers, residual pathways, normalization stacks. Every few years a new architecture is announced as if it were a revolution. One of the most famous examples was Kaiming He and Residual Networks (ResNet). At the time he was paraded around the AI world like a celebrity because residual connections supposedly “solved” deep learning. But these were largely engineering patches. Now something much more interesting appeared. A new architecture called CliffordNet returns to mathematics — specifically Clifford Algebra, developed in the 19th century by William Kingdon Clifford. Instead of stacking arbitrary modules, the model is built around the geometric product uv = u·v + u∧v A single algebraic operation that simultaneously captures inner product structure and geometric interactions. In other words: the math already contains the interaction mechanism. No attention blocks. No mixer layers. No architectural spaghetti. The result: • 77.82% accuracy on CIFAR-100 with only 1.4M parameters • roughly 8× fewer parameters than ResNet-18 And with strict O(N) complexity. The paper even suggests that once geometric interactions are modeled correctly, feed-forward networks become largely redundant. A good reminder for the AI community. Engineering tricks can dominate for years. But eventually mathematics shows up and deletes half the architecture. Paper: [https://t.co/QIkCCO1tYs) 19th century geometry just walked into computer vision.

predict_addict's tweet photo. Solid mathematical ideas almost always outperform contrived engineering tricks.

For years deep learning has been dominated by increasingly complex architectural hacks: CNN blocks, attention layers, channel mixers, residual pathways, normalization stacks.

Every few years a new architecture is announced as if it were a revolution.

One of the most famous examples was Kaiming He and Residual Networks (ResNet). At the time he was paraded around the AI world like a celebrity because residual connections supposedly “solved” deep learning.

But these were largely engineering patches.

Now something much more interesting appeared.

A new architecture called CliffordNet returns to mathematics — specifically Clifford Algebra, developed in the 19th century by William Kingdon Clifford.

Instead of stacking arbitrary modules, the model is built around the geometric product

uv = u·v + u∧v

A single algebraic operation that simultaneously captures inner product structure and geometric interactions.

In other words: the math already contains the interaction mechanism.

No attention blocks.
No mixer layers.
No architectural spaghetti.

The result:

• 77.82% accuracy on CIFAR-100 with only 1.4M parameters
• roughly 8× fewer parameters than ResNet-18

And with strict O(N) complexity.

The paper even suggests that once geometric interactions are modeled correctly, feed-forward networks become largely redundant.

A good reminder for the AI community.

Engineering tricks can dominate for years.

But eventually mathematics shows up and deletes half the architecture.

Paper:
[https://t.co/QIkCCO1tYs)

19th century geometry just walked into computer vision.

25

905

127

965

84K

dome_cs retweeted

Quantocracy @Quantocracy

2 months ago

The AutoTune filter [Financial Hacker] https://t.co/z29ztbes9U

0

3

2

7

988

dome_cs retweeted

Lianghui Zhu @lianghui_zhu

2 months ago

For a decade, we've made models wider and deeper—but we've barely changed how layers *talk* to each other. Since ResNet's `x + F(x)` in 2015, the depth residual has been the only highway for inter-layer communication. It's time to upgrade the staircase. 🧵

lianghui_zhu's tweet photo. For a decade, we've made models wider and deeper—but we've barely changed how layers *talk* to each other.

Since ResNet's `x + F(x)` in 2015, the depth residual has been the only highway for inter-layer communication.

It's time to upgrade the staircase. 🧵 https://t.co/KIvzN4w9dT

18

2K

238

2K

189K

dome_cs retweeted

XO Labs

@xolabs_

2 months ago

https://t.co/v155PpGfgJ

2

122

11

265

40K

dome_cs retweeted

Gappy (Giuseppe Paleologo)

@__paleologo

2 months ago

Just a reminder that you can do unintuitive things with ordinary differential equations. Oldie but goldie paper.

11

1K

96

789

42K

dome_cs retweeted

Liquidity Goblin @liquiditygoblin

2 months ago

when fitting curves on short dated options, do you fit to the bid / ask respectively and then take the mid of the vols? or do you take the mid of the prices and then take the vol of that? what if neither will give you the full picture? a thread on fitting 0dte curves🧵 1/11

liquiditygoblin's tweet photo. when fitting curves on short dated options, do you fit to the bid / ask respectively and then take the mid of the vols? or do you take the mid of the prices and then take the vol of that?

what if neither will give you the full picture?

a thread on fitting 0dte curves🧵

1/11 https://t.co/bn2JH6OBzT

14

258

12

409

28K

dome_cs retweeted

机器之心 JIQIZHIXIN

@jiqizhixin

2 months ago

Huge! Recurrent neural networks could match Transformer memory without the quadratic burden! Ali Behrouz from Google and colleagues have cracked it! They present Memory Caching (MC), a simple yet powerful method that lets RNNs store "memory checkpoints" of their internal states. This allows their effective memory to grow with context, offering a flexible trade-off between speed and recall. MC dramatically enhances recurrent models in language modeling and long-context understanding. It significantly closes the performance gap with Transformers on recall tasks and outperforms existing state-of-the-art recurrent models.

jiqizhixin's tweet photo. Huge! Recurrent neural networks could match Transformer memory without the quadratic burden!

Ali Behrouz from Google and colleagues have cracked it!

They present Memory Caching (MC), a simple yet powerful method that lets RNNs store "memory checkpoints" of their internal states. This allows their effective memory to grow with context, offering a flexible trade-off between speed and recall.

MC dramatically enhances recurrent models in language modeling and long-context understanding. It significantly closes the performance gap with Transformers on recall tasks and outperforms existing state-of-the-art recurrent models.

14

461

87

394

37K

dome_cs retweeted

annie @_annieversary

2 months ago

what the Fuck https://t.co/jSWRRiEBQ5

188

8K

792

5K

1M

dome_cs retweeted

Roan

@RohOnChain

3 months ago

This is the EXACT 12-step methodology Institutional quant desks use to win every single trade. Bookmark & run it through your stack or just pass it directly to your AI coding agent. Most people never reach this layer in their entire lifetime. Full breakdown in article below.

RohOnChain's tweet photo. This is the EXACT 12-step methodology Institutional quant desks use to win every single trade.

Bookmark & run it through your stack or just pass it directly to your AI coding agent. Most people never reach this layer in their entire lifetime.

Full breakdown in article below. https://t.co/wDPXi8lZVD

21

1K

135

2K

166K

🅳🅾︎🅼🅴 @dome_cs

3 months ago

@plus_vision_div I would be very excited to see a product like this someday. Thank you for your great work!

0

18

🅳🅾︎🅼🅴 @dome_cs

3 months ago

@plus_vision_div I noticed you also make whiteboards, so I wanted to ask—do you have any plans to bring this magnetic writing technology to whiteboards in the future? I think it could solve common issues like dried-out markers and stains from unwiped ink, while also being more eco-friendly.

1

0

21

dome_cs retweeted

Google Research

@GoogleResearch

3 months ago

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc

1K

39K

6K

22K

19M

🅳🅾︎🅼🅴

@dome_cs

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users