The Underfitted

Verified account

@SimplySigmoid

jr AI Engineer | Transformers × Financial Data | CS Student

Joined January 2022

150 Following

504 Followers

1.9K Posts

Pinned Tweet

The Underfitted

3 months ago

"The model will converge anyway" - a compelling argument, and a costly misconception. Two models. Same everything. Model A, step 0: loss = 3.29 → starts learning immediately. Model B, step 0: loss = 27 → spends the first 9,000 steps just getting back to where A started. Model B didn't train for 10k steps. It trained for ~800. The rest was debt repayment. A bad initialization doesn't slow you down. It steals your training budget — silently, one "optimization" step at a time. 🧵

SimplySigmoid's tweet photo. "The model will converge anyway" - a compelling argument, and a costly misconception.

Two models. Same everything.

Model A, step 0: loss = 3.29 → starts learning immediately.

Model B, step 0: loss = 27 → spends the first 9,000 steps just getting back to where A started.

Model B didn't train for 10k steps. It trained for ~800.

The rest was debt repayment.

A bad initialization doesn't slow you down. It steals your training budget — silently, one "optimization" step at a time.
🧵

1

2

0

0

195

The Underfitted

3 months ago

@CodeEdison if it were that simple

SimplySigmoid's tweet photo. @CodeEdison if it were that simple https://t.co/7rLYbqBshk

0

2

0

1

55

The Underfitted

3 months ago

@romerojr__ @claudeai Appreciate!

0

0

0

0

90

The Underfitted

3 months ago

SimplySigmoid's tweet photo. @tetsuoai king https://t.co/PKcBlrcMOf

0

0

0

1

150

The Underfitted

3 months ago

@vivoplt Also dont forget about this gem

SimplySigmoid's tweet photo. @vivoplt Also dont forget about this gem https://t.co/4cGN4uJNHd

0

0

0

0

14

The Underfitted

3 months ago

@BTechWalaBanda @karthikponna19 help with?

0

0

0

0

9

The Underfitted

3 months ago

@RoundtableSpace

SimplySigmoid's tweet photo. @RoundtableSpace https://t.co/NSN01CC0PT

0

0

0

0

241

The Underfitted

3 months ago

@vivoplt having a good foundation is a true leverage in this AI era

SimplySigmoid's tweet photo. @vivoplt having a good foundation is a true leverage in this AI era https://t.co/YkTI4AuzJj

0

4

0

0

795

SimplySigmoid retweeted

The Underfitted

3 months ago

It's 6PM on a Saturday. Karpathy on screen. Handwritten notes on the desk. VS Code open with makemore_from_scratch. No tutorial. No shortcut. Just activation functions, neuron flow through layers, and the slow satisfaction of actually understanding what's happening inside the network. Week by week. Layer by layer.

SimplySigmoid's tweet photo. It's 6PM on a Saturday.

Karpathy on screen. Handwritten notes on the desk. VS Code open with makemore_from_scratch.

No tutorial. No shortcut. Just activation functions, neuron flow through layers, and the slow satisfaction of actually understanding what's happening inside the network.

Week by week.
Layer by layer.

0

3

1

1

241

The Underfitted

3 months ago

@Shivam25mishra you need solid foundation to build your house

SimplySigmoid's tweet photo. @Shivam25mishra you need solid foundation to build your house https://t.co/PdLnMdo0gm

0

1

0

0

42

The Underfitted

3 months ago

@Shruti_0810 dont forget about this GEM

SimplySigmoid's tweet photo. @Shruti_0810 dont forget about this GEM https://t.co/59oTeu4Ypj

0

0

0

2

68

The Underfitted

3 months ago

@VadimStrizheus thats what a single prompt could do

SimplySigmoid's tweet photo. @VadimStrizheus thats what a single prompt could do https://t.co/csvv0ApJ4g

1

18

0

36

3K

The Underfitted

3 months ago

@quantscience_ if would be that simple

SimplySigmoid's tweet photo. @quantscience_ if would be that simple https://t.co/yHaQ7n3Iet

0

0

0

1

596

The Underfitted

3 months ago

@anthdm quants era incoming

SimplySigmoid's tweet photo. @anthdm quants era incoming https://t.co/Yns2ir13TF

0

0

0

1

241

The Underfitted

3 months ago

@elonmusk better than my automated system

SimplySigmoid's tweet photo. @elonmusk better than my automated system https://t.co/7J92cqUYzg

0

1

0

0

56

The Underfitted

3 months ago

@bigaiguy a comprehensive transformer breakdown by Claude

0

0

0

0

12

The Underfitted

3 months ago

@nalinrajput23 math

SimplySigmoid's tweet photo. @nalinrajput23 math https://t.co/03R1wQ92Uq

0

1

0

1

250

The Underfitted

3 months ago

It's 6PM on a Saturday. Karpathy on screen. Handwritten notes on the desk. VS Code open with makemore_from_scratch. No tutorial. No shortcut. Just activation functions, neuron flow through layers, and the slow satisfaction of actually understanding what's happening inside the network. Week by week. Layer by layer.

SimplySigmoid's tweet photo. It's 6PM on a Saturday.

Karpathy on screen. Handwritten notes on the desk. VS Code open with makemore_from_scratch.

No tutorial. No shortcut. Just activation functions, neuron flow through layers, and the slow satisfaction of actually understanding what's happening inside the network.

Week by week.
Layer by layer.

0

3

1

1

241

Last Seen Users on Sotwe

Trends for you

Most Popular Users