I use AI every single day so I'm definitely not against AI. I think it's one of the most powerful tools humans have ever created but I do think there's a big difference between using AI to help you think better and using AI so you don't have to think at all.
Especially with mathematics I don't think the point is only to get the answer as fast as possible. A big part of the value is what happens to your mind while trying to understand something. When you struggle with a problem, make mistakes, fix your reasoning, and slowly build intuition, you're not wasting time, you're training your brain.
Mathematics is the single best subject you can study if you want to develop real problem solving ability.
It trains you to think analytically, break problems into smaller parts, and build solutions step by step with complete precision. Unlike most fields, math doesn't reward surface level understanding, it forces depth.
The specific problems don't need to be directly relevant to the real world. The value comes from the way your brain adapts. You become better at handling complexity, making decisions, and solving unfamiliar problems.
That's why mathematicians and physicists are so valuable in industries like quantitative finance that already train the exact skill that matters the most: how to think.
math doesn't work in one pass
every topic in mathematics sits at multiple depths. the first pass shows you the surface. the second shows you what the surface was hiding. the third shows you what the second was building toward. you can revisit the same subject four or five times in your life and find something new every time
take linear algebra. first pass was strang. computational, just vectors and matrices and what a determinant actually does. second pass was proof-based. vector spaces. linear maps. the abstract structure underneath the computations. third pass was roman's advanced linear algebra. proofs got harder, theorems got deeper, things from the earlier passes opened up in ways i didn't see coming. and i'm still not done
every pass through a math subject teaches you something you couldn't have learned the first time. the proofs that confused you start to feel obvious. the theorems you memorized start to feel inevitable.
find the layer that fits where you are right now. don't try to absorb everything on the first read. forgetting a theorem isn't failure. it's the cost of choosing depth over memorization
the point isn't to remember everything. it's to keep coming back until each new pass makes the last one feel obvious
this book rewired my brain.
Øksendal's book on SDEs. heavy. theoretical. it'll take you weeks, maybe months. but it's the cleanest path i've found from measure theory into real stochastic analysis.
Brownian motion gets built properly. then the itô integral. then itô's formula. then diffusions. each one earned from what came before.
the funny part is the subtitle says "with applications." it isn't really. you spend chapters on martingales before you see a single financial quantity. semigroups and generators sit at the core. optimal stopping shows up before you ever price an option. black-scholes is on the last few pages and almost incidental.
worth every hour. you finish it and stochastic processes stop feeling like magic. they feel like geometry.
draw any loop on a circle. that loop is secretly an integer.
take a circle, pick a point, and draw a loop that starts and ends there. you can describe the whole loop by how many times it wraps around on its way back. once clockwise is +1. twice is +2. counterclockwise gives negatives. no wrap is 0.
here's the move. in topology, two loops are "the same" if you can smoothly deform one into the other. and once you allow that, everything about the loop washes out except the winding number. how wiggly it is, where it bunches up, none of it survives. only the wrap count.
so the set of essentially different loops on a circle is just the integers. an infinite topological universe collapses to ℤ.
this is the fundamental group of the circle. π_1(S¹) ≅ ℤ. one of the cleanest theorems in topology, because it takes a continuous question (which loops can deform into which) and gives a purely discrete answer (count the wraps).
Yeah blows my mind how the rn derivative du/dv shows up directly in mathematical finance. Its the object behind every change of measure in derivatives pricing, girsanovs theorem is essentially the rn derivative applied to brownian motion.
One of those theorems where you prove it as a pure measure theory abstraction and then watch it become the foundation of an entire applied field
a sequence of functions can converge to f in Lp without converging to f at a single point.
classic example: the typewriter sequence. bumps that slide across [0, 1] with shrinking width but constant height. the L¹ norm collapses to 0. but at any fixed x, f_k(x) keeps flipping between 0 and 1 forever.
Lp convergence isn't about points. it's about the norm collapsing. the topology forgets local behavior and only remembers integrals.
what's actually satisfying is how much theory shows up the moment you ask why.
the unit ball of Lp isn't compact in the norm topology. that's a functional analysis fact, and it's why you can't extract a pointwise convergent subsequence in general. but the riesz subsequence theorem rescues something: given any Lp convergent sequence, there's a subsequence that converges almost everywhere. a faint trace of compactness restored by measure theory.
and then banach-alaoglu shows up to finish the job. norm compactness fails, but the closed unit ball is compact in the weak-* topology. you lose pointwise control. you keep geometric control. just in a weaker form.
real analysis, measure theory, functional analysis, topology, and a piece of geometry, all hiding inside a single statement: f_k → f in Lp.
one definition. five subjects answering it. that's why functional analysis feels like the spine of modern math.
the martingale approach is beautiful because it makes du/dν literal. you build h_n on finer and finer partitions, doob's convergence hands you the limit, and the radon nikodym derivative falls out as the limit of u(A_n)/ν(A_n) over shrinking cells. that's the construction i'd use if i were teaching probability.
the riesz proof i used is slicker in the abstract setting but you never get to see what du/dν actually 'is'. martingales need a clean partition structure. riesz works for any sigma finite measures without needing one. riesz when you want pure generality, martingales when you want to see what the derivative actually is.
spent today proving radon-nikodym.
if one measure is absolutely continuous with respect to another, there's a function h that links them. integrating against h recovers the first from the second. it's the formal way to say "this measure is just that one, reweighted".
without this, conditional expectation isn't a real mathematical object. probability collapses.
the beautiful part is the move you don't expect. you don't construct h directly. you combine both measures into a sum, work in the hilbert space L^2 over that sum, and define a linear functional that integrates against the first measure. riesz representation hands you a function g. a few indicator tricks show 0 <= g < 1 almost everywhere. and the h you wanted falls out as g/(1-g).
you never build it. the hilbert space geometry summons it.
probability rests on a riesz argument applied to the right space.
P(A) = 1 doesn't mean A is guaranteed.
if you pick a number uniformly between 0 and 1, the probability it's irrational is 1, and the probability it's rational is 0. but it's still possible to pick a rational. it just happens with probability 0.
probability 1 doesn't mean certain. it means "almost surely". the exceptions exist, they just live on a set of measure zero, which probability theory treats as negligible.
even certainty comes with a margin of measure 0.