Every published Swiss court decision, resolved to its 6.5M citations and 11.3M statute references. Trilingual. Daily updates. Parquet on HF. Built so LLMs stop guessing and start citing. https://t.co/RTsUzUtTkI #openaccess#opendata#opengovernment
The internet runs on a coincidence of atomic physics. Erbium emits light at exactly 1,550 nanometers. Silica glass fiber loses the least signal at exactly 1,550 nanometers. One is a quantum property of a rare earth element, the other is an optical property of melted sand. They have nothing to do with each other. It is pure luck. Before erbium-doped fiber amplifiers, every undersea signal had to be converted from light to electricity and back every 50 kilometers. Each conversion degraded the signal and capped bandwidth. Erbium removed that cap. An erbium amplifier sitting on the floor of the ocean boosts signals 1,000 times and runs for decades without maintenance. 99% of intercontinental data moves through glass strands no thicker than a human hair, amplified by a rare earth element that just so happens to emit at the right wavelength. And erbium isn't even the strangest one.
I'm rebuilding AlphaFold2 from scratch in pure PyTorch.
No frameworks on top of PyTorch. No copy-paste from DeepMind's repo. Just nn.Linear, einsum, and the 60-page supplementary paper.
The project is called minAlphaFold2, inspired by Karpathy's minGPT. The idea is simple: AlphaFold2 is one of the most important neural networks ever built, and there should be a version of it that a single person can sit down and read end-to-end in an afternoon.
Where it stands today:
- ~3,500 lines across 9 modules
- Full forward pass works: input embedding → Evoformer → Structure Module → all-atom 3D coordinates
- Every loss function from the paper (FAPE, torsion angles, pLDDT, distogram, structural violations)
- Recycling, templates, extra MSA stack, ensemble averaging — all implemented
- 50 tests passing
- Every module maps 1-to-1 to a numbered algorithm in the AF2 supplement
The Structure Module was the most satisfying part to build. Invariant Point Attention is genuinely beautiful — it does attention in 3D space using local reference frames so the whole thing is SE(3)-equivariant, and the math fits in about 150 lines of PyTorch.
What's next:
- Build the data pipeline (PDB structures + MSA features)
- Write the training loop
- Train on a small set of proteins and see what happens
The repo is public. If you've ever wanted to understand how AlphaFold2 actually works at the level of individual tensor operations, this is meant for you.
Repo: https://t.co/k25vl5th1y
@mxschons what are you optimizing for here? more data doesn't necessarily mean better outcome. how to find the most fruitful balance between context, focus and goals for any given situation?
@limosalapponica@hanno_sauer I read it as saying essentially introspection is all good but change requires action. My point is more that the OP's post is perhaps equally unhelpful. Meet folks where they are. We're not getting anywhere if we don't.
I might agree that it's largely trivial (and verbose), though the deeper point I see in it is that we're ripe for another Copernican revolution, namely a better and more universal appreciation of how the human mind works and how its mechanics contributes to conflict—and that perhaps fruitful social cooperation requires physical action (mean tweeting doesn't count)
At the descriptive level, perhaps, depending on the granularity of the use of language, but this doesn’t make the conflicting perspectives of a philosophical problem from a human point of view go away. Though not every situation that is commonly described as a philosophical problem really is one.