Guys, I beg you to understand that the long term AI frontier is about statistics / information theory and physics. It is NOT about linear algebra.
Dense matrices are the best approach we have rn because of the TPU/GPU hardware and because the math is robust and general purpose.
However, it is inefficient.
And as we optimize towards more faithful intelligence representations sparse networks will dominate the intelligence / energy frontier.
The most important thing is not to be a super physicist information theorist, because only I can be so awesome after all, but to be able to think generally in these terms from first principles. You need to be able to think CONCEPTUALLY in statistics. You need to understand that these matrices are just encoding the necessary information to sample probability distributions.
The 21st century will be the century of information and statistics. Please develop intuition around these ideas.