direct yourself towards learning rather than discovery, and obsolescence is irrelevant. if it never comes, you won't have lost anything; learning and discovery are identical at the frontier
@andimgladofit@carlesgelada separately, the JS post assumes translation invariance s.t. "only relative positions are observable", which is also not the case here. hence why the screenshot i gave mentions "being able to attend" to relative positions, i.e. relative position is not the only information present
@andimgladofit@carlesgelada the linearity assumed by the JS post is that the map from query to (position encoded query) is linear for fixed position. say q=Wx for x the input embedding; then the encoding in question takes q to W(x+p)=q+Wp, i.e. not linear in q
@andimgladofit@carlesgelada ?? the original attention is all you need encoding adds the position vector to the input embeddings, this just is not a linear operation of anything at all; (a+b)+p != (a+p) + (b+p)
@andimgladofit@carlesgelada this blogpost's conclusion doesn't encompass the positional encoding in question, because it isn't a linear operation on q or k (which is one of the posts assumptions)
yuds explanation is actually what the original paper says directly:
@kalomaze i mean, wasn't something of similar difficulty as this rust idea already shown with Gemini 1.5 translating Kalamang? do you have something other than ICL in mind
@norvid_studies assuming this is referring to the sherlock quote "When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth."
i might have to add "sherlock holmes solutions" to my vocabulary...
@ZyMazza i disagree, "before society was trending towards every possible future" is describing either the pre-paradigm or the crisis period, collapse into a single linear narrative describes pedagogy after the paradigm has been accepted. this is all discussed in structure
@norvid_studies none off the top of my head lol but you can think of a simplicial complex as similar to the polygon approximations of surfaces you see in 3d modeling (just with triangles instead of quads)