DFlash for Gemma 4: Up to 6x Faster. โกโก
Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash โ open source, same quality, more speed!!
https://t.co/wKcRoibuOB
Today https://t.co/jFknDoasSy joins Hugging Face
Together we will continue to build ggml, make llama.cpp more accessible and empower the open-source community. Our joint mission is to make local AI easy and efficient to use by everyone on their own hardware.
In today's episode of programming horror...
In the Python docs of random.seed() def, we're told
"If a is an int, it is used directly." [1]
But if you seed with 3 or -3, you actually get the exact same rng object, producing the same streams. (TIL). In nanochat I was using the sign as a (what I thought was) clever way to get different rng sequences for train/test splits. Hence gnarly bug because now train=test.
I found the CPython code responsible in cpython/Modules/_randommodule.c [2], where on line 321 we see in a comment:
"This algorithm relies on the number being unsigned. So: if the arg is a PyLong, use its absolute value." followed by
n = PyNumber_Absolute(arg);
which explicitly calls abs() on your seed to make it positive, discarding the sign bit.
But this comment is actually wrong/misleading too. Under the hood, Python calls the Mersenne Twister MT19937 algorithm, which in the general case has 19937 (non-zero) bits state. Python takes your int (or other objects) and "spreads out" that information across these bits. In principle, the sign bit could have been used to augment the state bits. There is nothing about the algorithm that "relies on the number being unsigned". A decision was made to not incorporate the sign bit (which imo was a mistake). One trivial example could have been to map n -> 2*abs(n) + int(n < 0).
Finally this leads us to the contract of Python's random, which is also not fully spelled out in the docs. The contract that is mentioned is that:
same seed => same sequence.
But no guarantee is made that different seeds produce different sequences. So in principle, Python makes no promises that e.g. seed(5) and seed(6) are different rng streams. (Though this quite commonly implicitly assumed in many applications.) Indeed, we see that seed(5) and seed(-5) are identical streams. And you should probably not use them to separate your train/test behaviors in machine learning. One of the more amusing programming horror footguns I've encountered recently. We'll see you in the next episode.
[1] https://t.co/srv1ZBlDsi
[2] https://t.co/qpnKdvfVNS
Six Dragons Fly Again for the Web: An Interactive Web Application for Everyone
@ISMIRConf
I developed an interactive web demo for the Six Dragons project that can create an ensemble of Korean court music using our project's model.๐๐ต
Implementation:
https://t.co/JpRNbhPZIr
Oasis
A Universe in a Transformer
Oasis is an interactive world model developed by Decart and Etched. Based on diffusion transformers, Oasis takes in user keyboard input and generates gameplay in an autoregressive manner.
What is the performance limit when scaling LLM inference? Sky's the limit.
We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many intermediate reasoning tokens as needed. Remarkably, constant depth is sufficient.
https://t.co/HO2seV73KT (ICLR 2024)
Critical flaw found in #Docker Engine allows attackers to bypass authorization plugins (AuthZ) - CVE-2024-41110, CVSS score 10.0.
This #vulnerability can lead to severe privilege escalation, affecting numerous Docker versions.
Details here: https://t.co/n04jOEAvEf
#DevOps
FuriosaAI's research paper "TCP: A Tensor Contraction Processor for AI Workloads" has been accepted for publication by the International Symposium on Computer Architecture (@ISCAConfOrg), the premier forum for new ideas in silicon design. https://t.co/v9F8Pd4ngc (1/5)
Scarlett Johansson is taking legal action against OpenAI after they 'copied' her voice for GPT-4o.
According to the actress, Sam Altman tried to hire her voice last September, she said no, and Sam used cloned and used her voice anyway.
Insane if true.
https://t.co/ibsy2y8yTa