A 5-million quantum circuit dataset to train ML models in the NISQ era
Quantum computing today operates in the so-called NISQ regime: small numbers of noisy qubits, limited circuit depth, and hardware errors that make even simple computations difficult to reproduce. In this setting, the fundamental object researchers manipulate is the quantum circuit, a structured sequence of gates that encodes a computation. Yet, unlike images, text, or molecules, there has been no large public dataset of circuits to train ML models on. Without such a resource, applying modern deep learning to questions like circuit classification, noise diagnosis, or program synthesis has remained largely anecdotal.
Leonardo Placidi and coauthors address this gap with MNISQ, the first large-scale dataset built specifically for ML on quantum circuits in the NISQ era. It contains 4.95 million circuits of 10 qubits each, with up to 100 two-qubit gates, generated by quantum-encoding classical datasets such as MNIST. The circuits are released in two complementary formats: the actual quantum circuits for quantum ML pipelines, and their textual representation as QASM (Quantum Assembly Language) files for classical sequence models.
This dual format enables two benchmarks in parallel. On the quantum side, the authors use quantum kernel methods to classify the circuits, reaching up to 97% accuracy in multiclass tasks, and they also characterize how realistic hardware noise affects performance, giving the community a clean testbed for error-mitigation research. On the classical side, they treat QASM files as a programming-style language and apply NLP models including S4, Transformer, and LSTM. The S4 architecture reaches 77% accuracy, climbing to 81% with data augmentation, showing that modern sequence models can extract meaningful structure from quantum programs without ever simulating them.
MNISQ provides the kind of standardized benchmark that has been missing in the quantum software stack. In drug discovery, materials design, and energy research, where quantum-classical pipelines are being prototyped, having a public corpus of circuits enables more systematic algorithm selection, noise modeling, and integration between quantum hardware teams and ML scientists.
Paper: Placidi et al., Scientific Data (2026) — CC BY 4.0 | https://t.co/IUBpbu9ziy
In this new article, @TumaBitcoiner delves into one of the latest produce of @blksresearch, a post-quantum signature scheme called SHRINCS.
Developed by researchers Jonas Nick (@n1ckler) and Mikhail Kudinov (@kudinov_mikhail), this scheme tries to leverage stateful and stateless algorithms to provide smaller quantum signatures with static backups.