Videet Mehta, Liming Wang, Hilde Kuehne, Rogerio Feris, James R. Glass, M. Jehanzeb Mirza, "CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models," https://t.co/aHej47hWn1
we achieved a 4x speedup from traditional diffusion policies and ~200% improvement in survival steps with extremely low number of function evaluations (~5) . also better guidance control.