LFM2.5-Audio-1.5B-JP is our first Japanese audio model.
Speak to it in Japanese. It responds in Japanese speech. One model, no separate ASR/TTS components.
> First end-to-end general-purpose audio model at this scale with Japanese support
> 1.5B params, outperforms J-Moshi (~7B)
> Competitive with Qwen2.5-Omni-3B (a ~5B model)
> Base model, designed for fine-tuning on specific use cases
(2/n)