Called CSM-1B, the model generates “RVQ audio codes” from text and audio inputs, according to Sesame’s description on the AI dev platform Hugging Face. RVQ refers to “residual vector ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results