News

Speech LLMs use speech embeddings as the prompt to a Large Language Model (LLM) and generate human readable text for the speech signal in an autoregressive manner. Teacher-forcing is a common approach ...