The paper in question: https://arxiv.org/pdf/2310.16834
https://www.youtube.com/watch?v=K_9wQ6LZNpI
Some modern LLMs based on diffusion: