Sequence Modeling

Cracking Long Sequences “How Mamba Outperforms Transformers”

Abstract Demystifies how Mamba models efficiently handle extremely long sequences while outperforming Transformers in speed, scalability, and memory usage. We break down the core ideas behind selective state-space models, why they excel where attention struggles, and what this shift means for the future of sequence modeling. Speaker Bio Short Bio: Younis is an (NLP Research Engineer ll) and a Computer Science Master’s student specializing in multimodal AI systems and advanced speech-modeling, with research spanning conversational analysis, interaction modeling, and real-time systems.