This is the most beautiful AI research I’ve read this year! In standard AI models (like traditional Transformers), words in a sentence figure out their meaning by looking at the other words around them. This is called Self-Attention. But standard self-attention has a quirk: when a word looks around for context, it often ends up […]
