In a recent interview, Nikolay Savinov from Deepmind explained that when a model is fed many tokens, it has to distribute its attention across all of them.<br /> The article Deepmind expert says trimming documents improves accuracy despite large context windows appeared first on THE DECODER. [...]