A new technical paper titled “Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention” was published by DeepSeek, Peking University and University of Washington.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results