Scan to download
BTC $61,949.31 +1.85%
ETH $1,625.92 +4.13%
BNB $590.91 +2.58%
XRP $1.12 +3.32%
SOL $64.84 +4.57%
TRX $0.3279 +1.90%
DOGE $0.0846 +3.47%
ADA $0.1624 +1.95%
BCH $223.59 +2.85%
LINK $7.70 +4.34%
HYPE $58.39 -0.31%
AAVE $62.92 +4.01%
SUI $0.7433 +3.57%
XLM $0.2025 -4.74%
ZEC $422.36 +17.87%
BTC $61,949.31 +1.85%
ETH $1,625.92 +4.13%
BNB $590.91 +2.58%
XRP $1.12 +3.32%
SOL $64.84 +4.57%
TRX $0.3279 +1.90%
DOGE $0.0846 +3.47%
ADA $0.1624 +1.95%
BCH $223.59 +2.85%
LINK $7.70 +4.34%
HYPE $58.39 -0.31%
AAVE $62.92 +4.01%
SUI $0.7433 +3.57%
XLM $0.2025 -4.74%
ZEC $422.36 +17.87%

DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
Related tags
app_icon
ChainCatcher Building the Web3 world with innovations.