Scan to download
BTC $79,292.81 +4.64%
ETH $2,411.31 +4.48%
BNB $649.80 +2.97%
XRP $1.46 +1.92%
SOL $88.44 +3.02%
TRX $0.3288 -0.77%
DOGE $0.0981 +3.69%
ADA $0.2557 +2.85%
BCH $464.96 +4.40%
LINK $9.50 +1.15%
HYPE $41.09 +2.77%
AAVE $94.32 +2.34%
SUI $0.9722 +2.96%
XLM $0.1821 +2.88%
ZEC $325.24 +1.64%
BTC $79,292.81 +4.64%
ETH $2,411.31 +4.48%
BNB $649.80 +2.97%
XRP $1.46 +1.92%
SOL $88.44 +3.02%
TRX $0.3288 -0.77%
DOGE $0.0981 +3.69%
ADA $0.2557 +2.85%
BCH $464.96 +4.40%
LINK $9.50 +1.15%
HYPE $41.09 +2.77%
AAVE $94.32 +2.34%
SUI $0.9722 +2.96%
XLM $0.1821 +2.88%
ZEC $325.24 +1.64%

DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
Related tags
app_icon
ChainCatcher Building the Web3 world with innovations.