Scan to download
BTC $62,424.86 +3.02%
ETH $1,630.49 +4.74%
BNB $590.35 +2.83%
XRP $1.12 +5.49%
SOL $64.76 +4.45%
TRX $0.3293 +3.24%
DOGE $0.0845 +5.01%
ADA $0.1647 +6.32%
BCH $225.38 +2.42%
LINK $7.74 +6.41%
HYPE $58.94 +0.15%
AAVE $63.28 +4.16%
SUI $0.7513 +8.32%
XLM $0.2063 +5.60%
ZEC $393.72 +7.44%
BTC $62,424.86 +3.02%
ETH $1,630.49 +4.74%
BNB $590.35 +2.83%
XRP $1.12 +5.49%
SOL $64.76 +4.45%
TRX $0.3293 +3.24%
DOGE $0.0845 +5.01%
ADA $0.1647 +6.32%
BCH $225.38 +2.42%
LINK $7.74 +6.41%
HYPE $58.94 +0.15%
AAVE $63.28 +4.16%
SUI $0.7513 +8.32%
XLM $0.2063 +5.60%
ZEC $393.72 +7.44%

DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
Related tags
app_icon
ChainCatcher Building the Web3 world with innovations.