Tencent’s tech team has optimized DeepSeek’s open-source DeepEP communication framework,Watch My Sister's Idol Trainee Friends Online boosting its performance across different network environments, according to the Chinese AI startup. Testing showed a 100% improvement on RoCE networks and a 30% gain on InfiniBand (IB), offering more efficient solutions for AI model training. On GitHub, DeepSeek acknowledged the Chinese tech giant’s contribution had led to a “huge speedup.” DeepEP is a communication library tailored for a mixture of experts (MoE) and expert parallelism (EP), supporting high-throughput, low-latency GPU kernels and low-precision computing, including FP8. Tencent’s Starlink Networking team identified two main bottlenecks: underutilized dual-port NIC bandwidth and CPU control latency. After targeted optimizations, performance doubled on RoCE and improved by 30% on IB. The enhanced framework is now fully open-source and has been successfully deployed in training Tencent’s Hunyuan large model, demonstrating strong versatility within environments built on Tencent’s Starlink and H20 servers, Chinese tech media outlet iThome reported. [iThome, in Chinese]
Related Articles
2025-06-26 10:56
71 views
NYT Connections Sports Edition hints and answers for May 18: Tips to solve Connections #237
Connections: Sports Editionis a new version of the popular New York Times word game that seeks to te
Read More
2025-06-26 10:30
2322 views
'Succession' Season 4 review: Are you ready to say farewell to the Roys?
The bad news? Successionis coming to an end. The good news? It's going out on a high note.Jesse Arms
Read More
2025-06-26 09:45
2410 views
Houston vs. Miami livestreams: How to watch the Sweet 16 matchup
It’s sweet on Friday too!With one night of the Sweet 16 in the books, the NCAA men’s bas
Read More