SG Talk

Full Version: DeepSeek goes 100% Chinese for AI inference using Huawei's Ascend 910x accelerators
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Comments at the end quite interesting.

https://www.yahoo.com/tech/beginning-end...00522.html
Yahoo and scmp so quick and aso turn around to sing the praises of china? 太现实了吧

二毛们怎么办是好?
二毛自己吃自己
It seems by removjng ghe middle CUDA layer they gajn the efficiency and make the Huawei chip perform even better than
NVIDIA.
India so many programmers how cum they aren't able to do that? Looks like they are not too smart
They will need lots more chips as 910 only can delivers-60-percent-nvidia-h100-inference-performance.

https://www.tomshardware.com/tech-indust...erformance
(11-02-2025, 01:48 PM)WhatDoYouThink! Wrote: [ -> ]India so many programmers how cum they aren't able to do that? Looks like they are not too smart



Their programmers graduated from dubious Uni and taught by dubious half headed Profs, who were similarly taught
(11-02-2025, 01:35 PM)sgbuffett Wrote: [ -> ]It seems by removjng ghe middle CUDA layer they gajn the efficiency and make the Huawei chip perform even better than
NVIDIA.

Nvda wanna lock users to their CUDA app layer. Deepseek did a machine level programming bypass and increase efficiency.

But that level of expertise is not easy. Doubt SG have it.

Look like the algorithm is the key and hardware in certain extent but not main key.
Dyn worry too much. Ds and hw will catch up very fast
(11-02-2025, 01:52 PM)Niubee Wrote: [ -> ]Nvda wanna lock users to their CUDA app layer. Deepseek did a machine level programming bypass and increase efficiency.

But that level of expertise is not easy. Doubt SG have it.

Look like the algorithm is the key and hardware in certain extent but not main key.

I have tried embedded assembly code to speed up access.
Doable.
(11-02-2025, 01:52 PM)Niubee Wrote: [ -> ]Nvda wanna lock users to their CUDA app layer. Deepseek did a machine level programming bypass and increase efficiency.

But that level of expertise is not easy. Doubt SG have it.

Look like the algorithm is the key and hardware in certain extent but not main key.

Sg employs a lot of cheap CECA! 

How to have it?

My son also lost his job recently because company outsource to CECA! Rotfl
(11-02-2025, 01:59 PM)sgbuffett Wrote: [ -> ]I have tried embedded assembly code to speed up access.
Doable.

Deepseek uses assembly-like PTX programming
(11-02-2025, 01:48 PM)teaserteam Wrote: [ -> ]They will need lots more chips as 910 only  can delivers-60-percent-nvidia-h100-inference-performance.

https://www.tomshardware.com/tech-indust...erformance

Google search gave me this :

When comparing the Huawei Ascend 910C to Nvidia's AI chips, the Ascend 910C is positioned as a strong competitor, particularly in the Chinese market, claiming to offer comparable or even superior performance in certain AI tasks, though experts generally acknowledge that Nvidia still holds the overall market lead in terms of technology and wider adoption;
.
.