Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Today, Deepseek is one of the only leading AI companies in China that does not depend on funds from tech giants like Baidu, Alibaba, or bytedance.
According to Liang, he was not looking for experienced engineers to create a consumer-prick product when he combined the Diploma’s research team. Instead, he focused on PhD students of China, including Peking University and Singua University, who are interested to prove themselves. Were published in many top journals and won the award at the International Academic Conference, but there was a lack of art experience, Chinese technology publishing Qbitai.
“Our main technical positions are mostly filled by people who have graduated in the past one or two years,” Liang said 36kr in 2023. The recruitment strategy helped to create an allied company culture where people were free to use computing resources enough to follow obsolete research projects. This is a completely different way to operate from Internet companies established in China, where parties often compete for resources. (A recent example: Bytens has accused a former intern– The winner of a prestigious academic award, to destroy his colleagues in order to store more computing resources for his team.)
Liang says students may be more suitable for high-investment, low profit research. “Most people, when they are young, can fully dedicated themselves without consciousness,” he explained. His pitch for possible appointment is that Deepseek was created to “solve the most difficult questions in the world”.
Experts say that these young researchers are almost completely educated in China. “This younger generation also embodies the feeling of patriotism, especially when they navigate US restrictions and stops critical hardware and software technology,” explains Zhang. “Their determination to overcome these obstacles reflects not only personal ambition but also a greater promise to advance China’s position as a global invention leader.”
In October 2022, the US government began to combine export controls that strictly limited Chinese AI companies to access sophisticated chips like Nvidia’s H100. This step has presented a problem for DIPSC. The firm started with the reserves of 10,000 H100, but needed more to compete with companies like OpenAI and Meta. Liang told 36kr “the problem we are facing have never been funded, but export control controls in advanced chips” In a second interview in 2024.
Diplos had to come up with more effective methods to train his models. “They optimized their model architecture using engineering strategy batteries-Custom communication schemes in chips, reducing field size to save memory, and innovative use of mix-off-mode method,” Wendy Chang says, a software engineer-based policy marketer for Analyst of China Studies. “Many of these methods are not new ideas, but successfully combining them to create a sophisticated model is a wonderful achievement.”
DIPSC Multi-Head Latent Attention (MLA) and Mixer-O-Experts also made significant progress, two technical designs that make the diplic models more affordable by the needs of low computing resources for training. In fact, the latest model of DIPSC is so skilled that it requires one-tenth of computing computing power comparable to the Meta comparable Lama for training, According to the research institute Epoch AI.
The desire to share these innovations of DIPSC to the public has gained considerable goodwill in the World AI research community. For many Chinese AI companies, creating open source models is the only way to catch up with their Western equivalent, as it attracts more users and contributors, causing models to grow. “They have now shown that sophisticated models can be made using less, though still a lot of money and the current rules of model-building leaves a lot of space for optimization,” said Chang. “We’re sure that many more efforts toward this side are moving forward.”
The news can create problems for current US export controls that focus on creating a computing resource barrier. “The existing assumptions of how much AI computing the AI ​​computing in China and what they can achieve with it can be ignored,” Chang says.