ChinaTech #4 China's Open Source Ecosystem
... Early days of a rising industry
This new segment by Shobhankita Reddy is your go-to newsletter for updates and perspectives on China’s tech ecosystem. This edition seeks to understand the rise of China’s open source ecosystem amidst increasing international politicisation
In a July 2024 interview, DeepSeek CEO Liang Wenfeng discussed the company's decision to open source its work at length.
"In the face of disruptive technologies, moats created by closed source are temporary. Even OpenAI's closed source approach can't prevent others from catching up. So we anchor our value in our team — our colleagues grow through this process, accumulate know-how, and form an organization and culture capable of innovation. That's our moat. In fact, open source is more of a cultural behavior than a commercial one, and contributing to it earns us respect."
He also explained their high focus on research and development (R&D) – High-Flyer, DeepSeek's parent company, reportedly reinvests 70% of its annual revenues into R&D – and their decision to go against the grain, rethinking model architecture and hardware optimizations, as compared to other Chinese Big Tech firms that chose to only work on adapting open source models for Chinese applications.
Railing against Chinese freeriding, he said -
"Americans excel at 0-to-1 technical innovation, while Chinese excel at 1-to-10 application innovation.
….
In the past 30+ years of the IT wave, we basically didn't participate in real technological innovation. We're used to Moore's Law falling out of the sky, lying at home waiting 18 months for better hardware and software to emerge. That's how the Scaling Law is being treated."
Liang is part of a generation of Chinese entrepreneurs, currently in their late 30s and early 40s, who were in university in the mid-2000s as the Linux Foundation's outreach efforts in China gained momentum and as Chinese developers were increasingly participating in international projects.
This crop of developers, acutely aware of the Western accusations around reverse engineering and espionage that Chinese products face, found a nationalistic pride and validation for engineering skills in their zeal for open source. This gained further thrust through the Chinese state's support for open source via multiple alliances and foundations.
In fact, China's beginnings in open source were state-led. In 1999, the Chinese Academy of Sciences led a government-funded Red Flag Linux project to replace MS Windows and reduce Chinese dependence on US technology. While this had mixed results, several policy documents have promoted the development of open source software and hardware over the years.
Crudely put, even China's rise as the factory of the world can be traced to an open culture of sharing and collaboration, with little regard for intellectual property. Benefitting from its special economic zone status by the Deng Xiaoping administration, Shenzhen, now called the Silicon Valley of the East, developed a manufacturing prowess in 'shanzhai', a term originally used for counterfeit electronic products sold by the brand names of 'Nokir' and 'Samsing'. This was enabled by a regulatory grey zone. At some point, eventually, 'shanzhai' paved the way for an indigenous, bottom-up, sophisticated ecosystem where Chinese companies moved away from copying international firms to copying each other and establishing themselves as powers of innovation to be reckoned with.
The 2010s saw several Chinese big tech companies sponsor international open source communities such as the Linux Foundation and the Apache Software Foundation. Chinese developers, only recently in 2021 surpassed by Indian developers in number on Github, are a major contributing presence on the platform. Starting in 2020, when several tech firms open sourced their tools and frameworks used during the fight against the COVID-19 pandemic, Chinese Big Tech continue today to open source their projects in a bid to build brand awareness and hire the best talent. The prevalence of open source software is high even in legacy and low-tech industries in China as a way to reduce IT costs.
However, this is not without tension. The Chinese government understands that open source enables foreign access to critical technology software and hardware. But the very culture of open source does not align with its need for state control, crackdown on dissent and internet censorship, all of which have intensified over the past decade. These internal contradictions, combined with an international politicization of open source, have led to a complex situation in recent years.
Save for occasional state interventions, such as the temporary ban on Github in 2013, open source communities in China have largely had free rein. This, while a sharp contrast with social media platforms that faced preemptive controls and regulation from the get-go, has fast changed.
In 2018, as a response to US export controls on chips, China established a RISC-V Industry Consortium to promote the adoption of open source chip architecture. In 2019, Huawei was added to the US entity list and lost access to Google's license on Android electronic items. The same year, Github blocked access to its services for users in Iran and other countries that faced US sanctions. Access to MATLAB was blocked for a few Chinese universities. This prompted the development of HarmonyOS, Huawei's own operating system, and China's first open source foundation, OpenAtom Foundation.
The 13th five-year National Informatization Plan (2016- 2020) is noteworthy for its push to develop “open source R&D centers”, particularly in the fields of AI and cloud computing. The 14th Five-Year plan encourages the open sourcing of code, algorithms, hardware design and application services in AI, high-end chips, operating systems and other key fields.
While several open source hosting platforms exist in China, they are not yet powerful alternatives to their Western counterparts. Gitee, funded by the Ministry of Industry and Information Technology in 2020, is most promising even though it is far away from GitHub in all metrics related to users and engagement.
Chinese firms have also been building alternatives to US-origin packages and frameworks for machine learning, databases, and application services. While still nascent in their adoption and usability, they are likely to benefit from scale and quick feedback cycles from a thriving industry. For example, these software packages are being piloted and used by electric vehicle makers. It should not be long before they improve significantly in latency and other critical functionality, bolstered by local talent cliques and eventually permeate into global markets.
DeepSeek is a paradigm shift for the global AI landscape, away from the brute-force pre-training on internet-scale data costing billions of dollars that generative AI was previously thought to require. It is an important signal and case study highlighting the developments emerging from China's open source ecosystem. It is in this light that the rise of open source in China needs closer attention.
China already has a thriving open source ecosystem. It may only be a matter of time before it has an indigenous and thriving open source ecosystem.