Asterisk: To start, could you say a little bit about who you are and what you do?
Jeffrey Ding: I’m an assistant professor at George Washington University’s political science department, where I research emerging technologies and international relations. I also publish a weekly China AI newsletter that features translations of writings by Chinese scholars and bloggers on AI-related topics.
A: It’s a great newsletter. And in it you like to correct misconceptions that American scholars have about AI in China. So I’m curious: Right now, what’s the most annoying misconception you see in that area?
J: I think one of the consistent misconceptions is the overestimation of China’s AI capabilities. Part of this stems from the July 2017 national development plan, in which China elevated AI to be a strategic priority. A lot of Western observers just assumed that meant China was a leader in this space. Many prominent voices called it a Sputnik moment — a wake-up call that the U.S. was falling behind in strategic technology.
Overestimation also extends to recent developments in large language models. This happens every time a new Chinese model is released, like Wu Dao 2.0 from a year or so ago: We all thought this was a symbol of bigger, stronger, faster AI from China. And now nobody talks about Wu Dao 2.0. No paper was released. There’s no public-facing API. Even the leading competitor to ChatGPT, Baidu’s Ernie Bot, does not measure up on a lot of different natural language processing benchmarks. So that would be the number one misconception: this tendency to overhype developments in China.
A: The sense that I get from hearing people talk about these models is that they’re closer to maybe a year behind leading American models than, say, five years or 10 years behind. Is that right?
J: Yeah. I recently co-wrote a report on trends in these large language models. If you track when GPT-3 was released and when Chinese labs were able to put out alternatives that performed as capably on different benchmarks, it was about one and a half to two years later.
A: Naively, I’d expect if Chinese labs were clearly capable of building and training these models, and doing it fairly quickly, eventually they’d produce a model that’s comparable. Why isn’t that happening?
J: There’s a lot more freedom to experiment and push the technological frontier at labs like OpenAI and DeepMind. These are very unique entities. They aren’t restricted by needing to meet anything like key performance indicators or other commercial drivers.The best labs in China, by contrast — Alibaba DAMO Academy, Tencent — have to meet KPIs for making money. There’s more leeway and more runway for companies like OpenAI and DeepMind to invest in pushing forward the technological frontier. So it makes sense that Chinese labs can then invest those resources in that talent and that time into projects developing something similar to GPT-3 only once that trajectory has already been established.
A: Do you think that's going to change as there’s a more widespread awareness of scaling laws — that it seems like you can reliably put in more compute and get out a more powerful model?
J: Maybe. I’m actually not completely bought in on the “scaling dominates everything” argument. The difference between GPT-3 and ChatGPT was not necessarily a difference of scaling. It was this advance called InstructGPT, which used human input to make the models better and less toxic. That was the type of innovation that actually was missing. When we reviewed 26 different large language models and different labs in China’s ecosystem, I did not see anything like InstructGPT. So I think it’s not just about the resources. It’s also these conceptual and engineering innovations.
A: Just to explain, these models are now trained using a technique called reinforcement learning from human feedback (RLHF), where humans provide input that’s used to train the model to give responses more like the ones that the humans said were good. Is that technique not commonly used in China?
J: I have not seen a paper published where RLHF was used to train a large language model. The report only covers 2020 to 2022, so I might have missed something. But yeah, I think the general principle seems obvious and seems like it would be easy to implement. I wouldn’t be surprised if actually there’s a lot of engineering-related tacit knowledge involved with doing something like InstructGPT. That’s actually very hard to discern from just reading the arXiv paper.
A: Back in 2018, you wrote a report called Deciphering China's AI Dream. Another misconception you were trying to correct in that report was that China must have a very centralized, top-down policy on AI, when in fact there are a lot of different bureaucratic, local, and corporate interests that all cut against each other. I’m curious if in observing the past five years you’ve seen more of a push toward centralization, or if it’s still pretty diffuse.
J: It’s a great question and it’s been on top of my mind because recently China has implemented some reforms to the Ministry of Science and Technology (MOST), which elevated it to a higher level in terms of guiding the overall direction of science and tech policy. I think one could read that as a driving force toward more centralization, and that’s tied to concerns about supply cutoffs and issues with foreign technology dependency. At the same time, what I found very interesting about that reform and reorganization was that they also took away some of MOST's responsibilities. And one of those key responsibilities was overseeing grant management of big science and technology grants.
This has been a long-standing debate about science and technology policy in China: whether the grants should be managed by bureaucrats or managed and overseen through a more bottom-up process where scientists get more input. So you could read that part of the reorganization as actually decentralizing the grant management process and giving more power back to the scientists. I still think that the push and pull is going to exist, and we see that reflected through the recent MOST reorganization.
A: In April, China released a new set of regulations on generative AI. I’ve seen a lot of discussion on how restrictive these rules are for what companies can put into their training data, and how it could cripple Chinese labs. What’s your take on that?
J: It’s important to note that these are draft regulations — and often the draft gets significantly revised or softened. We saw that with data localization requirements in the cybersecurity law a few years ago. As to the specific training data provisions, I believe you’re referencing the requirements about not using any training data that has personal information attached to it —
A: And also ensuring that the data be accurate and objective, and doesn’t infringe on intellectual property, etc.
J: Right. If that were enforced strictly to the letter, it would definitely impose certain constraints. I think what’s more likely to happen is that those regulations will be lessened.
What I think won't change, though — and I think this will continue to pose a barrier to Chinese companies in this space — are the Chinese government’s concerns with internet content providers, especially those providers who have “public opinion properties” and have “social mobilization capacities.” Those are terms of art used by the Chinese government. Part of the reason China’s internet is so censored is that they put the onus on companies to control their content so it’s not politically sensitive. And so to apply that burden not just to WeChat or Baidu search results, but to something like Ernie Bot or another LLM would make it very hard for Chinese companies to meet those requirements.
A: When OpenAI trains ChatGPT to not say something racist or hallucinate, the thing they’re using is RLHF, or something like it. And if Chinese labs don’t use those techniques, I can see how it would be extremely difficult for them to make sure that Ernie Bot doesn’t start talking about Tiananmen Square.
J: Right. And you can’t ensure that the pre-output censorship that happens in the training process is going to be perfect. They would have to implement some sort of post-model-output censorship stage that OpenAI doesn’t have to implement. That’s a huge burden. What companies with LLMs might do instead is optimize for business-facing applications that don’t have public opinion properties or social-mobilization capacities.
A: So instead of the proliferation of chatbots, Chinese would see business-facing applications that are mostly invisible to the average consumer.
J: Yes, exactly.
A: Talking about how this might diffuse through the economy brings us to another point. You recently wrote a paper breaking down what makes countries competitive technologically, and you drew a difference between nations that lead in innovation and nations that lead in diffusion. Could you summarize that argument?
J: This paper focused on how countries leverage new science and technology advances to sustain higher economic and productivity growth, which historically has been a key step in the rise and fall of great powers. Britain, for example, established productivity leadership and then translated that economic power into military and geopolitical influence after the first industrial revolution. My argument is that when we measure national scientific and technological capabilities, we overweight innovation capacity or other metrics that are closely tied to a country’s ability to pioneer new initial advances. And we underweight diffusion capacity, which is a country’s ability to diffuse, spread, and embed these advances in productive processes across the whole economy.
A: You have a great example of this: In the late 19th century, the United States was pretty weak in innovation capacity. There were not a lot of new innovations coming out of the U.S., but it was very, very good at taking the advances coming from Europe — chemical engineering, among others — and integrating them into industry. Another is that after World War II, the Soviet Union was quite strong in innovation, it had many great scientists, but as a country it struggled to diffuse those advances through their economy. And, of course, we can just see how those two situations played out.
J: Exactly. So the idea here is if you only rely on innovation and capacity metrics, you arrive at misleading assessments of a country’s ability to sustain growth in the future.
A: And your argument is that U.S. commentators are too focused on China’s innovation capacity, and ignore the fact that China is much weaker on diffusion.
J: Yeah. Right now, there’s a lot of discussion among U.S. policymakers about how the U.S. will soon face this innovation deficit, as framed through indicators like R&D spending, total patents and publications, and high-end STEM talent. I wanted to look more closely at what the diffusion capacity indicators would say. So I looked across fields related to AI: information and communication technologies, cloud computing, and even more basic metrics like household access to computers.
I found that on a lot of these different indicators, China’s diffusion capacity was much, much lower than its relative innovation capacity. If you compare indicators of innovation capacity, such as total R&D spending of its top three companies, or the rankings of its top three universities, China scores extremely high. But when you look at indicators of diffusion capacity — the adoption rate of different information and communications technologies across businesses, or how close and strong the linkages between academia and industry are — China ranked as a middling science and technology power.
A: There are some information technologies that are very widespread in China, like digital cash. Do you have a sense of what determines why, say, WeChat can take over everything so quickly, but for other technologies like cloud computing or industrial robotics there seem to be much deeper barriers?
J: In some of these areas, such as financial payments, there’s just more opportunity for leapfrogging legacy systems. The reason for the fast diffusion in digital payment technologies is that there weren’t firmly established legacy methods of credit card payments. Another example is high-speed rail: The government invested heavily in infrastructure, and China became a forerunner in adopting high-speed rail technology at scale.
But when it comes to technologies that have an outsize impact on productivity growth — cloud computing, industrial robotics, industrial software — the ability to leapfrog legacy systems doesn’t apply. China will need to invest in earlier generations of the technology and accumulate expertise in a more gradual way. And so in those industries, China will struggle with its diffusion capacity.
A: So circling back to AI, it doesn’t seem obvious which bucket it falls into.
J: There’s a couple of ways to think about it. Another quirk about some of these technologies that China has been able to diffuse at scale is that they are consumer facing. They don’t require a lot of complementary skills and technologies to adopt. You don’t need a wide pool of talent to ensure the spread of a digital payment technology across an entire country. But you absolutely do when it comes to industrial robotics, software, and cloud computing.
That is one of the factors that makes me think that China’s diffusion capacity in AI will follow the same trends that we’ve seen in cloud computing and industrial robotics. I’ve looked at different metrics to compare different countries’ abilities to train average AI engineers. I testified before the U.S.-China Economic and Security Review Commission recently and presented data on the number of universities in both the U.S. and China that have at least one researcher that has published in a top AI conference. And I believe about 100 universities in China met that very low baseline and about 400 universities in the U.S. surpassed that baseline. I’m only talking here about the talent necessary to fine-tune a large language model that’s already been trained and apply it to a specific task.
A: Which is not nearly as technically complicated as training it in the first place.
A: And I can also imagine that if, as you said, these models aren’t deployed as consumer-facing software because of censorship concerns, that would also slow diffusion as well.
J: Yeah. And open source techniques which allow for faster diffusion — where capabilities and issues are effectively crowdsourced — aren’t available to Chinese companies right now.
A: Many of the tools that are used to make AI are open source, like PyTorch and TensorFlow. Are these same tools used in China, or is it mostly domestic equivalents?
J: The vast majority of researchers use PyTorch and TensorFlow and US developed open source frameworks in development. This is based on survey data of researchers and students in the computer vision field, where I think only 3% of respondents said that they would use domestic alternatives such as Baidu's PaddlePaddle or Huawei's MindSpore, as of 2021. Recently there's been a lot more attention on trying to build a domestic full stack software and hardware system for developing models. So Huawei's Pangu was trained using its Mindspore framework and also Huawei Ascend chips.
A: So, what can a country do to increase diffusion capacity? It seems like it’s much easier to spend a lot of money on R&D than it is to get everyone to adopt slightly better industrial processes.
J: There are a lot of different factors.
One bucket is decentralization. Decentralization often correlates with higher diffusion capacity in science and tech. Instead of picking winners and locking in a particular trajectory, a decentralized ecosystem enables diffusion from the bottom up because the most successful trajectory is allowed to emerge.
Another factor is human capital. The Chinese government has been very good about hitting R&D targets because that’s relatively easy to do — they can just mandate spending in different areas. But the government has been much less successful in investing in more widespread technical education. For example, community colleges and vocational training opportunities can raise the average level of engineering — that would be a set of policies that promote more diffusion capacity.
And then the third bucket would be a bunch of random factors that affect diffusion capacity: the latecomer advantage of being able to leapfrog legacy systems, whether a country has a standardized language, the strength of communication channels, culture. I think it’s very hard to pin down just a few factors that would affect diffusion capacity.
A: I want to turn to AI safety. You’ve written about how, contrary to what some Western commentators seem to think, Chinese AI researchers are concerned about long-term risk and dangers from AI development. How closely does that conversation track the Western conversation?
J: This is work that needs to be done on a more systematic basis, but we looked at 20 or so different large language models and found that about half of them had a section devoted to ethical, governance, and safety-related issues. It seemed like the focus was mostly on issues of bias, fairness, and toxic content rather than concerns about artificial general intelligence, for instance.
That’s not to say that there aren’t researchers who are discussing AGI-related concerns. For example, a previous issue of my newsletter featured writings by Nanjing University professor Zhou Zhihua 周志华, who leads one of the top teams in China. He talks about how researchers should not even touch strong AI or a close equivalent to artificial general intelligence. And this was published in the China Computer Federation publication, which features writings from leading computer scientists in China. Those discussions are happening. But I would say discussions on AGI and long-term AI safety issues are not as robust and deep in China as compared to Western countries.
A: Is your sense that the bias-and-fairness debate is happening in response to the Western debate over the same issues, or are those issues arising independently?
J: I think a lot of the bias and fairness concerns are coming from the diffusion of norms from Western organizations.
But on other issues, like privacy, that is being driven by the concerns of the Chinese public and a growing backlash to the intrusiveness of AI applications. One of the most important tipping points that we almost never talk about is Chinese delivery drivers. There was this big investigative report about the constraints imposed on delivery drivers by algorithms which calculated how much time they would have to meet their delivery requirements. This really shone a huge spotlight on algorithms and how they play such a huge role in manipulating people’s lives.
A: It’s interesting to hear that privacy is such an organic concern — at least as an American with the stereotype of China as a state with no digital privacy.
J: Privacy concerns look a little different in China, where more of the focus is on the instrumental benefits of privacy — how to prevent someone from hacking into your bank account and stealing all of your money, for instance — rather than privacy as this intrinsic civil right that serves as a check against the worst abuses of government.
But among Chinese academics who might have more of a protected position to say certain things, there’s a fair amount who do talk about the need for privacy as more of a civil right or to check against government abuse. I've translated work by scholars such as Tsinghua professor Lao Dongyan 劳东燕, who’s criticized the use of facial recognition in the Beijing metro system. And there’s actually also been a lot of pushback against the continuing use of QR health codes as COVID control winds down. Oftentimes we only see the surveillance state growing in its reach, but I think this is an example where the surveillance state has been curtailed. So it is a more nuanced picture than just Orwellian, authoritarian government with complete control.