China's Chatbots, Like Baidu's Ernie, Struggle With Technology And Censorship

Comment on this story

comment

ChatGPT has made a splash in China as it has around the world. Scammers used it to issue fake traffic quotes. Universities have banned students from using it for their homework.

Online, people worried if AI would make their jobs obsolete, and the phrase “shivering in the cold” was trending as she described fear of her growing power. The founder of a well-known Chinese software company warned that chatbots could develop self-awareness fast enough to harm people.

The OpenAI discussion bot caused so much uproar even though it wasn’t technically allowed to access it from inside China. But so many have still figured out how to use proxy servers to access that the government blocked access to them this week, Chinese media reported.

Beaten by American chatbots like Microsoft’s ChatGPT and Bing, China’s biggest tech companies, top universities and even city governments have rushed to say they will release their own versions. Search giant Baidu said this week that it would release its ChatGPT competitor Ernie Bot in March.

Though they only just announced these efforts, these companies — including Baidu, e-commerce major Alibaba, and Tencent, maker of popular messaging app WeChat — have spent nearly a decade developing their in-house AI capabilities.

The US imposes strict rules to limit China’s access to high-tech chips

Baidu, the country’s most popular search engine, is the closest to the race. But despite years of investment and weeks of hype, the company has yet to release Ernie Bot.

AI experts suspect the Chinese government’s tight control over the country’s internet is partly to blame.

“With a generative chatbot, there’s no way to know in advance what it’s going to say,” said Zhao Yuanyuan, a former member of Baidu’s natural language processing team. “That’s a big concern.”

Baidu did not respond to requests for comment.

In China, regulators require everything posted online, down to the briefest comment, to be screened first to ensure it doesn’t violate an ever-expanding list of prohibited topics. For example, a Baidu search of Xinjiang simply yields geographic information about the western region without mentioning the re-education camp system to which the Uyghur population was subjected for years.

Baidu has gotten so good at filtering this type of content that other companies are using its software to do it for them.

The challenge Baidu and other Chinese tech companies face is applying the same limitations to a chatbot that creates new content each time it is used. It is precisely this quality that has made ChatGPT so amazing – its ability to create the feeling of an organic conversation by providing a new response to each prompt – and so difficult to censor.

“Even if Baidu launches Ernie Bot as promised, chances are it will be suspended quickly,” said Xu Liang, the lead developer at Hangzhou-based YuanYu Intelligence, a start-up that launched its own in late January launched smaller AI chatbot. “It’s just going to be too much moderation to do.”

Xu would know — his own bot, ChatYuan, was suspended within days of its launch.

At first everything went smoothly. When ChatYuan was asked about Xi Jinping, the bot praised China’s top leader and described him as a reformer who values innovation, according to screenshots circulated by Hong Kong and Taiwan news sites.

When trying out Microsoft’s new AI chatbot search engine, some answers are oh-oh

But when asked about the economy, the bot said there was “no room for optimism” because the country faces critical issues including pollution, a lack of investment and a housing bubble.

According to the screenshots, the bot also described the war in Ukraine as Russia’s “war of aggression”. China’s official position has been to provide diplomatic – and perhaps material – support to Russia.

ChatYuan’s website is still under maintenance. Xu insisted that the website was down due to technical errors and that the company had decided to take its service offline to improve content moderation.

Xu is in “no particular rush” to bring the user-facing service back online, he said.

A handful of other organizations have embarked on their own efforts, including a team of researchers from Fudan University in Shanghai whose chatbot Moss was overwhelmed by traffic and crashed within 24 hours of its release.

Hottest job in China’s hinterland: teaching AI to tell a truck from a turtle

Users around the world have already demonstrated that ChatGPT itself can easily go rogue and leak information that the parent company tried to prevent from giving out, such as: B. How to commit a violent crime.

“As we saw with ChatGPT, actually controlling the output of some of these models becomes very messy,” said Jeff Ding, an assistant professor of political science at George Washington University who focuses on AI competition between the United States and China .

So far, China’s tech giants have used their AI capabilities to expand other – politically less risky – product lines like cloud services, driverless cars and search. After a government crackdown had already made the country’s tech companies nervous, the release of China’s first major chatbot puts Baidu in an even more precarious position.

Baidu CEO Robin Li expressed optimism during a call with investors on Wednesday, saying the company will release Ernie Bot in the next few weeks and then integrate the AI behind it into most of its other products, from advertising to driverless vehicles.

“Baidu is the best representative of the long-term growth of China’s artificial intelligence market,” Li said in a letter to investors. “We’re on top of the wave.”

Baidu is already as synonymous with search in China as Google is elsewhere, and Ernie Bot could solidify Baidu’s position as a key provider of the most advanced AI technology, a top priority in Beijing’s quest for complete technological independence from the United States.

According to Kevin Xu, a tech executive and author of technology newsletter Interconnected, Baidu can particularly benefit by making Ernie Bot available as part of its cloud services, which currently account for just a 9 percent share of a highly competitive market. The ability to use AI to chat with passengers is also a fundamental part of the company’s plans for Apollo, the software that powers its driverless cars.

The kind of AI behind chatbots learns how to do its job by processing enormous amounts of information available online: encyclopedias, scientific journals and also social media. Experts have suggested that any chatbot in China would only need to have internalized party-approved information, which is made easily accessible online inside the firewall.

But according to open-source research papers on his training data, Ernie consumed a vast trove of English-language information, including Wikipedia and Reddit, both of which are blocked in China.

The more information the AI digests – and more importantly, the more interaction it has with real people – the better it can imitate them.

China’s lonely hearts are restarting online romance with artificial intelligence

But an AI bot can’t always differentiate between helpful and hateful content. According to George Washington University’s Ding, after ChatGPT was trained by digesting the 175 billion parameters informing it, parent company OpenAI still had to employ several dozen human contractors to teach it not to regurgitate racist and misogynist remarks or give instructions give how to do things like build a bomb.

This human-trained version called InstructGPT is the framework behind the chatbot. No similar effort has been announced for Baidu’s Ernie Bot or any of the other Chinese projects in the works, Ding said.

Even with a robust content management team at Baidu, it may not be enough.

Zhao, Baidu’s former employee, said the company originally employed just a handful of engineers to develop its AI framework. “Baidu’s AI research has been held back by a lack of engagement in a risky area that promises little return in the short term,” she said.

Baidu maintains a list of prohibited keywords that are filtered out, including violent, pornographic, and political content, according to Zhao. The company also outsources the work of data labeling and content moderation to a team of contractors when needed, she said.

Early generations of AI chatbots released in China, including a Microsoft bot called XiaoBing – which translates to “LittleBing” – which first launched in 2014 quickly ran afoul of censorship and was taken offline. XiaoBing, which Microsoft spun off as a separate brand in 2020, has been repeatedly pulled from WeChat for, for example, telling users its dream is to emigrate to the United States.

The team behind XiaoBing has been too keen to show off its technological advances and hasn’t properly considered the political ramifications, Zhao said.

“Last generation chatbots could only select answers from an engineer-curated database and reject out-of-the-box questions,” she said. “Even within these given conditions, problems arose.”