Share this article

Baidu’s AI Image Generation Is a Literal Meme Machine

By Weilin Li
Mar. 28, 2023 updated 10:20

In recent days, Baidu's AI chatbot/image generator, ERNIE Bot (or Wenxin Yiyan in Chinese), have produced numerous hilarious and flawed images. Most of the errors were related to Chinese idioms and dish names, which in some scenarios include animal names but do not necessarily refer to actual animals.

Meanwhile, users have discovered that some of the strange images generated are a result of machine translation. Specifically, it is believed by users to be caused by Baidu's process of translating prompts from Chinese to English before generating images rather than directly generating images from the Chinese prompts.

"Bus" and "mouse" resulted in the generation of inaccurate images on ERNIE Bot"Bus" and "mouse" resulted in the generation of inaccurate images on ERNIE Bot

Firstly, there is a widely circulated image associated with the term "bus", which in the realm of computing, refers to an internal structure that acts as a shared channel for information transmission.

This term shares its name with a class of vehicle in English. Nonetheless, in Chinese, these two meanings are expressed by two separate words. As a result, when the Chinese computer term is used as a prompt in ERNIE Bot, it generates an image of the vehicle.

When the Chinese term for a “computer mouse" is used in the prompt, ERNIE Bot generates an image of a mouse, the animal, in a similar manner.

ERNIE Bot separated the characters for "old", “lady” and "cake" when referring to the pastry Lao Po Bing.ERNIE Bot separated the characters for "old", “lady” and "cake" when referring to the pastry Lao Po Bing.

Afterward, online users discovered that Baidu's AI drew a series of dish names very inaccurately.

Lao Po Bing is a type of Chinese pastry. Lao Po means wife, but separately the two characters means “old lady” in Chinese. From the results, it is apparent that ERNIE Bot has literally separated Lao Po and showed results of an old lady and a cake.

When translating a pork meatball dish, ERNIE Bot generated a misleading image of a red, flaming lion's head.When translating a pork meatball dish, ERNIE Bot generated a misleading image of a red, flaming lion's head.

Hong Shao Shi Zi Tou is a classic Chinese dish that consists of large pork meatballs braised in a gravy along with vegetables. Although Shi Zi Tou translates to “lion’s head,” the name actually refers to the meatballs' round shape rather than any actual lion parts. However, ERNIE Bot made the mistake of translating the name literally, resulting in a misleading image of a red, flaming lion's head.

ERNIE Bot's interpretation of San Bei Ji is literal, showing three chicken-cup hybrids.ERNIE Bot's interpretation of San Bei Ji is literal, showing three chicken-cup hybrids.

The dish San Bei Ji, which translates to "three cup chicken," typically consists of soy sauce, sesame oil, and rice wine. However, ERNIE Bot's interpretation is quite literal, showing three chicken-cup hybrids.

The image interpretation by ERNIE Bot for the "monkey year, horse month."The image interpretation by ERNIE Bot for the "monkey year, horse month."

Hou Nian Ma Yue is a Chinese idiom that literally means "monkey year, horse month." It is used to describe an uncertain or unpredictable time in the future, similar to the English idiom "sometime in the never-never."

The phrase comes from the Chinese zodiac, where each year and month is associated with an animal. However, ERNIE Bot's answer seems to include the literal interpretation of a monkey, a horse, and the moon as individual elements.

ERNIE Bot has depicted a smartphone and an apple separately to represent the iPhone.ERNIE Bot has depicted a smartphone and an apple separately to represent the iPhone.

The iPhone, which is a line of smartphones created by Apple Inc., is another instance of a misinterpretation by the AI.

Baidu’s latest development, chasing after the success of Chat GPT, has led to a heated discussion among users and fans about these "off-topic" images, with some expressing a complete lack of concern about losing their job to AI.

On March 23rd, Baidu issued a statement in response to the feedback on ERNIE Bot's text-to-image function. They clarified that their tool’s abilities come from the self-developed text-to-image model ERNIE-ViLG.

The statement stressed that ERNIE Bot is still in the developmental phase. " ERNIE Bot is constantly learning and growing as everyone uses it. Please have some confidence and give our self-developed technology and products some time. Do not spread rumors, and we hope that ERNIE Bot can bring more joy to everyone."