Meet Dall-E, artificial intelligence that is a meme factory
Technology mixes language and images, helps graphic artists, but can accelerate misinformation
At OpenAI, one of the world’s most ambitious AI (Artificial Intelligence) labs, researchers have developed technology that lets you create digital images simply by describing what you want to see.
They call it Dall-E in allusion both to “Wall-E,” the 2008 animated film about an autonomous robot, and to surrealist painter Salvador Dalí.
Alex Nichol, one of the researchers on the system, demonstrated how it works: When he asked for “an avocado-shaped teapot” by typing those words, the system created ten different images of an avocado-green teapot, some with pits and some without.
The idea is that the tool will provide graphic artists with new shortcuts and new ideas in the production of digital images.
For many experts, however, Dall-E causes concern. As this kind of technology improves, they say, it could help spread misinformation on the internet, fueling the kind of online campaign that may have influenced the 2016 US presidential election.
“You can use it for good things, but you can certainly use it for all sorts of crazy and worrisome applications, including ‘deepfakes,'” like misleading photos and videos, said Subbarao Kambhampati, a professor of computer science at the State University of Arizona.
There is already misinformation online, but the concern is that it takes it to new levels, says Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, an artificial intelligence lab in Seattle. “We can forge text. We can put text in someone’s voice. And we can forge images and videos,” he said.
To try to avoid this risk, OpenAI does not allow outsiders to use Dall-E on their own and puts a watermark in the corner of each image it generates.
But there are other risks. As they learn their skills from huge sets of text, images and other online data that can include bias, AI systems can also generate biased content, for example against women or black people. The creation of pornography, hate speech and trolling campaigns are also of concern.
Understand how Dall-E, based on neural network works
About five years ago, the world’s leading AI labs built systems capable of identifying objects in digital images and even generating images on their own, such as flowers, dogs, cars and faces.
A few years later, they built systems that do the same with written language, summarizing articles, answering questions, generating tweets and even writing blog posts.
Dall-E is what AI researchers call a neural network: a mathematical system loosely based on the brain’s network of neurons. It is the technology that recognizes commands spoken on smartphones and identifies the presence of pedestrians for self-driving cars.
A neural network learns skills by analyzing large amounts of data. By identifying patterns in thousands of avocado photos, for example, he learns to recognize an avocado.
Dall-E looks for patterns by analyzing millions of digital images and captions that describe them. In this way, you learn to recognize the links between images and words.
When someone describes an image to Dall-E, he generates a set of features, like the line on the edge of a trumpet or the curve in the ear of a teddy bear.
Then a second neural network, called a diffusion model, creates the image. The latest version of Dall-E, revealed in April, generates high-resolution images that in many cases look like photos.
While Dall-E often fails to understand what someone has described and sometimes destroys the image it produces, OpenAI continues to improve the technology.
Researchers can often refine a neural network’s abilities by feeding it even greater amounts of data.