28 Sep DALL-E: the algorithm that generates images from text descriptions
Developed by OpenAI and presented on 5 January 2021, DALL-E is marking an extraordinary turning point in the evolutionary journey of Artificial Intelligence. Let’s see why.
What is it and how does it work?
If its name brings something to mind, you’re not wrong: DALL-E is a tribute to the artist Salvador Dali and Pixar’s robot, WALL-E.
Capable of producing images from text descriptions, this revolutionary system is a variant of the natural language processing model GPT-3. The latter creates texts that are often indistinguishable from those created by the human mind, while with its 12 billion parameters, DALL-E makes graphical interpretations.
Consisting of text-image pairs, DALL-E provides a graphical representation of the individual terms described. The subsequent combination of concepts and modification of existing images is all thanks to DALL-E, which can be asked to satisfy all sorts of whims, such as “rocket with a tutu” to obtain the follow images, for example:
In some cases, if very specific requests are made, DALL-E tends to forget or swap certain elements.
For example, “A big dog with a blue sweatshirt, green sunglasses and black headphones” gives the following results:
DALL-E was not created by OpenAI researchers with a specific goal in mind, but could certainly be an ideal system for creative fields such as interior design and fashion, where there is a need for visual imagination and new inspiration.
April 2022: introducing DALL-E 2
A new version of this system was introduced five months ago. If DALL-E was already able to provide accurate images, now DALL-E 2 has the ability to make even more realistic creations from a natural language description, with four times the resolution.
The amazing thing about DALL-E 2 is that it can expand images beyond what is in the original canvas, taking into account shadows, reflections and textures. It uses a process called “diffusion”, which starts with a pattern of random dots and gradually changes that pattern to an image when it recognises specific aspects of the image.
The goal of OpenAI is to enable people to express themselves in complete freedom: DALL-E 2 helps to understand how advanced AI systems see and perceive our world, which is critical to their mission of creating AI that benefits us all.
Both of these two systems are the result of the continuous evolution of Artificial Intelligence. What other innovations will we see emerge and develop as the years go by?