DALL E: A new type of image has been born, coming from the deepest layers of artificial intelligence (AI). But the history of photography is littered with “ghosts”
The image on the right corresponds to a person that does not exist and that anyone can generate on this website . They are ghosts of a type of generative artificial intelligence responsible for other creations such as the famous deepfakes .
In the 21st century we are witnessing the birth of a new type of image that comes from the deepest layers of artificial intelligence (AI).
From the mental representations that the first humans formed around the word and fire to the automated ones, there has been a long technological road full of incredible moments.
Capture eclipses in a dark box
The firmament was always a fascinating setting, giving rise to both mythological and scientific interpretations.
Centuries ago, some scholars thought that instead of projecting the imagination towards the firmament, perhaps it would be more interesting to project light into a camera and study it. This is how the first images of the firmament began to be taken. This was one of the uses of the camera obscura that allowed, through the entry of light through a small hole, to visualize the external image, for example, a solar eclipse, projected on one of its walls.
The first ghosts in the 19th century
Capturing phenomena from our reality is all very well, but creating our own images and projecting them was going to be a memorable spectacle.
The magic lantern was a type of projector originally made up of a candle, a mirror and a cylinder with a lens to concentrate the light through which they traveled from bucolic landscapes to specters and demons, giving rise to a genre known as phantasmagoria .
The use of smoke and other effects such as moving the projector itself made it possible to terrify audiences in the 19th century.
Electrical impulses: the first image on TV
In the second decade of the 20th century, a grayscale portrait with 32 lines of resolution and 5 images per second would go down in history as the first television image.
The challenge was to capture the scene and, for the first time, transmit it from one place to another as was already the case with sound on the radio. Before transmitting, the scene had to be explored in an orderly manner, and the changes in light converted into variations in electrical current. This principle of electromechanical image generation can be seen in this video .
A baby was the first digital image
In the middle of the 20th century, the first digital image arrived, captured by a scanner. Then there was not even an internet open to all audiences.
This device made it possible to capture the intensity variations of a photograph and register them in a much more precise way, by means of the coding in individual cells, the pixels .
The image converted into a numerical matrix is ready to be encoded, mixed, compressed, registered and studied using the tools provided by the digital revolution.
Hallucinating with artificial intelligence
If we can recognize a face, it is because deep down all faces are similar or have common elements. There are positions in the photograph of a face where certain pixel values – such as those that define the lips, nose, etc. – are more likely than others.
Modeling a face from training with thousands of faces is one of the milestones of artificial intelligence machine learning. In addition, what is interesting is to manipulate that modeled representation, enhancing some characteristics over others.
The result is aberrations such as the digital hallucinations of the Google Deep Dream algorithm or the faces of non-existent people (like the one we show at the beginning of the article), the result of sampling new images from the generic face model. Now yes, the resulting images are true ghosts.
The disturbing resurrection of Salvador Dalí
Hallucination is a way of demonstrating that we can force and distort the latent representations of a model to our convenience. Sort of like a supersurgeon being able to extract a donor’s face and transfer it to a recipient.
The deepfake has many applications that can be both creative and unethical. One of the applications that has been explored for a few years is to recreate famous or historical characters.
The folkloric Lola Flores in an advertisement was the first popular “resurrection”. Even more disturbing is the one made by the Salvador Dalí Museum in the city of Saint Petersburg (Florida) . Using archival material and interviews, they recreated the facial expressions and voice of the famous surrealist painter. What sensation does it produce to see a historical figure “relive”?
Generating images with a text command
Capturing the representation of a face by exposing thousands of faces is already a milestone. But if we can also capture the relationship between different data inputs, such as the combined input of image and text, we will have gone even further.
This type of multimodal representation can be achieved with AI architectures such as the Dall-E2 system. Dall-E is an acronym that merges Wall-E, the famous Disney movie, and the painter Salvador Dalí. Social networks have already been flooded with his creations. What can they draw? Most. The only limit is the imagination. For example, a medieval man sitting at a computer.
It is already possible to try, and for free, DALL-E mini , a model inspired by DALL-E that serves to experience the level of associative capacity between the textual input and a visual result. There are ways to achieve one result or another depending on the words used or how they are ordered in the text.
What kind of positive and negative consequences can this kind of realistic image generation by AI algorithms have for society? Let’s get used to living surrounded by ghosts?
Arturo Fuentes Calle , Associate Professor, Polytechnic University of Catalonia – BarcelonaTech