We live in the age of hypervisibility. The problem with visibility is that 19th-century phenomenology has made it clear that seeing is not even close to being an objective recording of external reality, but rather a subjective experience based on the observer's body (Crary 1988, 3–4)1. The contemporary development of imaging technology, coupled with today's understanding of vision, necessitates not only going beyond the visual but ultimately denying it.
Removing the visual may sound exaggerated, but it is necessary to escape contemporary hierarchies, power structures, class associations, and layers of meanings induced by daily imagery. We must escape the visual to recalibrate our relationship to vision and resurrect a new visual culture. As Hitchcock demonstrated in the famous stabbing bath scene of his 1960 film Psycho, the most intense experience comes from what we do not see – from what evolves in our heads. So, what is there to see?
“In Defense of the Poor Image,” the essay by the German filmmaker, artist, and writer Hito Steyerl, will serve as a starting point to follow the argumentation. Steyerl’s artworks and writing explore media, technology, and the global economy of images. She further investigates the relationship between visibility and invisibility through media (For example, in: How Not to Be Seen: A Fucking Didactic Educational .MOV File, 2013), and she discusses how (digital) images travel around the globe via the systems we built.
In her 2009 essay, she creates a framework for the digital economy of low-quality images, which she refers to as “poor” images. These images are the antithesis of the glossy, high-resolution, bourgeois image. “Poor” images are low resolution, “compressed, reproduced, ripped, remixed, as well as copied and pasted into other channels of distribution” (Steyerl 2009)2. For her, they comprise the proletarian version of images. Only those images make it through slow internet connections to exist outside the gate-kept, highly polished environments of cinema and art galleries. High-res vs. low-res – the latter value is defined by the speed at which it travels.
She argues that the poor image is a response to the neoliberalization of culture. Gate-kept experiences, either by the cost of attendance (cinema blockbusters) or fine art pieces (artificial scarcity), are smuggled and shared into an underground culture. Subculture is subverting today's highly commercialized image production by spreading secret phone recordings of movies and artificially scarce artworks, making them “onto the street.” Steyerl also draws a link to the film manifesto For an Imperfect Cinema by Juan García Espinosa. It argues that the “imperfect” cinema is more “real” and thus closer to the audience. Blurring the distinction between consumer and producer, audience and author, and escaping the class reality conveyed through commercially produced movies. This “low-tech" and mass film production would become an "art for the people" (Espinosa 1969)3.
Steyerl then explains that fine art has, in response, also reappropriated this reality into its discourse and that many conceptual artists have deliberately produced low-quality images and artifacts.
Big Blur by MSCHF (2020) Sold for over $40,000 through Sotheby's © Sothebys
She concludes, “The history of conceptual art describes this dematerialization of the art object first as a resistant move against the fetish value of visibility,” which will serve as our starting point. While she acknowledges that the poor image is already a step towards dematerialization, she still maps it onto the interim path. I propose to follow through – to go beyond the (post) visual and deny it – to subvert the commodification of art through the denial of its visual presence.
Hito Steyerl has made an extensive contribution to the economics of digital images, offering a perspective on their value based on the concept of “quality.” However, a lot has changed in the past 15 years: Internet speeds have increased drastically, meme culture has exploded, and generative imagery has made quantum leaps. All this calls for a re-evaluation:
First, the average internet speed has increased by more than 10 times between 2009 and 2024 (Fox, n.d.)4. Back in 2009, we witnessed the early days of LTE/4G. The first time mobile video streaming was technically feasible on a consumer level. Today, next to the landline, mobile internet is supported by cell towers, public Wi-Fi, and satellites, greatly enhancing constant high mobile internet speeds. The argument that image quality is limited to bandwidth is obsolete (in the Western hemisphere).
Today’s primary image consumption platform, Instagram, has over 2 billion active users and uses about 400-600 Megabytes per hour of scrolling (Grcar, 2024)5. Further, these platforms standardized image sizes on their networks to 1024 x 1024 pixels. The advantage of low resolution over high resolution has diminished to a contest of images with higher and lower compression rates (A plain color image is smaller than an image of a gradient). There may be a potential benefit in sharing images that compress more efficiently, resulting in fewer users exceeding their data plans. However, this effect, if it exists, is likely to be negligible. In any case, the new internet speed has reduced the aesthetics of the “poor” image to a personal choice of style.
The culmination of this style is the so-called “Deep Fried Meme,” which has been documented since about 2015 (“Deep Fried Memes,” 2017)6. Steyerl very much predicted what was confirmed on November 21st, 2016, when Urban Dictionary user memegod420 wrote the definition for Deep Fried Memes as: "When a high-quality meme is screen shot, reposted and re-filtered so many times over that it has a yellowish, low quality resolution and looks like it was deep fried."
Original Deep Fried Meme March 24th, 2015 by paparoachscarsmp3 via Tumblr
In parallel to the increasing internet speeds, we have also witnessed the introduction of a new generation of visual generative machine learning algorithms. Services such as Midjourney, OpenAI's Dall-E, and Google's Imagen allow users to create images directly from a text prompt. These generative models for image synthesis are libraries of babel for visual culture. They are trained on as many pictures as these companies could accumulate. Here is what these models can do:
1. Transformation from low quality to high quality - “Upscaling”
Left: GTA 4 PC: 4K Max Settings Free Roam Gameplay | RTX 3080 (No Mods) by Youtube Channel ll-MK-ll Right: GTA IV Real Life Photorealistic Graphics | Runway Gen 3 AI Remaster by Youtube Channel KaiKaiTheAIGuy
Left: THE GREATEST OFFROAD TRIP!!! GTA Online by YouTube channel hellaflush. Right: GTA 5 but with Generative AI Real Life Graphics! By Youtube Channel DubStepZz
2. Transformation from high quality to low quality - “Downscaling”
PlayStation Aesthetic by Tiktok User braeden.c99
The additional text prompt for the above transformation with Midjourney reads as: “pixelated glitch art of close-up of {subject}, ps1 playstation psx gamecube game radioactive dreams screencapture, bryce 3d --style ddCHhSumaNyOrL1Q” which was shared by Reddit User Wear_A_Damn_Helmet on r/midjourney.
Image quality is no longer definitive, as we can now digitally alter it to our liking with ease. Currently, this requires above-average processing power. However, according to Moore's Law, it is only a matter of time before all our devices, and eventually augmented reality contact lenses, are equipped with this functionality natively and in real time. TikTok and Snapchat have already successfully demonstrated how to run many of these machine-learning models on entry-level smartphones.
From theorists such as Susan Sontag and Roland Barthes, we understand that all photographs – whether manipulated or not – represent interpretations of reality rather than objective truths. Their creation and reception shape the meanings, and, as Steyerl suggests, their meaning and class association are additionally defined through their material quality. The “rich” image represents the bourgeoisie, neo-capitalism, and high commercialization of culture. The “poor” image is the image of the proletariat, being squeezed through slow data pipelines, viewed on small, low-quality digital screens, pasted around, cropped, and squeezed, and finally shredded through multiple layers of compression.
To contextualize Steyerl’s argumentation, these layers of manipulation, as demonstrated in the category of the deep-fried memes, are layers of stacked hermeneutics (Gadamer, 1960)7 and gain separate richness that the originals do not have. This is generally how meme culture works, and the post-ironic stage has gone meta to the point where “no” meaning has become “the” meaning – a meme that obscures its meaning.
Image by Overall-Estate1349 posted to Go to r/RedditForGrownups
The "Surreal Era" and "Post-Irony" memes are the most visually degraded but heaviest in meaning. So what comes after total degradation? The negative? The invisible?
Steyerl argues that the low-quality image is the image of the proletariat. Today's technology, as we have seen, especially text-to-image and style transfer ML models, can translate to any image feature. They upscale from a low-quality to a high-quality image and vice versa. Interestingly, the images produced by these models are reflected within a so-called latent space, a multi-dimensional abstraction of all potential images within that model. This opens the possibility (as demonstrated in countless tech demonstrations and artworks) of “walking” along these vectors of space, which appear like a morph between different images. Here is an example of how handwriting digits are ordered in such a latent space projected onto a two-dimensional image:
Image from the Article Latent Space Visualization by Julien Despois for HackerNoon.
When we look at faces, this multi-dimensional space becomes clearer. One can interpolate between two faces:
Image from the Article Latent Space Visualization by Julien Despois for HackerNoon.
At this moment, it is worth mentioning that the word "meme" was coined by Richard Dawkins in his 1967 book The Selfish Gene. There, he explains how evolution "walks" along these potential versions of DNA, "discovering" viable living outcomes out of infinite possibilities. Neo-Darwinists like him should be approached with caution regarding their influence on humanism, but there is a clear analogy between latent space and DNA as a space of possibilities. They both produce viable outcomes for their world's environment.
Therefore, if we expect these machine learning models to be capable of representing all possible images, we will be able to morph from any state to any other state fully. High-quality, low-quality, deep-fried, oil painting, sci-fi – if one can name it, one can create it.
Currently, the class representation of “rich” and “poor” collapses. GTA's high-quality rendering (see above) no longer requires a high-end computer. We simply run GTA at its lowest quality and overlay an upscaler to make it appear as if it were the outcome of a high-end machine.
Image quality-induced class structure collapses. This is good because it removes the divide. Now, anyone can easily create blockbuster-style movies. On the other hand, it renders visual culture meaningless. Let us imagine the skill of image curation on mood boards being reduced to setting the correct image transformer.
Steyerl argues at another point that low-quality images from news outlets convey urgency or reality “in the making.” In the future, these upscalers will run in real-time, augmenting images to blockbuster quality, but also completely lose their previous undertone.
Suppose our visual experience and consumption of images are going to undergo such a radical transformation. In that case, we come full circle, so to speak, with the theory of photography and the phenomenological theory of seeing. We will have to acknowledge that images do not show anything.
In Camera Lucida, Barthes discusses how photographs can be read “left” and “right,” arguing that no photograph has ever convinced anyone or punished a lie. He suggests that shocking photographs are structurally insignificant; it has no value or knowledge – the more direct the trauma, the more difficult the connotation (Barthes 1982)8.
Therefore, the meaning of the image remains primarily up to the reader. I would take this as an argument against Steyerl: its positioning has never been about visual “quality” but about the audience's (sub)conscious choice of what to believe.
One thing we can take away from modern theory is that everything is much more complicated and interconnected. An image by itself does not represent anything, and if we want to increase representation, we have to move into more relational perception. In particular, we need to reevaluate the idea that "a picture is worth a thousand words," which no longer makes sense. The relationship between the description of an image and the image it describes has changed. Which is the signifier, and which is the symbol? From an "AI" perspective, the prompt (the description) points to the image. It is not the image that points to the description (prompt). This is interesting because, despite the prevalence of visual communication, ideas and visuals are still essentially conveyed through words. We speak in images only in the second level of interpretation.
Soon, visual culture will be rendered and created in real time. If we care to save its value, we need to go beyond the image in the classical sense.
What would that imply? This reminded me of my website, which I developed during my early design studies. The site did not have any images on the landing page. An image could only mislead the audience in choosing a project to look at. So, instead, I had a short description of what they would be. Similar to the standard of leaving image descriptions for visually impaired users. I set that as the default.
Website by Vinzenz Aubry (2015) Omitting preview images that can not represent interactive projects.
Our visual culture is becoming. It is in constant flux, and technological advances will make this culture adapt in real time. So, I see two possibilities:
Philosophers from Heraclit to Dewey, Gadamer, and Deleuze have told us that perception is in flux. So why have we held on to the belief in static images?
Right now, the only way to escape commercialized culture is to deny the image, become invisible, and shift to other senses. The visual is too loaded to be of any use. When was the last time one smelled an ad or touched an ad? For now, this is more of a thought experiment. Our society is based on images, so only small experiments can succeed in denying the visual.
When everything is generated in real-time, we will lose the collective experience of our visual culture – a phenomenon known as visual filter bubbles. The big question is who will have the power over this fluid reality of media. The 2024 presidential U.S. elections have shown that an oligarch like Elon Musk has no problem adjusting the social media algorithms of his controlled platforms to favor the visibility of conservative images, which directly represent the visual reality we consume and ultimately manifest. To return to Steyerl and her film "How Not to Be Seen": As long as capitalism reigns over our visual culture, the only way out is for images to become invisible – invisibility as a privilege, a sign of quality. Only from these shadows can we prepare a real media of the flux.
Crary, Jonathan. 1988. “Techniques of the Observer.” October 45:3–35. https://doi.org/10.2307/779041.
Steyerl, Hito. 2009. “In Defense of the Poor Image - Journal #10.” 2009. https://www.e-flux.com/journal/10/61362/in-defense-of-the-poor-image/.
Espinosa, Julio García. 1969. “For an Imperfect Cinema by Julio García Espinosa, Trans. by Julianne Burton.” 1969. https://www.ejumpcut.org/archive/onlinessays/JC20folder/ImperfectCinema.html.
Fox, Will. n.d. “Global Average Internet Speed, 1990-2050.” Accessed December 12, 2024. https://www.futuretimeline.net/data-trends/2050-future-internet-speed-predictions.htm.
Grcar, Anja. 2024. “How Much Data Does Social Media Use (TikTok, Instagram, YouTube, Meta, X).” Race Communications. October 11, 2024. https://race.com/blog/how-much-data-does-social-media-use-tiktok-instagram-youtube-meta-x/.
“Deep Fried Memes.” 2017. Know Your Meme. February 16, 2017. https://knowyourmeme.com/memes/deep-fried-memes.
Gadamer, Hans-Georg. 1960. Truth and Method. 2., rev. Ed. New York: Continuum.
Barthes, Roland. 1982. Camera Lucida: Reflections on Photography. 26. print. New York, NY: Hill and Wang [u.a.].