For the official version of record, see here:

Wasielewski, A. (2024). DALL-E in Flatland: Illusion, Space, and AI-Generated Images. Media Theory, 8(1), 185–204. Retrieved from https://journalcontent.mediatheoryjournal.org/index.php/mt/article/view/1073

Lost in Compression: Models of Authorship in Generative AI

NICOLAS MALEVÉ

Aarhus University, DENMARK

Abstract

Recent developments of image generators have introduced a new point of contention in the already contested field of artificial intelligence: the ownership of images. In 2023 Getty Images sued the company Stability AI, accusing it of illegally appropriating photographs for the purpose of training its models. Analysing image generators and stock agencies as probabilistic systems, this text argues that their significant difference lies in their model of appropriation. Where Stability AI proceeds through direct appropriation, the stock agency proceeds through contractual appropriation using its dominant position in the market. The article discusses Getty Images’ release of an image generator trained on its own image collection and critically reflects on the stock agency’s attempt to insert its contractual engine in the core of the generative AI technology, making copyright the regulating principle of the relations of ownership of the system and the principle that constrains the range of images it can produce.

Keywords

Introduction

The machine learning revolution in computer vision of the last decade produced major breakthroughs in object detection and image understanding. In response, it elicited a vast corpus of criticism centred on the problem of representational bias. Activists, artists and academics from various backgrounds revealed how detection and classification algorithms reproduce and exacerbate discrimination, racism and sexism (Buolamwini, 2017; Fabbrizzi et al., 2022; Paglen and Crawford, 2019). Whilst these problems have not gone away in generative AI (Offert and Phan, 2022), the recent developments of DALL-E or Stable Diffusion introduced another point of contention. If the main problem of detection algorithms was framed as one of representation, the issue of ownership in image generators has extended the range of concerns about the consequences of AI development.

Now, the question “who owns what?” haunts the field of machine vision as much as the question “who is represented and how?”.

Far from being speculative, this question has provoked a series of attempts to protect AI‑generated work as well as an onslaught of litigations against AI systems for copyright infringement and data theft in image and text generation. There are two distinct areas where copyright law plays a role in the current legal disputes: the copyright of the generated works and the ownership of the images used in the training sets. The former has been debated in court and jurisprudence tends to deny authorship to machines. As the jurisprudence gradually develops, mainly in the United States, judges restate over and over that only humans can be legitimate copyright owners. The example of AI researcher Steven Thaler is a case in point (Solomon, 2023). Thaler submitted the work ‘A Recent Entrance to Paradise’, an image of a train entering a tunnel in the countryside, for copyright registration. He declared that the work was generated algorithmically and autonomously by a software he had written, the Creativity Machine. The United States Copyright Office (USCO) refused his application on the grounds that the intervention of a human author in the process was a sine qua non condition for copyright protection of creative works (Millemann, 2022). In a majority of cases, the generated work is also manipulated by humans, which raises the difficult question of the division of creative labour between machines and humans. Judges go to great lengths to establish the dividing line between human and AI contributions, as comic book artist Kristina Kashtanova recently realised (Wolfson, 2023). Kashtanova, author of the graphic novel Zarya of the Dawn, has made extensive use of image generation tool Midjourney in her work. The USCO only granted limited copyright protection to the novel, arguing that she did not exercise enough control over the generated images as Midjourney could not produce predictable results. The USCO concluded that there was too much distance between her intention (as formalised in her prompts) and the machine generated output. The Copyright Office granted registration for the text and the arrangement of images but excluded the images from protection (Analla and Jonnavithula, 2023).

By contrast to the relative speed with which judges have denied authorship to machines, the question of the inclusion of creative works in machine vision training sets is only now starting to be tried in court and a growing number of artists and creators are making their voices heard in the press (Castro, 2023), in public forums (Kanaan, 2018; Xiang, 2022) or even in legal forums such as the notice of inquiry on copyright and artificial intelligence issued by the USCO (US Copyright Office, 2023). With the staggering amount of 15 billion synthetic images generated in a year (Valyaeva, 2023), there is high pressure on the courts to clarify the rules of the game. The question of training data ownership has emerged as a site of resistance and the reasons to oppose AI through copyright are multiple. There is an economic argument (artists are robbed) as well as aesthetic arguments (AI is degrading creativity). Leading legal scholars such as Pamela Samuelson (2023) see the copyright battle as an issue that could halt the development of AI. Even if copyright is only a bump on the road in the long term, as Barack Obama recently proposed (Patel, 2023b), it might be the only bump that slows AI’s seemingly irrepressible progress.

In this text, I will concentrate on the question of property pertaining to training images to reflect on the models of ownership that increasingly define the ontology of the photograph in the world of computer vision and image generation. To do this, I will analyse the ambivalent relations between stock photography giant Getty Images and the technology of image generation. Getty, the owner of the copyright of millions of photographs, filed a complaint against Stability AI, the company that architected the financing of the generative AI ecosystem Stable Diffusion (Getty Images (US), Inc. v. Stability AI, Inc., 2023). In the complaint, the stock photo agency attempts to demonstrate that its catalogue has been systematically appropriated by Stability AI to make financial gains. In line with Getty Images’ decision to ban AI-generated content from its platform (Rose, 2022), the complaint contrasts the stock photo agency with image generators. Emphasising Getty’s quality control, the complaint differentiates the agency’s practices from those of Stable Diffusion and documents the various forms of misappropriation of its content as well as the ensuing degradation of its corporate image. The plaintiffs emphasise Getty’s treatment of the photograph as an object of careful curation that secures referentiality, and a coherent relation between image and metadata. In Getty’s collections, images are contractual objects subject to authorship and ownership. In contrast, Stability AI reduces images to mere data available for extraction that produces grotesque representations. The plaintiffs contrast Getty’s curatorial approach with Stability’s embrace of a model of compression and distortion that operates without any concern for legitimate ownership of images. Whilst the lawsuit presents a well-defined opposition, the evolution of Getty’s business model suggests otherwise. In parallel to the lawsuit, Getty Images announced the launch of its own image generation platform in partnership with NVIDIA (Getty Images, 2023). Here the stock agency valorises the potential of image generation and, rather than oppose the two approaches, tries to synthesise them. With its service of image generation, Getty Images attempts to reconcile its model of authorship with the new possibilities offered by image models.

By analysing these attempts to differentiate and reconcile technological models and models of property, this article explores the fraught relations of ownership and compression in the photographic elaboration of computer vision. It investigates the political implications of an ontology of the networked image centred on its model of ownership. In that perspective, discussing the politics of appropriation of stock photography sheds light on the politics of appropriation of AI systems. To do this requires engaging with the increasing importance of legal mediation in the definition of the generative image, and with its relevance for the evolution of future image factories as well as for AI more generally. The conflict between stock photography and image generators brings photography to the intersection of the trajectories of the visual, technology and law.

Getty Images vs Stability AI: Anatomy of a complaint

To introduce the complex relations Getty Images entertains with generative AI, I will begin with Getty’s attempt to stop Stability AI from the alleged misappropriation of its collection. In the world of generative AI, the constellation of companies and NGOs behind Stable Diffusion has attracted a large share of legal challenges. To establish the specific claims of Getty Images, I will analyse the demand for a jury trial against the AI software company Stability AI. Funded by hedge fund manager Emad Mostaque, Stability AI is a company that owns the Dream Studio^[1] and ClipDrop^[2] platforms, which offer users access to the latest Stable Diffusion models at different pricing schemes. The jury demand is a highly instructive document in that it lays out the core arguments of the plaintiff in order to be entitled to a jury trial. The document filed at the US District Court of Delaware shows an early attempt to juridically address the problem of appropriation at the heart of the AI industry and its conflict with the economic model of the stock photography industry.

The first paragraph of the document sets the tone of the complaint (Getty Images (US), Inc. v. Stability AI, Inc., 2023: 1):

This case arises from Stability AI’s brazen infringement of Getty Images’ intellectual property on a staggering scale. Upon information and belief, Stability AI has copied more than 12 million photographs from Getty Images’ collection, along with the associated captions and metadata, without permission from or compensation to Getty Images, as part of its efforts to build a competing business. As part of its unlawful scheme, Stability AI has removed or altered Getty Images’ copyright management information, provided false copyright management information, and infringed Getty Images’ famous trademarks.

As in many other recent court cases against generative systems (Andersen et al v. Stability AI Ltd. et al, 2023), the complaint stresses the importance of the input part of the process of generation. The dataset used to train image systems is the main site of the dispute. In this case, the target is the LAION-Aesthetics dataset, a collection of 600 million image-text pairs commissioned by Stability AI to train their diffusion model (Baio, 2022). If, for years, the datasets used in image generation were kept secret, things changed when Stability AI opted for an open-source model of development. As the datasets, the models and the code used by the company are available to anyone, the company becomes more vulnerable to critical inspection and legal offensive as it can’t hide the resources it is using. Further, the composition of Stability AI’s datasets can now be exposed by custom-made search engines such as Mat Dryhurst and Holly Herndon’s haveibeentrained.^[3]

Figure 1: Screen capture of the query “getty images” in the haveibeentrained search engine, November 2023.

The page shows images scraped from the site alongside images owned by Getty copied by internet users on other websites. There is little doubt that images from Getty have been included in LAION and used for training Stable Diffusion.

Datasets contain more than images. For instance, LAION-Aesthetics contains text-image pairs combined with other metrics. Such metrics include the likelihood of the presence of nudity or violence,^[4] the likelihood of the presence of a watermark and an aesthetic score.^[5] Curatorial decisions encompass the selection of images and the selection of criteria to describe images. In the case of LAION, no mention of authorship is included. Image descriptions proceed directly from the texts automatically extracted from the image’s alt tags (Baio, 2022). The process is a mix of automated scraping (Lavigne, 2023) and algorithmic evaluation.^[6] In this process of scraping, no permission has been asked of the photographers or image-makers, nor to the website owners.^[7]

This form of appropriation, the lawyers claim, is key to Stable Diffusion’s business model and allows it to unfairly compete with the commercial model of stock photography agencies. Stock photography agencies license the contents of their catalogue according to different forms of use and pricing schemes to corporate clients, broadcast media or news sites. With a history of large-scale acquisitions, giants such as Getty have acquired dominant shares of the market and are in a position to negotiate the rights of the photos they acquire at low value. As photography scholar Paul Frosh (2013) claimed, these image factories are contractual engines. In ‘Beyond the Image Bank: Digital Commercial Photography’, he describes the stock agency as a “Leviathan of the image” (Frosh, 2013: 131):

[…] a single, gigantic creature that encompasses and contains individual photographs, contractually appropriating their rights, purposes and capacities in a manner similar to the relationship proposed by Hobbes between human individuals and the state […]

From this perspective, the complaint document holds two models of appropriation in opposition. One is a model of contractual appropriation, the Getty model, where every transaction is formalised and financial (even if poorly). The other is a raw model of appropriation through scraping without financial compensation. In this opposition, the legal definition of copyright has a significant bearing. A relevant characteristic of copyright law concerns the intangible dimension of creative work. In theory, copyright protects the work of the human spirit. However, in copyright law, the relation between labour and creative work is hard to define as protection is not granted according to the amount of effort that was put into the work,^[8] but is still justified as the protection of an investment (to protect from those who want to “reap where they have not sown” (Rahmatian, 2013: 12)). Further, this investment is only protected when it manifests itself in a tangible form: copyright protects expressions, not mere ideas. In Getty’s contractual model, self-contained images are essential assets which are considered as the expressions of their authors. In the stock photography model of appropriation, copyright also supports its transactional model. A company like Getty doesn’t sell images directly; it sells conditional access to images. Getty buys the rights over an image and sells licenses to their clients. Permission is the product, and the tangible character of the product is what legitimises the transaction. Here the photograph is intimately bound to the legal framework of ownership both in the definition of its object and its transactional dynamics.

By contrast, Stability AI orchestrates the development of Stable Diffusion, an open-source model, and commissions datasets such as LAION that are made available to developers and users free of charge. And Stability AI doesn’t sell individual images, either. However, as the plaintiffs insist (Getty Images (US), Inc. v. Stability AI, Inc., 2023: 3), it licenses access to its technical infrastructure to run the models through an outlet such as Dream Studio. In Dream Studio, users buy packages that allow them to spend credits on the generation of images.^[9] There is no one-to-one correlation between credits and images. Credits are related to computational resources and only indirectly to visual output. The countless discussions in internet forums regarding the cost of image generation testify to the fact that individual images are mere moments in a process of calculation. Here expression does not matter as much as the underlying statistical tendencies that make up configurations of pixels. In short, one company licenses the use of individual images through its website. The other licenses the use of the model through its platform. Stability AI’s business model rests on the valorisation of a process where expressions are considered secondary (in theory, prompts can generate countless images). In opposition to expression, which relies on individual images, is compression. By compression, computer vision scientists mean the reduction of the dominant patterns to statistical features. The best example of compression is the trained model in which all potentially generated images are virtually contained. The model is not the literal equivalent of a database of individual images. According to Mostaque, it compresses the “knowledge of over 100 terabytes of images” (in Wiggers, 2022). It consists of a matrix from which new images can be generated. As such, Stable Diffusion’s model defies the categories of copyright law as it digests images in an intangible form. The problem for Getty is not just that Stability AI captures images without negotiation. It is also that it threatens the foundation on which property rights over images can be claimed: their tangibility.

Origin or result? The vagaries of visual comparison

To prove the irreducible importance of tangible images in the process of generation, the plaintiffs have to demonstrate that the capabilities of Stable Diffusion are causally linked to the appropriation of their assets. In order to make that argument, the document exhibits a series of images to show that their images have a significant impact on the software outputs. Their visual indictment begins with a screenshot of Getty’s website that underlines the professionalism and quality of their catalogue. At the centre of the screenshot, we see a close-up of a man’s hand placing a wedding ring on his future wife’s finger. In her other hand, the woman holds a red rose whose presence is explained in the caption: the couple are wed during a group Valentine’s Day wedding (Getty Images (US), Inc. v. Stability AI, Inc., 2023: 6). Ironically, the institution of the heterosexual couple is called to support a platform that rests on contractual transactions. The choice of the aesthetic celebration of the matrimonial contract with the hands of the newlyweds in full display is also a critique of generated images: hands are a visual feature that generative systems notably struggle to resolve.

As their indictment unfolds, the plaintiffs make use of visually striking comparisons. They juxtapose stock images with AI-generated outputs. AI images are carefully selected for their similarity with Getty’s products and for their grotesque character. For instance, one of these comparisons shows two images of soccer players side by side in an acrobatic pose (Getty Images (US), Inc. v. Stability AI, Inc., 2023: 18). In the stock photograph, the composition is well balanced, and the axes formed by the players’ arms and legs convey a sense of choreographic harmony. The players are engaged in a harsh competition, but they evoke two dancers executing a perfectly rehearsed ballet figure. By contrast, in the generated image, the players look like ill-conceived mannequins and the composition seems arbitrary. Both images feature a version of Getty’s logo. In the stock image, the watermark is visible and situated at the centre of the image. In the AI image, it is distorted and barely legible. For the plaintiffs, this juxtaposition establishes that Stable Diffusion’s images are damaging the brand by stamping Getty’s distorted logo over low-quality images.

This demonstration brings forth a new dilemma. If the lawyers are able to demonstrate a likeness between AI-generated images and Getty’s, they have trouble demonstrating that this likeness can be detrimental. Who would buy the “grotesque” algorithmic rendering for the cover of a sport magazine? Would ill-conceived mannequins really replace the spectacular ballet of professional soccer players on glossy magazine covers? On the other hand, if the plaintiffs show that Stable Diffusion is able to generate convincing images, the danger is that the judge might see Getty’s images as so banal and generic that they hardly deserve protection. If Stable Diffusion is only able to passively reproduce what it learned from the training data, it might suggest that the training data is itself already regular, predictable and lacking singularity. A sense of anxiety transpires through these comparisons. Generated images are, at the same time, too close for comfort and too different to be considered passive copies. There is an uncanny resemblance that the document attempts to exorcise. More problematically, even, the plaintiffs can’t seem to decide whether Stable Diffusion is guilty of forgery (to create an image that can pass for an image from Getty) or if Stable Diffusion is guilty of plagiarism (to mechanically reproduce an image of the Getty collection whilst pretending it has been autonomously generated by Stable Diffusion). This confusion between origin and result is symptomatic of an understanding of the image that leaves no room for the collective source of all creations. As copyright considers creative work as an entity abstracted from the relations from which it emerges, it generates a reference anxiety that constantly troubles the relation between the alleged original and the alleged copy. Before delving deeper into the nuances of this reference anxiety, it is worth considering critically the other assumptions underlying the opposition drawn out in the complaint.

Image banks as probabilistic systems

First, I must examine the idea that the catalogue of stock agencies stands out from mass image production and that its quality makes it inherently different from it. One might venture, on the contrary, that the reason why stock imagery is important to generative AI is not because it is exclusive or authentic but because it is background and redundant. By repeating the tendencies present in the larger image space, it is nothing more than generic and this generic character is what matters to generative systems. As Frosh puts it, stock photographs are being looked through rather than looked at. They are the “background noise of consumer cultures” (Frosh, 2002: 191). And as computer vision experts insist, singular images have little influence on the learning process; only converging patterns get registered as tendencies (Newhauser, 2023). Getty’s images are more than just generic, they are quintessentially so. If there is an undeniable professionalism in stock imagery or at least a know-how, it is parasitic on the banal and the repetitive.

Further, we need to be critical towards a clear-cut opposition between synthetic images and photographs. Conflating machine learning imaging systems and probability obscures the fact that the relation between photography and probability has a long history.^[10] Furthermore, we might arrive at different conclusions if we consider imaging systems, rather than concentrating our attention only on images themselves. As Frosh (2013: 136) already observed a decade ago:

The generic image’s parsimonious plurality of meanings is in effect an anticipated range of message outcomes. These outcomes are dictated by the overall mode of commercial photographic production, where photographers utilize the success of other images and image categories to gauge the possible interpretations of cultural intermediaries (stock photography agencies), who in turn estimate the possible interpretations of other cultural intermediaries (their clients: advertisers, marketers, etc.), who in turn anticipate the possible interpretations of consumers. The stock photograph is thus a deliberately probabilistic entity within an overall system of multiple calculation and anticipation.

If we follow Frosh, generative AI models are not the only probabilistic systems. The image factories of stock photography are probabilistic, too. Even if they use different modalities and material infrastructures, stock agencies and generative AI are both dedicated to making predictions on the basis of redundancies and differences, whether by gauging the value of a variation on a dominant cultural trope or the variability of a statistical pattern.

In general, the line of argument developed in the legal complaint draws on an idealised vision of stock photography (high level of quality, exceptional content) and ignores its machinic mode of operation. But the crucial difference doesn’t consist in the technology or the quality of image production. It lies in their structure of ownership. The originality of the image ontology of the stock agency is how it internalises the mechanism of copyright. Copyright law is not merely called in to the rescue when the company is threatened. As I have already noted, copyright helps define the photographic object for the stock company and it supports its transactional dynamics. As we will see, this goes even deeper than a practicality to manage the flow of images. It is the mechanism through which it regulates its space of possibilities and devises its strategies.

To get to this, the opposition between image banks and generators needs to be even further contested. Thanks to Frosh’s foundational work, we can establish that image banks are probabilistic engines. But more than that, there is a likely scenario in which they become generative systems. For that, we need to take a step back and consider them beyond their current configuration. Estelle Blaschke’s research can help us see them in another light (2009). Blaschke’s work shows how image archives and later image banks were conceived as search engines for existing images. However, based on her close reading of the material she dug out from the archive of photography history, one can also see the signs of a prefiguration of search engines for images that didn’t yet exist. In that context, the way image bank pioneer Otto Bettmann described what he called a “pictorial futurama” is revealing (in Blaschke, 2009):

Ideally, picture retrieval should work in the following manner (and perhaps one day it will): The picture user in search of “Melba eating Melba toast” will teletype his coded request to an electronic picture research pool. After a few minutes’ wait, a Western Union messenger will arrive with a fat envelope containing pictures of Melba eating Melba toast, dry, buttered or with marmalade! Only a digit here and there has to be changed should the request happen to be for “Thomas Jefferson eating spaghetti” or a reproduction of Leonardo da Vinci’s “Mona Lisa” … This Pictorial Futurama is not offered facetiously. We are getting there … To help in such pursuits and to speed up the retrieval of pictures – the right pictures – The Bettmann Archive has developed a visual index.

Seen from today’s perspective, an utterance such as “Thomas Jefferson eating spaghetti” is not a query for a database of existing images. It sounds to our contemporary ears like a prompt for an image generator. Having photographers scouring the world in quest of every potential image is just an unpractical way of achieving this goal. What Bettman tells us through that pictorial futurama is that his ultimate goal is to let users search the space of human imagination, not merely browse a catalogue of existing images. For one of the pioneers of image banks, their ultimate function is not to search but to elicit images into existence through written requests.

Embracing image generation

With these considerations in mind, can the image factory, probabilistic in nature, be seen as a proto-generative system? Can a fusion of image banks and generative systems be imagined? The system of ownership relying so crucially on the licensing of tangible objects seems an obstacle that cannot be circumvented. What would it take for this difference in structures of ownership to be accommodated? What would a probabilistic system with a contractual structure of ownership look like?

These questions are not mere speculations. Interestingly, Getty Images itself tried to answer them recently. In partnership with NVIDIA, Getty released the latest technology for a “commercially safe” generative AI tool (Getty Images, 2023). At first glance, Getty’s own image generator offers an interface that resembles other existing image generation software. However, instead of relying on billions of images scraped from the web, their generator is trained exclusively on their own content. The existence of this AI generator required the extension of Getty’s contractual strategy. Every photographer and creative who contributes is remunerated in proportion to their contribution to the datasets used to train the system as well as in proportion to the revenues generated by the AI. As Getty’s CEO Craig Peters puts it, this formula is a kind of “proxy for quality and quantity” (Patel, 2023a). Additionally, the company looked into models to automatically calculate the attribution ratio at the pixel level. This form of nano-ownership might represent a bleak perspective for photographers if the project is successful: it might mean that their middle term perspective would be to produce images for AI with a status comparable to the micro-workers who annotate images on Amazon Mechanical Turk. They would be turned into “functionaries of the camera” (Flusser, 2012) to feed the newly born generator where their gains would be an infinitesimal fraction of the global benefits.

In the image factory, copyright is a means to redistribute financial gains and ratify the division of labour. But the role of copyright in Getty’s image generator goes beyond a legal mechanism of appropriation; it defines the perspective of the system. The selling argument of the new image generation service is the quality of the generated images and, more importantly, a “worry-free” and “commercially safe” output. In this context, safety means protection from copyright-related lawsuits. To deliver on its promise of safety, Getty’s algorithm is trained to reproduce the regularities of a training set that has been purged from all the categories and representations that could cause trouble for its user. In other words, it is trained to avoid all the zones of the image space over which a legal entity may lay claim. As Peters explains in an interview with The Verge (Patel, 2023a):

If you have an image and it produces an image of a third-party brand or somebody of name and likeness like [Travis] Kelce or [Taylor] Swift, that’s a problem. But there’s much more nuanced problems in intellectual property, like showing an image of the Empire State Building. You could actually get sued for that. Tattoos are copyrighted. So, fireworks can actually be copyrighted. That smiley firework that shows up? Grucci Brothers actually own that copyright. So, there’s a lot of things that we baked in here to make sure that our customers could use this and be absolutely safe. And then we actually put our indemnification around that so that if there are any issues, which we’re confident there won’t be, we’ll stand behind that.

As I suggested above, every work protected by copyright inherits a troubled relation to reference. In comparison to Stable Diffusion’s position regarding reference, which ignores any potential tension, Getty’s is highly defensive. Getty relies on the mechanism of copyright to remove any ambiguity about the legal persona who is entitled to ownership over an image. Copyright establishes a reference between a bounded object, the image (as opposed to an intangible imaginary/image space), and a legal subject, the author. The complaint document already insisted on this point. It juxtaposes the legal acquisition of photographs with the scraping of photographs on the web. The legitimacy of the contractual acquisition of photographs relies on Getty’s supposed ability to guarantee that the acquired photographs are exempt of any infringement issues. Getty puts in a lot of effort (or says it does) to ban from its repertoire images that are infringing the copyright of others (e.g., a photograph of a firework owned by Grucci Brothers) using the strict criteria specified by jurisdictions such as the United States. Essentially, authorship legitimates the stock photography enterprise and functions as the building block of its whole edifice.

On closer inspection, however, it can be said that authorship itself is an authorised form of scraping. A large critical corpus of work addressing the problem of authorship and copyright, ranging from Barthes’ (1977) and Foucault’s (2021) critiques of the author to commons-based activists (Lessig, 2004; Niederberger et al., 2021), Marxist legal scholars and sociologists (Edelman, 1979; Lazzarato, 2006) to free software advocates (Benkler, 2011; Stallman, 2002), has emphasised that nobody creates in isolation and that an authored work is always to some degree indebted to a collective. In that perspective, the attribution of a work as an exclusive privative expression is a means of tearing off the cultural web of relations that made it possible. Authorship becomes a means to abstract images from their communal origin (Mugrefya and Snelting, 2022). The attribution of a work to a subject obscures the roots that tie it to a shared pre-individual milieu. Further, this logic of extraction from milieu to individual anticipates a series of transfers of property as copyright’s mechanism is deeply transactional, especially in its Anglo-Saxon understanding. If individual property can be claimed over what is indebted to a collective process, it is also transferable to others. Images that originate from a complex set of cultural undercurrents are given titles of property and these titles of property can then be traded. After being given form, author and property, images join the circuit of capital. In this sense, Getty has its own scrapers: the photographers and artists of its catalogue. As their work becomes their legal property, Getty is able to sell exclusive rights over their images to their clients.

Yet, as we have seen, this process of exclusion is never fully achieved; the reference is always more complex and the author as a legal entity never fully represses their communal inscription. Authorship always comes along with a reference anxiety and the frame of the photo never fully isolates it from the intangible ground from which it emerges. There is something beyond the individual that always “delays and belabours the finished version of” any work, as Cristina Garza Rivera (2020: 48) wrote. Frames never fully contain, never fully limit or circumscribe. They never fully stabilise. The incomplete insulation of the legal subject from its collective entanglements and the porosity of the frame always threaten to let cultural antagonisms and tensions infiltrate the company’s products. Therefore, the parallel processes of screening for potentially harmful subjects and identifying copyright infringements participate in the same logic.

Stock photography is a practice that reflects in a generic manner the cultural undercurrents it feeds off and attempts perhaps more than any other practice to detach its representations from them. Decontextualisation is their operative motto. Detachment is the precondition for circulation and trade. The detachment of the photograph from its communal origin is only a starting point. Its contents must be detachable, too. As Frosh (2013) remarked, pointing to the white background that was for a time typical of stock imagery, the added value of stock photos was the ease with which they could be decontextualised and reconfigured in further image composites. Images must be “free” from context. An image of workers on a construction site can be used as a background for an ad that sells insurance, a poster that announces investments in public infrastructure or a news article that discusses the extension of the age of retirement. The hygienic style so characteristic of image banks testifies to this intense investment in clearing rights and cleaning representations. And if this process proves insufficient, the company provides indemnification to its clients should an image cause legal trouble. Safe images, trouble-free images, are images protected from the inherent dangers of an unstable reference, the never-fully-exorcised intangible dimension of the image, its shadow of formlessness. Getty protects itself from the reference and its subterranean social antagonism. However, for Getty, reference anxiety is more than a defensive strategy. It is integral to its business strategy and its aggressive conquest of the market. The company protects itself from the reference and sells this protection to others. Getty licenses images and along with them offers insurance against litigation. The agency provides uncapped indemnification should a copyright lawsuit arise because an image generated through Getty’s platform infringes someone’s copyright. Further, it weaponises the threat of the reference against its competitors, such as Stable Diffusion. Used to attack competitors and motivate consumers, copyright is the stick and the carrot.

Conclusion

This analysis of the ambivalent attitude of Getty demonstrates that the staged opposition between photography and AI must be nuanced. In parallel to a categorical opposition, there are potential convergences. In fact, there is a possibility for traditional actors such as Getty to co-opt AI generation software. In this scenario, the central problem is not strictly technological. It pertains to the model of ownership to be built in these systems. In this perspective, the current complaints against AI systems should not be understood as a defence against them but a step towards a takeover of the technology. This takeover would entail a profound transformation of generative AI’s current system of ownership whilst leaving intact their computational infrastructure. It would constitute an evolution of stock agencies rather than a rupture, as they are by nature probabilistic. In this scenario, copyright would be a pragmatic instrument for regulating the work of photographers turned into functionaries of the algorithm, not because their work is scraped automatically, but because it can be bargained as training data. For the stock agency, copyright would function as a weapon with which to attack competitors. For their clients, it would function as a threat the stock agency offers protection from. And fundamentally, copyright would be the arbiter of the regularities learned by the system, defining what can be shown and to which extent, what can be named and what can’t. An image space with anonymous celebrities, buildings without architects, blurry tattoos and no fireworks.

Deep down, we are confronted with image ontologies that have more in common than it first seems. At one level, everything seems to contrast them. For Stable Diffusion, an image doesn’t have an owner; it is up for grabs for those who have the means to download it. Its frame is a temporary container of statistical regularities which traverse a whole collection. For Getty, an image is an object of licensing and permissions, fixed in a contractual frame, tied to a juridical persona, the property of an individual author and valorised as a vector of financial transaction. In one case the image is considered as lacking any singularity; in the other, it is considered as lacking any collective attachment. But this opposition rests on a shared assumption. In one case the image belongs to no one, in the other it only belongs to an individual: both systems repress the communal ground of images. Both systems are products of appropriationism, albeit under different guises: one, Stable Diffusion, simply dispenses with the cultural environments it plunders; the other, Getty Images, invests the frame of the tangible work to achieve a closure from the cultural milieu it feeds off. The repression of the communal attachments of visual culture has consequences for both systems. While scraping provokes endless controversies about theft and copyright infringement, legal appropriation generates an endless reference anxiety. Therefore, if we critique the politics of appropriation of AI systems, we must be careful about the implications of grounding our criticism on the basis that it infringes photographers’ copyrights. Such critique might address the particular problems of a method of appropriation, scraping, but does not address the larger question of appropriation itself.

Finally, there is a silver lining to these legal controversies. Since their inception, generative systems seemed bound to large-scale appropriation. However, for the reasons discussed above, the potential alliance of generative systems and legal forms of appropriation, such as Getty’s, demonstrates that new articulations between datasets and algorithms can be sought and experimented upon. If a more powerful Leviathan of the image doesn’t sound particularly desirable, the confrontation of the stock image factory and the generative platform at least opens up a speculative perspective: can the disassociation of the computational capabilities and the ownership model of generative AI suggest other associations? Can their computational infrastructures be inflected by more generous, imaginative and respectful models of communal attachments? If generative AI is not deterministically tied to careless large-scale scraping, maybe the contractual engines of the image factory are not the only alternative.

References

Analla, T. (2023) ‘Zarya of the Dawn: How AI is Changing the Landscape of Copyright Protection’, A. Jonnavithula (ed.) Jolt Digest, 6 March. Available at: https://‌jolt.‌law.harvard.edu/digest/zarya-of-the-dawn-how-ai-is-changing-the-land‌‌scape-of-copyright-protection (Accessed: 29 April 2024).

Andersen et al v. Stability AI Ltd. et al (2023) No. 3:2023cv00201 (US District Court for the Northern District of California). Available at: https://dockets.‌justia.‌com/docket/california/candce/3:2023cv00201/407208 (Accessed: 29 April 2024).

Baio, A. (2022) ‘Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator’, Waxy, 30 August. Available at: https://waxy.org/‌2022/08/exploring-12-million-of-the-images-used-to-train-stable-diffusions-image-generator/ (Accessed: 29 April 2024).

Barthes, R. (1977) ‘The Death of the Author’, in S. Heath (ed.) Image, Music, Text. London: Fontana, pp.142-148.

Benkler, Y. (2011) ‘The Unselfish Gene’, Harvard Business Review, July–August. Watertown: Harvard Business School Publishing Corporation.

Blaschke, E. (2009) ‘From the Picture Archive to the Image Bank’, Études photographiques 24(November). Available from: http://journals.openedition.‌org/‌etudesphotographiques/3435 (Accessed: 21 December 2023).

Buolamwini, J. A. (2017) Gender Shades: Intersectional Phenotypic and Demographic Evaluation of Face Datasets and Gender Classifiers (Thesis), Massachusetts Institute of Technology.

Castro, D. (2023) ‘Critics of Generative AI Are Worrying About the Wrong IP Issues’, Center for Data and Innovation, 20 March. Available at: https://‌datainnovation.‌org/2023/03/critics-of-generative-ai-are-worrying-about-the-wrong-ip-issues/ (Accessed: 29 April 2024).

Edelman, B. (1979) Ownership of the Image: Elements for a Marxist Theory of Law. London: Routledge and Kegan Paul.

Fabbrizzi, S., S. Papadopoulos, E. Ntoutsi and I. Kompatsiaris (2022) ‘A Survey on Bias in Visual Datasets’, arXiv, 23 June. Available at: http://arxiv.org/‌abs/‌2107.07919 (Accessed: 4 September 2022).

Flusser, V. (2012) Towards a Philosophy of Photography. London: Reaktion.

Foucault, M. (2021) ‘What is an Author?’, in D. F. Bouchard (ed.) Language, Counter-Memory, Practice. Ithaca, NY: Cornell University Press, pp.113-138.

Frosh, P. (2002) ‘Rhetorics of the Overlooked: On the communicative modes of stock advertising images’, Journal of Consumer Culture 2(2): 171-196.

Frosh, P. (2013) ‘Beyond the Image Bank: Digital Commercial Photography’, in M. Lister (ed.) The Photographic Image in Digital Culture. London: Routledge, pp.131-148.

Getty Images (2023) ‘Getty Images Launches Commercially Safe Generative AI Offering’, Getty Images, 25 September. Available at: https://newsroom.gettyimages.‌com/en/getty-images/getty-images-launches-commercially-safe-generative-ai-offering (Accessed: 29 April 2024).

Getty Images (US), Inc. v. Stability AI, Inc. (2023), No. 1:23-cv-00135-UNA (US District Court for the District of Delaware). Available at: https://docs.‌justia.com/‌cases/‌federal/‌district-courts/delaware/dedce/1:2023cv00135/81407/1 (Accessed: 29 April 2024).

Kanaan, Y. (2018) ‘A Future Dystopia: Will Human Artists Ever be Replaced by Artificial Intelligence?’, artmejo, 28 October. Available from: https://artmejo.com/‌‌a-future-dystopia-will-human-artists-ever-be-replaced-by-artificial-intelligence/ (Accessed: 29 April 2024).

Lavigne, S. (2023) ‘Scrapism: A Manifesto’, Critical AI 1(1–2). DOI:10.1215/2834703X-10734046.

Lazzarato, M. (2006) ‘Construction of Cultural Labour Market’, European Institute for Progressive Cultural Policies, November. Available at: https://eipcp.net/‌policies/cci/lazzarato/en.html (Accessed: 29 April 2024).

Lessig, L. (2004) Free Culture: The Nature and Future of Creativity. New York: Penguin Books.

Millemann, A. (2022) ‘Is Machine-Made Art Copyrightable?’, The IP Law Blog, 3 October. Available at: https://www.theiplawblog.com/2022/03/‌articles/‌copyright‌-‌‌law/is-machine-made-art-copyrightable/ (Accessed: 29 April 2024).

Mugrefya, É. and F. Snelting (2022) ‘Collectively Setting Conditions for Re-Use’, March, 3 January. Available at: https://march.international/collectively-setting-conditions-for-re-use/ (Accessed: 29 April 2024).

Newhauser, M. (2023) ‘The two models fueling generative AI products: Transformers and diffusion models’, GPTech, 13 July. Available at: https://‌www.‌gptechblog.com/generative-ai-models-transformers-diffusion-models/ (Accessed: 29 April 2024).

Niederberger, S., C. Sollfrank, and F. Stalder (eds.) (2021) Aesthetics of the Commons. Zurich: Diaphanes.

Offert, F. and T. Phan (2022) ‘A Sign That Spells: DALL-E 2, Invisual Images and The Racial Politics of Feature Space’, arXiv. Available at: https://‌arxiv.‌org/abs/2211.06323 (Accessed: 29 April 2024).

Origin Ltd. (2013) ‘Sweat of the brow doctrine’, Technology and IP Law Glossary, 14 June. Available at: https://www.ipglossary.com/glossary/sweat-of-the-brow-doctrine/ (Accessed: 29 April 2024).

Paglen, T. and K. Crawford (2019) ‘Excavating AI: the Politics of Images in Machine Learning Training Sets’, Excavating AI, 19 September. Available at: https://‌www.excavating.ai/ (Accessed: 29 April 2024).

Palpatine, V. (2023) ‘Dream studio is way too expensive.’, Reddit. Available at: https://‌www.reddit.com/r/StableDiffusion/comments/wumjpo/dream_studio_is_way_too_expensive/ (Accessed: 29 April 2024).

Patel, N. (2023a) ‘Getty Images CEO Craig Peters has a plan to defend photography from AI’, The Verge, 10 May. Available at: https://www.theverge.com/‌23903700/‌getty-images-craig-peters-generative-ai-images-disinformation-payment (Accessed: 29 April 2024).

Patel, N. (2023b) ‘Barack Obama on AI, free speech, and the future of the internet’, The Verge, 11 July. Available at: https://www.theverge.com/23948871/barack-obama-‌ai-regulation-free-speech-first-amendment-decoder-interview (Accessed: 29 April 2024).

Rahmatian, A. (2013) ‘Originality in UK Copyright Law: The Old “Skill and Labour” Doctrine Under Pressure’, IIC – International Review of Intellectual Property and Competition Law 44(1): 4-34. DOI:10.1007/s40319-012-0003-4.

Rivera Garza, C. (2020) The Restless Dead: Necrowriting and Disappropriation. La Vergne: Vanderbilt University Press.

Rose, J. (2022) ‘Shutterstock is Removing AI-Generated Images’, Motherboard, 19 December. Available at: https://www.vice.com/en/article/v7vzpj/shutterstock-is-‌removing-ai-generated-images (Accessed: 29 April 2024).

Samuelson, P. (2023) ‘Legal Challenges to Generative AI, Part II: Deliberating on inconclusive AI-generated policy questions.’, Communications of the ACM 66(11): 16-19.

Sekula, A. (1986) ‘The Body and the Archive’, October 39: 3-64. DOI:10.2307/778312.

Solomon, T. (2023) ‘US Judge Rules AI-Generated Art Not Protected by Copyright Law’, ARTnews, 21 August. Available at: https://www.artnews.com/art-news/‌news/‌us-judge-rules-ai-generated-art-is-not-protected-by-copyright-law-1234677410/ (Accessed: 29 April 2024).

Stallman, R. M. (2002) Free Software, Free Society: Selected Essays of Richard M. Stallman (ed. J. Gay). Boston: GNU Press.

US Copyright Office (2023) ‘Artificial Intelligence Study’, copyright.gov, 20 August. Available at: https://copyright.gov/policy/artificial-intelligence/?utm_campaign=‌subscriptioncenter&utm_content=&utm_medium=email&utm_name=&utm_source=govdelivery&utm_term= (Accessed: 29 April 2024).

Valyaeva, A. (2023) ‘AI Has Already Created As Many Images As Photographers Have Taken in 150 Years. Statistics for 2023’, Everypixel Journal, 15 August. Available from: https://journal.everypixel.com/ai-image-statistics (Accessed: 29 April 2024).

Wiggers, K. (2022) ‘This startup is setting a DALL-E 2-like AI free, consequences be damned’, TechCrunch, 8 December. Available at: https://techcrunch.com/‌2022/‌08/12/‌a-startup-wants-to-democratize-the-tech-behind-dall-e-2-consequences-be-damned/ (Accessed: 29 April 2024).

Wolfson, S. (2023, February 27) Zarya of the Dawn: US Copyright Office Affirms Limits on Copyright of AI Outputs, Creative Commons. Available from: https://‌creativecommons.org/2023/02/27/zarya-of-the-dawn-us-copyright-office-affirms-limits-on-copyright-of-ai-outputs/ (Accessed 29 April 2024)

Xiang, C. (2022) ‘Artists Are Revolting Against AI Art on ArtStation’, Motherboard, 14 December. Available at: https://www.vice.com/en/article/ake9me/artists-are-revolt-‌against-ai-art-on-artstation (Accessed: 29 April 2024).

Notes

[1] https://beta.dreamstudio.ai/generate

[2] https://clipdrop.co/

[3] See https://haveibeentrained.com

[4] In which case the image is considered not safe for work, or NSFW.

[5] An integer between 1 and 10 that corresponds to the degree to which annotators found the image aesthetically pleasing.

[6] It also includes indirect human labelling as the algorithms that evaluate the aesthetic score of the images have themselves been trained on datasets labelled by humans. One could therefore say that the process includes a form of human annotation “by proxy”.

[7] The lawyers insist that the terms and conditions of Getty’s websites expressly prohibit downloading their contents and their use for data mining without permission and that this prohibition includes photos and videos as well as metadata, descriptions and keywords (Getty Images (US), Inc. v. Stability AI, Inc., 2023: 11).

[8] The so-called “sweat of the brow” doctrine was giving protection to creative work if the author was able to demonstrate effort. This doctrine is now deprecated as the author has to prove a form of originality in the creation of the work for which they seek protection (Origin Ltd, 2013).

[9] At the time the complaint was filed, on average an image was worth $0.024 USD (Palpatine, 2023)

[10] See in particular Allan Sekula’s analysis of Galton and Quetelet in ‘The Body and the Archive’ (Sekula, 1986).

Nicolas Malevé is an artist, programmer and data activist living in Aarhus, Denmark. He is currently a post-doctoral researcher at the School of Communication and Culture at Aarhus University, supported by a grant from the Novo Nordisk Foundation (NNF21OC0068539).

Email: maleven@cc.au.dk

Media Theory Journal, Vol./No. 8/1

appropriation, Copyright, Generative AI, Getty Images, Stable Diffusion, stock photography

NICOLAS MALEVÉ: Lost in Compression

Lost in Compression: Models of Authorship in Generative AI

NICOLAS MALEVÉ

Aarhus University, DENMARK

Abstract

Keywords

Introduction

Getty Images vs Stability AI: Anatomy of a complaint

Origin or result? The vagaries of visual comparison

Image banks as probabilistic systems

Embracing image generation

Conclusion

References

Notes

Like this:

Share this article

Leave a ReplyCancel reply

NICOLAS MALEVÉ: Lost in Compression

Lost in Compression: Models of Authorship in Generative AI

NICOLAS MALEVÉ

Aarhus University, DENMARK

Abstract

Keywords

Introduction

Getty Images vs Stability AI: Anatomy of a complaint

Origin or result? The vagaries of visual comparison

Image banks as probabilistic systems

Embracing image generation

Conclusion

References

Notes

Like this:

Share this article

Leave a ReplyCancel reply

Discover more from Media Theory