Generative AI: Learning to Live with ChatGPT
Why creators of all hues worry about this disruptive capability and what they can do about it
All new technology in its early phase is highly disruptive to the status quo. I call this the ‘dynamite’ phase—you can use it to blow things to smithereens or tunnel through mountains and move humanity forward. AI, especially generative AI, is in its ‘dynamite’ phase. Artists and writers—people who make a living creating artefacts, or products, if you will, are alarmed. What do these apparently violative developments mean for the sanctity of their creations and their livelihoods?
As someone who spent half a dozen years writing for a living and has then spent the next dozen-plus years building consumer tech products (games, healthcare, education, media and payments), this is a topic close to my heart. Creating something is a personal act. It requires craft, honed over years of immersive practice and patience. Writers, artists, coders, designers… all share these traits. We learn from other practitioners, through reading, viewing, observing, listening and most importantly doing, and over time develop what we believe is our unique style.
Artists are often afraid their work might be considered derivative; the fear of being called a ‘copycat’ looms. And yet, it is inescapably true that we all learnt through imitation. No one questions a child (or even an adult messing around with an Apple Pencil) who looks at Van Gogh’s Sunflowers and tries to copy it.
The first stories I ever wrote sounded exactly like stories from the Panchatantra and Enid Blyton’s school stories. As a writer of fantasy fiction, it took years to not blindly copy Pratchett or write really terrible poems like Tolkien (I instead imitated Vikram Seth’s Beastly Tales). After what felt like a million or so words (handwritten and typed), I finally started feeling confident in my ability to be ‘original’.
But generative AI worries me. Could an LLM (Large Language Model) be trained on fiction written by some of the best writers in the world and ‘create’ something new— something not derivative and perhaps original? What about music and art? If you are an artist, could generative AI help anyone produce art in your signature style? Could this potentially lead to a loss of income for artists?
Multiple conversations later, I realised that other people had similar questions about the ethics, economics and evolving policies around this field.
This needed a deep dive. I had to learn, then distil, and then share. Think of the rest of this essay as a trigger. It might appeal to you as someone who is an artist trying to make sense of this new ‘dynamite’ phase. It might appeal to your curiosity as a technologist and product builder. Heck, you might even be an AI expert who wants to see how the rest of the world reacts to what you are building. Or, you are most likely a bystander wondering what all this ruckus is really about—in which case, fasten your seatbelts!
Generative AI, LLMs, diffusion models et al
The following segment of this essay was produced using a bunch of generative text AI tools available in the market. It made sense to use the power of LLMs to write about their capabilities. I have edited heavily, checked for accuracy and rewritten parts wherever the flow did not feel right. But overall, I find these tools, especially those that summarise larger pieces of text, very worthy writing assistants. Good writing is not going anywhere, it will only become stronger and more widespread as a result of these evolving technologies.
If you have a good understanding of generative AI, skip directly to the next section on copyright law.
Generative AI is a type of artificial intelligence that can create new content, such as text, images and music. It does this by using a neural network model to learn the patterns and relationships in the content it is trained on. Once it has learned these patterns, it can then use them to create new content that is similar to the content it was trained on.
There are three terms to understand here: data, model and application. Data is an essential component of artificial intelligence. It consists of a collection of facts, figures, or any other information that can be in the form of text, images or sounds. One of the primary uses of data in AI is to train ‘models’, which are mathematical representations of real-world phenomena. These models are then used to make predictions or decisions based on the data they have been trained on. For instance, a model trained on a large dataset of satellite images can predict weather patterns with a fair degree of accuracy.
Applications are the way in which humans interact with AI models and their underlying data sets to solve problems and get things done. Applications range from speech recognition software, which can process natural language, to self-driving cars, which use computer vision to navigate roads. In recent years, AI has been applied to a wide range of industries, including finance, healthcare and transportation, to name a few. As such, it has become an increasingly important area of research and development.
Data —> Model —> Application
Example: Data: text sources on the internet
Model: OpenAI’s GPT-3
Data is used to train models by providing them with labelled examples. For example, a classifier (a type of model) could be trained on a dataset of emails, where each email is labelled as spam or not spam. The model would then learn to predict the class label for a given input email.
Models are evaluated based on their performance on a test set. The test set is a set of data that is not used to train the model. This allows us to get an unbiased estimate of the model’s performance.
LLMs and Natural Language Processing
One of the big problems that these models are trying to solve is in understanding natural language. Enter LLMs. These are a type of artificial intelligence that can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way, even if they are open-ended, challenging or strange.
LLMs are tens of gigabytes in size and trained on enormous amounts of text data. Their performance continues to scale as more parameters are added to the model. The most popular LLM right now is GPT-3. It is a third-generation Generative Pretrained Transformer, a neural network machine learning model developed by OpenAI, trained using internet data to generate any type of text.
It has over 175 billion machine learning parameters and can be used for natural language generation and processing tasks such as creating articles, poetry, stories, news reports and dialogue; generating summaries; programming code snippets; finding bugs in existing code; mocking up websites; translating between languages or performing ‘sentiment analysis’.
• Benefits include its task agnosticism (can perform many different tasks without finetuning) and lightweight nature, which allows it to run on consumer laptops or smartphones.
• Limitations/risks are pre-training (no long-term memory), limited input size and slow inference time as well as mimicry leading to issues with factual accuracy due to bias from the underlying training data set. Recently, OpenAI released GPT-4, their most advanced LLM which can take both text and image inputs as prompts (multimodal) to generate an output.
The rapid development of machine learning and AI technologies is leading to a new era of natural language processing (NLP). Largescale language models, such as Google’s BERT and OpenAI’s GPT-3, are increasingly being used to power a variety of applications including search engine results, conversational interfaces and text generation. These models are trained on large datasets and can easily process huge volumes of text, understanding the context and intent behind words. This enables them to generate more accurate results than traditional methods of NLP.
But better models are just around the corner.
Another exciting field in generative AI is image generation. Generative AI models, such as DALL-E, use a technique called ‘diffusion modelling’ to produce images.
• Diffusion models work by first ruining the image with random noise before attempting to rebuild it through a series of steps that reduce noise while increasing its meaning.
• The model is trained by adjusting parameters within neural networks for it to take meaningless images and evolve them into something meaningful.
Predicting the performance or understanding the workings of generative AI models is difficult. Their outputs can only be judged based on whether they look good or not.
These models can be considered highly capable imitators, like a smart parrot, since they do not truly understand language or comprehend real landscapes. Despite this, they create realistic-looking outputs from statistical mashups alone.
Matters of copyright
Copyright and intellectual property law are not uniform across the world. Each country has its own implementation. The following are the most common aspects across the world.
Copyright law is a form of intellectual property law that protects original works of authorship, including literary, dramatic, musical and artistic works, such as poetry, novels, movies, songs, computer software and architecture. Copyright law gives the author of a work the exclusive right to reproduce, distribute and perform the work, as well as to create derivative works based on the work.
Two key aspects of copyright law relate to the idea/ expression dichotomy and the fair use doctrine.
• The idea/expression dichotomy. Copyright law protects the expression of an idea, but not the idea itself. This means that anyone can use the same idea as someone else, but they cannot copy the expression of that idea.
• The fair use doctrine. The fair use doctrine allows for the use of copyrighted material without permission from the copyright holder in certain limited circumstances, such as for purposes of criticism, commentary or education.
Fair use is a legal doctrine that permits the use of copyrighted material without permission in certain circumstances, promoting freedom of expression and creativity. The doctrine is based on the idea that copyright law should not be used to stifle creativity, and that in some cases, it is in public interest to allow limited use of copyrighted material without permission from the copyright holder.
To determine whether a use is fair, courts consider four factors:
1. The purpose and character of the use, including whether it is commercial or non-commercial, and whether it is ‘transformative’.
2. The nature of the copyrighted work.
3. The amount and substantiality of the portion of the work that is used.
4. The effect of the use on the potential market for or value of the copyrighted work.
No single factor determines whether a use is fair. Courts must consider all factors and weigh them in light of the specific facts of a case.
Some instances automatically assume fair use. For example, in the United States, it’s fair use to quote a copyrighted work in a review or criticism of that work. Similarly, using copyrighted work for news reporting is fair use. However, in other cases, a use may not be fair, even if it falls under one of the above categories. For example, copying an entire copyrighted work and distributing it for free, even for educational purposes, is not fair use.
In general, the more transformative the re-use case is, the more likely it is to be considered fair use. Transformative uses add something new and original to the copyrighted work, such as by commenting on it, criticising it or parodying it. Conversely, uses that are not transformative, such as copying the work exactly as it is, are less likely to be considered fair use.
It’s important to note that it is the user who must prove that their use of copyrighted material falls within the fair use doctrine.
Generative AI and the copyright question
As AI models and applications become more and more sophisticated, the potential for their use in creative expression is increasing. With this potential comes the question: under what conditions is the use of images and text as training data, copyrighted or not, fair? For example, if a large language model is trained on a dataset of works of long-dead authors, is that considered fair use? What about living authors and artists?
The principle of fair use requires that the original material should undergo sufficient transformation. In the realm of generative art, how do we decide if the result is sufficiently different from the original for it to qualify as fair use? How do we measure the substantiality of the original work used as training data by a model? While ideas can be copied, their expression cannot be, and this is where the world of art, writing and creation will clash with generative AI and copyright law.
There is another important question to consider: how will attribution and monetisation work? Copyright tags (like Creative Commons) used to state whether a piece of work requires/ does not require attribution and whether it can/ cannot be used for commercial purposes. But how can we ensure that original artists and creators are attributed and receive payment when their work is used as training data to generate new pieces of commercial art?
This is already common practice (though not perfectly implemented) in the world of music sampling. Sampling refers to the act of taking a portion of a sound recording and reusing it by incorporating it into an audio-only recording of a new song. This is common in genres in which artists will typically use pre-recorded music and sounds to create new work (hip hop, EDM, etc). If someone wants to sample a sound recording, they must obtain permission from both the copyright owner of the song (the music publisher(s)) and the copyright owner of the particular recording of that song (the record label) to avoid copyright infringement.
It will be important for creators and technology companies to work together to find a solution that ensures copyright law is respected while still allowing development of generative AI technology.
We have already seen the opening salvos of copyright holders directed at AI model creators. Getty Images recently filed a lawsuit against Stability AI for apparently using 12 million of their images to train their image generation model.
What could be the way forward?
The intersection of policy, copyright law, creator economy and AI will need new champions. Policymakers have traditionally been incredibly weak in understanding evolving technologies. Old-school creators mistrust technologists and technology. And technologists do inadvertently blow things to smithereens before moving humanity forward.
The way forward will require us to solve the following three problems fairly quickly:
1. Educating creators Creators will have to learn how AI models could use their work as training data. As more technologists start working in AI, explaining its nuances to people in other fields becomes very important.
2. Participation with the right terms Creators will also need tools that allow them to tag and mark their work as AI-ready: can/ cannot be used as training data for AI models; attribution required/ not required; Can be used for commercial/non-commercial purposes; way to pay the original creators and suchlike. A bunch of people have been thinking on these lines, but a real solution is still a distant dream.
3. Generating value for creators
a. Attribution. Attributing original creators whose work was used while generating a new piece of work will be a challenge. Stable Attribution is one example of a tool that helps find the human creators behind an AI-generated image.
b. Proof of ownership. Proof of owner ship or original creation is an important aspect to consider, especially when it comes to digital content. One possible solution could be a blockchain-based system that verifies the authenticity and uniqueness of each piece of content. This would not only help with attribution and copyright protection but could also open up new opportunities for monetisation and revenue generation in a digital marketplace. By leveraging the transparent and decentralised nature of blockchain technology, creators and owners of digital content could have greater control and ownership over their work, leading to a fairer, more sustainable online ecosystem for all parties involved.
c. Monetisation. It will also be important for technology companies that are developing these AI models and applications to figure out how they ensure that creators of original work do not lose out on potential income and these applications actually help everyone become better creators.
As a technologist, this is a really exciting time. The number of amazing and genuinely helpful applications being built in generative AI is inspiring. But I’m also wary of the current ‘dynamite’ phase and do not want oldschool creators to be left behind. My hope is that we are not only able to bring them along but actually make the future more rewarding for them as also to make the creator ecosystem more inclusive.
ANSHUMANI RUDDRA has authored children’s books and designed social games like Mafia Wars and Cafe World
Follow us on: Facebook, Twitter, Google News, Instagram
Join our official telegram channel (@nationalherald) and stay updated with the latest headlines