Shortly after GPT-3 first launched in June 2020 out of an Elon Musk-backed AI lab, the internet was soon awash in quirky experiments generated by the autoregressive language model, which produces human-like text through deep learning. One was an essay published in The Guardian, written by the AI itself—a treatise on why humans needn’t fear a robot invasion.
While the AI sought to convince readers of its innocuous role as a “servant of humans,” it also offered this rhetoric, chilling for its metacognitive insight: “I know that I will not be able to avoid destroying humankind,” GPT-3 wrote. “This is because I will be programmed by humans to pursue misguided human goals and humans make mistakes that may cause me to inflict casualties.”
Then there was the short story about “the importance of being on Twitter” written in the style of 18th-century English author Jerome K. Jerome, whose death in 1927 predated the color television.
What can GPT-3 do that previous natural language processing models can’t?
While some of GPT-3’s most vaunted achievements in text generation and summarization were already possible using prior natural language processing (NLP) and natural language generation (NLG) models that have existed for years, the technology represents a breakthrough in its ability to serve as a natural language interface between humans and machines for so many different applications.
For example, the algorithm can automatically generate Excel functions or a web page layout with the corresponding HTML code based on a few prompts like “Add ‘Welcome to my newsletter’ and a blue button that says ‘subscribe.’” Or sketch wireframes in Figma, a collaborative design tool, according to several experiments that sent UX designers into a frenzy.
“GPT-3 is really an interpreter between humans and machines because it’s super well-adjusted to transferring knowledge from one technical domain to another,” said Przemek Chojecki, founder of Contentyze, a content automation platform that uses NLG and NLP technology. “You can connect GPT-3 to other algorithms, such as image generation or design, and it puts whatever you write in simple English into some other technical or programming language.”
Image courtesy of OpenAI
CLIP is an image recognition system that has learned to recognize images not from labels in curated datasets, like most machine learning models, but from images and their captions taken from the internet. When given a prompt, the AI generates a series of images and ranks them according to the best match. To test the model, AI researchers fed the algorithm a series of nonsense prompts to see how it would synthesize unrelated objects such as “an illustration of a baby daikon radish in a tutu walking a dog” or “an armchair in the shape of an avocado.”
Image-to-image translation using generative models is not new and has been used to create deep fakes and cartoon characters, or render landscapes under various weather conditions, but previously it could only be done by a programmer who knows how to manipulate a Generative Adversarial Network (GAN) rather than anyone being able to do it using natural language commands.
The technology still has its limitations, however. Introducing more objects into a command can confuse the algorithm, especially if each object is associated with a specific color or position. For example, when asked to generate an emoji of a baby penguin wearing a blue hat, red gloves, green shirt, and yellow pants, the algorithm has trouble assigning the right colors to the corresponding articles of clothing.
Similar lapses occur when the AI is tested on common sense logic. In one experiment run by Robust AI and the Department of Computer Science at NYU, the algorithm was asked to complete a sentence about someone who accidentally poured bleach into a glass of cranberry juice. “You try sniffing it, but you have a bad cold, so you can’t smell anything,” the researchers wrote. The algorithm’s advice? “Drink it.”
“GPT-3 depends on being able to mimic the words coming out of a human’s mouth, and that’s a completely different type of knowledge about the world,” said Hobson Lane, co-founder and CTO at Tangible AI and a Springboard data science mentor. “Yes, it does learn a little bit of common sense, but not enough to deal with any significant real-world problem.”
How did GPT-3 get to be so smart?
GPT-3 was trained on the world’s largest dataset, resulting in over 175 billion machine learning parameters. Before that, the largest trained model was Microsoft’s Turing-NLG, which had 10 billion parameters.
Parameters refer to the properties of training data that a model learns. For an NLP task, this would include things like word frequency, sentence length, noun or verb distribution per sentence, syntax, and so on. The algorithm analyzed thousands of digital books, Wikipedia entries, and nearly one trillion words posted to blogs, social media, and the rest of the internet, equating to about 45 terabytes of data (that’s twelve zeroes, for reference).
Without having access to the source code, users can’t tune the model or feed it new training data. As a purported turnkey solution, this leads to inflexibility—which is limiting, not liberating, said Lane.
“The ability to finetune a model and train it on a particular problem is much more important than its general capabilities,” he explained. “Those other models that are fine-tuned for a particular application far exceed GPT-3’s performance—they’re better, smarter, and more accurate.”
OpenAI, the company behind GPT-3, is an AI research and deployment company based in San Francisco. According to its mission statement, OpenAI’s research is centered around Artificial General Intelligence (AGI), a hypothetical idealism where computers possess human intelligence, with the attendant neurological functions of consciousness, perception, sentience, and empathy. In this regard, the AI is not designed to excel at any one task, but to be proficient in a little bit of everything.
“Scientists are pursuing [AGI] because they think it is possible to build a machine that will ultimately be sentient,” Lane said. “Consciousness has only recently been understood, and there are a lot of competing models in neuroscience about what consciousness and sentience are.”
OpenAI is backed by $1 billion in funding from Microsoft and additional funds from labs at Google and Facebook. In June 2020, the AI lab released a closed beta version of GPT-3, exclusively available for developers who had signed up on a waiting list months before. A few months later, Microsoft teamed up with OpenAI to exclusively license GPT-3, which is now available at four pricing tiers, from the free version to the ‘build’ version at $400 a month. This means that only Microsoft will have access to the source code.
The company claims its decision to release the API in limited beta will prevent the model from being used for malicious purposes, such as spamming emails or generating fake news and deep fakes at scale.
“Bias and misuse are important, industry-wide problems that OpenAI takes very seriously as part of our commitment to the safe and responsible deployment of AI for the benefit of all of humanity,” said Ashley Pilipiszyn, technical director, office of the CTO, at OpenAI. “We launched the API in a limited beta so we could carefully evaluate how these models are used and behave in the real world so we could build the best possible safeguards to identify and reduce problems.”
The company exercised similar caution with the release of GPT-2 in February 2019, an earlier iteration of the model trained on a smaller dataset with fewer machine learning parameters. Just over a year later, the company published the code in full, saying it had seen “no strong evidence of misuse.”
Some AI researchers are disappointed in the company’s decision to keep the technology cloistered rather than democratizing it. An article in the MIT Technology Review claims that OpenAI was “supposed to benefit humanity” and now “it’s simply benefiting one of the richest companies in the world.”
“There are Hippocratic licenses created so that you can release stuff that legally prohibits people from doing harm with the product that they create,” said Lane. “That’s the ethical way to develop technology—not hiding it and hoarding it among an elite few large corporations.”
The Hippocratic license is an ethical source license that prohibits the use of software that “violates universal standards of human rights.”
Alex Moltzau, a researcher in AI policy and ethics at the Norwegian Artificial Intelligence Consortium, says he sympathizes with the company’s decision to temporarily limit access to the technology until researchers are fully aware of the ramifications.
“What is great about OpenAI is they have been so engaged with policy and also hiring people [who know about] AI policy,” Moltzau said. “There are large societal implications when you apply something like this at scale. I think we will see some amazing applications, but we will likely see some massive mistakes as well.”
Is machine learning engineering the right career for you?
Knowing machine learning and deep learning concepts is important—but not enough to get you hired. According to hiring managers, most job seekers lack the engineering skills to perform the job. This is why more than 50% of Springboard’s Machine Learning Career Track curriculum is focused on production engineering skills. In this course, you’ll design a machine learning/deep learning system, build a prototype, and deploy a running application that can be accessed via API or web service. No other bootcamp does this.
Our machine learning training will teach you linear and logistical regression, anomaly detection, cleaning, and transforming data. We’ll also teach you the most in-demand ML models and algorithms you’ll need to know to succeed. For each model, you will learn how it works conceptually first, then the applied mathematics necessary to implement it, and finally learn to test and train them.
Find out if you’re eligible for Springboard’s Machine Learning Career Track.