News
OpenAI Releases Update of GPT-2 Unsupervised Language Model
- By Becky Nagel
- November 5, 2019
San Fransisco, Calif.-based OpenAI on Tuesday released GPT-2 (1.5B), the "final model" release of this version of its popular large-scale unsupervised language model.
According to OpenAI, its GPT project "generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering and summarization -- all without task-specific training."
This new GPT 2 was trained by the non-profit organization to "simply to predict the next word in 40GB of Internet text." However, there were some caveats:
"Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with.
The GPT-2 language has 1.5 billion parameters and is trained on a dataset consisting of 8 million Web pages. According to OpenAI, because of the broad range and types of text it covers, it has some interesting capabilities, including the ability to perform question answering and reading comprehension without task-specific training data, and generate conditional text samples "of unprecedented quality."
"The model is chameleon-like -- it adapts to the style and content of the conditioning text," PureAI wrote about the project. "
This allows the user to generate realistic and coherent continuations about a topic of their choosing."
The 1.5B final model version of GPT-2 released Tuesday is the largest version, and offers code and model weights "to facilitate detection of outputs of GPT-2 models."
A paper detailing the release can be found here.
An independent tutorial on working with the code can be found here.
About the Author
Becky Nagel is vice president of AI for 1105 Media, where she specializes in training internal and external customers on maximizing their business potential via a wide variety of generative AI technologies as well as developing cutting-edge AI content and events. She's the author of "ChatGPT Prompt 101 Guide for Business Uses," regularly leads research studies on generative AI business usage, and serves as the director of AI Boardroom, a new resource for C-level executives looking to excel in the AI era. Prior to her current position she was a technical leader for 1105 Media's Web, advertising and production teams as well as editorial director for a suite of enterprise technology publications, including serving as founding editor of PureAI.com. She has 20 years of enterprise technology journalism experience, and regularly speaks and writes about generative AI, AI, edge computing and other cutting-edge technologies. She can be reached at [email protected].