OpenAI Releases Update of GPT-2 Unsupervised Language Model

San Fransisco, Calif.-based OpenAI on Tuesday released GPT-2 (1.5B), the "final model" release of this version of its popular large-scale unsupervised language model.

According to OpenAI, its GPT project "generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering and summarization -- all without task-specific training."

This new GPT 2 was trained by the non-profit organization to "simply to predict the next word in 40GB of Internet text." However, there were some caveats:

"Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with.

The GPT-2 language has 1.5 billion parameters and is trained on a dataset consisting of 8 million Web pages. According to OpenAI, because of the broad range and types of text it covers, it has some interesting capabilities, including the ability to perform question answering and reading comprehension without task-specific training data, and generate conditional text samples "of unprecedented quality."

"The model is chameleon-like -- it adapts to the style and content of the conditioning text," PureAI wrote about the project. " This allows the user to generate realistic and coherent continuations about a topic of their choosing."

The 1.5B final model version of GPT-2 released Tuesday is the largest version, and offers code and model weights "to facilitate detection of outputs of GPT-2 models."

A paper detailing the release can be found here.

An independent tutorial on working with the code can be found here.

About the Author

Becky Nagel is the vice president of Web & Digital Strategy for 1105's Converge360 Group, where she oversees the front-end Web team and deals with all aspects of digital strategy. She also serves as executive editor of the group's media Web sites, and you'll even find her byline on, the group's newest site for enterprise developers working with AI. She recently gave a talk at a leading technical publishers conference about how changes in Web technology may impact publishers' bottom lines. Follow her on twitter @beckynagel.