OpenAI Releases Update of GPT-2 Unsupervised Language Model

San Fransisco, Calif.-based OpenAI on Tuesday released GPT-2 (1.5B), the "final model" release of this version of its popular large-scale unsupervised language model.

According to OpenAI, its GPT project "generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering and summarization -- all without task-specific training."

This new GPT 2 was trained by the non-profit organization to "simply to predict the next word in 40GB of Internet text." However, there were some caveats:

"Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with.

The GPT-2 language has 1.5 billion parameters and is trained on a dataset consisting of 8 million Web pages. According to OpenAI, because of the broad range and types of text it covers, it has some interesting capabilities, including the ability to perform question answering and reading comprehension without task-specific training data, and generate conditional text samples "of unprecedented quality."

"The model is chameleon-like -- it adapts to the style and content of the conditioning text," PureAI wrote about the project. " This allows the user to generate realistic and coherent continuations about a topic of their choosing."

The 1.5B final model version of GPT-2 released Tuesday is the largest version, and offers code and model weights "to facilitate detection of outputs of GPT-2 models."

A paper detailing the release can be found here.

An independent tutorial on working with the code can be found here.

About the Author

Becky Nagel is the vice president of Web & Digital Strategy for 1105's Converge360 Group, where she oversees the front-end Web team and deals with all aspects of digital projects at the company, including launching and running the group's popular virtual summit and Coffee talk series . She an experienced tech journalist (20 years), and before her current position, was the editorial director of the group's sites. A few years ago she gave a talk at a leading technical publishers conference about how changes in Web browser technology would impact online advertising for publishers. Follow her on twitter @beckynagel.