Smart Text Generator Unnerves OpenAI Group, Promises More Testing

OpenAI researchers just wanted to train a next gen text generator’s software to predict the next word in a sentence. The text generator did far better than that, adapting the writing styles it was fed and even excelling at translation and summarization, tasks it wasn’t designed to do.

It did so well that the humans in charge decided it was not a good idea to unleash the little beast on the unsuspecting world.

Researchers at the non-profit AI research group OpenAI got more than they bargained for in teaching the software to predict the next word in a sentence. It was so good at mimicking writing by humans they’ve decided to pump the brakes on the research while they explore the damage it could do by testing it further.

Elon Musk has been clear that he believes artificial intelligence is the “biggest existential threat” to humanity. Musk is one of the primary funders of OpenAI, and its researchers appear to share his concerns about opening a Pandora’s box of trouble.

A story in The Guardian.com written by Rhett Jones and posted on Gizmodo, explained as simply as possible why the generator is just TOO good. Engineers first pulled 40gb of data, from over 8 million web pages just to train the software, called GPT-2.  The next step was to feed it a small amount of data, such as a Hemingway paragraph or a Poe opening line. The GPT-2 took off from there.

He is an example of the first paragraph input:

In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.

In just a few seconds, it had analyzed the whimsical piece and then produce 9 paragraphs that fit the storyline perfectly.:

The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.

Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.

Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow.

Graphical image courtesy of OpenAI.

The Guardian was able to take the software for a spin and tried out the first line of George Orwell’s Nineteen Eighty-Four: “It was a bright cold day in April, and the clocks were striking thirteen.” The program picked up on the tone of the selection and proceeded with some dystopian science fiction of its own:

“I was in my car on my way to a new job in Seattle. I put the gas in, put the key in, and then I let it run. I just imagined what the day would be like. A hundred years from now. In 2045, I was a teacher in some school in a poor part of rural China. I started with Chinese history and history of science.”

It took GPT-2 seconds to write about a dystopian world where everything is double speak and completely nuts. That makes the software far more advanced than other versions. Dario Amodei, OpenAI’s research director, said the size made it unique. The models “were 12 times bigger, and the dataset was 15 times bigger and much broader” than prior AI models. It was trained on a dataset containing about 10 million articles, selected from Reddit links with more than three votes. The vast amount text, 40 GB, is large enough to store about 35,000 copies of Moby Dick.

The powers that be at OpenAI decided to release a far less powerful version for now.

read more at cl.s7.exct.net

Or check out/www.technologyreview.com/s/612975/ai-natural-language-processing-explained/