It takes a bot to know one?

lewisandquark:

A couple of weeks ago, I wrote about GPT-2, a text-generating algorithm whose huge size and long-term analysis abilities mean that it can generate text with an impressive degree of coherence. So impressive, in fact, that its programmers at OpenAI have only released a mini version of the model for now, worried that people may abuse the full-size model’s easy-to-generate, almost-plausibly-human text. 

(below: some text generated by mini-GPT-2, in response to the prompt in italics)

This was a fantastic recipe for chocolate cake with raspberry sauce! I only made a couple of changes to the recipe. First, Iadded vanilla candles instead of meringues for a more mild and exotic fragrance. Once again, I only used 1 tsp of vanilla syrup for clarity. Second, the chocolate cake whipped cream was tempered by an additional 1 tsp of canola oil. The regular vegan whipped cream is soothing and makes it pleasing to the hungry healthiest person I know!

In the meantime, as OpenAI had hoped, people are working on ways to automatically detect GPT-2′s text. Using a bot to detect another bot is a strategy that can work pretty well for detecting fake logins, video, or audio. And now, a group from MIT-IBM Watson AI lab and Harvard NLP has come up with a way of detecting fake text, using GPT-2 itself as part of the detection system.

The idea is fairly simple: GPT-2 is better at predicting what a bot will write than what a human will write. So if GPT-2 is great at predicting the next word in a bit of text, that text was probably written by an algorithm - maybe even by GPT-2 itself.

There’s a web demo that they’re calling Giant Language model Test Room (GLTR), so naturally I decided to play with it.

First, here’s some genuine text generated by GPT-2 (the full-size model, thanks to the OpenAI team being kind enough to send me a sample). Green words are ones that GLTR thought were very predictable, yellow and red words are less predictable, and purple words are ones the algorithm definitely didn’t see coming. There are a couple of mild surprises here, but mostly the AI knew what would be generated. Seeing all this green, you’d know this text is probably AI-generated.

HERMIONE: So, you told him the truth?
Snape: Yes.
HARRY: Is it going to destroy him? You want him to be able to see the truth.
Snape: [turning to her] Hermione, I-I-I'm not looking for acceptance.
HARRY: [smiling] No, it's-it's good it doesn't need to be.
Snape: I understand.
	[A snake appears and Snape puts it on his head and it appears to do the talking. 	It says 'I forgive you.']
HARRY: You can't go back if you don't forgive.
Snape: [sighing] Hermione.
HARRY: Okay, listen.
Snape: I want to apologize to you for getting angry and upset over this.
HARRY: It's not your fault.
HARRY: That's not what I meant to imply.
	[Another snake appears then it says 'And I forgive you.']
HERMIONE: And I forgive you.
Snape: Yes.

Here, on the other hand, is how GLTR analyzed some human-written text, the opening paragraph of the Murderbot diaries. There’s a LOT more purple and red. It found this human writer to be more unpredictable.

I could have become a mass murderer after I hacked my governor module, but then I realized I could access the combined feed of entertainment channels carried on the company satellites. It had been well over 35,000 hours or so since then, with still not much murdering, but probably, I don’t know, a little under 35,000 hours of movies, serials, books, plays, and music consumed. As a heartless killing machine, I was a terrible failure.

But can GLTR detect text generated by another AI, not just text that GPT-2 generates? It turns out it depends. Here’s text generated by another AI, the Washington Post’s Heliograf algorithm that writes up local sports and election results into simple but readable articles. Sure enough, GLTR found Heliograf’s articles to be pretty predictable. Maybe GPT-2 had even read a lot of Heliograf articles during training.

image

However, here’s what it did with a review of Avengers: Infinity War that I generated using an algorithm Facebook trained on Amazon reviews. It’s not an entirely plausible review, but to GLTR it looks a lot more like the human-written text than the AI-generated text. Plenty of human-written text scores in this range.

The Avengers: Infinity War is a movie that should be viewed on its own terms, and not a tell-all about The Hulk.  I have always loved the guys that played Michael Myers, and of all the others like Angel and Griffin, Kim back to Bullwinkle, and Edward James Olmos as the Lion. Special mention must go to the performances of Robert De Niro and Anthony Hopkins.  Just as I would like to see David Cronenberg in a better role, he is a treat the way he is as Gimli.Also there is the evil genius Bugs Bunny and the amazing car chase scene that has been hailed as THE Greatest Tank Trio of All Time ever (or at least the last one).  With Gary Oldman and Robert Young on the run and almost immediate next day in the parking lot to be his lover, he tries to escape in a failed attempt at a new dream.  It was a fantastic movie, full of monsters and beasts, and makes the animated movies seem so much more real.

And here’s how GLTR rated another Amazon review by that same algorithm. A human might find this review to be a bit suspect, but, again, the AI didn’t score this as bot-written text.

The Harry Potter File, from which the previous one was based (which means it has a standard size liner) weighs a ton and this one is huge! I will definitely put it on every toaster I have in the kitchen since, it is that good.This is one of the best comedy movies ever made. It is definitely my favorite movie of all time. I would recommend this to ANYONE!

What about an AI that’s really, really bad at generating text? How does that rate? Here’s some output from a neural net I trained to generate Dungeons and Dragons biographies. Whatever GLTR was expecting, it wasn’t fuse efforts and grass tricks.

instead was a drow, costumed was toosingly power they are curious as his great embercrumb, a fellow knight of the area of the son, and the young girl is the agents guild, as soon as she received astering the grass tricks that he could ask to serve his words away and he has a disaster of the spire, but he was super connie couldn't be resigned to the church, really with the fuse effort to fit the world, tempting into the church of the moment of the son of the gods, there was what i can contrive that she was born into his own life, pollaning the bandit in the land. the ship, i decided to fight with the streets. he met the ship without a new priest of pelor like a particularly bad criters but was assigned as well.as he was sat the social shape and his desire over the river and a few ways that had been seriously into the fey priest. abaewin was never taken in the world. he had told me this was lost for it, for reason, and i cant know what was something good clear, but she had attack them 15, they were divided by a visators above the village, but he went since i was so that he stayed. but one day, she grew up from studying a small lion.

But I generated that biography with the creativity setting turned up high, so my algorithm was TRYING to be unpredictable. What if I turned the D&D bio generator’s creativity setting very low, so it tries to be predictable instead? Would that make it easier for GLTR to detect? Only slightly. It still looks like unpredictable human-written text to GLTR.

he is a successful adventurers of the city and the lady of the undead who would be able to use his own and a few days in the city of the city of bandits. he was a child to be a deadly in the world and the goddess of the temple of the city of waterdeep. he was a child for a few hours and the incident of the order of the city and a few years of research. she was a child in a small village and was invited to be a good deal in the world and in the world and the other children of the tribe and the elven village and the young man was exiled in the world. he was a child to the forest to the local tavern and a human bard, a human bard in a small town of his family and the other two years of a demon in the world.

GLTR is still pretty good at detecting text that GPT-2 generates - after all, it’s using GPT-2 itself to do the predictions. So, it’ll be a useful defense against GPT-2 generated spam.

But, if you want to build an AI that can sneak its text past a GPT-2 based detector, try building one that generates laughably incoherent text. Apparently, to GPT-2, that sounds all too human.

For more laughably incoherent text, I trained a neural net on the complete text of Black Beauty, and generated a long rambling paragraph about being a Good Horse. To read it, and GLTR’s verdict, enter your email here and I’ll send it to you.