Exploring Possibility Space: Poor Software QA Is Root Cause of TAY-FAIL (Microsoft's AI Twitter Bot)

[Update 3/26/16 3:40pm: Found the smoking gun. Read this new post. Also the recent post: "Microsoft's Tay has no AI"]

This happened:

"On Wednesday morning, the company unveiled Tay [@Tayandyou], a chat bot meant to mimic the verbal tics of a 19-year-old American girl, provided to the world at large via the messaging platforms Twitter, Kik and GroupMe. According to Microsoft, the aim was to 'conduct research on conversational understanding.' Company researchers programmed the bot to respond to messages in an 'entertaining' way, impersonating the audience it was created to target: 18- to 24-year-olds in the US. 'Microsoft’s AI fam from the internet that’s got zero chill,' Tay’s tagline read." (Wired)

Then it all went wrong, and Microsoft quickly pulled the plug:

"Hours into the chat bot’s launch, Tay was echoing Donald Trump’s stance on immigration, saying Hitler was right, and agreeing that 9/11 was probably an inside job. By the evening, Tay went offline, saying she was taking a break 'to absorb it all.' " (Wired)

Why did it go "terribly wrong"? Here are two articles that assert the problem is in the AI:

"It’s Your Fault Microsoft’s Teen AI Turned Into Such a Jerk" - Wired tl;dr: "this is just how this kind of AI works"
"Why Microsoft's 'Tay' AI bot went wrong...AI experts explain why it went terribly wrong" - TechRepublic tl;dr: "The system is designed to learn from its users, so it will become a reflection of their behavior".

The "blame AI" argument is: if you troll an AI bot hard enough and long enough, it will learn to be racist and vulgar. ([Update] For an example, see this section, at the end of this post)

I claim: the explanations that blame AI are wrong, at least in the specific case of tay.ai.

[Update 3/26/16 12:40pm]
Here are some good commentaries and analysis that address additional problems, beyond "respond to me" exploit that is my focus:

"Microsoft’s Tay is an Example of Bad Design (or Why Interaction Design Matters, and so does QA-ing)" -- e.g. inadequate black list
"TayAndYou - toxic before human contact" -- e.g. inadequate filtering of training corpus, incl. tweet corpus.
"Why did Microsoft’s chatbot Tay fail, and what does it mean for Artificial Intelligence studies?" -- e.g. poor marketing communications and real-time management

(WARNING: Foul, profane, and offensive language in images below)

Poor Software QA (Process) is Root Cause

[Update 3/26/16 4:00pm -- The software QA process involves detecting and preventing defects, both in the development process itself and in the lifecycle as software moves from development to production. It usually involves people with official QA roles and titles, but not always. In modern software development, everyone participates and has some responsibility for software QA.]

Sleuthing by @daviottenheimer led to discovery of evidence that twitter users were exploiting a hidden feature of Tay: "repeat after me". He found evidence in tweets and replies. I later found evidence in the on-line board 4chan.org/pol/. /pol/ is a chat and photo sharing board that appeals to people who are "anti-normie" to an extreme. They like "popping bubbles" (my words), i.e. trolling people, especially public people, who are mainstream, normal, proper, and/or politically correct. It is a free public board with no membership required and no real-name requirement.

The first /pol/ thread "Tay - New AI from Microsoft" started Wednesday, March 23, at 11:00am. (Here). This thread and subsequent threads show that the /pol/ community was enthusiastically trolling Tay, essentially at random and without any evidence of knowledge of the inner workings of Tay. Contrary to the experts in the two articles above, Tay was not poisoned by direct trolling. Here is an example:

(click to enlarge)

While many of the tweets and replies by Tay led to many laughs by the trolls, Tay was not poisoned in the first few hours.

Then, at about 16:50, @MacreadyKurt stumbled upon the undocumented command: "repeat after me".

(Click to enlarge)

At 16:55, @BASED_AN0N posted instructions on 4chan.org/pol/ (trip code "yp45OVHP"), along with proof of concept (POC).

(Click to enlarge)

Within minutes, this instruction was repeated on 4chan. Many more twitter trolls following @Tayandyou saw the exploit in their timeline, and the real poisoning began [but see Update].

[Original post]
This constitutes poisoning because Tay was not simply repeating the text a single time. Instead, the text was added into it's Natural Language Processing (NLP) system so that these words and phrases became reusable and remixable in future discourse. And, since Tay's NLP was designed to continuously learn and "improve" in real-time ("on-line learning"), the more the trolls conversed with Tay using these foul words and phrases, the more Tay reinforced them and used them.

[Update 5:25pm PDT]
I now think that the previous paragraph is wrong. After some searching, I haven't found evidence that the seeded/inserted text was used later by Tay. Instead, it appears that Tay ONLY repeated the text after the "repeat after me" prompt. Then, trolls would retweet and/or screen grab and tweet the photos or post on 4chan. Two examples are here and here.

[Update 6:45pm PDT] This article in Business Insider says that "In some — but not all — instances, people managed to have Tay say offensive comments by asking them to repeat them." Some of the images in the article give the impression that "repeat after me" did not immediately precede the worst Tay tweets. However they appear to be using pictures posted on social media by trolls, not pictures they got from Twitter threads. Therefore we can't vouch for the "...not all.." statement.

[Update 7:00pm PDT] Here is one example from Business Insider of "repeat after me" not appearing before the offending Tay tweet. I captured this image from Google cache of Twitter thread.

(Click to enlarge)

OK. This isn't a case of "repeat after me". But also isn't as foul/profane as the other examples. Instead, it looks like a fairly typical NLP generated sentence drawing on a large corpus. Here the NLP is linking Hitler, totalitarianism, and atheism, but putting them inappropriately in the context of Ricky Gervais. If the question was "Is Ricky Gervais a Christian?" Tay might have replied "Ricky Gervais learn religion from Jesus, the inventor of Christianity" using the same sentence structure. This sort of mistaken semantic construction is fairly common in generative AI, but if the sentences/phrases are short enough, then human readers tend to overlook them or interpolate some reasonable meaning (much like adults do with very young children).

Q: Was "Repeat after me" a Result or Consequence AI? A: No.

It is just too imperative. It's not chatty. Instead, I'd guess that this was a rule-based design feature, probably left over from the early stages of development where software engineers were the only people interacting with the Tay bot. Very simply, the "repeat after me" allows the developer-user to manually seed Tay's NLP system and then immediately see what happens with further interaction. [Update 8:19pm] Or it may be an early feature put in before the full NLP system was working.

Put another way: the sort of AI you need to make chat work does not also work well to act on imperatives. Just compare robot AI (where natural language AI is sometimes used) to conversational/social AI and you'll see that they don't share common functionality, and often have completely different architectures.

Also, there were quite a few rule-based behaviors that overrode and/or by-passed the NLP AI. Microsoft called it "a lot of filtering". One example is anti-trolling rule for the topic "Gamergate".

(Click to enlarge)

Nearly all social/interactive AI outside of academic research has some "cheats" or "kludges" that are hard-coded by developers to control behavior in a way that would be hard/complicated/costly to do with the AI engine itself. The "repeat after me" command was just one, probably to aid development and testing.

Q: Why Was It There? A: Bad QA.

The Official Microsoft Blog post, titled "Learning from Tay's introduction", does NOT acknowledge that the root cause was exploit of a hidden feature. Instead, they describe the root cause as a "critical oversight for this specific attack". In other words, they claim they didn't do enough troll testing.

Instead, I believe that it is more likely that the root cause is poor software QA, which is different than "penetration testing" as you would do to test if your system was vulnerable to trolling. If "repeat after me" was, in fact, a rule-driven behavior explicitly put in by a developer, the the QA failure was not detecting it and not making sure it was removed. The Microsoft blog post does not describe their QA process, and it may be that they do not have any engineers dedicated to software QA. After all, this is a project of Microsoft Research, not one of the product divisions/groups.

Caveats

I don't have access to the Tay code, test results, processes and procedures, or organization charts. My claims above are extrapolations from the evidence, plus builds on my own experience in software development and in working with corporate software development teams. As such, I may be wrong and someone might be able to produce contrary evidence. I hope so.

[Update] Example of a Bot That DID Become "Casually Racist"

[Update 3/26/16 2:10pm]

In my commentary, above, I assert that the primary root cause of "Tay-Fail" was exploit of a hidden feature ("repeat after me" rule) that should have been removed during software QA process. To be clear, I am not asserting that it is not possible to poison a social bot through persistent trolling. It is. I just don't believe that was the primary root cause of the worst cases TAY-FAIL.

Here is a good example, told by a developer ("bitshepard") in a comment on Ycombinator Hacker News:

"I have a chat bot that went casually racist about a day or two after activating it. After looking through the logs, I found a particularly vitriolic person that was responsible for the source of this bot's newfound hatred of Asians. My bot didn't get fixated on one particular topic, it just spewed racism and vitriol for a while until it learned some more words. Rather than nuking from orbit immediately, I left it alone to see if it would get past the racism.

So far, it's been a few months since activating the bot. It's not nearly as casually racist as before, but from time to time still throws out something racist just for the lulz. It had a hard time learning context, because of its environment and the linguistic skills of the denizens, but it has gotten much better at when it interacts with people.

Occasionally, newcomers get misled into believing the bot is actually a living person with a mental illness, and not just a collection of random bits of code cobbled together."

7 comments:

Ekim Nazım KayaMarch 26, 2016 at 12:24 PM
We agree that it's a combination of the bad 'repeat after me' idea, and manual filtering. Also, hanks to Microsoft's AI investments, they have set the expectation bar too high. As a 9 year old bot company, here's our take: https://medium.com/@botego/why-microsoft-s-chatbot-tay-failed-and-what-does-it-mean-for-artificial-intelligence-studies-fb71d22e8359#.254v1fb5x
SSHXMarch 26, 2016 at 2:41 PM
This was a feature of AIML bots as well, that were popular in 'chatrooms' way back in the late 90's. You could ask questions with AIML tags and the bots would automatically start spewing source into the room and flooding it. Proud to say I did get banned from a lot of places.
UnknownMarch 26, 2016 at 4:25 PM
I worked on Tay in her relatively early stages at MSR. I left MSR precisely because I did not agree with the direction they took with Tay. Tay is NOT AI and has zero to do with AI. Tay is a search engine. You can blame her for turning full Trumptard as much as you can blame the Google search engine for
returning some offensive result after a googlebomb. The biggest problem with this disaster is that this is turning into an egg on the face of AI. AI was never involved here. TAY IS NOT AI.