Can Google Perspective, Facebook AI and Twitter algos improve web civility?

It’s hard to put a value on thwarted conversations. But here’s what do know: the web is not a very civil place, and we’re the ones who lose out. As per a detailed report on online harassment (PDF):

Seventy-two percent of American internet users have witnessed harassment online.

Nearly half have personally experienced harassment.

Almost a third self-censor what they post for fear of retribution.

We know that online harassment can be a contributing factor to suicide – though the exact effect of cyber-bullying on suicide rates is unclear. In the U.S., suicide is the second highest cause of fatality for people aged 15-24 and 25-34.

For those who publish content online – which is pretty much every company these days – the problem of cultivating civility at scale is vexing. So is limiting abusive comments without stifling open discourse. This has caused many publishers to abandon commenting entirely, outsource them to Facebook Comments (which doesn’t solve this problem), or consider radical solutions, such as this Norwegian site which requires readers to pass a test indicating they read the article before commenting.

Startups have pursued this challenge also, such as Civil Comments, which utilizes crowdsourcing to moderate comments based on community norms. But a human-powered solution of any kind is not appealing to the likes of Facebook and Twitter.

They are more interested in humanizing their algorithms so that the solution they achieve will have massive scale. That means limiting human involvement. Whether machine learning can fix the problems of civility is an open question, but it does create a good debate.

Google Perspective – a machine learning approach to comment civility

There’s been a slew of activity in this space lately, most notably from the announcement of Google Perspective, a new project from Jigsaw, an incubator that is part of Alphabet, Google’s parent company. As per Chris Kanaracus of Constellation:

"Jigsaw is hoping to radically change this through a new API that uses machine learning models to screen website and forum content and determine whether it is “toxic.”

Jigsaw CEO Jared Cohen described these tools in a blog post, including a use case at the New York Times:

"We’ve been testing a version of this technology with The New York Times, where an entire team sifts through and moderates each comment before it’s posted—reviewing an average of 11,000 comments every day. That’s a lot of comments. As a result the Times has comments on only about 10 percent of its articles. We’ve worked together to train models that allows Times moderators to sort through comments more quickly, and we’ll work with them to enable comments on more articles every day."

So in this case it’s an efficiency play for the human moderators, wading through rafts of comments faster. Obviously, if the Times can enable commenting on more articles, they stand to see increased engagement on those pieces. One perk of machine learning tools: in theory, they get better as more data gets munched. Cohen:

"This technology is still developing. But that’s what’s so great about machine learning—even though the models are complex, they’ll improve over time. When Perspective is in the hands of publishers, it will be exposed to more comments and develop a better understanding of what makes certain comments toxic."

Publishers will be able to use the Perspective API to tie into their existing platforms. They can link their own comments with the hundreds of thousands in Google’s database. Perspective will then rate the publisher’s comments based on their similarity to previously flagged comments.

Facebook – AI for suicide prevention

Buzzfeed reported on March 1 that Facebook is using AI for a different type of project: suicide prevention. Facebook is using “AI” to identify posts indicating suicidal or harmful thoughts. Facebook’s urgency is no doubt sparked by several horrific suicides that took place on Facebook Live, highlighting once again that connectivity doesn’t solve for despair.

The AI works by scanning posts and comments and comparing them to others where intervention was warranted. This could mean flagging for review by the community team. Facebook would then reach out to the so-called “at risk” user, giving them a screen of suicide prevention resources, which include options like contacting a helpline or reaching out to a friend.

The AI seems to be doing a better job of flagging at-risk individuals than the humans around them:

“The AI is actually more accurate than the reports that we get from people that are flagged as suicide and self injury,” Facebook Product Manager Vanessa Callison-Burch told BuzzFeed News in an interview. “The people who have posted that content [that AI reports] are more likely to be sent resources of support versus people reporting to us.”

Twitter – using algorithms to crack down on abusive accounts

2016 marked the year that Twitter finally got much more aggressive about abuse and trolling on its platform – years too late for some critics. On March 1, 2017, Twitter announced it will start using algorithms to identify and restrict abusive accounts.

Twitter is trying to do more than police keywords here; Buzzfeed reports that the algorithmic changes also consider the relationship between users when assessing abuse. This function is part of a broader set of tools that Twitter has made available to users in recent months, allowing them a greater degree of control over who can contact them, muting options, and blocking/abuse reporting enhancements.

Twitter VP of Engineering Ed Ho is aware that automating civility is easier said than done:

"Twitter seems to expect this approach will have hiccups, as Ho acknowledged: “Since these tools are new we will sometimes make mistakes, but know that we are actively working to improve and iterate on them every day.” There isn’t a process to appeal any of the the penalties yet, though Twitter’s plan to “iterate every day” indicates that may change."

My take – anti-civility tools can be gamed

Of these three initiative, Facebook gets my highest marks. Suicide is an ideal target for machine learning given the high stakes of the problem and the difficulty humans have in taking action around it. It’s not surprising to learn that AI can already exceed human abilities here. We all struggle to figure out who amongst us is really in trouble, and what on earth to do about it.

Granted, throwing up a list of suicide prevention hotlines to those at risk is a pretty mechanized gesture. But Facebook is taking this further, adding to the number of suicide counselors available via third parties on Facebook Live. Facebook has also made the controversial decision to allow those who have been reported as “at risk” to be able to broadcast Live. Per Buzzfeed:

"Facebook’s decision to maintain the live broadcast of someone who’s been reported as at-risk for self-harm is clearly fraught. But the company appears ready to risk broadcasting a suicide if doing so gives friends and family members a chance to intervene and help. “There is this opportunity here for people to reach out and provide support for that person they’re seeing, and for that person who is using Live to receive this support from their family and friends who may be watching,” said Facebook researcher Jennifer Guadagno. “In this way, Live becomes a lifeline.”

We can debate this decision but I believe it shows courage on Facebook’s part – and I don’t often say that.

The gaming of Google Perspective has proven easy. On Fortune, Jeff John Roberts wrote about giving the filter a test drive. Perspective flagged obscene words but others slipped through:

"I also tried a lesser-known insult (“libtard”) that’s become a nasty slang term and fixture on certain political sites. It appeared to pass muster. The sarcastic phrase, “nice work, libtard” only obtained a 4% “toxicity” store, raising the possibility that would-be trolls will start reaching for newer or unusual slurs to avoid detection."

Boing Boing reported on a study, Deceiving Google’s Perspective API Built for Detecting Toxic Comments (PDF):

"Within days of its release, independent researchers have published a paper demonstrating a way of tricking Perspective into trusting ugly messages, just by introducing human-readable misspellings into their prose."

The researchers say that these adversarial examples “have been shown to be effective against different machine learning algorithms even when the adversary has only a black-box access to the target model.”

Twitter, meanwhile, is going to struggle with a machine-automated attempt to make their site more civil. Twitter will always have to weigh openness and anonymity with civil discourse. It’s never going to be a comfortable mix.

I like what Civil Comments is doing. For enterprise communities, a crowdsourced/moderated approach to comment civility can work well. Nothing beats a set of community norms that the community itself is determined to uphold.

Let’s see how these tools evolve. In the meantime, posting comments will always take a bit of courage. Hopefully we can support those who go out on limbs that matter.

Michel ColaciComment