Detecting The Fakes In Online Misinformation

Fake news (feik nju:z)noun: false, often sensational, information disseminated under the guise of news reporting.

In 2017, Harper Collins named fake news as the “word of the year” beating other short-listed words, equally influenced by Politics. Not only was this term consistently in the news, it seemed to be the most favoured word appearing on President Donald Trump’s Twitter handle.

Fake news or false propaganda has always existed in the days of yore. But it’s only recently with the birth of the internet and the emergence of social media that it has had far-reaching implications. According to a recent Telegraph report, fake news is “..now seen as one of the greatest threats to democracy, free debate and the Western order”.

While various theories abound about the probable causes of the rise in dissemination of such untruth (from the ‘breeding ground’ of the US Election to the backyard of the social media giants themselves, Facebook and Twitter), the reality is that fake news stories are a rising epidemic.

Finding the Fakes

There are plenty of tips on spotting the right from the wrong online. Facebook has a useful list which includes checking the sources, looking at the URL, watching for unusual formatting and so on. There are tips from other sources that include checking the publisher’s credibility and visiting fact-checking websites such as Snopes.com, FactCheck.org, International Fact Checking Network (IFCN) among others. But a proliferation of misinformation in every day media around us has made it challenging to identify trustworthy news sources (despite the tips!), necessitating the use of computational tools to determine reliability of online content.

Best Paper Award

A group of researchers from  IIIT-H, Rajat Singh, Nurendra Chowdhary, Ishita Bandlish and Prof Manish Shrivastava from the Language Technologies Research Centre and Kohli Centre on Intelligent Systems, has proposed a novel approach to detecting fake news on websites through automated processes. In their paper titled, “Neural Network Architecture for Credibility Assessment of Textual Claims”, they write that the manner in which fabricated news articles with political or financial agendas shape public opinion affects society as a whole, making it a very serious problem. Their approach named Credibility Outcome (CREDO) aims at scoring the credibility of an article in an open domain setting.

The article won the Best Paper Award  – First Place at the International Conference on Computational Linguistics and Intelligent Text Processing (CICLing) held in Hanoi, Vietnam in March. As per the website, the award is given by the Award Committee, taking into account the following criteria: novelty, originality, and importance of the reported work and overall quality of the paper.

CREDO

Previous or similar research has been in the areas of automatic fact checking, rumour detection, sentiment analysis, semantic similarity and credibility analysis. If an approach models the writing style or linguistic features, it should also take into account, information of author and website which the approaches so far have not done.

The IIIT-H team has built a tool called CREDO which includes the following modules:

keyword extraction, document retrieval, author credibility scores, website and author trust scores, semantic similarity and sentiment analysis. They conducted experiments on Snopes to demonstrate the effectiveness of their modules compared to previous approaches to the problem. Based on the results of their testing, they concluded that the most important module is semantic similarity, followed by sentiment analysis, and then author, website scores. The experiments also show that the choice of the classifier plays a major role in the output. The researchers concluded that Support Vector Methods with RBF kernel and Neural Network architecture (Multilayer Perceptron) give the best results for the problem.

Future Refinements

The tool was built for credibility analysis of unstructured text articles in an open-domain setting. The plan is to extend and apply this system to domains like social media, and enhance CREDO with more sophisticated features like writing styles and so on.

 

 

Sarita Chebbi has been a stay-at-home mom of two for the longest time. When she’s not painstakingly documenting heirloom recipes from her mom into a tattered diary, she can be found on the yoga mat, trying to impress her family with her pretzel poses. She also likes to nit-pick on the written word.

Leave a Reply

Your email address will not be published. Required fields are marked *

Next post