How to develop a disinformation platform using NLP, Social Network Analysis and conversational agent technologies?

Nowadays, Internet and social media are flooded daily with disinformation, misinformation and fake news. Any user of a social network suffers from overstimulation, which is promoted by the excess of information and exploited to spread a false idea or fake news. This results in the creation of new truths and manipulation of populations around the world. In the end, disinformation and fake news are a form of propaganda that polarizes public opinion, promotes extremism and hate speech, and ultimately manipulates democracies. There are multiple examples in different fields where these techniques are applied, for example in politics with the Facebook-Cambridge Analytica data scandal or in the Brazilian and Brexit elections, in public health with the fake news related to COVID-19 or in the financial sector with cryptocurrencies and NFTs.

Figure 1. Fake news and propaganda.
Figure 1. Fake news and propaganda.

For these reasons, in the LieSense project we have designed a platform to capture, contrast and visualize the online dissemination of fake news and disinformation among different communities. All this is achieved through the use of tracking tools, NLP and SNA analysis, visualization and the user interface will be through a chatbot. These tools will integrate data from different media and social media sources, as well as objective information on fact-checking efforts. In the next section, the initial architecture designed for LieSense will be explained.

Architecture Overview

Microservice architectures are very versatile, and allow for very loose coupling of functionality between elements in the platform. Likewise, each microservice in LieSense will be grouped into a specific set of services, based on their purpose: Data Collection, Natural Language Processing, Social Network Analysis, Storage, Data Visualization and Virtual Assistant (chatbot).

Figure 2. Architecture Overview
Figure 2. Architecture Overview

Moreover, in order to provide some common features, tie all the other services together and to provide data processing logic, an additional component labeled LieSense Middleware will be introduced. LieSense Middleware is a software layer that connects different components. It serves two main purposes. On the one hand, it provides interoperability with other services, like the distribution of functionality, scalability, load balancing and fault tolerance. The functionalities fall within several categories which are application-specific routing, information-exchange, management and support. The application-specific features deliver services for different parts of the applications, while the information-exchange features deal mainly with information management. On the other hand, this middleware also enables the orchestration of several operations within the platform via orchestrated tasks (i.e., pipelines).

First of all, the data collection microservice use scraping techniques and APIs to obtain data from structured and non-structured data sources and convert these data into linked data formats. The use of linked data enables both the reuse of ingestion modules and interoperability and models the data in a uniform schema for processing data.

Secondly, NLP microservices gather various analysis tasks that enrich the input data with sentiment and psychological analysis or moral value estimation.. Furthermore, in order to assist in the fact-checking process it performs entity recognition, stylometry and automatic summarization.

Furthermore, the network analysis microservices covers the fields of Social Network Analysis (SNA) and Agent-Based Social Simulation. It is responsible for the analysis, modeling and simulation of the dissemination of disinformation campaigns in Social Media.

The data analysis microservices enable building aggregated and presented through easy-to-use visualization dashboards for users and fact-checking professionals. In addition, the storage microservices are used to save the data in NoSQL and RDF databases.

Finally, virtual assistant microservice involves interacting with the chatbot through the messaging channel to obtain more information about dubious information, report a case of disinformation, and provide feedback to the system.

Leave a Reply

Your email address will not be published. Required fields are marked *