What do Sentimenti tools do? – says Dr Jan Kocoń

What do Sentimenti tools do? Dr Jan Kocoń is a natural language engineer – he is responsible for the machine learning closed in SentiTool, our tool for analyzing emotions in the text. He coordinates the work of the linguistics team, integrates individual elements of the tool, and works closely with the IT team.

When are you going to tell someone about Sentimenti and our tools, what do you say first?

Sentimenti is a project in which we analyze emotions in the text. Unlike competitive solutions that recognize only the overtones of the text (positive, neutral, negative), our tools are able to understand the text, assign specific meanings to the words in the text and the emotions people feel about them. These emotions, in turn, provide the knowledge base for a machine learning mechanism that automatically recognizes emotions at the level of sentences and the whole text.

What does it mean that we analyse emotions in the text?

In the research carried out in the project we adapted the Plutchik model, which includes eight basic emotions: joy, sadness, trust, repulsion, expectation, fear, surprise and anger. We are able to estimate to what extent these emotions are expressed in the text.

How do we know what emotions people feel?

The knowledge base that helps our project includes more than 30000 meanings of words, for which 20000 unique respondents assign ratings for overtones and emotions. We are talking here about “meanings” and not “words” because words are ambiguous, for example “dark” means something different in “dark blue” or “dark people” and only in the latter case carries emotions. Each meaning will ultimately receive 50 marks from different people. This allows us to know what feelings are evoked by certain meanings in the text. However, the emotion of the text is not a simple summation of the emotions assigned to the meanings in the text...

What else makes the emotion analysis tools in the text work?

Two things come to our aid. The first one is our giant database of opinions, with associated overtones, which come from different fields: travel, medicine, products and many others. We have more than 10 million such texts, which is an excellent source of information about the general feeling of the author. However, in order to find out what emotions a given text evokes in the reader, we also conduct our own research, analogous to research on single meanings.

This time the subject of these studies is the texts. The respondents attribute basic emotions to them, exactly the same as the meanings of words. The second pillar of our tool is a combination of many machine learning methods. Experts in natural language processing provide us with tools for text analysis at the syntactic and semantic level, additionally they create rules for the analysis of meanings in context such as: negation, conjecture, weakening or strengthening of the overtones, etc. This is an additional help for automatic methods, such as deep neural networks, which are used to make the right conclusions about the emotions in the text.

What do you think automatic emotion analysis can be useful for?

Ultimately, I see many applications for our tools. The first area covers the market of advertisements displayed in the context of web articles and matching them with the emotions that the text of the publication evokes in readers. For example, in a sad text there could be an advertisement for an insurance company, and in a joyful text there could be an advertisement for a trip. Another area is brand monitoring, i.e. analyzing how companies’ customers write on the Internet about a given company, its products and what emotions accompany them. Another interesting area is sorting e-mail complaints from customers against the emotions contained in them, detection of conflicts arising in employee correspondence, detection of crises in social media, and even the possibility of diagnosing mental illnesses – the potential is really huge.

What else do you plan to do in Sentimenti?

So far, there is a prototype ready with a simple text analysis on the level of meanings. With an overtones analysis using our great opinion resources. Currently in Sentimenti team in Wroclaw I’m managing to build a machine learning mechanism. It will be possible to aggregate both information from the meaning knowledge base and information from the natural language processing stream. We are constantly receiving new data about the feelings of people reading the texts, which are our teaching collection. The more data, the better the quality of the tool.