Finding the sentiment of online conversations
One of the most important aspects of online conversations is the sentiment of what the author is saying. Are they positive about you, negative or apathetic? The difference is vitally important, but very hard to determine due to the complexity of language.
Let's look at what I mean by complexity of language. Most services that are out there take a look at a post and try to identify what is being said by looking the total range words. They have lists of positive words like "great", "awesome", "l33t" (for the hacker crowd) as well as negative words like "sucks", "terrible", etc. If neither group of words is found the post is considered neutral.
I'm sure you can see the error in this. A post could be negative overall, but avoid these words. It could also use one negative word, but be positive overall. What is needed is true contextual language processing (which is expensive and requires a lot of development).
Here are a few examples of sentiment analysis.
Collective Intellect is a social media monitoring solution that we work with. Part of their analysis is of language within conversations and the sentiment that is displayed there. The sentiment is then tracked over time and can be a key metric in the success of a campaign. Their formula for extracting the sentiment is not publicly accessible so I am not sure how they calculate it.
Summize is a Twitter search engine. In their labs section is a sentiment analyzer that lets you enter a keyword and get the real time sentiment. If you play with this for a while you will see some issues as I found out when I sent this link out on Twitter.



Here is a sample of the output for the term "marketing".

Another service that uses Twitter as the basis to create an engaging experience around sentiment is Twistori. Twistori takes a few key terms like "love", "hate", "feel" and "wish" and creates a dynamic timeline based on the use of the terms. It's very cool to watch the service extract the terms and after a few minutes you see how difficult it is to get sentiment right.

So, do you look at the sentiment of online conversations? There is still no better filter than to read back through a blogger's posts to get their real feeling at this point. Technology is evolving quickly, but so is language.
How are you tracking sentiment online? Is there a tool that I missed? Let me know!
Technorati Tags:
conversations, marketing, Matt Dickman, search, social media, Techno//Marketer, Twitter, sentiment







I think Visible Technologies and Radian 6 also monitor sentiment, similar to Collective Intellect.
Given the ease of use and cost, Radian 6 seems to be the industry's favorite.
I never considered Twistori as a sentiment measuring tool before. I used to think of it as a novelty. That's a good analogy into the difficulty.
Posted by: Warren Sukernek | Monday, June 23, 2008 at 11:42 PM
Warren -- Thanks for the comment! I'll have to check out Visible as well (Radian 6 is on my radar).
Posted by: Matt Dickman | Monday, June 23, 2008 at 11:46 PM
Hi Matt,
You are right that it is difficult. In most discussions, I hear users of automated sentiment analysis technology who are disappointed with the accuracy of current systems. Human language is complex and changing (ex: "wicked" is now positive, mostly). Measuring sentiment relative to a topic is even more difficult, as your colleague Luke pointed out - for example, "The new Corvette is fast" versus "Global warming is happening fast". Is "fast" positive or negative? It depends. Some technologies approach this problem through NLP and others through statistical classification requiring the system to be trained for each topic and retrained on errors. The error rates are still high, but there is lots of interesting work taking place in this space.
You are right, however, that "there is still no better filter than to read back through a blogger's posts to get their real feeling...". If a brand's goal is to truly participate in the conversation online, then we will take the time to read what people are saying.
There is another interesting filter to consider: actions. The powerful thing about social media is that it isn't just words. It's words + actions. As users interact, comment, bookmark, link, vote or tweet, they are leaving digital breadcrumbs behind that tell us a whole lot about the sentiment towards a piece of content or an idea. We can measure the dynamics of conversations.
So while a post may be positive or negative, did its readers find it important or was the post quickly forgotten? Was there a broad engagement around the topic? Did the idea propagate? Is the author becoming influential on your topic as measured by others' reactions?
All of these conversational dynamics are measurable and give you lots of additional ways to sort content and measure the impact of your online marketing efforts.
Good post,
Marcel
twitter: @lebrun
Posted by: Marcel LeBrun | Tuesday, June 24, 2008 at 02:31 AM
Matt,
Sentiment analysis of online conversations is one of the great challenges in this business, which is why we combine trained experts (real people)and the best language processing technology available today to provide companies with insights from individual posts, subsequent comments and overall sentiment for entire threaded conversations. Many firms in the social media business don't do automatic sentiment analysis because it's not easy.
Posted by: Mike Spataro | Tuesday, June 24, 2008 at 10:01 AM
Hi Matt,
Per your question of: "Their (Collective Intellect) formula for extracting the sentiment is not publicly accessible so I am not sure how they calculate it."
CI would like to make public how sentiment is currently calculated.
Sentiment = The Collective Intellect tonality algorithm performs linguistic and statistical analysis on each post to determine its overall sentiment (positive, negative, or neutral). The CI algorithm is designed to operate effectively across a broad range of domain areas. The algorithm has been tested on several standard sentiment datasets (such as movie reviews) and consistently performs at a level close to human inter-rater accuracy.
And if you are wondering what human inter-rater accuracy is, here is that definition.
Human inter-rater accuracy (really correlation) is a measure of how well a set of independent raters correlate on rating a test set. For example, for a particular test set, if the sentiment ratings of a set of 4 human raters only correlate at 70% (on average) then that is the best accuracy level any automated system could hope to attain on that test set.
Posted by: Michael Conti | Tuesday, June 24, 2008 at 02:44 PM
Thanks for the link to that Twistori site - really cool !
Posted by: Adam Singer | Tuesday, June 24, 2008 at 03:13 PM