« Facebook's DIY ad targeting explained | Main | Free mobile for advertising impressions; it's only a matter of time »

Monday, June 23, 2008

Finding the sentiment of online conversations

iStock_000005805124XSmall.jpgOne of the most important aspects of online conversations is the sentiment of what the author is saying. Are they positive about you, negative or apathetic? The difference is vitally important, but very hard to determine due to the complexity of language.

Let's look at what I mean by complexity of language. Most services that are out there take a look at a post and try to identify what is being said by looking the total range words. They have lists of positive words like "great", "awesome", "l33t" (for the hacker crowd) as well as negative words like "sucks", "terrible", etc. If neither group of words is found the post is considered neutral.

I'm sure you can see the error in this. A post could be negative overall, but avoid these words. It could also use one negative word, but be positive overall. What is needed is true contextual language processing (which is expensive and requires a lot of development).

Here are a few examples of sentiment analysis.

58C82440-1332-4186-89B4-C7DEBEB6D173.jpgCollective Intellect is a social media monitoring solution that we work with. Part of their analysis is of language within conversations and the sentiment that is displayed there. The sentiment is then tracked over time and can be a key metric in the success of a campaign. Their formula for extracting the sentiment is not publicly accessible so I am not sure how they calculate it.

Summize is a Twitter search engine. In their labs section is a sentiment analyzer that lets you enter a keyword and get the real time sentiment. If you play with this for a while you will see some issues as I found out when I sent this link out on Twitter.

Picture 29.png
Picture 30.png
Picture 31.png
Picture 32.png
*Note that Luke works with me here in Cleveland.

Here is a sample of the output for the term "marketing".

Picture 27.png

Another service that uses Twitter as the basis to create an engaging experience around sentiment is Twistori. Twistori takes a few key terms like "love", "hate", "feel" and "wish" and creates a dynamic timeline based on the use of the terms. It's very cool to watch the service extract the terms and after a few minutes you see how difficult it is to get sentiment right.

Picture 28.png

So, do you look at the sentiment of online conversations? There is still no better filter than to read back through a blogger's posts to get their real feeling at this point. Technology is evolving quickly, but so is language.

How are you tracking sentiment online? Is there a tool that I missed? Let me know!


Technorati Tags:
, , , , , , ,


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341bfcd953ef00e5536bae798833

Listed below are links to weblogs that reference Finding the sentiment of online conversations:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

I think Visible Technologies and Radian 6 also monitor sentiment, similar to Collective Intellect.
Given the ease of use and cost, Radian 6 seems to be the industry's favorite.

I never considered Twistori as a sentiment measuring tool before. I used to think of it as a novelty. That's a good analogy into the difficulty.

Warren -- Thanks for the comment! I'll have to check out Visible as well (Radian 6 is on my radar).

Hi Matt,

You are right that it is difficult. In most discussions, I hear users of automated sentiment analysis technology who are disappointed with the accuracy of current systems. Human language is complex and changing (ex: "wicked" is now positive, mostly). Measuring sentiment relative to a topic is even more difficult, as your colleague Luke pointed out - for example, "The new Corvette is fast" versus "Global warming is happening fast". Is "fast" positive or negative? It depends. Some technologies approach this problem through NLP and others through statistical classification requiring the system to be trained for each topic and retrained on errors. The error rates are still high, but there is lots of interesting work taking place in this space.

You are right, however, that "there is still no better filter than to read back through a blogger's posts to get their real feeling...". If a brand's goal is to truly participate in the conversation online, then we will take the time to read what people are saying.

There is another interesting filter to consider: actions. The powerful thing about social media is that it isn't just words. It's words + actions. As users interact, comment, bookmark, link, vote or tweet, they are leaving digital breadcrumbs behind that tell us a whole lot about the sentiment towards a piece of content or an idea. We can measure the dynamics of conversations.

So while a post may be positive or negative, did its readers find it important or was the post quickly forgotten? Was there a broad engagement around the topic? Did the idea propagate? Is the author becoming influential on your topic as measured by others' reactions?

All of these conversational dynamics are measurable and give you lots of additional ways to sort content and measure the impact of your online marketing efforts.

Good post,
Marcel
twitter: @lebrun

Matt,

Sentiment analysis of online conversations is one of the great challenges in this business, which is why we combine trained experts (real people)and the best language processing technology available today to provide companies with insights from individual posts, subsequent comments and overall sentiment for entire threaded conversations. Many firms in the social media business don't do automatic sentiment analysis because it's not easy.

Hi Matt,
Per your question of: "Their (Collective Intellect) formula for extracting the sentiment is not publicly accessible so I am not sure how they calculate it."

CI would like to make public how sentiment is currently calculated.

Sentiment = The Collective Intellect tonality algorithm performs linguistic and statistical analysis on each post to determine its overall sentiment (positive, negative, or neutral). The CI algorithm is designed to operate effectively across a broad range of domain areas. The algorithm has been tested on several standard sentiment datasets (such as movie reviews) and consistently performs at a level close to human inter-rater accuracy.

And if you are wondering what human inter-rater accuracy is, here is that definition.

Human inter-rater accuracy (really correlation) is a measure of how well a set of independent raters correlate on rating a test set. For example, for a particular test set, if the sentiment ratings of a set of 4 human raters only correlate at 70% (on average) then that is the best accuracy level any automated system could hope to attain on that test set.

Thanks for the link to that Twistori site - really cool !

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

subscribe by rss

subscribe by email

  • Enter your email address below to get updates.

search

reader poll

presentations

news you can use

flickr

  • www.flickr.com
    mattanium's photos More of mattanium's photos

about me

  • Matt Dickman is a blogger, speaker and technology evangelist working as SVP, Digital Marketing at Fleishman-Hillard.

    This is his personal blog and the thoughts and opinions expressed here are his and do not necessarily represent the views of his employer or its clients.

    Want to book me to speak at your event or conference? Click here for more information.

contact me

  • View Matt Dickman's profile on LinkedIn

    Email: mattdickman@gmail.com
    Call: 216.408.3312
    ICQ: 32429495
    AIM: mattanium1
    Skype: mattanium

    Other places to connect:



     

    Connect with me offline at:

join the community

latest t//m video

obligatory rankings

follow along on twitter

conversations

creative commons

  • Creative Commons License

    Public Relations Blog Directory

Blog powered by TypePad
Member since 08/2004