During my presentation A whistlestop tour of SSIS addins yesterday at SQLBits I demonstrated how Jessica Moss & Andy Leonard’s Twitter Task could be used in conjunction with SSIS’s Term Extraction component to discover what people might be saying in their replies to a Twitter user.
The Term Extraction component is a little-known and rarely-used component in SSIS’s bag of tricks but its rather interesting and can be quite useful too. It uses data mining algorithms to to extract nouns and/or phrases that are passed through it and then give a score to each term or phrase based on the frequency of its occurrence. Here’s another description from the documentation:
The Term Extraction transformation extracts terms from text in a transformation input column, and then writes the terms to a transformation output column. The transformation works only with English text and it uses its own English dictionary and linguistic information about English.
You can use the Term Extraction transformation to discover the content of a data set. For example, text that contains e-mail messages may provide useful feedback about products, so that you could use the Term Extraction transformation to extract the topics of discussion in the messages, as a way of analyzing the feedback.
There appeared to be quite a bit of interest in what I demo’d although one anticipated question that came up from the audience was “Can the Twitter Task perform searches on Twitter?” and the answer, currently, is no.
Hence, I’ve put together a package that uses a script component to do exactly that – queries Twitter for tweets containing a given search term. To use it you simply enter whatever you want to search for in the SearchTerm variable:
and execute the package. Here are the results from the Term Extraction component after doing a search on Twitter for “sqlbits” and “november”:
Download the package from here: http://cid-550f681dad532637.skydrive.live.com/browse.aspx/Public/BlogShare/20091122