Last year, three researchers from the computer science department at Stony Brook University discovered that they could predict the success of literary works using statistical and lexical analysis. The results, published in the paper, “Success with Style: Using Writing Style to Predict the Success of Novels,” claimed an impressive success rate of 84 percent for literature and 89 percent for movie scripts.
That’s great for the literary arts, but could this work have relevance to other businesses, wondered Mukunda Krishnaswamy, founder and CTO of MimosaSoft. Specifically, could this type of analysis be applied to website or marketing copy to predict its effectiveness? Or, could it be coupled with sentiment analysis to enhance a company’s understanding of how its customers feel about its products?
“Engagement with customers is the primary challenge that marketers face,” said Krishnaswamy, citing an ExactTarget study in which 65 percent of marketers make that claim. Extending what they learned from the Stony Brook research, the MimosaSoft team built MimoLex, a tool that performs lexical analysis of marketing collaterals and helps marketers listen to customer feedback from a number of channels, including social media sites.
“MimoLex helps you emotionally connect with your customers, and it allows you to listen better and to respond to issues in a more timely manner,” said Krishnaswamy.
While lexical analysis sounds similar to sentiment analysis, there are some important distinctions, Krishnaswamy said.
“Lexical analysis breaks down the textual content into verbs, nouns, compounds, etc. [We use it] to look for correlations with engagement metrics such as time spent on a page or percentage of page exits.” He said that MimoLex uses a process similar to that used by the researchers at Stony Brook.
Sentiment analysis, on the other hand, extracts tokens (phrases/words) from text and categorizes them into different sentiments—e.g., strong positive, weak positive, neutral, weak negative, strong negative, profanities, major problems, and minor problems).
“MimoLex uses the sentiment counts to look for relationships with the engagement metrics,” said Krishnaswamy. “For example, a page with 15 percent strong positive sentiments might get a 30 percent exit rate, whereas a page with 7 percent strong positive sentiments might get a 65 percent exit rate.”
Built specifically for the SAP HANA platform, MimoLex is what the company calls a template application (tApp), which is designed to work out of the box. The solution is currently focused on helping businesses in the e-commerce, publishing, and hospitality industries. It uses natural-language processing to analyze text, and HANA’s in-memory platform allows that analysis to occur in real time. To use MimoLex, you install it and point it to your Google Analytics page. It will then analyze your web pages and crawl web sources connected to your Google Analytics setup. You also can ask it to evaluate other digital sources individually. Initial evaluations are based on a pre-built lexical dictionary in which each entry is flagged as having a negative or positive connotation.
The results are shown in a report similar to the one at left. You can drill down to individual pages and then focus on each flagged problem area. For example, the top-level report might identify 50 pages with potential problems. You can then drill down to see the specific problem.
“Some could be false,” said Krishnaswamy, “such as the use of a word like ‘nerd.’ It could be a problem in some contexts, but not others.” You can make the judgment to change or ignore the issues, and you can compare negative pages with positive pages to identify differences. If you make a change, you can run the analysis again to quickly see how the revised page compares to pages rated positively with low exit rates.
An early customer, Winshuttle, is using MimoLex to compare the sentiment value of the content pages on its website. In particular, the company is using it as an analysis tool for its bloggers.
“[MimoLex] gives more insight into styling and sentiments expressed by different content authors,” said Jim O’Farrell, Winshuttle’s director of product marketing. “Coming up with a consistent theme and sentiment across all pages is the unique value that this tool provides for us.”
One issue that MimoLex identified for Winshuttle was that a concept, dirty data, they introduced on the website had developed strongly negative connotations because of the way content authors were using it. “We steered them away [from the phrase] and toward using data quality management phraseology,” said O’Farrell.
In its current release, MimoLex performs a limited set of functions—a tradeoff for ease of use. However, MimosaSoft is planning to enhance the feature set with a new release this summer, including real-time competitor comparison, dictionary customization, and more user-friendly reports.
Michael Nadeau is the founding publisher of Data Informed and is currently a content consultant. You may reach him by email at email@example.com.