Track 1 – Text Mining/Big Data: Merging Two Worlds

Track 2 – Enterprise Text Analytics: Applications & Tools

Wednesday, October 3, 2012

7:30-8:45am

Registration & Breakfast


8:45-9:30am

Welcome Remarks

Keynote
Future Directions in Text Analytics

Text Analytics is one of those phrases that mean different things to different people. And this is a good sign, a sign that the field is still young and growing with myriad influences and directions. Text Analytics World is also changing and growing to cover the full spectrum of this variegated world of text analytics.

In this welcoming keynote, Tom Reamy looks across the whole range of text analytics technologies, techniques, and applications to explore the latest ideas in Text Mining, Big Data, Social Media and Customer Intelligence, Enterprise Applications, Intelligence Applications, and new models of knowledge and categorization. We will look at questions such as:

  • What are the major trends in the different areas of text analytics?
  • What are the problems and issues that are slowing down text analytics?
  • What new technologies and developments have the potential to dramatically advance text analytics?
  • Are the next few years going to be evolutionary or revolutionary?
  • Will companies actually figure out how to get real business value from social media?

The talk will then look at the most difficult but most promising trend in text analytics – how do you put all these pieces together? Because it is only by integrating the various aspects of text analytics that its full potential can be unlocked.

  • What is the best way to add text to Big Data to make it even bigger?
  • How do we add intelligence and depth to Social Media?
  • How do we build new Enterprise Applications that incorporate the 90% of information that is unstructured?
  • What kinds of knowledge models will we need to tie all these disparate fields together?
  • And, lastly, how do we integrate text analytics and predictive analytics?

We don’t have all the questions yet, much less the answers, but it is time to start thinking and talking about these big questions if text analytics is going to fulfill its promise.

Speaker: Tom Reamy, Chief Knowledge Architect, KAPS Group

[ Top of this page ] [ Agenda overview ]



9:35-9:45am

Sponsored by:
Expert System   Smart Logic

Elevator Pitch

[ Top of this page ] [ Agenda overview ]


9:45-10:15am

Networking / Exhibit Hall

[ Top of this page ] [ Agenda overview ]


10:15-11:00am

 All LevelsTrack 1: Text Mining and Predictive Analytics
Social Media Intelligence: Text and Network Mining combined

This paper combines text mining and network mining revealing new heterogeneous insights in social media that are not detectable using either technique alone. Sentiment analysis from online forum posts together with reference structures from the quotation network allows the detection of individual’s sentiment as well as their relative influence in the underlying discussion forum.

Originally performed for Deutsche Telekom and The Economist, the techniques and methods presented here use public data to demonstrate the approach for social media. A conclusion is drawn about the relevance and practicalities of this new approach along with a recommendation for next steps.

Speaker: Rosaria Silipo, Data Mining Consultant, DMR

[ Top of this page ] [ Agenda overview ]


10:15-11:00am

 Expert/Practitioners levelTrack 2: Enterprise Text Analytics Applications
Harnessing the power of text analytics to drive human capital

Human capital analysts have long relied on structured data from employees to better understand what drives their performance (e.g., surveys, HRIS). As text analytic resources have become more powerful and accessible, unstructured data from employees can now be better leveraged to drive employee performance, understand the employee experience, make human capital decisions, and improve business outcomes. This session will focus on how this growing area can be used to help companies better understand drivers of employee engagement and behavior as well as to better assess company culture, cultural alignment, and impact of cultural alignment on company performance.

Speaker: David Youssefnia, President, Critical Metrics, LLC

Speaker: Dr. Charles Scherbaum, Director of Research and Analytics, Critical Metrics, LLC

[ Top of this page ] [ Agenda overview ]


11:00-11:15am

Break

[ Top of this page ] [ Agenda overview ]


11:15-12:00pm

 All LevelsTrack 1: Text Mining and Predictive Analytics
Contextually Augmented Predictive Models for Call Center Next Best Action

Using predictive analytics in real time to make call-center agents effective at problem solving, selling, and customer service, is one of the next frontiers for CRM. The Next Best Action for Call Centers solution directly puts predictive analytics in the hands of call-center agents. Over and above historical data based propensity models, we exploit the invaluable data source of agent-customer conversations to gather information about customer context. This helps backend predictive models to better score and rank product offers for selling to customers. This case study shows 25% boost in revenue for a global bank\’s call-center in a cross-sell/up-sell deployment.

Speaker: Shantanu Godbole, Analytics Architect, IBM GPS CRM

[ Top of this page ] [ Agenda overview ]


11:15-12:00pm

 Expert/Practitioners levelTrack 2: Superior Search: How Text Analytics Improves Search, and the Resulting Applications

Text analytics and Search are kissing cousins, and will always be intertwined. A range of applications, both traditional and new, result from combining the two sets of technology. This session will cover:

  • A survey of text analytics and search technology search, how they are similar and
    where they’re not
  • Highlights of how text analytics is used within search engines
  • Example applications combining text analytics and search technologies

Speaker: Jeff Fried, CTO and VP Engineering, BA Insight

[ Top of this page ] [ Agenda overview ]


12:00-1:00pm

Lunch

[ Top of this page ] [ Agenda overview ]


1:00-1:45pm

 All LevelsTrack 1: Text Mining and Predictive Analytics
Predictive Coding in E-Discovery

An expert panel discussion of the text mining and predictive analytics foundation, evolution and current applications of predictive text mining in the litigation arena as a method to bring down spiralling document review costs. Predictive coding is currently being reviewed by several courts in the U.S. and is a very hot topic in litigation circles. Panel to consist of presenter, another legal expert and 1 to 2 legal technology experts.

Speaker: Gerard Britton , DOAR Litigation Consulting LLC

Speaker: Dr. Herb Roitblat , Co-Founder, OrcaTec

Speaker: Eoin Beirne , Executive Managing Director, H5

Speaker: Martha Harrison, Ropes & Gray

[ Top of this page ] [ Agenda overview ]


1:00-1:45pm

 Expert/Practitioners levelTrack 2: Text Analytics / Social Media Tools
Selecting the Right Social Media Monitoring Tools

Organizations across the world are increasingly using Social Media for many strategic objectives. Governments want to monitor their political opponents and propagate their policies while businesses want to see how their brand is performing vis-a-vis their competitors. More sophisticated organizations use it for reputation management, IP protection, sales and marketing, customer service and engaging with influencers. All this requires you to create a framework for monitoring Social Media channels like Twitter, Facebook and others. In this session, we will look at some of the issues related to the field of Social Media Monitoring. This will be a fast paced and to-the-point session based on Real Story Group’s Digital Marketing research stream. We will cover:

– Social Media Monitoring and Intelligence – Basics
– Understanding the marketplace – Scenarios, Categories and Tools
– Selecting the right tools for your needs

Speaker: Jarrod Gingras, Director, Advisory Services, Real Story Group

[ Top of this page ] [ Agenda overview ]


1:45-2:00pm

Break

[ Top of this page ] [ Agenda overview ]


2:00-2:45pm

 All LevelsTrack 1: Text Mining and Predictive Analytics
User-Adaptive and Guiding R&D Planning System Empowered by Text Mining

This session presents an R&D planning support system, InSciTe Adaptive, which aims to support decision-making processes, especially focused on R&D strategy planning. This system combines text mining and Semantic Web technologies to facilitate extraction of technical terms and complex relations among technologies, organizations and products from technical documents and news articles and furthermore interoperable integration of those heterogeneous literatures. It identifies and analyzes core and emerging technologies and suggests customized R&D strategy based on process flows designed by scenarios. In addition, it identifies user interest in real-time and supports users to achieve their goals in user-adaptive and guiding way.

Speakers: Seungwoo Lee , Senior Researcher, Korea Institute of Science and Technology Information

[ Top of this page ] [ Agenda overview ]


2:45-3:15pm

Networking / Exhibit Hall

[ Top of this page ] [ Agenda overview ]


3:15-4:00pm

 All LevelsTrack 1: Big Data and Text Analytics
Text Analytics on Two Million Documents: A Case Study

To make sense of big data, we must know what each document is about and what terms and entities they contain. NLP techniques are processing-intensive, so correct set-up is key when working with big unstructured data.

We will describe an experiment in which nearly two million publications from CiteSeer were loaded onto an Amazon Elastic Compute Cloud and each publication was processed using a keyword extraction API. We cover lessons learned in setting up a cloud environment for such a large dataset, provide an overview of keyword extraction techniques and explain which ones can scale to handle the Big Data.

Speaker: Alyona Medelyan , Research & Development, Pingar

[ Top of this page ] [ Agenda overview ]


3:15-4:00pm

 Expert/Practitioners levelTrack 2: Enterprise Text Analytics Applications
The Semantic Value of Textual Domain Representations

The success of business efforts to acquire and transfer relevant expert knowledge requires an improved understanding of how human cognitive factors may influence how users interpret textual representations of work-related domains. When discussing human factors, important distinctions and similarities between fundamental aspects of cognitive psychology and HCI will be presented. The concepts of semantic value and cognitive alignment will be discussed along with their relevance to various types of domain descriptions. The presentation will then explore key research results, including the results of the presenter’s 2011 comparative study of task domain representations, which offer new perspectives about textual content and textually-based domain descriptions. Finally, a typical business case will be used to demonstrate value creation opportunities resulting from using domain representations to create better cognitive alignment between individuals and groups. The research results and other concepts should be useful for executives, managers, and consultants looking for new ways to make knowledge transfer solutions more effective.

Speaker: Mark Riddell , President, The Marrell Group, LLC

[ Top of this page ] [ Agenda overview ]


4:00-4:15pm

Break

[ Top of this page ] [ Agenda overview ]

[ Top of this page ] [ Agenda overview ]


4:15-5:00pm

 Expert/Practitioners levelTrack 2: Enterprise Text Analytics Applications
Social Semantics for an Effective Enterprise

Automatic generation of semantic markup is an exciting reality in information and knowledge management. Various approaches offer statistical or rulebase solutions including components for entity recognition, syntactic parsing, natural language processing and so forth. The true value of these solutions is that it provides the scaffolding for unstructured data. In an enterprise environment, this value is increased immensely with human/computer interactions. As content owners and creators are brought into the process of evaluating the representation of their data, the interactive capacity and ROI of text analytics is solidified. In this presentation, Sarah will discuss the planning, development, and maintenance stages of text analytics in reference to a semantic system.

Speaker: Sarah Ann Berndt , Taxonomist, Johnson Space Center

[ Top of this page ] [ Agenda overview ]



5:05-5:15pm

Break

[ Top of this page ] [ Agenda overview ]


5:15-6:00pm

Beyond Bag of Words: Taking Statistical Text Mining to the Next Level

Statistical text mining approaches typically treat text as an unordered “bag of words”, yet humans use context for understanding. We describe a group of techniques for incorporating context into statistical text mining approaches and demonstrate the success of these approaches on real-world problems including churn prediction, survey analysis, and fraud detection.

Speaker: Dr. John Elder , CEO and Founder, Elder Research, Inc.

Speaker: Dr. Andrew Fast , Director of Research, Elder Research, Inc.

[ Top of this page ] [ Agenda overview ]



6:00-7:30pm

Reception

[ Top of this page ] [ Agenda overview ]


Track 1 – Social Media and Text Analytics

Track 2 – Text Analytics – Techniques / How to Build

Thursday, October 4, 2012

8:00-8:55am

Registration & Breakfast


8:55-9:00am

Welcome Remarks – Tom Reamy, Chair


9:00-9:45am

Keynote
Unified Access to Enterprise Information

The combination of too much information and increasing IT complexity make it difficult for businesses and IT departments alike to understand and react to customers, trends, or competition in a timely manner. Information is scattered across transactional systems, email archives, call center records, social media and the Internet. Gathering and compiling it quickly—and then analyzing it and acting on it has become a daunting task. What is needed is a single point of information access and management that can quickly gather, process, find and analyze information from all sources—structured or unstructured. Unified information acess systems combine the features of BI and search software are now beginning to supplant these earlier systems. Unified information access platforms offer a new system architecture to:

  • Provide a single point of access to multiple sources and types of structured and unstructured information
  • Combine structured data and data operators with text and semistructured operations and analytics within a single architecture, which is not dependent on predefined schemas or federated models (These platforms typically include tools for semantic understanding, including fuzzy matching and a range of search and text analytics routines, as well as structured data and analytics operations.)
  • Interpret queries appropriately for each type of data, then merge and analyze the results to uncover relationships across sources (e.g., bringing together all the transactions, interactions, and documents that mention a particular customer)
  • Provide appropriate tools to prepare, merge, analyze, and present information from multiple sources
  • Scale to petabytes of information or billions of key value pairs, offering incremental updates in near real time

These new information platforms and the applications that are built on them, called InfoApps, depend heavily on text analytics to unite data and content, and to understand the meaning of the information.

Sue Feldman, IDC’s VP for search and discovery technologies will discuss these new information access platforms, the benefits and the effects that they will have on enterprises in the next five years.

Speaker: Sue Feldman , Research Vice President, Search and Discovery Technologies, IDC

[ Top of this page ] [ Agenda overview ]


9:45-10:15am

Networking / Exhibit Hall

[ Top of this page ] [ Agenda overview ]


10:15-11:00am

 All LevelsTrack 1: Social Media and Beyond
Beyond Sentiment Hype: Using Conversation Context for Accurate Discovery

Most use of sentiment analysis in social media to date has been extremely limited. Analytics with dashboards full of traffic light symbols gloss only the most obvious features of social conversations, often obscuring the real reactions and trends which move opinion about a product or a company. In this session, we will discuss the causes, both technical and human, behind the failure of early sentiment approaches. We will introduce the technologies and practices for advanced conversation analytics, and show how understanding tone and context for commentary provides a far more accurate analytical frame for decision-making around social media.

Speaker: Hadley Reynolds, Principal Analyst, Next Era Research

[ Top of this page ] [ Agenda overview ]


10:15-11:00am

 Expert/Practitioners levelTrack 2: -Text Analytics: Classification
Classifying content using context-based language

MCT’s ontology-based content-categorization system searches unstructured text and returns results based on context-rich evidence found in the article. This system determines the aboutness of an article instead of search for single mentions of search terms. Our ontology can define an article in multiple ways. We add metadata for topics, companies, people, and geography and can return all of it to users. It leverages the meaning of words in context to distinguish between closely-related topics and concepts using everyday language. Automatic categorization is accurate 80 percent of the time. I’ll talk about how we’re building the ontology and learned best practices.

Speaker: Evelyn Kent , MCT SmartContent Product Manager, MCT SmartContent

[ Top of this page ] [ Agenda overview ]


11:15-12:00pm

 All LevelsTrack 1: Social Media and Beyond
Social Media Research Integration is the New Norm

Since its advent as a significant platform for consumer activity, social media has been
lauded as either the saving grace or the Achilles heel of many organizations. While the
merits of the many social media applications will continue to be debated, it’s a simple
fact that the social channel has become a mainstay in Voice of the Customer (VOC)
programs. This session will go beyond identifying the importance of incorporating social intelligence to explain the processes, tools, and applications for its use in an oranization. Three recent case studies will further demonstrate these points.

Speaker: Jessica Hogan , Senior Manager, Consumer Insights & Strategy, J.D. Power and Associates

[ Top of this page ] [ Agenda overview ]


11:15-12:00pm

 Expert/Practitioners levelTrack 2: Text Analytics: Classification
Case Study – Text Analytics in its Automated Classification Process

Heather Edwards will present a case study on AP’s use of text analytics in its automated classification process. This session will provide an overview of the AP’s classification systems including vocabulary development processes, use of semantic rules for classification, the classification testing process and QA environment, and the ways users interact with classification. Heather will focus on challenges specific to classification of news content, in contrast to enterprise content.

Speaker: Heather Edwards , Taxonomy Consultant, AP

[ Top of this page ] [ Agenda overview ]


12:00-1:00pm

Lunch

[ Top of this page ] [ Agenda overview ]


1:00-1:45pm

 All LevelsTrack 1: Voice of the Customer
Next Generation Customer Surveys with Text Analytics

Spencer Morris will lead this session where you will learn

  • to use text analytics and open-ended comments to explain the “why behind the what” and fill-in the gaps in survey design
  • how world-class service organizations use text analytics to supercharge their feedback programs

You’ll also find out how

  • text analytics is a disruptive technology for customer feedback programs
  • to leverage the right technology to take action at both the customer level and the brand level

Speaker: Spencer Morris, Director, Text Analytics, Mindshare

[ Top of this page ] [ Agenda overview ]


1:00-1:45pm

 Expert/Practitioners levelTrack 2: Text Analytics Issues: Languages
Crossing the Language Chasm: Extracting Information from Foreign-Language Text

A global marketplace and rapidly growing Big Data resources have created new expectations for text analytics. Identifying the subject matter of text isn’t enough; the pressure is on to identify sentiment, to find significance in tiny snips of text, to mine massive quantities of text in real time, and to *analyze text written in languages that the analyst cannot read.*

Speaker: Meta Brown, Independent Consultant

[ Top of this page ] [ Agenda overview ]



1:45-2:00pm

Break

[ Top of this page ] [ Agenda overview ]


2:00-2:45pm

 All LevelsTrack 1: Voice of the Customer
Beyond the Basic Approach: Implementing Text Analytics Solutions for Enterprise Voice of the Customer Programs

This session covers rule-based and statistical techniques used by Maritz Research to implement text analytics. Topics include categorizing and toning comments, including real world examples from enterprise customers.

Speaker: Brion Scheidel, Director, Text Analytics, Maritz Research

Speaker: Kurt Pflughoeft, Director, Marketing Science, Maritz Research

[ Top of this page ] [ Agenda overview ]


2:00-2:45pm

 Expert/Practitioners levelTrack 2: Text Analytics Issues: Languages
Using Natural Language Processing to Take the Pulse of the Spanish Telecoms Market

As social media has grown over the last years, there has been an increasing interest in using text analytics to tap into this information for marketing strategies or customer service improvement. Text analytics, when based on language technologies featuring syntactic and semantic processing, can go beyond entity and sentiment polarity extraction. We will present a case study focusing on event recognition in Spanish tweets in the domain of the telecoms market: we will show how language technology has a significant role to play in the comparison of brands and in the detection of users’ buying intents and their wish lists of ideal features or services, in topic classification, problem detection and even the geolocation of service failures.

Speaker: Antonio Valderrabanos, CEO, BiText

[ Top of this page ] [ Agenda overview ]

 


2:45-3:15pm

Networking / Exhibit Hall

[ Top of this page ] [ Agenda overview ]


3:15-4:00pm

 All LevelsTrack 1: Social Media: Semantics & Themes
Automated Social Media Theme Extraction

This session will examine how text analytics can be used to answer specific questions about topics of conversations in social media. Many organizations increasingly rely on social media content for information that will guide decisions and operations. However, they often use key word searches to attempt to filter and analyze massive amounts of unstructured text data generated by social media every day. Relying on keywords frequently misses crucial pieces of data. Using analytics capabilities, we demonstrate how to quickly extract key topics from large volumes of unstructured text; group documents based on theme similarity, and identify useful and relevant information.

Speaker: Scott Oliveira , Software Developer, Booz Allen Hamilton

[ Top of this page ] [ Agenda overview ]


3:15-4:00pm

 Expert/Practitioners levelTrack 2: Text Analytics and Taxonomy
Taxonomy Terminology Traceability and Text Analytics

Step 1 in taxonomy building is the content audit to gather appropriate terms. Traditionally, the taxonomist would manually review content titles: browsing the terminology via the single most meaningful summary of the content; creating mental maps of the field; and getting a general “ear” for the language used. What if, instead, you give the taxonomist a list of extracted terms with word counts, and said “Start with this”? Listen in for lessons learned across multiple projects using text analytics to move taxonomy building forward in the era of big data and the semantic web .

Speaker: Rena Morse, Director of Semantics, Silverchair Information Systems

Speaker: Edee Edwards, Taxonomy Development Manager, Silverchair Information Systems

[ Top of this page ] [ Agenda overview ]


4:00-4:15pm

Break

[ Top of this page ] [ Agenda overview ]


4:15-5:00pm

 All LevelsTrack 1: Social Media: Semantics & Themes
Semantics and the Social Media

Big Data and Semantics are all the buzz, and we know that combining the two will provide great value, but do we understand how to best take advantage of the combination?

This session will take the ambiguity out of semantics, provide a demonstration of a real semantic engine and offer examples of how other organizations have deployed this advanced technology to address and leverage their big data problems.

Speaker: Bryan Bell , Vice President, Enterprise Solutions, Expert System

[ Top of this page ] [ Agenda overview ]


4:15-5:00pm

 Expert/Practitioners levelTrack 2: Text Analytics and Taxonomy
Taxonomies for Text Analytics and Auto-Indexing

Text analytics for auto-indexing may or may not make use of text analytics and may or may not make use of taxonomies, but taxonomies can greatly support and enhance the indexing results. This session primer gives an overview of the different auto-indexing technologies of information extraction and auto-categorization, and explains the different ways that taxonomies can support auto-indexing, whether the taxonomies are displayed to the user or not.

Speaker: Heather Hedden, Taxonomy Consultant, Hedden Information Management

[ Top of this page ] [ Agenda overview ]

4:15-5:00pm

 Expert/Practitioners levelTrack 2: Text Analytics and Taxonomy
How taxonomies and facets bring end-users closer to big data

Taxonomies are effective when they are tailored for the particular data-set, are up to date, and make sense to users. As data grows big and evolves fast we need to explore new sustainable ways for generating and maintaining taxonomies, and we need to understand how to do this well. In this talk I will give an overview of traditional, manual, expert-defined taxonomies, user generated taxonomies (or folksonomies), and automatically generated taxonomies. I will then present applications of taxonomies that work and do not work based on a number of user studies.

Speaker: Anna Divoli, Senior Software Researcher, Pingar

[ Top of this page ] [ Agenda overview ]


5:05-5:50pm

Panel

Speaker: Tom Anderson

[ Top of this page ] [ Agenda overview ]


5:50-6:00pm

Close / Wrap Up

[ Top of this page ] [ Agenda overview ]

Post-Conference Workshop: Friday, October 5, 2012

Making Text Mining Work: Practical Methods and Solutions

Click here for a detailed workshop description

Instructor: Dr. Andrew Fast, Director of Research