Sunday, October 16, 2011

 
Half-day Workshop
Room: Madison

R Bootcamp
Click here for the detailed workshop description

  • Workshop starts at 1:00pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 5:00pm

Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer

[ Top of this page ] [ Agenda overview ]


 

Monday, October 17, 2011

 
Full-day Workshop
Room: Madison

R for Predictive Modeling: A Hands-On Introduction
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer

[ Top of this page ] [ Agenda overview ]


 
Full-day Workshop
Room: Clinton

Predictive Analytics: Fundamentals and Use Cases
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructors: Piyanka Jain, CEO, Aryng.com, Puneet Sharma, Senior Manager, PayPal

[ Top of this page ] [ Agenda overview ]



 

Tuesday, October 18, 2011

 
Full-day Workshop
Room: Madison

Driving Enterprise Decisions with Business Analytics
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor: James Taylor, CEO, Decision Management Solutions

[ Top of this page ] [ Agenda overview ]


 
Full-day Workshop
Room: Clinton

Angoss

Hands-On Business Analytics: Insights to Impact
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructors: Piyanka Jain, CEO, Aryng.com, & Puneet Sharma, Senior Manager, PayPal

[ Top of this page ] [ Agenda overview ]


 
Full-day Workshop
Room: Trianon

Hands-On Predictive Analytics with SAS Enterprise Miner
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor: Dean Abbott, President, Abbott Analytics

[ Top of this page ] [ Agenda overview ]



 

DAY 1: Wednesday, October 19, 2011
TAW Master of Ceremonies: Andrew Fast, Elder Research

 
10:00am-7:30pm

Exhibit Hall Open


 
7:30am-9:00am

Registration & Breakfast


 

9:00am-9:45am
Room: Regent

Keynote
Decision Support, Risk Profiling, and more
Multiple Case Studies: U.S. DoD, U.S. DHS, SSA
Text Mining: Lessons Learned

Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

In solving unstructured (text) analysis challenges, we found that principles from inductive modeling – learning relationships from labeled cases – has great power to enhance text mining. Dr. Elder will highlight key technical breakthroughs discovered while working on projects for leading government agencies, including:

  • Prioritizing searches for the Dept. of Homeland Security
  • Quick decisions for Social Security Admin. disability
  • Document discovery for the Dept. of Defense
  • Disease discovery for the Dept. of Homeland Security
  • Risk profiling for the Dept. of Defense

Dr. Elder will summarize, from these (and commercial) deployment experiences, the factors essential to a successful text mining project.

Speaker: John Elder, CEO & Founder, Elder Research, Inc.

[ Top of this page ] [ Agenda overview ]


 

9:45am-10:05am
Room: Regent

IBM

Platinum Sponsor Presentation
Making Chatter Matter: Monetizing Social Media through Analytics


Listening in on customers’ social media interactions is essential today. But it’s not enough. Gaining business value from those interactions requires separating the insight from the noise. The key to doing so is analytics. Organizations that treat social media as a silo instead of an integrated communication channel are missing an opportunity to use social media to deliver business value. Using advanced analytics, companies can track social interactions and consumer sentiment and combine that information with data from other sources. They can then use it within existing processes to improve go-to-market and customer engagement strategies, as well as predict customer behavior. It’s time to get past the “gee whiz” of social media and make money through social channels by applying advanced analytics to predict customer behavior. Attend this session and learn how to:

  • Move from social as a shiny object to social as an integrated element of a firm’s customer interaction strategy
  • Embed social media analytics into existing processes, as well as apply predictive analytics to acquire deeper insight into potential customer behavior
  • Improve go-to-market and customer engagement strategies based on insight uncovered by integrating social media information with data from other channels
  • Cost-effectively engage customers from awareness through loyalty
  • Determine which tools, processes, and staff to harness to create an aggregate view of the customer and make sense of social buzz beyond just being interesting chatter.

Speaker: Colin Shearer, WW Industry Solutions Leader, IBM

[ Top of this page ] [ Agenda overview ]


 

10:05am-10:10am
Room: Regent

SASGold Sponsor Presentation
Groundtruthing Your Business with Text Analytics

In the unchartered territory of the unstructured, it can be difficult to convey the practical value that text analytics provides, especially if you are considering technologies for the first time. This brief talk will introduce a way to frame the benefits of this exciting field, providing examples that helped guide other organizations to find their way in the text frontier.

Speaker: Fiona McNeill, Global Product Marketing Manager of Text Analytics, SAS

[ Top of this page ] [ Agenda overview ]


 

Sybase

10:10am-10:15am
Room: Regent

Gold Sponsor Presentation
Unstructured Data Analytics in Sybase IQ

Power companies today need to leverage unstructured text data alongside regular data to gain significant insights into their business, identify emerging trends, and proactively respond to opportunities or potential risks. Sybase IQ is a market-leading analytics server enabling organizations to perform deep analysis of massive amounts of data, accessed by hundreds of users requiring answers in real-time. Sybase IQ provides the means to store and retrieve unstructured data objects as part of the same repository as transactional or analytical data. Unstructured data may include images, maps, documents, audio, video, and XML files. Sybase IQ can manage individual unstructured data objects containing terabytes or even petabytes of data, as needed. By bringing relational and unstructured data together into a single location, Sybase IQ enables an organization to access both types of data using the same application, and the same interface.

Speaker: David Wiseman, Director of Business Development, Sybase, An SAP Company

[ Top of this page ] [ Agenda overview ]


 
10:15am-10:45am

Break / Exhibits


 
10:45am-11:05am
Room: Regent

Social Media; Knowledge Discovery
Case Study: Socialmediatoday.com
Mining for Social Media Buzzwords

This session will show how Posts in Blogs and News Sites along with the number of retweets and Facebook ‘likes’ associated with each post provide us with a metric on how interesting a post was. Using Text Analytics, thousands of Blog posts, along with their author and post title are used to identify appealing and promising subjects and concepts in the social media space.

Speaker: Themos Kalafatis, Independent Consultant

[ Top of this page ] [ Agenda overview ]


 
11:10-11:30am
Room: Regent

Knowledge Discovery
Case Study: Bundle.com
New Insights from “Big Legacy Data”: The Role of Text Analytics at Bundle.com

For decades, credit card transactions have generated mountains of data about consumer spending habits, but the data formats were designed for archiving and reporting rather than for data mining and pattern discovery. In particular, the merchant’s name is embedded in a text field, which also contains other information, without any standard format.

Bundle.com is a new startup that is building a business on the extraction of value from this legacy data source. This case study will show how text analytics are being used to robustly identify merchants in the dataset as a first crucial step in the extraction of powerful insights about consumer spending behavior.

Speakers: Alexander Hasha, Lead Data Scientist, Bundle.com & Jaime Fitzgerald, Founder & President, Fitzgerald Analytics

[ Top of this page ] [ Agenda overview ]


 

IBM

11:35am-11:55am
Room: Regent

Lab Session: Live Topical Demo
Bank On It! Use Publicly Available Social Media Data to Strengthen Customer Relationships and Grow Your Business

Attend this session and learn how management consultants at Beyond the Arc are helping financial institutions extend customer experience efforts by providing actionable insights from consumer generated, social media data.

Their secret? Integrating and then analyzing publicly available social media data – including unstructured text data. Using IBM SPSS text analysis solutions, Beyond the Arc now has the power to unlock the value of data sources underutilized by the bank – such as Facebook comments, Twitter
messages and even location-based data such as Foursquare comments. Discover how text analytics enables you to use diverse social media feedback channels to attract and retain customers, identify fast-moving emerging issues, and build community amongst your customers.

Attend this session and learn how to:

  • Benchmark your social media efforts
  • Acquire new customers from your social media programs
  • Build community amongst your clients that leads to increased engagement and stronger customer relationships
  • Understand how to identify fast-moving emerging issues

Speaker: Steven Ramirez, CEO, Beyond the Arc

[ Top of this page ] [ Agenda overview ]


 
12:00pm-12:20pm
Room: Regent

Knowledge Discovery
Case Study: Snap-on (tool mfg) & Mitchell1 (auto repair s/w)
Creating a $1 Million Data Source from Free Text

With information coming in from 9,200+ sources, represented by over 500 million transaction records of free text, understanding the repair history of a vehicle was not easy. Using text analytics to match repair and transaction orders we were able to create a robust data source AND a standardized component list of over 27,000 parts that make up every vehicle. In this session we will discuss the steps performed to transform over 500 million records into the most complete set of auto repair data in the nation.

Speaker: Michael Pooley, President & General Manager, Mitchell1

[ Top of this page ] [ Agenda overview ]


 
12:20pm-12:35pm
Room: Regent

Lightning Round of 2-minute Vendor Presentations

[ Top of this page ] [ Agenda overview ]


 
12:35pm-1:35pm

Birds of a Feather Lunch / Exhibits


 

Track sponsored by:
Sybase

1:35pm-1:55pm
Room: Regent

Social Media and Event/Trend Analysis
Predicting Real-World Occurrences via Social Web Analysis

Speaker: Rishab Aiyer Ghosh, Co-Founder & Vice President of Research, Topsy Labs

[ Top of this page ] [ Agenda overview ]


 
2:00pm-2:20pm
Room: Regent

Survey Analysis & Student Attrition
Case Study: Florida State College at Jacksonville
Who Left and Why? Using Text Mining to Better Understand College Course Withdrawals

With the steady rise and continued growth of predictive analytics and data mining in the field of education most emphasis has been on classical mining techniques and procedures involving structured sources. Yet credible estimates indicate that well over half of the information in most organizations is stored in various unstructured forms especially those involving text-based sources and documents. This case study describes a Natural Language Processing (NLP) text mining analysis of student comments explaining rationales and reasons for college course withdrawal. The textual data used in the study were drawn from open-ended, verbatim, student comments at Florida State College at Jacksonville, a large, diverse, multi-campus institution with an annual (2009-2010) student enrollment of over 84,000. A practical overview of the methods used as well as implications for practice are presented and discussed.

Speaker: Greg Michalski, Director of Student Analytics & Research, Florida State College at Jacksonville

[ Top of this page ] [ Agenda overview ]


 
2:35pm-3:10pm
Room: Grand Ballroom

Keynote
Building Watson – An Overview of the DeepQA Project

Computer systems that can directly and accurately answer peoples’ questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Open domain question answering holds tremendous promise for facilitating informed decision making over vast volumes of natural language content. Applications in business intelligence, healthcare, customer support, enterprise knowledge management, social computing, science and government could all benefit from computer systems capable of deeper language understanding. The DeepQA project is aimed at exploring how advancing and integrating Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), Knowledge Representation and Reasoning (KR&R) and massively parallel computation can greatly advance the science and application of automatic Question Answering. An exciting proof-point in this challenge was developing a computer system that could successfully compete against top human players at the Jeopardy! quiz show (www.jeopardy.com).

Attaining champion-level performance at Jeopardy! requires a computer to rapidly and accurately answer rich open-domain questions, and to predict its own performance on any given question. The system must deliver high degrees of precision and confidence over a very broad range of knowledge and natural language content with a 3-second response time. To do this, the DeepQA team advanced a broad array of NLP techniques to find, generate, evidence and analyze many competing hypotheses over large volumes of natural language content to build Watson (www.ibmwatson.com). An important contributor to Watson’s success is its ability to automatically learn and combine accurate confidences across a wide array of algorithms and over different dimensions of evidence. Watson produced accurate confidences to know when to “buzz in” against its competitors and how much to bet. High precision and accurate confidence computations are critical for real business settings where helping users focus on the right content sooner and with greater confidence can make all the difference. The need for speed and high precision demands a massively parallel computing platform capable of generating, evaluating and combing 1000’s of hypotheses and their associated evidence. In this talk, the audience will be introduced to the Jeopardy! Challenge, explain how Watson was built on DeepQA to ultimately defeat the two most celebrated human Jeopardy Champions of all time and will discuss applications of the Watson technology beyond in areas such as healthcare.

Speaker: David Gondek, Ph.D., IBM Technical Lead, Watson Knowledge Capture and Learning and Healthcare Adaptation, IBM Research

[ Top of this page ] [ Agenda overview ]


 
3:10pm-3:30pm
Room: Grand Ballroom

IBM

Diamond Sponsor Presentation
The Analytical Revolution

The algorithms at the heart of predictive analytics have been around for years – in some cases for decades. But now, as we see predictive analytics move to the mainstream and become a competitive necessity for organizations in all industries, the most crucial challenges are to ensure that results can be delivered to where they can make a direct impact on outcomes and business performance, and that the application of analytics can be scaled to the most demanding enterprise requirements.

This session will look at the obstacles to successfully applying analysis at the enterprise level, and how today’s approaches and technologies can enable the true “industrialization” of predictive analytics.

Speaker: Colin Shearer, WW Industry Solutions Leader, IBM

[ Top of this page ] [ Agenda overview ]


 
3:30pm-3:40pm
Room: Grand Ballroom

Industry Trends: 2011 Data Miner Survey Results: Highlights

Do you want to know the views, actions, and opinions of the data mining community? Each year, Rexer Analytics conducts a global survey of data miners to find out. This year at TAW we unveil the results of our 5th Annual Data Miner Survey.

This session will present the research highlights, such as:

  • Demand for data mining
  • Open-source data mining software: usage trends
  • Data visualization
  • Text mining trends
  • Measurement of analytic project performance/success

The full Summary Report will then be immediately available online to all TAW attendees.

Speaker: Karl Rexer, Ph.D., President, Rexer Analytics

[ Top of this page ] [ Agenda overview ]


 
3:40pm-4:15pm

Break / Exhibits


 
4:15pm-4:35pm
Room: Regent

Customer Support
Case Study: A Fortune 500 global technology company
Rules Rule: Inductive Business-Rule Discovery in Text Mining

Text mining remains at the leading edge (rather than the mainstream) of analytics for the corporate world, largely because of the complexities associated with how language is used. Words and phrases in a corporate lexicon can be used ambiguously, inconsistently, and incorrectly, making it difficult but not impossible for a human to understand. However, for predictive analytics, these ambiguities must be overcome so that algorithms can be applied consistently to historic data.

Call center data is no exception to these problems. A Fortune 500 global technology company applied text mining to their help desk calls related to the repair of supported devices. Complexities included the usual text ambiguities and spelling errors, but also included variable English terms and abbreviations as used in foreign countries. By incorporating a combination of manual text extraction by domain experts with automated machine learning using decision trees, an operational system of business rules was developed that exceeds specifications for profitable identification of parts needed for repairs.

Speaker: Dean Abbott, President, Abbott Analytics

[ Top of this page ] [ Agenda overview ]


 

Track sponsored by:
Sybase

4:40pm-5:00pm
Room: Regent

Thought Leader
It’s 8:00am: Do You Know Where Your Data Is?

This presentation will examine the use of text analysis in enabling organizations to gain a holistic view of their business and maintain compliance through the use of information management tools. It will show examples of how organizations have accurately pinpointed data entering and leaving the organization, built information governance frameworks and blended structured and unstructured data to provide actionable intelligence across the business.

Speaker: Nick Patience, Research Director, Information Management, The 451 Group

[ Top of this page ] [ Agenda overview ]


 
5:00pm-5:20pm

Break / Exhibits


 
5:25pm-5:45pm
Room: Regent

Survey Analysis & Churn Risk Detection
Case Study: PayPal
Identifying Customers Who Expressed Intend-to-Churn or Defect from Large Number of Surveyed Verbatim

How can customers’ intend-to-churn or defect be detected without having to read the large number of customer verbatim feedback? In this case study, we’ll show you how we use a combination of human-classified verbatim (into at-risk and not-at-risk) and query-based text search to classify a set of ‘at-risk’ verbatim set. We then use it as a training set for a supervised learning algorithm to predict and classify a large set of customer verbatim into at-risk and non-at-risks so that actions can be taken to prevent churn and learn from their feedback.

Speaker: Han Sheong Lai, Director of Operational Excellence & VOC, PayPal

[ Top of this page ] [ Agenda overview ]


 

Track sponsored by:
Sybase

5:50pm-6:10pm

Room: Regent
Financial Indicators from Social Media
Social Media Analysis for Market Prediction: Collective Mood States and the Wisdom of Crowds

Hundreds of millions of individuals are now connected to online social networking services which are becoming an increasingly important medium for the exchange of personal as well as public information. In fact, more than 150 million tweets, each consisting of a 140 character update by an individual user, are posted on Twitter on a daily basis. Facebook now claims more than 500 million users worldwide who generate personal status updates and other online content by the millions every day. The streams of user-generated information, referred to as a “social media feeds”, may contain valuable, real-time information on the public’s opinions, activities, and mood states. In fact, advances in natural language processing and sentiment tracking algorithms now enable us to leverage computational approaches to efficiently mine the wealth of information in these social media feeds.

In this session I will provide an introduction to the basic principles of online social networking environments and the resulting social media feeds that are generated by their millions of users. Subsequently I will provide an overview of existing text analysis approaches that have been used to extract indicators of social opinion and sentiment from these feeds. A number of recent results demonstrate the value of such analytics to gauge among many others “national happiness” and consumer sentiment towards particular brands and products. In some cases it has even been demonstrated that social media feeds may contain predictive information with regards to a variety of socio-cultural indicators, such as box office receipts, product adoption and even the stock markets. In the latter half of my presentation I will particularly outline our own research on the subject of stock market prediction. My team and I have analyzed large- scale Twitter data to yield accurate measurements of the public’s mood state which in turn have been shown to contain predictive information with regards to the Dow Jones Industrial Average. In addition we have performed an analysis of longitudinal changes in individual user sentiment over hundreds of thousands of Twitter users to study the effects of social networking relations to evolving user mood states.

Speaker: Johan Bollen, Associate Professor, School of Informatics and Computing, Center for Complex Networks and System Research, Indiana University

[ Top of this page ] [ Agenda overview ]


 
6:10pm-7:30pm

Reception / Exhibits


Sponsored by:
JMP

 
7:30pm-10:00pm
Room: Gramercy

Local Group Meeting
NYC Predictive Analytics

Bayesians, Frequentists, and Big Data: Musings on Statistics in the 21st Century

This talk will touch upon topics in data analysis, statistics, and computing relating to modern massive data challenges. How do classical theories in statistical inference and asymptotics translate into statistical practice in the modern world? What role should complex Bayesian procedures and other cutting-edge methodologies have in the data analyst toolkit? Computationally, how can we manage the data deluge and how is statistical software evolving? What are the implications for the data analyst? What are the dangers posed by addressing these very questions? I’ll suggest possible answers to some of these questions, and hope to spur further debate by posing others.

NYC Predictive Analytics is a non-profit professional group that meets monthly to discuss diverse topics in predictive analytics and applied machine learning. We are a group 1000+ members strong comprised of analysts, computer scientists, engineers, executives, entrepreneurs and students with a deep interest in these fields & related technologies.

NYVCPAClick here for more information about this Local Meeting

Speaker: John W. Emerson, Associate Professor, Department of Statistics, Yale University

[ Top of this page ] [ Agenda overview ]


 

DAY 2: Thursday, October 20, 2011
TAW Master of Ceremonies: Andrew Fast, Elder Research

 
10:00am-4:30pm

Exhibit Hall Open


 
8:00am-9:00am

Registration & Breakfast


 
9:00am-9:55am
Room: Regent

Expert Panel:
Text Analytics Hits the Mainstream

Text analytics has taken off, across industry sectors and across application areas that include decision suppport, sentiment analysis, fraud detection, survey analysis and beyond. Where exactly are we in the process of crossing the chasm toward pervasive deployment, and how can we ensure progress keeps up the pace and stays on target?

This panel of leading experts will address:

  • How much of text analytics’ potential has been fully realized?
  • Where are the outstanding opportunities with greatest potential?
  • What are the greatest challenges faced by the industry in achieving wide
    scale adoption?
  • How are these challenges best overcome?

Speakers: John Elder, CEO & Founder, Elder Research, Inc., Tim Daciuk, Business Development Manager, Advanced Analytics, IBM & Richard Foley, Worldwide Product Marketing Manager & Strategist, SAS Text Analyst

[ Top of this page ] [ Agenda overview ]


 
10:00am-10:20am
Room: Regent

Advanced Techniques
Speaker from: Google
Rich Prior Knowledge in Learning for Text Analysis

We possess a wealth of prior knowledge about most prediction problems, and particularly so for many of the fundamental tasks in text analysis. Unfortunately, it is often difficult to make use of this type of information during learning.

This session describes how to encode side information about output variables, and how to leverage this encoding and an unannotated corpus during learning. We describe research examples in applying machine learning with side information to several problems. Prior knowledge used in these applications
ranges from structural information that cannot be efficiently encoded in the model, to knowledge about the approximate expectations of some features, to knowledge of some incomplete and noisy labelings.

Speaker: Kuzman Ganchev, Research Scientist, Google

[ Top of this page ] [ Agenda overview ]


 
10:25am-10:55am

Break / Exhibits


 
10:55am-11:15am
Room: Regent

Insurance
Case Study: Accident Fund
Using Text Analytics to Accurately Segment Workers’ Compensation Injuries

Workers’ compensation insurers can utilize predictive models to identify high-cost injuries early-on for triage and cost prevention. At Accident Fund, we were interested in the insured’s secondary conditions (i.e., co-morbidities) such as obesity and diabetes as model variables because these factors can significantly influence overall outcome. Co-morbidities are traditionally captured in adjuster notes; thus we applied text analytics to generate co-morbidity indicators. In this case study we will present lessons learned from handling challenges presented by freeform adjuster notes that include:

  • Developing synonyms
  • Handling negation logic
  • Handling large data quantities, and;
  • Validating the performance of the algorithm.

Speaker: Zubair Shams, Director of Product Development, Accident Fund Insurance Company of America

[ Top of this page ] [ Agenda overview ]


 
11:20am-11:40am
Room: Regent

Crowdsourcing Text Analytics
Case Study: Wikipedia
The Big Prize: Using Competitions for Advanced Text Analytics

Predictive modeling competitions represent a powerful leap forward in the accuracy of predictive analytics and text mining. As companies and researchers are fast discovering, competitions are advancing the state of the art in a wide range of fields. This session will focus not simply on the mechanics of data prediction competitions, but on why they work so effectively. The talk will reference a competition for Wikipedia, which includes a text analytics component.

Speaker: Anthony Goldbloom, Chief Executive Officer, Kaggle

[ Top of this page ] [ Agenda overview ]


 
11:45am-12:05pm
Room: Regent
SAS
Lab Session: Live Topical Demo
Quantifying Impressions with SAS Text Analytics

Join us for this lab session to see first-hand how Humana has used SAS text mining to improve call center classification and customer web content. Greg Hayworth, Data Analytics Scientist with Humana, will demonstrate how Humana has been successfully using text mining to address the common problem of agent selection defaults, and will describe how the analysis results were applied to:

  • Reduce their number of call center inquiries, thereby reducing costs;
  • Improve web content.

Speaker: Greg Hayworth, , Scientist, Humana Inc.

[ Top of this page ] [ Agenda overview ]


 
12:10pm-12:30pm
Room: Regent

Sentiment Analysis
Beyond Sentiment: Predicting Review Helpfulness by Automatic Classification of Competence, Integrity and Benevolence

Sentiment analysis only goes so far – the value of customer reviews and feedback comes also from the integrity of and competence behind them. This session covers a method to build classifiers that could reliably estimate benevolence, integrity and competence over 50,000 Yahoo! reviews. This session details how the 3 categories were transformed into logistic regression classifiers using industry best practices involving:

  • Creation of an annotation standard
  • Conducting inter-annotator agreement studies
  • Active learning driven creation of training data for logistic regression classifiers
  • Final evaluation of classifiers with held out data which served as an accuracy estimate for the research effort.

This difficult problem was approached from both researcher and technology provider perspectives, as reflected by the co-presenter. Attendees will understand the basics of creation of machine learning systems and the importance of incremental refinement of annotation standards.

Speakers: Breck Baldwin, President, LingPipe & Dezhi Yin, Ph.D. Candidate, Georgia Tech Business School

[ Top of this page ] [ Agenda overview ]


 
12:30pm-1:30pm

Birds of a Feather Lunch / Exhibits


 
1:30pm-2:15pm
Room: Grand Ballroom

Keynote
Everyday Analytics Analytics: Making Leading Edge Pervasive

As hype increases about the value and quantities of data, the world is beginning to understand the potential of analytics. One doesn ‘t need to be a statistician to get excited – these conversations are no longer just about algorithms. Consumers are directly impacted in their daily lives, as even their running shoes can give them analytical feedback on their workout. In this session, Thomas H. Davenport of the International Institute for Analytics will discuss every day innovations in analytics and how, as we learn to harness insight from data, our world may change

Speaker: Thomas Davenport, President’s Distinguished Professor, Babson College Author, Competing on Analytics: The New Science of Winning

[ Top of this page ] [ Agenda overview ]


 
2:15pm-3:00pm

Break / Exhibits


 
3:00pm-3:20pm
Room: Regent

Customer Support
Case Study: Citibank
Analyzing and Scoring Customer Problem Descriptions through Multi-focal Learning

We present a case study on analysis, development and deployment of automatic problem determination categorization based on the problem descriptions provided by customers (text descriptions). For instance, for the same type of problems encountered in a customer service center, the problem descriptions could be different based on the input from an experienced or inexperienced customer. It is necessary to identify focal groups through learning algorithm as experienced customers provide precise descriptions, while in contrast inexperienced customers provide diverse descriptions. We validate the newly developed multi-focal learning algorithm by demonstrating significant boost in performance compared to existing classification algorithms for customer service classifications.

Speakers: Ramendra Sahoo, Senior Vice President of Risk Technology, Citibank & Hui Xiong, Associate Professor, Rutgers Business School

[ Top of this page ] [ Agenda overview ]


 
3:25pm-3:45pm
Room: Regent

Customer Support & Sentiment Analysis
Case Study: Amdocs
Customer Support and Product Sentiment for a Leading Telecom

This case study presents the results of a recent project for a Telecom service provider to develop a number of capabilities – what worked well and what did not. We will cover how one software solution was selected over others through an extended Proof of Concept that also formed the basis of two applications. The first application was analyzing customer support reps hastily written notes for determining what the customer motivation was for the call, what special problems had been dealt with, and to identify what steps were required to resolve the issue. Our derived solution employed categorization and entity extraction methods. The second application was to mine social media for both sentiment about the range of products and services that the Telecom offered and to look for potential problems and potential solutions to those problems.

Speaker: Tom Reamy, Chief Knowledge Architect, KAPS Group

[ Top of this page ] [ Agenda overview ]


 
3:45pm-4:30pm

Break / Exhibits


 
4:30pm-4:50pm
Room: Regent

Parallelized Analysis & Topic Discovery
Case Study: Intuit
Babies on Elephants: Hadoop and Text Analytics

Several million customers use Turbo Tax Online every season and during the experience they either search, if they have any particular information need, Turbo Tax knowledge base articles or posts from a community of Turbo Tax users or even arrive at Turbo Tax landing pages from queries on Web search engines. Millions of such queries are collected every season. In this case study we describe how we harnessed these unstructured text queries to discover "emerging" or baby topics which are customer issues growing more rapidly than the average population of topics and illustrate the benefits with examples where topics were discovered several weeks before they appeared in other voice of customer channels. In particular we will describe how we combined open source natural language processing packages with Hadoop to analyze large volumes of queries on a daily basis.

The key takeaways from this presentation are:

  • How voice of customer data can be mined to derive early predictors of issues
  • Applying automated topic detection for not only customer support but also marketing analytics, and;
  • Effectively combining text analytics with Hadoop for mining large data sets.

Speaker: Saikat Mukherjee, Ph.D., Senior Data Scientist, Intuit Inc.

[ Top of this page ] [ Agenda overview ]


 
4:55pm-5:15pm
Room: Regent

Thought Leadership
The Seven Different types of Text Mining and the Five Questions that Reveal
the Right Approach

“Text Mining” can still mean different things to different practitioners, as the rapidly growing field encompasses a wide range of technologies and applications. We find there to be seven distinct practice areas – such as information retrieval and document classification — which employ very different goals, terminology, and technology. Which is the best to employ for your goals, given your data? We describe a simple decision tree, with five questions, that can lead you to the resources best suited for your task. To illustrate, we “drop the TAW program down the tree” for a visual clustering of the talks from these two days, and reveal where the conference has been strongest.

Speaker: Andrew Fast, Ph.D., Director of Research, Elder Research, Inc.

[ Top of this page ] [ Agenda overview ]

 


Friday, October 21, 2011

 
Full-day Workshop
Room: Off-Site

IBM

Hands-On Introduction to Text Analytics with IBM SPSS
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor: Tim Daciuk, Business Development Manager, Advanced Analytics, IBM

[ Top of this page ] [ Agenda overview ]


 
Full-day Workshop
Room: Nassau B

The Best and the Worst of Predictive Analytics: Predictive Modeling Methods and Common Data Mining Mistakes
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • First AM Break from 10:00 – 10:15
  • Second AM Break from 11:15 – 11:30
  • Lunch from 12:30 – 1:15pm
  • First PM Break: 2:00 – 2:15
  • Second PM Break: 3:15 – 3:30
  • Workshops ends at 4:30

Instructor: John Elder, CEO & Founder, Elder Research, Inc.

[ Top of this page ] [ Agenda overview ]


 
Full-day Workshop
Room: Nassau A

TIBCO

Deploying User-Friendly Predictive Analytics: Delivering Results to Business Users with Interactive Applications
Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor: Jeff Mergler, Lead Statistical Applications Trainer, TIBCO Spotfire

[ Top of this page ] [ Agenda overview ]


Free Book Chapter