Jonathan Rosen
Lecture Notes - Prof. Narahari
designed to be viewed with IE4+
 
 

-Data Mining & Personalization-

 

Table of Contents

Predictive Modeling
Data Mining
Why Data Mine?
CRM (Customer Relationship Management)
CRM Goal

Personalization in E-commerce

Personalization Processes Chain

B2c Objectives

Impact of Personalization

Privacy Issues

Diagram of the Personalization Process

Knowing the Customer

Broadvision.com

How does this all work?

Knowledge Discovery in Databases (KDD)

Data Mining Problems

Data Mining Tasks

 

 

 

Predictive Modeling

·Predictive modeling is a system which aids an entity in predicting what one of their users will do next. Multiple actions by the user are considered in determining the eventual outcome. 
·Microstrategy (http://www.microstrategy.com) is company well known for their software solution suite which helps companies employ predicative modeling through data-mining.

Data Mining

·Data mining is “the exploration and analysis by automatic or semiautomatic means, of large quantities of data in order to discover meaningful patterns and rules.
·Companies use data mining in order to determine what their customers will do next.
·Data mining is not limited to the web.
·Companies like Structure (a men’s clothing vendor) use data mining on their store credit cards to see how much and when their customers purchase. They then use that information to make incentive offers to customer based on their buying patterns.

Why Data Mine?

·Customer service has become more important to the consumer
·Price differentiation is no longer enough
·Currently the trend in business
·Know more about the consumer, to better serve the consumer
·Can use increased intelligence as leverage and also in B2B marketplaces

CRM (Customer Relationship Management)

·Getting a new customer today costs more than keeping one
·This is directly related to the high acquisition costs associated with today’s market. One source contends that it costs five times as much to acquire a new customer (Peppers & Rogers), although an actual figure is highly variable depending on the market and point in time
·By getting to know a customer better, one can serve them better and make more money
·An example of this might occur when Customer A has been buying grass seed from Company X for 3 years. A good CRM mechanism would determine that Customer A might also be interested in buying other lawn care products such as fertilizer, garden hose, etc.
·A 5% increase in retention, reduces overhead by up to 10%
·CRM allows for increasingly sophisticated mass-market offers through better segmentation

CRM Goal

·Define the customer segments
·Carefully address legal and ethical concerns
·Lights out executions against segments
·Attribution & evaluation of responses

Personalization in E-commerce

·Positive:
·Easier to personalize by gather aggregate data
·Can literally ‘follow’ a customer around through a digital store (but obviously not through a mortar store)
·Much easier to experiment with the placement of goods, etc. in a digital store as opposed to a mortar store
·Negative:
·Web based shopping is not a proven better way to sell items

·Concerns over how much people purchase higher margin items or ‘touchy feely’ things have surfaced.

·Difficult to differentiate prices against geography

·i.e. how do you tell someone in Kansas and item costs $X and someone in New York in costs $Y? May be justified because of shipping costs, but the consumer may not see it that way.

Personalization Processes Chain

B2c Objectives

·KNOW THE CUSTOMER
·Gather data during registration
·Use cookies to remember the customer
·Possibly even show the customer a specialized storefront which reflects their interests when the visit the site based on the cookie
·Find out what the customer wants and deliver
·Use questionnaires

·Collect data, click streams, etc.

·Review histories (visits, orders, what they look at, etc.)

·Amazon.com (http://www.amazon.com) has excelled in the area of following customers around to see what they look at and what else they might be interested in

·Amazon automatically generates a “My Page” which has items listed on it based on your past purchases, searches and items you have viewed

·Deliver:

·Customized promotions

·Customized products (!!)

·I.e. sell someone who likes action movies a special DVD box set designed for them with their favorite action stars

Impact of Personalization

·Able to learn more about customers
·Happens invisibly, which could not take place in a store (especially in terms of collecting aggregate data of click streams)
·Helps to decide how to improve the system

Privacy Issues

·Large numbers of customers are concerned about what data is collected about them and who sees it.
·An especially hot topic in the spotlight of this issue was DoubleClick. You can read more about the DoubleClick scandal at: http://www.thestandard.com/article/0,1902,9480,00.html
·Customers give more info to a trusted site.
·Making one’s site secure and trustworthy are key ingredients to being able to collect accurate data about consumers
·Untrusting consumers may be unwilling to give their personal information to a site or may even enter false information to a site.
·Sites must have clear privacy statements

·Third party evaluators such as the Better Business Bureau (BBC) exist to evaluate sites for a fee and will assure consumers of the sites integrity. (http://www.bbb.com)

·BBB.com is more geared towards small business

·TRUSTe.com provides a similar service, and has better internet brand recognition, but because of their costs is geared more towards larger businesses. (http://www.truste.com)

Diagram of the Personalization Process

Process:
1.Get to know the customer through the gathering of data
2.Create a profile for the user, extrapolate from past histories

3.Segment the user based on his profile

4.Extrapolate predictions based on the user’s data, as well as his peers in the user’s segment

5.Deliver customized content, offers, etc. to the consumer

6.Allow the consumer to log in/access the customized content directly.

·Examples: send users to categories they frequently visit.

·Make sure the user can still easily access the entire site

Knowing the Customer

·Cookies present a problem because many customers do not trust them
·Also, when you give a user that shares a terminal a cookie, there is no way make sure that when someone from that terminal visits your site, you are dealing with the same user; i.e. cookies are machine specific, not user specific.
·The workaround is secure logins
·OPS: The Open Profiling Standard is a proposed standard for how Web users can control the personal information they share with Web sites. OPS has a dual purpose: (1) to allow Web sites to personalize their pages for the individual user and (2) to allow users to control how much personal information they want to share with a Web site. OPS was proposed to the Platform for Privacy Preferences Project (P3P) of the World Wide Web Consortium (W3C) in 1997 by Netscape Communications (now part of America Online), Firefly Network, and VeriSign.(copied from WhatIS.com; to read more about OPS go to: http://whatis.techtarget.com/definition/0,289893,sid9_gci214208,00.html)
·Manage customers not products. Stay attentive to the customers needs and try to build a relationship with the actual customer.

·A Microstrategy (http://www.microstrategy.com) white paper states that 66% of all information traded between consumers and businesses will be non-commercial in nature. This means that by nurturing the relationship between the business and the consumer (not always forcing products down their throat) businesses will be able to excel.

Broadvision.com

·Broadvision develops and delivers an integrated suite of packaged applications for personalized enterprise portals. Global enterprises and government entities use these applications to sell, buy, and exchange information over the web and on wireless devices. The BroadVision e-commerce application suite enables companies to become more competitive and profitable by establishing and sustaining high-yield relationships with customers, suppliers, and employees. (copied from http://www.broadvision.com)
·Software allows for the intuitive management and collection of data on customers on a given web site. Includes real-time applications. Essentially makes relating data to other data much easier for users of the system.

How does this all work?

·Data Mining applications use intuitive methods including decision trees, neural networks and other permutations of business rules to target a specific customer.
·Through the definition of the business rules, a certain customer can be targeted (i.e. 18 year old man with a fast computer for computer games, etc).Then, specific marketing can be used to maximize the value of that customer to the company.

Knowledge Discovery in Databases (KDD)

·It is the process of identifying valid, novel, potentially useful, and understandable patterns in data (Fayyad, Piatesky-Shapiro, and Smyth)
·It involves data preparation, pattern extraction, knowledge evaluation, and refinement, in iteration (Narahari)
·This is essentially a data mining process which results in business knowledge. Specific business rules and algorithms are applied to extract data. However, the data has to be significantly cleaned before it can actually be mined- this is generally due to the very raw nature of the data collected. It has to filtered to remove outliers and extraneous data, as well as to put the data into a uniform manageable state before it can be mined.
·Process:
1.Select Data
2.Data Cleansing and Pre-processing (80-90% of time spent on this process)

3.Data Mining

4.Results interpretation

5.Implementation

Data Mining Problems

·There are inherent problems any time you segment a group. Thus, when one beings creating numerous segments automatically electronically, there are bound to be some segmentations made which should not have been. Essentially, it is very difficult to fit everyone into certain categories. Some people may fit better in more than one category, or even a category that does not exist.
·The collection of extraneous data can also lead to improper segment assignment. For example, if someone went to buy diapers at the supermarket and lots of other baby products because they had family visiting, but as a result were now segmented as customer with an interest in baby products, the segmentation would be faulty.
·Similarly, some product associations are faulty. For example, the beer and diapers example discussed in class. A man goes to the store to buy diapers and decides to pick up a 6-pack. If this happens numerous times, then it would show the vendor (at first glance) a strong trend toward a product association between beer and diapers. However, common sense says this is coincidental and these are not co-marketable products. Thus, not all relationships in data always exist as they appear to.
·Further, forecasting is troublesome as conditions can change very quickly. If data is mined and shows that there is an increasing trend in the purchase of suntan lotion that does not mean the trend will continue to grow through the year. Controls must be placed on the process to let the system know that the lotion is a season item.

Data Mining Tasks

·Intelligent data mining suites can make fair predictions. By mining historical data, along with current data, it is possible to make reasonable forecasts as to how many of a given product a vendor can expect to sell on a given day.
·Further mining the data to determine what the drivers of purchase volume are can be beneficial to companies looking to increase sales.
·Data mining could be used to show trends as to which products have spikes in demand together. Then, these products could be co-marketed to increase overall sale volume for each item.
·Thus, determining which products go together, both naturally and by consumer demand, is an inherent ability of good data mining.
·The process of grouping like things together is called affinity grouping. It is directed by human involvement.
·There is an inherent problem with affinity grouping, an example of which is shown in the preceding example of grouping diapers and beer.

·Cluster Grouping differs because it is not a directed task. Clusters are formed of products and customers with similar demand patterns.

·This method of grouping is preferred by many experts.