Building Better Customer Data Profiles with Big Data Technologies

by   |   December 10, 2012 5:38 pm   |   0 Comments

Elliott Cordo of Caserta Concepts

Elliott Cordo of Caserta Concepts

“Know thy customer” should be the number one strategic initiative for any organization. In today’s world, marketing and all customer interactions need to be designed to uniquely fit the customer.   This micro-targeting approach is probably best recognized by some of the large online retailers.  At this point, the content and interactions from these organizations are expected to be relevant and tuned to our behavior.   To achieve this, developing a customer profile which encompasses engagement, preferences, sentiment and other behavioral aspects is an essential tool.

Let’s clarify a few of these terms:

Customer engagement is a measure of how an organization interacts with its customers in all available channels.  It is important to note that a successful engagement metric must go beyond the tracking of purchases and include as many touch points as possible including email (especially whether they actually open it), website usage and call center interaction.

Preferences are what the customer has shown us they like.  Such information can come through onboarding surveys, usage and behavior patterns, including what items or services they buy or interact with (shopping cart, wish lists).

Sentiment is what customers think about the brand.  This can come from surveys, customer relationship management (CRM) interactions and social media.

Related Stories

Opinion: Change the definition of the data warehouse.

Read more»

Opinion: How a small data error becomes a big data quality problem.

Read more»

Business problems suited to big data analytics.

Read more»

Such a profile should feed from as many customer activities as possible to ensure a holistic picture. Luckily, with the continuous and sometimes explosive growth of data by transactional systems, Web traffic and social media, building such a rich profile is achievable. However, this data—which is associated with the volume, variety and velocity of big data—poses a problem. How in the world do you process and integrate it?

Traditional Approach
Traditionally, organizations who implement a customer profile have implemented it as a data warehousing initiative. Most established data warehouses have gone through major master data management and customer data quality initiatives to generate a distinct and accurate customer record, typically referred to as conformed customer dimensions.   It is likely that this warehouse also contains structured facts around point-of-sale transactions, clickstream and CRM interactions.  This foundation can then be leveraged to form aggregates and segmentations, creating a customer profile.  An organization then has a reusable and well-defined tool for repeatable analysis and successful targeting of their customers—a beautiful thing.

However, there are three weaknesses with this traditional approach:

1. Data availability. Data warehouses are built in stages and there is always a list of subject areas that are waiting to be added.  Typically, a steering committee made up of marketing, IT and other stakeholders evaluates what areas are built and in what order. There is a constant cycle of evaluating business value, cost and feasibility. Survey data, for instance, might be one of those items lurking at the bottom of the value and feasibility list, at least when compared to retail data. However, it could be one of the most powerful tools for understanding a customer’s engagement and sentiment.

2. Data volume. There must always be a conscious decision around what data to store in the data warehouse and how long to maintain it. Filtering and pruning customer data can stand in the way of getting the full picture.  Typically, these decisions are made by IT based on infrastructure costs.

3. Unstructured data. Data of this type is very difficult to model in a relational database, and even more so in a data warehouse. If this data is structured and conformed, much of its original value can be lost.  In general a good amount of the raw data is discarded as a tradeoff for storing it in a more useable form.   Additionally, much of the unstructured world is comprised of text.  Analyzing and gaining insight from this data in the relational world is quite challenging and the toolset is relatively thin, especially when dealing with high volumes.

Emerging Technologies to Address the Problem
A group of emerging technologies that fall under the term “Big Data” provides a framework that makes a “complete” and adaptable customer profile much more feasible. The biggest enabling factor is that these technologies can accept and process any type of data, from relational data in the OLTP, to structured data in the enterprise data warehouse, to free text documents, help desk and CRM call logs, social media and even video. Removing the conventional bounds of the ETL (extract, transform and load) and data modeling steps required by traditional data warehousing allows organizations to generate customer profiles in a more agile and constraint free manner.

The large-scale storage capabilities of the Hadoop Distributed File System (HDFS) allow organizations to retain everything. There is no need for timescale-based pruning or pre-filtering of data based on perceived value. That which might not seem useful to fulfill current analytical requirements may indeed be significant from a customer perspective. For example, timeouts and page errors can be discarded for clickstream analysis but may represent significant negative touch points for an end user.

Machine learning and sentiment analysis can be used to process unstructured data, text and video. Working with these tools allows an enterprise to develop customer models and behavioral profiles that may not have been easily identified in the past.

Taken together, the emerging technologies address the weaknesses in the traditional data warehousing approach. At a high level, the right approach will consist of three components:

1. Extraction:  This is the initial stage of the collecting and delivery of source data.   Toolsets may include a combination of the traditional enterprise ETL tools that have built Big Data connectors to native Big Data tools such as Sqoop and Flume.

2. Processing and transformation:  Processing the data into actionable information will be accomplished by any mix of Native Map Reduce, higher level languages such as Pig or Hive and an evolving set of agile and user-friendly ETL tools such as Datameer.   Machine learning applications such as Mahout may be employed to create clustering or segmentation based on a library of advanced algorithms.

3. Delivery:  Depending on the delivery requirements, the customer profile may exist as an ETL endpoint, be instantiated in a big data database such as HBase or Cassandra, or the results can be fed back to a relational database or enterprise data warehouse.

A Strategic Asset
As with a more traditional approach, managing customer data quality is a requirement for creating customer profiles that are useful in analysis and for creating valuable customer relationships. A well-constructed customer data strategy is a strategic asset; although an organization’s strategy will evolve as the program is underway, a solid starting point is key.

The customer profile provides a very compelling use case for big data. Being able to access and aggregate so many little pieces of data related to customer engagement will surely help bring the big picture into focus.

Elliott Cordo is a principal consultant at Caserta Concepts, a consulting and technology services firm that specializes in data warehousing, business intelligence and big data analytics in the New York area.

Tags: , , ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>