May 04, 2007

Segmentation is more than a way to classify the market. It acts as a strategic framework for understanding, marketing to, and reaching customers and prospects. Taken to its fullest potential, segmentation provides the organizing principle—the lingua franca—for the way the entire organization thinks about its customers and prospects.

When Claritas launched the industry’s first segmentation system—PRIZM—the original 40 clusters were produced from analyzing detailed data from the 1970 Census and updated demographic projections and evaluating millions of pieces of information about consumer purchasing habits. PRIZM’s introduction in the 1970s was followed by updating the PRIZM system in the 1980s and completely rebuilding the system in the 1990s, capitalizing on each release of the decennial Census data. As the Census Bureau prepares to release the last of the 2000 Census data this summer, Claritas is gearing up to repeat the process it has performed every decade since it pioneered geodemographic segmentation in 1976.

Building a segmentation system sounds deceptively simple, with three primary steps:

- Isolate key factors that delineate between key behaviors;
- Create segments based upon these key factors, and;
- Validate the new segments using a variety of important behaviors.

A statistical factor analysis of neighborhood-level data can lead to identifying the key factors that account for the most statistical variance between neighborhoods across the demographic variables. In addition to Census data, Claritas makes use of its proprietary annual Demographic Update, which combines Census data with multiple data sources, including a large number of small-area estimates from around the country. In 1980, 34 key factors ranging from household income to housing density accounted for 87% of the difference between neighborhoods.

The challenge of identifying key factors is to insure that the variance they explain is meaningful to the behaviors that were determined a priori to be important to the goals of the segmentation system. A fundamental mistake is to jump into the analysis without a clear definition of which behaviors will serve as the yardstick for assessing the performance of the system in general and the key factors’ performance in particular.

Techniques for creating segments

Once the key factors have been pinpointed, various clustering algorithms can be used to define the segments. One of the more popular methods has been to use a ‘k-means’-like procedure. These techniques produce a specified (k) number of segments clustered around points (or “means” of the clusters). The general procedure of these techniques is to iterate though the observations, successively moving observations from one segment to another, to generate a solution that minimizes the within-cluster variation while maximizing the differences between clusters. The final number of clusters is a compromise between precision and practicality. The 1970, 1980 and 1990 PRIZM systems were all created using variants of this technique.

Researchers who use these clustering algorithm techniques must be alert to the assumption of grouping all neighborhoods into a single analysis—that the dimensions that are important for defining one segment are the same for all segments. Depending upon the behavioral goals, this may not be the case.

More recently, multivariate divisive partitioning techniques have come into use. These techniques focus directly at creating segments that maximize the differences along key behaviors using demographic and other characteristics. The idea is to successively split segments into two using a characteristic that maximizes the difference between the resulting segments simultaneously across all behaviors. Since the variables that are used can vary at each step, these techniques avoid the problem of using the same dimensions for each segment. Claritas now uses its own custom software that has evolved from these techniques to allow even closer integration of behavioral goals and segment definition.

The final and most important step is to validate the segments across the complete range of targeted behaviors. This step has increased in importance with each new version of PRIZM. Each segment must be carefully evaluated to insure that what it represents is stable and interesting, in both demographic and behavioral terms. If either the stability or interest test can’t be met, the segments are recombined and then repartitioned. As with the k-means-type of clustering algorithms, the size and number of the segments must be balanced against the unique characteristics each segment brings to the overall system.

Levels of analysis

Over the course of the 1990s, marketers have increasingly sought to capture finer differences within households in the same neighborhood. While the ultimate segmentation system—with more than 107 million clusters, one for each household in the US—could deliver perfect precision of one-to-one marketing, the logistics of managing every household as unique is more than any marketing system can bear. Trading off some of this exactness gives way to the expediency of household-level segmentation.

Household-level segmentation systems, such as P$YCLE, are constructed using the same basic process as any segmentation system. The critical difference is that the key factors being isolated come from household-level data inputs, as compared to neighborhood-level inputs for a cluster system like PRIZM. Using household-level data permits marketers to determine finer distinctions in behaviors within a given neighborhood while maintaining the advantage of a manageable number of “buckets” into which households are classified.

The gain of increased precision from using household-level inputs comes at the cost of collecting, purchasing, storing and manipulating considerably more pieces of data than with a neighborhood-level system.

Regardless of the level of analysis, the final segmentation system should yield individual segments that represent a unique piece of the demographic and consumer behavior portrait of American lifestyles.

By David Miller
Senior Vice President/Data Research & Development
Claritas Inc.

Courtesy of


Leave a reply

Enter the characters shown in the image.