rd4_4
Rev Feb 2, 1998
Babson College
F.W. Olin Graduate School of Management 

  Chicago Fashion Study I:
Using MiniTab to Help Analyze the Data
Some Examples
 

The purpose of this memo is to take you through a sequence of analyses that will show you how to use MiniTab explore the Chicago Fashion data set in search of Segments.  Click below to go to main sections.

TALLY
CROSS_TABULATION
ROW PERCENTS: DESIGNERS
COLUMN PERCENTS: DESIGNERS
OVERALL DISTRIBUTION: DESIGNERS
DESIGNERS AND INCOME
INCOME AND STORE SHOPPED
PRICES AND STORES
PRICES AND THE DEMAND CURVE
PRICES AND STORES 2
PRICES AND SALES
SALE/NO SALE DEMAND CURVES
SALES AND STORES

TALLY

Suppose you begin by noticing that the data set contains a number of lifestyle variables. You decide to explore them. The first one "DESIGNERS" looks interesting so you explore it by clicking on stats/tally and selecting designers and a number of outputs including counts and cumulative percents. You get this Table:

This tabulation shows you that 460 respondents said "#1" that they did not often buy designer clothes. You then decide to look at responses for "LIFE_FSH" to see whether it agrees: This table shows that there appears to be more agreement with this statement -- as more respondents give it a "6" rating.

CROSS_TABULATION

You then wonder whether the same respondents give high ratings to both questions -- so you cross-tabulate them:

ROW PERCENTS: DESIGNERS

It looks as though there is some tendency for those who say "1" to "DESIGNERS" to also say "!" to "LIFE_FSH". To check this further, you decide to look at the same dats -- but to use "ROW PERCENTS" to see whether the rows are similar:

This table makes it very clear that there are big differences. The top row has a lot of respondents to the "left" and the bottom row has a lot to the "right". This confirms that there is a relationship between these variables.

COLUMN PERCENTS: DESIGNERS

Similarly, you decide to check for column-to-column differences:

Again, you see evidence of a strong relationship.

OVERALL DISTRIBUTION: DESIGNERS

You also want to know whether there are a lot of respondents in any "cluster" within the overall cross tabulation -- so you ask for the same cross-tabulation with percentages based on the whole sample:

You see that there are a couple of areas of the table with a lot of respondents. These could be "target" clusters.

DESIGNERS AND INCOME

It then occurs to you that interest in DESIGNERS may well be related to income, so you decide to look at this relationship. You decide to cross-tabulate these variables and to ask for column percents.  This is the table you get:

It shows that there is some relationship -- but not nearly as strong as that between DESIGNERS and LIFE_FSH.  (Note that the format of this Table is a little different frm the one above.  That is because I "dropped" the data into EXCEL to make it easier to format.  There is an EXCEL Wizaed to help do this that is quite intuitive to use.)

INCOME AND STORE SHOPPED

You then decide you need to see whether there are also differences in which stores customers shop at different income levels.  You focus first on Jeans.  So you decide to cross-tabulate INCOME by BY_JEANS.


 
You see that there are large store-to-store differences in INCOME.  But the Table is very large, so you decide to eliminate that stores with fewer shoppers and to keep just the top 7.  To do so you re-code the store data to keep just the most popular stores.  You do this by re-coding the others to "*" as follows:

Now the Table is much smaller and it is quite clear that shoppers at some stores are much more interested in DESIGNERS than shoppers at other stores:

PRICES AND STORES

You then decide to look at whether the prices customers paid also vary from store to store.  To do so, you look at the range of prices paid for Jeans (P_JEANS) and see that there is a very wide range -- from under $10 to $145.  So you decide to re-code these data as follows:

There are not enough spaces in the re-code screen to do the whole re-code at once, so you do the lower prices first -- into column 77 -- which is the next free column.  Note that any data not re-coded in each step is kept as it was.  So you can then re-code the rest of the data into the same column 77.

Now you can do a table of prices that is much more condensed:

This Table shows how purchases vary with price.  Most of the sales are at the middle price levels.

PRICES AND THE DEMAND CURVE

It then occurs to you that you can use these date -- with a little re-configuration -- to draw a demand curve.  Instead of cumulating sales from low prices to high, you want to cumulate them the other way.  So you go back to your EXCEL spreadsheet and sort in this order and create this Table:

This looks interesting -- so you decide to make a graph of it in EXCEL CHART:

PRICES AND STORES

You wonder whether prices paid at different stores are also different -- so you decide to cross-tabulate P_JEANS by BY_JEANS:



You also get the average price in each store by asking for "SUMMARIES" and selecting C 77 which is P_Jeans re-coded.  There are large store-to-store differences.

PRICES AND SALES

You also notice that some customers bought on sale.  You wonder whether they paid lower prices.  So you cross tabulate P_JEANS by SL_JEANS:

SALE/NO SALE DEMAND CURVES

Those who bought on sale paid much less.  You wonder whether the demand curves for these groups may be different -- so you decide to graph them:

Clearly there are significant differences.

SALES AND STORES

You also wonder whether there are store-to-store differences in the impact of sales on average price paid.  So you decide to do a triple-cross-tabulation as follows:


 
 The resulting tables show significant store-to-store differences. Try it for yourself and see.

Top