Conjoint Analysis

Author: Kemjika Ananaba

In this Project, we will learn about conjoint analysis and its application when introducing a new product into the market. Rank-based conjoint analysis is carried out in this Project using multiple linear regression. We encourage you to create your own Jupytor Notebook and follow along. You can also download this notebook together with any accompanying data in the Notebooks and Data GitHub Repository. Alternatively, if you do not have Python or Jupyter Notebook installed yet, you may experiment with a virtual Notebook by launching Binder or Syzygy below (learn more about these two tools in the Resource tab).

Launch Syzygy (UBC)

Launch Syzygy (Google)

Launch Binder

Business Problem

This Project focuses on the evaluation of market research for a new brand of beer. The simulated dataset captures three attributes that describe the new beer being introduced in the market: Price Point, After Taste, and Calorie Level. The Price Point and Calorie Level are recorded as levels. Rank-based conjoint analysis answers the following questions:

How important are certain features of a product to consumers?
What are the trade-offs consumers are willing to make with regards to the product features?

Rank Based Conjoint Analysis

There are various ways to perform conjoint analysis. Here, we will focus on rank-based conjoint analysis. A Rank-Based Conjoint (RBC) model consists of exposing respondents to a set of product profiles, which all share the same attributes but at varying levels, and asks them to rank the product profiles from the best to the worst.

Data

The dataset is an aggregated response from several survey responders. There are two columns: Rank and Stimulus. The Rank column shows how each of the 18 combinations of the three attributes are ranked, and the Stimulus column codes these product combinations based on the following categorizations:

Attribute/Levels

“A” is Price Point with three levels:
1. \$6
2. \$5
3. \$4
“B” is After Taste with two levels:
1. Strong
2. Mild
“C” is Calorie Level with three levels:
1. Full
2. Regular
3. Low

import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
%matplotlib inline

Data = pd.read_csv("nb0015_data/ConjointInput.csv", sep = ";")
Data.head(4)

	Stimulus	Rank
0	A1B1C1	2
1	A1B1C2	3
2	A1B1C3	1
3	A1B2C1	5

Data Manipulation

The Stimulus column is a bit cryptic, so some data manipulation is needed here. In total, there are eight different product characteristics across the three attributes, and 18 different combinations.

#create an 18 row by 9 column dataframe of zeros

ConjointDummyDF = pd.DataFrame(np.zeros((18,9)), columns=["Rank","A1", "A2", "A3",
                                                    "B1","B2",
                                                    "C1", "C2",
                                                    "C3"])
#converting data table to dataframe with dumnmy variables
# transfer rank column
ConjointDummyDF.Rank = Data.Rank
# For loop that inserts 1 if column name is present
for index, row in Data.iterrows():
    stimuli1, stimuli2, stimuli3 = Data["Stimulus"].loc[index][:2], \
    Data["Stimulus"].loc[index][2:4], Data["Stimulus"].loc[index][4:6]


    ConjointDummyDF.loc[index, [stimuli1,stimuli2,stimuli3]] = 1

ConjointDummyDF.head()

	Rank	A1	B1	B2	C1	C2	C3
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

fullNames = {"Rank":"Rank", \
           "A1": "$6","A2": "$5","A3": "$4", \
          "B1": "Strong After Taste", "B2":"Mild After Taste", \
           "C1":"Full Calories", "C2":"Regular Calories", "C3": "Low Calories",\
          }
#assign names to column names
ConjointDummyDF.rename(columns=fullNames, inplace=True)
#confirm that the names of the columns have changed
ConjointDummyDF.head()

	Rank	$6	Strong After Taste	Mild After Taste	Full Calories	Regular Calories	Low Calories
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

Linear Regression

Conjoint analysis works by observing how respondents’ preferences change as one systematically varies the product features. It examines how they trade-off different aspects of the product, weighing options that have a mix of more desirable and less desirable qualities. The observations allow us to statistically deduce, through linear regression, the part-worth of all eight of the levels across the three product attributes.

Multiple linear regression attempts to model the relationship between multiple explanatory variables, such as the product features, and a response variable, the product rank, by fitting a linear equation to the observed data.

#assign all columns apart from rank is assigned to variable X, which represents x axis
X = ConjointDummyDF[["$6", "$5","$4", "Strong After Taste","Mild After Taste",
"Full Calories","Regular Calories", "Low Calories"]]

#assign a constant to be defined as a benchmark
X = sm.add_constant(X)
#assign rank to Y variable
Y = ConjointDummyDF.Rank
#generate Linear regression model
linearRegression = sm.OLS(Y, X). fit()

#output of regression
df_res = pd.DataFrame({
    'feature': linearRegression.params.keys()
    , 'part_worth': linearRegression.params.values
    , 'pval': linearRegression.pvalues
})
df_res

	feature	part_worth	pval
const	const	4.384615	1.931004e-185
$6	$6	-4.538462	9.384234e-180
$5	$5	1.461538	7.543620e-174
$4	$4	7.461538	2.406369e-182
Strong After Taste	Strong After Taste	0.692308	1.175104e-171
Mild After Taste	Mild After Taste	3.692308	2.218664e-180
Full Calories	Full Calories	1.461538	7.543620e-174
Regular Calories	Regular Calories	2.461538	1.448182e-176
Low Calories	Low Calories	0.461538	7.670223e-168

Part Worth or Relative Utility

Central to the theory of conjoint analysis is the concept of product utility. Utility is a latent variable that reflects how desirable or valuable an object is in the mind of the respondent. The utility of a product is assessed from the value of its parts (part-worth). Conjoint analysis examines consumers’ responses to product ratings, rankings, or choices to estimate the part-worth of the various levels of each attribute of a product. Utility is not an absolute unit of measure; only relative values or differences in utilities matter.

We therefore focus on the coefficients in the multiple linear regression output because these represent the average part-worths or relative utility scores across all attributes. The higher the coefficient of a product attribute, the higher the relative utility. There are three different attributes in the eight different levels. The $4 Price Point (A3) ranks highest at a part-worth of 7.46; the second highest part-worth belongs to the Mild After Taste (B2), with a part-worth of 3.69; and the Regular Calories Level (C2) comes in third, with a part-worth at 2.46.

What would be the optimal product bundle?

# Normalize values for each feature for the pie chart
raw = [7.46,3.69,2.46]
norm = [float(i)/sum(raw) for i in raw]
norm

[0.5481263776634827, 0.27112417340191036, 0.1807494489346069]

# Graph our winning product features

labels = '4$ Price', 'Mild After taste', 'Regular Calories'
#using values above
sizes = [54.8, 27.1, 18.1]
colors = ['yellowgreen', 'mediumpurple', 'lightskyblue']
explode = (0.1, 0, 0)
plt.pie(sizes,
        explode=explode,
        labels=labels,
        colors=colors,
        autopct='%1.1f%%',
        shadow=True,
        startangle=70
        )

plt.axis('equal')
plt.title('Ideal product Based on Responses')
plt.show()

Relative Importance of the Product Attributes

The part-worth is used to derive the importance and the relative importance of a product’s attributes. An attribute’s importance is the difference between the highest and lowest utility level of the attribute. Relative importance of an attribute is essentially its share of importance.

If the distance between the utility levels of an attribute is large, then that attribute will have a larger bearing on the respondents choice of product than another attribute where the distance is not as large. The distance, therefore, is a reflection of the importance of the attribute in determining consumer preferences.

Distance = Max(part-worth) - Min(part-worth)
Weight of attribute/importance = attribute distance / sum(attribute distance)

#Attribute importances

price= ['$4', '$5', '$6']
Taste = ['Strong After Taste', 'Mild After Taste']
Calories = ['Full Calories', 'Regular Calories', 'Low Calories']
#function to calculate distance
def distance_cal(lista):
    newlist =[]
    for item in lista:
        x= df_res.part_worth[(df_res['feature'] == item)][0]
        newlist.append(x)
    return max(newlist) - min(newlist)

#create list of lists to be used in calculating the attribute importance and weight
attributes= [price,Taste,Calories]

#Create list attribute names
attributename = ['Price','Taste','Calories']
#Sum of attribute distance
sum_dist = distance_cal(price) + distance_cal(Taste) + distance_cal(Calories)
i=0
#For loop to print values of distance and weight for each variable
for item in attributes :
    print("\n Attribute : " , attributename[i])
    print ("\n Distance : " , distance_cal(item))
    print ("\n Importance %: " , distance_cal(item)/sum_dist)
    print("-----------------------")
    i=i+1

 Attribute :  Price

 Distance :  11.999999999999993

 Importance %:  0.7058823529411763
-----------------------

 Attribute :  Taste

 Distance :  3.0000000000000013

 Importance %:  0.17647058823529427
-----------------------

 Attribute :  Calories

 Distance :  1.9999999999999987

 Importance %:  0.11764705882352938
-----------------------

#Ploting the importance
sizes = [distance_cal(price)/sum_dist, distance_cal(Taste)/sum_dist, distance_cal(Calories)/sum_dist]
colors = ['yellowgreen', 'mediumpurple', 'lightskyblue']
explode = (0.1, 0, 0)
plt.pie(sizes,
        explode=explode,
        labels=attributename,
        colors=colors,
        autopct='%1.1f%%',
        startangle=70
        )

plt.axis('equal')
plt.title('Attribute Importance')
plt.show()

Trade-off Analysis

Product developers are constantly faced with trade-offs. For instance, changing the taste of the beer from a Strong After Taste (B1) to a Mild After Taste (B2) would result in an increase in price as the beer requires more additional resources. Whether this would result in an increase in demand could be determined by examining the trade-offs that consumers are willing to make between a more preferred After Taste and a less desirable Price Point. The figure below illustrates the attributes part-worths that are considered in the trade-off analysis.

f, ax = plt.subplots(figsize=(14, 8))
plt.title('Attribute Part Worth')
pwu = df_res['part_worth']

xbar = np.arange(len(pwu))
my_colors = ['black','blue','blue','blue','red','red','yellow','yellow','yellow']
plt.barh(xbar, pwu,color=my_colors)
plt.yticks(xbar, labels=df_res['feature'])
plt.show()

Knowledge of the relative importance of various attributes can assist in marketing and advertising decisions. Other factors being equal, we should devote greater attention and resources to improving a product in its attributes that are of greatest importance to target consumers.

Final Conclusion

Conjoint analysis is a marketing research technique designed to help managers determine the preferences of customers and potential customers. In particular, it seeks to determine how consumers value the different attributes that make up a product and the trade-offs they are willing to make among the product’s different attributes. As such, conjoint analysis is best suited for products with very tangible attributes that can be easily described or quantified.

References

Conjoint Analysis. (n.d.). Retrieved from https://www.ashokcharan.com/Marketing-Analytics/~pd-conjoint-analysis.php
Herka. (2020). Traditional-Conjoint-Analysis-with-Python. Retrieved from https://github.com/Herka/Traditional-Conjoint-Analysis-with-Python/blob/master/Traditional Conjoint Analyse.ipynb
Pratama, A. (2018, December 04). How to Do Conjoint Analysis in python. Retrieved August 5, 2020, from https://ariepratama.github.io/How-to-do-conjoint-analysis-in-python/
Sreenivas, P., & Follow. (2019). Conjoint Analysis: A simple python implementation. Retrieved from https://www.linkedin.com/pulse/conjoint-analysis-simple-python-implementation-prajwal-sreenivas
Wilcox, Ronald T., A Practical Guide to Conjoint Analysis. Darden Case No. UVA-M-0675, Available at SSRN: https://ssrn.com/abstract=910102

Edit this page on GitHub

	Rank	A1	B1	B2	C1	C2	C3
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

	Rank	$6	Strong After Taste	Mild After Taste	Full Calories	Regular Calories	Low Calories
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

	Rank	A1	B1	B2	C1	C2	C3
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

	Rank	$6	Strong After Taste	Mild After Taste	Full Calories	Regular Calories	Low Calories
0	2	1.0	1.0	0.0	1.0	0.0	0.0
1	3	1.0	1.0	0.0	0.0	1.0	0.0
2	1	1.0	1.0	0.0	0.0	0.0	1.0
3	5	1.0	0.0	1.0	1.0	0.0	0.0
4	6	1.0	0.0	1.0	0.0	1.0	0.0

Conjoint Analysis

Author: Kemjika Ananaba #

Business Problem #

Rank Based Conjoint Analysis #

Data #

Data Manipulation #

Linear Regression #

Part Worth or Relative Utility #

Relative Importance of the Product Attributes #

Trade-off Analysis #

Final Conclusion #

References #