Intro

What is Product Analytics? Your Ultimate Guide | FullStory

Fundamental marketing theory

STP - Segmentation, Targeting, Positioning
Marketing mix
Price elasticity
Neural networks

Customer Analytics

The first part of the course focuses on how to perform customer segmentation, using a hands-on approach. It involves the application of hierarchical and flat clustering techniques for dividing customers into groups. It also features applying the PCA to reduce the dimensionality of the problem, as well as combining PCA and K-means for an even more professional customer segmentation

Purchase Analytics

The second part of the course explores both the descriptive and predictive analysis of the purchase behavior of customers, including models for purchase incidence, branch choice, and purchase quantity. Not only that, but it also covers the application of state-of-the-art deep learning techniques to make predictions using real-world data

STP Framework

STP is a fundamental marketing framework. It can be applied to all areas of business and marketing activities

Segmentation
Targeting
Positioning

Segmentation

The process of dividing a population of customers into groups that share similar characteristics
Observations within the same group would have comparable purchasing behavior
Observations within the same group would respond similarly to different marketing activities
Divide customers into segments based on age, income, gender
Consumer data
- Demographic
- Geographic
- Psychographic
- Behavioral
  - Purchase frequency
  - Time of purchase
  - Purchase quantity
  - Product ratings

Targeting

The process of evaluating potential profits from each segment and deciding which segments to focus on
Selecting ways to promote your products. You can target one segment on TV and another online
Examining customers' perception. (Involves psychology and usually budget constraints)
Consideration factors
- Segment size
- Expected growth
- Competitors' offerings

Positioning

What product characteristics do the customers from a certain segment need?
Shows how a product should be presented to the customers and through what channel
In fact, this process is so important, that it has a framework of its own called: Marketing Mix

Marketing Mix

Develop the best product or service and offer it at the right price through the right channels

Purchase probability
Brand choice probability
Purchase quantity

4 Ps of Marketing

Product
- Product features; Branding; Packaging
Price
- Product cost; Long term price changes
Promotion
- Price reduction, display and feature
Place
- Distribution: intensive, selective, exclusive

Physical and Online Retailers

Price elasticity is: the percentage change in an economic outcome of interest with respect to one percent change in a respective price

It is expected that: units sold from a brand would increase if the unit price of the brand decreases and would increase if the unit price of a competitor brand increases

Price elasticity of purchase probability
Price elasticity of brand choice probability
Price elasticity of purchase quantity

Pearson Correlation

Linear dependency between variables

df_segmentation.corr()

Ranges from -1 to 1

Standardization

For standardizing features. We'll use the StandardScaler module

# For standardizing features. We'll use the StandardScaler module.
from sklearn.preprocessing import StandardScaler

# Standardizing data, so that all features have equal weight. This is important for modelling.
# Otherwise, in our case Income would be considered much more important than Education for Instance.
# We do not know if this is the case, so we would not like to introduce it to our model.
# This is what is also refered to as bias.
scaler = StandardScaler()
segmentation_std = scaler.fit_transform(df_segmentation)

Types of Clustering

Hierarchical
Flat
Flat clustering are way faster than hierarchical

Hierarchical Clustering

Ex - Taxonomy of Animal kingdom
Types of Hierarchical clustering
- Divisive (Top down)
- Agglomerative (Bottom up) (easier to solve mathematically)
- Should reach same results
How do we measure the distances between observations?
- Euclidean distance
- Manhattan distance
- Maximum distance

Segmentation between clusters
- Ward method (|A-B|^2)/n~~scale~~

# Hierarchical clustering with the Sci Py library. We'll use the dendrogram and linkage modules.
from scipy.cluster.hierarchy import dendrogram, linkage

# Perform Hierarchical Clustering. The results are returned as a linkage matrix.
hier_clust = linkage(segmentation_std, method = 'ward')

# We plot the results from the Hierarchical Clustering using a Dendrogram.
# We truncate the dendrogram for better readability. The level p shows only the last p merged clusters
# We also omit showing the labels for each point.
plt.figure(figsize = (12,9))
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('Observations')
plt.ylabel('Distance')
dendrogram(hier_clust,
            truncate_mode = 'level',
            p = 5,
            show_leaf_counts = False,
            no_labels = True)
plt.show()

Finding a cluster

We need to find a horizontal line on the dendogram on which to cut
Rule of thumb: We find the longest vertical line unintercepted by a horizontal line from the dendrogram. This is where we should make the cut

K-means Clustering

Most commonly used method in clustering because of its simplicity
K-means is perfect for segmentation data

Choose the number of clusters we want to identify. K in K-means means number of clusters
Specify cluster seeds
- Based on Euclidean squared distance add points to one of the cluster
Calculate the centroid (geometrical center)
Repeat until the centroids stop changing

Problems

The squared Euclidean distance is quite sensitive to outliers (solved by k-median clustering)
Choose number of clusters before hand
K-means enforces spherical clusters

How to Scale K-Means Clustering with just ClickHouse SQL

Within Cluster Sum of Squares (WCSS) is used to determine best clustering solution

Choosing number of clusters - Elbow method

Purchase Analytics

Price elasticity of purchase probability

% change in purchase probability in response to a 1% change in price

We have a product, which costs $2.40-. We have obtained the price elasticity of purchase probability to be -0.79. The customer at this price point is: inelastic

We have a product, which costs $3.50. We have obtained the price elasticity of purchase probability to be -2.1. The customer at this point is: elastic

We have a product, which costs $2.00. Using a logistic regression model, we have obtained the coefficient for price to be -2.3. In addition, we have estimated the purchase probability at this price point to be 0.3. Using the price elasticity formula from the lecture, what is the price elasticity? -3.22

A company has a product with a price of 2.3 dollars on the market. They have estimated that the price elasticity of purchase probability at this price point is -0.9. If you are an adviser for the marketing team of the company what would you recommend: The company should increase the price of the product, without fear of losing too many customers.

The purchase probability of a client at a price point is 0.7, while their price elasticity: -1.56. If we raise the price of the product by 1%, the new probability of purchase of the client will be: 68.44%

The purchase probability of a client at $2.56 is 0.7, while their price elasticity: -1.22. If we lower the price of the product by $0.02, the new probability of purchase of the client will be: 70.95%

Multnomial Logistic Regression - For a multiclass scenario

Deep Learning

Training the deep learning model

Keras was built into the core TensorFlow, so most TensorFlow 2.0 methods are not referenced as 'TensorFlow', but rather Keras methods
tf.keras.Sequential(layers) function that is laying down the model (used to 'stack layers')
tf.keras.layers.Dense(output layer size, activation function) takes the inputs provided from the last layer, calculates their dot product with the weights and adds the bias. Also applies an activation function.
Activation functions
- relu
- sigmoid
- tanh
- softmax
model.compile(optimizer, loss) configures the model for training
- Optimizers
  - Adaptive Moment Estimation (ADAM)
- loss
  - sparse_categorical_crossentropy
    - applies one-hot encoding to the targets
tf.keras.callbacks.EarlyStopping(patience) ensures that the model stops traning after the validation loss has started increasing. When patience=0, it will stop after the first validation increase. patience = 2
fit()
evaluate() returns the loss value and metrics values for the model in 'test mode'
predict()
argmax - Arguments of the maxima

Customer Analytics in Python

Fundamental marketing theory​

Customer Analytics​

Purchase Analytics​

STP Framework​

Segmentation​

Targeting​

Positioning​

Marketing Mix​

4 Ps of Marketing​

Physical and Online Retailers​

Price elasticity is: the percentage change in an economic outcome of interest with respect to one percent change in a respective price​

Pearson Correlation​

Standardization​

Types of Clustering​

Hierarchical Clustering​

Finding a cluster​

K-means Clustering​

Within Cluster Sum of Squares (WCSS) is used to determine best clustering solution​

Purchase Analytics​

Price elasticity of purchase probability​

Multnomial Logistic Regression - For a multiclass scenario​

Deep Learning​

Training the deep learning model​

Fundamental marketing theory

Customer Analytics

Purchase Analytics

STP Framework

Segmentation

Targeting

Positioning

Marketing Mix

4 Ps of Marketing

Physical and Online Retailers

Price elasticity is: the percentage change in an economic outcome of interest with respect to one percent change in a respective price

Pearson Correlation

Standardization

Types of Clustering

Hierarchical Clustering

Finding a cluster

K-means Clustering

Within Cluster Sum of Squares (WCSS) is used to determine best clustering solution

Purchase Analytics

Price elasticity of purchase probability

Multnomial Logistic Regression - For a multiclass scenario

Deep Learning

Training the deep learning model