Understanding customer behavior is key for modern businesses seeking to personalize their marketing and improve customer engagement. In this project, I analyzed a customer personality dataset and used K-Means clustering to segment the customer base into two distinct groups. This simple yet powerful analysis reveals valuable insights for business strategy.

Dataset Overview

The dataset contains information on 2,240 customers, including:

  • Demographics: Age, income, education, marital status, family size
  • Spending patterns: Annual spending on wine, meat, fruits, fish, sweets, and gold products
  • Purchase behavior: Number of purchases through web, catalog, and store
  • Customer lifecycle: Days since last purchase and days since registration

Data Preparation

To prepare the data for clustering, I performed several steps:

  • Removed rows with missing income data
  • Created new features like:
    • Age (based on year of birth)
    • Total Spending (sum of product-related expenses)
    • Family Size (number of children and teenagers)
    • Customer Tenure (based on registration date)
  • Encoded categorical variables (education level, marital status)
  • Removed outliers in income and age
  • Standardized numerical variables for fair distance calculations

K-Means Clustering

Using PCA to reduce the feature space, I applied K-Means clustering and evaluated different values of K using:

  • Elbow method (to check for optimal inertia)
  • Silhouette score
  • Davies–Bouldin index

These metrics suggested that K=2 was the most appropriate. You can explore the detailed technical implementation here (Kaggle Notebook).

Cluster Summary

After analyzing the two customer clusters, here’s what we found:

  • Cluster 0 – High-Value Customers
    • Higher income and spending
    • Active across multiple purchase channels (web, store, catalog)
    • Likely to purchase luxury or high-end items like wine and gold
  • Cluster 1 – Budget-Conscious Customers
    • Lower income and total spending
    • Fewer purchases across all channels
    • More focused on daily essentials

Business Recommendations

Based on the segmentation, companies can adopt tailored strategies:

  • For Cluster 0: Offer premium memberships, personalized luxury bundles, and loyalty programs.
  • For Cluster 1: Focus on value-driven promotions, bundle essential products, and use discount incentives.

Coming Soon: Power BI Dashboard

I’m currently developing a Power BI dashboard to visualize the clustering results and allow interactive exploration of customer segments. It will help stakeholders easily identify key patterns across age, income, and purchase behavior. I’ll share the link once it’s ready!

留下评论