Uncovering Pricing and Behavioural Patterns in Online Apparel: A Data Mining and Machine Learning Approach Using Clickstream Data
- This project investigates customer behavior and pricing dynamics in the context of online apparel sales, using a real-world clickstream from a European e-commerce platform. Through a
structured process of exploratory data analysis and predictive modeling, the study explores how product visuals, attributes, and browsing behavior influence both purchasing patterns and pricing outcomes. Random Forest models were used for both regression and classification tasks to predict product prices and classify items into budget or premium tiers. These model outputs were then combined to detect potential pricing–perception mismatches, where a product appears over- or underpriced relative to how it is perceived based on its features. This approach helps highlight products that may benefit from a review of their pricing or presentation strategy. Key insights from the analysis show that products with frontal model photography tend to perform better in both pricing and sales, while black and blue items generate the highest revenue. These findings lead to clear business recommendations in areas such as pricing, visual merchandising, and product positioning. The project demonstrates how machine learning can be applied not only to forecast outcomes but also to guide practical, data-driven decisions in e-commerce.