Market basket analysis python

Market Basket Analysis Concepts and Examples in Python

Have you ever noticed that when you buy a certain product, there are usually other products that are often purchased together? For example, when someone buys bread, he might also buy butter or jam. Market basket analysis is a method used to identify product purchasing patterns that often occur together.

In this article, we will discuss the concept of market basket analysis, examples of its application in data mining, how to perform market basket analysis using Python, and how to create data visualizations using Tableau.

What is Market Basket Analysis and How Does the Concept Work?

Market basket analysis is a data analysis technique used to identify the relationship between products purchased together by customers. The main objective of this analysis is to discover hidden relationships among purchased products and to use this information to improve sales and marketing strategies.

The basic concept of market basket analysis is “lift”. Lift is the ratio between the probability that two products will be purchased together and the probability that they will be purchased at random. If lift is 1, then the two products are likely to be purchased at random. However, if the lift is greater than 1, then the two products are likely to be purchased together.

Читайте также:  Html templates online shop

Example of Market Basket Analysis in Data Mining

  1. Data Preprocessing
    The first step in market basket analysis is preparing the data. You must ensure that the data you use is free from errors and duplications, and that it is properly encoded. Next, you need to convert the data into a transaction format, which is a list of all products purchased by a customer in a single transaction.
  2. Frequent Itemset Mining
    After the data is converted into a transaction format, the next step is to find frequent itemsets, namely sets of products that are frequently purchased together. Frequent itemsets are obtained using algorithms such as Apriori or FP-Growth.
  3. Association Rule Mining
    After getting the frequent itemset, the next step is to find the association rule, namely the relationship between products purchased together. The association rule is obtained by calculating the lift between every two products in the frequent itemset.
  4. Interpretation of Results
    After getting the association rule, the final step is to interpret the results. You can use this information to improve your sales and marketing strategy. Suppose you find that people who buy bread also tend to buy butter, then you can place butter near the bread to increase the likelihood of a joint purchase.

How to Use Python to Analyze Market Baskets

Python is one of the popular programming languages for data analysis. In this section, we will cover how to use Python to perform a market basket analysis. why python is it a good language to learn?

1. Data Preprocessing

First, you need to import the required libraries, such as pandas and numpy. Next, you need to read the data from the CSV file or database and prepare the data.

import pandas as pd import numpy as np # read data from CSV file df = pd.read_csv('data.csv') # remove unused data df = df.drop(['id'], axis=1) # converting data into a transactional format transactions = [] for i in range(len(df)): transactions.append([str(df.values[i,j]) for j in range(len(df.columns))])

2. Frequent Itemset Mining

Next, you need to import a library to use the Apriori or FP-Growth algorithms to find frequent itemsets.

from mlxtend.preprocessing import TransactionEncoder from mlxtend.frequent_patterns import apriori, fpgrowth # convert data to boolean format te = TransactionEncoder() te_ary = te.fit_transform(transactions) df = pd.DataFrame(te_ary, columns=te.columns_) # search for frequent itemsets with the Apriori algorithm frequent_itemsets = apriori(df, min_support=0.01, use_colnames=True) # looking for frequent itemsets with the FP-Growth algorithm frequent_itemsets = fpgrowth(df, min_support=0.01, use_colnames=True) 

3. Association Rule Mining

After getting the frequent itemset, you can use the mlxtend library to find the association rule.

from mlxtend.frequent_patterns import association_rules # find association rule with lift metric rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

4. Interpretation of Results

After getting the association rules, you can use pandas to interpret the results.

# sort the association rule based on the lift value rules = rules.sort_values(['lift'], ascending=False) # displays the association rule with the largest lift print(rules.head()) 

Creating a Market Basket Analysis Visualization with Tableau

  1. Import Data
    First, you need to import product purchase data into Tableau.
  2. Create Frequent Itemsets
    Next, you can use the “Create Sets” feature to create frequent itemsets. Select the product column and select “Create Sets”. Select “Use All” to use all products, then set the support limit to search for frequent itemsets.
  3. Create Association Rules
    After getting the frequent itemset, you can use the “Create Combined Sets” to create an association rule. Select two frequent itemsets and set the lift limit to find the association rule.
  4. Create Visualization
    After getting the association rules, you can create visualizations using the “Visualizations” feature in Tableau. For example, you can create a scatter plot to display association rules based on support and lift values.

In conclusion, Market basket analysis is a useful data analysis technique for finding related product purchasing patterns. By using this technique, you can improve your sales and marketing strategy.

Python and Tableau are two popular tools for doing market analysis. In this article, we’ve covered how to use Python and Tableau to perform market analysis and create a visualization of the results.

In carrying out basket analysis, you need to carry out several stages such as data preprocessing, frequent itemset mining, association rule mining, and interpretation of results. Once you get results, you can take action to improve your sales and marketing strategy.

We hope this article will be useful for those of you who are interested in market basket analysis. Don’t forget to try it yourself using Python or Tableau to do market basket analysis on your data.

Источник

How to Perform Market Basket Analysis in Python

Market basket analysis is a powerful data science application that improves user experience and encourages purchases, which adds direct business value to companies.

In the past, marketers would often use their intuition when creating product combinations and building marketing strategies. Now that organizations are able to collect and store more data than ever before, they use their findings to target customers and increase sales. They hire data scientists and analysts in marketing teams to make these decisions instead.

In this article, I will explain some of the theory behind market basket analysis and show you how to implement it in Python.

Table of Contents

What Is Market Basket Analysis?

Market basket analysis is used by companies to identify items that are frequently purchased together. Notice, when you visit the grocery store, how baby formula and diapers are always sold in the same aisle. Similarly, bread, butter, and jam are all placed near each other so that customers can easily purchase them together. The technique uncovers hidden correlations that cannot be identified by the human eye by using a set of statistical rules to identify product combinations that occur frequently in transactions.

Apart from market basket analysis, other popular applications of data science in marketing include churn prediction, sentiment analysis, customer segmentation, and recommendation systems.

How Does Market Basket Analysis Work?

Market basket analysis is frequently used by restaurants, retail stores, and online shopping platforms to encourage customers to make more purchases in a single visit. This is a use-case of data science in marketing that increases company sales and drives business growth and commonly utilizes the Apriori algorithm.

What is the Apriori Algorithm?

The Apriori algorithm is the most common technique for performing market basket analysis.

It is used for association rule mining, which is a rule-based process used to identify correlations between items purchased by users.

How Does the Apriori Algorithm Work?

Let’s explore the process through an example of items most frequently bought together in a given store:

An example of the Apriori algorithm for market basket analysis as three subsets of item combinations purchased in a grocery store. There is a shopping cart on the right side with the three items hovering above - cereal, popcorn, and milk. On the left side are the three purchase combinations - popcorn and milk, popcorn and cereal, or milk and cereal.

Most store customers have purchased popcorn, milk, and cereal together. Therefore, is a frequent itemset as it appears in a majority of purchases. So, if a person grabs popcorn and milk, they will also be recommended cereal.

According to the Apriori algorithm, a subset of the frequent itemset is also frequent. Since is a frequent itemset, this means that , , and are also frequent. Due to this, if a customer only goes for popcorn, they will be recommended both milk and cereal as well.

What Are the Components of the Apriori Algorithm?

The Apriori algorithm has three main components:

You can think of these as metrics that evaluate the relevance and popularity of each item combination.

Let’s illustrate. The baskets below contain items purchased by four customers at a grocery store:

An example of the Apriori algorithm for market basket analysis with four baskets featuring four different types of item combinations.

Here is a tabular representation of this purchase data:

Источник

Оцените статью