Sales Analysis of a Supermarket using pandas
--
Introduction
Supermarkets help us get our daily household goods, groceries and gift items. In 2019, the global retail market generated sales of nearly 25 trillion U.S. dollars, with a forecast to reach close to 27 trillion U.S. dollars by 2022 — Statista 2022.
Business Task
In the dataset provided, key business questions will be answered. These questions and their answers will drive actionable insights.
What products have brought in the most revenue?
What category of products is the most profitable?
What is the peak month for sales?
Is there a yearly growth for the business?
Which customer is the most valuable? Most used payment channel?
Data Source
The dataset for this analysis was gotten from Github. The data was stored in a csv format, it is structured, organized in rows and columns.
Data Cleaning and Manipulation
Python is the tool I have chosen to use for this project. Pandas libraries provide efficient cleaning tools and visualizations in order to gain quick insights.
The dataset was gotten from Kaggle. This csv file contains 51290 rows and 21 columns. The columns are order_id, order_date, ship_date, ship_mode, customer_name, segment, state, country, market, region, product_id,
category, sub_category, product_name, sales, quantity,
discount, profit, shipping_cost, order_priority, year. There are no null values.
Each year has December has the month with most sales made (with the exception of 2014 but my guess is that the total sales info hasn’t been gotten since December 2014 is the last month in the dataset). The ‘ember’ months generally have huge volume of sales. This should mainly be due to the holiday season.
The categories with most sales is Technology with 4.7 million dollars sales made, the next category with most sales is Furniture. Office Supplies have the least amount of sales with 3.8 million dollars.
Interestingly, the category with most profits is as well the Technology category with 663,778 dollars profit made. The Furniture department has the second most sales as shown above but it is the least profitable category with 286,782 dollars profit. Office supplies are the second most profitable category with a whooping 518,472 dollars profit made. This is largely due to the share total number of transactions for Office supplies.
The chart below shows the customers with most transactions with the supermarket. This however can be tricky because multiple customers can share the same first name and surname. This can be dealt with if in subsequent years, each customer is given a unique ID.
Having most transactions does not necessarily translate to having most sales. The customers who have spent the most dollars are shown below.
As can be seen above, only Bart Watters is in the top 10 most transactions as well as top 10 most sales.
There is a yearly growth in the number of transactions as well as number of sales.
CONCLUSIONS/RECOMMENDATIONS
It was observed that most sales take place during the holiday season with December being the peak month for sales. Also, there is a sales spike during the summer sales in June.
The Technology category brings most sales. The top 4 products with most sales are in the Technology category. Most profits also come from the Technology category.
There is an increase in the number of transactions yearly. There is also consequently an increase in the total sales recorded yearly.