Heroes Of Pymoli Data Analysis¶

Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).
Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).

# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
df = pd.read_csv(file_to_load)

df.head(10)

Player Count¶

# Calculate the Number of Unique Players
player_demographics = df.loc[:, ["Gender", "SN", "Age"]]
player_demographics = player_demographics.drop_duplicates()
num_players = player_demographics.count()[0]     # Display the total number of players
pd.DataFrame({"Total Players": [num_players]})

Purchasing Analysis (Total)¶

Run basic calculations to obtain number of unique items, average price, etc.
Create a summary data frame to hold the results
Optional: give the displayed data cleaner formatting
Display the summary data frame

unique_item_count = len(df['Item ID'].unique())
average_price_of_items = round(float(df['Price'].mean()), 2)
count_of_purchases = len(df['Price'])
price_sum = float(df['Price'].sum())
price_sum

summary_dataframe = pd.DataFrame({
    'Number of Unique Items': [unique_item_count],
    'Average Price': '$' + str(average_price_of_items),
    'Number of Purchases': [count_of_purchases],
    'Total Revenue': '$' + str(price_sum)
})

summary_dataframe

Gender Demographics¶

Percentage and Count of Male Players
Percentage and Count of Female Players
Percentage and Count of Other / Non-Disclosed

df_gender1 = df[['Gender','SN']].drop_duplicates(subset = 'SN')

gender_count = df_gender1['Gender'].value_counts(0)
gender_percent = df_gender1['Gender'].value_counts(1)

gender_count_df = pd.DataFrame(gender_count)
gender_percent_df = round(pd.DataFrame(gender_percent) * 100, 2)

gender_summary_df = gender_count_df.merge(gender_percent_df, left_index = True, right_index = True)
gender_summary_df.columns = ['Total Count', 'Percentage of Players']
gender_summary_df

Purchasing Analysis (Gender)¶

Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender
Create a summary data frame to hold the results
Optional: give the displayed data cleaner formatting
Display the summary data frame

df_gender_2 = df.groupby('Gender')

purchase_count = round(df_gender_2['Purchase ID'].count(), 0)
avg_purchase_price = round(df_gender_2['Price'].mean(), 2)
total_purchase_value = round(df_gender_2['Price'].sum(), 2)
purchase_value_per_gender = round(total_purchase_value / gender_count, 2)

summary_dataframe2 = pd.DataFrame([purchase_count, avg_purchase_price, total_purchase_value, purchase_value_per_gender])
summary2 = summary_dataframe2.T
summary2.columns = ['Purchase Count', 'Average Purchase Price', 'Total Purchase Value', 'Avg Total Purchase per Person']
summary2

Age Demographics¶

Establish bins for ages
Categorize the existing players using the age bins. Hint: use pd.cut()
Calculate the numbers and percentages by age group
Create a summary data frame to hold the results
Optional: round the percentage column to two decimal points
Display Age Demographics Table

bins = [0, 9, 14, 19, 24, 29, 34, 39, 150]
bin_labels = ['<10', '10-14', '15-19', '20-24', '25-29', '30-34', '35-39', '40+']

df["Total Count"] = pd.cut(df["Age"], bins, labels=bin_labels)
df_age1 = df[['Total Count','SN']].drop_duplicates(subset = 'SN')

age_demographics_summary = df_age1.groupby("Total Count").count()
age_counts = age_demographics_summary['SN']
age_demographics_percentages = round(age_counts / 576 * 100, 2)
age_demographics_percentages

summary_dataframe3 = pd.DataFrame([age_counts, age_demographics_percentages])

summary_data = summary_dataframe3.T

summary_data.columns = ['Total Count', 'Percentage of Players']

summary_data.head()

Purchasing Analysis (Age)¶

Bin the purchase_data data frame by age
Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below
Create a summary data frame to hold the results
Optional: give the displayed data cleaner formatting
Display the summary data frame

bins = [0, 9, 14, 19, 24, 29, 34, 39, 150]
bin_labels = ['<10', '10-14', '15-19', '20-24', '25-29', '30-34', '35-39', '40+']

df["Total Count"] = pd.cut(df["Age"], bins, labels=bin_labels)
less_rows = df[['Total Count','SN', 'Price']]

less_rows_grouped = less_rows.groupby('Total Count')
purchase_counts = less_rows_grouped['Price'].count()
average_prices = round(less_rows_grouped['Price'].mean(),2)
total_spent = round(less_rows_grouped['Price'].sum(), 2)
spending_per_person = round(total_spent/age_counts,2)


summary4 = pd.DataFrame([purchase_counts, average_prices, total_spent, spending_per_person])
summary4b = summary4.T
summary4b.columns = ['Purchase Count', 'Average Purchase Price', 'Total Purchase Value', 'Avg Total Purchase per Person']
summary4b.head()

Top Spenders¶

Run basic calculations to obtain the results in the table below
Create a summary data frame to hold the results
Sort the total purchase value column in descending order
Optional: give the displayed data cleaner formatting
Display a preview of the summary data frame

df_sn_2 = df.groupby('SN')
purchase_counts = df_sn_2['Gender'].count()
average_spending = round(df_sn_2['Price'].mean(),2)
total_purchase = round(df_sn_2['Price'].sum(),2)

summary6 = pd.DataFrame([purchase_counts, average_spending, total_purchase])
summary7 = summary6.T
summary7.columns = ['Purchase Count', 'Average Purchase Price', 'Total Purchase Value']
summary7.sort_values('Total Purchase Value', ascending=False).reset_index().head()

Most Popular Items¶

Retrieve the Item ID, Item Name, and Item Price columns
Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value
Create a summary data frame to hold the results
Sort the purchase count column in descending order
Optional: give the displayed data cleaner formatting
Display a preview of the summary data frame

df_sn_3 = df.groupby(['Item ID', 'Item Name'])
purchase_counts2 = df_sn_3['Gender'].count()
average_spending2 = round(df_sn_3['Price'].mean(),2)
total_purchase2 = round(df_sn_3['Price'].sum(),2)

summary7 = pd.DataFrame([purchase_counts2, average_spending2, total_purchase2])
summary8 = summary7.T
summary8.columns = ['Purchase Count', 'Item Price', 'Total Purchase Value']
summary9 = summary8.sort_values('Purchase Count', ascending=False)
summary9.head()

Most Profitable Items¶

Sort the above table by total purchase value in descending order
Optional: give the displayed data cleaner formatting
Display a preview of the data frame

summary8.sort_values('Total Purchase Value', ascending=False).head()

	Purchase ID	SN	Age	Gender	Item ID	Item Name	Price
0	0	Lisim78	20	Male	108	Extraction, Quickblade Of Trembling Hands	3.53
1	1	Lisovynya38	40	Male	143	Frenzied Scimitar	1.56
2	2	Ithergue48	24	Male	92	Final Critic	4.88
3	3	Chamassasya86	24	Male	100	Blindscythe	3.27
4	4	Iskosia90	23	Male	131	Fury	1.44
5	5	Yalae81	22	Male	81	Dreamkiss	3.61
6	6	Itheria73	36	Male	169	Interrogator, Blood Blade of the Queen	2.18
7	7	Iskjaskst81	20	Male	162	Abyssal Shard	2.67
8	8	Undjask33	22	Male	21	Souleater	1.10
9	9	Chanosian48	35	Other / Non-Disclosed	136	Ghastly Adamantite Protector	3.58

	Purchase Count	Average Purchase Price	Total Purchase Value	Avg Total Purchase per Person
Gender
Female	113.0	3.20	361.94	4.47
Male	652.0	3.02	1967.64	4.07
Other / Non-Disclosed	15.0	3.35	50.19	4.56

	Total Count	Percentage of Players
Total Count
<10	17.0	2.95
10-14	22.0	3.82
15-19	107.0	18.58
20-24	258.0	44.79
25-29	77.0	13.37

	Purchase Count	Average Purchase Price	Total Purchase Value	Avg Total Purchase per Person
Total Count
<10	23.0	3.35	77.13	4.54
10-14	28.0	2.96	82.78	3.76
15-19	136.0	3.04	412.89	3.86
20-24	365.0	3.05	1114.06	4.32
25-29	101.0	2.90	293.00	3.81

	SN	Purchase Count	Average Purchase Price	Total Purchase Value
0	Lisosia93	5.0	3.79	18.96
1	Idastidru52	4.0	3.86	15.45
2	Chamjask73	3.0	4.61	13.83
3	Iral74	4.0	3.40	13.62
4	Iskadarya95	3.0	4.37	13.10

Summary

Solution

Heroes Of Pymoli Data Analysis¶

Player Count¶

Purchasing Analysis (Total)¶

Gender Demographics¶

Purchasing Analysis (Gender)¶

Age Demographics¶

Purchasing Analysis (Age)¶

Top Spenders¶

Most Popular Items¶

Most Profitable Items¶

		Purchase Count	Item Price	Total Purchase Value
Item ID	Item Name
178	Oathbreaker, Last Hope of the Breaking Storm	12.0	4.23	50.76
145	Fiery Glass Crusader	9.0	4.58	41.22
108	Extraction, Quickblade Of Trembling Hands	9.0	3.53	31.77
82	Nirvana	9.0	4.90	44.10
19	Pursuit, Cudgel of Necromancy	8.0	1.02	8.16