By: Aayush Srivastava
The data for the analysis was collected through a survey without invigilation, which could have allowed some degree of bias and unreliability in the data. The author(s) won't be responsible for any incorrect data and fallacies.
The data was collected in the form of a simple MCQ test based on things generally taught in schools. The user was asked for some basic details like gender, age, nature and past academic record. The data was then sorted into smalled frames on the basis of the characteristics, for example age groups. Then the average score of each group was calculated and the group with highest average was deemed the winner in the respective category.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
mdata=pd.read_csv('/home/aayush/Projects/Python/Do people remeber what they learned in school/data.csv')
mdata.head()
def calc_avg(scores):
'''Calculate the average score.'''
sum=0
for i in scores:
sum+=int(i.split()[0])
return sum/len(scores)
def split_data(fra,col,val):
'''Splits main data frame into samller frames.'''
return fra.loc[fra[col]==val]
def better(pos,*f):
'''Finds the better scoring group.'''
cnt=0
best=[0,0]
for i in f:
print('Average score of '+str(pos[cnt])+' is:'+str(calc_avg(i['Score'])))
if calc_avg(i['Score'])>best[0]:
best[0]=calc_avg(i['Score'])
best[1]=cnt
cnt+=1
return pos[best[1]]
The data is split into two groups namely age group '20-30' represented by data_20_30 and the group Below 20 represented by data_20 .
data_20_30,data_20=split_data(mdata,'Which age group you belong to?','20-30'),split_data(mdata,'Which age group you belong to?','Below 20')
p=['Below 20','20-30']
better(p,data_20,data_20_30)
From the above results we can say that people in age group 20-30 performed better than people younger than 20.
The data is split into two groups 'male' represented by data_male and the 'female' represented by data_female .
data_male,data_female=split_data(mdata,'Which one of the following best describes you?','Male'),split_data(mdata,'Which one of the following best describes you?','Female')
p=['Male','Female']
better(p,data_male,data_female)
From the above results we can say that females performed better than males.
We split the data into three groups Introverts, Extroverts and Ambivert .
data_intro,data_extro,data_ambi=split_data(mdata,'Which one of the following best describes you?.1','Introvert'),split_data(mdata,'Which one of the following best describes you?.1','Extrovert'),split_data(mdata,'Which one of the following best describes you?.1','Ambivert')
p=['Introvert','Extrovert','Ambivert']
better(p,data_intro,data_extro,data_ambi)
From the results we infer that Extroverts scored more than others.
Four groups are made of people with an Below average, Average, Above Average and Extraordinary academic record.
data_below_avg,data_avg,data_above_avg,data_extra=split_data(mdata,'How good were you in school?','Below Average'),split_data(mdata,'How good were you in school?','Average'),split_data(mdata,'How good were you in school?','Above Average'),split_data(mdata,'How good were you in school?','Extraordinary')
p=['Below Average','Average','Above Average','Extraordinary']
better(p,data_below_avg,data_avg,data_above_avg,data_extra)
Hence we infer that people with an Above average academic record were better performers than others.
Adding a little complexity to our analysis soo far we will split data into 6 groups 3 for men of different behaviours and 3 for women.
data_men_intro,data_men_extro,data_men_ambi=split_data(data_male,'Which one of the following best describes you?.1','Introvert'),split_data(data_male,'Which one of the following best describes you?.1','Extrovert'),split_data(data_male,'Which one of the following best describes you?.1','Ambivert')
data_wom_intro,data_wom_extro,data_wom_ambi=split_data(data_female,'Which one of the following best describes you?.1','Introvert'),split_data(data_female,'Which one of the following best describes you?.1','Extrovert'),split_data(data_female,'Which one of the following best describes you?.1','Ambivert')
p=['Male-Introvert','Male-Extrovert','Male-Ambivert','Female-Introvert','Female-Extrovert','Female-Ambivert']
better(p,data_men_intro,data_men_extro,data_men_ambi,data_wom_intro,data_wom_extro,data_wom_ambi)
Hence we infer that Introvert females perform better than other groups.
From the above analysis we can infer that: