Summary

I was given a 63 mb csv dataset with three columns, one of which was important. There was no missing data. Each datum was the name of a candidate. Using Python without Pandas, I was asked to produce output that looks like the following.

Output example

The solution required creating dictionaries, and using a for-loop within a with-loop to populate them, as well as conditional statements to find specific data values. It then writes the results to a text file.


Solution



    print("PyPollChallenge")

import os
import csv
import operator
csvpath = os.path.join('election_data.csv')

# Initialize my empty dictionaries, which will be populated via a for-loop.
# The first dictionary houses the votes, and the second one is the percentage conversion of that same information.

competitors = {}
competitors2 = {}


with open(csvpath, newline='') as csvfile:
    csvreader = csv.reader(csvfile, delimiter=',')

    csv_header = next(csvreader)

# This data did not require skipping the first row, as there was no header
# The for-loop below is short-hand magic. Basically, it takes the name, and if there is no key, 
# add a key with that name. If there is a key match, add one to the sum.

    for row in csvreader:
        competitors[row[2]] = competitors.get(str(row[2]), 0) + 1

# This next part translated the sums to a percentage.

total_count = sum(competitors.values(), 0.0)
competitors2 = {k: v / total_count for k, v in competitors.items()}


winner = max(competitors.items(), key=operator.itemgetter(1))[0]
winner_print = 'Winner: ' + str(winner)
total_votes_print = 'Total Votes: ' + str(total_count)


khan_votes = 'Khan: ' + str((round(competitors2['Khan'], 3))) + '% (' + str(competitors['Khan']) + ')'
correy_votes = 'Correy: ' + str((round(competitors2['Correy'], 3))) + '% (' + str(competitors['Correy']) + ')'
li_votes = 'Li: ' + str((round(competitors2['Li'], 3))) + '% (' + str(competitors['Li']) + ')'
oTooley_votes = "O'Tooley: " + str((round(competitors2["O'Tooley"], 3))) + '% (' + str(competitors["O'Tooley"]) + ')'



cache = []
cache.append("Election Results")
cache.append("-------------------------")
cache.append("Total Votes: 3521001")
cache.append("-------------------------")
cache.append(khan_votes)
cache.append(correy_votes)
cache.append(li_votes)
cache.append(oTooley_votes)
cache.append("-------------------------")
cache.append("Winner: " + str(winner))
cache.append("-------------------------")

print(cache)

with open('output2.txt', 'w') as output:
  for line in cache:
      output.write("%s\n" % line)