Programming for Everybody: Assignment 07.2 Files - edorlando07/datasciencecoursera GitHub Wiki
###Python Data Structures
7.2 Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution. You can download the sample data at http://www.pythonlearn.com/code/mbox-short.txt when you are testing below enter mbox-short.txt as the file name.
Use the file name mbox-short.txt as the file name.
A sample of the file structure is listed below:
Received: (from apache@localhost)
by nakamura.uits.iupui.edu (8.12.11.20060308/8.12.11/Submit) id m05ECIaH010327
for [email protected]; Sat, 5 Jan 2008 09:12:18 -0500
Date: Sat, 5 Jan 2008 09:12:18 -0500
X-Authentication-Warning: nakamura.uits.iupui.edu: apache set sender to [email protected] using -f
To: [email protected]
From: [email protected]
Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/content-
impl/impl/src/java/org/sakaiproject/content/impl
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
Author: [email protected]
Date: 2008-01-05 09:12:07 -0500 (Sat, 05 Jan 2008)
New Revision: 39772
Modified:
content/branches/sakai_2-5-x/content-impl/impl/src/java/org/sakaiproject/content/impl/ContentServiceSqlOracle.java
content/branches/sakai_2-5-x/content-impl/impl/src/java/org/sakaiproject/content/impl/DbContentService.java
Log:
SAK-12501 merge to 2-5-x: r39622, r39624:5, r39632:3 (resolve conflict from differing linebreaks for r39622)
----------------------
This automatic notification message was sent by Sakai Collab (https://collab.sakaiproject.org/portal) from the Source
site.
You can modify how you receive notifications at My Workspace > Preferences.
The actual code is listed below:
fname = raw_input("Enter file name: ")
fh = open(fname)
count = 0
total = 0
for line in fh:
line = line.rstrip()
if not line.startswith("X-DSPAM-Confidence:") : continue
atpos = line.find(':')
line = line[atpos+2 :atpos+8]
line = float(line)
count = count + 1 ##starts the count of lines in the stripped down version
total = total + line ##starts the running sum of the lines
average = total/count
print "Average spam confidence: " + str(average)
The output for the code listed above is the following:
Enter file name: mbox-short.txt
Average spam confidence: 0.750718518519