Write code using find() and string slicing to extract the number at the end of the line below. Convert the extracted value to a floating point number and print it out.
text = "X-DSPAM-Confidence: 0.8475";
text = "X-DSPAM-Confidence: 0.8475"
Colpos = text.find(':') # Colon Position
text_a_Colpos = text[Colpos+1 : ] # Text after colon position
number = text_a_Colpos.strip()
print(float(number))
ans = float(text_a_Colpos)
print(ans)
# Using Split and join functions
num_str = text_a_Colpos.split() # string format of number in list
d = ""
num = d.join(num_str) # converts list into string
print(num)
num_f = float(num)
print(num_f)
=============================================================================================
Write a program that prompts for a file name, then opens that file and reads through the file, and print the contents of the file in upper case. Use the file words.txt to produce the output below.
when you are testing below enter words.txt as the file name.
file = input('Enter the file name: ')
fhandle = open(file)
for line in fhandle:
line_strip = line.strip()
line = line_strip.upper()
print(line)
Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution.
when you are testing below enter mbox-short.txt as the file name.
fname = input('Enter the file name: ')
fhandle = open(fname)
count = 0
Total = 0
for line in fhandle :
if 'X-DSPAM-Confidence:' in line :
Colpos = line.find(':')
num_string = line[Colpos + 1 : ]
num = float(num_string)
count = count + 1
Total = Total + num
else:
continue
avg = Total / count
print('Average spam confidence:',avg)
===============================================================================================
Open the file romeo.txt and read it line by line. For each line, split the line into a list of words using the split() method. The program should build a list of words. For each word on each line check to see if the word is already in the list and if not append it to the list. When the program completes, sort and print the resulting words in alphabetical order.
fhandle = open('romeo.txt')
lst = list()
for line in fhandle:
words = line.split()
print(words)
for word in words:
if lst is None:
lst.append(word)
elif word in lst:
continue
else:
lst.append(word)
lst.sort()
print(lst)
Open the file mbox-short.txt and read it line by line. When you find a line that starts with 'From ' like the following line:
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.
Hint: make sure not to include the lines that start with 'From:'.
fname = input('Enter the file name: ')
fhandle = open(fname)
count = 0
for line in fhandle :
if line.startswith('From') :
if line[4] is ':' :
continue
else:
req_line = line.split()
print(req_line[1])
count = count + 1
else:
continue
print('There were',count, 'lines in the file with From as the first word')
==============================================================================================
Write a program to read through the mbox-short.txt and figure out who has sent the greatest number of mail messages. The program looks for 'From ' lines and takes the second word of those lines as the person who sent the mail. The program creates a Python dictionary that maps the sender's mail address to a count of the number of times they appear in the file. After the dictionary is produced, the program reads through the dictionary using a maximum loop to find the most prolific committer.
fname = input('Enter the file name: ')
fhandle = open(fname)
reg_mailer = dict() # regular mailer
for line in fhandle:
if line.startswith('From') :
if line[4] is ':' :
continue
else:
words = line.split()
mail = words[1]
else:
continue
# reg_mailer[mail] = reg_mailer.get(mail,0) + 1
if reg_mailer is None or mail not in reg_mailer :
reg_mailer[mail] = 1
else:
reg_mailer[mail] = reg_mailer[mail] + 1
a = max(reg_mailer.values())
for key, value in reg_mailer.items() :
if value == a :
print(key,a)
else:
continue
===============================================================================================
Write a program to read through the mbox-short.txt and figure out the distribution by hour of the day for each of the messages.
You can pull the hour out from the 'From ' line by finding the time and then splitting the string a second time using a colon.
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
Once you have accumulated the counts for each hour, print out the counts, sorted by hour as shown below.
fname = input('Enter the file name: ')
fhandle = open(fname)
time_mail = dict()
for line in fhandle:
if line.startswith('From') :
if line[4] is ':' :
continue
else:
words = line.split()
time = words[5]
time_tup = time.split(':')
time_tuple = time_tup[0]
else:
continue
time_mail[time_tuple] = time_mail.get(time_tuple,0) + 1
# if reg_mailer is None or mail not in reg_mailer :
# reg_mailer[mail] = 1
# else:
# reg_mailer[mail] = reg_mailer[mail] + 1
ans = sorted(time_mail.items())
for k,v in ans:
print(k,v)
==============================================================================================