Writing a Python Program to accept an input from a text file and figure out the distribution by hour of the day for each of the messages

Hi guys ,

Hope you are doing great and keeping safe. I am writing a blog after a long time and this time its bit different . I think its due to the fact that I moved from Singapore to Milan , and I was not able to find time to write . We will be going through a program which I encountered while reading a book and could not find a satisfactory answer on Stackoverflow or Github. So I decided to publish my own version taking advantage of the blog I have šŸ™‚ . In case anyone needs help it is a good alternative to both the solutions mentioned above.

So we need to count number of times hours is being used and create a counter. Example given below for your reference.

From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2020

Now the actual question is asking us write a program which will accept an input from a text file , it will then go through the that text file and find the hour at which the email was sent and then count that hour by using dictionaries , tuples and list.

fname= input("Enter a file name: ")   #We will use a variable to accept a text file
if len(fname)<1:                                            # if you do no want to enter any file name , we can set a default file name
    fname="mbox-short.txt"
op=open(fname)                                           #We then open that text file by using open keyword

lis=[]                                                                      #We will use a variable to define an empty list and empty dictionary
dic={}

 #We now run a For loop to go through each line in the text file. We want to  to ignore the first "From :" clause as it has an colon at the end which we do not want according to the exercise. 
                                                                                     
for line in op:                                                    
    if line.startswith('From:'):
        continue

  #We then use the 2nd If statement to run our main code which will first find the line starting  from "From" keyword , then we split the line . We then use indexing to find time in hours mins and seconds. To drill down further we split the time again by using : with split method. Then we use indexing again to pick out hour from the time which is [0] indexing.

    elif line.startswith('From'):
        x=line.split()
        y=x[5]
        z=y.split(':')
        m=z[0]
        #print(m)
                                           
        dic[m]=dic.get(m,0) + 1          #We then use the empty dictionary to create a counter using Get method. 

#print(dic)                                              # you can test your code till here by running this print command.

for k,v in dic.items():                        #Run a For loop to go through the dictionary and create a tuple(newtu).
    #print(k,v)
    newtu=(k,v)
    lis.append(newtu)                        #Use the empty list we created earlier to add the results of tuple in the list .
lis=sorted(lis)                                      #We can then sort the list as we need sorted results

#print(lis)

for k,v in lis:                                         #We then run another for loop to print out the results in the format needed.
    print(k,v)
 

I hope this explanation helps you to understand this problem .

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: