Assignment 10 - ldkvd/CS101L GitHub Wiki
Welcome to the CS101L Wiki!
Word Count
In this Exam, you will ask the user for a text file to read. You’ll want to read all the words and output a count of the words that are used the most. (We’ll only be concerned with words that have a length greater than 3)
- Don’t forget to remove any punctuation from the beginning and ending of the word. In the previous sentence, words should be counted once. Not word followed by a period.
- Output the top 10 words that are used most. With the most frequently used words at the top. Exclude all words that are 3 characters or less.
- Output the number of words that appear only once. (How many words are only used once)(Only words more than 3 characters)
- Output how many unique words there are. (Only words more than 3 characters)
- Recover gracefully from the user providing an invalid file.
- You may not finish it all, but work out as much as you can in an orderly manner. Plan your time efficiently.
Answer:
The image above is source code 1/4.
The image above is source code 2/4.
The image above is source code 3/4.
The image above is source code 4/4.
The image above is the output.
In this code, multiple functions are defined and called. Functions allow for the same piece of code to run multiple times. It reduces clutters, complexity, and duplication of codes. It breaks down a large program into smaller, easy-to-manage components. We can call one statement (the function), rather than writing the same code over and over. Within some of the functions, try and except blocks are used to account for possible errors.
In the main program, the function open_file() is called to ask user for a text file to read. The function splits the string into a list and returns a list. Within the list comprehension, for each word in the list, the function remove_pun() returns the word with the punctuations removed. The for loop iterates through the list and removes any words with a length less than three. The function get_word_count() is called which counts and returns a dictionary of the word as the key and the number of times the word appears in the text file as the value. Then the function get_top_ten() is called which sorts the dictionary and prints the ten words and the number of times they occur in the text file in a specific format. The function get_occur_once() is called which returns the number of words that occur only once in the text file. A message is printed displaying the total number of words occurring only once in the text file. The function get_unqiue() is called which returns the number of words that are unique in the text file. A message is printed displaying the total number of unique words in the text file.