Progress report 2 - ShelinaGobardhan/FOTD-Portfolio- GitHub Wiki
This project started with creating frequency lists of all the words in the book and in the movie script. We used this code:
tr -sc '[A-Z][a-z]' '[\012*]' < azkabanbook.txt |
sort |
uniq -c > azkabanbook.csv
tr -sc '[A-Z][a-z]' '[\012*]' < azkabanscript.txt |
sort |
uniq -c > azkabanscript.csv
Furthermore, we have to pay attention to the fact that in the movie script the names of the characters that are speaking are announced with a ':', for example Harry: “bla, bla”. That is why we created two movie scripts. One with the stagenames and one without.