NLP Suite Release History - NLP-Suite/NLP-Suite GitHub Wiki

NLP Suite GitHub release (latest by date)

Open the download page to download and install the current release of the freeware, open-source NLP Suite package.

Release Version Date Changes
4.9.6 7/15/2024 1. fixed function name bug for vocabulary analysis in style_analysis_main for NLTK unusual words; 2. added the options of turning ON or OFF all reminders for ALL GUIs; 3. added the TIPS file TIPS_NLP_Multi-Word Expressions (MWE) & Light Verb Constructions (LVC).pdf to the SVO GUI.
4.9.5 7/15/2024 1. fixed bugs in Stanford_CoreNLP_util when processing sentence splitter, and NER annotators from parsers_annotators_main; 2. fixed bug in GIS in processing dates for Google Earth Pro; 3. continued development of DB_PCACE_data_analyzer functions and added the TIPS file.
4.9.4 7/6/2024 1. fixed a filename bug in Stanza SVO; 2. increased the accuracy of SVO extraction by adding str str(full_name.split()) in the function def replace_words_with_full_namesor such names as Chiang Kai-shek would not be replaced and would keep shek; 3. continued to improve the functions in the GUI DB_PCACE_data_analyzer.
4.9.3 7/3/2024 1. fixed bugs in the NGrams_CoOccurrences_main GUI to compute n-grams; 2. fixed bugs in the DB_PCACE_data_analyzer_main GUI.
4.9.2 7/1/2024 1. rewrote the Stanza_functions_util; 2. rewrote the SVO functions to account for multi-word expressions for PERSON, ORGANIZATION, and LOCATION, which can be social actors; 3. rewrote the visualization functions for the CoNLL table analyzer.
4.9.1 6/24/2024 1. improved the Sankey charts; 2. improved the Treemap charts; 3. added the bubble charts; 4. added separate TIPS files for each available specialized chart; 5. fixed display bugs in the SVO for Sankey charts; 6. fixed a bug in the display of dates in the description field for Google Earth Pro; 7. rewrote the co-occurrences and word searches greatly reducing the time of execution (from 3 hours on the Chinese Government Work Reports searches to 10 minutes) and adding new options (e.g., co-occurrences within sentences).
4.9.0 4/26/2024 1. export the "feats" field for the Stanza parser and POS annotator. feats contains valuable information on verb Mood (e.g, indicative, imperative, subjunctive) not available in Stanford CoreNLP.
4.8.9 4/22/2024 1. added the case sensitive checkbox for case sensitive searches for categorical data in data_visualization_1_main.py; 2. improved the user messages for Sunburst and treemap options; 3. fixed several bugs arising from NUL values in the data.
4.8.8 4/20/2024 1. Fixed a bug in the chart for abstract/concreteness scores in the style analysis GUI; 2. fixed a bug in the output of Punctuation as figures of pathos (?!) in the style analysis GUI.
4.8.7 4/19/2024 1. fixed a potential bug with the display of output files in the WordNet GUI.
4.8.6 4/19/2024 1. added a checkbox to the search_byWord_main to allow the option of searching for search words CO-OCCURRING the same sentence/document; 2. edited the TIPS files for style_analysis, function_words_analysis, and verb_analysis.
4.8.5 4/17/2024 1. completely uniformed the output of CoreNLP, spaCy, Stanza SVO.
4.8.4 4/16/2024 1. Fixed bug in Stanza SVO; 2. uniformed output for spaCy and Stanza SVO.
4.8.3 4/15/2024 1. Fixed the NLTK download bugs; 2. Fixed display of NER values for Stanza in NER_main GUI; 3. Fixed bugs in SVO extraction for spaCy (Stanza still giving problems).
4.8.2 4/14/2024 1. Fixed a number of minor bugs in the sentence_analysis_main GUI.
4.8.1 4/11/2024 1. fixed a bug in sentence complexity for missing parameter.
4.8.0 4/7/2024 1. temporarily disconnected GIS mapping when running SVO OpenIE.
4.7.9 4/6/2024 1. fixed a display bug in the index of Processing location (e.g., 171/341 for geocoding: ...); 2. fixed a missing display in Google Earth Pro of sentence and document.
4.7.8 4/4/2024 1. Uniformed the handling of mwe (multi-word expressions) in spaCy and Stanza for NER and depparse; 2. added the package check for 'bertopic' in the topic_modeling_main.py to alert the user when the package is not available.
4.7.7 4/3/2024 1. added BERT topic modeling; 2. simplified the references to available languages for parsers and annotators; 3. fixed a bug in NER mwe (multi-word expression) for Stanza.
4.7.6 4/1/2024 1. re-instated the ability of using different languages for parsers and annotators adding several checks of compatibility; 2. replaced the current WSI (Word Sense Induction) algorithms with new ones.
4.7.5 3/23/2024 1. added videos to several GUIs; 2. added the TIPS files to the GIS_Google_Earth_main GUI.
4.7.4 3/17/2024 1. added the BERT topic modeling option to the MALLET and Gensim topic modeling options; 2. fixed a bug in MALLET heatmap.
4.7.3 3/6/2024 1. added more search options to the file search ALL options GUI; 2. fixed several data-dependent bugs by trapping errors; 3. added options to the BERT word sense induction.
4.7.2 3/1/2024 1. added headers to MALLET output csv files; 2. fixed bugs with docx to txt converter.
4.7.1 2/20/2024 1. Fixed bugs with word embeddings and word sense induction algorithms.
4.7.0 2/16/2024 1. fixed bugs and improved the Word Sense Induction algorithms (via BERT).
4.6.9 2/11/2024 1. fixed a bug in the display of headers in file_search_byWord; 2. added the lemmatize option in the file_search_byWord search.
4.6.8 2/7/2024 1. fixed bug in NGrams_CoOccurrences_main for Case sensitive option.
4.6.7 2/6/2024 1. fixed bug in file_search_byWord_main when running the case sensitive/insensitive option.
4.6.6 2/5/2024 1. added a vocab csv output file when computing 1-grams (charts to be added soon); 2. fixed potential bug with Stanza not installed in NLP_menu_main when selecting Setup default NLP parsers & annotators... 3. fixed a bug in data_visualization_main_1 with the Treemap option.
4.6.5 2/2/2024 1. fixed bug in NLP_menu_main for Co-Occurrences and N-grams options as not available.
4.6.4 1/31/2024 1. fixed a bug in the merge_main.py with wrong number of parameters; 2. fixed bug with missing TIPS; 3. fixed a bug with N-grams option for determiners.
4.6.3 1/29/2024 1. fixed a bug in a function call in the file_checker_converter_cleaner_main due to wrong number of parameters passed; 2. changed the requirements.txt files setting specific package versions to avoid dependency errors (wordcloud, Pillow, stanza).
4.6.2 1/26/2024 1. fixed a bug by changing several lines of codes from 1 to 0 (e.g., a['Sort order'][0]) in the function def getFileList().
4.6.1 1/25/2024 1. fixed potential bug in importing stanza; 2. fixed bug in computing document(s) statistics.
4.6.0 1/24/2024 1. fixed a bug in the NLP_menu_main when selecting the Co-Occurrences VIEWER and N-Grams VIEWER; 2. removed the unnamed column in n-grams csv output thus removing a bug in the Search word(s) option in the N-grams Co-Occurrences GUI.
4.5.9 1/22/2024 1. removed the txt version of readme files and left md versions only; 2. changed a test for internet connection from Bing to Google.
4.5.8 1/17/2024 1. edited the HELP? and hover-over info for normalization and data transformation. Both options are now available. 2. added Mac and Windows readme beta versions to help with installations.
4.5.7 1/13/2024 1. introduced the normalization option (i.e., dividing results by document size) and data transformation (e.g., log) options for plotting; 2. fixed bugs in the computing of hapax values from Style analysis GUI; 3. fixed wrong warning in SVO GUI for gender and quote annotators; 4. added Sankey charts to SVO output.
4.5.6 11/27/2023 1. fixed a bug in the creation of a wordcloud with an empty input file; 2. added an Organizations column in the SVO output; 3. improved Stanford CoreNLP SVO using entitymentions values for Subjects and Objects; 4. introduced a new GUI for a pipeline of data quality algorithms (file_checker_pre_processing_pipeline_main.py).
4.5.5 11/22/2023 1. fixed a bug in the export of the NER dataframe in Stanford_CoreNLP_util that prevented the visualization of NER charts.
4.5.4 11/22/2023 1. in the SVO wordclouds visualization, set the collocations parameter to False to avoid potential repetition of the same words; 2. added new options to the dropdown menus in NLP_menu_main for statistical tools of textual analysis; 3. fixed chart display bugs in the computation of clause, noun, verb, function words in the CoNLL table analyzer; 4. added computation of overall noun and verb lists and frequencies in the CoNLL table analyzer.
4.5.3 11/21/2023 1. fixed a bug with wordcloud visualization when using a csv file in input; 2. extended the 'Different colors by POS tags option in wordclouds to include proper nouns; 3. added a checkbox to compute corpus statistics by POS (Part of Speech) tag value in the statistics_txt_main GUI; 4. uniformed the handling of 'GUIs available for more options' in all GUIs that rely on the checkbox and menu.
4.5.2 11/19/2023 1. fixed minor bugs in the search functions; 2. added the heatmap for topic composition and keys to MALLET topic modeling; 3. added the script charts_matplotlib_seaborn_util to the NLP Suite.
4.5.1 11/17/2023 1. added CLOSE button to coref results and passed the right file back to SVO when doing manual coreferencing.
4.5.0 11/16/2023 1. fixed bug in verb analysis in the CoNLL table analyzer; 2. improved the layout of the NGrams_CoOccurrences_main GUI; 3. improved the user interaction for NGrams_CoOccurrences_main GUI; 4. fixed a bug in SVO for GIS information; 5. improved the display of GIS information in SVO to take into account multi-word locations; 6. improved the handling of NER PERSON in SVO; 7. changed the default display of wordcloud setting collocation to False.
4.4.9 11/12/2023 1. fixed a bug with Plotly capitalized.
4.4.8 11/12/2023 1. added a check on the number of variables required by the various Excel/plotly charts; 2. improved the layout of the data_visualization_2_main GUI; 3. improved the user interaction in dealing with charts with improved dropdown menus.
4.4.7 11/11/2023 1. fixed various bugs in the search functions; 2. uniformed the name of output directories for searches of various types (word, sentence, n-grams); 3. added the option of searching for co-occurring words with sentence (rather than document) in NGrams_CoOccurrences_main; 4. added the option of filtering data when drawing sunburst or treemap charts; 5. added the option of computing and visualizing WordNet aggregated lemma values for POS nouns and POS verbs in the CoNLL_table_analyzer; 6. added the Excel/plotly option in data_visualization_2_main GUI (TO BE COMPLETED); 7. added a Sankey chart to the gender annotator; 8. removed the legend from the Excel charts when only 1 series is processed.
4.4.6 11/4/2023 1. added the option of creating Excel/Plotly charts in the data_visualization_2_main GUI; 2. fixed bug in N-grams VIEWER GUI.
4.4.5 11/4/2023 1. added Python folium pin map and heatmap to the GIS_pipeline; 2. uniformed the layout of several GUIs when further GUI options are suggested, using a dropdown menu instead of a button.
4.4.4 11/2/2023 1. added the option of users selecting the type of chart to be displayed (bar, line, radar); 2. fixed bugs with plotLy charts; 3. fixed bugs in ngrams and file searches when users select the wrong file/search word.
4.4.3 11/1/2023 1. fixed bugs in N-grams search function with Sankey visualization; 2. improved the layout of the style analysis GUI; 3. completed the development work of the file search by word with wordclouds visualization; 4. completed the development work of the n-grams search with wordclouds visualization.
4.4.2 10/30/2023 1. moved the corpus statistics under the text statistics GUI and out of style analysis.
4.4.1 10/29/2023 1. completed the search function for N-grams; 2. added wordcloud display in the word search functions; 3. fixed a bug in pulling up N-grams from the NLP_menu_main GUI.
4.4.0 10/29/2023 1. improved the n-grams scripts; 2. organized N-grams output by n-gram number; 3. added a line chart to the n-gram search for 1-grams; 4. corrected the sort in the function get_data_to_be_plotted_with_counts leaving sorting in the order of the x-axis values.
4.3.9 10/28/2023 1. improved the n-grams scripts; 2. added a line chart to the n-gram search for 1-grams; 3. corrected the sort in the function get_data_to_be_plotted_with_counts leaving sorting in the order of the x-axis values.
4.3.8 10/27/2023 1. nominalization creates a line chart with values by date if the files embed a date.
4.3.7 10/27/2023 1. completed the work on nominalization.
4.3.6 10/26/2023 1. completed the work on generalizing the special charts to process any number and types of csv file fields; 2. added several warnings to pandas 2.0.2 incompatible with Gensim, charts, etc.; 3. fixed minor bugs of variable names; 4. continued the work on nominalization (a time chart to be added).
4.3.5 10/25/2023 1. completed the work on generalizing the special charts to process any number and types of csv file fields.
4.3.4 10/24/2023 1. improved both the nominalization algorithms to account for typical nominalized verb ending (e.g., ion, ment) and the nominalization TIPS file.
4.3.3 10/24/2023 1. prepared the data_visualization_1_main.py GUI to process multiple selections of fields/search values. TO BE CONTINUED; 2. added Sankey charts to the n-grams search in NGrams_CoOccurrences_main GUI; 3. fixed an Excel chart display with MALLET topic modelling; 4. fixed a bug in style_analysis_main for the hapax_words variable not assigned; 5. added the colormap chart option to data_visualization1_main; 6. added Sankey charts to coreference output.
4.3.2 10/23/2023 1. prepared the data_visualization_1_main.py GUI to process multiple selections of fields/search values. TO BE CONTINUED.
4.3.1 10/22/2023 1. added SVO_main.py.
4.3.0 10/22/2023 1. added the N-grams search option in the NGrams_CoOccurrences_main GUI; 2. added the code to deal with multi-word NER values (e.g., PERSON, COUNTRY) split in two records by CoreNLP; 3. added back SVO_main.py accidentally deleted.
4.2.9 10/21/2023 1. reorganized the style_analysis and NGrams_CoOccurrences GUI; 2. in the CoNLL_table_analyzer and file_search_byWord GUIs added the options of searching for a word and extracting neighboring words (TO BE COMPLETED); 3. improved the efficiency of the ngrams algorithms.
4.2.8 10/19/2023 1. improved the Ngrams_CoOccurrences_Viewer functions.
4.2.7 10/18/2023 1. improved the display of output from the CoNLL_table_analyzer.
4.2.6 10/18/2023 1. completed the n-grams algorithms with all options implemented; 2. fixed bugs in the n-grams/co-occurrence VIEWER algorithms; 3. added error trapping in the knowledge_graphs_WordNet_main; 4. added a chart of countries found by the geocoder used for GIS.
4.2.5 10/17/2023 1. fixed a bug in the N-grams VIEWER when splitting multiple-word search words (e.g., Hong Kong).
4.2.4 10/16/2023 1. completed the development of GIS pipeline; 2. rewrote the n-grams algorithms leading to greater efficiency (15 minutes instead of 5 hours for the CGWR corpus).
4.2.3 10/15/2023 1. prepared the work for multi-name locations (e.g., United States, US); 2. set the default Google Earth Pro pin color to red.
4.2.2 10/12/2023 1. prepared the work for multi-word locations incorrectly tagged by CoreNLP NER annotator; 2. added Future to verb tenses in the CoNLL_verb_analysis_util.
4.2.1 10/12/2023 1. added M.K.A. Halliday's high-value, median-value, and low-value classification of modal verbs in CoNLL_verb_analysis_util; 2. added a wordcloud display to the output of a CoNLL_table_search_util.py.
4.2.0 10/11/2023 1. In the Style analysis GUI, modified the Vocabulary analysis option for Short words (<4 characters) to compute, instead, the word length for all words in a corpus; 2. fixed bug in wordclouds_util word_str referenced before assignment.
4.1.9 10/9/2023 1. fixed a bug in the WordNet GUI with the column header 'Word' (expected 'Term') leading to Key error.
4.1.8 10/8/2023 1. fixed a config filename bug in coreference_main.py
4.1.7 10/7/2023 1. fixed bugs in the export of files for the SVO algorithm.
4.1.6 10/6/2023 1. completed the rewriting of the N-grams Viewer (Co-Occurrences Viewer still does not process date information).
4.1.5 10/5/2023 1. fixed a bug with the opening of coreference files.
4.1.4 10/4/2023 1. fixed a bug with Gephi; 2. fixed bugs in the function statistics_csv_util.compute_csv_column_frequencies with no groups; 3. fixed the wrong display of Infinitive verbs frequency.
4.1.3 10/3/2023 1. added a series of checks in the data visualization 2 GUI for Boxplots; 2. added two updated TIPS files for data visualization; 3. added a try/except for the Sankey charts.
4.1.2 10/1/2023 1. made further improvements to the GIS algorithms.
4.1.1 10/1/2023 1. fixed bugs in the GIS algorithms.
4.1.0 9/30/2023 1. fixed bugs in the creation of charts for document statistics and other charts; 2. improved the output layout for sentiment analysis algorithms.
4.0.9 9/29/2023 1. fixed a display problem with charts by group columns (e.g., By Document).
4.0.8 9/29/2023 1. fixed an inconsequential bug in the export of sentiment analysis scores for Stanza; 2. fixed a bug in the export of charts for sentiment analysis scores.
4.0.7 9/28/2023 1. Further improved error trapping for the sunburst visualization option; 2. further improved the layout design of the DB_PCACE_data_analyzer_main GUI.
4.0.6 9/27/2023 1. fixed all bugs in the data_visualization_1_main.py GUI; 2. fixed bugs with the Gephi_util.py; 3. fixed bugs with Gephi_util.
4.0.5 9/27/2023 1. trapped user errors in Sunburst and Treemap.
4.0.4 9/26/2023 1. revised the data_visualization_1 and data_visualization_2 scripts.
4.0.3 9/25/2023 1. improved STEP2 for Mac; 2. fixed a display overlap between Document and Sentence in GIS description field; 3. added 2 TIPS files; 4. fixed a display overlap in all GUIs with the Release version.
4.0.2 9/24/2023 1. Improved ?HELP and TIPS to remind the user that clausal tags are available ONLY when using Stanford CoreNLP PCFG parser (the nn parser does NOT produce clausal tags).
4.0.1 9/23/2023 1. Fixed an error with the reading of files when a specific sort order is specified.
4.0.0 9/22/2023 1. Fixed a minor issue in the parsers_annotators_main with the CoNLL_table_analyzer checkbox state (normal/disabled); 2. uniformed the MAC & Windows width for OK + Reset Show buttons in all GUIs.
3.9.9 9/21/2023 1. uniformed the width of Open file/dictionary buttons in all GUIs; 2. improved the layout of the html_gender_annotator GUI; 3. fixed a bug with language selection in Stanza.
3.9.8 9/18/2023 1. added a series of checks to the NLP_setup_package_language_main.py to avoid setup errors; 2. Added the POTUS_webscraper based on beautifulsoup to extract POTUS speeches; 3. fixed bugs in the creation of Excel charts; 4. fixed bugs in TIPS filenames.
3.9.7 9/12/2023 1. fixed a bug when opening the CoNLL table analyzer from the parser GUI; 2. fixed a potential bug in the saving of NLP_default_IO_config.csv; 3. improved the user-interface errors when reading corpus files with wrong I/O information.
3.9.6 9/10/2023 1. improved the user interaction when there are missing IO values; 2. removed the line import RF_charts_treemaper_util from data_visualization util; 3. fixed a bug in file_search_byWord_util.
3.9.5 9/8/2023 1. improved the display of menu options in NLP_menu_main GUI to avoid multiple selections.
3.9.4 9/7/2023 1. fixed a bug in wordclouds_main GUI with wrong filename; 2. fixed a bug with Mac external software installation.
3.9.3 9/1/2023 1. Improved the display of information when clicking on a Google Earth Pro pin in GIS_main; 2. improved the user interface in the functions behind the NLP_setup_external_software_main.py GUI; 3. fixed a filename bug in opening the config file NLP_default_package_language_config.csv; 4. fixed a bug in GUI-specific I/O configuration not updating.
3.9.2 8/28/2023 1. removed config directory from git push.
3.9.1 8/27/2023 1. Fixed a bug in NLP_menu_main when selecting the option "Sample corpus (ALL options GUI)" under Pre-processing tools; 2. improved the display of user messages about the RUN button with wrong/missing IO information; 3. fixed a bug of missing file for timechart in data_visualization_main; 4. added TIPS file for specialized visuals to the data_visualization GUI.
3.9.0 8/23/2023 1. Edited the GIS scripts to avoid asking for Goggle geocoder API when using Nominatim as geocoder; 2. edited GUI_util to replace Image.ANTIALIAS, no longer supported, with Image.LANCZOS.
3.8.8 8/13/2023 Release 3.8.8 reflects a large number of bug fixes and improvements, the result of summer 2023 work on the NLP Suite.
3.8.7 6/4/2023 1. rewrote the SVO_util functions, fixing bugs with spaCy and Stanza SVO; 2. made the SVO GUI more user-friendly; 3. fixed bug in style analysis GUI; 4. added the chart visualization to BERT NER; 5. uniformed the output layout of spaCy NER to all other NER packages; 6. fixed a bug in reminders; 7 fixed bugs in the opening of output files in the CoNLL_table_analyzer_main GUI; 8. fixed bugs in the statistics_csv_main GUIs.
3.8.6 5/11/2023 1. added videos to several GUIs; 2. improved the use of US social security first names databases in the html_annotator_gender GUI.
3.8.5 5/5/2023 1. fixed a bug in Co-Occurrence VIEWER with a date option selected.
3.8.4 5/5/2023 1. added the boxplot and comparative bar chart options to the data_visualization_basic GUI; 2. consolidated all charts in a single charts_util; 3. fixed a bug in Sankey chart when using 2 variables only.
3.8.3 5/4/2023 1. improved the chart output format of the CoNLL_table_analyzer GUI; 2. improved the N-Grams/Co-Occurrences VIEWER GUI.
3.8.2 5/3/2023 1. improved the gender algorithms with Social Security data.
3.8.1 4/28/2023 1. fixed bug for missing labels_x_indented_indented_coordinate for Mac in GUI_IO_util.
3.8.0 4/27/2023 1. improved the I/O options in the GIS GUI.
3.7.9 4/26/2023 1. fixed a __ in output filenames of nominalization files; 2. added the opening of output chart files to MALLET output; 3. fixed a bug in filesToOpen in SVO.
3.7.8 4/25/2023 1. Fixed a layout problem with one of the nominalization output files when processing multiple input files.
3.7.7 4/22/2023 1. Improved the visualization of Python wordclouds for stopwords.
3.7.6 4/21/2023 1. replaced error_bad_lines=False with on_bad_lines='skip' to avoid an error with Pandas release 2.0.
3.7.5 4/21/2023 1. fixed a bug with undefined numFiles in def compute_sentence_complexity; 2. fixed a bug in def compute_sentence_complexity for Pandas append (concat in Pandas 2.0).
3.7.4 4/20/2023 1. fixed a bug with Gensim Word2Vec; 2. fixed bugs in both gender annotator and annotator GUIs; 3. fixed a bug in Word Sense Induction.
3.7.3 4/16/2023 1. added the option of using geocoded data in GIS_main and GIS_Google_Earth_Pro_main; 2. added the option in GIS_Google_Earth_Pro_main of creating heat maps via Google Maps; 3. trapped a potential error with Word sense induction in Word2Vec_main if the words to be analysed are too infrequent for k-means; 4. trapped a potential error with wordclouds displays.
3.7.2 4/13/2023 1. Fixed a bug in Word2Vec GUI for Word sense induction.
3.7.1 4/132/2023 1. improved SVO GUI layout.
3.7.0 4/12/2023 1. improved user messages in the shape of stories GUI; 2. added file opening in Word2Vec GUI; 3. edited the layout of the SVO GUI.
3.6.9 4/10/2023 1. improved the GUI layout for GIS_main, SVO_main, CoNLL_table_analyzer, NER_main; 2. fixed a bug in spell_checker functions; 3. fixed a bug in timeline function in visualization_main GUI; 4. fixed a bug in the file_search_byWord_main for -K and +K; 5. added sort options in NLP_setup_IO_main for files read into the NLP Suite and extended the option to all NLP functions; 6. added a new data_visualization_basic_main GUI for boxplots and multiple bar charts (to be completed).
3.6.8 3/23/2023 1. removed the timeout option for Stanford CoreNLP; 2. improved the messages for dates embedded in the filename in the ?HELP and Read Me buttons of several GUIs.
3.6.7 3/22/2023 1. export the Stanford CoreNLP error file to the subdirectory of the output directory rather than the main output directory; 2. fixed a bug in GIS_pipeline_util for wrong check of empty dataframe.
3.6.6 3/19/2023 1. uniformed the export of BERT sentiment analysis output to a sentiment_BERT subdirectory following the new NLP Suite standard for output; 2. added the option of changing the installation directory of an external software in NLP_setup_external_software_main; 3. fixed a bug in Stanford_CoreNLP_coreference due to a new timeout parameter.
3.6.5 3/17/2023 1. fixed a bug in the sentiment_analysis GUI when opening the shape of stories GUI; 2. improved the Stanford_CoreNLP_util; 3. fixed a problem in the NLP_setup GUIs when users close the GUIs with the top-left-most red button in Mac or the top-right-most X button in Windows instead of using the CLOSE button that would run special functions.
3.6.4 2/28/2023 1. improved the user interface in the test for Mac M1 & M2 chips in NLP_menu_main; 2. added hover-over effects to the knowledge_graphs_WordNet_main GUI; 3. added a check for GUI-specific config files in all the scripts that require a specific type of input (e.g., a csv file); when the GUI-specific config file is present, the I/O configuration is automatically set to the GUI-specific option.
3.6.3 2/27/2023 1. edited the test for Mac M1 & M2 chips (tensorflow-metal & tensorflow.macos) with instructions on what to do in case of errors with tensorflow.
3.6.2 2/26/2023 1. rewrote the filter and lemma functions for SVO in SVO_util.
3.6.1 2/24/2023 1. Added the option of setting the timeout limit in the NLP_setup_package_language_main GUI for Stanford CoreNLP to speed up processing; 2. added a test in NLP_menu_main for potential error with Anaconda installation with Mac M1 & M2 chips.
3.6.0 2/23/2023 1. Fixed a bug in the parser_annotator_main GUI when selecting to run a parser with the CoNLL table analyzer checkbox ticked.
3.5.9 2/23/2023 1. Fixed the RUN button missing in the narrative_analysis GUI; 2. fixed a bug with keyword 'coref' in Stanford_CoreNLP_util.
3.5.8 2/19/2023 1. Fixed a checkbox state in coreference_main GUI always set to disabled even when a single file is in INPUT.
3.5.7 2/16/2023 1. fixed a bug when running the N-grams viewer from the CORPUS Analysis tools.
3.5.6 2/15/2023 1. fixed LANCZ0S deprecated warning; 2. fixed bug in Google_Earth_main.
3.5.5 2/14/2023 1. fixed bugs in processing different options in GIS_Google_Earth_main.
3.5.4 2/13/2023 1. Improved the layout of the parsers_annotators_main GUI; 2. fixed a bug with the Quote/dialogue annotator (via CoreNLP) in the parsers _annotators_main.
3.5.3 2/13/2023 1. completed the TIPS file for the NLP_welcome_main GUI; 2. improved the local & GitHub release version displays.
3.5.2 2/12/2023 1. fixed variable name change open_setup_external_software_button.
3.5.1 2/12/2023 1. added a TIPS file for the NLP_welcome_main GUI, linking humanists' and social scientists' questions to NLP Suite tools; 2. added code to avoid warning the user of updating the NLP Suite every time you close a GUI when multiple GUIs are open; warning now only on the last CLOSE; 3. improved the checks for the 3 setup options in the NLP_menu_main GUI.
3.5.0 2/9/2023 1. Fixed bug in the external software display; 2. Fixed a bug in knowledge_graph_WordNet_main to check and download the NLTK wordnet resource.
3.4.9 2/7/2023 1. Further improved the external software functions; 2. changed a reminder to timed_alert from showWarning to avoid stopping functions; 3. completed the NER_main GUI; 4. connected all functions using external software to the new function external_software_install in IO_libraries_util.
3.4.8 2/6/2023 1. fixed error with + + in IO_libraries_util; 2. added user message that SVO only works correctly for the English language only.
3.4.7 2/6/2023 1. Fixed bug in NLP_setup_package_language_main where Stanford CoreNLP is always selected, regardless of user selection.
3.4.6 2/5/2023 1. Fixed bug in the functions for external software installation and completely rewrote the functions for setting up external software; 2. added a dropdown menu of data tools in all the GUIs; 3. completed the Sankey plot; 4. fixed bug in filename in file_spell_checker; 5. added the 'Processing file' line in terminal for Co-occurrences; 6. added the function to sample a corpus by string values contained in the filename; 7. fixed bug with filenames in CoNLL_table_analyzer; 8, fixed bug in opening files.
3.4.5 1/22/2023 1. Improved the Mac layout of the data_visualization GUI; 2. added a check to prevent a question in external software if Java already installed.
3.4.4 1/22/2023 1. completed the data_visualization_main GUI and connected all available options (Sankey option still under development).
3.4.3 1/21/2023 1. Improved the GUI_util to avoid GitHub Issue 983 to no avail for Mac; Issue still Open; 2. Prepared the data_visualization_main GUI to be connected to the Plotly charts util.
3.4.2 1/20/2023 1. Added the option of watching videos in the four fundamental GUIs: NLP_menu_main.py, NLP_setup_package_language_main.py, NLP_setup_IO_main.py, NLP_setup_external_software_main.py; 2. reinstated STEP3 as an executable file.
3.4.1 1/18/2023 1. implemented the sampling of a corpus by search words (file_search_byWord_main); 2. prepared the GUI sample_data_main (to be completed).
3.4.0 1/16/2023 1. rewrote requirements.txt commenting all blank lines to avoid installation errors; 2. rewrote STEP2 for Mac & Windows to avoid exiting if the installation fails for a specific module in requirements; 3. edited the access to YouTube videos as items in Playlist.
3.9.9 1/14/2023 1. fixed bugs in the call to the function getGoogleAPIkey; 2. fixed bugs in the call to the GIS TIPS files in SVO_main; 3. started initializing the DB_PCACE_analyser main.
3.9.8 1/14/2023 1. Edited STEP2 installation script to process requirements.txt one package at a time continuing to the end if a package installation fails; 2. edited the SVO wordclouds algorithm to visualize word frequencies for the same word differently for S, V, and O.
3.9.7 1/13/2023 1. Prepared the data_visualization_main for new plot options; 2. eliminated a potential error in STEP2; 3. changed the deprecated Image.ANTIALIAS to Image.LANCZOS.
3.9.6 1/12/2023 1. Added the Open TIPS file button to the NLP_welcome_main GUI to prepare for an opening TIPS file.
3.9.5 1/12/2023 1. Added error trapping to YouTube video lookups in case of programming error with the video url.
3.9.4 1/12/2023 1. Implemented a webbrowser approach to YouTube videos instead of vcl and pafy.
3.9.3 1/11/2023 1. Fixed the layout of the parsers_annotators_main.
3.9.2 1/11.2023 1. Connected the Social Security option for names for states and year in the html_gender_annotator GUI; 2. disconnected temporarily all the video displays since vlc and pafy are creating multiple problems.
3.9.1 1/11/2023 1. changed vlc to python-vlc in requirements.txt to avoid STEP2 error; 2. added the handling of vlc in IO_libraries_util; 3. removed the videos folder from the NLP Suite, now stored on YouTube.
3.9.0 1/10/2023 1. Removed a circular reference in videos_util.py.
3.8.9 1/10/2023 1. Added videos and YouTube videos handler.
3.8.8 12/27/2022 1. Fixed a bug in NLP_setup_package_language_main; 2. rearranged import calls to a. avoid double tkinter screen; b. avoid displaying user messages before the complete layout of a GUI.
3.8.7 12/26/2022 1. Completed the rewriting of the functions to enable/disable the RUN button in all GUIs.
3.8.6 12/25/2022 1. Completed the rewriting of the functions (and GUI layout) for the 3 NLP_setup options (IO, NLP packages, external software).
3.8.5 12/23/2022 1. Added Java to the list of external software in NLP_setup_external_software_main.
3.8.4 12/22/2022 1. added hover-over effects to several GUIs; 2. improved the SVO algorithms options; 3. added a TIPS file for date embedded in filename; 4. added different K values for beginning and end in repetition finder in CoNLL_table_analyzer; 5. added the interactive timeline chart in data_visualization (to be continued).
3.8.3 12/20/2022 1. improved the user friendliness of NLP_setup_package_language_main GUI.
3.8.2 12/19/2022 1. eliminated a repeated entry in the dropdown menu of the CoNLL_table_analyzer; 2. added output txt files in the split by Begin-Middle_End option of the file_splitter_main (to be continued).
3.8.1 12/18/2022 1. Added the option of using a single click to select or double click to expand in the ontology class widget in the knowledge_graph_DBpedia_YAGO_main; 2. added hover-over effects to several GUIs.
3.8.0 12/16/2022 1. added the split file option by K sentences (BME) (to be completed for split txt files); 2. setup knowledge_graphs_DBpedia_YAGO_main to accept several ontology classes w/o color selection (defaulting to blue) in preparation of omitting HTML output (to be completed); 3. added BERT (English model) and BERT (Multilingual model) options for sentiment analysis; 4. moved the import of some packages to the line where the package is run so that a GUI can still be displayed in full.
3.7.9 12/14/2022 1. added the stopword option and csv file input option to BERT Word2Vec; 2. improved the layout and user messages of the knowledge_graphs_DBpedia_YAGO_main GUI.
3.7.8 12/10/2022 1. fixed potential problems from a variable name change; 2. improved the layout of the wordclouds GUI.
3.7.7 12/09/2022 1. added the button widget "Manipulate & visualize csv data" to all GUIs; 2. fixed the style_analysis GUI layout for Mac and Windows.
3.7.6 12/08/2022 1. added hover-over info in some GUIs.
3.7.5 12/06/2022 1. consolidated MALLET & Gensim topic modeling into a single GUI; 2. added hover-over effects in wordclouds GUI.
3.7.4 12/06/2022 1. Expanded the visualization options in DB_PCACE_data_analyzer_main.py GUI; 2. added the chart-type widget to all GUIs in preparation of allowing users to choose their preferred chart option; 3, added user reminders in terminal/command line for Word2Vec; 4. renamed all filenames for Word2Vec to improve user friendliness; 5. improved the Word2Vec GUI layout; 6. added a reminder about TensorFlow potential error when running on Mac with M1 or M2 chip; 7. added the option of NOT plotting Word2Vec vector space which may be computationally very draining, particularly for BERT; 8 extensively rewritten the setup functions for external software.
3.7.3 11/21/2022 1. Fixed a bug in calling BERT Word2Vec.
3.7.2 11/21/2022 1. Fixed a bug in reminders when running the parsers_annotators GUI; 2. Completed BERT Word2Vec with all distance measures calculated.
3.7.1 11/20/2022 1. Added the option of NOT splitting documents in Sunburster; 2. in Word2Vec added csv files for a. the 2-dimensional Euclidean distance between words; b. the n-dimensional Euclidean distance between words; c. the cosine distance between words; 3. added the option of computing distances and for how many top words; 4. added distance measure to BERT Word2Vec.
3.7.0 11/19/2022 1. Added the option of NOT splitting documents in Sunburster; 2. in Word2Vec added csv files for a. the 2-dimensional Euclidean distance between words; b. the n-dimensional Euclidean distance between words; c. the cosine distance between words; 3. added the processing of dates embedded in the filename to create dynamic Google Earth Pro maps.
3.6.9 11/18/2022 1. Fixed a bug in BERT_util with importing TransformerSummarizer; 2. Fixed a warning message in NLP_menu_main GUI with the selection of Word embeddings (Word2Vec) (via BERT) warning user that option is not available yet.
3.6.8 11/18/2022 1. Improved Sunburster interactive chart; 2. completed the Gensim Word2Vec script.
3.6.7 11/17/2022 1. Improved the output of SVO; 2. improved Gensim Word2Vec performance; 3. improved BERT word embeddings; 4. finalized the visualization GUI options for Sunburster.
3.6.6 11/14/2022 1. Designed the DB_PCACE_data_analyzer_main.py for analyzing PC-ACE data; 2. fixed a bug in Stanza sentiment analysis.
3.6.5 11/12/2022 1. Redesigned the visualization_main in preparation of interactive output; 2. Fixed an spaCy SVO bug with last sentence w/o a .
3.6.4 11/10/2022 1. Fixed a potential bug with Google Earth in spaCy and Stanza; 2. SVO for spaCy and Stanza completed.
3.6.3 11/09/2022 1. Fixed a repetition of normalized dates in CoreNLP SVO.
3.6.2 11/09/2022 1. added the Filter subdirectory to SVO; 2. added the normalized-date subdirectory to SVO; 3. fixed a bug with sentiment_analysis_SentiWordNet_util.py; 4. fixed bug in opening NER (GUI) from parsers_annotators_main; 5. fixed a bug in Ngrams_CoOccurrences VIEWER for languages other than English; 6. fixed a bug in SVO spaCy and Stanza; 7. exported the Date type in SVO; 8. completed the visualization of BERT word embeddings.
3.6.1 11/06/2022 1. Minimized the number of SVO annotators to the required ones (coref, gender, quote); 2. improved the speed performance of Stanford CoreNLP; 3. improved the user-interaction of wordcloud_main when using csv files for input; 4. fixed a bug in computing statistical measures (e.g., skewness, kurtosis) and charts; 5. started adding timing warnings for algorithms; 6. removed the script NLP_setup_download_nltk_stanza.py from src and from STEP2 Mac and Windows installation files; 7. fixed a bug in sentiWordNet for sentiment analysis; 8. standardized handling of normalized dates in SVO, OpenIE, and Normalized date annotator and added chart visualization; 9. most Mac GUIs should now parallel the Windows layout.
3.6.0 11/04/2022 1. Added the field Multi-word Expression for NER tags based on BIEOS for spaCy and Stanza; 2. improved the user-interaction in the visualization_main GUI when visualizing network graphs.
3.5.9 11/03/2022 1. Fixed bug for missing MAC widgets; 2. fixed bug for nltk tokenizer replacing it with Stanza in Word2Vec.
3.5.8 11/03/2022 1. Changed all IO_files_util.make_output_subdirectory to silent=True to avoid stopping the code waiting for a user answer; 2. fixed bug in selecting in NLP_menu_main the option Parsers & annotators (BERT, CoreNLP, spaCy, Stanza)'.
3.5.7 11/02/2022 1. Fixed bugs in the shape of stories algorithms; 2. fixed bog on pandas on_bad_lines.
3.5.6 11/02/2022 1. Continued to improve the GUI layout for Mac; 2. fixed a bug with missing open_file_directory_button_width in Mac.
3.5.5 11/02/2022 1. Continued to improve the GUI layout for Mac.
3.5.4 11/02/2022 1. Continued to improve the GUI layout for Mac.
3.5.3 11/01/2022 1. Fixed a bug in KML functions; the GIS pipeline should all be working correctly now; 2. added reminder about spaCy not processing specific annotators but always the parser; 3. fixed display of 'NLP package for basic functions' in the NLP_setup_package_language GUI 4. continued to edit GUIs in preparation for MAC layout.
3.5.2 10/30/2022 1. Re-added visuals to spaCy and Stanza lost after restructuring subdir; 2. continued to edit GUIs for optimal Mac layout.
3.5.1 10/30/2022 1. Uniformed output subdir name and filenames for spaCy and Stanza; 2. changed the coreference GUI adding spaCy and BERT as coreference package options, besides Stanford CoreNLP (spaCy and BERT options soon to come); 3. changed environments.txt to always install the latest spaCy, Stanza, pandas.
3.5.0 10/28/2022 1. Added Started running and Finished running for sPacy and Stanza with annotator; 2. changed the NER GUI to add BERT, spaCy, Stanza as options; 3. improved the display of NLP_package_annotators GUI and correctly saved selected options; 4. added BERT Word2Vec to Word2Vec GUI.
3.4.9 10/28/2022 1. Fixed bug in the selection of inputFilename followed by input_main_dir_path; added inputFilename.set('').
3.4.8 10/28/2022 1. Removed the Yes/No reminder question about Json when running Stanford CoreNLP annotators and moved the option as a widget in NLP_setup_package_language_main.py; 2. completed the algorithm and visualization for SVO with spaCy; 3. fixed a bug in Ngrams VIEWER when aggregation is different from 'year.'; 4. fixed a bug in Gensim Word2Vec importing nltk.data.
3.4.7 10/27/2022 1. Removed the Yes/No reminder question about Json when running Stanford CoreNLP annotators and moved the option as a widget in NLP_setup_package_language_main.py; 2. completed the algorithm and visualization for SVO with spaCy.
3.4.6 10/26/2022 1. Fixed a potential bug in checking for the correct folder for Stanford CoreNLP.
3.4.5 10/26/2022 1. Uniformed the sentiment analysis output from spaCy and Stanza to BERT and CoreNLP.
3.4.4 10/25/2022 1. Improved user messages in NGrams_CoOccurrences_VIEWER GUI; 2. fixed a bug when opening a file/directory containing a date in NLP_IO_setup_main; 3. fixed a bug in html_annotator_gender_main when selecting non-supported options and improved the GUI; 4. added charts for normalized date for CoreNLP SVO & OpenIE; 4. fixed bug in Stanza due to nan NER value; 5. added GIS output to Stanza; 6. restricted the display of user warning for missing external software only when the software is needed; 7. removed SENNA from SVO options.
3.4.3 10/21/2022 1. fixed a bug in SVO when using filters; 2. improved the layout of OK warnings on MAC.
3.4.2 10/21/2022 1. Generalized the use of dates embedded in filenames not in the format 'mm-dd-yyyy'; this can be used for instance in N-grams viewer.
3.4.1 10/21/2022 1. fixed an index bug in spaCY SVO with faulty input text; 2. fixed bug in CoNLL_table_analyzer; 3. fixed bug in N-Grams Co-occurrences VIEWER; 4. fixed bugs in Gensim and Mallet topic modeling; 5. simplified display of subject, verb, object social actor and actions files in SVO.
3.4.0 10/20/2022 1. Improved the display of IO configuration options in NLP_IO_setup; 2. fixed a display issue in the IO configuration text box to display the correct hover-over date information; 3. fixed CoNLL_table_analyzer bug; 4. fixed coref bug and SVO; 5. fixed parsers_annotators bug.
3.3.9 10/19/2022 1. Uniformed SENNA SVO output to the standard output format of all SVO packages; 2. fixed a bug in STEP2-install_NLP-Suite.ps1 for Windows setup.
3.3.8 10/19/2022 1. Uniformed output directories in SVO for all packages (except SENNA; to come).
3.3.7 10/18/2022 1. Fixed bug in CLOSE in NLP_welcome_main; 2. Fixed bugs in Mac in parses_annotators in opening GUIs; 3. Fixed bugs in opening empty config files; 4. changed all calls to GUI_util.GUI_top(...
3.3.6 10/18/2022 1. Fixed a bug in the selection of old csv config files; 2. fixed a bug in parsers_annotators; 3. added a warning in parsers_annotators when both parser and annotator are run; 4. improved the display of Yes/No reminders for Windows.
3.3.5 10/17/2022 1. Reorganized GUIs to simplify their layout (for Stanford CoreNLP memory etc. options and date options); 2. started fixing the layout of GUIs for Mac; 3. fixed bugs in narrative_analysis GUI (missing variable); 4. fixed bug in html_annotator_gender(import with wrong filename); 5. improved the message display of the wordclouds GUI.
3.3.4 10/6/2022 1. further edited the layout of the CoNLL table analyzer; 2. improved the timeout automatic closing of OK message displays; 3. added a button in all GUIs to eventually enable opening a pop-up text widget where to paste text for quick temporary processing of text w/o dealing with I/O configuration options; 4. fixed a bug in the Coreference pronominal resolution GUI; 5. fixed a bug in the setting of manual editing coreferenced document (normal/disabled) when changing the IO configuration between directory and filename; 6. improved the handling of output files in subdirectories; 7. added TIPS files; 8. added the objective/subjective function in style analysis; 9. added the K sentences repetition finder in style analysis; 10. improved the SVO pipeline; 11. improved the layout of parsers_annotators GUI; 12. Started redesigning the Data Manager GUI.
3.3.3 9/30/2022 1. Changed all reminders to timed Yes/No questions; 2. added countdown timer to message displays; 3. added saving output files in special subfolders of the output folder; 4. improved the layout of the CoNLL_table_analyzer; 5. added visualization for CoNLL table search and K sentences; 6. improved the layout of all NLP Suite tools in the NLP main menu page.
3.3.2 9/23/2022 1. Improved the efficiency of geocoding location data by a factor of 12 (from 3 hours to 15 minutes with 25,000 locations); 2. openExcel charts directly on the Chart worksheet, rather than on the Data worksheet; 3. fixed bug in opening CoreNLP, spaCy, and Stanza output with charts; 4. improved the user interaction in handling of the output subdirectories by NLP tool; 5. fixed several bugs in the visualization of charts due to the change in variable name columns_to_be_plotted.
3.3.1 9/22/2022 1. fixed a bug with sentiment analysis in Stanza; 2. added the option of not exporting Json files with Stanford CoreNLP annotators; 3. added the visualization of all charts to spaCy and Stanza.
3.3.0 9/21/2022 1. fixed a bug when processing charts by document for alphabetic variables that need counting.
3.2.9 9/21/2022 1. generalized the preparation of charts in the function visualize_charts with X-Axis and Y-Axis options; 2. added nn/PCFG to output filenames of CoreNLP parser.
3.2.8 9/20/2022 1. converted several file handling functions from csv to pandas; 2. fixed a bug in processing a locations file in GIS_main; 3. added charts to n-grams; 4. fixed a bug in CoNLL_table_analyzer (although results are still not 100% accurate); 5. added the output subdirectory to many scripts to consolidate output; 6. prepared several functions for a faster approach to GIS.
3.2.7 9/19/2022 1. fixed a bug when saving dataframes with no header using df_to_csv in IO_csv_util.
3.2.6 9/18/2022 1. fixed a bug when hovering over non optionmenu widgets; 2. disabled all options in CoNLL_table_analyzer when a non CoNLL table file is selected in input; 3. fixed bug in GIS geocoding; 4. fixed bugs in the CoNLL table search and visualize charts.
3.2.5 9/18/2022 1. fixed a bug when hovering over non optionmenu widgets; 2. disabled all options in CoNLL_table_analyzer when a non CoNLL table file is selected in input.
3.2.4 9/18/2022 1. Improved geocoding; 2. added warning for user that collocations are not available when searching the CoNLL table in the CoNLL_table_analyzer; 3. improved display of charts when plotting with multiple series on a chart; 4. added plots of coreference resolution output; 5. added plots of found/not-found GIS output.
3.2.3 9/16/2022 1. made edits to some of the scripts in the GIS pipeline to improve nominatim geocoding of locations and export non-distinct non-geocoded locations; 2. added the visualization of BERT sentiment analysis; 3. fixed a bug in html_annotator_gender; 4. fixed a bug in returning from NLP_setup_package_language not updating the hover-over information; 5. improved the tkinter visualization of widgets (adding light green for labels and disabled widgets).
3.2.2 9.14.2022 1. added BERT to the list of sentiment analysis options; 2. made a minor change to the NLP_menu_main which seems to cause problems to some users.
3.2.1 9/14/2022 1. added a reminder in CoNLL Table analyzer for blank clause tag values when using a CoNLL table produced by Stanford CoreNLP neural network parser; 2. added reminders in Stanford_CoreNLP_coreference and Stanford_CoreNLP_NER that, for now, the algorithms behind those GUIs are based only on Stanford CoreNLP (and not on spaCy or Stanza); 3.fixed a bub in sentiment analysis with SentiWordNet; 4. fixed a bug with message non defined when checking Java installation; 5. forced user to select a language when setting up parser and language options.
3.2.0 9/14/2022 1. commented out (again) computing Excel charts by sentence index until function efficiency is improved; 2. added kaleido (used by plotLy to requirements and added the check for its installation in charts_plotly_util.
3.1.9 9/14/2022 1. fixed several bugs with the CoNLL table analyzer; 2. fixed several bugs with variable-name changes in the html dictionary annotator (work to be completed).
3.1.8 9/13/2022 1. updated the hover-over effects on GUI widgets.
3.1.7 9/12/2022 1. added hover-over info in the NLP_setup_package_language_main.py GUI and improved user messages; 2. added hover-over info in NLP_welcome_main.py; 3. made several layout improvements for Mac; 4. uniformed the CoNLL table layout for all three packages (spaCy, Stanford CoreNLP, Stanza).
3.1.6 9/11/2022 1. Fixed bug in closing the NLP Suite from the welcome GUI; 2. fixed a bug in NLP_parsers_annotators_main with spaCy as the selected package.
3.1.5 9/11/2022 1. Fixed accidental error in NLP_parsers_annotators_main.
3.1.4 9/11/2022 1. added hover-over information to various GUI widgets; 2. improved the automatic update of the NLP Suite upon closing a GUI.
3.1.3 9/10/2022 1. fixed a bug on CLOSE for automatic update.
3.1.2 9/10/2022 1. changed the output names when extracting sentences in a search (extract to extract_with_searchword and extract_minus to extract_wo_searchword; 2. added code lines to process Continents correctly in Nominatim since CoreNLP NER often tags continents incorrectly and, as a result, Nominatim geocodes them incorrectly; 3. improved user interaction with GIS_main; 4. improved user interaction with SVO_main; 5. completed the SVO extraction with spaCy and Stanza; 6. fixed a bug in the dropdown menu for parsers in NLP_parsers_annotators_main.py.
3.1.1 9/7/2022 1. moved requirements.txt to src; 2. passed CoreNLP NER tags to Nominatim to improve geocoding; 3. added NER tags to GIS output files; 4. updated all ?HELP message in the file search by word and n-grams/co-occurrence VIEWER functions.
3.1.0 9/7/2022 1. improved the word search algorithms although 1. the efficiency is a problem; 2. partial matches are not carried out.
3.0.9 9/7/2022 1. added check for pygit2 module installation; 2. fixed a bug in GIS_main.py with read_csv; 3. added the option to process locations in Nominatim with the NER tag values exported by the selected NER package (spaCy, CoreNLP, Stanza).
3.0.8 9/4/2022 1. fixed a bug in charts_util when processing a single field.
3.0.7 9/4/2022 1. fixed bug in SVO with date extracted; 2. fixed a bug in NLP_parsers_annotators_main.py; 3. fixed several bugs in computation of ngrams from style analysis GUI; 4. added language list to spaCy; 5. added LOCATION, PERSON, TIME to spaCy and Stanza SVO.
3.0.6 9/2/2022 1. added the option of producing NO charts, whether Excel or plotLy; 2. added encoding='utf-8' and error trapping (errors='ignore' or error_bad_lines=False) in every script where missing; 3. fixed several bugs due to filename changes.
3.0.5 9/1/2022 1. fixed the Open GUI bugs in Style Analysis.
3.0.4 8/30/2022 1. fixed bugs due to the append and extend in filesToOpen; 2. fixed various bugs behind the style_analysis_main.py GUI.
3.0.3 8/27/2022 1. fixed bug in software installation; 2. fixed bug in Stanford_CoreNLP encoding error.
3.0.2 8/26/2022 1. fixed bug in displaying reminders; 2. fixed bug in charts_Excel_main.
3.0.1 8/26/2022 1. fixed minor bugs; 2. Nearly completed the transition to spaCy and Stanza, in addition to Stanford CoreNLP, as the main packages; 3. Added a Setup dropdown menu in every GUI; 4. Uniformed the behavior of all setup buttons in the NLP_main_menu.
3.0.0 7/9/2022 1. Fixed a number of minor issues; 2. rewrote the search by word function; 3. rewrote the n-grams function; 4. fixed bugs in the computation of pronouns numbers in coreference; 5. improved the style_analysis options; 6. improved Stanza; 7. Added several data cleaning options.
2.9.9 6/28/2022 1. Fixed a number of minor issues and generally improved the user interaction; 2. added hover-over effects to all GUIs; 3. Added charts to most scripts; 4. Added new types of charts (bar for the selected column, by Document, by Sentence index, and descriptive statistics); 5. added the plotLy option for charts, besides Excel; 6. improved the compute_csv_column_statistics and related sub-functions (compute_csv_column_statistics_NoGroupBy and compute_csv_column_statistics_groupBy); 7. Improved the what's in your corpus GUI with many more options; 8. Added makeDir to all scripts that produce a large number of files; 9. Added Stanza with its own GUI and set of annotators.
2.9.8 5/15/2022 1. Completed the hover-over effect for widgets with no text label in all GUIs (e.g., the open file/directory buttons).
2.9.7 5/14/2022 1. created test_hover_over file; 2. set up the Stanford CoreNLP language pack.
2.9.6 5/10/2022 1. replaced the hedonometer.json file which had gotten corrupted; 2. further extended the hover-over effect in the GUIs to all widgets; 3. fixed repeated messages with missing IO values.
2.9.5 5/9/2022 1. Added hover-over effects to all widgets in all GUIs; 2. improved the performance of the computation of the sentence index.
2.9.4 4/27/2022 1. Fixed the SpaCy bug for language detection.
2.9.3 4/25/2022 1. Added a new TIPS file TIPS_NLP_Excel smoothing data series.pdf and added the option to all GUIs that could use this TIPS; 2. Added a series of TIPS files for the Shape of Stories GUI.
2.9.2 4/23/2022 1. Fixed all readability measures in the Style Analysis GUI.
2.9.1 4/22/2022 1. Cleaned the output of text readability adding a line chart for all readability measures; 2. excluded the use of GPU in Stanza for sentence complexity, leading to error.
2.9.0 4/20/2022 1. Temporarily disconnected the computation of missing Sentence index in Excel line charts.
2.8.9 4/18/2022 1. Embedded HTML dictionary annotated files in <@# #@> so that they can be split with the file splitter if so needed; 2. Uniformed the embedded symbols <@# and #@> for all scripts; 3. added a reminder that the CoreNLP neural network parser does not produce clause tags; 4. fixed a bug in CoNLL_table_clause_analysis_util.
2.8.8 4/16/2022 1. Extended the plots by sentence index to all CoNLL table analyzer functions; 2. Fixed a problem with Clause Tags with the neural network parser, since these are not available.
2.8.7 4/12/2022 1. Fixed ?HELP and bugs in the sentence_analysis GUI; 2. Fixed ?HELP and bugs in the style_analysis GUI.
2.8.6 4/9/2022 1. Added the sample data option to the "Data & Files Handling Tools in main menu; 2. in DBpedia produced a single merged csv file in output to avoid adding files to an already potentially long list of output files.
2.8.5 4/8/2022 1. Improved the sample data functions to avoid overwriting the output directory.
2.8.4 4/7/2022 1. Edited the DBpedia & YAGO scripts; 2. Added the csv file splitter by Document ID; 3. Added the function to sample the corpus by the Documents listed in a csv file.
2.8.3 3/26/2022 1. Commented out a line for Excel line charts leading to errors; 2. Added the TIPS on The Word of Emotions and Sentiments and corrected the TIPS Sentiment analysis.
2.8.2 3/22/2022 1. Prepared all Stanford CoreNLP GUIs for the selection of one of the supported languages; 2. added an overall GUI for the study of emotions/sentiments.
2.8.1 3/20/2022 1. Fixed a display bug when computing the sentence table in the CoNLL table analyzer; 2. Fixed an Excel bug when producing a line chart; 3. added the Python sentence complexity algorithm based on Stanza.
2.8.0 3/15/022 1. Added the TIPS file TIPS_NLP_Stanford CoreNLP performance and accuracy.pdf; 2. added reminders for default visualization options in GIS_main and SVO_main.
2.7.9 3/15/2022 1. Completed the work of renaming S, V, and O to Subject (S), Verb (V), and Object (O) in the SVO extractor functions; 2. Standardized further the output from SVO CoreNLP & SENNA.
2.7.8 3/13/2022 1. standardized the csv output fields with Sentence ID, Sentence, Sentence ID, Document ID, and Document as the last fields in the files; 2. fixed bug in NGrams_CoOccurrences_Viewer_util.
2.7.7 3/2/2022 1. added a GUI for Search (ALL options) in NLP menu under various dropdown menus; 2. fixed the display of manual coreference; 3. added nominative and accusative/objective pronouns to the social-actor-list.csv to avoid being filtered out in SVO.
2.7.6 3/1/2022 1. added a GUI for Search (ALL options).
2.7.5 3/1/2022 1. added reflexive pronouns to the CoreNLP coreference annotator; 2. changed the list of pronouns in Stanford_CoreNLP_coreference_util; 3. improved user interface (reminders, TIPS, ?HELP) for coref GUI; 4. added the extra line for opening files and drawing charts in Stanford_CoreNLP_coref_main; 5. added Excel chart of frequency of pronouns to CoreNLP coref annotator; 6. uniformed the calls to check_pronouns for all algorithms that require it (parser, coref, SVO); 7. fixed the normalizedNER bug for OpenIE; 8. fixed the bug with the coref-SVO pipeline.
2.7.4 2/27/2022 1. greatly improved the SVO GUI with respect to SENNA (although the real problem is how slow the function convert_to_svo).
2.7.3 2/27/2022 1. added a check for file None in the function OpenOutputFiles to avoid code break; 2. added all three types of pronouns to coref results; 3. in coref added a test that the antecedent does not already have 's (Mary's) to avoid ending up with double 's (Mary's's); 4. removed a line forgotten there for testing purposes.
2.7.2 2/26/2022 1. In SVO added the TIPS file TIPS_NLP_SVO SENNA.pdf; 2. Fixed a bug with the Stanford_CoreNLP_NER_main (wrong indentation); 3. in GIS_main and Google_Earth_Pro_main added the option of using the csv output of locations from Stanford_CoreNLP_NER_main.
2.7.1 2/26/2022 1. Improved TIPS, ?HELP, and README for three different Stanford CoreNLP scripts: NER annotator; coreference annotator; SVO extractor.
2.7.0 2/26/2022 1. Fixed SVO bug GitHub Issue #644 Cannot finished SVO pipeline.
2.6.9 2/25/2022 1. Edited SVO to avoid a bug in new Python Wordcloud.
2.6.8 2/25/2022 1. removed WordNet temporary files from SVO output; 2. in Python wordclouds added the options of selecting font and maximum number of words; 3. fixed bugs in GIS scripts when using a CoNLL table as input; 4. added the visualization of coreferenced pronouns and non-coreferenced pronouns in different colors in the output HTML file.
2.6.7 2/23/2022 1. fixed a bug in Stanford CoreNLP coreference manual resolution; 2. added a check for MacOS shell bash vs zsh; 3. in SVO, added a WordNet aggregate up for nouns; 4. fixed a temporary bug in CoNLL table analyzer for verbs; 5. improved the functions behind the CLOSE button.
2.6.6 2/22/2021 1. added a button in the GIS_main GUI to visualize the Google API keys, in case they need to be changed; 2. moved the gender and quote annotator in SVO_main down in the GUI; 3. improved the user interface in the NLP_setup_update_util.
2.6.4-5 2/21-22/2022 1. added the option of 3-D visualization for Word2Vec; 2. added the script charts_plotly_util to eventually provide another chart option to users, besides Excel; 3. renamed Excel scripts to be prefaced with charts_ and changed all calling scripts.
2.6.3 2/20/2022 1. re-imported the KML functions optimized for speed, lost after the feb 9 PyCharm crash.
2.6.2 2/20/2022 1. added a widget for font selection for Python wordcloud; currently the option is disabled; 2. removed code for NER locations extractor in Stanford_CoreNLP_annotator_util that was creating more problems than solutions.
2.6.1 2/19/2022 1. Added the option of opening the CoNLL table analyzer from Stanford CoreNLP NER GUI; 2. started transitioning from NLTK sentence splitter, tokenizer, and lemmatizer to stanza, editing tens of files affected by the change; 3. fixed a bug in NLP_setup_update_util with the display of Git e error raised; 4. reset Word2Vec files to correct version after PyCharm crash.
2.6.0 2/16/2022 1. added a check for current Stanford CoreNLP available on GitHub; 2. added the option of changing a currently installed external software; 3. Edited the Ream Me message of the Stanford CoreNLP NER GUI as an input CoNLL table is not allowed; 4. minor edits.
2.5.9 2/15/2022 1. Extended to the NLP_welcome_main GUI the call for update on CLOSE; 2. various edits.
2.5.6-8 2/15/2022 1. Added IO_files_util and jar files to Git.
2.5.5 2/15/2022 1. Edited some of the TIPS about installing Java/JDK; 2. improved the output of the K-sentences analyzer behind the CoNLL table analyzer GUI.
2.5.4 2/13/2022 1. Fixed bugs in NLP_setup_update; 2. added a general function to open url websites and changed all functions that opened a site; 3. removed the JDK requirement for Stanford CoreNLP that no longer seems to create problems; 4. added the gender/quote annotator for OpenIE; fixed a bug in pronoun counts for "I" in CoreNLP_annotator; 5. rewrote the readme files for Mac and Windows; 6. minor edits.
2.5.3 2/10/2022 1. added the gender HTML annotator in SVO; 2. edited the check for Java installation to specifically check for Java JDK for Stanford CoreNLP; 3. finalized the help messages and TIPS for Word2Vec.
2.5.2 2/9/2022 1. Edited various files of NLP_setup type; 2. Completed the Word2Vec script.
2.5.1 2/7/2022 1. The warning of release version is now displayed when the GUI has been completely displayed; 2. added the TIPS file TIPS_NLP_English Language Benchmarks.pdf to various scripts; 3. fixed the bug in the display of Excel charts with hover-over effects (.xlsm files).
2.5.0 2/7/2022 1. restored barchartsample.xlsm under lib that got corrupted, causing Excel .xlsm files to crash.
2.4.9 2/6/2022 1. improved the efficiency of kml functions bringing processing speed down to 9 seconds from 20 minutes on the POTUS ina speeches; 2. fixed a bug in MALLET topic modeling; 3. prefixed NLP_setup_ to the download, update, and shortcut filenames to conform to NLP Suite script standard of bunching together scripts by their function for easy recognition; 4. added the CLOSE button in the NLP Suite welcome GUI; 5. minor editing issues.
2.4.8 2/5/2022 1. Moved the update NLP Suite reminder to GUI_util to display the message when users hit the CLOSE button from any GUI; 2. renamed the Mac and Widows files NLP_environment_shortcut_add to STEP3-NLP-environment to simplify installation instructions; 3. added a new TIPS file TIPS_NLP_Anaconda NLP environment pip.pdf; 4. rewrote the NLP Suite GitHub pages to simplify first-time users experience.
2.4.7 2/3/2022 1. Added the gender and speaker options in SVO; 2. improved the user interaction when running STEP2, as Git is required.
2.4.6 2/3/2022 1. New display of Release version on every GUI to include the version available on GitHub; 2. Added a user warning in update_util that the NLP Suite was updated upon closing.
2.4.5 2/2/2022 1. Fixed yet another bug with external software setup in Mac...
2.4.4 2/2/2022 1. And yet another bug with external software setup...
2.4.3 2/2/2022 1. Fixed a bug in the installation of external software.
2.4.2 2/2/2022 1. Fixed bugs in external software setup for Gephi and Google Earth Pro for Mac.
2.4.1 2/2/2022 1. Further improved the external software setup.
2.4.0 2/1/2022 1. Fixed a bug of utf-8 compliance when writing output files in the Python WordClouds.
2.3.9 2/1/2022 1. Further improved External software setup for Mac & Windows; 2. Fixed a bug in Python WordClouds due to a variable name error (bg_img).
2.3.8 2/1/2022 1. Further improved External software setup for Windows.
2.3.7 1/30/2022 1. Improved I/O setup; 2. Improved user interaction in external software setup.
2.3.6 1/29/2021 1. Fixed a bug in Stanford_CoreNLP_main when using the * option for CoreNLP annotators; 2. Fixed a bug in whats_in_your_corpus_main when using the * option for What else in your corpus option.
2.3.5 1/28/2021 1. Added the new K-sentences analyzer for the CoNLL table in CoNLL_table_analyzer_main.
2.3.4 1/27/2021 1. Fixed a bug in wordclouds_main with the Python WordCloud when selecting Lemmas, Stopwords, Punctuation, Lowercase.
2.3.3 1/16/2022 1. Updated the release version message to reflect the fact that release versions are now updated automatically; 2. further improved the download and installation of external software; 3. added the option of using a full image, rather than just image contours, for the Python WordCloud package; 4. edited the output filename for the Python WordCloud algorithm to reflect the NLP Suite filenaming criteria; 5. added Gensim 4.0 reminder in the main topic_modeling_gensim_main GUI.
2.3.2 1/21/2022 1. Trapped the wordclouds Wordle error for inactive website; 2. added a button in wordclouds_main to open the png image file.
2.3.1 1/20/2022 1. Improved the What's in your corpus GUI; 2. improved the wordclouds GUI adding the option of creating an image via removebp to be used by Muller's Python wordcloud.
2.3.0 1/18/2022 1. Added a reminder for WordNet download blocked in Chrome; 2. added several user warnings for software download & installation; 3. added a check and download link to Java JDK when installing Stanford CoreNLP; 4. added a check and download link to Microsoft Visual Studio C++ when installing SENNA for Windows machines; 5. completed the merge function in data_manager_main; 6. fixed the bug of filename error in Stanford_CoreNLP_annotator.
2.2.9 1/16/2022 1. Fixed a non-ASCII " issue in STEP2-install_NLP-Suite.ps1 for Windows; 2. extended to Stanford CoreNLP parser the display of a chart of personal pronouns and a reminder to the user to run the CoreNLP coref annotator previously introduced for SVO (release 2.2.7).
2.2.8 1/15/2022 1. Added the auto_update option in STEP2 for Windows and eliminated the manual update for bot Mac and Windows; 2. Improved the GUI display of CoNLL_table_analyzer for Mac & Windows.
2.2.7 1/14/2022 1. Changed the menu fields in CoNLL_table_analyzer from tk.menu to ttk.combobox to allow muc faster search of POS and Deprel tags; 2. in SVO added the display of a chart of personal pronouns and a reminder to the user to run the CoreNLP coref annotator.
2.2.6 1/12/2022 1. Added the option of running the CoreNLP quote annotator with either double or single quotes; 2. added the display of existing config options in IO_setup_main.
2.2.5 1/9/2022 1. Edited the STEP2-install_NLP-Suite file in setup eliminating quotes that were causing problems; 2. Implemented the new Gephi functions in visualization_main.
2.2.4 1/9/2022 1. Rewrote the CoNLL table search functions to account correctly for enhanced dependencies; 2. Added a reminder for verb modality and fixed a bug when computing verbs and function words from the style_analysis GUI.
2.2.3 1/8/2022 1. Created new GUIs for knowledge graphs (DBpedia, YAGO, WordNet) and for various types of HTML annotators that will produce HTML files in output from txt file(s) in input. Several files were renamed to reflect the changes.
2.2.2 1/6/2022 1. Added reminders to the CoNLL table analyzer and to the SVO algorithms.
2.2.1 1/5/2022 1. Corrected the GUI size of IO_setup_main.py; 2. Added a ; to each PERSON & NER normalized date in the SVO output so that they can be processed separately; 3. changed the filter/lemmatize functions for SVO to make them much faster.
2.2.0 1/4/2022 1. When processing coref with an input directory, the filename is correctly processed; 2. The coref function displays the number of pronouns processed and pronouns coreferenced.
2.1.9 12/28/2021 1. Added the processing of locations in SVO with OpenIE; 2. Added an automatic NLP Suite update (update_util) thus eliminating the scripts update_auto_NLP-Suite.command and update_auto_NLP-Suite.bat; 3. fixed the bug of Coref + SVO with an input directory.
2.1.8 12/21/2021 1. Uniformed all config filenames in all _main scripts.
2.1.7 12/20/2021 1. Fixed the bug in the CoreNLP_annotator processing of filenames with embedded dates so so to make possible dynamic Google Earth Pro maps; 2. added extensive help messages to the various GIS scripts that can processes dates embedded in filenames.
2.1.6 12/16/2021 1. Improved the visual display of the NLP_welcome_main and NLP_menu_main GUIs; 2. added reminder on default number of topics for MALLET and Gensim topic modelling.
2.1.5 12/14/2021 1. Completely rewrote the approach to config files, from txt to csv, thus modifying every single _main script.
2.1.4 12/01/2021 1. Further improved the help messages for I/O set up and software setup; 2. Banned the possibility of setting up the output files directory inside the NLP Suite directory.
2.1.3 12/01/2021 1. Completed the GUI displays, HELP, and Reminders to give users control over the I/O config files; 2. Completed the pipeline to open the CoNLL table analyzer GUI after running the CoreNLP parser; 3. Completed the pipeline to open the Shape of stories GUI after running the Stanford CoreNLP Sentiment Analysis algorithm in the Sentiment Analysis GUI.
2.1.2 11/30/2021 1. Fixed 'startTime' not defined error in whats_in_your_corpus_main.py; 2. Edited the NLP_menu_main GUI for default I/O and external software to give users greater control of these options.
2.1.1 11/29/2021 1. Edited all output filenames in SVO; 2. Edited the output filename in the extract data_manager_util function; 3. Edited a bug in SVO when calling the GIS_pipeline; 4. added reminder in SVO when no records are produced; 5. added videos to the NLP_menu_main.
2.1.0 11/23/2021 1. Added an interactive plot to Word2Vec results, although a simple one for now.
2.0.9 11/23/2021 1. Completed the work on the CoNLL table analyzer to include verb modality.
2.0.8 11/22/2021 1. Rewrote all the CoNLL analyzer functions (except verb modality) for speed.
2.0.7 11/21/2021 1. Simplified the Nominalization script, to avoid a huge output file when processing large corpora.
2.0.6 11/20/2021 1. Fixed a potential infinite loop in the function process_json_ner of Stanford_CoreNLP_annotator; 2. added a TIPS file "Geocoding: How to Improve Nominatim" to all scripts that geocode (GIS_main, Google_Earth_Pro_main, SVO_main).
2.0.5 11/19/2021 1. added wn to the list of packages needed by nominalization as a dependency of pywsd.
2.0.4 11/17/2021 1. Further improved the messages in the shape of stories GUI; 2. added a newer release of Gensim Word2Vec; 3. changed the output of the Stanford CoreNLP parser from CoNLL-X to CoNLL-U; 4. fixed a bug on the sentiment analysis GUI; 5. improved the processing of location names in Stanford_CoreNLP_annotatorr_util for ities, followed by states and country and states followed by country.
2.0.3 11/15/2021 1. Completed the help messages for ?HELP and ReadMe buttons in the shape of stories GUI.
2.0.2 11/15/2021 1. Ooops... fixed a bug in the checking of release history.
2.0.1 11/15/2021 1. Improved the warning messages in the shape of stories; 2. fixed a bug in the checking of release history.
1.9.9 11/14/2021 1. Changed the display of the sentiment analysis GUI; 2. added the automatic calculation of WordNet verb categories without auxiliaries not to distort the stative and possessive categories; 3. completed the n_gram/co-occ viewer; 4. added greater precision to the GIS pipeline for preferred locations; 5 . fixed a bug in shape of stories due to indentation problem.
1.9.8 11/7/2021 1. Completed the merge option in the data_manager algorithm; 2. further improvements to the GUI layouts.
1.9.7 11/6/2021 1. Improved some of the GUI layouts; 2. improved the data_manager files (merge option to be completed); 3. added the option of exporting csv fields to a text file in the data manager; 4. fixed a bug with the extract in data manager when the <> and = operators are used in the WHERE clause; 5. added a popup dropdown menu widget.
1.9.6 11/3/2021 1. Edited all scripts for the popup messages - Started running, Finished running - moving them to command line/prompt; 2. improved the YAGO algorithm in opening output files.
1.9.5 11/2/2021 1. fixed a bug in the NER extractor always missing the last record in an NER list; 2. fixed all GUI width sizes for Mac and Windows.
1.9.4 11/1/2021 1. added the SSL certificate check to YAGO; 2. added the hyperlinks to the YAGO output; 3. Eliminated the time estimate in YAGO when processing a directory or an already long algorithm would be even longer.
1.9.3 10/31/2021 1. Adjusted the ontology class dropdown menu size for Mac and Windows to avoid overwriting the menu.
1.9.2 10/31/2021 1. Trapped a number of errors in the DBpedia script.
1.9.1 10/31/2021 1. Rewrote the DBpedia util script using SPARQL instead of curl and eliminating an annoying repeat of a confirmation slider for every file processed.
1.9.0 10/29/2021 1. Changed the dropdown menus in NLP_main to comboboxes to speedup searches; 2. redesigned the dropdown menu in the annotator GUT (the ontology classes dropdown menus on a Mac would be hidden).
1.8.9 10/28/2021 1. Fixed a bug in computing time taken by any script (due to a spelling error); for algorithms taking more than an hour, the code would break.
1.8.8 10/28/2021 1. Fixed a bug in the selection of external software; 2. implemented a dropdown menu for DBpedia and YAGO in annotator_main to speed up search.
1.8.7 10/27/2021 1. uniformed all scripts that extract spatial information to include the NER tag LOCATION; 2. Added the new scripts to compute geographic distances; 3. Added a timeout loop for Nominatim geocoding; 4. doubled the speed of executing the KML function for Google Earth Pro maps; 5. added the files and functions to handle videos.
1.8.6 10/24/2021 1. changed the window width of many GUIs to display properly the open file/folder buttons; 2. Skipped processing blank Objects in Gephi in the SVO script; 3. Added the lemmatizing option to SVO; 4. added the display of total time a function/script takes to run; 5. added functions to IO_csv_util.
1.8.5 10/22/2021 1. added stanza.download('en') to STEP2 installation; 2. added Python n-grams/co-occurrences viewer; 3. rewrote the search functions in file_search_byWord_util; 4. added stricter control on menu choices in NGrams_CoOccurrences_Viewer_main and in file_search_byWord_main; 5. fixed a bug in filename when using CoreNLP coref; 6. Fixed an inconsequential display repetition in CoreNLP OpenIE; 7. added a csv file output for coreferenced pronouns; 8. Added the LOCATION NER tag to GIS_main and the output location file to the files to open; 9. Removed the large dependenSee.jar file to visualize parse trees and substituted it with an internal Python function based on nltk/spaCy tree visualization; 10. Added two buttons in the display of I/O configuration to open an input file/directory and output directory.
1.8.4 10/18/2021 1. In NLP_menu_main converted the request for external software to a question; 2. Improved some of the reminders; 3. changed all calls to GUI_bottom in all _main scripts; 4. Edited the checkboxes to automatically open ALL output files and Excel charts.
1.8.3 10/17/2021 1. Added a popup entry widget for Google geocoder API and Google Maps API to allow entering a key when running the GIS pipeline from SVO_main; removed the Google API entry widget from GIS_main; 2. Added checks for Google Earth Pro and Gephi installation; 3. Added a sentence splitter to Stanford_CoreNLP_annotator_util.py; 4. Fixed a bug with Gephi output being deleted in SVO; 5. Made faster the processing of Google Earth Pro kml files.
1.8.2 10/15/2021 1. Fixed a bug in SVO GUI to allow for manual coreference of single files; 2. Improved the Excel chart display; 3. Improved output display in file_manager_main.py; 4. Added the Quote annotator to the CoreNLP annotators in Stanford_CoreNLP_main; 5. Added several Excel charts in the Stanford_CoreNP_annotator_utilfor every type of annotator; 6. Added Gephi and Google Earth Pro to list of external software in IO_libraries_util; 7. added the check for stanza in wordclouds_util.
1.8.1 10/13/2021 1. Corrected the wordclouds_util.py to deal with results of SVO script; 2. In the statistics_txt_util split the Frequency of sentences and words into two different charts to avoid masking due to the different scales.
1.8.0 10/13/2021 1. Changed O/A to O in the SVO scripts; 2. Added split option for a merged coreferenced file; 3. Edited the split function for greater user information.
1.7.9 10/12/2021 1. Setup stringent checks to the GIS_main pipeline; 2. Edited SVO and GIS scripts to deal with multiple locations listed in SVO output for the same sentence; 3. Added the option of splitting filenames by embedded items in the file_manager_main; 4. Added a CoreNLP coreference GUI to allow for manual coreference of previously manually coreferenced files; 5. rewrote the readme files in txt format rather than md format.
1.7.8 10/7/2021 1. Rewrote the main GIS scripts; 2. Improved the display of the CoreNLP quote annotator, to include speaker and sentence ID; 3. Simplified the code of CoreNLP_main.
1.7.7 10/4/2021 1. Fixed a bug in the gender annotator GUI when annotating first names by dictionary values; 2. added more reminders for various options in Stanford CoreNLP.
1.7.6 10/3/2021 Fixed a bug in GIS Google Earth Pro GUI due to the wrong order of arguments.
1.7.5 10/3/2021 1. Fixed a bug in the display of Sentence ID in the NER annotator; 2. Fixed a bug in GIS script caused by input csv files with blank rows (blank Document ID field).
1.7.4 10/3/2021 1. Added a Visualization GUI with the options of visualizing network graphs via Gephi (to be completed); 2. Fixed a bug in IO_CoNLL_util for sent_str = sent_str + " " + str(row[1]).
1.7.3 10/3/2021 1. Completed the work for full stop . added to paragraphs missing end-of-paragraph punctuation; 2. Completed the work for removing blank lines from documents; 3. allowed the use of a single input file when computing n-grams in the VIEWER GUI; 4. dictionary annotate function in annotatator_gender_dictionary script must go through CoreNLP_annotator for NER; the new approach needs to be completed! 5. Added a reminder in WordNet_main about the stative and possession aggregated categories; 6. Added a reminder about CoreNLP Server is shutting down; 7. Added a check for empty txt files; 8. Added the export of a CoreNLP json file as a txt file; 9. Fixed a misspelling error NER_oututFilename in GIS_main; 10. edited all setup_Mac and setup_Windows bat/command files to make them more user friendly.
1.7.2 9/28/2021 Fixed missing 'and' in if statement.
1.7.1 9/28/2021 1. Added a single widget for pre-processing tools in all Stanford CoreNLP GUIs; 2. Added widgets to all Stanford CoreNLP GUIs to allow user control of document size and max sentence length; 3. Edited the Stanford_CoreNLP_annotator to process the new parameters; 4. Added a check for sentence length in all the annotators of the Stanford_CoreNLP_annotator_util and remind the user when sentence length exceeds 100 words/tokens; 5. Modified the GIS GUI to simplify geocoding and mapping; 6. A backup files option was added to all algorithms that modify original corpus files (e.g., ASCII converter).
1.7.0 9/26/2021 1. Added a test for Java 64-Bits; 2. Completely restructured the WordNet GUI to make it far more user friendly; 3. Updated the WordNet TIPS.
1.6.9 9/25/2021 1. Added test for memory size; 2. Improved the user interface with memory issues with new reminders TIPS and automatic checks; 3. Added the function to compute sentence length for files and added the widget to the Stanford CoreNLP and SVO scripts; 4. Improved I/O in DB_SQL_main.
1.6.8 9/23/2021 1. Fixed minor file naming issues when running SVO with all options ticked; 2. Added a check for internet connection when checking the latest release of the NLP Suite on GitHub.
1.6.7 9/23/2021 1. Fixed circular import; 2. Fixed coref file naming problem.
1.6.6 9/22/2021 1. Added the option of opening the wiki page on what is new in a new release of the NLP Suite when firing up the NLP Suite; 2. Added the utf_8 and % check options in Stanford_CoreNLP_main; 3. Added the % sign check (and replace) in Stanford_CoreNLP_annotator_util to avoid breaking the annotator's code 4. Added reminders to improve the user interface.
1.6.5 9/22/2021 1. Added further checks on Java and Stanford CoreNLP; 2. Added reminders to SVO and Stanford CoreNLP GUIs; 3. Changed file type of SQL queries saved and imported in DB_SQL_main from .txt to .sql.
1.6.4 9/21/2021 1. Changed the CoreNLP annotator for OpenIE output; 2. Changed the release version check function.
1.6.3 9/20/2021 Added a check and reminder for Java JDK 8 when running Stanford CoreNLP annotators.
1.6.2 9/19/2021 Completely rewritten the Python wordclouds GUI with many user options.
1.6.1 9/16/2021 Improved the release version user message.
1.6.0 9/15/2021 1. Added files under lib/OpenIE for SVO extraction; 2. Added the release_version.txt file under lib and the relative code in GUI_util to read the version_number in order to allow GitHub to check the version_number the user is using and compare it against the version on GitHub; 3. completed the redesign of reminders_util to take the messages of all reminders from reminders_util and to allow automatic rewriting of reminders messages that differ from the ones listed in reminders_util. 4. Added a version number check that gives a warning when the software version is not the latest. 5. Fixed a few issues in STEP2 installation. 6. The shortcut is now NLP instead of nlp (Mac users will have to run remove/add shortcut again).
1.5.9 9/12/2021 1. Added GitHub image files 2. added error trapping for Gensim 4.0 version for Mallet and added a reminder.
1.5.8 9/10/2021 Fixed a permission error bug in setup executables.
1.5.7 9/9/2021 Added a check for Anaconda3 in STEP2 of installation for Mac to avoid an installation error when Anaconda3 is already installed.
1.5.6 9/9/2021 1. Added TIPS files to several scripts 2. changed location and names of image files displayed in GitHub 3. added reminder in wordcloud 4. improved the installation files.
1.5.5 9/6/2021 1. New TIPS files added 2. improved the pip install warning 3. changed file names in Mac setup folder to uniform with Windows file names.
1.5.4 9/3/2021 Improved the NLP_menu_main, displaying a checkbox to visualize whether all I/O options and external software have been setup.
1.5.3 9/2/2021 1. Further improved the handling of I/O configurations and related user messages. 2. Improved ?HELP commands for the DB_SQL_main GUI.
1.5.2 9/1/2021 Further improved the handling of I/O configurations and related user messages.
1.5.1 8/30/2021 1. Improved the user interface of I/O configuration options for first-time users; added new reminders for all GUIs and for IO_setup_util. 2. Fixed a bug in saving the Stanford CoreNLP coreferenced files.
1.5.0 8/7/2021 BETWEEN RELEASE 1.3.9 AND THE CURRENT 1.5.0 THERE HAVE BEEN VARIOUS RELEASES TO IMPROVE A NUMBER OF MINOR ISSUES. WE HAVE NOT KEPT TRACK OF EACH CHANGE. SORRY! THE CURRENT 1.5.0 RELEASE INCLUDES THE FOLLOWING CHANGES. 1. New approach to I/O setup, in NLP_menu_main and all GUIs 2. Completed the work for the generalized CoreNLP annotator 3. Added several new TIPS and edited others 4. Improved a large number of different scripts 5. Added new approach to reminders to allow the same reminder to be used for multiple GUIs 7. Added new reminders 8. Fixed bug in shape of stories due to wrong filename 9. Added auto install/update/read scripts for Mac and Windows 10. Updated Readme file 11. Updated Wiki 12. Improved the OpenIE SVO extractor to take into account more complex clausal structures 13. Extended to SENNA the OpenIE approach to SVO extraction 14. Improved the CoreNLP pronominal coreference resolution 15. Improved the manual edit of CoreNLP pronominal coreference resolution results
1.3.9 5/8/2021 1. Trap Excel bug when when the number of rows in input file exceeds the maximum number allowed by Excel 2. Trap pandas bug in sentence complexity 3. Improved the WordNet GUI 4. Improved the code behind SVO 5. Improved the code behind Shape of Stories
1.3.8 4/20/2021 1. Fixed nominalization Excel chart bug 2. Fixed sentence complexity Excel chart bug 3. Fixed text readability Excel chart bug 4. Fixed Record ID CoreNLP generalized annotator bug (leading to errors in the CoNLL table analyzer) 5. Fixed an SVO SENNA bug 6. Improved the code of SVO script 7. Improved Shape of Stories user interactions
⚠️ **GitHub.com Fallback** ⚠️