User Documentation - Pastafarians/linguine GitHub Wiki

NOTE: Please see this page for Linguine's current version.

Logging in

  • You must be logged into Linguine to access any pages other than the home page.
  • To log into Linguine, enter your RIT username and password in the appropriate fields in the upper right corner of the home page.
  • The username in most cases with be your email address with the @rit.edu removed.
  • When you have successfully logged in, the username and password fields will be replaced by the RIT Tiger logo and your first name.
  • Linguine uses RIT's central authentication system, you cannot authenticate to a specific department domain.
  • If you are already logged in and wish to switch users, click the Tiger logo and select Logout.

Corpora

The Corpora page allows you to view and modify text passages that have been uploaded to Linguine for the purpose of analyzing them. These passages are associated with your Linguine account and will not appear for other users. The total amount of disk space allocated for your corpora and your current usage are shown in the bottom right. The first time you log in, several sample passages will be automatically added to your account. You may remove these passages if you wish.

  • To view a corpus click on the title. You can return to the list by clicking the Back button in the upper right corner.
  • To delete a corpus, first click on it to view the passage, then click the garbage can icon in the upper right corner.
  • You may add tags to your corpora for ease of searching and organization. Click the blue + sign below the corpus title, enter the tag name, and click Add.
  • To remove an existing tag, click the small X on the tag label.

Uploading a new corpus

  1. To upload a new corpus from your local computer, click the Create Corpus button in the upper left corner of the page.
  2. You will be prompted for the Title and file to be uploaded.
  3. Enter the title as you wish it to appear in your Corpora page.
  4. Click Choose a File to open your browser's file chooser. You may then navigate your computer's local drive and select the file you wish to upload.
  5. The file name should appear in the File Upload field.
  6. Click Create Corpus to confirm your selection

Analysis

The Analysis page shows a list of all the analyses you have performed and allows you to create new analyses. Like the Corpora, these analyses are specific to your Linguine account and will not show up for other users. Each analysis shows the name created that analysis, the analysis type, any tokenization or cleanup operations performed, and the date and time the analysis was started. If an analysis is marked 'Incomplete' it has been completed by Linguine yet and will should an ETA indicating when it will most likely be complete.

Creating a new analysis

To start creating a new analysis, click the blue Create Analysis button in the upper left corner. You will be brought to the Select Analysis Type tab.

Selecting an analysis type

This tab will list all analysis operations currently available within Linguine. If an analysis type is labeled (Stanford CoreNLP) or (Curator), the analysis will use that external tool, running in the RIT NLP server environment. Otherwise, the operation is using libraries built into Linguine itself.

Click on the analysis type you with to perform to select it. You can then either click the blue Next button or the Select Corpora tab to proceed.

Selecting the corpora

This tab will list all corpora uploaded to your account. Click on the corpus you wish to perform the analysis on. Most analyses can only be performed on a single corpus at a time. If the analysis does allow multiple documents, you will be able to select more than one corpus. If you wish to deselect a corpus, click on it again.

Once you can selected the corpus, you can either click the blue Next button or the Select Preprocessing Operations tab to proceed. If you made a mistake earlier, you can return to the Select Analysis tab by clicking the blue Previous button.

Selecting preprocessing operations

This tab will list all preprocessing operations available for the analysis type you selected. These are only applicable to analyses that use Linguine's libraries. If no options appear, it is because the operation using an external system that handles preprocessing automatically.

Multiple preprocessing operations can be selected for a single analysis. Once you have selected the preprocessing operations you wish to perform, you may optionally enter name in the Analysis Name field that you wish to appear in the list of completed analyses. A default name based on the analysis type will be provided if you do not wish to add a custom name. Click Create Analysis to begin the operation. You will be returned to the main Analysis page.

If the analysis you are performing requires a preprocessing step (typically tokenization), you will not be able to create the analysis until an appropriate preprocessing operation has been selected.

Viewing an analysis

You can view the results of a completed analysis by clicking on it in the list. To return to the list, click the Back button in the upper right corner. To delete the results of an analysis, first click the analysis, then click the trash can icon in the upper right corner.

Visualization

When the analysis view opens, it will default to the Visualization tab. The visualization used will differ depending on the analysis type.

  • Term Frequency Analysis: The results will be shown as a word cloud. The size of each term is proportional to its frequency within the corpus. Hover over a word with your mouse cursor to view its frequency.
  • Part of Speech Tagging: The results will be shown as a parse tree. The part of speech for each word or phrase appears in the node above it. The parse tree will be generated for only one sentence at a time. If your corpus contains multiple sentences, click the Select Sentence dropdown to select a sentence to display.
  • Sentiment Analysis: The results will be shown as a parse tree, similar to Part of Speech Tagging. The sentiment for each branch of the parse tree will appear above the node. The sentiment ranges from very negative (--) to very positive (++). A blank sentiment indicates it is neutral. Like Part of Speech Tagging, only one sentence can be displayed at a time.
  • Named Entity Recognition: The results will show the text with all identified named entities highlighted. The Color of the highlight indicates the type of entity. Hovering over the entity with the mouse cursor will also list its type.

Default View

Clicking the Default View tab shows the raw data returned from the linguine server. This may be useful if you are looking for detailed results of a specific token or need information not apparent in the visualization. The data is displayed in a collapsible list, organized by sentence. You may also export the results as a JSON file by clicking the blue Download JSON button at the bottom of the page.

⚠️ **GitHub.com Fallback** ⚠️