Shiny App Users' Manual - Yolanda128/wcma_digital_collection GitHub Wiki

Motivation

The Williams Museum of Art (WCMA) contains over 10,000 pieces of various styles, time periods, and mediums, and, thanks toWCMA’s Mellon Digital Project, images of the pieces in the collection have been digitally uploaded to the internet, allowing students, faculty, researches, artists, and many others to easily access metadata and thumbnail images for the entire WCMA collection. However, for the average appreciator of art, it may be difficult to look through the large collection for the pieces that they like. Therefore, our goal for creating this shiny App is to design a tool that allows the user to specify their preferences (color,style, type) after which the shinyApp would give them recommendations from the WCMA collection based on their preferences. This way, a user can ideally identify the pieces that they like without having to look through the entire collection.

How To Use

Our shinyApp seeks to recommend pieces in the WMCA collection based on the user’s preferences. First, the user is asked to select a type of art, which includes the physical medium, such as drawing, painting, or photos, as well as cultural origin, such as Eastern, African, or Pacific. Next, the user is asked to pick a color scheme, which includes 9 primary colors as well as a “mixed colors” option. The mixed colors option simply means that the image doesn’t contain a primary color and can therefore be considered “mixed color” or diverse in color. Finally, the user is instructed to pick a style, each of which corresponds to a subgroup of images from the collection that we chose based on style. After designating these preferences, the user clicks the “Recommend A Piece To Me” button, and the shinyApp will recommend a piece from the WCMA collection to them based on their preferences. After receiving the first recommendation, the user can optionally click the “Works By The Same Maker” button as many times as they want to view other pieces by the same maker, in the hope that they can find even more art works that appeal to them.

Behind the Scenes

We uploaded a dataset to the shinyApp beforehand that contained the names of the images we are working with and grouping variables pertaining to them, including the type, the primary color, and the style of the images. The first variable was borrowed from the WCMA metadata, and the latter two variables were created by us. See more details on how we clustered the images by their primary color and style.

Style

In order to determine the style of an image, we decided to look at three features related to the image’s HSV values: value mean, saturation mean, and warmth, defined below. * Value Mean: Measures lightness/darkness. The average of an image’s value across all of its pixels can give us information on the overall brightness of an image relative to other images. * Saturation Mean: Measures the intensity of a color. The mean of an image’s saturation across all of its pixels can give us information on the overall color intensity of an image relative to other images. * Warmth: Measures the proportion of warm colored pixels in a given image. A warm colored pixel is defined as a pixel whose hue is less than or equal to 90 or between 330 and 360.

We used the Manhattan distance, which is the sum of absolute differences of each of the features for two given images. The values for the features were standardized so that their impacts on the distance measure were taken into account on the same scale. Then, we used the k-means clustering algorithm to group the images that are least distant from each other (and therefore most similar to each other) into the same clusters.A silhouette plot was used to help us choose k, which is the number of clusters to divide the images into. There is, however, no generally agreed upon method for choosing the most optimal k every time. After consulting the silhouette plot and making the composites on several different values of k, we decided it was best to use k=4. The images corresponding to each style of image is a composite of all the images in the group corresponding to that style.

Color

As for color, we compressed the images into a reasonable number of pixels and assigned each pixel a color based on pre-specified conditions. Then, we looked at the distribution of pixel colors and grouped the images by the primary color in the image. If 40 percent or more of the pixels are of one color, then we put the image in the group corresponding to the color. Otherwise, we put it into a group called “mixed colors”, which means that the image is diverse in color. The images corresponding to each color is a composite of all the images in that color group. For example, the image corresponding to “purple” is a composite image of all the images in the purple group.

Making Recommendations

Each of the three inputs (style, color, type) filter out pieces from the collection, such that after all three preferences are designated, we are left with a subset of pieces from the collection. When the first button ("Recommend A Piece To Me") is clicked, an image (image A) from that pre-fixed subset is randomly chosen and displayed. Given that there is more than one piece in that subset, the displayed image can change each time with multiple clicks of the first button. Each time the second button ("Works By The Same Maker")is clicked , one piece from the set of images that share the same artist with image A is randomly selected and displayed. Again, given that there is more than one piece in that set, the displayed image can change each time with multiple clicks of the second button. To better understand what each feature/input means, we included tabs at the top of the page that explain exactly how we divided the collection into subsets based on a given feature/input.

Limitations

There are a few limitations to our project. First, the “maker” category that we found in the metadata can be vague and not necessarily the name of an artist. Some names for “maker” include “Mesopotamian”, “Anonymous (Italian)”, and “Roman?”. These tags do not provide much information about the actual maker of the art and two pieces with the same label for maker are likely to have been made by two different people. Furthermore, some classifications for the “type” of art that we found in the metadata are not mutually exclusive. We did our best to reclassify some of the images through the use of regular expressions. For example, an image classified under “WCMA Prendergast” was reclassified into one of the other categories(painting, drawing, photo, prints) based on the description of the work. Nevertheless, our reclassification method does not attain a 100 percent accuracy rate. Careful, manual, artist-expertise based re-classification is required to attain a better accuracy rate.

Ethical Considerations

As Statisticians, it is important to consider the potential ramifications of our work when working with any kind of data. Therefore, while our work with the WCMA art collection seems relatively benign compared to, for example, a project related to college admissions decisions, it is still important to think about the consequences of our work so that we can identify problems to improve upon. For this project, we can think of three such ethical considerations. First, in our attempt to streamline the digital museum-visiting process and directly give the user recommendations based on their specified preferences, we could theoretically be depriving the viewer of the "surprise" element of the museum experience. Because the viewer will only see pieces that correspond to their preferences, the shinyApp does not necessarily give them a chance to be exposed to the pieces that will change their minds about their preferences. Furthermore, part of the collection was collected during the colonial period, potentially through ethically questionable means. Therefore, through our application, we are essentially recommending pieces from the WCMA collection that should never have belonged to the museum in the first place and possibly sending users to the museum to view the art without their knowing about the means through which the art was acquired. Thus, there is a possibility that we are perpetuating certain injustices surrounding the pieces themselves by recommending them to other users. One solution to this problem would be to find out how each piece was acquired and display that information to the user. Lastly, art is something that should be available for anyone who would like to appreciate it, but the nature of digitalizing the pieces into images limits the audience to those with sight. Our application discriminates against those who don’t have sight because it is visually based, and therefore we are perpetuating the exclusion of those who cannot enjoy the artworks visually. The blind who would like to experience visual art and get recommendations based on their preferences would not be able to use our application at all. Fortunately, there are R packages that allow us to transform digital images into sound, making digital art such as the collection that we are working with accessible to the blind. However, the transformation from image to sound is still a developing technology and thus cannot make an image “sound” exactly as it should be heard. Hopefully, with further advances in technology, we can one day make the experience of art widely accessible to people who have limited vision.