ICP 6 - PallaviArikatla/Big-Data-Programming GitHub Wiki

OBJECTIVE: TO PERFORM QUERIES USING APACHE SOLR

For dataset Music

Steps:

solrctl instancedir --generate /tmp/music1

Go through the dataset we work on and change the indexes according to the ID's in the dataset.

Type the following command:

gedit /tmp/music1/conf/schema.xml

with which you'll enter into schema page and edit the page as follows according to the dataset:

Have to create instance directory followed by creating collection using following commands:

solrctl instancedir --create music1 /tmp/music1

solrctl collection --create music1

The file will be created in the Solr terminal. Select the dataset and upload it in the interface.
Now run the queries
Keyword matching: In this search it looks for that particular keyword alone.

Wildcard matching: In this search it looks for the entire string and displays it.

Boost: It prioritizes the content as per the request considering boosting factor.

For the dataset Books.

Use the following commands:

solrctl instancedir --generate /tmp/books

gedit /tmp/books/conf/schema.xml

solrctl instancedir --create books /tmp/books

solrctl collection --create books

Proximity: It prints the data asked in the query collecting data from dataset within proximity factor provided.