ICP 6 - PallaviArikatla/Big-Data-Programming GitHub Wiki
OBJECTIVE: TO PERFORM QUERIES USING APACHE SOLR
QUESTION 1:
For dataset Music
Steps:
- Create music1 schema configuration with the following command.
 
solrctl instancedir --generate /tmp/music1
- Go through the dataset we work on and change the indexes according to the ID's in the dataset.
 
Type the following command:
gedit /tmp/music1/conf/schema.xml
with which you'll enter into schema page and edit the page as follows according to the dataset:
- Have to create instance directory followed by creating collection using following commands:
 
solrctl instancedir --create music1 /tmp/music1
solrctl collection --create music1
- 
The file will be created in the Solr terminal. Select the dataset and upload it in the interface.
 - 
Now run the queries
 - 
Keyword matching: In this search it looks for that particular keyword alone.
 
- Wildcard matching: In this search it looks for the entire string and displays it.
 
- Range: It displays data in the specified range.
 
- Boost: It prioritizes the content as per the request considering boosting factor.
 
QUESTION 2:
For the dataset Books.
- Follow the same procedure as above while creating second instance.
 
Use the following commands:
solrctl instancedir --generate /tmp/books
gedit /tmp/books/conf/schema.xml
solrctl instancedir --create books /tmp/books
solrctl collection --create books
- Edit the schema as per the dataset as follows:
 
- 
Run the following queries in Solr.
 - 
Keyword matching: In this search it looks for that particular keyword alone.
 
- Proximity: It prints the data asked in the query collecting data from dataset within proximity factor provided.
 
- Sort: Sorts data using search parameter we give in the query.
 
- Fuzzy Logic: Provides data based on approximation.
 
- AND logic: Compares and provides two outputs at a time.
 
VIDEO LINK: http://youtu.be/Ln1pxWMorLI?hd=1