GettingStartedFromABinaryDistribution - adjam/solrmarc GitHub Wiki
This getting started guide describes the SolrMarc binary distribution for Blacklight and VuFind, defines its software dependencies, guides the user through using the SolrMarc binary to index a file of MARC records.
### About SolrMarc ###
SolrMarc is a utility that reads in MARC records from a file, extracts information from various fields as specified in an indexing configuration script, and adds that information to a specified SOLR index.
SolrMarc provides a rich set of techniques for mapping the tags, fields, and subfields contained in the MARC record to the fields you wish to include in your SOLR index record, but it also allows the creation of custom index functions if you cannot achieve what you require using the predefined mapping techniques.
Currently, SolrMarc is configured to work with:
NOTE: If you anticipate a need for custom indexing functions, you will need to download the SolrMarc source code and build the package using the instructions from the GettingStartedFromASourceDistribution document on this wiki. However with the recent addition of support for custom indexing scripts, you can now accomplish anything you can do with a custom indexing function, by writing a custom script that will be interpreted at runtime.
### Software Dependencies ###
SolrMarc requires the java runtime environment (JRE) version 1.5 or newer (or version 1.5 or newer of the java development kit (JDK).)
To check the version of your installed java at the command prompt type:
java -version
To download the proper version of the java, go to http://java.sun.com/javase/downloads/index.jsp
### Downloading the SolrMarc Binary Distribution ###
To get the binary distribution, download SolrMarc_GenericBlacklight_Binary_Unix.tar.gz
, SolrMarc_GenericBlacklight_Binary_PC.zip
, SolrMarc_GenericVuFind_Binary_Unix.tar.gz
or SolrMarc_GenericVuFind_Binary_PC.zip
from the project's downloads page. The binary distribution of SolrMarc, is a simpler option. The binary distribution is delivered as a single large .zip or .tar file. When unpacked this distribution will create a directory containing a large SolrMarc.jar file, a number of .properties files, and a couple of sub-directories.
The only difference between the two binary distributions is that the Unix version contains a number of bash shell scripts for running the SolrMarc indexer or for running the the other utility programs associated with running SolrMarc, whereas the PC version of the binary distribution contain batch files to perform the same tasks.
### Unpacking the Binary Distribution ### Create a directory and copy the SolrMarc distribution you just made into it. You can do this anywhere, but for this example let's create a directory called `indexer` at the top level of either your blacklight or your vufind directory. Now unzip the distribution file. On unix, the command is _tar zxvf filename_. On windows, you can run winzip or some similar program.
### Configuring SolrMarc ### SolrMarc uses a series of Java properties files for its configuration, and these are placed in the directory you created and unpacked the binary distribution. Some of the values in these properties files **may** need to be set before you will be able to run SolrMarc to produce an index for your VuFind or Blacklight installation.
As distributed, the Binary releases of SolrMarc are configured to point at and write to the solr directory inside the jetty directory that is unpacked with the rest of SolrMarc. If you want to use a different Solr installation you already have elsewhere in your system you will need to modify the config.properties file.
### Running SolrMarc ### You will then be ready index MARC records into the solr index that will be used by your implementation of Blacklight or VuFind via the following command:
indexer/indexfile /path/to/marcrecords.mrc
The command will display informational messages and warnings while it is running, processing MARC records.
To index the sample record included in the demo distribution in the test_data directory use the following commands:
indexer/indexfile ./test_data/test_data.utf8.mrc
indexer/indexfile ./test_data/lc_records.utf8.mrc
or to index both at one time:
cat ./test_data/*.mrc | indexer/indexfile
### Changing your indexing options ### Chances are you aren't going to want to index your own data exactly the way we have things set up for the demo application. Here's how to start making changes to the index mappings.
Go to the indexer
directory where you unpacked SolrMarc. The properties file that control how SolrMarc will run and what fields will be added to the Solr index will be right next to the SolrMarc.jar file that contains all of the code for running SolrMarc.
#### demo\_config.properties #### The main configuration file is named demo\_config.properties. A Blacklight example of it is shown below. ``` # Properties for the Java import program # for more documentation, see # http://code.google.com/p/solrmarc/wiki/ConfiguringSolrMarc
solrmarc.solr.war.path=jetty/webapps/solr.war
solrmarc.custom.jar.path=
solr.indexer.properties = demo_index.properties, demo_local_index.properties
solr.indexer = org.solrmarc.index.SolrIndexer
marc_permissive = true
solr.path = jetty/solr
solr.data.dir = jetty/solr/data
solr.hosturl = http://localhost:8983/solr
marc.default_encoding = MARC8
marc.to_utf_8 = true
marc.include_errors = false
Depending on your local MARC records, you might want to change the default encoding, or other values. If you need lots of customization it's probably better to build a custom distribution from source. For instructions on building a custom distribution, please see the [GettingStarted](GettingStarted.md) doc on this wiki.
<br />
#### demo\_index.properties ####
The configuration file that handles all of the mappings from MARC to Solr is `demo_index.properties`. The `demo_index.properties` that is configured to work with the Blacklight demo application looks like this:
for more information on solrmarc mappings,
see http://blacklight.rubyforge.org/ DEMO_README file
id = 001, first marc_display = FullRecordAsMARC text = custom, getAllSearchableFields(100, 900)
language_facet = 008[35-37]:041a:041d, language_map.properties
format = 000[6-7]:000[6]:007[0], (map.format), first isbn_t = 020a, (pattern_map.isbn_clean) material_type_display = custom, removeTrailingPunct(300aa)
title_t = custom, getLinkedFieldCombined(245a) title_display = custom, removeTrailingPunct(245a) title_vern_display = custom, getLinkedField(245a)
subtitle_t = custom, getLinkedFieldCombined(245b) subtitle_display = custom, removeTrailingPunct(245b) subtitle_vern_display = custom, getLinkedField(245b)
title_addl_t = custom, getLinkedFieldCombined(245abnps:130[a-z]:240[a-gk-s]:210ab:222ab:242abnp:243[a-gk-s]:246[a-gnp]:247[a-gnp]) title_added_entry_t = custom, getLinkedFieldCombined(700[gk-pr-t]:710[fgk-t]:711fgklnpst:730[a-gk-t]:740anp) title_series_t = custom, getLinkedFieldCombined(440anpv:490av) title_sort = custom, getSortableTitle
author_t = custom, getLinkedFieldCombined(100abcegqu:110abcdegnu:111acdegjnqu) author_addl_t = custom, getLinkedFieldCombined(700abcegqu:710abcdegnu:711acdegjnqu) author_display = custom, removeTrailingPunct(100abcdq:110[a-z]:111[a-z]) author_vern_display = custom, getLinkedField(100abcdq:110[a-z]:111[a-z]) author_sort = custom, getSortableAuthor
subject_t = custom, getLinkedFieldCombined(600[a-u]:610[a-u]:611[a-u]:630[a-t]:650[a-e]:651ae:653aa:654[a-e]:655[a-c]) subject_addl_t = custom, getLinkedFieldCombined(600[v-z]:610[v-z]:611[v-z]:630[v-z]:650[v-z]:651[v-z]:654[v-z]:655[v-z]) subject_topic_facet = custom, removeTrailingPunct(600abcdq:610ab:611ab:630aa:650aa:653aa:654ab:655ab) subject_era_facet = custom, removeTrailingPunct(650y:651y:654y:655y) subject_geo_facet = custom, removeTrailingPunct(651a:650z)
published_display = custom, removeTrailingPunct(260a) published_vern_display = custom, getLinkedField(260a)
pub_date = custom, getDate
lc_callnum_display = 050ab, first lc_1letter_facet = 050a[0], callnumber_map.properties, first lc_alpha_facet = 050a, (pattern_map.lc_alpha), first lc_b4cutter_facet = 050a, first
url_fulltext_display = custom, getFullTextUrls url_suppl_display = custom, getSupplUrls
See [ConfiguringSolrMarc](ConfiguringSolrMarc.md) for more information about configuration options for these files.
<br />