Command Line Usage - NatLibFi/RecordManager GitHub Wiki

Command Line Reference

**N.B. This reference is for the 2.x version of RecordManager. Older versions have several separate entry points for the commands. See older revisions of this page for version 1.x. **

This reference explains the most important commands.

All command line functions are executed with the ./console command as the entry point. Run the program without parameters for a list of valid commands. Show detailed description for a command with the help command, e.g. ./console help solr:update-index. Note that commands can be abbreviated as long as the command is unambiguous, e.g. ./console so:up.

All the command line programs should report their activities and end with a summary. If a program ends abruptly without an error or success message, see the PHP error log for any errors, or try to run the program with the --verbose parameter to see more verbose output on its activity. Some commands also increase output verbosity according to the verbosity level; try -vvv for maximum verbosity.

Configuration parameters can be changed on the fly with the --config parameter, e.g. ./console --config.Solr.update_url=http://localhost:8983/solr/update

Using the --lock parameter ensures that a command scheduled e.g. with cron doesn't get run multiple times in parallel. With --lock you could e.g. schedule harvesting to run "continuously", i.e. once per minute, and it would only start if the previous process has completed.

Import

The records:import command can be used to import files containing metadata records into the RecordManager database. For typical continuous use scenarios, harvesting records (see below) is recommended instead of manual imports.

Harvest

The records:harvest command can be used to harvest metadata records from different data sources using a protocol suitable for automatic harvesting (e.g. OAI-PMH). This is recommended for typical usage scenarios where continuous (incremental) updates from the data sources are needed.

It is recommended to automate harvesting at suitable intervals e.g. with cron.

Deduplication

When deduplication is enabled for data sources, the records need to go through deduplication process run with the records:deduplicate command. This command will go through all records marked for deduplication and find duplicates using so-called deduplication keys. See Deduplication for more information.

Solr Index Update

Added, updated and deleted records are sent to the Solr index with solr:update-index command. This process should typically be scheduled to run regularly.

The command tracks the time of previous update and incrementally sends any changes to Solr. This includes any changes done by the deduplication process. However, a full, manual update is required when mappings or other indexing rules are changed. This can be accomplished with the ./console solr:update-index --all command.

Renormalization

Records may need to be renormalized e.g. if normalization XSLT is changed or deduplicated is enabled or disabled. This can be accomplished with the records:renormalize command.

Export

The records:export command can be used to export data from the RecordManager database.

xpath Parameter Examples

To export MARC records that have a 740 field:

./console records:export --xpath=//datafield[@tag='740'] field740.xml

To match the field contents:

./console records:export --xpath="//datafield[@tag='740']/subfield[@code='a' and text()='In the land of the crane']" crane.xml

Data Source Configuration Changes

There are several commands in the sources category to manipulate datasources.ini configuration file programmatically. This can be useful for mass updates etc.