Command Line Usage - NatLibFi/RecordManager GitHub Wiki
Command Line Reference
**N.B. This reference is for the 2.x version of RecordManager. Older versions have several separate entry points for the commands. See older revisions of this page for version 1.x. **
This reference explains the most important commands.
All command line functions are executed with the ./console command as the entry point. Run the program without parameters for a list of valid commands. Show detailed description for a command with the help command, e.g. ./console help solr:update-index
. Note that commands can be abbreviated as long as the command is unambiguous, e.g. ./console so:up
.
All the command line programs should report their activities and end with a summary. If a program ends abruptly without an error or success message, see the PHP error log for any errors, or try to run the program with the --verbose
parameter to see more verbose output on its activity. Some commands also increase output verbosity according to the verbosity level; try -vvv
for maximum verbosity.
Configuration parameters can be changed on the fly with the --config
parameter, e.g.
./console --config.Solr.update_url=http://localhost:8983/solr/update
Using the --lock
parameter ensures that a command scheduled e.g. with cron doesn't get run multiple times in parallel. With --lock
you could e.g. schedule harvesting to run "continuously", i.e. once per minute, and it would only start if the previous process has completed.
Import
The records:import
command can be used to import files containing metadata records into the RecordManager database. For typical continuous use scenarios, harvesting records (see below) is recommended instead of manual imports.
Harvest
The records:harvest
command can be used to harvest metadata records from different data sources using a protocol suitable for automatic harvesting (e.g. OAI-PMH). This is recommended for typical usage scenarios where continuous (incremental) updates from the data sources are needed.
It is recommended to automate harvesting at suitable intervals e.g. with cron.
Deduplication
When deduplication is enabled for data sources, the records need to go through deduplication process run with the records:deduplicate
command. This command will go through all records marked for deduplication and find duplicates using so-called deduplication keys. See Deduplication for more information.
Solr Index Update
Added, updated and deleted records are sent to the Solr index with solr:update-index
command. This process should typically be scheduled to run regularly.
The command tracks the time of previous update and incrementally sends any changes to Solr. This includes any changes done by the deduplication process. However, a full, manual update is required when mappings or other indexing rules are changed. This can be accomplished with the ./console solr:update-index --all
command.
Renormalization
Records may need to be renormalized e.g. if normalization XSLT is changed or deduplicated is enabled or disabled. This can be accomplished with the records:renormalize
command.
Export
The records:export
command can be used to export data from the RecordManager database.
xpath Parameter Examples
To export MARC records that have a 740 field:
./console records:export --xpath=//datafield[@tag='740'] field740.xml
To match the field contents:
./console records:export --xpath="//datafield[@tag='740']/subfield[@code='a' and text()='In the land of the crane']" crane.xml
Data Source Configuration Changes
There are several commands in the sources
category to manipulate datasources.ini configuration file programmatically. This can be useful for mass updates etc.