Polonius Reader - rail5/polonius GitHub Wiki
% polonius-reader(1) Version 1.0 | Manual for the Polonius Reader
NAME
polonius-reader - outputs a selected portion of the contents of a file
SYNOPSIS
polonius-reader ./file
polonius-reader ./file --start 0 --length 10
polonius-reader ./file --search "hello world" --output-pos
OPTIONS
OVERVIEW
-i / --input
Specify file to read
The file can also be specified without the '-i' option
-s / --start
Specify start position (character number in the file)
Defaults to 0 if not specified
-l / --length
Specify how many bytes to read
Defaults to full length of the file if not specified
-b / --block-size
Specify the maximum amount of data we can load from the file into memory at any given time
Defaults to 1 kilobyte if not specified
-f / --find / --search
Search for a string in the file
Returns nothing (blank) if no matches were found
If a start position is given with -s / --start, the program will only search for matches after that start position
If a length is given with -l / --length, the program will only search for matches within that range from the start position
-p / --output-pos
Output the start and end positions, rather than the actual text
Will output in the (space-delimited) format: start end
For example: 0 5
If used with searches, this will output the start and end position of the found match. If used outside of searches, this will output the start and end position of the file read
-c / --special-chars
Parse escaped character sequences in search queries (`\n`, `\t`, `\\`, and `\x00` through `\xFF`)
-e / --regex
Interpret search query as a regular expression
-V / --version
Print version number
-h / --help
Display help message
START POSITION
Specifying the "start position" tells Polonius to skip over the beginning of the file and only start reading at the position you specify (specified in characters)
The start position can be specified with the -s option. For example: polonius-reader ./file -s 50
If the start position is not specified, it defaults to position 0 (the beginning of the file)
READ LENGTH
Specifying the "read length" tells Polonius to only read X number of characters from the start position
The read length can be specified with the -l option. For example: polonius-reader ./file -s 50 -l 10
If the read length is not specified, Polonius will read until the end of the file
SEARCH
Polonius can search a file for a specific string
A search can be done with the -f option. For example: polonius-reader ./file -f "hello world"
If this is combined with Start Positions or Read Lengths, the search will happen only within those boundaries. For example, polonius-reader ./file -f "hello" -s 500 -l 200
will search for the string "hello" only between character #500 and character #700
This can also be combined with the "Output Positions" (-p) option. If the "Output Positions" flag is set with -p, Polonius will tell you where it found a match, in the space-delimited format startposition endposition (for example: 510 515). By default, without the "Output Positions" flag, Polonius will output the match itself.
The search function will output either:
-
The found match
-
The position of the found match (if -p is specified)
-
Nothing (blank) if no match was found
This search function is also fast. Here is an example that was run on my laptop:
-
A 2.5GB file was created using randomtext
-
The string "hello world" was inserted approximately 2.4GB in (right near the end of the file)
-
The following commands were run through the Bash time utility:
-
polonius-reader ./big-file -f "hello world"
-
grep -o "hello world" ./big-file
-
-
Here was the result of the Polonius command:
hello world
real 0m1.874s
user 0m0.862s
sys 0m0.980s
- Here was the result of the grep command:
grep: memory exhausted
real 0m9.696s
user 0m3.112s
sys 0m4.500s
REGEX SEARCH
A normal search can be made into a regex search by passing the -e option. For example: polonius-reader -f "[a-z]+[0-9]{2}" -e
All of the above about normal searches applies also to regex searches. Regex searches, however, are significantly slower than normal searches.
Polonius is not capable of finding regex matches which are larger than the block size (default 10KB if unspecified).
BLOCK SIZE
Specifying the "Block Size" tells Polonius how much data from the file we're willing to load into memory at once.
The default value (if unspecified) is 10 kilobytes
The block size can be specified with the -b option, in the formats:
1. `-b 15` (This would set the block size to 15 bytes)
2. `-b 16K` (This would set the block size to 16 kilobytes)
3. `-b 17M` (This would set the block size to 17 megabytes)
And of course, the example numbers '15', '16', and '17' can be swapped for any arbitrary number
This option is common to both polonius-reader and polonius-editor
OUTPUT POSITIONS
Setting the "Output Positions" flag tells Polonius to not output the actual content of the file, but instead to tell you the start and end positions of the content that it would've outputted.
The flag can be set with the -p option. Polonius will output the positions in the space-delimited format startposition endposition, for example: 10 15
This is mainly useful in two scenarios:
1. Searches
When searching for a string, often we don't just want to know *whether* a match was found, but also *where* it was found
2. Determining the length of a file
If polonius-reader is run with **no extra arguments given**, it will output the entire contents of a file.
In this case, if you set the *-p* flag, it will output something like `0 700`, where *700* is the number of characters in the file
SPECIAL CHARACTERS
Setting the "special characters" flag tells Polonius to parse escaped character sequences in search queries. Polonius will parse \n
, \t
, \\
, and \x00
through \xFF
.
The special characters flag can be set with the -c option.