Commands and Options - ari99/sy-s3 GitHub Wiki

The available commands and options are explained in the properties file.

From jc.properties:

### localhost testing ###############
#### aws access key
#access_key=accessKey1
#### aws secret key
#secret_key=verySecretKey1
#### aws s3 endpoint url or ip: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
#endpoint=0.0.0.0
#### aws s3 endpoint port -required otherwise the default 8000 is used
#port=8000
#### currently scality is setup not to use ssl
#### bucket string used in getting a single object or listing objects. Bucket is required for 
#### list_objects but not get_objects because get_objects reads from the db.
#bucket=mybucket
#### Add auth headers boolean. Auth headers aren't needed for public datasets.
#auth=true
#### Connect to s3 using ssl
#ssl=false
################# end localhost testing ############

############## public dataset testing ########## 
#### https://docs.opendata.aws/aft-vbi-pds/readme.html
#### aws access key
access_key=
#### aws secret key
secret_key=
#### aws s3 endpoint url or ip: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
endpoint=s3.amazonaws.com
#### aws s3 endpoint port 
port=443
#### bucket string used in getting a single object or listing objects
bucket=aft-vbi-pds
#### Add auth headers
auth=false
#### Connect to s3 using ssl
ssl=true
####### end public dataset testing #########


###################  Available commands #########################
#### The available commands are: 
#### list_objects_v1 #list_objects commands will append to db lists, not replace
#### list_objects_v2 # There are two versions available from the AWS API. The list_objects
					 # Use the list_objects commands to retrieve a the list of object in S3.
					 # The objects information are saved in the initial_objects and 
					 # objects_to_dl sets in the db
#### get_single_object # This will not change the db. Downloads a single S3 object. 
					   # Uses object_key and bucket. 
#### get_objects ## Download objects in "objects_to_dl" in MapDB only. 
				#   Can use read_db_start to start at a certain point in the db set.
				#   Can use "pattern" to match a pattern for which objects to dl.
#### get_all_names # Lists all lists/sets/tables in the MapDB db
#### delete_fix_db # Sometimes the MapDB gets corrupted so you can delete and fix it using this command.
#### read_all_sets # Read (print) all sets in the db. Read specific sets below.
#### read_initial_objects 
#### read_objects_to_dl 
#### read_dled_objects
#### clear_all_sets # Clear all db sets. Clear specific sets below.
#### clear_initial_objects
#### clear_objects_to_dl
#### clear_dled_objects
#### list_buckets # this command only works with non-public datasets :
################  -"Anonymous requests cannot list buckets, " 
################  -https://docs.aws.amazon.com/AmazonS3/latest/API/RESTServiceGET.html
########################## Available Commands end ###################

#################### Command examples #####################
#### prop_commands stands for properties file commands.
#### Can run one or more than one command sequentially. Separate them by comma.
# Download objects left to download in the db
#prop_commands=get_objects
# Print all sets in the db
#prop_commands=read_all_sets 
#prop_commands=get_all_names
# Clear the db and confirm by printing the sets to the terminal
#prop_commands=clear_all_sets, read_all_sets
# Print sets, download objects, and confirm all the objects have moved from the objects_to_dl set to the downloaded set
#prop_commands=read_all_sets,get_objects,read_all_sets
# Clear sets, confirm clear, fill sets, print sets, download objects, print sets
#prop_commands= clear_all_sets,read_all_sets, list_objects_v1, read_all_sets, get_objects, read_all_sets
# Fix db corruption
#prop_commands=delete_fix_db
# Clear sets, confirm clear, fill sets, print sets, download objects, print sets
#prop_commands= clear_all_sets,read_all_sets, list_objects_v2, read_all_sets, get_objects, read_all_sets
# Download objects, and confirm all the objects have moved from the objects_to_dl set to the downloaded set
#prop_commands = get_objects, read_all_sets
# Download a single object from S3 and print all sets in the db
#prop_commands=get_single_object,read_all_sets
# List account's buckets
#prop_commands=list_buckets
# clear db, get list of objects and save them to db, print db
#prop_commands=clear_all_sets, list_objects_v1, read_all_sets
# clear db, get list of objects and save them to db, print db
prop_commands=clear_all_sets,list_objects_v2,read_all_sets
# get list of objects and save them to db, print db, download objects
#prop_commands= list_objects_v1, read_all_sets, get_objects
######################### Command examples end ###################



#### Name of db to store data. Will create a file locally with this name.
objects_db=objectsDb.db

###CliOptions.java has the following defaults for the DB set names. Can uncomment bellow and change if wanted.
#initial_objects_name=initial_objects
#objects_to_dl_name=objects_to_dl
#dled_objects_name=dled_objects




###s3 key used only in get single object(get_single_object command). Used in combination with "bucket" parameter above.
#object_key=b/9.txt
object_key=bin-images/00291.jpg
#### "pattern" is used to decide whether bucket/path combination json will be added to the list of objects saved in the db
#### when running listobjectsv1 or listobjectsv2 .
#### Uses this pattern matcher: 
#### http://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/util/AntPathMatcher.html
#### This pattern will be trimmed.
#pattern=**/*1.jpg

#### The maximum number of keys returned in the response body. Used in Listobjectv1 and v2.
#### sent to aws in http request. Default api max keys is 1000
max_keys=10



#### Use delimeter to limit the results to only within the prefix directory. Delimeter and prefix
####   are used in list_objects_v1 and list_objects_v2.
#delimiter=/
#prefix=bin-images/0001
#### When delimiter is used with prefix, it will only search one level deep.
#### For example, without delimiter , and prefix=l/ , this is returned: " adding to db: mybucket/l/3/2.txt".  
#### Only full keys are saved to the db with prefix and no delimiter.
#### With delimiter in addition to prefix, this is returned:  adding to db: adding to db: mybucket/l/3/ . 
#### Partial key paths up to the first occurrence of delimiter after prefix are saved to the db when you 
#### use both delimiter and prefix.
##### ONLY DELIMITER: returns all results up to that delimiter from the start of the bucket
##### ONLY PREFIX: only returns keys with that prefix
##### BOTH: only return objects with the prefix and up to the first occurrence of the delimiter after the prefix

#### Number of verticles to call before each thread.sleep. This is only used in get_objects command.
num_verticles=50

#### Only used in get_objects command. Amount of milliseconds to sleep every num_verticles.
sleep_time=20000


#### max_requests_list_objects is used by list_objects_v1, list_objects_v2
####  It is checked before continuing after a truncated request.
max_requests_list_objects=2
#### Used by get_objects command, checked before creating a get_object verticle. Will run one more than value.
max_requests_get_objects=10

#### list_objects_v2 can use continuation-token or start-after parameter in the request. 
#### list_objects_v1 only has "marker".
#### Used in list_objects_v2 and list_objects_v1. Key (objectkey in db) to start returning results after. AWS api param. 
#### Passed to AWS API as "start-after" param in list_objects_v2 and "marker" param in list_objects_v1.
#### If continuation token is present in the request this parameter is ignored is list_objects_v2. 
#### Does not include the value.
#start_after=bin-images/00271.jpg
#start_after=a/1/j.txt

#### read_db_start is used by db reader to output data in db after the value.
#### It is also used in get_objects to read from the db the objects after the value,
#### and then only grab those from S3. It does not include the value.
# For example: read_db_start={"bucket":"mybucket","path":"mybucket/b/8.txt","objectkey":"b/8.txt"}
#read_db_start={"bucket":"mybucket","path":"mybucket/b/8.txt","objectkey":"b/8.txt"}