Config Documentation - GalSim-developers/GalSim GitHub Wiki

This documentation is for the GalSim 2.x release line. (Features that are new in some version will be indicated as such.)

Overview
Top Level Fields

psf
gal
stamp
image
input
output

Special Fields

modules
eval_variable
template

The galsim Executable

Changing or adding parameters
Splitting up a config job
Other command line options

Overview

The basic configuration method is to use a dictionary which can be parsed in python. Within that structure, each field can either be a value, another dictionary which is then further parsed, or occasionally a list of items (which can be either values or dictionaries). The hierarchy can go as deep as necessary.

Our example config files are all yaml files, which are read using the executable galsim. This is a nice format for config files, but it is not required. Anything that can represent a dictionary will do. For example, the executable galsim also reads in and processes json-style config files if you prefer.

If you would like a kind of tutorial that goes through typical uses of the config files, there are a series of demo config files in the GalSim examples directory. See Tutorials for more information. This documentation is meant to be more of a reference once you already have the basic idea of how the config files generally work.

For a concrete example of what a config file looks like, here is demo1.yaml (the first file in the aforementioned tutorial) stripped of most of the comments to make it easier to see the essence of the structure:

gal :
    type : Gaussian
    sigma : 2  # arcsec
    flux : 1.e5  # total counts in all pixels

psf :
    type : Gaussian
    sigma : 1  # arcsec

image :
    pixel_scale : 0.2  # arcsec / pixel
    noise :
        type : Gaussian
        sigma : 30  # standard deviation of the counts in each pixel

output :
    dir : output_yaml
    file_name : demo1.fits

This file defines a dictionary, which in python would look like

config = {
    'gal' : {
        'type' : 'Gaussian',
        'sigma' : 2.,
        'flux' : 1.e5
    },
    'psf' : {
        'type' : 'Gaussian',
        'sigma' : 1.
    },
    'image' : {
        'pixel_scale' : 0.2,
        'noise' : {
            'type' : 'Gaussian',
            'sigma' : 30.
        }
    },
    'output' : {
        'dir' : 'output_yaml',
        'file_name' : 'demo1.fits'
    }
}

As you can see, there are several top level fields (gal, psf, image, and output) that define various aspects of the simulation. There are others as well that we will describe below, but most simulations will want to include at least these four.

Most fields have a type item that defines what the other items in the field mean. (The image and output fields here have implicit types Single and Fits, which are the default, so may be omitted.) For instance, a Gaussian surface brightness profile is defined by the parameters sigma and flux.

Most types have some optional items that take reasonable defaults if you omit them. E.g. the flux is not relevant for a PSF, so it may be omitted in the psf field, in which case the default of flux=1 is used.

Top Level Fields

At the top level, there are 6 basic fields:

psf defines what kind of PSF profile to use.
gal defines what kind of galaxy profile to use.
stamp defines parameters related to building the postage stamp image of each object.
image defines parameters related to the full images to be drawn.
input defines any necessary input files or things that need some kind of initialization.
output defines the names and format of the output files.

None of these are technically required, although it is an error to have neither psf nor gal. (If you don't want to draw anything but noise, you need to let GalSim know that this is intentional by using type: None for one of these.) But the most common usage would be to use psf, gal, image and output. It is not uncommon for there to be no input files, so you will often omit the input field. And sometimes you will omit the gal field to draw an image with just stars. Most simulations will use the default stamp type (called 'Basic'), which involves drawing a galaxy convolved by a PSF (or just a PSF image if gal is omitted) on each postage stamp, so this field will very often be omitted as well.

We will go through each one in turn. As we do, some values will be called float_value, int_value, etc. These can either be a value directly (e.g. float_value could just be 1.5), or they can be a dict that describes how the value should be generated each time (e.g. a random number or a value read from an input catalog). See Config Values for more information about how to specify these values.

In addition each value will have one of (required) or (optional) or (default = something) to indicate whether the item is required or if there is some sensible default value. The (optional) tag usually means that the action in question will not be done at all, rather than done using some default value. Also, sometimes no item is individually required, but one of several is.

psf

The psf field defines the profile of the point-spread function (PSF). Any object type is allowed for the psf type, although some types are obviously more appropriate to use as a PSF than others. See Config Objects for a list of all the available object types.

If this field is omitted, the PSF will effectively be a delta function. I.e. the ideal galaxy surface brightness profiles will be drawn directly on the image without any convolution.

gal

The gal field defines the profile of the galaxy. As for the psf field, any object type is allowed for the gal type, although some types are obviously more appropriate to use as a galaxy than others. See Config Objects for a list of all the available object types.

Technically, the gal field is not fundamental; its usage is defined by the stamp type. One could for instance define a stamp type that looked for a gal_set field instead that might give a list of galaxies to draw onto a single stamp. However, all of the stamp types defined natively in GalSim use the gal field, so it will be used by most users of the code.

If this field is omitted, the default stamp type = 'Basic' will draw the PSF surface brightness profiles directly according to the psf field. Other stamp types may require this field or may require some other field instead.

stamp

The stamp field defines the relevant properties and parameters of the stamp-building process. See Config Stamp for a list of all the available stamp types.

This field is often omitted, in which case the 'Basic' stamp type will be assumed.

image

The image field defines the relevant properties and parameters of the full image-building process. See Config Image for a list of all the available image types.

If this field is omitted, the 'Single' image type will be assumed.

input

The input field indicates where to find any files that you want to use in building the images or how to set up any objects that require initialization. See Config Input for a list of all the available input types.

This field is only required if you use object types or value types that use an input object. Such types will indicate this requirement in their descriptions.

output

The output field indicates where to write the output files and what kind of output format they should have. See Config Output for a list of all the available output types.

Special Fields

There are a couple of other top level fields that act more in a support role, rather than being part of the main processing.

modules

Almost all aspects of the file building can be customized by the user if the existing GalSim types do not do precisely what you need. How to do this is described in the pages about each of the different top-level fields. In all cases, you need to tell GalSim what Python modules to load at the start of processing to get the implementations of your custom types. That is what this field is for.

The modules field should contain a list of modules that GalSim should import before processing the rest of the config file. These modules can be either in the current directory where you are running the code or installed in your Python distro. (Or technically, they need to be located in a directory in sys.path.)

See examples/des/meds.yaml, examples/des/blendset.yaml, and examples/great3/cgc.yaml for some examples of this field.

eval_variables

Sometimes, it can be useful to define some configuration parameters right at the top of the config file that might be used farther down in the file somewhere to highlight them. Or sometimes, there are calculations that are needed by several different values in the config file, which you only want to calculate once.

You can put such values in a top-level eval_variables field. They work just like variables that you define for 'Eval' items, but they can be placed separately from those evaluations.

See examples/demo11.yaml, examples/des/draw_psf.yaml, and examples/great3/cgc.yaml for examples of this field.

template

This feature directs the config processing to first load in some other file (or specific field with that file) and then possibly modify some components of that dict.

To load in some other config file named config.yaml, you would write

template: config.yaml

If you only want to load a particular field from that file, say the image field, you could write

template: config.yaml:image

The template field may appear anywhere in the config file. Wherever it appears, the contents of the other file will be a starting point for that part of the current config dict, but you can replace or add values by specifying new values for some of the fields. Fields that are not at the top level are specified using a . to proceed down the levels of the config hierarchy. e.g. image.noise.sky_level refers to config['image']['noise']['sky_level'].

For example, if you have a simulation defined in my_sim.yaml, and you want to make another simulation that is identical, except you want Sersic galaxies instead of Exponential galaxies say, you could write a new file that looks something like this

template : my_sim.yaml
gal: 
    type : Sersic
    n : { type : Random, min : 1, max: 4 }
    half_light_radius : 
        template : my_sim.yaml:gal.half_light_radius
    flux : 1000
output.dir : sersic_sim

This will load in the file my_sim.yaml first, then replace the whole config['gal'] field as well as config['output']['dir'] (leaving the rest of config['output'] unchanged). The new config['gal'] field will use the same half_light_radius specification from the other file (which might be some complicated random variate that you did not want to duplicate here).

If the template field is not at the top level of the config dict, the adjustments should be made relative to that level of the dictionary:

psf :
    template: cgc.yaml:psf
    index_key : obj_num
    items.0.ellip.e.max : 0.05
    items.1.nstruts : 1
    items.1.strut_angle : { type : Random }

Note that the modifications do not start with psf., since the template processing is being done within the psf field.

Finally, if you want to use a different field from the current config dict as a template, you can use the colon notation without the file. E.g. To have a bulge plus disk that have the same kinds of parameters, except that the overall type is a DeVaucouleurs and Exponential respectively, you could do:

gal:
    type: Sum
    items:
        -
            type: DeVaucouleurs
            half_light_radius: { type: Random, min: 0.2, max: 0.8 }
            flux: { type: Random, min: 100, max: 1000 }
            ellip:
                type: Eta1Eta2
                eta1: { type: RandomGaussian, sigma: 0.2 }
                eta2: { type: RandomGaussian, sigma: 0.2 }
        -
            template: :gal.items.0
            type: Exponential

This would gererate different values for the size, flux, and shape of each component. But the way those numbers are drawn would be the same for each.

See examples/great3/rgc.yaml and examples/great3/cgc_psf.yaml for examples of this feature.

The `galsim` Executable

The normal way to run a GalSim simulation using a config file is galsim config.yaml, where config.yaml is the name of the config file to be parsed. For instance, to run demo1 (given above), you would type

galsim demo1.yaml

Changing or adding parameters

Sometimes it is convenient to be able to change some of the configuration parameters from the command line, rather than edit the config file. For instance, you might want to make a number of simulations, which are nearly identical but differ in one or two specific attribute.

To enable this, you can provide the changed (or new) parameters on the command line after the name of the config file. E.g. to make several simulations that are identical except for the flux of the galaxy and the output file, one could do.

galsim demo1.yaml gal.flux=1.e4 output.file_name=demo1_1e4.fits
galsim demo1.yaml gal.flux=2.e4 output.file_name=demo1_2e4.fits
galsim demo1.yaml gal.flux=3.e4 output.file_name=demo1_3e4.fits
galsim demo1.yaml gal.flux=4.e4 output.file_name=demo1_4e4.fits

Notice that the . is used to separate levels within the config hierarchy. So gal.flux represents config['gal']['flux'].

Splitting up a config job

For large simulations, one will typically want to split the job up into multiple smaller jobs, each of which can be run on a single node or core. The natural way to split this up is by parceling some number of output files into each sub-job. We make this splitting very easy using the command line options -n and -j. The total number of jobs you want should be given with -n, and each separate job should be given a different -j. So to divide a run across 5 machines, you would run one of the following commands on each of the 5 different machines (or more typically send these 5 commands as jobs in a queue system).

galsim config.yaml -n 5 -j 1
galsim config.yaml -n 5 -j 2
galsim config.yaml -n 5 -j 3
galsim config.yaml -n 5 -j 4
galsim config.yaml -n 5 -j 5

Other command line options

There are few other command line options that we describe here for completeness.

-h or --help gives the help message. This is really the definitive information about the galsim executable, so if that message disagrees with anything here, you should trust that information over what is written here.
-v {0,1,2,3} or --verbosity {0,1,2,3} sets how verbose the logging output should be. The default is -v 1, which provides some modest amount of output about each file being built. -v 2 give more information about the progress within each output file, including one line of information about each object that is drawn. -v 3 (debug mode) gives a lot of output and should be reserved for diagnosing runtime problems. -v 0 turns off all logging output except for error messages.
-l LOG_FILE or --log_file LOG_FILE gives a file name for writing the logging output. If omitted, the default is to write to stdout.
-f {yaml,json} or --file_type {yaml,json} defines what type of configuration file to parse. The default is to determine this from the file name extension, so it is not normally needed, but if you have non-standard file names, you might need to set this.
-m MODULE or --module MODULE gives a python module to import before parsing the config file. This has been superseded by the modules top level field, which is normally more convenient. However, this option is still allowed for backwards compatibility.
-p or --profile turns on profiling information that gets output at the end of the run (or when multi-processing, at the end of execution of a process). This can be useful for diagnosing where a simulation is spending most of its computation time.
-n NJOBS or --njobs NJOBS sets the total number of jobs that this run is a part of. Used in conjunction with -j (--job).
-j JOB or --job JOB sets the job number for this particular run. Must be in [1,njobs]. Used in conjunction with -n (--njobs).
-x or --except_abort aborts the whole job whenever any file raises an exception rather than continuing on. (new in version 1.5)
--version shows the version of GalSim.