FileNames - EranOfek/AstroPack GitHub Wiki
Class Hierarchy: Base -> Component -> FileNames
FileNames is a container class for image and astronomical data products file names. This class supports the LAST/ULTRASAT file name convention described in Ofek et al. (2023). The idea is to use the same file name convention for all file types.
The file names are unique and they are concatenated from several strings separated by underline "_". The rationale for this convention is that all file names have the same structure and could be analyzed using a single routine that uses simple regular expression commands.
The file name format is therefore:
<ProjName>_YYYYMMDD.HHMMSS.FFF_<filter>_<FieldID>_<counter>_<CCDID>_<CropID>_<type>_<level>.<sublevel>_<product>_<version>.<FileType>
For example: USAT_20210909.123456.789_clear_M31_001_2_12_sci_raw_Image_ver1.fits
If some sub-string is not relevant to the file, then it appears as an empty string. In this case, two (or more) successive underlines (e.g., "__") will appear in the file name. The strings, by their order of appearance in the file names, are:
- Project Name - Project/telescope name. For LAST we use LAST.<Node>.<Mount>.<Camera>, where Node is the node index (1 for the first node in Neot-Smadar, Israel). Mount for the telescope-mount index (e.g., 1 to 12). Camera, for the camera on mount index (e.g., 1 to 4).
- Time - UTC date and time in format YYYYMMDD.HHMMSS.FFF.
- Filter - Filter name (e.g., `clear').
- Field - The field ID. For LAST the field ID is a string of the format ddd+dd, indicating the RA and Dec in decimal degrees.
- Counter - Image counter. For LAST the image counter is usually 1 to 20, indicating the index of the image in the sequence of 20 exposures in a visit.
- CCDID - Detector ID. For LAST this is always 1.
- CropID - Index of the sub-image (1 to 24). 0 or an empty string is reserved for the full image.
- Type -- One of the following (self-explanatory) image types: bias, dark, flat, domeflat, twflat, skyflat, fringe, focus, sci, wave, test.
- Level -- One of the following strings describing the level of processing:
- raw - A raw image.
- proc - A single processed image.
- caodd - The coadd image of the visit or any other sequence of images.
- ref - A reference image.
- calib - A calibration image.
- Product -- One of the following keywords describing the file content:
- Image - Image data.
- Back - Background image.
- Var - Variance image (typically background variance only).
- Mask - A bit mask image.
- PSF - A PSF image, or cube.
- Cat - A Catalog of sources detected in the image.
- MergedMat - A merged matrices of sources detected in multiple epochs of the same field.
- Asteroid - A file containing information on asteroids detected in the image.
- Evt - A photon event file.
- Spec - A spectrum.
- Version - Processing version.
File type extensions are typicaly, fits
, hdf5
, or mat
.
The FileNames class is also responsible for generating the directory names in which the files are stored. The file path is constructed from the following sub directories, described in the following options:
<BasePath>/new
<BasePath>/calib
<BasePath>/failed
<BasePath>/<YYYY>/<MM>/<DD>/raw
<BasePath>/<YYYY>/<MM>/<DD>/proc/
<BasePath>/<YYYY>/<MM>/<DD>/proc/<SubDir>
Here <SubDir> is typically a numerical index. It is used in LAST because the number of processed files from a single visit can be larger than 1500. To reduce the number of files in a directory we store each visit data in a separate directory.
The construction of the date directory is done by changing the directory at local noon. In order to determine the local noon, the time zone is a property of the FileNames class.
BasePath can include the project name - for example: /last2/data/archive/LAST.01.01.02
Multiple image names can be stored in a single element of the object. In this case, each property is a vector or a cell array of strings. Note that common values (e.g., Project name) can be given as scalars. FileNames can be also an array of objects, which is useful in some cases (e.g., divide images to groups by some properties).
The FileNames class contains the following properties
- ProjName
- Time - Either a vector of JD (one per image), or a cell array of dates.
- Filter
- FieldID
- Counter
- CCDID
- CropID
- Type
- Level
- Product
- Version
- FileType
- FullPath - If provided (not empty), full path will override the automatic path generation.
- BasePath - Base path.
- BasePathIncludeProjName - A logical indicating if the BasePath includes the project name.
- SubDir
- TimeZone - Default is 2 [hr].
The generateFromFileName method can be used in two modes:
To generate a FileNames object of files that exist (with wild cards) in the current directory, you can use
FN=FileNames.generateFromFileName('LAST*.fits');
However, if you want to run it on a string of file names that do not exist (no wild cards are allowed), then you have to use a cell array input:
FN=FileNames.generateFromFileName('LAST.01.03.02_20230626.171715.051_clear_____twflat_proc_Image_1.fits');
Given a file name, you can get the value of one of the sub string using:
% Get the value of ProjName
FileNames.getValFromFileName('LAST.01.02.01_20221229.212126.937_clear_050+09_050_001_001_sci_raw_Image_1.fits','ProjName');
- genFile - Generate file names.
- genPath - generare file paths.
- genFull - Generate full paths.
- nextSubDir - Find next numerical SubDir, by increasing the largest SubDir number by 1.
FN = FileNames.generateFromFileName('LAST*.fits');
% The output is a cell array of file names:
List = genFile(FN)
% the same as before, but put 'proc' in the Level field:
List = genFile(FN,[], 'Level','proc');
% The second input argument can be used for selected file indices - e.g., return the 3rd file name:
List = genFile(FN,3, 'Level','proc');
Dir = genPath(FN)
Dir = genPath(FN, 1, 'AddSubDir',false)
Dir = genPath(FN, [], 'BasePath','/home/eran/archive');
Dir = genPath(FN, 1, 'FullPath','/home/eran/archive/LAST');
Full = genFull(FN);
Full = genFill(FN, 2, 'Product','Mask', 'FullPath','/home/eran/archive/LAST');
The following utility methods are available:
For validation:
- validateType
- validateLevel
- validateProduct
- validate
For time and JD:
- julday - get JD
- jd2str - convert JD to string
- validTimes - Return a vector of logical indicating if Time argument is valid.
Read write from header:
- readFromHeader - Attempt to read properties from AstroImage image header.
- writeToHeader - Write properties to AstroImage image header.
- updateForAstroImage - Update an FileNames object using the headers of AstroImage.
Sort and select:
- reorderEntries - Reorder/select all the entries in FileNames object.
- sortByJD - Sort entries in FileNames object by JD.
Path related:
- getDateDir - Return date directory name from file name properties
Selection and grouping:
- getProp - get property (all or single by index).
- sunAlt - Calculate Sun Altitude for images in FileNames object.
- selectBy - Select entries that have proprty value of some value or in some range.
- groupByCounter - Group entries according to running counter groups.
- groupByTimeGaps - Group FileNames images by groups separated by some time gaps.
- selectLastJD - Return index of image with largest JD.
- findFirstLast - find image, of some product type, with latest/earliest JD
General tools:
- nfiles - Return number of files in a FileNames object.
- moveImages - move/delete images specified by FileNames object.
- updateIfNotEmpty - Update FileNames properties if provided and not empty.
FN = FileNames.generateFromFileName('LAST*.fits');
% Get vector of JD of all images
getProp(FN, 'Time')
% Get the JD for the 3rd image
getProp(FN, 'Time',3)
% get the product
getProp(FN, 'Product)
% Select FileNames with Product=Image and do not create a new object:
NFN = selectBy(FN, 'Product','Image', 'CreateNewObj',false)
% Select all images for which Level is not equal raw (create a new object):
NFN = selectBy(FN, 'Level','raw', 'SelectNotVal',true)