Filter_EN - CCSEPBVR/CS-IS-PBVR GitHub Wiki

Filter Programs

The filter program is designed for parallel processing of particle generation by partitioning volume data stored on storage media. It is an independent preprocessing program from PBVR. The region partitioning model employs an octree structure, and after the filtering process, the divided sub-volume data is output as files.

How to start

The parameter is read and executed by specifying a parameter file name (optional) describing the parameter immediately after the execution command (filter). If the parameter file name is not specified or if a file name that does not exist is specified, an error occurs and the system is not started properly.

(How to start with MPI + OpenMP, enter the number of process parallels in N)
$ mpiexec -n N filter param.txt

(How to start with OpenMP only)
$ filter param.txt

In both of the above cases, the number of OpenMP threads is specified by the environment variable OMP_NUM_THREADS.

Starting the filter for VTK data

In order to start the VTK filter, it is necessary to set the environment variables according to the environment.

  • Linux
    Set the environment variables as follows:
export LD_LIBRARY_PATH=${VTK_LIB_PATH}:$LD_LIBRARY_PATH
  • Mac
    Set the environment variables as follows:
export DYLD_LIBRARY_PATH=${VTK_LIB_PATH}:$DYLD_LIBRARY_PATH
  • Windows
    Go to [Control Panel]->[System]->[Advanced]-> [Environment Variables] and set the following environment variables.
Variables Value
Path C:¥Program Files¥VTK 6.3.0¥bin

The value specifies the bin directory under the VTK installation directory.

File Format

The input and output file formats of the filter are shown below. In addition, among the output files, all binary format files are single-precision without headers and footers, and are unified in little-endian. Three types of file formats are available: the SPLIT format (also known as KVSML format), the subvolume aggregation format, and the step aggregation format shown in the figure below. The SPLIT format generates an independent file for every step and subvolume. However, in this method, the number of files increases explosively as the hierarchy of the 8-minute tree increases, so a subvolume aggregation format that aggregates files in the temporal direction for each subvolume and a step aggregation format that aggregates files in the spatial direction for each step can be used. The file format is described in detail below.

workload

Input Data Format

The input data that can be processed by the filter program is as follows.

  1. AVSFLD Binary Data※1
  2. AVSUCD ASCII Binary Data*1
  3. STL binary data*2
  4. PLOT3D binary data*3
  5. VTK Legacy Binary Data*4

※1. For more information on the AVS data format, please visit the Website. AVSUCD data supports only the data format and does not support the geom or data_geom formats. Element Type supports 2D and 3D elements, and mixed elements can also be used. ※2. For details on the STL data format, refer to Website.
※3. For details on the PLOT3D data format, refer to the Website, etc. ※4. For details on the VTK data format, refer to the Website, etc. VTK Structured Points, VTK Structured Grid, VTK Rectilinear Grid, VTK Unstructured Grid, and VTK Polygon Data are available.

Endian

The binary files handled by the filter program are unified in little-endian. On a little-endian machine, no special processing is required for endianness, but on a big-endian machine, it is necessary to output the visualization data in little-endian or convert it to little-endian before executing the filter program.

Filter Output Information File (.pfi)

The filter program outputs a pfi file that describes the sub-volume's metadata. The pfi file is in binary format and consists of the following information.

  • Total number of nodes (int)
  • Total number of elements (int)
  • Element type (int) *1
  • File type (int) *2
  • Number of files (int) *3
  • Number of components (int)
  • Start step (int)
  • End step (int)
  • Number of sub-volumes (int) *4
  • Minimum value on the X-axis in the overall 3D space (float)
  • Minimum value on the Y-axis in the overall 3D space (float)
  • Minimum value on the Z-axis in the overall 3D space (float)
  • Maximum value on the X-axis in the overall 3D space (float)
  • Maximum value on the Y-axis in the overall 3D space (float)
  • Maximum value on the Z-axis in the overall 3D space (float)
  • Number of nodes in sub-volume 1 (int)
  • Number of nodes in sub-volume 2 (int)
  • Number of nodes in sub-volume 3 (int)
  • Number of nodes in sub-volume n (int)
  • Number of elements in sub-volume 1 (int)
  • Number of elements in sub-volume 2 (int)
  • Number of elements in sub-volume 3 (int)
  • Number of elements in sub-volume n (int)
  • Minimum value on the X-axis in sub-volume 1 (float)
  • Minimum value on the Y-axis in sub-volume 1 (float)
  • Minimum value on the Z-axis in sub-volume 1 (float)
  • Maximum value on the X-axis in sub-volume 1 (float)
  • Maximum value on the Y-axis in sub-volume 1 (float)
  • Maximum value on the Z-axis in sub-volume 1 (float)
  • Minimum value on the X-axis in sub-volume 2 (float)
  • Minimum value on the Y-axis in sub-volume 2 (float)
  • Minimum value on the Z-axis in sub-volume 2 (float)
  • Maximum value on the X-axis in sub-volume 2 (float)
  • Maximum value on the Y-axis in sub-volume 2 (float)
  • Maximum value on the Z-axis in sub-volume 2 (float)
  • Minimum value on the X-axis in sub-volume n (float)
  • Minimum value on the Y-axis in sub-volume n (float)
  • Minimum value on the Z-axis in sub-volume n (float)
  • Maximum value on the X-axis in sub-volume n (float)
  • Maximum value on the Y-axis in sub-volume n (float)
  • Maximum value on the Z-axis in sub-volume n (float)
  • Minimum value of component 1 at step 1
  • Maximum value of component 1 at step 1
  • Minimum value of component 2 at step 1
  • Maximum value of component 2 at step 1
  • Minimum value of component N at step 1
  • Maximum value of component N at step 1
  • Minimum value of component 1 at step m
  • Maximum value of component 1 at step m
  • Minimum value of component 2 at step m
  • Maximum value of component 2 at step m
  • Minimum value of component N at step m
  • Maximum value of component N at step m

*1. For the definition of element types, see here.
*2. The file types are in the following three formats.

  • 0 : SPLIT format
  • 1 : Subvolume aggregation format
  • 2 : Step aggregation format

*3. Indicates the total number of files when the file type is "Subvolume aggregation format".
※4. The number of sub-volumes is $8^\texttt{n\_layer}$

  • n_layer=0 : 1 piece
  • n_layer=1 : 8
  • n_layer=2 : 64 pieces
  • n_layer=3 : 512 pieces
  • n_layer=4 : 4,096 units
  • n_layer=5 : 32,768 units
  • n_layer=6 : 262,144
  • n_layer=7 : 2,097,152 units

SPLIT File Format

The element configuration file and the node coordinate file are divided into subvolume units. KVSML files and component files are divided into subvolume and step units. The total number of files is

Number of subvolumes * 2 + Number of subvolumes * Number of steps * 2

となる。 (例:n_layer=7&step=100: 423,624,704個)

File Structure

File Name Description
prefix_XXXXX_YYYYYYY_ZZZZZZZ.kvsml KVSML File (ASCII Format)
prefix_YYYYYYY_ZZZZZZZ_connect.dat Element Configuration File (Binary Format)
prefix_YYYYYYY_ZZZZZZZ_coord.dat Node Coordinate File (Binary Format)
prefix_XXXXX_YYYYYYY_ZZZZZZZ_value.dat Component Files (Binary Format)
  • XXXXX : Number of steps (5-digit number)
  • YYYYYYY : Number of subvolumes (7-digit number)
  • ZZZZZZZ : Total number of volumes (5-digit number)

Subvolume File Format

The element configuration, node coordinates, and components of all steps are divided into subvolume units. It is also possible to aggregate multiple subvolume information into one file by specification. The total number of files is the maximum number of subvolumes. (n_layer=7 : 2,097,152 pieces)

File Structure

prefix_YYYYYYY_ZZZZZZZ.dat ...... (binary format)

  • prefix : 接頭辞
  • File number (7-digit number)
  • Total number of all files (7-digit number)

Step Aggregation File Format

The element configuration file and the node coordinate file are each one file (all subvolume locks are aggregated into a single file), and the component files are divided into step units. The total number of files is the number of steps + 2.

File Structure

File Name Description
prefix_connect.dat Element Configuration File (Binary Format)
prefix_coord.dat Node Coordinate File (Binary Format)
prefix_XXXXX_value.dat Component Files (Binary Format)
  • prefix
  • XXXXX : Number of steps (5-digit number)

Parameter file

The parameter file is an ASCII format file commonly used for filters (for AVSFLD/UCD, PLOT3D, STL) and filters for VTK. By specifying a file name as an argument when executing a filter, the file is interpreted as an input parameter and the parameter is set.

Parameter Name Parameter Contents Default Value Remarks
in_dir Input File Directory “.” Input file directory path*1
field_file AVSFLD file name - ※2、※3、※4
stl_binary_file STL file name - ※2
plot3d_config_file PLOT3D configuration file name - ※2、※3
vtk_file VTK File Name - ※2、※3、※5
vtk_in_prefix VTK Time Series File Name Prefix - ※2、※3、※5
vtk_in_suffix VTK Time Series Filename Suffix - ※2、※3、※5
ucd_inp AVSUCD file name - ASCII format *2
in_prefix Prefix of time series AVSUCD data files - Binary format *2
in_suffix Suffix of time series AVSUCD data files - Binary format *2
format Step number format for time series data “%05d”
out_dir Output File Directory “.” Output file directory path*1
out_prefix Output File Name Prefix “output.”
start_step Number of Start Steps 1 ※6
end_step Number of Exit Steps 1 ※6
n_layer Number of octa-tree divisions 0 Integers between 0~7
output_type File Format 0 0: SPLIT format
1: Subvolume aggregation
2: Step aggregation
No output other than the above
file_number Number of Output Files 0 An integer greater than or equal to 0, where 0 is the number of subvolumes subvolume aggregate files)
mpi_volume_div Number of Subvolume Splits 1 Number of subvolume divisions*7
mpi_step_div Number of Step Divisions 1 Number of steps divided*7
mpi_div Split Axes in MPI Parallel 2 0: Determined by the number of step divisions and the number of subvolume divisions
1: Priority for subvolume partitioning axis
2: Priority for step division axismpi_volume_div and mpi_step_div are specified
multi_elem_type Unstructured Grid Mixed Element Data Flag 0 0: Consists of only one type of element
1: Consists of multiple types of elements
temp_delete Instructions to delete temporary files when executing mixed-element data 1 0: Leave temporary files
1: Clear temporary files

*1. The directory can be specified as an absolute or relative path, but it cannot be specified with ~ (tilde).
※2. Specify only one of the following: field_file, stl_binary, vtk_file, vtk_in_prefix(suffix), ucd_inp, or in_prefix(suffix). *3.If the input data is a structural lattice, it is output as a hexahedral one-dimensional element (unstructured lattice) in 3D and as a quadrilateral one-dimensional element (unstructured lattice) in 2D.
※4. Only parameters related to nstep, ndim, dim1, dim2, dim3, veclen, coord[123], and variable are referenced. ※5. For the VTK Legacy format, the program automatically determines five data formats (VTK Unstructured Grid and VTK Polygonal Data).
*6. Specify only time-series data.
*7. If mpi_volume_div and mpi_step_div are specified, an error occurs if the product and the number of processes do not match.

PLOT3D CONFIGURATION FILE

DESCRIBE THE FILE FORMAT OF PLOT3D DATA BY PLOT3D CONFIGURATION FILE. Here, usebytecount is set to 1 for Fortran binaries and 0 for C binaries.

Parameter Name Parameter Contents Default Value
coordinate_file_prefix Coordinate File Name Prefix -
coordinate_file_suffix Coordinate File Name Suffixes -
coordinate_mode_precision 精度 (float | double) double
coordinate_mode_usebytecount 1 for true、0 for false true
coordinate_mode_endian Endian (little | big) little
coordinate_mode_iblanks 1 for true、0 for false false
solution_file_prefix Solution File Name Prefix -
solution_file_suffix Solution File Name Suffixes -
solution_mode_precision Precision (float | double) double
solution_mode_usebytecount 1 for true、0 for false true
solution_mode_endian Endian (little | big) little
function_file_prefix Function File Name Prefix -
function_file_suffix Function File Name Suffixes -
function_mode_precision 精度 (float | double) double
function_mode_usebytecount 1 for true、0 for false true
function_mode_endian Endian (little | big) little

MPI Parallel Processing

The method of determining the number of divisions in MPI parallel processing is explained. In the following, consider the data processing of 50 steps * 8 subvolumes.

  1. Step splitting axis priority

    • If the number of processes is less than or equal to the number of steps, all steps are divided by the number of processes. Example) When starting with 8 processes, the area responsible for each process is 6 steps * 8 subvolumes, or 7 steps * 8 subvolumes.
    • When the number of processes is greater than the number of steps, the number of processes to be filtered is an integer multiple of the number of steps, and the number of subvolumes is also divided. Example) When starting with 128 processes, 50 * 2 = 100 processes (remaining 28 processes) are processed, and the area responsible for each process is 1 step * 4 subvolumes.
  2. Subvolume split: Axis priority

    • If the number of processes is less than or equal to the number of subvolumes, all subvolumes are divided by the number of processes. Example) When starting with 8 processes, the area responsible for each process is 50 steps * 1 subvolume.
    • When the number of processes is larger than the subvolume, the number of processes to be filtered is an integer multiple of the number of subvolumes, and the number of steps is also divided. Example) When starting with 128 processes, 8 * 16 = 128 processes (0 remaining processes), and the area responsible for each process is 6 steps * 1 subvolume, or 7 steps * 1 subvolume.
  3. User-Specified Splitting

    • When specified by the parameter file, an error is given if the product of the number of divisions and the number of processes do not match.

How to run in a staging environment

Describe how to run in a staging environment on a supercomputer. It is necessary to ensure consistency between the specification of the parameter file and the specification of the file transfer, and to start the filter program. Further, depending on the output format of the filter program, output from a plurality of processes to one file may be performed. In such a case, it is necessary to specify a shared directory that can be accessed by multiple processes as the output destination.

Executable shell and parameter files

```
#!/bin/bash -x
#
#PJM --rsc-list "elapse=01:00:00"
#PJM --rsc-list "node=64"
#PJM --rsc-list "rscgrp=small"
#PJM --stg-transfiles all
#PJM --mpi "proc=64"
#PJM --mpi "use-rankdir"
#PJM --stgin  "rank=* ./filter             %r:./"		………… ①
#PJM --stgin "rank=* ./param.txt %r:./" 		………… ②
#PJM --stgin  "rank=0 /data/ucd/ucd*.dat    0:.. /"		………… ③
#PJM --stgout "rank=* %r:.. /output*.dat        ./" 		………… (4)
#PJM --stgout "rank=* %r:./pbvr_filter.*      ./LOG/"	………… ⑤
#PJM -S

. /work/system/Env_base

export PARALLEL=8
export OMP_NUM_THREADS=8

mpiexec -n 64 lpgparm -p 4MB -s 4MB -d 4MB -h 4MB -t 4MB filter param.txt  ……… ⑥
```
  1. Transfer the executable module to the rank directory of each process.

  2. Transfer the parameter file to the rank directory of each process.

  3. Transfer the input data to a shared directory (the file specified in the parameter file)

  4. Transfer output data from a shared directory to a local directory

  5. Transfer log and error files from the rank directory to the local directory

  6. Start the executable module in the rank directory of each process using the parameter file in the rank directory of each process as an argument.

    #
    in_dir=.. /			………… ⑦
    field_file=pd3d.fld		
    out_prefix=case0			
    out_dir=./			………… ⑧
    file_type=0			………… (9)
    n_layer=3
    start_step=0
    end_step=511
    
  7. Specify the path of the input data (specified by relative path in the stasting environment, in the above example, read the input data from a shared directory)

  8. Specify the path of the output data (specified by relative path in the staging environment, in the above example, output to the rank directory in each process)

  9. Specify the file output format (in the above example, SPLIT format)

I/O Files and Directories

The relationship between the input and output files and directories of the filter program in the staging environment is shown below. There is no problem if the I/O destination is specified as a shared directory (except for log and error files). As for the output data, only SPLIT format can be output to the rank directory. Other file formats require the use of shared directories for data aggregation.

IO Type Rank Directory Shared Directories
Input Parameter File ○※1
Input Input Data ○※2
Output Output Data (SPLIT form)
Output Output Data (Step Aggregation Format) ×
Output Output Data (Subvolume Aggregation Format) ×
Output Logs & Error Files ○※3 ×

*1. Parameter files are entered only with rank 0. *2. When it is possible to transfer all input files to the rank directory of all processes.
*3. The output destination is fixed to the rank directory.

Running Unstructured Grids with Mixed Element Types

In the case of unstructured lattice volume data including a plurality of element types, the filter program first generates UCD binary data divided for each single element type, and then divides the UCD binary data of the single element type into subvolumes by the filter program, and outputs a file that is the input of the PBVR server.
By setting the parameter name "multi_element_type" to 1 in the parameter file as shown below, the filter program outputs a subvolume split file divided by element type.

#
in_dir=.
in_prefix=MULTI
in_suffix=.dat
out_dir=.
out_prefix=div
out_prefix=.dat
format=%03
start_step=1
end_step=20
multi_element_type=1

The file output for each element type is a file name in which the two-digit code of the element type is added to the beginning of the file name. The following is a list of element names and element type codes.

Element Name Element Type Codes
3 Angular 2
4 Angular 3
4-hedron 4
Pyramid 5
Prism 6
6-hedron 7
3 corners 2 times 9
4 corners 2 times 10
4 Sides 2 times 11
6 hedron 2 times 14

In the case of volume data composed of a tetrahedral one-dimensional element and a tetrahedral two-dimensional element in the above parameter file, the output file name is as follows.

Original Mixed Data
MULTI001.dat
MULTI002.dat
MULTI003.dat
MULTI004.dat
MULTI005.dat
...
MULTI020.dat

↓ Split

Tetrahedral Primary Data Tetrahedral Secondary Data
04-div001_~ 11-div001_~
04-div002_~ 11-div002_~
04-div003_~ 11-div003_~
04-div004_~ 11-div004_~
04-div005_~ 11-div005_~
... ...
04-div020_~ 11-div020_~
⚠️ **GitHub.com Fallback** ⚠️