Template Format Specification - ugcs/GeoHammer GitHub Wiki

CSV templates define how to parse and interpret CSV data files from various survey instruments. Templates are YAML files that specify file format parameters, data mapping, and parsing rules. When opening a CSV file, GeoHammer automatically selects the appropriate template by matching the file's header against template regex patterns.

Template Location

Templates must be placed in the templates/ directory relative to the GeoHammer executable. The application automatically loads all .yaml files from this directory on startup and monitors for changes.

Template Selection

Templates apply to files with extensions .csv, .asc and .txt, having comma-separated or fixed-width data columns.

When opening a CSV file, GeoHammer:

  1. Reads the first 50 lines of the file
  2. Tests each template's match-regex against this content
  3. Uses the first matching template for parsing

Template Structure

name: 'Template Display Name'
code: 'template-code'
file-type: CSV
match-regex: >-
  ^.*Latitude,Longitude,Date,Time,MagField.*
file-format:
  has-header: true
  comment-prefix: '#'
  decimal-separator: '.'
  separator: ','
data-mapping:
  # Position fields (required)
  latitude:
    header: 'Latitude'
  longitude:
    header: 'Longitude'
  # Time fields (required)
  date:
    header: 'Date'
    format: 'yyyy/MM/dd'
  time:
    header: 'Time'
    format: 'H:mm:ss.fff'
  # Data fields
  data-values:
    - header: 'MagField'
      semantic: 'TMI'
      units: 'nT'
      decimals: 3

Header Section

Field Required Description
name Yes Template display name
code Yes Unique identifier for the template
file-type Yes File type: CSV, ColumnsFixedWidth, or Segy
match-regex Yes Multi-line regex to identify compatible files

File Format Section (file-format)

Comma-Separated Format

Field Required Description
has-header Yes true if file contains column headers
separator Yes Column separator: ',', '\t', ';', etc.
comment-prefix Yes Comment line prefix: '#', '%', '//'
decimal-separator Yes Decimal separator: '.' or ','

Fixed-Width Format

Field Required Description
column-lengths Yes Array of column widths in characters
comment-prefix Yes Comment line prefix: '#', '%', '//'
decimal-separator Yes Decimal separator: '.' or ','

Data Mapping Section (data-mapping)

Required Position Fields

latitude:
  header: 'Lat'        # Column header name (if has-header: true)
  # OR
  index: 0             # Column index (if has-header: false)

longitude:
  header: 'Lon'        # Column header name
  # OR  
  index: 1             # Column index

Optional Position Fields

altitude:
  header: 'Alt'        # Altitude/elevation column
  
trace-number:
  header: 'Trace'      # Trace number

Date and Time Fields

Separate date and time columns:

date:
  header: 'Date'
  format: 'yyyy/MM/dd'           # Java DateTimeFormatter pattern
  
time:
  header: 'Time'  
  format: 'HH:mm:ss.fff'         # Java DateTimeFormatter pattern
  type: UTC                      # Optional: UTC or GPST

Combined date-time column:

date-time:
  header: 'DateTime'
  format: 'yyyy-MM-dd[ ]HH:mm:ss[.fff]'  # Brackets indicate optional parts
  type: UTC                              # Optional: UTC or GPST

Timestamp column:

timestamp:
  header: 'Timestamp'            # Unix timestamp in milliseconds

Date from file name:

date:
  source: FileName               # Extract date from filename
  regex: '\d{4}-\d{2}-\d{2}'     # Regex to find date in filename
  formats:                       # Multiple possible formats
    - 'yyyy-MM-dd'
    - 'yyyyMMdd'

SGY Traces Section (sgy-traces)

Elevation profile for SGY files can be loaded from the external positional files. Use sgy-traces section to let GeoHammer know how to map data from the positional file to SGY traces. Columns listed in sgy-traces should also be added to a data-values section.

sgy-traces:
  - header: "ECHO:Trace Hi"     # Column header with the SGY trace index
  - header: "ECHO:Trace Lo"     # Another column where the trace index can be stored
  - header: "GPR:Trace"

Data Values Section (data-values)

data-values:
  - header: 'MagField'           # Column header
    semantic: 'TMI'              # Data type identifier
    units: 'nT'                  # Units for display
    decimals: 3                  # Decimal places to display (default 2)
  - header: 'Altitude'
    semantic: 'Altitude'
    units: "m"

Common Semantic Types:

  • TMI - Total Magnetic Intensity
  • TMI_anomaly - Magnetic anomaly
  • Bx, By, Bz - Magnetic field components
  • Altitude - Height above sea level
  • Altitude AGL - Height above ground level
  • Line - Survey line identifier

Date-Time Format Patterns

Common Format Patterns

Pattern Example Description
yyyy/MM/dd 2024/03/15 Year/Month/Day
dd-MMM-yyyy 15-Mar-2024 Day-Month-Year
MM/dd/yyyy 03/15/2024 US format
HH:mm:ss.fff 14:30:25.123 24-hour time with milliseconds
HH:mm:ss.ff 14:30:25.12 24-hour time with centiseconds
yyyy-MM-dd[ ]HH:mm:ss[.fff] 2024-03-15 14:30:25.123 ISO format with optional milliseconds

Format Pattern Symbols

Symbol Meaning Examples
y Year yy=24, yyyy=2024
M Month M=3, MM=03, MMM=Mar
d Day d=5, dd=05
H Hour (0-23) H=9, HH=09
h Hour (1-12) h=9, hh=09
m Minute m=5, mm=05
s Second s=5, ss=05
f Fraction of second f=1, fff=123
[ ] Optional literal space Space that may or may not be present
[.fff] Optional pattern Pattern that may or may not be present

Multiple Format Support

For files with varying date formats:

date:
  header: 'Date'
  formats:
    - 'yyyy-MM-dd'
    - 'dd/MM/yyyy'
    - 'MMM dd, yyyy'

Data Validation

Apply quality control filters:

data-validation: '{MagValid} == 1'                   # Single condition
data-validation: '{SNR} > 30 && {SD} < 100'          # Multiple conditions

Reference any data column using {ColumnName} syntax.

Skip Lines

Skip file content until a pattern is found:

skip-lines-to:
  match-regex: '^Data.*'        # Skip until line starting with "Data"
  skip-matched-line: true       # Also skip the matched line

Regex Column Extraction

Extract values using regex patterns:

latitude:
  regex: 'Lat:\s*([+-]?\d+\.\d+)'    # Extract from formatted text

Troubleshooting

Template Not Matching

  • Check match-regex against actual file headers
  • Ensure regex accounts for whitespace and special characters
  • Test regex with first 50 lines of the file

Date Parsing Issues

  • Verify format pattern matches exact date/time format
  • Use optional patterns [.fff] for variable precision
  • Consider multiple formats for input with separate date and time columns

Column Mapping Errors

  • Ensure header names match exactly (case-sensitive)
  • Check for extra spaces or special characters
  • Use column indexes for files without headers