Relabel Residue Numbers or Segment IDs - k-ngo/CATMD GitHub Wiki

Relabel Residue Numbers or Segment IDs

Overview and Methodology

What It Does

This tool modifies residue numbers (resids) or segment identifiers (segids) in a molecular dynamics trajectory. It enables users to renumber residues sequentially or reassign segment IDs to specific atom selections. This ensures consistency across frames and simplifies system labeling for downstream analysis and visualization.

How It Works

  • Objective: Standardize or customize atom labeling in trajectories by modifying residue numbers or segment identifiers.

  • Process:

    • Residue Renumbering: Assigns new sequential residue numbers starting from a user-defined value.
    • Segment ID Assignment: Applies a new segid to all atoms in a given selection.
    • Frame Iteration: Applies changes frame-by-frame and writes out a new trajectory.
  • Use Cases:

    • Resetting residue numbering in trimmed or mutated protein segments.
    • Harmonizing segment IDs for multi-component systems.
    • Preparing trajectories for docking, visualization, or export to other tools.

Configuration and Inputs

Prerequisites

  • Requires a loaded trajectory.
  • Atom selection must be valid and non-empty.

Key Configuration Options

  • System Definition:

    • system_sel: Atom selection string for the group to relabel (e.g., protein, segid TOX).
  • Relabeling Type:

    • 'Residue Numbers': Renumber residues sequentially from a starting value.
    • 'Segment IDs': Assign a new segid to all atoms in the selection.
  • Residue Renumbering:

    • start_resid: Starting number for renumbering (e.g., 1).
  • Segment ID Assignment:

    • new_segid: Custom segid string to assign (e.g., TOX, PROT, LIG).
  • Frame Range:

    • begin_frame, end_frame, step: Define subset of frames to process (end_frame = -1 processes all remaining frames).
  • Output Settings:

    • output_traj: Output filename (e.g., relabeled.dcd).
    • load_output_traj: Whether to automatically load the new trajectory for further analysis.

Output

  • Relabeled Trajectory:

    • Saved to: relabeled.dcd (or custom name defined in output_traj).
  • Console Output:

    • Summary of selections and relabeling type.
    • Confirmation of new residue numbering or segid assignment.
    • Total number of frames processed.

Example Scenarios

Renumbering a Truncated Domain

  • Scenario: A protein domain is extracted from a larger complex.
  • Method: Use Residue Numbers with system_sel='segid DOM', start_resid=1.
  • Outcome: Domain residues start at 1, avoiding conflicts or gaps.

Reassigning Segment IDs

  • Scenario: Multiple chains need distinct segids for clear selection.
  • Method: Use Segment IDs with system_sel='chain A', new_segid='CHAINA'.
  • Outcome: Chain A atoms now belong to a unique segment for easy identification.

Ligand Relabeling

  • Scenario: Ligand is imported from a docking run with inconsistent segid.
  • Method: system_sel='resname LIG', relabel_type='Segment IDs', new_segid='LIGAND'.
  • Outcome: Ligand appears under a unified segment label, compatible with viewer selection syntax.

Usage Tips

  • Selection Validation:

    • Use concise and accurate selection strings (e.g., segid TOX and resid 5-30).
    • Avoid empty selections, which will raise an error.
  • Start Residue Number:

    • Choose 1 for new numbering, or match original system residue offset if needed.
  • Segid Conventions:

    • Use short, descriptive segids (≤4 characters is common practice).
  • Performance:

    • For long trajectories, process a subset of frames with step > 1 during test runs.
  • Chaining Outputs:

    • Set load_output_traj=True if subsequent tools in CATMD should use the newly relabeled file without manual loading.