How to: get the absolute path to a data file - gmlc-dispatches/dispatches GitHub Wiki

For data files located in a Python package

  • The parent directory must be a Python package, i.e. contain a (possibly empty) file named __init__.py
try:
    # Pyton 3.8+
    from importlib import resources
except ImportError:
    # Python 3.7
    import importlib_resources as resources


def main():
    ...
    with resources.path("dispatches.models.fossil_case.ultra_supercritical_plant.storage", "initialized_integrated_storage_usc.json") as p:
        path_to_file = Path(p).resolve()
    assert path_to_file.is_file()
    use_data(path_to_file)  # use str(path_to_file) if a string object is required
    ...

For data files not in a Python package

If the data file file is in a directory that contains .py files, but not a __init__.py file, the method above (based on importlib.resources) cannot be used (the directory is considered a "namespace Python package").

Instead, use the following snippet in the Python code where the data file is needed. As an example:

  • The data file is initialized_integrated_storage_usc.json
  • The directory where the file is located is dispatches/models/fossil_case/ultra_supercritical_plant/storage
  • The corresponding module import path for the directory is dispatches.models.fossil_case.ultra_supercritical_plant.storage
from pathlib import Path

# import the module corresponding to the directory/namespace package
from dispatches.models.fossil_case.ultra_supercritical_plant import storage

# create a pathlib.Path object pointing to the directory
dir_path = Path(storage.__path__._path[0]).resolve()
# if using Python 3.8 and up, the following shorter syntax can be used
# dir_path = Path(storage.__path__[0]).resolve()
assert dir_path.is_dir()

# create a pathlib.Path object pointing to the file
data_file_path = dir_path / "initialized_integrated_storage_usc.json"
assert data_file_path.is_absolute()
assert data_file_path.is_file()

# you can now use data_file_path or str(data_file_path) as needed

Example: data file in temporary directory

In this case, the data file is located inside a directory that is created automatically at runtime and cannot be (reliably, or at all) accessed as a Python module.

In this case, use the first available non-temporary base directory available as an "anchor" to build the file path:

from pathlib import Path

# import the module corresponding to the first non-temporary directory/namespace package
from dispatches.models.nuclear_case import flowsheets

base_dir_path = Path(flowsheets.__path__._path[0]).resolve()
temp_dir_path = base_dir_path / "bidding_plugin_test_multiperiod_nuclear"
data_file_path = temp_dir_path / "tracker_detail.csv"

assert data_file_path.is_file()