Deprecated ‐ Packaging XCIST using PyInstaller - xcist/documentation GitHub Wiki

Introduction

We would like it to be as easy as possible for users to install XCIST. For this reason, we have been investigating the use of PyInstaller, which "freezes (packages) Python applications into standalone executables" under all common operating systems.

For the most part PyInstaller seems to work well. However, we have identified a few issues and limitations for its use on XCIST that are described below.

Generating a compiled package

As an early example, the GetMu application has been packaged using PyInstaller. A few lines of Python code, related to the ctypes library, need to be modified as described below. PyInstaller needs to be installed as described in its documentation. Then the packaged version of GetMu can be generated by simply running this command in its top-level directory:

pyinstaller --add-data="materials;materials" run_getMu.py

The --add-data argument copies the material data files into the compiled package.

Cautions

PyInstaller has its own methods to trace Python dependencies to determine what needs to be packaged. In our experience, the generated package may not be correct if the developer has multiple versions of Python installed on the same machine. We have found the safest approach is to ensure that only one version of Python is installed. It may also be possible to have multiple Python virtual environments and to work within the scope of one of those environments, but we have not tried that.

Troubleshooting PyInstaller builds is tricky. There are usually not clear error messages. When things are not configured properly the application typically just freezes. The complication is that there are multiple directories involved, some in "hidden" locations, and internal search paths need to be correct. When PyInstaller builds a "onefile" executable, that file is expanded into a directory that has a path something like:

C:\Users\user\AppData\Local\Temp\_MEIxxxxx (where the x's are digits)

When we work with a Python package that is built using pip, the files for that package have a path something like:

C:\Users\user\AppData\Local\Programs\Python\Python37\Lib\site-packages\xcist

It is simpler to work with the portable code base rather than the install base, because that involves one hidden folder instead of two.

File / Folder Organization

There are two alternative implementations: portable and install.

Portable Version

To successfully build a PyInstaller executable, the following changes have been made:

  • All Python files have been moved from the xcist subfolder to the root level.
  • All Python files have been moved from the examples subfolder to the root level.
  • All compiled libraries have been moved from the xcist/lib subfolder to the root level.
  • In run_GetMu.py, added to sys.path the absolute path rather than relative path to the xcist folder. Also xcist is now a sub-folder rather than a "peer" folder.
  • In CommonTools.py:
    • Modified myPath.main variable in get_path() to be up one level.
    • Modified load_C_lib() to load binaries at the top level using literal file names (not variables). See note below on ctypes.

The PyInstaller executable is then successfully created using the command:

pyinstaller --add-data=".;." --add-data="xcist;xcist" --distpath=run_GetMu --onefile --console run_GetMu.py

For convenience the above command has been put in the shell script package_run_GetMu.sh.

After PyInstaller has successfully run, the executable will be in the run_GetMu subfolder, or whatever argument is given to --distpath.

The --console argument allows the executable to be launched by double-clicking.

To run the other example, sample_axial_scan.py, the following changes were required. This is probably more representative of most XCIST applications. If the PyInstaller "--onefile" option is used, the root directory is in a "secret" location chosen by PyInstaller.

  • Code was added to sample_axial_scan.py to determine whether or not the application is running within a PyInstaller package, and based on that, what the root directory is (so it is no longer a secret).
  • The absolute path rather than relative path to the xcist folder is added to sys.path, using the root directory determined above.
  • The absolute path to the configuration file default.cfg is specified, again using the root directory determined above.

Note this code fragment:

if hasattr(sys, '_MEIPASS'):
    # Running in a package created by PyInstaller.
    print("Running from PyInstaller package.")
    root_dir = sys._MEIPASS
else:
    # Not running in a PyInstaller package.
    print("Not running from PyInstaller package.")
    root_dir =os.getcwd()

The PyInstaller executable is then successfully created using the command:

pyinstaller --add-data=".;." --add-data="xcist;xcist" --hidden-import="scipy.interpolate" --distpath=sample_axial_scan --onefile --console sample_axial_scan.py

For convenience the above command has been put in the shell script package_sample_axial_scan.sh.

After PyInstaller has successfully run, the executable will be in the sample_axial_scan subfolder, or whatever argument is given to --distpath.

Install Version

Steps to reorganize the code to work in PyInstaller:

  • Moved all compiled binaries from lib folder to xcist folder, because the binaries must be loaded into ctypes without a path name, and the loading occurs in CommonTools.py, which is in the xcist folder.
  • Modified setup.py to include the *.dll and *.so binary files from the xcist folder.
  • Modified CommonTools.py as was done for the Portable Version to find the correct path when running in a PyInstaller package.

run_GetMu.py is built into an executable using this command:

pyinstaller --add-data="../xcist;xcist" --distpath=run_GetMu --onefile --console run_GetMu.py

As of 8/14/20 this is not working without errors. It was possible to get this to work if the binary files are kept at the top level, outside the installed package, although one could question the value of the installed package approach if it does not include all needed files.

clean.sh

For convenience in development, a file called clean.sh was added that contains:

#!/usr/bin/bash
rm -rf build
rm -rf dist
rm -rf run_GetMu
rm -rf sample_axial_scan
rm *.spec

This file cleans up the artifacts generated from PyInstaller, including the executable itself.

Retired Risks

We have successfully demonstrated that:

  • PyInstaller can package a Python executable and its Python dependendencies.
  • PyInstaller can package dependent DLLs. (GetMu uses several standard C libraries.)
  • PyInstaller can package compiled DLLs integrated using ctypes, with some limitations described below.
  • PyInstaller can package required data files.

Limitations

Use of __import__(variable)

PyInstaller dependency tracking is not able to traverse imports performed in this way. XCIST does this in feval() in CommonTools.py.

We get around this in two ways:

  • The PyInstaller command line argument --add-data=".;." copies all Python files into the executable whether or not PyInstaller can determine that they are needed. (Note that this depends on all Python files being in the root level directory. Having Python files in a subdirectory resulted in other PyInstaller issues.)
  • PyInstaller command line arguments such as --hidden-import="scipy.interpolate" force needed Python libraries to be included in the executable whether or not PyInstaller can determine that they are needed.

Use of ctypes

PyInstaller attempts to automatically trace dependent files in a Python application and create a convenient distribution package or single executable file. However, there are some limitations when PyInstaller searches for dependencies in a Python application which uses ctypes, which XCIST does.

  • Included compiled libraries must be referenced by bare filenames (e.g. no leading paths).
  • Included libraries must be represented by literal strings, not variables.

This can be handled by code such as this, which loads the compiled CatSim library for either Windows or Linux:

if os.name == "nt":
    c = cdll.LoadLibrary("libcatsim64.dll")
else:
    c = cdll.LoadLibrary("libcatsim.so")

The CatSim libary files must be at the same level as the Python code, rather than in a "lib" sub-directory so that a path need not be part of the name.

Single file vs. single folder

By default, PyInstaller generates a single folder which contains an executable along with its dependencies. PyInstaller also has an option for generating a single file executable which contains compressed copies of its dependencies within that file. There are tradeoffs between the single folder and single file approaches.

Benefits of single folder:

  • Runs faster than single file because single file has many dependent files compressed within, and these need to be expanded before running. As an example, running sample_axial_scan from the single packaged file takes about 30 seconds to start up.
  • Easier to update. Often a code change will only require updating the executable file while the dependencies are the same.
  • Easier to debug issues.
  • The single file expands all the files that would be in the single folder package in a hidden location. If the executable runs normally, the hidden files are cleaned up. However, if there is a failure in the execution of the executable, the hidden files may be left behind, invisibly taking up space on the disk.

Benefits of single file:

  • Simpler for users. There may be many files in the "single folder" that should not be touched by the user, so the package may seem cluttered.
  • Less vulnerable to users accidentally breaking the application by moving or deleting one of the files.

Note the point above under Cautions about having only a single version of Python installed. In our experience, the single file option often does not work otherwise.

Unknown hardware / OS dependency

In packaging convenient executables for users, there will need to be separate executables for Windows, Linux, and Mac.

We have successfully run GetMu and the compiled CatSim library, and packaged it using PyInstaller as described above, on a couple of models of Dell laptops including a Dell Precision 5530. However, we get errors when running on an HP Z840. The operating system is identical between the two cases: Windows 10 Pro, 64-bit, version 1809. We have not yet identified the cause of this difference.

Update 8/14/20: The is resolved. The issue was likely the wrong version of Python (32-bit rather than 64-bit) installed on the machine that was not working.

Compatibility with newest numpy

Our early experiments with PyInstaller showed that PyInstaller would give errors when attempting to load the newest version of numpy, but that the prior version of numpy worked fine. We need to revisit whether this gets fixed in new releases of PyInstaller.

Update 8/14/20: The newest numpy is working now. The fix was either due to an update to numpy or PyInstaller, or the error may have been caused by having multiple Python versions installed.

Unknowns

Applications having a GUI

We have not yet experimented with packaging a Python application having a GUI using PyInstaller.

Licensing of dependent libraries

We need to determine that we can redistribute the libraries that XCIST uses. If there are any restrictions, we will need to create a strategy for distributing XCIST without those libraries and instructing users on how they can obtain the needed libraries and build the integrated package.