PHP API - skizzerz/python-sandbox GitHub Wiki

This page details the API of the reference PHP parent implementation. The reference implementation is intended to abstract away the RPC communication layer between the parent and child so that you need only to focus on adding in your own application functionality. The reference PHP implementation may be found in the PythonSandbox directory.

First steps

In order for the reference implementation to run, it requires that an autoloader be set up and capable of loading files from the PythonSandbox directory. Furthermore, the PythonSandbox/Constants.php file must be loaded. The following is an example of how to accomplish this, however the exact implementation is left up to the application in case it wishes to integrate things more closely with its own processes.

require_once 'PythonSandbox/Constants.php';

spl_autoload_register(function ($class) {
    $file = __DIR__ . '/' . str_replace('\\', '/', $class) . '.php';
    if (file_exists($file)) {
        require $file;
    }
});

Application

The application must define a class that extends from PythonSandbox\Application and implements its abstract methods. All of the Application methods are detailed below, some have default implementations and do not need to be overloaded if those defaults are acceptable.

public function getPythonVersion() -> string

This function returns the python version and is used to locate the python libraries in the filesystem. For example, this should return the string 'python3.5' or 'python3.4' if the python libraries were located at /usr/lib64/python3.5 or /usr/lib64/python3.4.

public function getPythonBasePath() -> string

This function returns the filesystem path that python is based on. For example, if loading system python from /usr/bin/python3 and /usr/lib64/python3.5, this should return "/usr". If loading a virtualenv, this should point to the root of the virtualenv. This must be an absolute path.

public function getSandboxBasePath() -> string

This function returns the filesystem path that the sandbox and libsbpreload.so binaries reside in as well as the sandbox's lib directory. This must be an absolute path.

public function getInitScriptNode(PythonSandbox\VirtualFS $fs, string $fileName) -> PythonSandbox\Node

This function returns the Node containing the "init.py" script used to run application-specific initialization, for example a PythonSandbox\RealFile instance or a PythonSandbox\VirtualFile instance. The $fs and $fileName parameters should be passed to the node constructors as-is; $fileName will always be the string "init.py" but this should be considered an implementation detail as filenames may change in the future should the child RPC API change. The default implementation presents a PythonSandbox\VirtualFile that does not contain any python code, which can be useful if the application has no specific initialization it wishes to perform.

public function getUserScriptNode(PythonSandbox\VirtualFS $fs, string $fileName) -> PythonSandbox\Node

This function returns the Node containing the "main.py" script used to run user code, for example a PythonSandbox\RealFile instance or a PythonSandbox\VirtualFile instance. The $fs and $fileName parameters should be passed to the node constructors as-is; $fileName will always be the string "main.py" but this should be considered an implementation detail as filenames may change in the future should the child RPC API change. The default implementation presents a PythonSandbox\VirtualFile that does not contain any python code, which is not particularly useful so this method should generally be overridden.

public function getConfigurationInstance() -> PythonSandbox\Configuration

This function returns a Configuration object (or a subclass thereof) used to obtain sandbox configuration variables (detailed further below). The default implementation returns PythonSandbox\Configuration::singleton() however the application may wish to return a subclass in order to integrate the Configuration object with its own configuration.

public function getFilesystemInstance() -> PythonSandbox\VirtualFS

This function returns a VirtualFS object or a subclass thereof, and is called exactly once during the Sandbox initialization process. The default implementation is likely sufficient, unless the application wishes to greatly override how VirtualFS operates. Even with the default implementation, the application has multiple opportunities to further customize the virtualized filesystem.

public function initializeFilesystem(PythonSandbox\VirtualFS $fs) -> void

This function is called by the VirtualFS constructor to allow the application to extend the virtualized filesystem with its own custom files and directories. To extend the filesystem, a reference to the root node must be obtained like so: $root = $fs->getRoot();. $root should be treated as a nested array-like object in order to extend or overwrite files and directories. For example, $root['usr']['bin'][] = new PythonSandbox\RealFile($fs, 'myexe', '/real/path/to/myexe'); would add a /usr/bin/myexe file that, when read, retrieves the contents of /real/path/to/myexe. The default implementation does nothing.

public function getLibraryPaths() -> array

This function returns any virtualized pathnames that should be added to the python library search path. This can be useful to inject application libraries into python and make them easy to access via the import statement. The default implementation returns an empty array.

public function initializeSandbox(PythonSandbox\Sandbox $sb) -> void

This function is called at the end of the Sandbox constructor and can be used to further modify the sandbox, for example by calling $sb->setenv() to modify what environment variables are available to the child process.

ApplicationHandler

The application must also either define a class in the global namespace named ApplicationHandler, or it must alter the Configuration setting RPCHandlers to change what handler is used for NS_APP. Functions on this handler will be called whenever an RPC with the application namespace comes in from the child. These functions should return whatever value should be passed to the sandbox on success or throw a PythonSandbox\RPCException on failure; depending on the code specified in the RPCException, this will either cause the sandbox to exit or it will raise an exception in Python. See the Raising exceptions section in the API Reference page for more details (the reference PHP implementation provides constants that refer to the various python exceptions, such as PythonSandbox\ImportError and PythonSandbox\KeyError).

The ApplicationHandler constructor is passed a single parameter, which is the PythonSandbox\Sandbox instance for the currently-running sandbox.

Running the sandbox

To run the sandbox, construct a new instance of your Application class (the code below assumes you named it Application) and pass it to Sandbox::runNewSandbox(). This call will block until the sandbox is complete, and will return the exit code of the sandbox.

$exitCode = PythonSandbox\Sandbox::runNewSandbox(new Application());

RPCException

If the application handler wishes to indicate an error, it should raise an RPCException. Its constructor takes the following form:

throw new PythonSandbox\RPCException(int $code, mixed $data[, int $errno]);

In the vast majority of cases, $data is the string message. However, for certain exception codes, it may need to be an associative array with certain keys; consult the child API reference for more details. $errno is only used if throwing an OSError.

SandboxException

If the application handler wishes to immediately exit the sandbox without raising an error there, it can throw a SandboxException. These exceptions are not propagated to the application itself, and instead signal that we should immediately terminate the sandbox (performing an unclean shutdown if need be).

Configuration

The Configuration class exposes a simple API to retrieve and set configuration variables:

public function exists(string $key) -> bool
public function get(string $key, mixed $default = null) -> mixed
public function set(string $key, mixed $value) -> mixed

The exists function tests if a configuration key exists, the get function retrieves that key or $default if that key does not exist, and set sets the key to the value and returns the previous value.

Configuration settings

The following configuration settings are recognized by the sandbox:

  • MaxFDs (int): The maximum number of open file descriptors the virtualized filesystem will open. If it is requested to open more than this many files at once, it returns EMFILE. The default value is 64.
  • MaxReadLength (int): The maximum number of bytes that can be read in a single read() call. If a read is requested larger than this, only this many bytes will be returned (the call will still be successful). The default value is 8192.
  • MemoryLimit (int): The memory limit for the sandbox in bytes. If 0, the sandbox's hardcoded default of 200MiB is used. This limit is on the address space of the sandbox and not the resident set (e.g. total memory available rather than memory currently in use). The default value is 0.
  • CPULimit (int): The CPU limit for the sandbox in seconds. If 0, the sandbox's hardcoded default of 5 seconds is used. The default value is 0.
  • RPCHandlers (array): An associative array mapping RPC namespaces to the class name equipped to handle calls from that namespace. The default value is [NS_SYS => 'PythonSandbox\SyscallHandler', NS_SB => 'PythonSandbox\SandboxHandler', NS_APP => 'ApplicationHandler'].
  • AllowedPythonLibs (array): An array that is used as a whitelist to show what libraries python is allowed to load. The default allows access to all .py and .so files.
  • AllowedSystemLibs (array): An array that is used as a whitelist to show what libraries the system is allowed to load (the libraries present in /lib, /lib64, /usr/lib, and /usr/lib64). The default allows access to all .so files (and their versioned versions, e.g. .so.1, .so.2, and so on). The default does not allow access to files with minor versions, such as .so.1.0; if these are required then the application should modify this configuration variable.