CellProfiler as a Python package - CellProfiler/CellProfiler GitHub Wiki

Currently, running CellProfiler as a Python package is not recommended. Reasons for this include:

  • If you intend to run an existing pipeline, you should just use the existing headless run tools, which can be found here: https://github.com/CellProfiler/CellProfiler/wiki/Getting-started-using-CellProfiler-from-the-command-line
  • The CellProfiler GUI allows for many interacting settings to be changed and then subsequently visualized. Editing a pipeline within a Python script removes this intuitive visualization ability and could lead to low quality image analysis.
  • If you are interested in accessing a particular module, it can instead be best to look at the underlying code that the module uses, which typically uses functions from scikit-image or centrosome. Extracting the functional code you are interested in will be much simpler than trying to run a pipeline within a Python environment.
    • CellProfiler is typically just a wrapper around these functions, offering some additional logic.

Update 14th September 2022

That said, we are currently working on CellProfiler Library, a way in which CellProfiler modules, image and object processing functions can be used directly without requiring the CellProfiler GUI, a JAVA VM or loading a pipeline, which will handle a lot of image analysis logic in a way that is familiar to CellProfiler users. Currently, the modules added so far can be found in CellProfiler/cellprofiler/library/, which we are gradually adding to over time.

Current method for running CellProfiler as a Python package

Installation

If you would still like to try using CellProfiler from a Python environment (e.g. a Python interpreter, Jupyter Notebook, or a Python package), please follow the instructions below.

Note that several of the commands below require using code from the CellProfiler/core repository

Prerequisites

In order to install the CellProfiler, the following prerequisites have to be fulfilled.

Installing

Once all prerequisites are installed, the CellProfiler Python integration can be installed with

pip install CellProfiler

Quick start

Cellprofiler can run existing pipelines or single modules. The following sections will provide a small example of both applications.

CellProfiler pipelines

The CellProfiler python package can import complete pipelines which were built using the CellProfiler GUI. This example shows how the Fruit Fly cells pipeline can be imported into the python package, how settings can be changed and how output can be accessed.

Setup environment

import cellprofiler_core.pipeline
import cellprofiler_core.preferences
import cellprofiler_core.utilities.java
import pathlib

cellprofiler_core.preferences.set_headless()

Start the Java VM

Since the some CellProfiler modules rely on Java integrations, the Java VM needs to be started such that the required sources can be found. The VM can be started with

cellprofiler_core.utilities.java.start_java()

Open a pipeline

After downloading the Fruit Fly cells pipeline, it can be loaded with

pipeline = cellprofiler_core.pipeline.Pipeline()
pipeline.load("ExampleFly.cppipe")

Set the default output directory

The default output directory of the CellProfiler Python package is C:\Users\USERNAME.

In the Python integration, the default output can be configured in the preferences. Here, the directory is set to the folder Output in the current working directory (ensure that the folder exists).

current_dir = pathlib.Path().absolute()
cellprofiler_core.preferences.set_default_output_directory(f"{current_dir}\\Output")

Loading the images

Before running the pipeline, the images need to be loaded. Here, we load all TIF images which are in the Images folder of the current working directory.

file_list = list(pathlib.Path('.').absolute().glob('Images/*.TIF'))
files = [file.as_uri() for file in file_list]
pipeline.read_file_list(files)

Running the pipeline

We can run the pipeline with

output_measurements = pipeline.run()

Reading the output

After running the CellProfiler, the all output attributes can be retrieved with output_measurements.get_measurement_columns().

The values of a specific measure, for example the X-Center of the Cells object, can then be accessed with output_measurements.get_measurement('Cells', 'AreaShape_Center_X')

Stopping the Java VM

Finally, to ensure that the programm teminates, we need to stop the Java VM.

cellprofiler_core.utilities.java.stop_java()

Changing modules in a pipeline

You can run pipeline.modules() to fetch a list of all modules in the current pipeline. For this example workflow you can use pipeline.modules()[7] to access the IdentifyPrimaryObjects Module.

You can print the settings of a given module with

[print(setting.to_dict()) for setting in pipeline.modules()[7].settings()]

You can change the value of the Threshold smoothing scale with

pipeline.modules()[7].setting(22).set_value(1.5)

Working with single modules

It is also possible generate pipelines from scratch and to configure and run individual modules.

Create Image instances, name and add them to the ImageSet instance

image_set_list = cellprofiler_core.image.ImageSetList()

image_set = image_set_list.get_image_set(0)

Create Image instances, name and add them to the ImageSet instance

import skimage.data
x = skimage.data.camera()

image_x = cellprofiler_core.image.Image(x)

image_set.add("x", image_x)

skimage.io.imshow(image_set.get_image("x").pixel_data)

Create an ObjectSet instance, name and add an Objects instance

object_set = cellprofiler_core.object.ObjectSet()

objects  = cellprofiler_core.object.Objects()

object_set.add_objects(objects, "example")

Create a Measurements instance

measurements = cellprofiler_core.measurement.Measurements()

Create a Module instance

import cellprofiler.modules.gaussianfilter

module = cellprofiler.modules.gaussianfilter.GaussianFilter()

Configure a Module

module.x_name.value = "x"

module.y_name.value = "y"

module.sigma.value = 4

Create a Workspace instance

workspace = cellprofiler_core.workspace.Workspace(
    pipeline,
    module,
    image_set,
    object_set,
    measurements,
    image_set_list,
)

Run a single Module

module.run(workspace)

response = image_set.get_image("y")

skimage.io.imshow(image_set.get_image("y").pixel_data)

Add Module to pipeline

When creating a pipeline from scratch, modules can be added and then be executed as shown in section CellProfiler pipelines

pipeline.add_module(module)