Python integration - isir/greta GitHub Wiki

List of contents

How to launch

To prevent Python environment contamination, most of Greta modules uses independent conda environment for Python integration. To call programs on the conda environment, we recommend to use Windows BATCH file from the Java platform.

When you call commandline programs (e.g., .py files, .bat files), you can call ProcessBuilder() in Java8 as follows:

        try{
            proc = new ProcessBuilder("python", "Common\\Data\\TurnManagement\\check_env.py").redirectErrorStream(true).redirectOutput(ProcessBuilder.Redirect.INHERIT).start();
            proc.waitFor();
        } catch (Exception e){
           e.printStackTrace();
        }

In this example, check_env.py python program is called to check the existence of the conda environment.

The standard output and error of the program is redirected to the standard output and error of the Java main program (usually, displayed to the netbeans output panel). In this method, you cannot capture the output text of the program to use in the latter part of your program.

By using proc.waitFor(), the main Java process stops until the end of the python program.

How to install conda environment

Since Greta employs conda-based python environment management, you also need to implement installation script in .bat file format.


    System.out.println(".init_TurnManagement_server(): TurnManagement, installing python environment...");
    try{
        server_process = new ProcessBuilder("Common\\Data\\TurnManagement\\init_env.bat").redirectErrorStream(true).redirectOutput(ProcessBuilder.Redirect.INHERIT).start();
        server_process.waitFor();
    } catch (Exception e){
        e.printStackTrace();
    }

If the environment exists, conda installation batch file is called from ProcessBuilder().

The content of init_env.bat is as follows.

echo ###########################
echo Start installing py311_vad
echo ###########################

call conda create -n py311_vad python==3.11 -y
call conda activate py311_vad
cd /d %~dp0
call pip install -r requirements_vad.txt

echo ###########################
echo End installing py311_vap
echo ###########################

echo ###########################
echo Start installing py311_vap
echo ###########################

call conda create -n py311_vap python==3.11 -y
call conda activate py311_vap
call cd /d %~dp0

call conda install cudatoolkit=11.8 dlib ffmpeg numpy=1.26 matplotlib opencv=4.11 -y

if errorlevel 1 goto error

call pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
call pip install -r env/requirements_vap.txt

goto end

:error
echo installation without cuda
call pip install torch torchvision torchaudio
call pip install -r env/requirements_vap.txt

goto end

:end
echo ###########################
echo End installing py311_vap
echo ###########################

This example installs the environment using conda commands and pip commands. Optionally, in this example, we added error handling for computers without GPUs.

How to get standard output

1. At the end of program

        try{
            proc = new ProcessBuilder("python", "Common\\Data\\TurnManagement\\check_env.py").redirectErrorStream(true).start();
        } catch (Exception e){
           e.printStackTrace();
        }
        inputStream = proc.getInputStream();
        result = new BufferedReader(
                new InputStreamReader(inputStream, StandardCharsets.UTF_8))
                .lines()
                .collect(Collectors.joining("\n")
                );
        System.out.println(".init_TurnManagement_server(): TurnManagement, python env exist: " + result);

In this example, check_env.py python program is called to check the existence of the conda environment.

The standard output of the program is captured as InputStream object, which will be read by BufferedReader. You can get the output only when the program is finished.

For more information, please refer auxiliary/TurnManagement/src/greta/auxiliary/TurnManagement/TurnManagement.java.

2. On the fly

You can also capture the output on the fly.

    try {
        String[] cmd = {
                "cmd.exe","/C","conda","activate","greta_deepgram","&&","python","-u",
            System.getProperty("user.dir")+"\\Common\\Data\\DeepASR\\DeepGram\\DeepGram.py",
        };
        Runtime rt = Runtime.getRuntime();
        System.out.println("command:"+cmd[0]+" "+cmd[1]+" "+cmd[2]);

        Process proc = rt.exec(cmd);

        BufferedReader stdInput = new BufferedReader(new InputStreamReader(proc.getInputStream(), "ISO-8859-1"));
        BufferedReader stdError = new BufferedReader(new InputStreamReader(proc.getErrorStream()));

        // Read the output from the command
        System.out.println("Here is the standard output of the command:\n");
        String s = null;
        String pythonOUt = null;

        while ((pythonOut = stdInput.readLine()) != null) {

            System.out.println(pythonOut);

        }

        // Read any errors from the attempted command
        System.out.println("Here is the standard error of the command (if any):\n");
        while ((s = stdError.readLine()) != null) {
            System.out.println(s);
        }   
    } catch (IOException ex) {
        Logger.getLogger(DeepASRFrame.class.getName()).log(Level.SEVERE, null, ex);
    }

In this example, the standard output of DeepGram.py is captured as pythonOUt on the fly (line by line) within the while loop.

Different from the previous examples, Runtime.exec() is called to create the python process as another way to call program (It should be possible to replace them with ProcessBuilder() method).

For more information, please refer auxiliary/DeepASR/src/greta/auxiliary/deepasr/DeepGramFrame.java.

How to ensure termination of sub-process/sub-thread written in Python

In some cases (e.g., running the python process in infinite loop), you need to terminate the process explicitly. If you face the zonbi process due to this, you need to consider to use the following methods.

1. Shutdown hook

    Thread server_shutdownHook = new Thread() {
        public void run() {
            Process process;
            try {
                process = new ProcessBuilder("Common\\Data\\TurnManagement\\kill_server.bat", "5961").redirectErrorStream(true).redirectOutput(ProcessBuilder.Redirect.INHERIT).start();
                process.waitFor();
            } catch (IOException ex) {
                Logger.getLogger(TurnManagement.class.getName()).log(Level.SEVERE, null, ex);
            } catch (InterruptedException ex) {
                Logger.getLogger(TurnManagement.class.getName()).log(Level.SEVERE, null, ex);
            }
        }

    };
    Runtime.getRuntime().addShutdownHook(server_shutdownHook);

This example tries to send kill signal to the python server located at port 5961 by activating kill_server.bat. Runtime.getRuntime().addShutdownHook(server_shutdownHook) defines shutdown hook, which will run at the end of the main Java program.

2. Timeout sub-process in python

Another possibility is to use timeout sub-process.


def main():

    # Main loop
    while True:

        if not mainloop_timeout_looper.is_started():
            mainloop_timeout_looper.start_count()
        if  not mainloop_timeout_looper.is_alive():
            # os._exit(0)
            break
        mainloop_timeout_looper.reset()

def timer_loop(limit, port):
    
    hostname = socket.gethostname()
    client = socket.socket()
    client.connect((hostname, port))
    
    _ = client.recv(1024)
    client.settimeout(0.01)
    

    s_time = time.perf_counter()
    
    # print("Waiting connection to socket inspecter...", end = '', flush=True)
    inspect_socket = socket.socket()
    inspect_socket.bind((socket.gethostname(), 6543))
    inspect_socket.listen(1)
    inspect_socket.settimeout(0.5)
    try:
        inspecter, _ = inspect_socket.accept()
        inspecter.settimeout(0.01)
    except:
        pass
    # print('done')
        
    while True:
        
        e_time = time.perf_counter()
        elapsed = e_time - s_time
        # print(elapsed)
        
        try:
            inspecter.send("elapsed {:.2f}".format(elapsed).encode())
        except:
            wait(0.01)
        
        try:
            s_time = float(client.recv(1024).decode())
            inspecter.send("s_time {:.2f}".format(s_time).encode())
        except:
            wait(0.01)
        
        if elapsed > limit:
            print('[TurnManagement] mainloop timeout', elapsed)            
            break
        
        # time.sleep(0.01)

    time.sleep(2)
    os._exit(0)

class Timeout_looper:
    
    def __init__(self, limit = 5, port = 9999):
        
        self.looper = Process(target=timer_loop, args=(limit, port))
        self.looper.start()
        
        self.server = socket.socket()
        self.hostname = socket.gethostname()
        self.server.bind((self.hostname, port))
        self.server.listen(1)
        
        self.conn, _ = self.server.accept()
        
        self.started = False
        
    def reset(self):
        
        assert self.started, "Timer.start() should be called before Timer.reset()"
        
        s_time = time.perf_counter()
        s_time = str(s_time).encode()
        self.conn.send(s_time)
        
    def start_count(self):
        
        self.started = True
        # self.conn.send('start'.encode())
    
    def is_started(self):
        
        return self.started
    
    def is_alive(self):
        
        return self.looper.is_alive()
    
    def kill(self):
        
        # os.kill(int(self.looper.pid), signal.SIGILL)
        self.looper.kill()

In this example, if the each mainloop exceed the timout limit the entire python programs is forcibly terminated by os._exit(0).

Please note that you should avoid to create thread for this timeout checker since it can be less accurate due to python's GIL lock.

For more information, please refer the following programs:

auxiliary/TurnManagement/src/greta/auxiliary/TurnManagement/TurnManagement.java
bin/Common/Data/TurnManagement/func_util.py and bin/Common/Data/TurnManagement/turnManager_vap_audio_faceEmbed_refactored.py (might be in dev branch)