Platform Overview - chitsaw/psi GitHub Wiki

Platform for Situated Intelligence: An Overview

Platform for Situated Intelligence (which we abbreviate as \psi, pronounced like the greek letter) is an open, extensible framework that accelerates research and development, on multimodal, integrative-AI systems. These are systems that operate with different types of streaming data, e.g. video, audio, depth, IMU data, etc., and leverage multiple component technologies to process this data at low latency. Example range from systems that sense and act in the physical world, such as interactive robots, drones, virtual interactive agents, personal assistants, interactive instrumented meeting rooms, to software systems that mesh human and machine intelligence, all the way to applications based on small devices that process streaming sensor data.

In recent years, we have seen significant progress with machine learning techniques on various perceptual and control problems. At the same time, building this type of end-to-end, multimodal, integrative-AI systems that leverage multiple technologies and act autonomously in the open world remains a challenging, error-prone and time-consuming engineering task. Numerous challenges stem from the sheer complexity of these systems and are amplified by the lack of appropriate infrastructure and development tools.

Platform for Situated Intelligence aims to address these issues and provide a robust basis for developing, fielding and studying multimodal, integrative-AI systems. By releasing the code under an open-source MIT License, we hope to enable the community to contribute, grow the \psi ecosystem, and further lower the engineering barriers and foster more innovation in this space.

Concretely, the framework provides infrastructure for working with multimodal, temporally streaming data, a set of tools to support development, debugging, and maintenance, and an ecosystem of Components that simplifies application writing and facilitates technology reuse. \psi applications are authored by connecting together \psi components.

  • Infrastructure. The \psi runtime and core libraries provide a parallel, coordinated computation model centered on online processing of streaming data. Time is a first-order citizen in the framework. The runtime provides abstractions for computing over streaming data and for reasoning about time and synchronization, and is optimized for low-latency from the bottom up. In addition, the runtime provides fast persistence of generic streams, enabling data-driven development scenarios.

  • Tools. \psi provides a powerful set of tools that enable testing, data visualization, annotation replay, analytics and machine learning development for integrative-AI systems. The visualization subsystem allows for live and offline visualization of streaming data. A set of data processing APIs allow for re-running algorithms over collected data, data analytics and feature extraction for machine learning.

  • Components. \psi provides a wide array of AI technologies encapsulated into \psi components. \psi applications can be easily developed by wiring together \psi components. The initial set of components includes sensor components for cameras and microphones, audio and image processing components, as well as components that provide access to Azure Cognitive Services. We hope to create through community contributions a broader ecosystem of components that will lower the barrier to entry for developing integrative-AI systems.

More information about upcoming features in Platform for Situated Intelligence is available in the Roadmap document.

⚠️ **GitHub.com Fallback** ⚠️