Application of Kalman Filter in Neural Signal Decoding - 180D-FW-2023/Knowledge-Base-Wiki GitHub Wiki

Application of Kalman Filter in Neural Signal Decoding

Hongchang Kuang

Word count: 2610 (including Latex formula and HTML code)

Introduction

Have you imagined a world that we could control things without moving our fingers? Will you fancy playing video games or driving your cars just with your brain? I have to tell you that this fantasy is no longer impossile with the help of Brain Machine Interface (BMI). In recent decades, as technology advancing and social ethics system maturing, a heated debate has been triggered over the usage of BMI. Some argues that current innovations are still not applicable for widely marketing products. However, most engineers and neuroscientists are excited by technological progress that facilitates brain mapping, the most sanguine of them comparing their growing ability to tremendous advances that led to the unimaginable success of the BMI projects. Companies like NeuralLink and Paradromics, as well as government agency have invested on the projects to test the possibilities and work on the real world applications [1]. This articles briefly introduces one of the prevailing technology in BMI: the application of Kalman filter in neural signal decoding, discussing its advantages and potential improvements, and what makes it robust and powerful that could possibly allows BMI to enhance our daily life.

Backgrounds in Neuroscience

Before we get to the real technique, a simple background in neuroscience is neccessary to unravel you from the potential confusion in the world of jargons, since the mind, with our brain in general, is our main subject of research. The main study theme of neuroscience is to link molecules to mind. Human brain is a highly integrated network of approximately 100 billion individual nerve cells. Therefore, to fully understand the mechanism behind the BMI devices, a brief understanding of how neurons are organized into signaling pathways and how they communicated is necessary. We will introduce how brain functions from three different levels: brain, neuron, and neural signal (action potential). Further readings in related areas are also encouraged.

The Brain and Behavior

What does our brain look like? Surprisingly, it is complex and ordered at the same time. The central nervous system (CNS) is bilateral and symmetrical. Studies on brain with modern imaging techniques discovers that different regions on brain are specialized for different functions. As you may have gussed, for BMI to decode human cognitive information, the most important part that we would like to focus on is the cerebral cortex, where brain operations that are responsible for human cognitive abilities occur. As shown in the graph, It consists of four anatomically distinct lobes with different functionalities, including planning future actions, hearing, learning, vision, etc. With the concept of functional engineering prevailing in neuroengineering, it is natural to focus on a specific area on cerebral cortex to gather information of neuron activity. Therefore, to achieve fancy operations like controlling the cursor with our mind, for example, we will mainly focus on motor cortex in frontal lobe, which is responsible for planning future actions and the control of movement.

Neurons

What is the basic unit of our brain and what makes it complex? We have to give credit to these simple but powerful biological creatures: nerve cells (neurons). Just like VLSI transistors, even though they have relatively basic morphology and architecture, approximately 10^11 neurons in the brain can support long-ranged and intricate anatomical circuits, where the "complexity" arises from. Neuron have four regions: cell body (soma), dendrites, axon, and presynaptic terminals. The main component axon is what we should pay special attention to: it conveys signals (action potential) to other neurons in long distance (0.1mm - 3m), while ensure the signals propagate without distortion or failure and preserving its shape at very high speed (1 - 100m/s). Compared to transmission line in our real life, the efficiency and accuracy of neural signal transmission are mind-blowing. Two neurons can also communicate at the synapse chemically using neurotransmitters. Neurons, whose structure diagram is attached below, are the most powerful building blocks of the most complicared circuits in the world.

Neural Signal (Action Potential)

Is the neural signal similar to the digital signal used for communication in the real life? Actually, they are nearly identical. Action potentials are the signals by which brain receives, analyzes and conveys information. Even though they are highly stereotyped just like digital signal (0 and 1), they are able to convey all the information that the brain may need, including vision, audition, emotion, etc. However, unlike the 0s and 1s in digital signals, since they do not have distinguishable shapes, all the information is preserved in the path that the signal travels and pattern of action potentials. The maximum magnitude of action potential can be up to 40 mV, and their frequency is about 500 Hz. On the other hand, it is also their highly stereotyped shape makes decoding them feasible. Just like displacement and velocity, in neural signal decoding, we are more interested in the "velocity" of action potential: firing rate, which is the number of spikes of action potentials per unit time. With this concept, we are able to have a criterion of how active a neuron is.

Methods

With all the knowledge of neuroscience infused in your barin, I think your neurons should be ready for the real methods of decoding the neural data. Sit back, and I will walk you through this wonderful and inspiring process.

Motor Prothesis Concept

This architecture of our decoder is simple but elegant. As shown in the graph, the whole BMI system consists of multiple components, including signal detection, decode algorithm, and feedback. In this section, we will focus on continuous decoding of neural signal in hand movement (motor prosthesis). Image that you would like to grab the water bottle on the table. Your brain will go over two stages: planning and movement. In the planning stage, it need to plan the according trajectories precisely. In the following movement activity, it also needs to control the acceleration and speed of your arm. In general, the movement activity follows the plan activity, and is specially tuned for direction and speed of arm movement. As a result, we will be mainly concerned with the neural movement activity and the corresponding kinematics like hand position, velocity, acceleration, etc.

Decoding Data and Notation

We need to know what are we going to decode at first and what the experiment looks like. Let us set up the following experimental context: a monkey is asked to perform a task. It needs to move its hand to the cursor on the screen. When the cursor appears, it cannot move. As soon as it disappear, the monkey should move its hand to the destination. We call this whole process a reach.

Apparently, for the reaching task, we have two sources of data to consider:

  1. Observed kinematics (dimensionality $M$):

$$ x_k = \begin{bmatrix} p^x_k \ p^y_k \ v^x_k \ v^y_k \ 1 \end{bmatrix} $$

$p^x_k$ and $p^y_k$ represent the position of hand at time $k$ (for $x$ and $y$ coordinates). Respectively, $v^x_k$ and $v^y_k$ represent the $x$ and $y$ velocity of hand at time $k$. We ignore the acceleration and add a $1$ to the end for calculation simplicity and matching the dimension.

  1. Neural activity (dimensionality $N$)

$$ y_k = \begin{bmatrix} y^1_k \ y^2_k \ \vdots \ y^N_k \ 1 \ \end{bmatrix} $$

$y^n_k$ represents the neuron n's firing rate at time k. In our case $N = 96$, which means that we decode neural signals from 96 neurons at the same time. We add a $1$ to the end for calculation simplicity and matching the dimension.

We will further denote the decoded kinematics as $\hat{X}_k$. For convenience, for the total period of time of these activities, we can also define:

$$X = \begin{bmatrix} x_1 & x_2 & \ldots & x_K \end{bmatrix}$$

$$Y = \begin{bmatrix} y_1 & y_2 & \ldots & y_K \end{bmatrix}$$

where K is the total number of data points. If the time interval between two points is $\Delta t$, thus the total period of time of these activity will be $\tau = K\Delta t$.

General Ideas in Decoding

The general idea of the decoding operation is straightforward: decode th kinetics information from neural activities, which means to infer $X$ from $Y$.

You might think this sounds easy. One simple method is that, if we could assume that their relation is linear, we can simply fit a line to them using least square since we have both observed data for $X$ and $Y$:

$$ X = LY $$

Basically is to minimize:

$$ \frac{1}{2} \sum_{k=1}^{K} ||x_k - Ly_k||^2 $$

With our knowledge in linear algebra, we get:

$$ L = XY^T(YY^T)^{-1} $$

Now we have this $L$. Concretely, say that we have a new neural signal data $\widetilde{y}_k$, then we can decode the kinetics $\hat{x}_k$ using:

$$ \hat{x}_k = L\widetilde{y}_k $$

Problem solved! This is what we called Optimal Linear Estimator (OLE). Is this the final method that we should concern? Of course not! Unfortunately, as you may guess, its accuracy and robustness are not very ideal. Obviously, we have ignore something in this case. First, for the adjacent positions, say $x_n$ and $x_{n+1}$, there must be some relation between them; second, even though we introduce the concept of velocity, we did not apply any physics law to exploit them.

Well, the first problem seems easy to solve: we can simply take history into account by doing something like:

$$ x_k = \sum_{p=0}^{P} L_py_{k-p} $$

Rewrite it in matrix form:

$$ X_w = L_wY_w$$

We can find $L_w$ using least square as well:

$$ L_w = X_wY_w^T(Y_wY_w^T)^{-1} $$

Seems more advanced, doesn't it? This technique is called Wiener Filter. We enhance the OPE by considering data from the history. But what about the second problem? it's finally time to welcome Kalman Filter to the stage.

Decoding with Kalman Filter

The main concern for now is, we have additional information we have not used in our decoding process: the law of physics, since we understand the the positions of the arm are the integration of its velocity (we will be in real trouble if they are not). The natural way to incorporate this is to use Kalman Filter.

If we add one more equation to the original OLE, we can get:

$$ x_{k+1} = Ax_k $$

$$ y_k = Cx_k $$

The first equation is called state update process or dynamic process, where $x_k$, the kinematics of the hand, is the state. $A$ is the dynamic matrix where we can encode the law of physics. The second equation is observation process. $y_k$ is just the observation of neural signal. This inverse OLE equation helps us link the neural signal $y_k$ and kinematics $x_k$. If you recall your signals and systems class, these two equations together is called Linear Dynamical System.

In real life, we have to add some random noise to them:

$$ x_{k+1} = Ax_k + w_k $$

$$ y_k = Cx_k + q_k $$

We can simply assume that they are gaussian noise. Zero mean and covariance matrix $W$ and $Q$ will be good in our case:

$$ w_k \sim \mathcal{N}(0, W) $$

$$ q_k \sim \mathcal{N}(0, Q) $$

So how do we get the optimal estimator of $\hat{x}_k$ from those two equations? Under certain assumption (linearity and gaussian noise), there is a recursive solution to this called Kalman Filter

The spoiler is, we will reach to some kind of equation like this:

$$ \hat{x}_k = M_1 \hat{x}_{k-1} + M_2 y_k $$

Apparently, we take all factors into account: history, law of physics, and the neural signal. All of them have some information of what the newly decoded data should be. So the goal is to find two matrices $M_1$ and $M_2$. They are the function of $A$, $C$, $W$, $Q$.

What is Kalman Filter actually doing? In one dimension, we need to find an estimator

$$ \hat{x}_{k|1,\ldots,k} = \mathbb{E}(x_k|y_1,\ldots,y_k) $$

In other words, this means "we want to find the expected value of hand position at time $k$ given all my observations of neural signal from time $1$ to $k$". We also need to know the variance of it to estimate how much we trust this value (in multi-dimension this will be $\Sigma$):

$$ \sigma^2_{k|1,\ldots,k} = \text{Var}(x_k|y_1,\ldots,y_k) $$

In general, we would write the distribution of $x_k|y_1,\ldots,y_k$ and then just take its expected value. This seems to be difficult, but we have a recursive solution provided by Kalman Filter. We will ignore the tedious derivation of it here. The process will be:

  1. Initialize: $\Sigma_{0|0} = 0$

  2. Estimation updated process: Recursively calculate, until convergence of $K_k$:

$$ \Sigma_{k|k-1} = A\Sigma_{k-1|k-1}A^T + W $$

$$ \Sigma_{k|k} = \Sigma_{k-1|k-1} - \Sigma_{k-1|k-1}C^T(C\Sigma_{k-1|k-1}C^T)^{-1}C\Sigma_{k-1|k-1} $$

$$ K_k = \Sigma_{k-1|k-1}C^T(C\Sigma_{k-1|k-1}C^T)^{-1} $$

We called the converged value $\Sigma_\infty$ and $K_\infty$, and $K_k$ is called the Kalman Gain

After some more magic, we can calculate $M_1$ and $M_2$ by

$$ M_1 = A - K_\infty CA $$

$$ M_2 = K_\infty $$

What about $A$, $C$, $W$, $Q$ themselves? We need to calculate them using maximum likelihood method in machine learning. What we need to do is to maximize the following likelihood with parameters set $\theta$:

$$ L = P((x_k, y_k)^K_{k=1} | \theta = {A, W, C, Q}) $$

This means that, with the parameters set $\theta$, we need those obaservations data to appear at the highest possibility.

We can do this by taking the derivative of logarithm of $L$ with respective to $A$, $C$, $W$, $Q$. Then simply set it to zero. What we will finally get is:

$$ A = X_{:,2:end}X_{:,1:end-1}^{\dagger} $$

$$ C = YX^{\dagger} $$

$$ W = \frac{1}{K-1}(X_{:,2:end} - AX_{:,1:end-1})(X_{:,2:end} - AX_{:,1:end-1}) ^T $$

$$ Q = \frac{1}{K}(Y - CX)(Y - CX)^T $$

We also need to make sure that $A$ obeys law of physics. With all these parameters, we should be able to decode $\hat{x}_k$ using the formula we first propose. The final results of the decoding is shown in the graph on the left. On its right is the actually movement of all the reach in the experiment marked by different colors, where x and y axis represent the coordinates of hand positions. Although they don't seem to be close, the analysis on loss function of squared error proved the results to be reliable, and in real life they will work with high accuracy if we are not very serious about fidelity.

Conclusion

I hope this method in neural signal processing is inspiring and fantastic. This article introduces the motivation of choosing Kalman Filter in decoding neural signal, and the basic mathematical recursion in arriving at the final answer. In general, when it comes to decode continous data, Kalman Filter is a powerful technique which incorporates factors like history, law of physics, and the characteristics of decoded data into the decoding process. It presents ideal accuracy and robustness. In our project, for example, we can choose to use Kalman Filter in smoothing the data collected from the IMU. Nevertheless, I cannot guarantee that Kalman Filter will be a panacea to all your problems: it also has some drawbacks. Kalman Filter can easily experience overshoot since it may weight too heavily on the history [3]. Some potential improvements in neural signal processing include rotating the velocity toward the destination (take angles into consideration) or use Closed Loop Decoder Adaptation (CLDA). In modern decoder with more advanced features, Deep Neural Network is also widely used. In the near future, we can expect that neural signal decoder would provide more advanced functionality like speech decoding, wating for you to explore further [5].

References

[1] AI's Next Frontier: Are Brain-Computer Interfaces The Future Of Communication? Bernard Marr, Forbes. https://www.forbes.com/sites/bernardmarr/2023/08/11/ais-next-frontier-are-brain-computer-interfaces-the-future-of-communication/?sh=61d3d2b851d9

[2] Principles of Neural Science, Fourth Edition. Eric R. Kandel, James H. Schwartz, Thomas M. Jessell

[3] Slides from EC ENGR C143A "Neural Signal Processing" by professor Jonathan Kao

[4] Homework answers from EC ENGR C143A "Neural Signal Processing" by Hongchang Kuang

[5] SCIENTISTS SAY NEW BRAIN-COMPUTER INTERFACE LETS USERS TRANSMIT 62 WORDS PER MINUTE. Victor Tangermann, Futurism. https://futurism.com/neoscope/scientists-new-brain-computer-interface-type-62-words-per-minute