Multi objective Least Squares and Applications - ECE-180D-WS-2023/Knowledge-Base-Wiki GitHub Wiki
Author: Bruce Qu Revisions by: Neil Kane
This is an introductory article designed to provide readers with some basic programming background to the multi-objective least squares problem. The Table of Contents:
- What is Least Squares?
- Why would you care?
- Preliminaries
- Multi-objective least squares
- Regularized data fitting
- Different Flavors and Extensions
- An interesting Application
- Conclusion
Before diving into the in depth mathematics let’s take a few steps back and understand from a high level what least squares is. From its name the idea is to minimize the sum of the squares of some quantity. Specifically this quantity is generally some metric that we want to penalize or in other words we prefer quantities that are smaller over those that are larger. In this way the end goal becomes to find certain parameters that give us the least sum of squares or simply least squares. These quantities are usually things like the distance between data points and our line (our hyperplane in higher dimensions). The fundamental idea being to determine some mapping between data points in one dimension with outputs in another.
Least squares problems, especially multi-objective least squares problems, are important in many fields of science, engineering, and data analysis. In this wiki article, we can also see that it can be applied to achieve noise reduction. Least squares can be used to filter out the noise and recover a signal that best represents the "ground-truth" data. Towards the end of this article, we will apply the multi-objective least squares concept to an image-deblurring task. Without further ado, let's dive right in!
Before we start, we need to review some fundamental concepts that we will use later on.
- Least Squares
In this section, we will briefly go over the least square problem, how we can solve this type of problem, and its solution.
Problem: Given
The term "least squares" comes from the summation of squares of a function
Example: For example, let
The least squares solution
To find
The solution is
Summary: In general, we formulate a least squares problem in the following format.
Find any
To make it more compact, let
Solution: If we assume that
Note that this is the unique solution of the least squares problem
Proof: There are many ways to prove that the previous answer is the correct solution to a general least squares problem. To accommodate readers who are not familiar with basic linear algebra, we will approach the solution from a calculus perspective.
Keep in mind that we want to minimize the function
We can then take the partial derivative of
We can apply this concept to every term in the gradient of
Solving
Now, we are ready to investigate the Multi-objective least squares problem!
In a multi-objective least squares problem, we want to seek one
For example, a weighted least squares problem can be formulated with the above definitions.
In general, the coefficients
Solution: Assuming all
We can see that in the final form, it looks exactly like a least squares problem. However, if the stacked matrix has linearly dependent columns, we have multiple solutions. Also note that we do not enforce all matrices
Hence the solution to the above least squares problem can be derived from a single-objective least squares problem.
Motivation: Occam's Razor highlights the idea that simple explanations are better than more complex ones. This same concept is often applied to the models that we try to determine using techniques like least squares. Sometimes the resulting values for our mapping can be too “complex” and in this regard we would like to penalize this in some way. Thus this motivates the idea of regularization which is one of the most widely used cases of multi objective least squares. Specifically we want to add another quantity, namely the actual parameters that we are solving for. There are many options here such as taking absolute values or squaring them like in least squares (for those that are familiar these are often called L1 and L2 regularizers) but in essence we prefer smaller (simpler) models over more larger (complex) ones. As a final added benefit the closed form solution of least squares for linear cases relies on linearly independent columns and adding L2 regularization is a simple way to ensure this criteria is always met. We can now leverage the solution presented in the previous section to explore an example of regularized data fitting.
Problem: We have data points
If
Now, our problem becomes two-fold:
- We want to fit the model
$\hat{f}(x)$ to data points$(x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), (x^{(3)}, y^{(3)}), ..., (x^{(N)}, y^{(N)})$ , i.e. minimize the difference between our prediction and ground-truth. - We want to keep
$\theta_1, \theta_2, ..., \theta_p$ small to avoid over-fitting.
Method: We can easily formulate this problem into a multi-objective least squares problem. (L2 Regularization)
Depending on the strength of regularization, we can increase/decrease the value of a regularization coefficient
Solution: Similar to the previous multi-objective least squares problem:
In the prior example we explored the cases where each of the dimensions of our input data were treated equally resulting in a linear relationship. However this is not the only type of least squares that we can explore. In fact we can also decide to augment our data by raising certain dimensions to different powers, performing a log operation or really anything we desire. There may not necessarily be closed form solutions from a linear algebra perspective however these problems are still of interest as it is not always that case that our data is related linearly with the outputs. If we have a parabolic curve of data it doesn’t make sense to approximate it with a line for example.
The main limitations of these extra cases is that we cannot easily get a solution like we can in the linear case. This however does have a strong relationship to many machine learning techniques as their goal is also to minimize quantities and go about this task with other approaches often based on gradient descent. Specifically the concept of squaring is analogous to looking at L2 norms and is popular to it being differentiable and leading to nice closed form solutions. It is by no means the only option and there is a lot of flexibility available over how exactly the quantity is determined as we could have just as easily considered taking absolute values combined with augmenting our data by considering polynomial forms. These are just some of the many approaches one can take and the specific option is picked through empirical trial and error mixed with known information regarding the data. Hopefully these article has given some more insight on the various least squares approaches and if you want to learn more about this concept some other related topics could be when specific constraints are applied to our solutions (constrained least squares) or other machine learning topics that build upon this idea of minimizing cost.
In this part, we will briefly discuss how we can treat some problems in real life as multi-objective least squares problems.
An interesting application of estimation (or multi-objective least squares problem) is image deblurring/denoising.
Let
In the MATLAB code below, we load in an image from the USC-SIPI Image Database at http://sipi.usc.edu/database.
% MATLAB script
using MAT, ImageView
f = matopen("deblur.mat");
Y = read(f, "Y");
B = read(f, "B");
imshow(Y);
Problem. In this image deblurring problem, we are given a noisy and blurred image
Method. We will try to construct
We will introduce the cost function/objective:
Intuition. The term
% MATLAB script (cont.)
E = [1, zeros(1, 1023); zeros(1022, 1024); -1, zeros(1, 1023)];
D = @(lambda) abs(fft2(B)).^2 + lambda.*abs(fft2(E)).^2 + lambda.*abs(fft2(E')).^2;
for i=-6:2:0
X = ifft2((conj(fft2(B)).*fft2(Y))./D(10.^i));
figure();
imshow(X);
str = sprinf('lambda=%d',i);
title(str);
end
Analysis: In the above code, we used Fast Fourier Transform and Inverse Fast Fourier Transform to help us deblur the image. This is out of the scope of this wiki. We notice that when
In the wiki blog, we established the multi-objective least squares problem, derived the solution, and introduced the regularization term. We also explored other variations to expand on this idea before considering a real world example. Specifically, we treated the image deblurring task as a multi-objective least squares problem.
Boyd, S., & Vandenberghe, L. (2019). Chapter15: Multi-objective Least Squares. In Introduction to applied linear algebra: Vectors, matrices, and least squares (pp. 309–325). essay, Cambridge University Press.
Unclear, U. (n.d.). Volume1: Mosaics. Sipi Image Database. Retrieved February 10, 2023, from http://sipi.usc.edu/database
Johari, A. (2020, May 13). A 101 guide on the least squares regression method. Medium. Retrieved February 10, 2023, from https://medium.com/edureka/least-square-regression-40b59cca8ea7