Convex Optimization
In this notebook we talk about Convex Optimization fundamentals. We also examine a simple practical application of convex optimization; Image in-painting using CVXPY package in Python.
Convex Optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many methods are classified as convex optimization. Although it is instrumental in Artificial Intelligence, Convex Optimization is a general technique that does not limit to Artificial Intelligence and has applications in various fields, such as information and communication systems, circuit design, portfolio optimization (stock exchange), and many others more. This Notebook will cover the fundamental theoretical concepts and optimization and convex optimization and show some simple Python examples to learn how to use this technique.
To define Convex Optimization, we must first look at the definitions of optimization and convex functions.
Mathematical optimization is selecting the best element, subject to some criterion from a set of available alternatives. It must be noted that the word Optimization is used in many different contexts. For example, we have Compiler Optimization in programming language implementation and compiler theory, which is very different from what we will talk about. In this notebook, whenever you see optimization, it means "Mathematical Optimization."
Many definitions try to formalize the definition of Mathematical Optimization. Here, we present one of the most used notations.
$$ \text{minimize} \hspace{1cm} f_0(x) $$$$\hspace{3.1cm} \text{subject to} \hspace{1cm} f_i(x) \leq 0, i=1,\cdots,m$$$$\hspace{5.7cm} g_i(x) = 0, i=1,\cdots,p$$In which $x \in \mathbb{R}^n$ is a vector varialbe to be chosen. $f_0$ is the objective function to be minimized. $f_1,\cdots,f_m$ are the inequality constraint functions. And $g_1,\cdots,g_p$ are the equality constraint functions.
There are various variations of these notations, but they can easily be transformed to the one presented above. For example the problem of maximizing function $f_0(x)$ could easily be transformed into the problem of minimizing function $-f_o(x)$.
There are two aspects in optimization problems:
Each of these two aspects is as important as the other one. If you don't have a good model, solving it will not help you solve the real-world problem.
A model is a mapping from the real-world high-level description of the problem to the mathematical notations. For example, in circuit design problems, x can represent the specifications of the actual design, like the placement of each component and other technical information. Constraints come from the physical limitations of the manufacturing process and performance requirements, and last, but not least, the objective function can be a combination of cost, weight, power, area, etc. We should define all of these aspects mathematically in order to have a good model.
There are multiple methods to solve the problem. Convex Optimization focuses on methods of solving specific but prevalent types of optimization problems. More on that later.
If you remember High School geometry, a convex polygon is a polygon in which no line segment between two points on the boundary ever goes outside the polygon.
Convex functions and Convex sets follow the same intuition. A function $f: D \to \mathbb{R}$ is called convex if and only if the following condition holds:
Also a function is Strictly Convex if and olny if the following confition holds:
Assume we have two points $(x_1 , f(x_1)) , (x_2 , f(x_2))$ and we connect them with a straight line. If the function $f$ is convex, then all other points on the function between $x_1$ and $x_2$ must reside under this line.
A convex set is an affine space over the reals that, given any two points in it, the set contains the whole line segment that joins them.
The Convex Optimization problem most used notation is $$ \text{minimize} \hspace{1cm} f_0(x) $$ $$\hspace{3.1cm} \text{subject to} \hspace{1cm} f_i(x) \leq 0, i=1,\cdots,m$$ $$\hspace{3.0cm} A x = b$$
In which $x \in \mathbb{R^n}$ and $f_0 ,... , f_m$ are convex. Also, we must note that equality constraints are linear and cannot be arbitrary functions. To show linear equations, we use matrix notations. If we want to have $p$ different equality constraints, we denote it as multiplication of a $p \times n$ matrix $A$ and a $n\times1$ column vector variable $x$; therefore, the other side of the equation will be a $p\times 1$ column vector.
Convex functions have a lot of good properties that help us get to the result easier. You can find some of these properties in Wikipedia. These properties lead to some crucial properties of convex optimization problems:
These properties lead to methods that can numerically solve convex optimization problems in polynomial time.
Because of having efficient methods, we usually try to formulate optimization problems as convex. Some problems can easily be transformed into this format, but we need some tricks for other problems. Some of the more common tricks are listed here:
Many optimization methods are different cases of convex optimization or can be reduced to convex optimization with some tricks. You can see a small list of some well-known optimization methods that can be reduced to convex optimization. You may be familiar with some of these concepts. This list shows how robust convex optimization is.
Moreover, Convex Optimization is used in many different scientific fields and areas. Here we list some application areas of Convex Optimization. Below each item, you can see one or more links to sites containing more information about the subject or the application of convex optimization in that area.
Machine Learning:
CVXPY is a great python library developed initially at Stanford University. It is a domain-specific language embedded in python that allows the programmer to define the problem model easily and solve the problem using Convex Optimization techniques.
In this notebook, we examine the in-painting problem. In this problem, the program receives a corrupted image. Some pixel values of this corrupted image are missing, and the program should try to guess these missing values to get a clear image.
First, we install the required packages using pip. Note that installing and downloading CVXPY may take a little longer than NumPy and Matplotlib.
!pip install numpy
!pip install matplotlib
!pip install cvxpy
Consider the following image of a cat. We make it corrupted by keeping about 30% percent of its pixels and discarding others. Then we try to develop a simple algorithm that gets the corrupted image as input and tries to in-paint the image.
import matplotlib.pyplot as plt
import numpy as np
# Load the images.
u_orig = plt.imread("./images/cat512color.png")
rows, cols, colors = u_orig.shape
# known is 1 if the pixel is known,
# 0 if the pixel was corrupted.
# The known matrix is initialized randomly.
known = np.zeros((rows, cols, colors))
for i in range(rows):
for j in range(cols):
if np.random.random() > 0.7:
for k in range(colors):
known[i, j, k] = 1
u_corr = known * u_orig
# Display the images.
%matplotlib inline
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(u_orig, cmap='gray');
ax[0].set_title("Original Image")
ax[0].axis('off')
ax[1].imshow(u_corr);
ax[1].set_title("Corrupted Image")
ax[1].axis('off');
For the in-painting, we must find a way to guess the value of missing pixels. Let us denote the pixels array with $P_{i,j}$ notation. We can choose many different objective functions. Here we use $l_2$ total variation and try to minimize it. By minimizing total variation, we try to make each missing pixel have the minimum possible distance from its neighbors.
import cvxpy as cp
variables = []
constraints = []
for i in range(colors): # colors == 3
U = cp.Variable(shape=(rows, cols))
variables.append(U)
constraints.append(cp.multiply(known[:, :, i], U) == cp.multiply(known[:, :, i], u_corr[:, :, i]))
objective_function = cp.Minimize(cp.tv(*variables)) # tv is total variation
problem = cp.Problem(objective_function, constraints)
problem.solve(verbose=True, solver=cp.SCS)
print("optimal objective value: {}".format(problem.value))
Now let's see the final result and compare it to the original image.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
%matplotlib inline
rec_arr = np.zeros((rows, cols, colors))
for i in range(colors):
rec_arr[:, :, i] = variables[i].value
rec_arr = np.clip(rec_arr, 0, 1)
fig, ax = plt.subplots(1, 2,figsize=(10, 5))
ax[0].imshow(rec_arr)
ax[0].set_title("In-Painted Image")
ax[0].axis('off')
img_diff = np.clip(10 * np.abs(u_orig - rec_arr), 0, 1)
ax[1].imshow(img_diff)
ax[1].set_title("Difference Image")
ax[1].axis('off')
The in-painted image looks almost identical to the original one. Although it has many differences from the original image and minor artifacts are visible in the picture (for example, in the cat's whiskers), we can say the result is acceptable for almost ten lines of code.