Quick and dirty minimization
Sometimes you just want to minimize a function. You’re in the middle of your coding and you suddenly realize that you need to know the minimum of your 7 dimensional function, let’s call it
def dirty_function(x[0], x[1], x[2]....):
return some_complicated_value
So you hit the books. Problem is, when you start reading up on minimization algorithms, you’ll find a whole smörgåsbord of complicated looking algorithms. You get all confused and you resign yourself to learning how to use a full-blown numerics library in order to minimize your dirty little function. There goes the rest of the week.
A couple of years ago, I had a conformation-analysis problem where I wanted to minimize some distances on a little peptide. I ended wasting a week reading up on Numerical Recipes in C. Their code was doubly terrible. It was terrible because their code was Fortran disguised in C, and it was doubly terrible because in their shitty license, they own the rights to their code and any modifications that you make.
It doesn’t have to be like that. I’ve found that if you want something rough-as-guts, it’s actually very simple to do a minimization in just a few lines of code.
There are roughly two types of minimization algorithms. Those that require a derivative, and those that don’t. Since you have a dirty_function, you probably don’t have an analytical derivative of dirty_function, so that rules out the algorithms that require a derivative (the best being Powell minimization). You have basically two options: downhill-simplex minimization and sequential-univariate minimization.
Although downhill-simplex is more robust, sequential-univariate minimization (even though its name is uglier) is way easier to implement.
The idea to seq-uni (as we’ll call it from now on) is to pretend that your function is parabolic in all 7 dimensions. Let’s say you are at
v = (x[0], x[1], x[2], ...)
So, for each iteration, you first pick two tiny steps in the first dimension, dimension 0:
x_old = x[0]
x_step1 = x_old + step
x_step2 = x_old + 2*step
Then you evaluate your dirty_function at:
y = dirty_function(x_old, x[1], x[2]....)
y1 = dirty_function(x_step1, x[1], x[2]...)
y2 = dirty_function(x_step2, x[1], x[2]...)
Now we pretend that (x_old y), (x_step1, y1), (x_step2, y2) are points on a parabola and calculate the x_min of the parabola.
def get_parabola_x_min(x1, y1, x2, y2, x3, y3):
# assuming y = A*x^2 + B*x + C
term1 = (y3-y2) / ((x3-x2)*(x3-x1))
term2 = (y1-y2) / ((x1-x2)*(x3-x1))
A = term1 - term2
B = (y3 - y2 + A*(x2*x2-x3*x3)) / (x2-x3)
x_min = 0.5 * B / A
return x_min;
x_min = get_parabola_x_min(x_old, y,
x_step1, y1, x_step2, y2)
x_min is now your new point for value of your vector in dimension 0.
v_new = (x_min, x[1], x[2]....)
Now cycle through the rest of your dimensions for your first iteration. At the end of the cycle, we’ve found the minimum of the fake parabolic minimums for each dimension.
Repeat the iteration until the x_min for each dimension is smaller than a certain tolerance (I’ve found that the tiny step you take has to be smaller than the tolerance):
error = abs(x_min - x_old)
if error < tolerance:
finish
If the error is small, it means that approximating a new parabola from the starting point brings us back to the same point. As the approximation is good only near the real minimum, we should be there.
So there you have it. A fast and dirty minimization. But beware: seq-uni can converge slowly with some functions, but this is balanced by the fact that you don’t make a lot of calls to dirty_function. And sometimes, seq-univ cycles around the minimum without terminating. Oh well, that’s why it’s a dirty method.

Numerical Recipes in C is actually triply bad: I’ve heard the algorithms in several cases are actually incorrect or poorly implemented (i.e., not numerically robust). I’ve had workmates tell me that the NRiC authors’ response is to deny there are any problems.
See also scipy:
http://baoilleach.blogspot.com/2007/06/minimizing-objective-function-using.html
Hi baoilleach, thanks for the comment.
I had a look at scipy, it looks great but it’s a pretty big package. However, I do find numpy, the core of scipy, to be an indispensable library that I use all the time.
One of the dilemnas in writing software is to decide how much dependency you want in your programs. What I want to show in the post was that if you want to minimize a function, and it doesn’t behave too badly, it’s really easy to implement the sequential-univariate, and thus you can obviate the need for an external library. This makes it easier to share code.
Besides, the algorithm is fun to figure out.
I was having another look at this post and I it took me a little while to figure out your implementation of `get_parabola_x_min`.
I think I can see why you called the variables what you did but it might be easier to understand if somewhere in your description you state that you’re finding the minimum of the quadratic with equation y = A*x^2 + B*x + C then using y’ = 2Ax + B and y’‘ = 2A to find the minimum at y’ = 0.
Interestingly, although you say this method is good for when you don’t have derivatives you are essentially approximating the first and second derivatives in your minimization. For example, you could write
This is mathematically the same calculation as yours but I believe this makes it clear that you’re estimating the second derivative to find A and then solving y’ = 2Ax + B for B by estimating the derivative using x2 and x3. Finally, you solve for x_min by setting y’ to zero.
Finding numeric approximations for derivatives is fairly straightforward, as this shows. Do you know how seq-uni compares to some of the other derivative requiring algorithms if you approximate the derivatives like this?
The badness of the seq-uni algorithm is that it assumes that the function is effectively independent in all dimensions. That is, the approximation of the derivative is only made in a single dimension. This loses any information about the mixing of the gradient between dimensions. An approximation of the gradient would give you this information.
Therefore, seq-uni performs really badly when there is any coupling between two dimensions. I’ve found that the algorithm performs well if the function is very simple, say a single distance constraint. But if I apply two distance constraints, it slows down so horribly, I don’t even know if it will eventually find a solution.
When it works well, the advantage of seq-univ is that it only requires 3 function evaluation in each dimension for a single iteraction, and that may be all you need. Furthermore, you only need to remember a single vector for each complete iteration, that should progressively get closer to the minimum. In downhill-simplex, you need n+1 complete vertices (where n is the number of dimensions) in each iteration for the algorithm to work.
I am now in the process of implementing a downhill-simplex algorithm to my problem.
FWIW, scipy comes with a downhill-simplex algorithm in optimize.py. I looked at it briefly, and it looks like it only depends on numpy. I don’t know how good the code is, but their copyright notice looks nice:
“You may copy and use this module as you see fit with no guarantee implied provided you keep this notice in all copies.”
Another interesting approach to efficient optimisation I heard about recently is Automatic Differentiation.
This approach recognises that approximating differentials numerically is computationally expensive and potentially unstable. So instead, you run an AD program over your FORTRAN or C/C++ code and the AD program will generate code for you that, when called, will compute the derivative(s) of your original function at the points given to it as arguments.
Yay, meta-programming!
i think i love you!
i have been working with this nasty little function for my thesis, have MEMORIZED “amoeba” in C and FORTRAN, and have been having major problems implementing it. i am ECSTATIC to try these new methods!
i am only a little nervous that when approximating all of these derivatives that the error will become a little bit more nasty than i wish it to be….i guess i shall see!