Often problems occur where we want to maximise or minimise some quantity. This is pretty easy to do for simple functions or graphs that we can plot. When the information is plotted in a graph we can visually look for the point in the graph which is highest or lowest. Alternatively if we have a function that can be written mathematically we can use differentiation to find its maximum or minimum.
What optimisation is able to cope with is finding the maximum or minimum of a quantity given that it is limited or `constrained’ by some other facts. Think of trying to build as many products in a factory as you can, whilst only having a limited amount of raw materials and workers both of which need to be used to build the product.
Maximisation or Minimisation?
Normally we want to deal with either a maximisation or minimisation problem. It turns out that these two problems are in fact two sides of the same coin: finding the minimum of some function is the same as finding the maximum of . As a result we typically define optimisation problems in the form of a minimisation problem.
If you think about minimising a simple function such as , then strictly speaking what you are actually doing is minimising this function over a certain domain. This domain can be as small or as big as you require, in the picture above it is for .
Mathematical optimisation is essentially a more general form of this simple optimisation but over a more complicated domain. Instead of being made up of a single variable our domain is constructed from any number of variables – many real world problems can involve millions of variables.
A Basic Definition – courtesy of Jean Francois Puget
The minimisation problem is defined by two features: a domain P and a real valued function f. What we wish to do is to find an x contained inside P such that for every y in P, .
In order to find a solution x in P we need to have a definition of P in the first place. In practice P is usually a subset of (points contained within P are vectors made of n real values) and P is further specified by constraints. These constraints exclude some points from the valid solution space.
An example of these constraints on P is illustrated above for the case of two dimensions. After applying the constraints we see that only the points that lie within the blue feasible solution are allowed to be part of P. This feasible region is often thought of as a `convex polyhedron‘ as that is its shape in higher dimensions – assuming that the feasible region is bounded. It is also possible to have constraints of different types such as linear (seen above), quadratic and integer constraints.
In order to be able to say that we have solved a problem we need to be able to meet two conditions:
- Find x in P
- For all y in P,
If we meet both 1 and 2 we can be said to have found an `optimal solution’. If we meet 1 but not 2, then we have found a `feasible solution’. It should be noted that meeting both of these statements is not always easy. In fact depending on what P and f are like `proving optimality’ is often much harder.
What if we can’t reach optimality?
Usually modern computers can use powerful optimisation solvers that can solve most problems to optimality in a short amount of time. There are however some situations where the problem might be so large or complicated that you can’t determine the optimal solution within a useful timeframe. For example a business that is trying to solve a scheduling problem for jobs to be undertaken later that day cannot be waiting until the afternoon to find a solution.
This motivates another useful concept in optimisation that of being able to find and determine the quality of a feasible solution. In the business situation you can probably find a feasible solution x in P, but you do not have enough time to prove the second statement, that all other y in P have .
If we wish to find a sub-optimal solution to the problem then ideally we need to have some way of measuring the quality of this solution. In essence this is a measure of how close the objective function value is to the optimal objective function value. The way that this is done is by finding a lower bound on the objective function L, which the objective function cannot ever drop below.
for all y in P,
Normally this lower bound is found by solving a relaxed version of the problem (setting an integer constraint to be continuous) or during solver methods such as branch and bound. Once this L has been identified then we can figure out how close the current solution x is to it. The difference between the objective function values and L is known as the `optimality gap’.
absolute gap =
Often this absolute value of the gap is specified in relative terms so that you can compare the percentage difference between your best solution and the lower bound. With this knowledge it is then possible to specify how close to optimality your solutions are. This means that you requests for solutions found with an optimality gap of less than 5% in 3 hours can be dealt with.
relative gap =
Clearly the case where the gap is zero means that you have found the optimal solution. Obviously the smaller the gap the better your solution will be however it is likely that it will take increasingly long amounts of time to reduce the gap. For example it will take longer to go from a gap of 5% to 4% than from 10% to 5%.