100% UNDER CONSTRUCTION
{ BEWARE: I am not a mathematician, this will be dumbed down for noobs and programmers like me, actual mathematicians may suffer brain damage reading this. ~drummyfish }
Calculus is a somewhat unpopular but immensely important area of advanced mathematics whose focus lies in study of continuous change: for example how quickly a function grows, how fast its growth "accelerates", in which direction a multidimensional function grows the fastest etc. This means in calculus we stop being preoccupied with actual immediate values and start focusing on their CHANGE: things like velocity, acceleration, slopes, gradients etc., in a highly generalized way. Calculus is one of the first disciplines one gets confronted with in higher math, i.e. when starting University, and for some reason it's a very feared subject among students to whom the name sounds like a curse, although the basics aren't more difficult than other areas of math (that's not to say it shouldn't be feared, just that other areas should be feared equally so). Although from high school textbooks it's easy to acquire the impression that all problems can be solved without calculus and that it will therefore be of little practical use, the opposite is in fact true: in real world EVERYTHING is about change, proof of which is the fact that in physics most important phenomena are described by differential equations, i.e. basically "calculus equations" -- it turns out that many things depend on rate of change of some variable rather than the variable's direct value: for example air friction depends on how fast we are moving (how quickly our position is changing), our ears hear thanks to CHANGE in air pressure, electric current gets generated by CHANGE of magnetic field etc. Calculus is very similar to (and sometimes is interchangeably used with) mathematical analysis (the difference is basically that analysis tries to prove what calculus does, at least according to the "Internet"). The word calculus is also sometimes used to signify any "system for making calculations", for example lambda calculus.
Is this of any importance to a programmer? Fucking YES, you can't avoid it. Consider physics engines, machine learning, smooth curves and surfaces in computer graphics, interpolation and animation, statistics, scientific simulations, electronics, robotics, signal processing and other kind of various shit all REQUIRE at least basics of calculus.
In essence there are two main parts to calculus, two mathematical "operations" that work with functions and are opposite to each other:
One thing shows here: one of the reasons why calculus is considered advanced is probably that instead of simple numbers we suddenly start working with whole functions, i.e. we have operators that we apply to function and we get new functions -- this requires some more abstract thinking as a function is harder to image than a number. But then again it's not anything too difficult, it just requires some preliminary study to get familiar with what a function actually is etc.
Now listen up, here comes the truth about calculus. Doing it correctly and precisely is difficult and sometimes literally impossible, and this is left for mathematicians. Programmers and engineers HAVE TO know the basic theory, but we are largely saved by one excellent thing: numerical methods. We can compute derivatives and integrals only approximately with algorithms that always work for any function and which will be good enough for almost everything we ever encounter in practice. Besides in digital computers we deal almost exclusively with non-continuous functions anyway, we just have very dense discrete sets of points because in the end we only have finite memory, integer values and sampled data, so there is nothing more natural than numerical methods here. So where a mathematician spends years trying to figure out how to precisely sum up infinitely many infinitely small parts of some weird function, we just write a program that sums up a very big number of very tiny parts and call it a day. Still there exist programs for so called symbolic computation that try to automatically do what the mathematician does, i.e. apply reasoning to get precise results, but these belong to some quite specialized areas.
xxx : ###
xx : ##
xx *** xxxxxxxxx
xx ***: ** xxx ##xxx
xx ** : *xx # xxx
xx ** : xx* # xx *
xx * :xx ** ## xxx x
xx * xx **## xxx xxx
x ** xx *# xxxxx*
x * xx: ##* *
xx * xx : ## ** *
xx * xx : ### * *
xxx * xxx : ## * *
xxxxxx ### ** *
----------------------*------####----------*--------------**----
########## : * *
* #### * : ** **
* ## ** : * **
* ## * : * *
** ## ** : * *
* ## * : *** **
* # ** : *****
** # * :
* # * :
## ** :
#** ** :
## ** * :
# ** ** :
Graph showing a function (x
), its derivative (*
) and (one of) its integral(s) (#
).
The basics of calculus aren't that hard, however it can go deeper and deeper and one can probably dedicate whole life just to learning more and more; as you learn the basic derivatives and integrals, you move on to multidimensional calculus, vector calculus, integrating over curves and surfaces, various esoteric methods of analytical and numerical integration etcetc.
Calculus may also be considered advanced for the fact that -- historically speaking -- it's relatively "new", i.e. it took a long time to develop it and ancient and medieval civilizations existed without it despite otherwise having quite impressive math already. Of course precursors to calculus date very far back in history, parts of it and some special case problem were examined and solved, but it wasn't until 17th century when it was developed into a complete, general discipline. That happened thanks to Newton and Leibniz (they happened to develop it independently).
Derivative finds how quickly a function grows at any given point. DOING derivatives is called differentiation (confusingly because differential is a term distinct from derivative). Since derivative and integral are opposite operations, one would assume they'd be equally difficult to handle, but no, derivative is the easier part! So it's always taught first. It's kind of like multiplication and division -- multiplication is a bit easier (division has remainders, undefined division by zero etc.).
NOTE on notation: there are several notations used for derivatives. We will use a very simple one here: f'(x) to us is the derivative of a function f(x). Mathematicians will probably rather like to write d/dx f(x). Just know that this is a thing.
OK, BUT what exactly IS this "derivative"? What does it say? Basically derivative is the tangent to the graph of a function at given point. Derivative of function f(x) is a new function f'(x) which for given x says the slope of the graph of function f(x) at the point x. Slope here means literally the tangent function which encodes the angle at which the function is increasing (or decreasing). Tangent is defined as the (unitless) ratio of vertical change to horizontal change (for example if a plane is ascending with tangent equal to 2, we know that for every horizontal meter it gains two meters of height). Note that this is mathematically idealized so that no matter how quickly the function changes we really mean the slope at the exact single point, i.e. imagine drawing a tangent line to the graph of the function and then measuring how quickly it changes vertically versus how quickly it changes horizontally. Mathematicians define this using limits and infinitesimal intervals, but we don't have to care too much about that now, let's just assume it magically all works now.
Here it is shown graphically:
tangent / __
line / .' ''..
/ __.'f(x)
/-''
/|
__../:|dy
_-' /__|
/ dx
/ :
/ :
:
--------+--------------->x
A
Here we see a tangent line drawn at the graph of function f(x) at point A. We can draw the small right triangle and like shown -- the derivative at point A is now literally computed by dividing dy by dx. We can actually try to approximate the ideal derivative (and this is kind of how computers do it with the numerical methods) by computing (f(x + C) - f(x)) / C where C we set to some small number, for example 10^-10. It's basically how it's mathematically defined too, mathematicians just set the C to "infinitely small distance". By this notice that the derivative will be:
0 if the function is increasing. This is because dy will be positive and since dx is always positive, we'll get a positive number by dividing them.
Now it's important to say that derivatives can only be done with differentiable functions, i.e. ones that in fact DO have a derivative. This cyclic definition only says there indeed exist functions which are NOT differentiable -- imagine for example a function f(x) that gives 0 for every x except when x = 1 where f(1) = 1 -- what's slope of such function at x = 1? How the hell do you wanna integrate that? Firstly it's infinite (the tangent line goes completely vertically and here computing dy/dx just results in division by zero), but we don't even know if it's going up or down (it goes up from left but down to the right), it's just fucked up. Also a function that has holes (is not defined everywhere) clearly also isn't differentiable because if there's nothing to differentiate then what do you wanna do? A function that's not differentiable everywhere may still be differentiable in certain parts of course, but in general if we claim a function is differentiable we imply it's differentiable everywhere. It may also be the case that a function is differentiable but its derivative is not. Actually it further gets a bit more complicated, functions may also be partially differentiable, it is possible that a derivative may exist only from "one side", but we won't go into this. There exist conditions that must hold in order for a function to be differentiable, for example it must be continuous and smooth and whatever, just look that up if you need.
OK so to actually compute a derivative of a function we can use some of the following rules:
f(x) | f'(x) | comment |
---|---|---|
n | 0 | additive const. |
x^n | n * x^(n-1) | var. to power |
e^x | e^x | |
sin(x) | cos(x) | |
cos(x) | -sin(x) | |
ln(x) | 1/x | |
a * g(x) | a * g'(x) | |
g(x) + h(x) | g'(x) + h'(x) | |
g(x) * h(x) | g'(x) * h(x) + g(x) * h'(x) | |
g(h(x)) | g'(h(x)) * h'(x) | chain rule |
Monkey example: we're about to find the derivative of this super retarded function:
f(x) = x^2 - 2 * x + 3
Its graph looks like this:
:| :
3 + :
|: :
2 + '.._..'
|
1 +
|
--+----+----+----+--
-1 0| 1 2
|
To differentiate this function we only need to know (from the table above) that a derivative of a sum equals sum of derivatives and then just invoke a simple rule: derivative of x^N is N * x^(N-1). We have very little work to do here because there are no composed functions and similar shit, so we simply get:
f'(x) = 2 * x - 2
So x^2 became 2 * x, -2 * x became just -2 (because x^0 = 1) and 3 just disappeared (this always happens to additive constants -- notice that such constants don't affect the function's slope in any way, so that's why). The graph of the derivative looks like this:
|
2 + /
| /
1 + /
| /
--+----+----+----+--
-1 0| /1 2
| /
-1+ /
|/
-2+
Things to notice here are:
OK but what if we differentiate the derivative lol? This is legit, it will give us a higher order derivative and it is very useful and common. When we see the first derivative as the "speed" of the function's change, the second order derivative gives us the "speed" of the speed of function's change, i.e. basically it's acceleration. We will write second order derivative of function f(x) as f''(x). This can for example tell us where the function is convex versus concave (how it is "bent"), which again helps with finding minimum and maximum values etc. Of course we may continue and make third order derivative, fourth etc.
Next we must mention partial derivatives which are basically multidimensional derivatives, i.e. ones we do with functions of multiple variables. There is one important thing to mention: when differentiating a function of multiple variables, we have to say which variable we are differentiating against, which is an equivalent of choosing the axis along which we differentiate. Practically this will result in us treating the non-chosen variables as if they were constants. So say we have a function of two variables f(x,y): we can differentiate it against the variable x and also y, i.e. we get two different derivatives. If we imagine the function f(x,y) as a two dimensional heightmap, then the derivative against x means we are getting a slope as if we're going in the x axis direction (and accordingly the same holds for y). This is why it's called partial derivatives: there are multiple derivatives, multiple parts. Making a vector out of all partial derivatives will give us a gradient which is kind of an "arrow" that can tell us in which direction the increase/decrease if the fastest. This is very important for example for machine learning where we are trying to minimize the error function by following the path of the gradient etc. All this is beyond the scope of this article though.
Integral is the opposite to derivative. There are usually two main ways to interpret what an integral means:
Both of these interpretations are equivalent in that we will compute the same thing, they only differ in how we think of what we are computing.
As already claimed in the section on derivative, integrating is more difficult than differentiation. Some reasons for this are:
So due to these complications we now yet have to explain the two different types of integrals:
Fun fact: before digital computers engineers used very clever methods to find definite integrals of general functions. Analog computers were particularly good at integrating, their continuous nature makes them a quite elegant solution to the problem, however perhaps even more genius method in its simplicity was the following: the engineer would draw the function he wanted to integrate on a sheet of paper (or maybe more preferably some kind of heavier material), then cut it out and simply weight its mass -- this would give him the fraction of the weight of the whole sheet of paper and so also the fraction of the area below the function graph.
Example: we will now try to make an indefinite integral of the function:
f(x) = 2 * x - 2
This is the derivative we got in the example of differentiation, so by integrating we should get back the original function we differentiated there.
Now for the notation: the symbol for integral is kind of a big italic S (Unicode U+222), but for simplicity we will just use the uppercase letter I here. With indefinite integrals only the symbol alone is used. For definite integrals we additionally write the interval over which we make the integral, i.e. I(A,B) (normally A is written at the bottom and B at the top), where A and B says the interval. So we will now write our indefinite integral like this:
I (2 * x - 2) dx
Wait dude WHAT THE FUCK is this dx shit at the end? This question is expected. Look: it has to do with the theory behind what the integral mathematically means, for starters one can just ignore it and remember that integral starts with I, then the integrated function follows, and then there is dx at the end. But to give a bit of explanation: firstly notice the dx tells us what the integrated variable is -- usually we have a function with single variable x and so it's pretty clear, but once we move to more dimensions we'll have more variables and this dx tells us what is a variable (i.e. along which axis we are integrating) and what is to be treated as a constant (maybe this doesn't yet make much sense but with integration there is a big difference between a variable and a constant, even if they are both represented by a letter). The real reason for dx is that the integral really represents an infinite sum. Have you ever seen that big sigma symbol for a sum? The integral symbol (here I) is like this, it likewise says "make an infinite sum of what will follow". But if we take a function and make infinitely many steps and keep summing the values the function gives us, we will just get infinity as the sum, so something is missing. In fact we don't want to sum the function values but rather areas of "tiny strips" we are kind of drawing below the function graph -- now a strip is basically a rectangle: area of a rectangle is computed as its height times its width. Height of the rectangle is the function value (here 2 * x - 2) and width is dx, which represents the "infinitely narrow" interval. This is just to give some idea about WHY it looks like this, but it's cool to ignore it for now.
So now the fuck we can finally move on. Our integral is really easy because it's just a sum of two expressions (and an integral of a sum thankfully equals a sum of integrals) that can be integrated easily. So from the rule I x^N dx = x^(N + 1) / (N + 1) we deduce that integral of 2 * x is 2 * x^2 / 2 = x^2 and integral of -2 is -2 * x, so we get:
I (2 * x - 2) dx = x^2 - 2 * x + C
A few things to note here now:
Our example integral wasn't that hard, right? Yes, this was extremely easy, but once you start integrating something with composed functions (functions inside other functions) you'll get into all sorts of trouble.
Now let's finish with computing a definite integral, OK? Let's say we want to compute the integral over interval 0 to 1, i.e. we'll write:
I(0,1) (2 * x - 2) dx
Above we said this is done by computing indefinite integral (already done), then plugging the upper and lower bound and subtracting, so let's do it:
I(0,1) (2 * x - 2) dx = (1^2 - 2 * 1 + C) - (0^2 - 2 * 0 + C) = -1
Things to notice here:
For completeness here are some rules for integration:
f(x) | I f(x) dx | comment |
---|---|---|
a * x^n | a * (x^(n+1))/(n+1) + C | |
cos(x) | sin(x) + C | |
sin(x) | -cos(x) + C | |
e^x | e^x + C | |
1/x | log(x) + C | |
a * g(x) + b * h(x) | a * (I g(x) dx) + b * (I h(x) dx) + C | |
g(x) * h(x) | g(x) * (I h(x) dx) - (I g'(x) * (I h(x) dx) dx) + C | per partes |
However note that applying these rules is generally not so simple as with differentiation, there exist methods such as per partes or substitution that don't tell you exactly how or when to apply them, so you have to experiment -- like said, this is an entertainment left to those who just enjoy doing math.
Can we do higher order integrals and partial integrals? Yes, of course, just like with derivatives we can do both of these.
Here is a small C code that produces the image at the top showing a graph of a function, its derivative and integral. Please keep in mind this is the most naive example using the simplest algorithm that in practice would be too inaccurate and/or inefficient, but it's good for demonstration. For shorter code we resort to using floating point but of course we can always avoid it with fixed point. You can try to play around with the function and see how its derivative and integral changes. Note that the plotted integral is indeed just one of the infinitely many integrals that would be differently vertically shifted by the constant C -- here we just plot the one that at x = 0 goes through 0.
#include <stdio.h>
#include <math.h>
#define GRAPH_RESX 64 // ASCII graph resolution
#define GRAPH_RESY 28
#define GRAPH_SIZE 2.5 // interval shown in the graph
#define DX 0.01 // for numeric methods
double f(double x) // our function
{
return 1 + sin(2 * x) + 0.2 * x * x;
}
double derivative(double (*f)(double), double x)
{
return (f(x + DX) - f(x)) / DX;
}
double integral(double (*f)(double), double x)
{
int steps = x / DX;
double r = 0;
int flip = x < 0;
if (x < 0)
steps *= -1;
else
x = 0;
while (steps)
{
r += f(x) * DX;
steps--;
x += DX;
}
return flip ? -1 * r : r;
}
char graphImage[GRAPH_RESX * GRAPH_RESY];
void graphDraw(double x, double y, char c)
{
int drawX = ((x + GRAPH_SIZE) / (2 * GRAPH_SIZE)) * GRAPH_RESX,
drawY = GRAPH_RESY - ((y + GRAPH_SIZE) / (2 * GRAPH_SIZE)) * GRAPH_RESY;
if (drawX >= 0 && drawX < GRAPH_RESX && drawY >= 0 && drawY < GRAPH_RESY)
graphImage[drawY * GRAPH_RESX + drawX] = c;
}
int main(void)
{
// clear the graph image:
for (int i = 0; i < GRAPH_RESX * GRAPH_RESY; ++i)
graphImage[i] = (i % GRAPH_RESX) == GRAPH_RESX / 2 ? ':' :
((i / GRAPH_RESX) == GRAPH_RESY / 2 ? '-' : ' ');
// now plot the function, its derivative and integral
for (double x = -1 * GRAPH_SIZE; x < GRAPH_SIZE;
x += GRAPH_SIZE / (2 * GRAPH_RESX))
{
graphDraw(x,integral(f,x),'#');
graphDraw(x,derivative(f,x),'*');
graphDraw(x,f(x),'x');
}
// draw the graph:
for (int i = 0; i < GRAPH_RESX * GRAPH_RESY; ++i)
{
putchar(graphImage[i]);
if ((i + 1) % GRAPH_RESX == 0)
putchar('\n');
}
return 0;
}
Powered by nothing. All content available under CC0 1.0 (public domain). Send comments and corrections to drummyfish at disroot dot org.