### Example - Mandelbrot set

Source code in `hpc/src/mandel`

We start with a standard implementation to calculate the set.

```
double dx = (x2-x1)/img->nx, dy = (y2-y1)/img->ny;
for (j = 0; j < img->ny; ++j) {
for (i = 0; i < img->nx; ++i) {
double complex c = x1+dx*i + (y1+dy*j) * I;
double complex z = 0;
int count = 0;
while ((++count < maxiters) &&
(creal(z)*creal(z)+cimag(z)*cimag(z) < 4.0)) {
z = z*z+c;
}
color_pixel(img,i,j,count,maxiters);
}
}
```

The innermost loop is purely scalar, we will see if some vectorization can be applied here.

The program uses some simple functions to save a picture of the
set in a ppm file (see directory `hpc/src/utilities`

). You can
use `display`

to view the image on the cluster, if you run
a local X server.

We will now try to apply a series of modifications to the original code and meassure the effect these changes have on performance.

- mandel0 - version with complex. This is our baseline.
- mandel1 - work with real and imaginary parts of complex
- mandel2 - iterate vectors of points in the complex plane
- mandel3 - Ninja version

I have run these codes on my laptop (I7-4600U, Haswell) and on the CICA cluster (AMD Opteron 6344, Piledriver). The results can be found in this PDF.