Computers use numbers, not letters or images. So how are images stored as numbers?

Start with a pixel. A pixel is a single dot on the screen. It contains numbers that represent the red, green and blue light levels. An image is a large number of pixels. These pixels are saved in an array. An array is a block of reserved memory. It contains width * height pixels, and each pixel contains three numbers representing red green and blue.

Red, green and blue is a simplification. Black and white images can contain a single number that represents black to white. Some images can use yellow, cyan and magenta, which are used for printing. In addition, they may also have alpha which represents transparency.

Bit depth is the number of bits that make each pixel. 24-bit images have 24 bits per pixel. This is typically 8 bits per color. A 32 bit image can have 10-11-11 bit color depth or 8-8-8-8 red, green, blue and alpha.

Old images sometimes use a palette. Old computers lacked memory, so programmers had to work around that. One way to shrink an image is to take all common colors and move them into a palette. A palette contains a set number of colors. Each color contains red, green and blue values. The image then needs to store the palette index instead of the red green and blue values. It provides a significant size reduction at a cost of color smoothness.

Images are saved and loaded. To save disk space, nearly every image format uses tricks to reduce the file size. This includes compression. Some compression formats, such as JPEG, lose data. They are called a lossy compression. While they lose details, the gist of the image can be saved in a much smaller space. There are often multiple versions of each. Open standards exist for most image file formats.

Programs can create images. You can create an image directly, but most programmers use image libraries. An image library provides a set of functions that create images, draw things on it then output it in various formats. Writing an image library is a fun task that most programmers do at least once in their career. The algorithms are precise and clear.

Most image libraries organize memory in the same way. The block of memory (width * height * bit-depth) is set so the top left pixel is first. The pixel to the right is next in memory. At the end of the row, it drops one point and keeps going. The first image libraries used this because it was easy to convert from (x,y) to a memory address. It’s just pixel[y*width+x]. So to set a pixel you just use:

void set(x,y,color) { pixel[y*width+x] = color; }

Now half the programmers are annoyed over lack of input validation. In truth, input validation isn’t always needed. If you are writing this for your own use, it’s often easier to build validation into the algorithms. However, if you are writing a library then you can add safe functions for people who want to use them.

int set_safe(x,y,color) { if (x < 0 || x >= width || y < 0 || y >= height) return -1; set(x,y,color); return 0; }

The next basic function found in all images is line. It draws a line between two points. There are plenty of line drawing algorithms, but some are far simpler than others. There’s even one that doesn’t use division. While this may seem trivial, you sometimes need line drawing algorithms in embedded processors. For example, 3D printers. This line drawing algorithm requires you to check if a line is mostly horizontal or vertical.

(x1, y1) – First point

(x2, y2) – Second point

A simple line looks like this:

If you zoom in, you’ll find a new perspective.

Sometimes these will be blurred, so some are blacker than others. This is called anti-aliased. An aliased image is easier to visualize. Notice that for every two pixels over, it moves one pixel up. Normal line algorithms rely on division to calculate the y value. Typically it is (y2-y1)/(x2-x1)=(y-y1)/(x-x1). Or rearrange to get: y = y1 + (x-x1)*(y2-y1)/(x2-x1). This is the typical geometric approach. Division is typically computationally expensive, or at least far more than multiplication or addition. It also takes additional die space if you are implementing this in hardware, such as through an FPGA.

To eliminate the division we first notice a pattern. When Δx is twice that of Δy, we move two pixels over and one pixel up. When Δx is thrice Δy, we move three pixels over and one pixel up. This isn’t always clear cut. For example if Δx is two and a half times Δy, we move over two pixels, up one, over three, up one then repeat. We can use a simple math trick to calculate this. Every time we draw a pixel, we add Δy to a sum variable. When this variable is greater than Δx, we shift the y value and subtract Δx from the sum.

There’s one more wrinkle which you’ll find after the line draws. It will be shifted over slightly to the right. To correct this, we start the sum as one half of Δx. Ah, but there wasn’t supposed to be division. In this case it’s alright, because it’s division by two. This is accomplished with a bit shift, and it’s extremely simple to do in hardware.

This trick only works if the line is mostly horizonal, in other words if Δx > Δy. That’s alright, because we can just have two algorithms: one for mostly horizontal and the other for mostly vertical. Let’s assume the line is mostly horizontal. First, to simplify the algorithm, we ensure the first point is always on the left. This is simple.

if (x1 > x2) { int tmp = x2; x2 = x1; x1 = tmp; tmp = y2; y2 = y1; y1 = tmp; }

Then we draw the line using the described algorithm.

int dx = x2 - x1; int dy = y2 - y1; int step = sign(dy); dy = abs(dy); int y = y1; int sum = dy >> 1; for (int x = x1;x <= x2;x += 1) { set_safe(x,y,color); sum += dy; if (sum >= dx) { sum -= dx; y += step; } }

There are various ways to optimize this further. You could have two algorithms for horizontal, one for each left-right orientation. The same applies for the vertical line algorithm. This would eliminate the need for the initial swap. It requires more code but runs a slight bit faster, which is the perpetual trade-off between code size and run time. Additionally if you know the points always lie in the image then you can optimize it to write directly to the pixel array.

int dx = x2 - x1; int dy = y2 - y1; int step = sign(dy) * width; dy = abs(dy); int y = y1; int sum = dy >> 1; int *mem, *stop = pixel + y2 * width + x2; for (mem = pixel + y1 * width + x1; mem != stop; mem++) { *mem = color; sum += dy; if (sum >= dx) { sum -= dx; mem += step; } } *mem = color;

If coordinates can be given outside the image, then you can calculate the intercepts and draw the line between the intercept points. That’s another potential optimization. Optimization should fit how the code is used. In a case like this, a good approach is to have the functions line() and line_safe(). This gives programmers a safe or fast option.

Finally, there is a way of optimizing this further, but it requires self-mutating code. In an algorithm like this, where only parts change depending on initial parameters, then you can avoid additional statements inside loops by dynamically rewriting parts of the code. For example, instead of adding a step value, you can rewrite the code to add an immediate value. This reduces the registers used, which can save a few more clock cycles. Unfortunately most code segments are write protected these days so optimization through self-mutation is no longer a viable method.

Obviously there are tradeoffs between code size and optimization. You can generalize the algorithm so it uses one simple loop. If you’re writing it in assembler, you design it around the available registers. If you have a custom core on an FPGA, then you can make opcodes for some of these sequences and you can make as many custom registers as you want.

Computer graphics is a complex field, but everything condenses into a two-dimensional image. Even 3D graphics are converted into a 2D image before they are displayed. So a fundamental understanding of images are essential for any graphics.

## 2 Responses