Recently, AI has come into the spotlight. AI stands for Artificial Intelligence, but in this context, it is not actual intelligence. It’s the term used to describe an algorithm that imitates one aspect of intelligence. How do programmers achieve this?
To understand AI, you must first understand computers. A computer is a logic and math engine. It is, in essence, a very powerful calculator. So the only tools available to a computer are logic and math. Any solution must be created with these tools. Creating AI for a particular topic begins with a database of known data. Suppose you want to interpret hand-drawn numbers. You would start with a database of tens of thousands of hand-drawn numbers. This database includes images and manual interpretations. AI must take this database and learn how to interpret new data based on existing images. The algorithm they use for this is called a neural net.
Neural nets were originally named after neurons in the brain. They aren’t actually neurons, nor do they represent their function. It’s just the name for the algorithm. A neural net algorithm is a giant statistical equation. This equation can be trained on the known database, which takes a considerable amount of time. Once it is trained, it can be applied to new data. Suppose our neural net has been trained with the hand-drawn number database. When it receives an image of a number, it applies an equation to that image. Its output is the digits zero through nine and the probability of each. It selects the most probable outcome as the matching digit. In essence, what a neural net does is compare a new image to a database of known images. It finds the closest match and declares that the result.
A neural net algorithm will never be perfect. It can reach a high level of accuracy, over ninety-nine percent, but it has a limit. It is simply an equation that guesses. If you give it random data, something that doesn’t represent any number, then it will confidently return the closest statistical match regardless of how wrong it is. It is an equation and cannot understand the data nor how to properly interpret it.
Neural net algorithms lie at the heart of popular AI applications. How do these applications work? Let’s start with image creation.
Imagine an algorithm that reduces noise from an image (Noise is defined as random pixels inappropriately colored). Noise is common in all cameras, so this is easy to visualize. To reduce the noise, a simple algorithm must guess the values of the pixels. It takes a noisy image and reduces the noise by guessing based on the surrounding pixels. It can’t know what was originally there, so it just guesses something reasonable. If you use a neural net algorithm, it can guess what the picture is about then uses that to guide the guess. We’ll call this our anti-noise algorithm.
Now start with an image that just contains random noise. Take the anti-noise algorithm, but seed it with a few keywords or sentences. This primes it in the direction we want. Run the anti-noise algorithm repeatedly, using the output of the last as the input for the next. Each time the algorithm removes a little bit of noise and replaces it with a guess. Because we primed that guess with keywords or sentences, the neural net algorithm makes those things more likely to appear. After fifty or so iterations, you have a clean randomly created image. This is how AI-generated images are made. Note, you can also start with a rough sketch and add noise to that then use that as a seed.
When these anti-noise algorithms first appeared, they created really rough and distorted images. So programmers combined image creation with image interpretation. They used the output to make each neural net equation more precise. This is how they iteratively train these algorithms. A similar process also works for AI-generated music.
Generating images or music is a computationally expensive process. To speed it up, they reduce the image size. For example, the initial image is often tiny, such as sixty-four by sixty-four pixels. To generate a larger image, they use a similar anti-noise algorithm. They take the image created by the iterative process and expand it bit by bit. Again, they guess pixels and fill them in as needed. This generates the final image. The image is not truly random; given the same initial seed, it will create the same output.
Training these neural net algorithms requires a considerable amount of resources. It requires millions to billions of categorized images to start, then additional time and processing for feedback training.
Language AIs, such as Chat GPT, use a neural net algorithm trained with text. GPT-3, for example, was trained on forty-five terabytes of text. It uses a sequence-type neural net. Over the past decades, there have been several types, with the newest labeled Transformer Neural Net. The basic idea is that text you enter is first converted into a series of numbers. The sequence of numbers is then processed by the neural net algorithm, producing an output of one word at a time. Unlike image processing, they add a bit of randomness, so the output changes. This randomness helps train the system. Users can choose to regenerate the output then rank the output. This feedback can be used to improve the neural net. However, just like images, the neural net cannot understand the material. It’s only a statistical engine. If you ask it something, then it just outputs one of the highest-ranked responses.
The potential applications of AI are indeed numerous and diverse. Transformer Neural Nets were originally designed for translation, which makes language AIs excellent at translating and correcting grammar errors. Unlike traditional search engines, language AIs have the advantage of not requiring exact terminology to find specific information, expanding their utility as an information source.
Image AIs are also powerful, enabling anyone to quickly sketch something and have an AI enhance it, eliminating the need for artists in certain contexts. While generic music generation is possible, current AI capabilities still fall short of producing top-quality music. However, AI-generated background music for advertisements or videos is already achievable. (e.g., check out https://soundraw.io)
Another promising application lies in voice interpretation and response, where AI-powered systems can understand and interact with human speech. The versatility of neural net AIs leads to numerous special applications that extend beyond translation, language, and image processing, making it impossible to list them all.
The continuous advancements in AI technology promise even more exciting possibilities and improvements across various domains in the future. They are limited only by the nature of neural nets and the lack of true machine understanding. This means AI can be a tool for enhancing productivity, but a human must always direct an AI.