Software – Threads

Software for home computers didn’t use threads until the beginning of this century. Home computers had a single processor and core, so there was no point to them. However, software did have a need for background processing to keep interfaces responsive. Programming languages had different methods to accomplish this. One method is a thread pool. Individual programs created an internal pool of threads. They executed each thread in turn and, using this, could achieve responsiveness while processing information in the background. In reality, all these fake threads ran on the same thread, so there were no issues.

Once multiple processors and processors with multiple cores were introduced, software could now run multiple threads to make use of multiple processors. However, problems arose when threads tried to write to the same memory address. Until this point, a program contained multiple segments, where a segment is an allocated range of memory. The code segment holds instructions. The data segment holds initial data. The heap, or the rss segment, is dynamically allocated and extended. The stack segment tracks execution and holds local variables. Creating a new thread on an existing process can’t reuse the same stack segment, or the original thread would get messed up. So each thread has its own stack segment, but the other segments are shared. Each thread executes commands on the code segment and accesses memory in the data and rss segments.

So how do you keep threads from writing to the same address, or reading from an address while another thread writes to it? They introduced a thread lock. A lock hinges around the CMPXCHG opcode. If you don’t know assembler, then just ignore that part. Thread locks are thread-safe, meaning all threads can use them at the same time. However, only one thread can lock the lock. The others will wait until the lock is unlocked before they can continue. (Unless you use a trylock, which most languages have implemented now.) Various forms of locks, such as a read-write lock, have also been implemented. Locks let threads use the same memory safely.

To use threads, programmers associate a lock with a particular region of memory or a resource. For example, you could use a lock before making changes to a memory structure then unlock it afterwards. If all threads use the same lock for that memory structure, then nothing goes wrong. Something else that programmers often don’t realize is that reading and writing to files assumes single threads. If you are a library with multiple threads then first ensure the library can handle it. If it can’t, then use a thread lock whenever using it or create a wrapper for the library that uses thread locks.

Locks introduce one new problem, and that is deadlock. If one thread locks thread A then thread B, and a second thread locks B then A, then both threads could lock indefinitely. So always keep that in mind. Thread locks can sometimes hit randomly making them hard to debug.

When designing multi-threaded applications, keep in mind that locks are slow. Minimize the common memory that each thread accesses. This means designing the application with multiple threads in mind. A good approach is to separate processes into chunks then put queues between the threads. Profile the chunks to determine how many threads should process each queue. Not all applications can be arranged like this, but if you can then it saves a lot of collisions.

Over a decade after thread pools were originally invented, other companies reinvented them and marketed them as a new thing. (This is a recurring theme in programming. Programming hasn’t changed much in twenty years, so they create fake hype by rebranding an old idea.) This new version of thread pools is like the original, but instead of a single thread processing the virtual threads, it processes them on real threads. It’s like an idiot’s version of threads, for people who have trouble using real threads. Thread pools can be well-used, but it’s often far better to use your own threads. Thread pools strongly discourage infinite loops because that ties up a real thread. If you use actual threads, then an infinite loop is no problem, and is often a great way to handle things like socket communication.

Junior developers typically don’t use threads as they’re still learning the trade. Threads are typically used by intermediate to senior developers. They can cause unusual issues, so it’s always best to have code reviewed by senior developers

If you use multiple threads in pure assembler then there are more options. Adding a LOCK prefix, for example, can deal with some threading issues.

Current operating systems have limited options for thread organization. For example, suppose you are making an embedded device. You may want to dedicate one core to a single process or thread, to ensure it’s always responsive. This is impossible in Windows, though you can set affinity. Setting affinity kind of does it but you never know for sure. It’s now kind of possible on Linux, through the taskset command. However this is for a process rather than a thread.

Together our conversations can expand solutions and value

We look forward to helping you bring your ideas and solutions to life.
Share the Post:

Leave a Reply

Your email address will not be published. Required fields are marked *