All threads of a process share its virtual address space. The local variables of a function are unique to each thread that runs the function. However, the static and global variables are shared by all threads in the process. With thread local storage (TLS), you can provide unique data for each thread that the process can access using a global index. One thread allocates the index, which can be used by the other threads to retrieve the unique data associated with the index.
There are two types of Thread-Local Storage (TLS):
- Dynamic TLS.
- Static TLS.
When the compiler compiles your program, it puts all the TLS variables into their own section, which is named, unsurprisingly enough, .tls. The linker combines all the .tls sections from all the object modules to produce one big .tls section in the resulting executable or DLL file.
For static TLS to work, the operating system must get involved. When your application is loaded into memory, the system looks for the .tls section in your executable file and dynamically allocates a block of memory large enough to hold all the static TLS variables. Every time the code in your application refers to one of these variables, the reference resolves to a memory location contained in the allocated block of memory. As a result, the compiler must generate additional code to reference the static TLS variables, which makes your application both larger in size and slower to execute. On an x86 CPU, three additional machine instructions are generated for every reference to a static TLS variable.
If another thread is created in your process, the system traps it and automatically allocates another block of memory to contain the new thread’s static TLS variables. The new thread has access only to its own static TLS variables and cannot access the TLS variables belonging to any other thread.
That’s basically how static TLS works. Now let’s add DLLs to the story. It’s likely that your application will use static TLS variables and that you’ll link to a DLL that also wants to use static TLS variables. When the system loads your application, it first determines the size of your application’s .tls section and adds the value to the size of any .tls sections in any DLLs to which your application links. When threads are created in your process, the system automatically allocates a block of memory large enough to hold all the TLS variables required by your application and all the implicitly linked DLLs. This is pretty cool.