Getting Started with CoralThreads

In this article we will see how easy it is to use CoralThreads to pin a critical thread to a microprocessor core. We will be focusing on code so if you are interested in the theory behind CoralThreads refer to its white paper for the complete details.

The microprocessor architecture

Before we start, let’s lay down the basic architecture of microprocessors and the terminology used by CoralThreads. Going straight to the point, what you need to know is:

  • Most microprocessors nowadays are multi-core
  • A core can execute a single process or thread in parallel with the other cores (i.e. true parallelism)
  • Intel went a step further and introduced hyper-threading which allows a core to execute not only one but two threads concurrently inside a core with extremely fast context switches between them

Now to keep things simple, CoralThreads uses the following names:

  • Chip for the physical multi-core microprocessor you can hold in your hands
  • Core for each independent processing unit inside a chip
  • Processor for each independent processing unit inside a core

To summarize: a chip can have more than one core. Without hyper-threading, each core will have only one processor. With hyper-threading, each core will have two processors. Because of hyper-threading, it is important to understand that the thread is being executed by a processor inside the core and not by the core itself. Well, without hyper-threading a core will have only one processor, so one could assume that it is the core that is executing the thread. But to keep our terminology consistent with hyper-threading, we separate the core from its processors.

Examining your microprocessor

Grab your CoralThreads jar and run the Affinity class to see the status of everything we talked above:

$ java -cp coralthreads-all.jar com.coralblocks.coralthreads.Affinity

CpuInfo: [nChips=1, nCoresPerChip=4, hyper-threading=true, nProcessors=8, procIds=0,1,2,3,4,5,6,7]

Chip-0:
    Core-0:
        Processor-0: free
        Processor-4: free
    Core-1:
        Processor-1: free
        Processor-5: free
    Core-2:
        Processor-2: free
        Processor-6: free
    Core-3:
        Processor-3: free
        Processor-7: free

As you can see from above, the machine has one chip and 4 cores (quad-core). Hyper-threading is turned on so instead of 4 processors we have 8. Their ids are 0-7. You can also see that currently there are no threads assigned to any processor.

Pinning a thread to a processor

You can see in the example below how easy it is to use CoralThreads:

import com.coralblocks.coralthreads.Affinity;

public class Basics {
	
	public static void main(String[] args) throws Exception {
		
		Thread thread = new Thread(new Runnable() {

			@Override
            public void run() {

				// must be the first thing inside the run method
				Affinity.bind();
				
				try {
					
					while(true) {
						
						// do whatever you want here...
					}
					
				} finally {
					
					// must be the last thing inside the run method
					Affinity.unbind(); 
				}
            }
		}, "MyPinnedThread");
		
		System.out.println();
		Affinity.printSituation(); // nothing done yet...
		
		// assign thread to processor:
		int procToBind = Integer.parseInt(args[0]);
		Affinity.assignToProcessor(procToBind, thread);
		
		Affinity.printSituation(); // now you see it there...
		
		// start the thread!
		thread.start();
		
		Affinity.printSituation(); // now it is running with a pid...
	}
}

When you run the program above you get the output below. Note that you can pass the processor id you want to bind through the command line.

$ java -cp coralthreads-all.jar com.coralblocks.coralthreads.sample.Basics 2

CpuInfo: [nChips=1, nCoresPerChip=4, hyper-threading=true, nProcessors=8, procIds=0,1,2,3,4,5,6,7]

Chip-0:
    Core-0:
        Processor-0: free
        Processor-4: free
    Core-1:
        Processor-1: free
        Processor-5: free
    Core-2:
        Processor-2: free
        Processor-6: free
    Core-3:
        Processor-3: free
        Processor-7: free

CpuInfo: [nChips=1, nCoresPerChip=4, hyper-threading=true, nProcessors=8, procIds=0,1,2,3,4,5,6,7]

Chip-0:
    Core-0:
        Processor-0: free
        Processor-4: free
    Core-1:
        Processor-1: free
        Processor-5: free
    Core-2:
        Processor-2: assigned to MyPinnedThread (not-started)
        Processor-6: free
    Core-3:
        Processor-3: free
        Processor-7: free

CpuInfo: [nChips=1, nCoresPerChip=4, hyper-threading=true, nProcessors=8, procIds=0,1,2,3,4,5,6,7]

Chip-0:
    Core-0:
        Processor-0: free
        Processor-4: free
    Core-1:
        Processor-1: free
        Processor-5: free
    Core-2:
        Processor-2: bound to MyPinnedThread (running pid=2180)
        Processor-6: free
    Core-3:
        Processor-3: free
        Processor-7: free

You can check that your thread is running in the correct processor by using the top -H shell command. After it is running hit “1″ to see the cpus at the top. Cpu2 should be at 100%.

cpu2_100

Isolating processors from kernel and hardware interrupts

Thread affinity just picks a processor where a thread will always be executed but it does nothing to actually isolate the processor from external interrupts. Fortunately, all modern operating systems provide ways to do that so you can be sure that your critical thread will have the processor for itself at all times. Coral Blocks has the expertise to configure most operating systems to isolate a processor. Below we list some of the tricks that we have done in the past to accomplish that with great success:

  • Removing a processor from the kernel scheduler load balancer so when you pin a thread to a processor you can be sure that the kernel will never try to preempt it and run another user-space thread on that processor. That effectively isolates the processor from any other user-space thread except the ones you are pinning yourself through CoralThreads.
  • Disallowing a processor to receive hardware interrupts.
  • Removing all RCU kernel threads (read-copy-update threads) from latency-sensitive processors.
  • Removing all bdi-flush kernel threads (pdflush threads) from latency-sensitive processors.
  • Removing timer interrupts (full tickless mode) from latency-sensitive processors.

The availability of the tricks above will depend on your flavor of Linux and most importantly on your kernel version. We have found through extensive research and tests that a newer kernel version does not necessarily mean better performance.