May

The IOs of Game Netcode #3 – Synchronisation and Locks

This is part of a series of posts revolving around game netcode development, the introduction and links to the other posts can be found here.

We’ve already established you need to run your netcode framework on a different thread to your main game’s update/render loop, something which all netcode developers will agree is a requirement for writing good netcode. This means you’re already in danger of the threading and concurrency issues mentioned in the last post. So how do you ensure that you don’t run into these issues? Well, you have a few weapons in your arsenal to tackle the problem. The first one I’m going to discuss is locks.

Locks

Think of locks as your most basic weaponry against threading issues. The pistol with infinite ammo you switch back to after your power-up wears off. Locking is a synchronisation mechanism that ensures that only one thread can execute a piece of code at any given time. It does by forcing other threads to wait until the currently active thread is finished with the locked code.

Locks are such a common mechanism that I’m pretty sure all modern languages that support multi-threading also support locking. It’d be pretty disastrous if they didn’t. The main differences are the types of locks, the terminology and how they’re used. You usually have a couple of types available to you, such as object-level locks and method-level locks. Object-level locks are the bread and butter here and more complex locking mechanisms can be built upon them. In C# an object-level lock looks like this:

lock (this) {
	// Your synchronised code
}

In Java you achieve object-level locks using synchronized blocks, which are essentially the same thing:

synchronized (this) {
	// Your synchronised code
}

Unfortunately, in C++ you do not have a simple statement for object-level locking (although I’m sure you can create a template to do it). Instead, you need to instantiate a mutex (mutually exclusive) object (part of the C++ standard library) and perform the lock operation on that.

std::mutex mtx; // Declared in the class constructor

mtx.lock();
// Your synchronised code
mtx.unlock();

These code snippets will all do the same thing when multiple threads hit it at the same time. The first thread to request the lock will get it and be allowed to proceed to execute the code within the lock. All other threads will request the lock, but will be forced to wait (block) until the thread with the lock leaves the synchronised code, after which another thread will receive rights to the lock and proceed.

There are also method-level locks, which work similarly to the object-level locks, except work at the method level, synchronising everything inside a particular method. In C# this is achieved with the following:

[MethodImpl(MethodImplOptions.Synchronized)]
public void SynchronisedMethod() {
	// Your synchronised code
}

In Java, you use the same synchronized keyword as in object-level locking:

public synchronized void synchronisedMethod() {
	// Your synchronised code
}

C++ does not have a mechanism for method-level locking, but this can be easily achieved with a simple object-level lock. The method-level locking is just a convenience mechanism for object-level locking everything in a method.

Now, method-level locking and the examples I’ve shown for object-level locking have a pretty serious issue to take into consideration. A thread can achieve a lock on a method or an object while another thread has the same lock on a different object of the same class. This is because these types of locks are handled at the object level (locking against it’s own instance or a member object). For the most part this is exactly what is required, but sometimes there are situations in which you need to synchronise all instances of a particular class. This is called class-level locking and can be achieved in a couple of ways. You can perform an object-lock on a static object shared by all instances of a specific class, or you can do method-level locking on static methods.

Of course, with most coding concepts, there is a down-side to very up. With locking mechanisms, that is deadlocks

Deadlocks

A deadlock is when a thread is indefinitely blocked by a lock and they usually occur when multiple threads are executing code that results in nested locks on. For example, let’s take the classic bank transfer scenario.

public class Account {

	private double balance;

	public Account(double balance) {
		this.balance = balance;
	}

	public void withdraw(double amount) {
		balance -= amount;
	}

	public void deposit(double amount) {
		balance += amount;
	}

}

public class Bank {

	public void transfer(Account from, Account to, double amount) {
		synchronized (from) {
			synchronized (to) {
				from.withdraw(amount);
				to.deposit(to);
			}
		}
	}

}

Both accounts are synchronized so that exclusive access is obtained and the program is assured that it can perform all operations required of the transfer without blocking. The deadlock arises when a Bank executes two opposing transfers between the same two accounts at the same time (separate threads).

final Account account1 = new Account(1000);
final Account account2 = new Account(1000);
final Bank bank = new Bank();

new Thread(new Runnable() {
	public void run() {
		bank.transfer(account1, account2, 500);
	}
}).start();

new Thread(new Runnable() {
	public void run() {
		bank.transfer(account2, account1, 500);
	}
}).start();

The result can vary depending on execution order, but if both threads manage to obtain their first lock, then a deadlock will occur. This is because both threads cannot obtain their second lock until the other thread releases it, resulting in both thread blocking indefinitely.

The primary reason behind deadlocks is bad software design. Deadlocks can easily be avoided if the developer takes a step back and designs the inter-thread communication first. If the program/system makes use of multiple threads, then a deadlock scenario should be at the forefront of the developer’s mind. I’ve seen a number of poorly written programs where the deadlock scenario isn’t even considered and, for the most part, the program runs fine, but occasionally a deadlock will occur. This is because the developer will just throw in a lock here or there to ensure they don’t run into concurrent modification exceptions and suddenly have methods with locks calling other methods with locks, resulting a nested deadlock. Of course, the way they solve this is to ensure the proper execution order by adding more locks, which as you can guess, just makes things worse.

In my experience, the best implementation of multi-threaded operation consists of very few locks, but in the right places. DESIGN YOUR MULTITHREADING FIRST! 🙂

Thread-safe and Concurrent Data Structures

So now you should understand how to synchronise your multithreaded code, while avoiding potential deadlocks. It will help you avoid the concurrency issues when writing good threaded netcode. Now, I’m going to briefly touch on a set of special data structures that are designed to further help you avoid concurrency issues and deadlocks.

Thread-safe and concurrent data structures are a set of data structures that follow a certain set of rules to ensure that race conditions do not happen. This is done through a variety of different methods, such as locking and atomic operations.

They take care of all of the concurrency issues without any of the potential deadlocks, making your life a lot easier. Java, C# and C++ all have a decent set of thread-safe data structures in their standard libraries, too many to go through here, but it’s definitely worth searching their respective documentations and read up on the specifics.

One thing you need to keep in mind though, even though the add and remove operations are thread-safe, any iterators or enumerations generated from these data structures may not be. Meaning, if you loop over all the elements in a thread-safe data structure, the resulting collection may not be thread-safe. So if another thread adds or removes an element to the data structure while the loop is being processed, you’ll end up with concurrent modification and a race condition causing unpredictable behaviour. Fortunately, these thread-safe data structures usually offer up some kind of fail-fast iterator or enumerator which will throw a concurrent modification exception when it detects that the underlying structure has been modified after it has been created, which can then be handled properly by the developer.

You can usually avoid the concurrent modification exception by using the correct concurrent data structure for the job and in the right place. I like to use a concurrent queue for inter-thread communication as it allows me to iterate through the queue in a thread-safe way using peek and pop operations (no need for iterators or enumerators), which I will show an example of in a later post.

This pretty much wraps up this post. It’s a bit longer than the previous ones but I wanted to finish with the multithreading so that I can actually get into some nitty-gritty netcode stuff next post 🙂 Hopefully you can take something away from this that will aid you in your game development, and as always, if you want to chat or pick my brains about anything you can either comment below, catch me on IRC (click Chat above) or fire me a tweet to @Jargon64.

Thanks for reading! 🙂

May 21st, 2014 by Jargon | Type: Standard

Filed Under: Development, News Tags: Concurrency, Deadlocks, Locks, Netcode, Synchronisation, Thread-safe, Threads No comments yet

Apr

The IOs of Game Netcode #2 – Threading and Concurrency

This is part of a series of posts revolving around game netcode development, the introduction and links to the other posts can be found here.

In the last post of The IOs of Game Netcode (found here), I talked about a few general rules of thumb I usually follow when starting a new netcode framework. Over the next few posts I’m going to go a little deeper into the technical options available to us as netcode developers and what routes I take based on different scenarios. The first topic I’m going to start with is threading, and the resulting issue – concurrency. This post is quite long so I’ve only addressed the issues the developer should keep in mind while working with threads. The solutions to these issues will be covered in the next post 🙂

Threading

As any netcode developer will know, the first hurdle you will hit when writing netcode or working with sockets is the threading issue. Normally, you can only execute a single piece of program code at a time. Threading allows you to execute multiple pieces program code simultaneously by running it on different threads. By default, reading and writing via sockets will block current execution of program code until it is complete. This isn’t necessarily an issue with writing to a socket unless you are writing faster than the hardware can handle (network card or modem), but if you’re reading, the operation will block until there is data to be read. One of the ways around this is to check the number of available bytes to be read before reading and if there are bytes available, only read that much. However, even with using this method, the actual reading and writing of bytes will block, even if it’s just a moment, and you don’t want this in your main game or render loop. Another way around this is to use asynchronous read and write operations. These run in their own threads automatically (provided by the socket library you are using) and pass the data back via a callback when complete. They have their uses, such as web services, but for real-time game netcode they can begin to cause problems as you cannot be sure of the order of transmission and you start to encroach of race condition territory.

So, in order to effectively read and write across a network with sockets, without causing the main program to hang while it is doing so, you’ll need at least one additional thread to perform the socket operations on. The reason I say at least one additional thread is because when developing your game client, you only really need a single socket to connect to your server. However, for the server, you’ll need a thread for every connecting client to handle each of the socket operations. For those of you who are now thinking “Why not use non-blocking sockets?”, I’m aware of this and it will be covered in a future post, but for the time being I’m focusing on the standard variety of sockets as there’s a lot more to consider with using non-blocking sockets 🙂

Anyway, as soon as you start working with multiple threads, it opens up a whole new bag of worms in the form of concurrent modifications.

Concurrency

Concurrent modifications are when you are reading a value of a variable in one thread, while it is being modified in another. Another example is looping over a collection while different thread is adding or removing an element from the collection. Most languages languages allow this type of access with unpredictable results. This is due to two main reasons.

Race Conditions

The first is the race condition, you just don’t know what thread is going to access the variable first. There are three scenarios for a race condition:

Read & Read – Both threads want to read the value of a variable. It is unknown which thread reads the variable first but it doesn’t matter as it does not change. Both threads read the same value.
Read & Write – One thread reads the variable, while the other writes to it. The final value of the variable will always be what is written, but the value read by the reading thread may be that of the variable before the write, or after. This can lead to the aforementioned unpredictable behaviour and potential crashes.
Write & Write – Both threads want to write. No read operations are carried out, but that does not eliminate an unpredictable value being read later. This is because the final value of the variable is unknown. It is the value of whichever thread wrote to the variable last.

The above scenarios are very specific and only show two threads accessing a single variable. However, in reality, these threads would be doing more than just reading and writing to a variable. For example, we have a shared (global or static) float variable called currentSpeed accessed by both threads:

Thread 1 – Anti speed-hacking protection

currentSpeed = player.getVelocity().getMagnitude();
if (currentSpeed > MAX_PLAYER_SPEED) {
    player.disconnect();
}

Thread 2 – Find the fastest moving entity

Entity fastestEntity = null;
currentSpeed = 0;
for (Entity entity : entities) {
    if (entity.getVelocity().getMagnitude() > currentSpeed) {
        currentSpeed = entity.getVelocity().getMagnitude();
        fastestEntity = entity;
    }
}
return fastestEntity;

For the record, you should never share a variable between two different tasks like this, but if you did this is how it might play out. For this example we are going to assume that the player is moving at a speed of 4 and there is one other entity in the world moving at a speed of 10:

// Start with Thread 2
Entity fastestEntity = null;
currentSpeed = 0;
for (Entity entity : entities) {
    if (entity.getVelocity().getMagnitude() > currentSpeed) {
// Switch to Thread 1
currentSpeed = player.getVelocity().getMagnitude();
// Switch to Thread 2
        currentSpeed = entity.getVelocity().getMagnitude();
        fastestEntity = entity;
    }
}
// Switch to Thread 1
if (currentSpeed > MAX_PLAYER_SPEED) {
    player.disconnect();
}

In this scenario, the first time currentSpeed is assigned is after the first switch, where it gets set to 4. Before it can test the value of currentSpeed, the process switches to thread 2, where currentSpeed is set to the value of the fastest moving entity’s speed, which 10. Then the process switches back to thread 1 to perform the test. Oh look, the player is moving at a speed of 10, they must be speed-hacking, better disconnect them!

This occurs because the threads can switch at any point in during normal processing and is always something you need to keep in mind while working with multiple threads. There are mechanisms to get around these issues, but first…

Non-Atomic Operations

The second reason concurrent access can cause unpredictable results is due to non-atomic load and store operations. This is a bit more low level and might be harder to grasp for those who aren’t familiar with CPU architecture. There are a couple of definitions when it comes to the atomicity of an operation. It can refer to a single instruction or an operation of multiple instructions. Essentially, an operation is considered atomic if it completes in a single step relative to other threads. Therefore, a non-atomic operation can also result in a race condition as described above, but for the purposes of this section, we’ll be focusing on single instructions.

When you want to run your game (or program), you need to compile it into machine code first. Every developer knows this. During the compilation process, the compiler reads our source code and optimises it internally before outputting machine code, therefore the machine code doesn’t directly reflect the logic that we’ve defined. Most developers know this. One of the optimisations compilers do is to maximise CPU register usage. General purpose CPU registers typically have a size equal to the bit-architecture of the system. Modern day systems are 64 bit architecture and have 64 bit general purpose CPU registers. If you have two 32 bit integers that have some operation performed on them, the compiler will attempt to optimise the machine code to load them both into the same 64 bit register to perform the operation more efficiently. Some developers know this.

Now, the problem lies in the scenario where you attempt to perform an operation on a data type that has a larger bit requirement than the CPU register can handle, or the register already has some active data in it. The data ends up being split into multiple machine code instructions – and this is what causes the problem. Can you remember when I mentioned that threads can switch at any point in normal processing? Well, this happens at the machine code instruction level. So, a simple variable assignment such as:

long timestamp = 1L;

Can be split into two machine code instructions, with a thread switch right in the middle.

Not many developers know this.

This is the very essence of non-atomic operations. If processing switches to another thread during a multi-instruction load or store, the race condition is the least of your worries. Depending on your operation, you’ll either end up with a torn-read or a torn-write. One thread attempts to write a 64 bit integer to a variable but only gets as far as the first 32 bit store instruction, another thread reads the full 64 bit contents of the variable, then the first thread writes the second 32 bit store instruction. What the second thread ends up reading is one-half correct data, one-half bad data and one-whole big problem.

Some of you may now be thinking “Holy crap, threads are dangerous, how the hell do programs even function without exploding into a flaming ball of random corruption!?” Well, the answer is yes, they are dangerous, but there are also certain principles you can abide by and mechanisms you can use that prevent this pseudo-random behaviour. However, these topics will be addressed in the next post 😉

Like before, if you have any questions about this post or just want to chat netcode, please comment below or fire me a tweet at @Jargon64.