Preshing on ProgrammingPreshing on Programming

Semaphores are Surprisingly Versatile

In multithreaded programming, it’s important to make threads wait. They must wait for exclusive access to a resource. They must wait when there’s no work available. One way to make threads wait – and put them to sleep inside the kernel, so that they no longer take any CPU time – is with a semaphore.

I used to think semaphores were strange and old-fashioned. They were invented by Edsger Dijkstra back in the early 1960s, before anyone had done much multithreaded programming, or much programming at all, for that matter. I knew that a semaphore could keep track of available units of a resource, or function as a clunky kind of mutex, but that seemed to be about it.

My opinion changed once I realized that, using only semaphores and atomic operations, it’s possible to implement all of the following primitives:

  1. A Lightweight Mutex
  2. A Lightweight Auto-Reset Event Object
  3. A Lightweight Read-Write Lock
  4. Another Solution to the Dining Philosophers Problem
  5. A Lightweight Semaphore With Partial Spinning

C++ Has Become More Pythonic

C++ has changed a lot in recent years. The last two revisions, C++11 and C++14, introduce so many new features that, in the words of Bjarne Stroustrup, “It feels like a new language.”

It’s true. Modern C++ lends itself to a whole new style of programming – and I couldn’t help noticing it has more of a Python flavor. Ranged-based for loops, type deduction, vector and map initializers, lambda expressions. The more you explore modern C++, the more you find Python’s fingerprints all over it.

Fixing GCC’s Implementation of memory_order_consume

As I explained previously, there are two valid ways for a C++11 compiler to implement memory_order_consume: an efficient strategy and a heavy one. In the heavy strategy, the compiler simply treats memory_order_consume as an alias for memory_order_acquire. The heavy strategy is not what the designers of memory_order_consume had in mind, but technically, it’s still compliant with the C++11 standard.

There’s a somewhat common misconception that all current C++11 compilers use the heavy strategy. I certainly had that impression until recently, and others I spoke to at CppCon 2014 seemed to have that impression as well.

This belief turns out not to be true: GCC does not always use the heavy strategy (yet). GCC 4.9.2 actually has a bug in its implementation of memory_order_consume, as described in this GCC bug report. I was rather surprised to learn that, since it contradicted my own experience with GCC 4.8.3, in which the PowerPC compiler appeared to use the heavy strategy correctly.

How to Build a GCC Cross-Compiler

GCC is not just a compiler. It’s an open source project that lets you build all kinds of compilers. Some compilers support multithreading; some support shared libraries; some support multilib. It all depends on how you configure the compiler before building it.

This guide will demonstrate how to build a cross-compiler, which is a compiler that builds programs for another machine. All you need is a Unix-like environment with a recent version of GCC already installed.

How to Install the Latest GCC on Windows

Several modern C++ features are currently missing from Visual Studio Express, and from the system GCC compiler provided with many of today’s Linux distributions. Generic lambdas – also known as polymorphic lambdas – are one such feature. This feature is, however, available in the latest versions of GCC and Clang.

The following guide will help you install the latest GCC on Windows, so you can experiment with generic lambdas and other cutting-edge C++ features. You’ll need to compile GCC from sources, but that’s not a problem. Depending on the speed of your machine, you can have the latest GCC up and running in as little as 15 minutes.

My Multicore Talk at CppCon 2014

Last month, I attended CppCon 2014 in Bellevue, Washington. It was an awesome conference, filled with the who’s who of C++ development, and loaded with interesting, relevant talks. It was a first-year conference, so I’m sure CppCon 2015 will be even better. I highly recommend it for any serious C++ developer.

While I was there, I gave a talk entitled, “How Ubisoft Montreal Develops Games For Multicore – Before and After C++11.” You can watch the whole thing here:

The Purpose of memory_order_consume in C++11

In the C++11 standard atomic library, most functions accept a memory_order argument:

enum memory_order {

The above values are referred to as memory ordering constraints. Each of them has its intended purpose. Among them, memory_order_consume is probably the least well-understood. It’s the most complicated ordering constraint, and it offers the least reward for using it correctly. Nonetheless, there it is, tempting the curious programmer to make sense of it – if only to unlock its dark, mysterious secrets. That’s exactly what this post aims to do.

What Is a Bitcoin, Really?

When I first started learning about Bitcoin, I found plenty of information, but nothing that directly answered the most burning question:

When you buy bitcoins… what is it that you own, exactly?

That’s the question I’ll answer in this post. Along the way, I’ll introduce several key Bitcoin concepts. You’ll see for yourself how bitcoins are secured and how they’re transferred.

First and foremost, a bitcoin is a unit of account, in the same sense that a gallon is a unit of volume, or a gram is a unit of mass. You can’t pick up a bitcoin and hold it in your hand like you can a dollar bill. But that’s OK, because that’s not what’s important. What’s important is that:

  • Bitcoins can be possessed.
  • Bitcoins can be transferred.
  • Bitcoins are impossible to copy.

These three properties, combined, allow bitcoins to function effectively as a system of distribution of wealth. And fundamentally, that’s what makes bitcoins useful.

Bitcoin Address Generator in Obfuscated Python

Recently, I became interested in the inner workings of Bitcoin – specifically, the way it uses elliptic curve cryptography to generate Bitcoin addresses such as 1PreshX6QrHmsWbSs8pHpz6kLRcj9kdPy6. It inspired me to write another obfuscated Python script. The following is valid Python code:

_                   =r"""A(W/2,*M(3*G
       ashlib&as&h,os,re,bi    nascii&as&k;J$:int(
     k.b2a_hex(W),16);C$:C    (W/    58)+[W%58]if(W@
    [];"rip           em    d160");Y$:h.sha25
   6(W).digest();I$                 d=32:I(W/256,d-1)+
  chr(W%256)if(d>0@"";                  U$:J(k.a2b_base
 64(W));f=J(os.urando       m(64))        %(H-U("AUVRIxl
Qt1/EQC2hcy/JvsA="))+      1;M$Q,R,G       :((W*W-Q-G)%P,
(W*(G+2*Q-W*W)-R)%P)       ;P=H-2**       32-977;V$Q=P,L=
1,O=0:V(Q%W,W,O-Q/W*                      L,L)if(W@O%P;S,
T=A(f,U("eb5mfvncu6                    xVoGKVzocLBwKb/Nst
zijZWfKBWxb4F5g="),      U("SDra         dyajxGVdpPv8DhEI
qP0XtEimhVQZnEfQj/       sQ1Lg="),        0,0);F$:"1"+F(W
 [1:])if(W[:1           ]=="\0"@""        .join(map(B,C(
  J(W))));K$:               F(W          +Y(Y(W))[:4]);
   X.update(Y("\4"+                     I(S)+I(T)));B$
    :re.sub("[0OIl    _]|            [^\\w]","","".jo
     in(map(chr,ra    nge    (123))))[W];print"Addre
       ss:",K("\0"+X.dig    est())+"\nPrivkey:",K(
         "\x80"+I(f))""";exec(reduce(lambda W,X:
            W.replace(*X),zip(" \n&$@",["","",
               " ","=lambda W,",")else "])

Python 2.5 – 2.7 is required. Each time you run this script, it generates a Bitcoin address with a matching private key.

Acquire and Release Fences Don’t Work the Way You’d Expect

Raymond Chen defined acquire and release semantics as follows, back in 2008:

An operation with acquire semantics is one which does not permit subsequent memory operations to be advanced before it. Conversely, an operation with release semantics is one which does not permit preceding memory operations to be delayed past it.

Raymond’s definition applies perfectly well to Win32 functions like InterlockedIncrementRelease, which he was writing about at the time. It also applies perfectly well to atomic operations in C++11, such as store(1, std::memory_order_release).

It’s perhaps surprising, then, that this definition does not apply to standalone acquire and release fences in C++11! Those are a whole other ball of wax.

To see what I mean, consider the following two code listings. They’re both taken from my post about the double-checked locking pattern in C++11. The code on the left performs a release operation directly on m_instance, while the code on the right uses a release fence instead.