Preshing on ProgrammingPreshing on Programming

How to Install the Latest GCC on Windows

Several modern C++ features are currently missing from Visual Studio Express, and from the system GCC compiler provided with many of today’s Linux distributions. Generic lambdas – also known as polymorphic lambdas – are one such feature. This feature is, however, available in the latest versions of GCC and Clang.

The following guide will help you install the latest GCC on Windows, so you can experiment with generic lambdas and other cutting-edge C++ features. You’ll need to compile GCC from sources, but that’s not a problem. Depending on the speed of your machine, you can have the latest GCC up and running in as little as 15 minutes.

My Multicore Talk at CppCon 2014

Last month, I attended CppCon 2014 in Bellevue, Washington. It was an awesome conference, filled with the who’s who of C++ development, and loaded with interesting, relevant talks. It was a first-year conference, so I’m sure CppCon 2015 will be even better. I highly recommend it for any serious C++ developer.

While I was there, I gave a talk entitled, “How Ubisoft Montreal Develops Games For Multicore – Before and After C++11.” You can watch the whole thing here:

The Purpose of memory_order_consume in C++11

In the C++11 standard atomic library, most functions accept a memory_order argument:

enum memory_order {

The above values are referred to as memory ordering constraints. Each of them has its intended purpose. Among them, memory_order_consume is probably the least well-understood. It’s the most complicated ordering constraint, and it offers the least reward for using it correctly. Nonetheless, there it is, tempting the curious programmer to make sense of it – if only to unlock its dark, mysterious secrets. That’s exactly what this post aims to do.

What Is a Bitcoin, Really?

When I first started learning about Bitcoin, I found plenty of information, but nothing that directly answered the most burning question:

When you buy bitcoins… what is it that you own, exactly?

That’s the question I’ll answer in this post. Along the way, I’ll introduce several key Bitcoin concepts. You’ll see for yourself how bitcoins are secured and how they’re transferred.

First and foremost, a bitcoin is a unit of account, in the same sense that a gallon is a unit of volume, or a gram is a unit of mass. You can’t pick up a bitcoin and hold it in your hand like you can a dollar bill. But that’s OK, because that’s not what’s important. What’s important is that:

  • Bitcoins can be possessed.
  • Bitcoins can be transferred.
  • Bitcoins are impossible to copy.

These three properties, combined, allow bitcoins to function effectively as a system of distribution of wealth. And fundamentally, that’s what makes bitcoins useful.

Bitcoin Address Generator in Obfuscated Python

Recently, I became interested in the inner workings of Bitcoin – specifically, the way it uses elliptic curve cryptography to generate Bitcoin addresses such as 1PreshX6QrHmsWbSs8pHpz6kLRcj9kdPy6. It inspired me to write another obfuscated Python script. The following is valid Python code:

_                   =r"""A(W/2,*M(3*G
       ashlib&as&h,os,re,bi    nascii&as&k;J$:int(
     k.b2a_hex(W),16);C$:C    (W/    58)+[W%58]if(W@
    [];"rip           em    d160");Y$:h.sha25
   6(W).digest();I$                 d=32:I(W/256,d-1)+
  chr(W%256)if(d>0@"";                  U$:J(k.a2b_base
 64(W));f=J(os.urando       m(64))        %(H-U("AUVRIxl
Qt1/EQC2hcy/JvsA="))+      1;M$Q,R,G       :((W*W-Q-G)%P,
(W*(G+2*Q-W*W)-R)%P)       ;P=H-2**       32-977;V$Q=P,L=
1,O=0:V(Q%W,W,O-Q/W*                      L,L)if(W@O%P;S,
T=A(f,U("eb5mfvncu6                    xVoGKVzocLBwKb/Nst
zijZWfKBWxb4F5g="),      U("SDra         dyajxGVdpPv8DhEI
qP0XtEimhVQZnEfQj/       sQ1Lg="),        0,0);F$:"1"+F(W
 [1:])if(W[:1           ]=="\0"@""        .join(map(B,C(
  J(W))));K$:               F(W          +Y(Y(W))[:4]);
   X.update(Y("\4"+                     I(S)+I(T)));B$
    :re.sub("[0OIl    _]|            [^\\w]","","".jo
     in(map(chr,ra    nge    (123))))[W];print"Addre
       ss:",K("\0"+X.dig    est())+"\nPrivkey:",K(
         "\x80"+I(f))""";exec(reduce(lambda W,X:
            W.replace(*X),zip(" \n&$@",["","",
               " ","=lambda W,",")else "])

Python 2.5 – 2.7 is required. Each time you run this script, it generates a Bitcoin address with a matching private key.

Acquire and Release Fences Don’t Work the Way You’d Expect

Raymond Chen defined acquire and release semantics as follows, back in 2008:

An operation with acquire semantics is one which does not permit subsequent memory operations to be advanced before it. Conversely, an operation with release semantics is one which does not permit preceding memory operations to be delayed past it.

Raymond’s definition applies perfectly well to Win32 functions like InterlockedIncrementRelease, which he was writing about at the time. It also applies perfectly well to atomic operations in C++11, such as store(1, std::memory_order_release).

It’s perhaps surprising, then, that this definition does not apply to standalone acquire and release fences in C++11! Those are a whole other ball of wax.

To see what I mean, consider the following two code listings. They’re both taken from my post about the double-checked locking pattern in C++11. The code on the left performs a release operation directly on m_instance, while the code on the right uses a release fence instead.

Double-Checked Locking is Fixed In C++11

The double-checked locking pattern (DCLP) is a bit of a notorious case study in lock-free programming. Up until 2004, there was no safe way to implement it in Java. Before C++11, there was no safe way to implement it in portable C++.

The pattern gained attention for the shortcomings it exposed in those languages, and people began to write about it. In 2000, a group of high-profile Java developers got together and signed a declaration entitled “Double-Checked Locking Is Broken”. In 2004, Scott Meyers and Andrei Alexandrescu published an article entitled “C++ and the Perils of Double-Checked Locking”. Both papers are great primers on what DCLP is, and why, at the time, those languages were inadequate for implementing it.

All of that’s in the past. Java now has a revised memory model, with new semantics for the volatile keyword, which makes it possible to implement DCLP safely. Likewise, C++11 has a shiny new memory model and atomic library which enable a wide variety of portable DCLP implementations. C++11, in turn, inspired Mintomic, a small library I released earlier this year which makes it possible to implement DCLP on some older C/C++ compilers as well.

In this post, I’ll focus on the C++ implementations of DCLP.

Acquire and Release Fences

Acquire and release fences, in my opinion, are rather misunderstood on the web right now. That’s too bad, because the C++11 Standards Committee did a great job specifying the meaning of these memory fences. They enable robust algorithms which scale well across multiple cores, and map nicely onto today’s most common CPU architectures.

First things first: Acquire and release fences are considered low-level lock-free operations. If you stick with higher-level, sequentially consistent atomic types, such as volatile variables in Java 5+, or default atomics in C++11, you don’t need acquire and release fences. The tradeoff is that sequentially consistent types are slightly less scalable or performant for some algorithms.

On the other hand, if you’ve developed for multicore devices in the days before C++11, you might feel an affinity for acquire and release fences. Perhaps, like me, you remember struggling with the placement of some lwsync intrinsics while synchronizing threads on Xbox 360. What’s cool is that once you understand acquire and release fences, you actually see what we were trying to accomplish using those platform-specific fences all along.

Acquire and release fences, as you might imagine, are standalone memory fences, which means that they aren’t coupled with any particular memory operation. So, how do they work?

An acquire fence prevents the memory reordering of any read which precedes it in program order with any read or write which follows it in program order.

A release fence prevents the memory reordering of any read or write which precedes it in program order with any write which follows it in program order.

The Synchronizes-With Relation

In an earlier post, I explained how atomic operations let you manipulate shared variables concurrently without any torn reads or torn writes. Quite often, though, a thread only modifies a shared variable when there are no concurrent readers or writers. In such cases, atomic operations are unnecessary. We just need a way to safely propagate modifications from one thread to another once they’re complete. That’s where the synchronizes-with relation comes in.

Synchronizes-with” is a term invented by language designers to describe ways in which the memory effects of source-level operations – even non-atomic operations – are guaranteed to become visible to other threads. This is a desirable guarantee when writing lock-free code, since you can use it to avoid unwelcome surprises caused by memory reordering.

Synchronizes-with” is a fairly modern computer science term. You’ll find it in the specifications of C++11, Java 5+ and LLVM, all of which were published within the last 10 years. Each specification defines this term, then uses it to make formal guarantees to the programmer. One thing they have in common is that whenever there’s a synchronizes-with relationship between two operations, typically on different threads, there’s a happens-before relationship between those operations as well.

The Happens-Before Relation

Happens-before is a modern computer science term which is instrumental in describing the software memory models behind C++11, Java, Go and even LLVM.

You’ll find a definition of the happens-before relation in the specifications of each of the above languages. Conveniently, the definitions given in those specifications are basically the same, though each specification has a different way of saying it. Roughly speaking, the common definition can be stated as follows:

Let A and B represent operations performed by a multithreaded process. If A happens-before B, then the memory effects of A effectively become visible to the thread performing B before B is performed.

When you consider the various ways in which memory reordering can complicate lock-free programming, the guarantee that A happens-before B is a desirable one. There are several ways to obtain this guarantee, differing slightly from one programming language to next – though obviously, all languages must rely on the same mechanisms at the processor level.