Preshing on ProgrammingPreshing on Programming

What Is a Bitcoin, Really?

When I first started learning about Bitcoin, I found plenty of information, but nothing that directly answered the most burning question:

When you buy bitcoins… what is it that you own, exactly?

That’s the question I’ll answer in this post. Along the way, I’ll introduce several key Bitcoin concepts. You’ll see for yourself how bitcoins are secured and how they’re transferred.

First and foremost, a bitcoin is a unit of account, in the same sense that a gallon is a unit of volume, or a gram is a unit of mass. You can’t pick up a bitcoin and hold it in your hand like you can a dollar bill. But that’s OK, because that’s not what’s important. What’s important is that:

  • Bitcoins can be possessed.
  • Bitcoins can be transferred.
  • Bitcoins are impossible to copy.

These three properties, combined, allow bitcoins to function effectively as a system of distribution of wealth. And fundamentally, that’s what makes bitcoins useful.

Bitcoin Address Generator in Obfuscated Python

Recently, I became interested in the inner workings of Bitcoin – specifically, the way it uses elliptic curve cryptography to generate Bitcoin addresses such as 1PreshX6QrHmsWbSs8pHpz6kLRcj9kdPy6. It inspired me to write another obfuscated Python script. The following is valid Python code:

_                   =r"""A(W/2,*M(3*G
       ashlib&as&h,os,re,bi    nascii&as&k;J$:int(
     k.b2a_hex(W),16);C$:C    (W/    58)+[W%58]if(W@
    [];"rip           em    d160");Y$:h.sha25
   6(W).digest();I$                 d=32:I(W/256,d-1)+
  chr(W%256)if(d>0@"";                  U$:J(k.a2b_base
 64(W));f=J(os.urando       m(64))        %(H-U("AUVRIxl
Qt1/EQC2hcy/JvsA="))+      1;M$Q,R,G       :((W*W-Q-G)%P,
(W*(G+2*Q-W*W)-R)%P)       ;P=H-2**       32-977;V$Q=P,L=
1,O=0:V(Q%W,W,O-Q/W*                      L,L)if(W@O%P;S,
T=A(f,U("eb5mfvncu6                    xVoGKVzocLBwKb/Nst
zijZWfKBWxb4F5g="),      U("SDra         dyajxGVdpPv8DhEI
qP0XtEimhVQZnEfQj/       sQ1Lg="),        0,0);F$:"1"+F(W
 [1:])if(W[:1           ]=="\0"@""        .join(map(B,C(
  J(W))));K$:               F(W          +Y(Y(W))[:4]);
   X.update(Y("\4"+                     I(S)+I(T)));B$
    :re.sub("[0OIl    _]|            [^\\w]","","".jo
     in(map(chr,ra    nge    (123))))[W];print"Addre
       ss:",K("\0"+X.dig    est())+"\nPrivkey:",K(
         "\x80"+I(f))""";exec(reduce(lambda W,X:
            W.replace(*X),zip(" \n&$@",["","",
               " ","=lambda W,",")else "])

Python 2.5 – 2.7 is required. Each time you run this script, it generates a Bitcoin address with a matching private key.

Acquire and Release Fences Don’t Work the Way You’d Expect

Raymond Chen defined acquire and release semantics as follows, back in 2008:

An operation with acquire semantics is one which does not permit subsequent memory operations to be advanced before it. Conversely, an operation with release semantics is one which does not permit preceding memory operations to be delayed past it.

Raymond’s definition applies perfectly well to Win32 functions like InterlockedIncrementRelease, which he was writing about at the time. It also applies perfectly well to atomic operations in C++11, such as store(1, std::memory_order_release).

It’s perhaps surprising, then, that this definition does not apply to standalone acquire and release fences in C++11! Those are a whole other ball of wax.

To see what I mean, consider the following two code listings. They’re both taken from my post about the double-checked locking pattern in C++11. The code on the left performs a release operation directly on m_instance, while the code on the right uses a release fence instead.

Double-Checked Locking is Fixed In C++11

The double-checked locking pattern (DCLP) is a bit of a notorious case study in lock-free programming. Up until 2004, there was no safe way to implement it in Java. Before C++11, there was no safe way to implement it in portable C++.

The pattern gained attention for the shortcomings it exposed in those languages, and people began to write about it. In 2000, a group of high-profile Java developers got together and signed a declaration entitled “Double-Checked Locking Is Broken”. In 2004, Scott Meyers and Andrei Alexandrescu published an article entitled “C++ and the Perils of Double-Checked Locking”. Both papers are great primers on what DCLP is, and why, at the time, those languages were inadequate for implementing it.

All of that’s in the past. Java now has a revised memory model, with new semantics for the volatile keyword, which makes it possible to implement DCLP safely. Likewise, C++11 has a shiny new memory model and atomic library that enable a wide variety of portable DCLP implementations. C++11, in turn, inspired Mintomic, a small library I released earlier this year which makes it possible to implement DCLP on some older C/C++ compilers as well.

In this post, I’ll focus on the C++ implementations of DCLP.

Acquire and Release Fences

Acquire and release fences, in my opinion, are rather misunderstood on the web right now. That’s too bad, because the C++11 Standards Committee did a great job specifying the meaning of these memory fences. They enable robust algorithms which scale well across multiple cores, and map nicely onto today’s most common CPU architectures.

First things first: Acquire and release fences are considered low-level lock-free operations. If you stick with higher-level, sequentially consistent atomic types, such as volatile variables in Java 5+, or default atomics in C++11, you don’t need acquire and release fences. The tradeoff is that sequentially consistent types are slightly less scalable or performant for some algorithms.

On the other hand, if you’ve developed for multicore devices in the days before C++11, you might feel an affinity for acquire and release fences. Perhaps, like me, you remember struggling with the placement of some lwsync intrinsics while synchronizing threads on Xbox 360. What’s cool is that once you understand acquire and release fences, you actually see what we were trying to accomplish using those platform-specific fences all along.

Acquire and release fences, as you might imagine, are standalone memory fences, which means that they aren’t coupled with any particular memory operation. So, how do they work?

An acquire fence prevents the memory reordering of any read which precedes it in program order with any read or write which follows it in program order.

A release fence prevents the memory reordering of any read or write which precedes it in program order with any write which follows it in program order.

The Synchronizes-With Relation

In an earlier post, I explained how atomic operations let you manipulate shared variables concurrently without any torn reads or torn writes. Quite often, though, a thread only modifies a shared variable when there are no concurrent readers or writers. In such cases, atomic operations are unnecessary. We just need a way to safely propagate modifications from one thread to another once they’re complete. That’s where the synchronizes-with relation comes in.

Synchronizes-with” is a term invented by language designers to describe ways in which the memory effects of source-level operations – even non-atomic operations – are guaranteed to become visible to other threads. This is a desirable guarantee when writing lock-free code, since you can use it to avoid unwelcome surprises caused by memory reordering.

Synchronizes-with” is a fairly modern computer science term. You’ll find it in the specifications of C++11, Java 5+ and LLVM, all of which were published within the last 10 years. Each specification defines this term, then uses it to make formal guarantees to the programmer. One thing they have in common is that whenever there’s a synchronizes-with relationship between two operations, typically on different threads, there’s a happens-before relationship between those operations as well.

The Happens-Before Relation

Happens-before is a modern computer science term which is instrumental in describing the software memory models behind C++11, Java, Go and even LLVM.

You’ll find a definition of the happens-before relation in the specifications of each of the above languages. Conveniently, the definitions given in those specifications are basically the same, though each specification has a different way of saying it. Roughly speaking, the common definition can be stated as follows:

Let A and B represent operations performed by a multithreaded process. If A happens-before B, then the memory effects of A effectively become visible to the thread performing B before B is performed.

When you consider the various ways in which memory reordering can complicate lock-free programming, the guarantee that A happens-before B is a desirable one. There are several ways to obtain this guarantee, differing slightly from one programming language to next – though obviously, all languages must rely on the same mechanisms at the processor level.

Atomic vs. Non-Atomic Operations

Much has already been written about atomic operations on the web, usually with a focus on atomic read-modify-write (RMW) operations. However, those aren’t the only kinds of atomic operations. There are also atomic loads and stores, which are equally important. In this post, I’ll compare atomic loads and stores to their non-atomic counterparts at both the processor level and the C/C++ language level. Along the way, we’ll clarify the C++11 concept of a “data race”.

An operation acting on shared memory is atomic if it completes in a single step relative to other threads. When an atomic store is performed on a shared variable, no other thread can observe the modification half-complete. When an atomic load is performed on a shared variable, it reads the entire value as it appeared at a single moment in time. Non-atomic loads and stores do not make those guarantees.

The World’s Simplest Lock-Free Hash Table

A lock-free hash table is a double-edged sword. There are applications where it can provide a performance improvement that would be impossible to achieve otherwise. The downside is that it’s complicated.

The first working lock-free hash table I heard about was written in Java by Dr. Cliff Click. He released the source code back in 2007 and gave a presentation about it at Google that same year. When I first watched that presentation, I’ll admit, I didn’t understand most of it. The main conclusion I took away from it was that Dr. Cliff Click must be some kind of wizard.

Luckily, six years has given me enough time to (mostly) catch up to Cliff on this subject. It turns out that you don’t have to be a wizard to understand and implement a very basic, but perfectly functional, lock-free hash table. I’ll share the source code for one here. I’m pretty sure that anyone with experience writing multithreaded C++, and a willingness to comb through previous information on this blog, can fully understand it.

A Lock-Free… Linear Search?

Why yes. In this post, I’ll present a C++ class which maps integer keys to integer values. Both its SetItem and GetItem methods are thread-safe and lock-free. The catch is that both operations are implemented using a linear search.

As you might suspect, this class is not very efficient for large collections. Obviously, I won’t recommend using it in practice – at least not for storing more than a few items. But I think it’s worth studying anyway, as it provides a nice context in which to discuss several lock-free programming concepts. As a bonus, once you grasp how this class works, you’ll be ready to tackle a basic lock-free hash table.

The class, which I’ve imaginatively named ArrayOfItems, contains an ordinary array of key/value pairs. The array is preallocated to a size sufficiently large to hold all the items we’re going to store. We declare both fields as mint_atomic32_t, since we’re going to manipulate them using Mintomic, a portable library for low-level lock-free programming which I released earlier this month.

struct Entry
    mint_atomic32_t key;
    mint_atomic32_t value;
Entry *m_entries;