# Automatically Detecting Text Encodings in C++

Consider the lowly text file.

This text file can take on a surprising number of different formats. The text could be encoded as ASCII, UTF-8, UTF-16 (little or big-endian), Windows-1252, Shift JIS, or any of dozens of other encodings. The file may or may not begin with a byte order mark (BOM). Lines of text could be terminated with a linefeed character \n (typical on UNIX), a CRLF sequence \r\n (typical on Windows) or, if the file was created on an older system, some other character sequence.

Sometimes it’s impossible to determine the encoding used by a particular text file. For example, suppose a file contains the following bytes:

# I/O in Plywood

Plywood is an open-source C++ framework I released a few weeks ago. It includes, among other things, a runtime module that exposes a cross-platform API for I/O, memory, threads, process management and more.

This post is about the I/O part. For those who don’t know, I/O stands for input/output, and refers to the part of a computer system that either writes serialized data to or reads serialized data from an external interface. The external interface could be a storage device, pipe, network connection or any other type of communication channel.

Typically, it’s the operating system’s responsibility to provide low-level I/O services to an application. But there’s still plenty of work that needs to happen at the application level, such as buffering, data conversion, performance tuning and exposing an interface that makes life easier on application programmers. That’s where Plywood’s I/O system comes in.

# A New Cross-Platform Open Source C++ Framework

For the past little while – OK, long while – I’ve been working on a custom game engine in C++. Today, I’m releasing part of that game engine as an open source framework. It’s called the Plywood framework.

Please note that Plywood, by itself, is not a game engine! It’s a framework for building all kinds of software using C++.

For example, Plywood’s documentation is generated with the help of a C++ parser, formatted by a Markdown parser, and runs on a custom webserver all written using Plywood.

# A Flexible Reflection System in C++: Part 2

In the previous post, I presented a basic system for runtime reflection in C++11. The post included a sample project that created a type descriptor using a block of macros:

// Define Node's type descriptor
REFLECT_STRUCT_BEGIN(Node)
REFLECT_STRUCT_MEMBER(key)
REFLECT_STRUCT_MEMBER(value)
REFLECT_STRUCT_MEMBER(children)
REFLECT_STRUCT_END()


At runtime, the type descriptor was found by calling reflect::TypeResolver<Node>::get().

This reflection system is small but very flexible. In this post, I’ll extend it to support additional built-in types. You can clone the project from GitHub to follow along. At the end, I’ll discuss other ways to extend the system.

# A Flexible Reflection System in C++: Part 1

In this post, I’ll present a small, flexible system for runtime reflection using C++11 language features. This is a system to generate metadata for C++ types. The metadata takes the form of TypeDescriptor objects, created at runtime, that describe the structure of other runtime objects.

I’ll call these objects type descriptors. My initial motivation for writing a reflection system was to support serialization in my custom C++ game engine, since I have very specific needs. Once that worked, I began to use runtime reflection for other engine features, too:

# How to Write Your Own C++ Game Engine

Lately I’ve been writing a game engine in C++. I’m using it to make a little mobile game called Hop Out. Here’s a clip captured from my iPhone 6. (Unmute for sound!)

Hop Out is the kind of game I want to play: Retro arcade gameplay with a 3D cartoon look. The goal is to change the color of every pad, like in Q*Bert.

Hop Out is still in development, but the engine powering it is starting to become quite mature, so I thought I’d share a few tips about engine development here.

# Can Reordering of Release/Acquire Operations Introduce Deadlock?

I wasn’t planning to write about lock-free programming again, but a commenter named Mike recently asked an interesting question on my Acquire and Release Semantics post from 2012. It’s a question I wondered about years ago, but could never really reconcile until (possibly) now.

A quick recap: A read-acquire operation cannot be reordered, either by the compiler or the CPU, with any read or write operation that follows it in program order. A write-release operation cannot be reordered with any read or write operation that precedes it in program order.

Those rules don’t prevent the reordering of a write-release followed by a read-acquire. For example, in C++, if A and B are std::atomic<int>, and we write:

A.store(1, std::memory_order_release);


…the compiler is free to reorder those statements, as if we had written:

int b = B.load(std::memory_order_acquire);
A.store(1, std::memory_order_release);


And that’s fair. Why the heck not? On many architectures, including x86, the CPU could perform this reordering anyway.

# Here’s a Standalone Cairo DLL for Windows

Cairo is an open source C library for drawing vector graphics. I used it to create many of the diagrams and graphs on this blog.

Cairo is great, but it’s always been difficult to find a precompiled Windows DLL that’s up-to-date and that doesn’t depend on a bunch of other DLLs. I was recently unable to find such a DLL, so I wrote a script to simplify the build process for one. The script is shared on GitHub:

If you just want a binary package, you can download one from the Releases page:

The binary package contains Cairo header files, import libraries and DLLs for both x86 and x64. The DLLs are statically linked with their own C runtime and have no external dependencies. Since Cairo’s API is pure C, these DLLs should work with any application built with any version of MSVC. I configured these DLLs to render text using FreeType because I find the quality of FreeType-rendered text better than Win32-rendered text, which Cairo normally uses by default. FreeType also supports more font formats and gives text a consistent appearance across different operating systems.

# Learn CMake’s Scripting Language in 15 Minutes

As explained in my previous post, every CMake-based project must contain a script named CMakeLists.txt. This script defines targets, but it can also do a lot of other things, such as finding third-party libraries or generating C++ header files. CMake scripts have a lot of flexibility.

Every time you integrate an external library, and often when adding support for another platform, you’ll need to edit the script. I spent a long time editing CMake scripts without really understanding the language, as the documentation is quite scattered, but eventually, things clicked. The goal of this post is to get you to the same point as quickly as possible.

# How to Build a CMake-Based Project

CMake is a versatile tool that helps you build C/C++ projects on just about any platform you can think of. It’s used by many popular open source projects including LLVM, Qt, KDE and Blender.

All CMake-based projects contain a script named CMakeLists.txt, and this post is meant as a guide for configuring and building such projects. This post won’t show you how to write a CMake script – that’s getting ahead of things, in my opinion.

As an example, I’ve prepared a CMake-based project that uses SDL2 and OpenGL to render a spinning 3D logo. You can build it on Windows, MacOS or Linux.