<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[Preshing on Programming]]></title>
  <link href="https://preshing.com/feed" rel="self"/>
  <link href="https://preshing.com/"/>
  <updated>2021-10-31T23:07:44-04:00</updated>
  <id>https://preshing.com/</id>
  <author>
    <name><![CDATA[Jeff Preshing]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[How C++ Resolves a Function Call]]></title>
    <link href="https://preshing.com/20210315/how-cpp-resolves-a-function-call"/>
    <updated>2021-03-15T08:10:00-04:00</updated>
    <id>https://preshing.com/?p=20210315</id>
    <content type="html"><![CDATA[<p>C is a simple language. You&rsquo;re only allowed to have one function with each name. C++, on the other hand, gives you much more flexibility:</p>

<ul>
  <li>You can have multiple functions with the same name (<a href="https://docs.microsoft.com/en-us/cpp/cpp/function-overloading">overloading</a>).</li>
  <li>You can overload <a href="https://isocpp.org/wiki/faq/operator-overloading">built-in operators</a> like <code>+</code> and <code>==</code>.</li>
  <li>You can write <a href="https://isocpp.org/wiki/faq/templates#fn-templates">function templates</a>.</li>
  <li><a href="https://docs.microsoft.com/en-us/cpp/cpp/namespaces-cpp">Namespaces</a> help you avoid naming conflicts.</li>
</ul>

<p>I like these C++ features. With these features, you can make <code>str1 + str2</code> return the concatenation of two strings. You can have a pair of 2D points, and another pair of 3D points, and overload <code>dot(a, b)</code> to work with either type. You can have a bunch of array-like classes and write a single <code>sort</code> function template that works with all of them.</p>

<!--more-->
<p>But when you take advantage of these features, it&rsquo;s easy to push things too far. At some point, the compiler might unexpectedly reject your code <a href="https://godbolt.org/z/3ehEGr">with errors like</a>:</p>

<pre><code>error C2666: 'String::operator ==': 2 overloads have similar conversions
note: could be 'bool String::operator ==(const String &amp;) const'
note: or       'built-in C++ operator==(const char *, const char *)'
note: while trying to match the argument list '(const String, const char *)'
</code></pre>

<p>Like many C++ programmers, I&rsquo;ve struggled with such errors throughout my career. Each time it happened, I would usually scratch my head, search online for a better understanding, then change the code until it compiled. But more recently, while developing a new runtime library for <a href="https://plywood.arc80.com/">Plywood</a>, I was thwarted by such errors over and over again. It became clear that despite all my previous experience with C++, something was missing from my understanding and I didn&rsquo;t know what it was.</p>

<p>Fortunately, it&rsquo;s now 2021 and information about C++ is more comprehensive than ever. Thanks especially to <a href="https://en.cppreference.com/w/cpp/language">cppreference.com</a>, I now know what was missing from my understanding: a clear picture of the <strong>hidden algorithm</strong> that runs for every function call at compile time.</p>

<p>This is how the compiler, given a function call expression, figures out exactly which function to call:</p>

<svg style="max-width:636px" version="1.1" viewbox="0 0 168.28 187.32" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(-22.225 -117.21)">
  <g fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458">
   <path d="m112.45 196.92v34.728" />
   <path d="m73.819 121.18v9.9606c0 2.506 2.2513 4.7238 4.9108 4.7238h27.101c4.5249 0 6.6835 3.1219 6.6835 5.9351" />
   <path d="m151.14 121.18v9.9606c0 2.506-2.2513 4.7238-4.9108 4.7238h-27.101c-4.5249 0-6.6835 3.1219-6.6835 5.9351" />
  </g>
  <g stroke-width=".26458px">
   <text x="172.50816" y="134.9375" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="172.50816" y="134.9375">may perform</tspan><tspan x="172.50816" y="138.82687">argument-dependent</tspan><tspan x="172.50816" y="142.71626">lookup</tspan></text>
   <path d="m167.71 126.77c2.6806 1.181 3.9049 2.8679 4.9163 4.6289" fill="none" stroke="#f55" />
   <text x="57.149902" y="183.09174" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="57.149902" y="183.09174">may involve</tspan><tspan x="57.149902" y="186.98111">SFINAE</tspan></text>
  </g>
  <path d="m112.51 141.82v1.5875c0 3.2058-2.1874 6.2177-6.2177 6.2177h-7.2761c-4.9493 0-6.6807 3.1542-6.6807 6.6807v26.524c0 3.3398 2.7272 6.6477 6.6477 6.6477h6.6477c4.4641 0 6.813 2.3631 6.813 6.2012" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
  <g font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="144.46245" y="149.22499" fill="#648b8b" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="144.46245" y="149.22499">non-template</tspan><tspan x="144.46245" y="153.11436">function</tspan></text>
   <text x="81.491577" y="149.22499" fill="#648b8b" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="81.491577" y="149.22499">function</tspan><tspan x="81.491577" y="153.11436">template</tspan></text>
   <text x="155.04581" y="215.10635" fill="#ff5555" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="155.04581" y="215.10635">finds implicit</tspan><tspan x="155.04581" y="218.81052">conversions</tspan></text>
  </g>
  <path transform="matrix(.22958 0 0 .16103 114.31 201.07)" d="m12.088 188.58-20.29 35.143-20.29-35.143z" fill="#9fbaba" />
  <path d="m60.722 195.39h103.72" fill="none" stroke="#648b8b" stroke-dasharray="1.05833, 1.05833" stroke-width=".26458" />
  <g font-family="Arimo" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="50.498344" y="194.73335" fill="#648b8b" font-size="3.7042px" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="50.498344" y="194.73335">candidate</tspan><tspan x="50.498344" y="198.62273">functions</tspan></text>
   <text x="50.535419" y="236.80215" fill="#648b8b" font-size="3.7042px" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="50.535419" y="236.80215">viable</tspan><tspan x="50.535419" y="240.69153">functions</tspan></text>
   <text transform="rotate(-11.368)" x="21.082979" y="263.32162" fill="#ff5555" font-size="3.891px" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="21.082979" y="263.32162" fill="#ff5555" font-family="Arimo" font-size="3.891px" font-weight="bold" stroke-width=".26458px">TIEBREAKERS</tspan></text>
  </g>
  <g>
   <path d="m112.45 125.94v17.462c0 3.2058 2.1874 6.2177 6.2177 6.2177h7.2761c4.9493 0 6.6807 3.1542 6.6807 6.6807v26.524c0 3.3398-2.7272 6.6477-6.6477 6.6477h-6.6477c-4.4641 0-6.813 2.3631-6.813 6.2012" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
   <rect x="69.585" y="156.37" width="45.508" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="92.2911" y="161.09467" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.2911" y="161.09467">Template argument</tspan><tspan x="92.2911" y="165.06342">deduction</tspan></text>
   <path d="m68.046 175.93c-3.3279.61575-5.472 2.1906-7.6648 3.7259" fill="none" stroke="#f55" stroke-width=".26458px" />
   <rect x="69.585" y="171.71" width="45.508" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="92.2911" y="176.44052" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.2911" y="176.44052">Template argument</tspan><tspan x="92.2911" y="180.40927">substitution</tspan></text>
   <rect x="87.329" y="200.29" width="50.271" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="112.41785" y="205.01552" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="112.41785" y="205.01552">Arguments must</tspan><tspan x="112.41785" y="208.98427">be compatible</tspan></text>
   <rect x="87.312" y="215.64" width="50.271" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="112.41785" y="220.36137" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="112.41785" y="220.36137">Constraints must</tspan><tspan x="112.41785" y="224.33012">be satisfied (C++20)</tspan></text>
   <rect x="78.846" y="259.56" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="112.41785" y="264.28214" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="112.41785" y="264.28214" rotate="0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" fill="#666666" font-size="3.9688px" stroke-width=".26458px">Better-matching arguments wins</tspan></text>
   <rect x="78.846" y="272.26" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="112.41785" y="276.98206" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="112.41785" y="276.98206" fill="#666666" font-size="3.9688px" stroke-width=".26458px">Non-template function wins</tspan></text>
   <rect x="78.846" y="284.96" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
  </g>
  <g font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="112.41785" y="289.68207" fill="#666666" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="112.41785" y="289.68207" fill="#666666" font-size="3.9688px" stroke-width=".26458px">More specialized template wins</tspan></text>
   <text x="112.43531" y="302.94763" fill="#999999" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="112.43531" y="302.94763" fill="#999999" font-size="3.9688px" stroke-width=".26458px">additional tiebreakers</tspan></text>
   <text id="text1221" x="112.45243" y="270.66858" fill="#b3b3b3" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="112.45243" y="270.66858" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">otherwise</tspan></text>
  </g>
  <g id="g1686" transform="translate(25.929 -4.2333)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <use id="use1688" transform="translate(5.8852e-8 -12.7)" width="100%" height="100%" xlink:href="#g1686" />
  <use id="use1690" transform="translate(5.8852e-8 12.7)" width="100%" height="100%" xlink:href="#g1686" />
  <use id="use1692" transform="translate(5.8852e-8 12.7)" width="100%" height="100%" xlink:href="#use1690" />
  <use transform="translate(27.517)" width="100%" height="100%" xlink:href="#g1686" />
  <use transform="translate(27.517)" width="100%" height="100%" xlink:href="#use1688" />
  <use transform="translate(27.517)" width="100%" height="100%" xlink:href="#use1690" />
  <use transform="translate(27.517)" width="100%" height="100%" xlink:href="#use1692" />
  <use id="use1702" transform="translate(5.8852e-8 12.7)" width="100%" height="100%" xlink:href="#text1221" />
  <use transform="translate(5.8852e-8 12.7)" width="100%" height="100%" xlink:href="#use1702" />
  <g fill="none">
   <path d="m139.16 206.1c3.965 1.0146 7.0919 2.9053 8.9994 5.3009" stroke="#f55" stroke-width=".26458px" />
   <path d="m60.722 237.2h103.72" stroke="#648b8b" stroke-dasharray="1.05833, 1.05833" stroke-width=".26458" />
   <g stroke="#1a1a1a" stroke-linecap="round" stroke-width=".794">
    <path d="m38.881 119.24c-1.4271.01-2.2359.82958-2.2359 2.2807v4.7719c0 1.8284-.8478 2.6677-2.3282 2.6904" />
    <path d="m38.881 138.77c-1.4271-.01-2.2359-.82958-2.2359-2.2807v-4.7719c0-1.8284-.8478-2.6677-2.3282-2.6904" />
    <path d="m38.881 147.29c-1.4271.01-2.2359.82958-2.2359 2.2807v16.414c0 1.8284-.8478 2.6677-2.3282 2.6904" />
    <path d="m38.881 190.08c-1.4271-.01-2.2359-.82958-2.2359-2.2807v-16.414c0-1.8284-.8478-2.6677-2.3282-2.6904" />
    <path d="m38.881 200.73c-1.4271.01-2.2359.82958-2.2359 2.2807v46.047c0 1.8284-.8478 2.6677-2.3282 2.6904" />
    <path d="m38.881 302.79c-1.4271-.01-2.2359-.82958-2.2359-2.2807v-46.047c0-1.8284-.8478-2.6677-2.3282-2.6904" />
   </g>
  </g>
  <g font-family="Arimo" font-size="4.4979px" font-weight="bold" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text transform="rotate(-90)" x="-251.63768" y="30.929184" fill="#333333" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="-251.63768" y="30.929184" fill="#1a1a1a" font-family="Arimo" font-size="4.4979px" font-weight="bold" stroke-width=".26458px">Overload Resolution</tspan></text>
   <text transform="rotate(-90)" x="-129.00381" y="26.166685" fill="#1a1a1a" text-align="center" style="line-height:102%" xml:space="preserve"><tspan x="-129.00381" y="26.166685">Name</tspan><tspan x="-129.00381" y="30.754564">Lookup</tspan></text>
   <text transform="rotate(-90)" x="-168.70738" y="26.166685" fill="#1a1a1a" text-align="center" style="line-height:102%" xml:space="preserve"><tspan x="-168.70738" y="26.166685">Special Handling of</tspan><tspan x="-168.70738" y="30.754564">Function Templates</tspan></text>
  </g>
  <image id="image2324" x="117.48" y="239.45" width="16.447" height="12.443" stroke-width="1.85" xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAHMAAABXBAMAAADFUGf1AAAAMFBMVEX///9wcMCAoNCgwOBgcIDQ8PDw0JDw4MDwsHBgMBDgkGCwYECAQDDQQDAgICBgUIBvBO4iAAAAAXRSTlMAQObYZgAAB71JREFUeF6dl09sG8cVxlnfBAuOh0IORY0yu6yRiwBVJHQokMKX9GrAEogee5udUrEPBbxrGMh1Ox26hXrogSZr3+Tx7B5iF3CpmWkg261gcrfuzQlayAiCNk3qaGOqpyKBt+8thRSVuLbCBx4ILH/8vvdvhqyURfZRZcY49fH5WdHs/PlZZLMi/j6D5VM/anzvfJadmgV9O5vETIafBJf+0QDlb6p7Ktt76tEse/vt85+nabr9jdCLT/9JsxQsP1PKmDQ9NnqSvpt9culf/vtZNuJcIW2Hx0N3govZXvap/VWWSUq54lxE9njKvw/ez/Y6nw6vfT4KKVOblAmlomPp/vzqjb2nT/Tt6HESMq4oBPi2G8eo0uZPoUAfXB2MHscS7ALpUSqi7WOgz55mn+x9+EvDb8RKhZQG+ALZYxj+zDzOdnf/Fm09i9WmD1zgez4T8Stl55KtK+rPu17QvnE9lh4NAQ3AsohfWeREx76/84cPA3ZjEN3yWYH6UOooeYXsvJa/8G347u7uX7tSQ46b1AOUUdax9lUo73SS4e7F3aTXueWHXNLCccgiK14u25P8Uvd6EH9wdUtrsNsGlBaoso9i131ZqoBmEDvtNLrFOGWDEFnKqBJWPGw5L0VF9lk/fXLbaMopVIdTDMbA8qM7ZJm8DFVJt/vbK1tmk1FsiuaoiyFs3FpzTpd2VQkVa+qJvkY5qG2kOAalPrMd4jbK0L5RyigY+9uppBT7woQxWhVfA+jKZZeUiBolRaQ76bD/CKcXwmdxAqQHEwXviFtWqJ6WnMca7KFDr0B5lEhEfa8dp81mmSp+KGgb5AANMCiP+gUa0HZsietUT5eMEkhdBZWQ8olhH9AUiwwoU/ouqDplKOgYdMwA9YBkItrSIAvBhLQrwQUyTTbFevhsC2W5nLSTK6OhTgqTUKxDqpenOk5UWHQeqyUUR5apSKvYmOLJgIqHq2cvONNQVGLCpIlWkWQcAkjJVKpDPKAk7TxsXWhNke2bAeTIdTT8T5pIiWaRDKE9Axws2gmvVeuts8tHZLtaCQk6MvpLnnDaxuwA50LGSdEd1hE2J/ULa4dl57WEzEBI2JvkSwnlCrHERfJF5X2mxF1S32+8XnWOJBpQoRQ3wzH5SkJufNJY3AdEKZedO6RVq9ed/1+fdMA9nwu82W6Oq2ewYEU3sVgHqGB2+fXLpNpoVsnXGxqlqZE0YFgX2yU5+bZB63KybYCGuH6hsI03A+dc80B2HhvGBTQOkwlFNCTV2tjpgayYLDlXVoce+GWdh6tvBm5zwWmQGqCjNhdoCz8VYlPvVxtuXj2jwarik9YCiqeTsIk1dXdtH+D8eaVyr50YjYuCwUDh126zWRuTRHM1icjAlPkU98BoFd1drTbdc3leqQzesfEmZUoWvkGh3603SO46aWLgZVIIaGtBjoprepU0lwF9XrlOmQ2ZSIzkAhWGOrrvkLxK8nzfdQhJBlhknE+7qLDu0NuF5RxVh/zaiCmbTqwBqaDEwdK4muf/XnGqi5sBYqBmHkAvYEbYHXdhCUUrvX5XR/3ETsx1jYKM/hT8ZOwC2iKNHYohpIjSxZEM8f6J3f1CtDKnrd4y0bAPOQ17qojf/fhnuVvL9xvkh8zqzYAJZf44PqNFGCC6CuRHxb6kSWpsMf/FdCioRMvJnVqeu/W31ns9QMFKbXxmxGmBLtfyFwU61x92043J0iGqFY/cak4AJWd/YKJ+6EPh74/dtxKJKI3hUQapIttFEFFdDKJWttrKx06ej1cWR4m+5ytla/t1a+TkOnhQy77ID626lkphmdKzraXchYzqZCeWXiDV1hJpWqNCFvht8+C7X+SH0DktJGaadn/TOpdXwfE+ecQ8v3MvepCT76wn8NSjneRBnh9B8eQLmYi2T7QIZAqyJB20I03tzf1lbYRkIWWxuf+88tohtNJTXIe4KZWGA10HW+PFxGgWA9q0Gs9XX+h4COU9gmpuiu2pVOsN6EAOlpMBj/S1vL7yFR7OHlM83pp2RQLKcdC3v7XSIM4LkM37cOvBuVFf+xI0r3gMHNsSVCL6XqW+TFxkD4IsrI7wVAx9j5lo4yg6P5KRDFG1coIsNJby7IAcO+T7BYoX9AAeT/k5gChDtEKa0NgX2YSsNsnpydFG6TrtbE+9dARuMyZzwl1CNMMyk4WVtQnKIJ1wKoqjyLE5yNZyNPyxg9fxSqOCK4co25yKzmmplTrIBUcGjwpASf2NyklA8UagcjqaqI6K7MbX6GuINpukihcLYBxQARWemiw3Ji1QwHDgcpcsNJeLZ1hBCmg8Fe0rKVC1iAPUWWi+cXCfiQK1ZT+HhTL/M4TJOgTJwpICknXem4rOjSQ/jLqnkUNZI2kI6HYJqrieXgaU5RxnbTqaahV2Sv9JzUsBqJ2qmRqlZEfZMlRLzoQtIaXSkSp1bCTn09AuklxEGsapLFlAlT0qmSDJuVCiFD05QPTovimOYGGpzHAPVeMj36c0ujFGK9OtlMQIbUUbh1VtopUyw24/LSXnDK6kOprOfPKK/+TYnJAyrER5lFcJx39jFhREAyrsDOi8Cr3A5zM51pQGM8qeDCmHANkZ6kTDdTZYj2eQHQSC7ryzns6C+rEXrYPfGRyr7Z49AP4Leaiq6xwgjn4AAAAASUVORK5CYII=" preserveaspectratio="none" />
  <use transform="matrix(-1 0 0 1 224.71 0)" width="100%" height="100%" xlink:href="#image2324" />
  <g>
   <rect x="133.61" y="118.27" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="149.98691" y="122.99457" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="149.98691" y="122.99457">Unqualified</tspan><tspan x="149.98691" y="126.96332">name lookup</tspan></text>
   <rect x="95.515" y="118.27" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="111.88691" y="122.99457" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="111.88691" y="122.99457">Qualified</tspan><tspan x="111.88691" y="126.96332">name lookup</tspan></text>
   <rect x="57.415" y="118.27" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="73.786934" y="122.99457" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="73.786934" y="122.99457">Member</tspan><tspan x="73.786934" y="126.96332">name lookup</tspan></text>
  </g>
 </g>
</svg>

<p>These steps are enshrined in the C++ standard. Every C++ compiler must follow them, and the whole thing happens <strong>at compile time</strong> for every function call expression evaluated by the program. In hindsight, it&rsquo;s obvious there has to be an algorithm like this. It&rsquo;s the only way C++ can support all the above-mentioned features at the same time. This is what you get when you combine those features together.</p>

<p>I imagine the overall intent of the algorithm is to &ldquo;do what the programmer expects&rdquo;, and to some extent, it&rsquo;s successful at that. You can get pretty far ignoring the algorithm altogether. But when you start to use multiple C++ features, as you might when developing a library, it&rsquo;s better to know the rules.</p>

<p>So let&rsquo;s walk through the algorithm from beginning to end. A lot of what we&rsquo;ll cover will be familiar to experienced C++ programmers. Nonetheless, I think it can be quite eye-opening to see how all the steps fit together. (At least it was for me.) We&rsquo;ll touch on several advanced C++ subtopics along the way, like argument-dependent lookup and SFINAE, but we won&rsquo;t dive too deeply into any particular subtopic. That way, even if you know nothing else about a subtopic, you&rsquo;ll at least know how it fits into C++&rsquo;s overall strategy for resolving function calls at compile time. I&rsquo;d argue that&rsquo;s the most important thing.</p>

<h2 id="name-lookup">Name Lookup</h2>

<p>Our journey begins with a function call expression. Take, for example, the expression <code>blast(ast, 100)</code> in the code listing below. This expression is clearly meant to call a function named <code>blast</code>. But which one?</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">namespace</span> galaxy {
    <span class="keyword">struct</span> Asteroid {
        <span class="predefined-type">float</span> radius = <span class="integer">12</span>;
    };
    <span class="directive">void</span> blast(Asteroid* ast, <span class="predefined-type">float</span> force);
}

<span class="keyword">struct</span> Target {
    galaxy::Asteroid* ast;
    Target(galaxy::Asteroid* ast) : ast{ast} {}
    <span class="directive">operator</span> galaxy::Asteroid*() <span class="directive">const</span> { <span class="keyword">return</span> ast; }
};

<span class="predefined-type">bool</span> blast(Target target);
<span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt; <span class="directive">void</span> blast(T* obj, <span class="predefined-type">float</span> force);

<span class="directive">void</span> play(galaxy::Asteroid* ast) {
    <span class="highlight">blast(ast, <span class="integer">100</span>);</span>
}
</pre></div>
</div>
</div>

<p>The first step toward answering this question is <strong>name lookup</strong>. In this step, the compiler looks at all functions and function templates that have been declared up to this point and identifies the ones that <em>could be</em> referred to by the given name.</p>

<svg style="max-width:506px" version="1.1" viewbox="0 0 133.88 31.221" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(-22.225 -117.21)">
  <path d="m78.317 145.19v-21.101" fill="none" stroke="#9fbaba" stroke-width="2.6458" />
  <path transform="matrix(.22958 0 0 .16103 80.2 112.17)" d="m12.088 188.58-20.29 35.143-20.29-35.143z" fill="#9fbaba" />
  <g>
   <path d="m39.57 121.14v9.9606c0 2.506 2.2513 4.7238 4.9108 4.7238h27.101c4.5249 0 6.735 3.1219 6.735 5.9351" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
   <path d="m116.89 121.14v9.9606c0 2.506-2.2513 4.7238-4.9108 4.7238h-27.101c-4.5249 0-6.5659 3.1219-6.5659 5.9351" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
   <text x="138.25912" y="134.89708" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="138.25912" y="134.89708">may perform</tspan><tspan x="138.25912" y="138.78645">argument-dependent</tspan><tspan x="138.25912" y="142.67584">lookup</tspan></text>
   <path d="m133.47 126.73c2.6806 1.181 3.9049 2.8679 4.9163 4.6289" fill="none" stroke="#f55" stroke-width=".26458px" />
  </g>
  <g>
   <rect x="99.219" y="118" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="115.59104" y="122.73" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="115.59104" y="122.73">Unqualified</tspan><tspan x="115.59104" y="126.69875">name lookup</tspan></text>
   <rect x="61.119" y="118" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="77.491066" y="122.73" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="77.491066" y="122.73">Qualified</tspan><tspan x="77.491066" y="126.69875">name lookup</tspan></text>
   <rect x="23.019" y="118" width="32.842" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="39.391109" y="122.73" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="39.391109" y="122.73">Member</tspan><tspan x="39.391109" y="126.69875">name lookup</tspan></text>
  </g>
 </g>
</svg>

<p>As the flowchart suggests, there are three main types of name lookup, each with its own set of rules.</p>

<ul>
  <li><strong>Member name lookup</strong> occurs when a name is to the right of a <code>.</code> or <code>-&gt;</code> token, as in <code>foo-&gt;bar</code>. This type of lookup is used to locate class members.</li>
  <li><strong>Qualified name lookup</strong> occurs when a name has a <code>::</code> token in it, like <code>std::sort</code>. This type of name is explicit. The part to the right of the <code>::</code> token is only looked up in the scope identified by the left part.</li>
  <li><strong>Unqualified name lookup</strong> is neither of those. When the compiler sees an unqualified name, like <code>blast</code>, it looks for matching declarations in various scopes depending on the context. There&rsquo;s a <a href="https://en.cppreference.com/w/cpp/language/unqualified_lookup">detailed set of rules</a> that determine exactly where the compiler should look.</li>
</ul>

<p>In our case, we have an unqualified name. Now, when name lookup is performed for a function call expression, the compiler may find multiple declarations. Let&rsquo;s call these declarations <strong>candidates</strong>. In the example above, the compiler finds three candidates:</p>

<svg style="max-width:496px" version="1.1" viewbox="0 0 131.23 30.162" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <g font-family="cascadia_code" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="7.7322202" y="40.754318" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.7322202" y="40.754318" fill="#666666" font-size="3.87px" stroke-width=".26458px">void galaxy::<tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(galaxy::Asteroid* ast, float force)</tspan></text>
   <text x="-9.9415274" y="47.394417" fill="none" font-size="4.478px" stroke="#000000" text-align="center" text-anchor="middle" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <text x="7.8947835" y="46.607025" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.8947835" y="46.607025" fill="#666666" font-size="3.87px" stroke-width=".26458px">bool <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(Target target)</tspan></text>
   <text x="8.1737185" y="52.45974" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="8.1737185" y="52.45974" fill="#666666" font-size="3.87px" stroke-width=".26458px">template &lt;typename T&gt; void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(T* obj, float force)</tspan></text>
  </g>
  <g>
   <text x="67.050423" y="27.764225" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="67.050423" y="27.764225">this one comes from</tspan><tspan x="67.050423" y="31.653603">the <tspan font-family="cascadia_code" font-weight="bold">galaxy</tspan> namespace</tspan></text>
   <path d="m46.603 30.685c-2.9126.8432-4.9272 2.7755-6.6146 5.2795" fill="none" stroke="#f55" stroke-width=".26458px" />
   <path d="m19.112 38.303c.73679-1.1647 6.8656-1.3761 11.674-1.505 7.0358-.18862 14.238.33481 16.654.7727 2.8302.51312 3.903 3.8448-.03176 4.2333-6.5608.64783-19.293.83821-26.193.26458-2.403-.21925-4.946-1.494-2.8931-3.346" fill="none" stroke="#f55" stroke-width=".52917" />
  </g>
  <circle cx="3.3073" cy="45.376" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="46.613438" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="46.613438" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">2</tspan></text>
  <circle cx="3.3073" cy="39.555" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="40.792603" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="40.792603" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">1</tspan></text>
  <circle cx="3.3073" cy="51.197" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="52.434273" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="52.434273" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">3</tspan></text>
 </g>
</svg>

<p>The first candidate, circled above, deserves extra attention because it demonstrates a feature of C++ that&rsquo;s easy to overlook: <strong>argument-dependent lookup</strong>, or <strong>ADL</strong> for short. I&rsquo;ll admit, I spent most of my C++ career unaware of ADL&rsquo;s role in name lookup. Here&rsquo;s a quick summary in case you&rsquo;re in the same boat. Normally, you wouldn&rsquo;t expect this function to be a candidate for this particular call, since it was declared inside the <code>galaxy</code> namespace and the call comes from <em>outside</em> the <code>galaxy</code> namespace. There&rsquo;s no <code>using namespace galaxy</code> directive in the code to make this function visible, either. So why is this function a candidate?</p>

<p>The reason is because any time you use an unqualified name in a function call &ndash; and the name doesn&rsquo;t refer to a class member, among other things &ndash; ADL kicks in, and name lookup becomes more greedy. Specifically, in addition to the usual places, the compiler looks for candidate functions <em>in the namespaces of the argument types</em> &ndash; hence the name &ldquo;argument-dependent lookup&rdquo;.</p>

<svg style="max-width:362px" version="1.1" viewbox="0 0 95.779 57.415" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(.26458 -24.077)">
  <text x="-9.9415274" y="47.394417" fill="none" font-family="cascadia_code" font-size="4.478px" letter-spacing="0px" stroke="#000000" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
  <text x="19.998552" y="41.308662" fill="#666666" font-family="cascadia_code" font-size="4.8295px" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="19.998552" y="41.308662" fill="#666666" font-size="4.8295px" stroke-width=".26458px"><tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(ast, 100)</tspan></text>
  <g font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="41.55191" y="28.350765" fill="#648b8b" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="41.55191" y="28.350765" fill="#648b8b" font-family="Arimo" font-size="3.7042px" stroke-width=".26458px" text-align="center" text-anchor="middle">function call expression</tspan></text>
   <text x="15.887339" y="54.015358" fill="#648b8b" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="15.887339" y="54.015358" fill="#648b8b" font-family="Arimo" font-size="3.7042px" stroke-width=".26458px" text-align="center" text-anchor="middle">unqualified name</tspan></text>
   <text x="56.368591" y="55.33828" fill="#ff5555" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="56.368591" y="55.33828" font-family="Arimo" font-size="3.7042px" text-align="center">argument type is</tspan><tspan x="56.368591" y="59.227657" font-family="cascadia_code" font-size="3.7335px" font-weight="bold" text-align="center">galaxy::Asteroid*</tspan></text>
   <text x="67.216515" y="71.21328" fill="#ff5555" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="67.216515" y="71.21328" text-align="center">the <tspan fill="#ff5555" font-family="cascadia_code" font-size="3.7335px" font-weight="bold">galaxy</tspan> namespace is</tspan><tspan x="67.216515" y="75.16098" text-align="center">seached for candidate functions</tspan><tspan x="67.216515" y="79.050362" text-align="center">(argument-dependent lookup)</tspan></text>
  </g>
  <path id="path2671" d="m19.043 35.587c-.05575-1.6576 1.3858-3.0428 3.6483-3.0428h14.212c2.1176 0 3.6845-.87517 3.6483-3.0348" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
  <use transform="matrix(-1 0 0 1 81.141 0)" width="100%" height="100%" stroke-width="1.3795" xlink:href="#path2671" />
  <g fill="none" stroke-linecap="round">
   <path d="m22.49 50.271 3.4396-7.4083" stroke="#648b8b" stroke-width=".52917" />
   <path d="m46.567 51.594c-2.5688-2.5056-3.543-5.669-4.7625-8.7313" stroke="#f55" stroke-width=".26458" />
   <path d="m51.594 67.733c-2.4669-1.8753-4.0485-4.3167-5.0271-7.1438" stroke="#f55" stroke-width=".26458" />
  </g>
 </g>
</svg>

<p>The <a href="https://en.cppreference.com/w/cpp/language/adl">complete set of rules governing ADL</a> is more nuanced than what I&rsquo;ve described here, but the key thing is that ADL only works with unqualified names. For qualified names, which are looked up in a single scope, there&rsquo;s no point. ADL also works when overloading built-in operators like <code>+</code> and <code>==</code>,  which lets you take advantage of it when writing, say, a math library.</p>

<p>Interestingly, there are cases where member name lookup can find candidates that unqualified name lookup can&rsquo;t. See <a href="https://eli.thegreenplace.net/2012/02/06/dependent-name-lookup-for-c-templates">this post by Eli Bendersky</a> for details about that.</p>

<h2 id="special-handling-of-function-templates">Special Handling of Function Templates</h2>

<p>Some of the candidates found by name lookup are functions; others are function <em>templates</em>. There&rsquo;s just one problem with  function templates: You can&rsquo;t call them. You can only call functions. Therefore, after name lookup, the compiler goes through the list of candidates and tries to turn each function template into a function.</p>

<svg style="max-width:465px" version="1.1" viewbox="0 0 123.03 64.294" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(-22.225 -117.21)">
  <g>
   <path d="m93.398 174.03v2.5698" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
   <text x="38.099903" y="161.3959" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="38.099903" y="161.3959">may involve</tspan><tspan x="38.099903" y="165.28528">SFINAE</tspan></text>
   <path d="m93.371 121.71c0 3.2058-2.0938 6.2177-6.1242 6.2177h-7.2761c-4.9493 0-6.6807 3.1542-6.6807 6.6807v26.524c0 3.3398 2.7272 6.6477 6.6477 6.6477h6.6477c4.4641 0 6.813 2.3631 6.813 6.2012" fill="none" stroke="#9fbaba" stroke-linecap="round" stroke-width="2.6458" />
   <text x="125.41245" y="127.52916" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="125.41245" y="127.52916">non-template</tspan><tspan x="125.41245" y="131.41853">function</tspan></text>
   <text x="62.441578" y="127.52916" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="62.441578" y="127.52916" rotate="0 0 0 0">function</tspan><tspan x="62.441578" y="131.41853">template</tspan></text>
  </g>
  <path transform="matrix(.22958 0 0 .16103 95.26 144.98)" d="m12.088 188.58-20.29 35.143-20.29-35.143z" fill="#9fbaba" />
  <g>
   <path d="m93.398 118v3.7042c0 3.2058 2.1874 6.2177 6.2177 6.2177h7.2761c4.9493 0 6.6807 3.1542 6.6807 6.6807v26.524c0 3.3398-2.7272 6.6477-6.6477 6.6477h-6.6477c-4.4641 0-6.813 2.3631-6.813 6.2012" fill="none" stroke="#9fbaba" stroke-width="2.6458" />
   <rect x="50.535" y="134.67" width="45.508" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="73.241096" y="139.39883" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="73.241096" y="139.39883">Template argument</tspan><tspan x="73.241096" y="143.36758">deduction</tspan></text>
   <path d="m48.996 154.24c-3.3279.61575-5.472 2.1906-7.6648 3.7259" fill="none" stroke="#f55" stroke-width=".26458px" />
   <rect x="50.535" y="150.02" width="45.508" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="73.241096" y="154.74469" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="73.241096" y="154.74469">Template argument</tspan><tspan x="73.241096" y="158.71344">substitution</tspan></text>
  </g>
 </g>
</svg>

<p>In the example we&rsquo;ve been following, one of the candidates is indeed a function template:</p>

<svg style="max-width:487px" version="1.1" viewbox="0 0 128.85 6.35" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <text x="-9.9415274" y="47.394417" fill="none" font-family="cascadia_code" font-size="4.478px" letter-spacing="0px" stroke="#000000" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
  <text x="8.1737185" y="28.647236" fill="#666666" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="8.1737185" y="28.647236" fill="#666666" font-size="3.87px" stroke-width=".26458px">template &lt;typename T&gt; void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(T* obj, float force)</tspan></text>
  <circle cx="3.3073" cy="27.384" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="28.621769" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="28.621769" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">3</tspan></text>
 </g>
</svg>

<p>This function template has a single template parameter, <code>T</code>. As such, it expects a single template argument. The caller, <code>blast(ast, 100)</code>, didn&rsquo;t specify any template arguments, so in order to turn this function template into a function, the compiler has to figure out the type of <code>T</code>. That&rsquo;s where <strong>template argument deduction</strong> comes in. In this step, the compiler compares the types of the <em>function arguments</em> passed by the caller (on the left in the diagram below) to the types of the <em>function parameters</em> expected by the function template (on the right). If any unspecified template arguments are referenced on the right, like <code>T</code>, the compiler tries to deduce them using information on the left.</p>

<svg style="max-width:670px" version="1.1" viewbox="0 0 177.27 56.885" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <path d="m26.627 35.053c-.05575-1.6576 1.3858-3.0428 3.6483-3.0428h11.037c2.1176 0 3.6845-.87517 3.6483-3.0348" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
  <path d="m63.332 35.053c.05575-1.6576-1.3858-3.0428-3.6483-3.0428h-11.037c-2.1176 0-3.6845-.87517-3.6483-3.0348" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
  <g letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="141.64825" y="39.166817" fill="#666666" font-family="cascadia_code" font-size="3.87px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="141.64825" y="39.166817" text-align="center">template &lt;typename T&gt;</tspan><tspan x="141.64825" y="44.348904" text-align="center">void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle">blast</tspan>(T* obj, float force)</tspan></text>
   <text x="-9.9415274" y="47.394417" fill="none" font-family="cascadia_code" font-size="4.478px" stroke="#000000" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <g>
    <text x="45.073227" y="27.499638" fill="#648b8b" font-family="Arimo" font-size="3.7042px" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="45.073227" y="27.499638" fill="#648b8b" font-size="3.7042px" stroke-width=".26458px">caller</tspan></text>
    <text x="45.142128" y="40.754318" fill="#666666" font-family="cascadia_code" font-size="3.87px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="45.142128" y="40.754318" fill="#666666" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle"><tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle">blast</tspan>(ast, 100)</tspan></text>
    <text x="141.57936" y="27.499638" fill="#648b8b" font-family="Arimo" font-size="3.7042px" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="141.57936" y="27.499638" fill="#648b8b" font-size="3.7042px" stroke-width=".26458px">function template</tspan></text>
   </g>
  </g>
  <path d="m107.26 35.053c-.0558-1.6576 1.3858-3.0428 3.6483-3.0428h26.912c2.1176 0 3.6845-.87517 3.6483-3.0348" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
  <path d="m175.71 35.053c.0557-1.6576-1.3858-3.0428-3.6483-3.0428h-26.912c-2.1176 0-3.6845-.87517-3.6483-3.0348" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
  <g fill="#b3b3b3" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="23.025562" y="55.545479" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="23.025562" y="55.545479" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">argument</tspan></text>
   <text x="54.816998" y="55.545479" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="54.816998" y="55.545479" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">type</tspan></text>
   <text x="141.5396" y="55.545479" text-align="center" style="line-height:105%" xml:space="preserve"><tspan x="141.5396" y="55.545479" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">function parameter type</tspan></text>
  </g>
  <rect x="13.229" y="57.415" width="63.5" height="12.171" fill="#fff" stroke="#333" stroke-linecap="round" stroke-width=".52917" />
  <g font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="23.114965" y="61.656403" fill="#333333" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="23.114965" y="61.656403" fill="#333333" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">ast</tspan></text>
   <text x="23.114965" y="68.006401" fill="#333333" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="23.114965" y="68.006401" fill="#333333" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">100</tspan></text>
   <text x="54.864964" y="61.656403" fill="#ff5555" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="54.864964" y="61.656403" fill="#ff5555" font-size="3.87px" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle"><tspan>galaxy::Asteroid</tspan><tspan>*</tspan></tspan></text>
   <text x="54.747555" y="68.006401" fill="#333333" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="54.747555" y="68.006401" fill="#333333" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">int</tspan></text>
  </g>
  <g stroke="#333" stroke-width=".52917">
   <path d="m13.229 63.5h63.5" fill="none" />
   <path d="m33.073 57.415v12.171" fill="none" />
   <rect x="131.5" y="57.415" width="20.108" height="12.171" fill="#fff" stroke-linecap="round" />
  </g>
  <g>
   <text x="141.56148" y="61.656403" fill="#ff5555" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="141.56148" y="61.656403" fill="#ff5555" font-size="3.87px" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle"><tspan>T</tspan><tspan>*</tspan></tspan></text>
   <text x="141.56148" y="68.006401" fill="#333333" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="141.56148" y="68.006401" fill="#333333" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">float</tspan></text>
   <path d="m131.5 63.5h20.108" fill="none" stroke="#333" stroke-width=".52917" />
   <text x="110.4253" y="75.653801" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="110.4253" y="75.653801"><tspan fill="#ff5555" font-weight="bold">T</tspan> is deduced as</tspan><tspan x="110.4253" y="79.543182"><tspan fill="#ff5555" font-family="cascadia_code" font-weight="bold">galaxy::Asteroid</tspan></tspan></text>
  </g>
  <g fill="none">
   <path d="m70.117 62.733c12.425 9.1603 42.216 13.705 68.258-.81836" stroke="#f55" stroke-linecap="round" stroke-width=".26458" />
   <path d="m135.89 61.737 2.7011-.0076-.77725 2.3149" stroke="#f55" stroke-linecap="round" stroke-width=".26458px" />
   <g stroke="#cdcdcd" stroke-width=".26458px">
    <path d="m44.45 41.808c.26458 6.6211-41.196 3.0883-43.392 11.969-.98617 3.9886 4.4979 6.7484 10.98 6.7484" />
    <path d="m57.216 41.808c.12724 3.184-9.7518 6.5292-22.704 7.9002-14.818 1.5686-30.271 2.006-32.528 9.3603-1.2056 3.9279 3.5719 7.2776 10.054 7.2776" />
    <path d="m132.83 45.115c-.26458 6.6211 38.652 2.92 38.416 10.865-.12223 4.1069-12.089 4.5448-18.572 4.5448" />
    <path d="m156.18 44.983c1.1997 4.8129 18.654 1.3576 19.86 10.719 1.0539 8.177-14.792 10.246-23.366 10.644" />
   </g>
  </g>
 </g>
</svg>

<p>In this case, the compiler deduces <code>T</code> as <code>galaxy::Asteroid</code> because doing so makes the first function parameter <code>T*</code> compatible with the argument <code>ast</code>. The <a href="https://en.cppreference.com/w/cpp/language/template_argument_deduction">rules governing template argument deduction</a> are a big subject in and of themselves, but in simple examples like this one, they usually do what you&rsquo;d expect. If template argument deduction doesn&rsquo;t work &ndash; in other words, if the compiler is unable to deduce template arguments in a way that makes the function parameters compatible with the caller&rsquo;s arguments &ndash; then the function template is removed from the list of candidates.</p>

<p>Any function templates in the list of candidates that survive up to this point are subject to the next step: <strong>template argument substitution</strong>. In this step, the compiler takes the function template declaration and replaces every occurrence of each template parameter with its corresponding template argument. In our example, the template parameter <code>T</code> is replaced with its deduced template argument <code>galaxy::Asteroid</code>. When this step succeeds, we finally have the signature of a real function that can be called &ndash; not just a function template!</p>

<svg style="max-width:590px" version="1.1" viewbox="0 0 156.1 28.84" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)" stroke-width=".26458px">
  <g font-family="cascadia_code" letter-spacing="0px" text-anchor="middle" word-spacing="0px">
   <text x="-9.9415274" y="47.394417" fill="none" font-size="4.478px" stroke="#000000" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <text x="61.104324" y="28.118073" fill="#666666" font-size="3.87px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="61.104324" y="28.118073" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">template &lt;typename T&gt; void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle">blast</tspan>(<tspan fill="#ff5555" font-weight="bold">T</tspan>* obj, float force)</tspan></text>
   <text x="83.061707" y="51.401413" fill="#666666" font-size="3.87px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="83.061707" y="51.401413" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle">blast</tspan>&lt;galaxy::Asteroid&gt;(<tspan fill="#ff5555" font-weight="bold">galaxy::Asteroid</tspan>* obj, float force)</tspan></text>
  </g>
  <path d="m77.25 29.9c1.4379 9.4122 14.859 5.3462 15.098 16.93" fill="none" stroke="#f55" />
  <path d="m90.487 45.079 1.8447 2.0554 1.709-2.0856" fill="none" stroke="#ff2a2a" />
  <text x="98.147758" y="38.101032" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="98.147758" y="38.101032" fill="#ff5555" font-size="3.7042px" stroke-width=".26458px">substitution</tspan></text>
 </g>
</svg>

<p>Of course, there are cases when template argument substitution can fail. Suppose for a moment that the same function template accepted a third argument, as follows:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt; <span class="directive">void</span> blast(T* obj, <span class="predefined-type">float</span> force, <span class="highlight"><span class="keyword">typename</span> T::Units mass = <span class="integer">5000</span></span>);
</pre></div>
</div>
</div>

<p>If this was the case, the compiler would try to replace the <code>T</code> in <code>T::Units</code> with <code>galaxy::Asteroid</code>. The resulting type specifier, <code>galaxy::Asteroid::Units</code>, would be <a href="https://eel.is/c++draft/intro.defs#defns.ill.formed">ill-formed</a> because the struct <code>galaxy::Asteroid</code> doesn&rsquo;t actually have a member named <code>Units</code>. Therefore, template argument substitution would fail.</p>

<p>When template argument substitution fails, the function template is simply removed from the list of candidates &ndash; and at some point in C++&rsquo;s history, people realized that this was a feature they could exploit! The discovery led to a entire set of metaprogramming techniques in its own right, collectively referred to as <a href="https://en.cppreference.com/w/cpp/language/sfinae"><strong>SFINAE</strong> (substitution failure is not an error)</a>. SFINAE is a complicated, unwieldy subject that I&rsquo;ll just say two things about here. First, it&rsquo;s essentially a way to rig the function call resolution process into choosing the candidate you want. Second, it will probably fall out of favor over time as programmers increasingly turn to modern C++ metaprogramming techniques that achieve the same thing, like <a href="https://en.cppreference.com/w/cpp/language/constraints#Constraints">constraints</a> and <a href="https://en.cppreference.com/w/cpp/language/if#Constexpr_If">constexpr if</a>.</p>

<h2 id="overload-resolution">Overload Resolution</h2>

<p>At this stage, all of the function templates found during name lookup are gone, and we&rsquo;re left with a nice, tidy set of <strong>candidate functions</strong>. This is also referred to as the <strong>overload set</strong>. Here&rsquo;s the updated list of candidate functions for our example:</p>

<svg style="max-width:584px" version="1.1" viewbox="0 0 154.52 17.992" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <g font-family="cascadia_code" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" stroke-width=".26458px">void galaxy::<tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(galaxy::Asteroid* ast, float force)</tspan></text>
   <text x="-9.9415274" y="47.394417" fill="none" font-size="4.478px" stroke="#000000" text-align="center" text-anchor="middle" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <text x="7.8947835" y="34.43618" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.8947835" y="34.43618" fill="#666666" font-size="3.87px" stroke-width=".26458px">bool <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(Target target)</tspan></text>
   <text x="8.1737185" y="40.288895" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="8.1737185" y="40.288895" fill="#666666" font-size="3.87px" stroke-width=".26458px">void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>&lt;galaxy::Asteroid&gt;(galaxy::Asteroid* obj, float force)</tspan></text>
  </g>
  <circle cx="3.3073" cy="33.205" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="34.442593" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="34.442593" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">2</tspan></text>
  <circle cx="3.3073" cy="27.384" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="28.621758" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="28.621758" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">1</tspan></text>
  <circle cx="3.3073" cy="39.026" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <g>
   <text x="3.2992578" y="40.263428" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="40.263428" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">3</tspan></text>
  </g>
 </g>
</svg>

<p>The next two steps narrow down this list even further by determining which of the candidate functions are <strong>viable</strong> &ndash; in other words, which ones <em>could</em> handle the function call.</p>

<svg style="max-width:555px" version="1.1" viewbox="0 0 146.84 52.123" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(-22.225 -117.21)">
  <path d="m88.635 119.59v39.891" fill="none" stroke="#9fbaba" stroke-linecap="square" stroke-width="2.6458" />
  <text x="131.2334" y="143.13966" fill="#ff5555" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="131.2334" y="143.13966">finds implicit</tspan><tspan x="131.2334" y="146.84383">conversions</tspan></text>
  <path transform="matrix(.22958 0 0 .16103 90.497 129.1)" d="m12.088 188.58-20.29 35.143-20.29-35.143z" fill="#9fbaba" />
  <g>
   <path d="m41.143 123.43h94.985" fill="none" stroke="#648b8b" stroke-dasharray="1.05833, 1.05833" stroke-width=".26458" />
   <text x="30.91917" y="122.76663" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="30.91917" y="122.76663">candidate</tspan><tspan x="30.91917" y="126.65601">functions</tspan></text>
   <text x="30.956247" y="164.83546" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="30.956247" y="164.83546">viable</tspan><tspan x="30.956247" y="168.72484">functions</tspan></text>
  </g>
  <g>
   <rect x="63.5" y="128.32" width="50.271" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="88.605385" y="133.04883" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="88.605385" y="133.04883">Arguments must</tspan><tspan x="88.605385" y="137.01758">be compatible</tspan></text>
   <rect x="63.5" y="143.67" width="50.271" height="11.112" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
  </g>
  <g>
   <text x="88.605385" y="148.39468" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="88.605385" y="148.39468">Constraints must</tspan><tspan x="88.605385" y="152.36343">be satisfied (C++20)</tspan></text>
   <path d="m115.35 134.13c3.965 1.0146 7.0919 2.9053 8.9994 5.3009" fill="none" stroke="#f55" stroke-width=".26458px" />
   <path d="m41.143 165.23h94.985" fill="none" stroke="#648b8b" stroke-dasharray="1.05833, 1.05833" stroke-width=".26458" />
  </g>
 </g>
</svg>

<p>Perhaps the most obvious requirement is that the <strong>arguments must be compatible</strong>; that is to say, a viable function should be able to accept the caller&rsquo;s arguments. If the caller&rsquo;s argument types don&rsquo;t match the function&rsquo;s parameter types exactly, it should at least be possible to <strong>implicitly convert</strong> each argument to its corresponding parameter type. Let&rsquo;s look at each of our example&rsquo;s candidate functions to see if its parameters are compatible:</p>

<svg style="max-width:585px" version="1.1" viewbox="0 0 154.78 38.629" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(6.2943e-8 -24.077)">
  <g fill="none">
   <path d="m105.68 50.6-3.1647 3.4888m-.0989-3.3211 3.3954 3.1534" stroke="#f55" stroke-width=".66146" />
   <path d="m121.58 25.039c1.0803-.03749 1.9831.932 1.9831 2.4537v.50647c0 1.4242.57038 2.4781 1.9778 2.4537" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
   <path d="m121.58 35.892c1.0803.0375 1.9831-.932 1.9831-2.4537v-.50647c0-1.4242.57038-2.4781 1.9779-2.4537" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
   <text x="-9.9415274" y="47.394417" font-family="cascadia_code" font-size="4.478px" letter-spacing="0px" stroke="#000000" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
  </g>
  <text x="140.0587" y="29.351707" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="140.0587" y="29.351707">caller&#8217;s</tspan><tspan x="140.0587" y="33.241085">argument types</tspan></text>
  <text x="0.8667047" y="47.343384" fill="#666666" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="0.8667047" y="47.343384" fill="#666666" font-size="3.7042px" stroke-width=".26458px">candidate</tspan></text>
  <g>
   <rect x="27.252" y="43.127" width="63.5" height="6.0854" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="55.72113" y="47.368889" fill="#4d4d4d" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="55.72113" y="47.368889" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">galaxy::Asteroid*</tspan></text>
   <rect x="90.752" y="43.127" width="26.723" height="6.0854" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="101.60938" y="47.633469" fill="#4d4d4d" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="101.60938" y="47.633469" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">float</tspan></text>
   <rect x="27.252" y="49.212" width="63.5" height="6.0854" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="55.72113" y="53.4543" fill="#4d4d4d" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="55.72113" y="53.4543" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">Target</tspan></text>
   <rect x="27.252" y="55.298" width="63.5" height="6.0854" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="55.72113" y="59.539719" fill="#4d4d4d" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="55.72113" y="59.539719" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">galaxy::Asteroid*</tspan></text>
   <rect x="90.752" y="55.298" width="26.723" height="6.0854" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
  </g>
  <g fill="#4d4d4d" font-family="cascadia_code" font-size="3.87px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="101.60938" y="59.804298" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="101.60938" y="59.804298" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">float</tspan></text>
   <text x="55.72113" y="31.493885" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="55.72113" y="31.493885" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">galaxy::Asteroid*</tspan></text>
   <text x="101.60938" y="31.758465" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="101.60938" y="31.758465" fill="#4d4d4d" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">int</tspan></text>
  </g>
  <g transform="translate(-16.933 -236.41)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(28.84 -236.41)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g>
   <path d="m121.58 42.501c1.0803-.03749 1.9831.932 1.9831 2.4537v4.7398c0 1.4242.57038 2.4781 1.9778 2.4537" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
   <path d="m121.58 61.821c1.0803.0375 1.9831-.932 1.9831-2.4537v-4.7398c0-1.4242.57038-2.4781 1.9779-2.4537" fill="none" stroke="#648b8b" stroke-linecap="round" stroke-width=".52917" />
   <text x="140.58786" y="51.047539" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="140.58786" y="51.047539">function</tspan><tspan x="140.58786" y="54.936916">parameter types</tspan></text>
  </g>
  <circle cx="20.638" cy="52.123" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="20.629478" y="53.360306" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="20.629478" y="53.360306" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">2</tspan></text>
  <circle cx="20.638" cy="46.037" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="20.629478" y="47.274887" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="20.629478" y="47.274887" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">1</tspan></text>
  <circle cx="20.638" cy="58.208" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <g font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="20.629478" y="59.445724" fill="#333333" font-family="cascadia_code" text-align="center" text-anchor="middle" style="line-height:125%" xml:space="preserve"><tspan x="20.629478" y="59.445724" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">3</tspan></text>
   <text x="0.8667047" y="53.428787" fill="#666666" font-family="Arimo" style="line-height:105%" xml:space="preserve"><tspan x="0.8667047" y="53.428787" fill="#666666" font-size="3.7042px" stroke-width=".26458px">candidate</tspan></text>
   <text x="0.8667047" y="59.514191" fill="#666666" font-family="Arimo" style="line-height:105%" xml:space="preserve"><tspan x="0.8667047" y="59.514191" fill="#666666" font-size="3.7042px" stroke-width=".26458px">candidate</tspan></text>
  </g>
  <path id="path2298" d="m76.891 46.071 1.1012 1.6672s1.3069-2.2888 3.0553-3.3532" fill="none" stroke="#00bf00" stroke-linejoin="round" stroke-width=".52917" />
  <use transform="translate(-4.9415e-5 12.171)" width="100%" height="100%" xlink:href="#path2298" />
  <use id="use3627" transform="translate(32.544)" width="100%" height="100%" xlink:href="#path2298" />
  <use transform="translate(0 12.171)" width="100%" height="100%" xlink:href="#use3627" />
  <use transform="translate(-12.171 6.0854)" width="100%" height="100%" xlink:href="#path2298" />
 </g>
</svg>

<dl>
<dt>Candidate 1</dt>
<dd><p>The caller&#8217;s first argument type <code>galaxy::Asteroid*</code> is an exact match. The caller&#8217;s second argument type <code>int</code> is implicitly convertible to the second function parameter type <code>float</code>, since <code>int</code> to <code>float</code> is a <a href="https://docs.microsoft.com/en-us/cpp/cpp/standard-conversions">standard conversion</a>. Therefore, candidate 1&#8217;s parameters are compatible.</p></dd>

<dt>Candidate 2</dt>
<dd><p>The caller&#8217;s first argument type <code>galaxy::Asteroid*</code> is implicitly convertible to the first function parameter type <code>Target</code> because <code>Target</code> has a <a href="https://en.cppreference.com/w/cpp/language/converting_constructor">converting constructor</a> that accepts arguments of type <code>galaxy::Asteroid*</code>. (Incidentally, these types are also convertible in the other direction, since <code>Target</code> has a <a href="https://en.cppreference.com/w/cpp/language/cast_operator">user-defined conversion function</a> back to <code>galaxy::Asteroid*</code>.) However, the caller passed two arguments, and candidate 2 only accepts one. Therefore, candidate 2 is <strong>not</strong> viable.</p>
<svg style="max-width:584px" version="1.1" viewbox="0 0 154.52 17.992" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <g font-family="cascadia_code" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" stroke-width=".26458px">void galaxy::<tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(galaxy::Asteroid* ast, float force)</tspan></text>
   <text x="-9.9415274" y="47.394417" fill="none" font-size="4.478px" stroke="#000000" text-align="center" text-anchor="middle" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <text x="7.8947835" y="34.43618" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.8947835" y="34.43618" fill="#666666" font-size="3.87px" stroke-width=".26458px">bool <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(Target target)</tspan></text>
   <text x="8.1737185" y="40.288895" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="8.1737185" y="40.288895" fill="#666666" font-size="3.87px" stroke-width=".26458px">void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>&lt;galaxy::Asteroid&gt;(galaxy::Asteroid* obj, float force)</tspan></text>
  </g>
  <circle cx="3.3073" cy="33.205" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="34.442593" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="34.442593" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">2</tspan></text>
  <circle cx="3.3073" cy="27.384" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="28.621758" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="28.621758" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">1</tspan></text>
  <circle cx="3.3073" cy="39.026" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <g>
   <text x="3.2992578" y="40.263428" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="40.263428" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">3</tspan></text>
   <path d="m.48467 31.221 64.383 3.4396" fill="none" stroke="#f55" stroke-width=".52917" />
   <path d="m.48467 34.66 64.383-3.4396" fill="none" stroke="#f55" stroke-width=".52917" />
  </g>
 </g>
</svg>
</dd>

<dt>Candidate 3</dt>
<dd><p>Candidate 3&#8217;s parameter types are identical to candidate 1&#8217;s, so it&#8217;s compatible too.</p></dd>
</dl>

<p>Like everything else in this process, the <a href="https://en.cppreference.com/w/cpp/language/implicit_conversion">rules that control implicit conversion</a> are an entire subject on their own. The most noteworthy rule is that you can avoid letting constructors and conversion operators participate in implicit conversion by marking them <a href="https://en.cppreference.com/w/cpp/language/explicit"><code>explicit</code></a>.</p>

<p>After using the caller&rsquo;s arguments to filter out incompatible candidates, the compiler proceeds to check whether each function&rsquo;s <a href="https://en.cppreference.com/w/cpp/language/constraints#Constraints"><strong>constraints</strong></a> are satisfied, if there are any. Constraints are a new feature in C++20. They let you use custom logic to eliminate candidate functions (coming from a class template or function template) without having to resort to SFINAE. They&rsquo;re also supposed to give you better error messages. Our example doesn&rsquo;t use constraints, so we can skip this step. (Technically, the standard says that constraints are also checked earlier, during <a href="https://eel.is/c++draft/temp.deduct#general-5">template argument deduction</a>, but I skipped over that detail. Checking in both places helps ensure the best possible error message is shown.)</p>

<h3 id="tiebreakers">Tiebreakers</h3>

<p>At this point in our example, we&rsquo;re down to two <strong>viable</strong> functions. Either of them could handle the original function call just fine:</p>

<svg style="max-width:584px" version="1.1" viewbox="0 0 154.52 12.171" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <g font-family="cascadia_code" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="7.7322202" y="28.583473" fill="#666666" font-size="3.87px" stroke-width=".26458px">void galaxy::<tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>(galaxy::Asteroid* ast, float force)</tspan></text>
   <text x="-9.9415274" y="47.394417" fill="none" font-size="4.478px" stroke="#000000" text-align="center" text-anchor="middle" style="line-height:125%" xml:space="preserve"><tspan x="-9.9415274" y="47.394417" stroke-width=".26458px" /></text>
   <text x="8.1737194" y="34.46806" fill="#666666" font-size="3.87px" style="line-height:125%" xml:space="preserve"><tspan x="8.1737194" y="34.46806" fill="#666666" font-size="3.87px" stroke-width=".26458px">void <tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px">blast</tspan>&lt;galaxy::Asteroid&gt;(galaxy::Asteroid* obj, float force)</tspan></text>
  </g>
  <circle cx="3.3073" cy="33.205" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="34.442593" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="34.442593" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">2</tspan></text>
  <circle cx="3.3073" cy="27.384" r="2.5135" fill="#fff" stroke="#666" stroke-linecap="round" stroke-width=".26458" />
  <text x="3.2992578" y="28.621758" fill="#333333" font-family="cascadia_code" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="3.2992578" y="28.621758" fill="#666666" font-family="Arimo" font-size="3.9688px" stroke-width=".26458px">1</tspan></text>
 </g>
</svg>

<p>Indeed, if either of the above functions was the only viable one, it <em>would</em> be the one that handles the function call. But because there are two, the compiler must now do what it always does when there are multiple viable functions: It must determine which one is the <strong>best viable function</strong>. To be the best viable function, one of them must &ldquo;win&rdquo; against every other viable function as decided by a <a href="https://en.cppreference.com/w/cpp/language/overload_resolution#Best_viable_function">sequence of tiebreaker rules</a>.</p>

<svg style="max-width:466px" version="1.1" viewbox="0 0 123.3 70.379" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(-22.225 -117.21)">
  <g>
   <text transform="rotate(-11.368)" x="24.543682" y="145.49374" fill="#ff5555" font-family="Arimo" font-size="3.891px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="24.543682" y="145.49374" fill="#ff5555" font-family="Arimo" font-size="3.891px" font-weight="bold" stroke-width=".26458px">TIEBREAKERS</tspan></text>
   <rect x="59.013" y="143.36" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="92.584808" y="148.08389" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.584808" y="148.08389" fill="#666666" font-size="3.9688px" stroke-width=".26458px">Better-matching parameters wins</tspan></text>
   <rect x="59.013" y="156.06" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
   <text x="92.584808" y="160.78381" fill="#666666" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.584808" y="160.78381" fill="#666666" font-size="3.9688px" stroke-width=".26458px">Non-template function wins</tspan></text>
   <rect x="59.013" y="168.76" width="67.204" height="7.1438" rx="3.175" ry="3.175" fill="#fffffb" stroke="#cacaca" stroke-linecap="round" stroke-width=".52917" />
  </g>
  <g font-family="Arimo" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
   <text x="92.584808" y="173.48383" fill="#666666" font-size="3.9688px" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="92.584808" y="173.48383" fill="#666666" font-size="3.9688px" stroke-width=".26458px">More specialized template wins</tspan></text>
   <text x="92.602272" y="186.74939" fill="#999999" font-size="3.9688px" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="92.602272" y="186.74939" fill="#999999" font-size="3.9688px" stroke-width=".26458px">additional tiebreakers</tspan></text>
   <text x="92.619392" y="154.47034" fill="#b3b3b3" font-size="3.7042px" text-align="center" style="line-height:100%" xml:space="preserve"><tspan x="92.619392" y="154.47034" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">otherwise</tspan></text>
  </g>
  <g transform="translate(6.0961 -120.43)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(6.0961 -133.13)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(6.0961 -107.73)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(6.0961 -95.032)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(33.613 -120.43)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(33.613 -133.13)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(33.613 -107.73)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <g transform="translate(33.613 -95.032)">
   <path d="m72.76 270.01v5.9531" fill="none" stroke="#9fbaba" stroke-width="1.0583" />
   <path transform="matrix(.39001 0 0 .28306 120.54 229.26)" d="m-122.5 170.17-5.9971-10.387h11.994z" fill="#9fbaba" />
  </g>
  <text x="92.619392" y="167.17033" fill="#b3b3b3" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.619392" y="167.17033" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">otherwise</tspan></text>
  <text x="92.619392" y="179.87033" fill="#b3b3b3" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="92.619392" y="179.87033" fill="#b3b3b3" font-size="3.7042px" stroke-width=".26458px">otherwise</tspan></text>
  <image x="97.644" y="123.26" width="16.447" height="12.443" stroke-width="1.85" xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAHMAAABXBAMAAADFUGf1AAAAMFBMVEX///9wcMCAoNCgwOBgcIDQ8PDw0JDw4MDwsHBgMBDgkGCwYECAQDDQQDAgICBgUIBvBO4iAAAAAXRSTlMAQObYZgAAB71JREFUeF6dl09sG8cVxlnfBAuOh0IORY0yu6yRiwBVJHQokMKX9GrAEogee5udUrEPBbxrGMh1Ox26hXrogSZr3+Tx7B5iF3CpmWkg261gcrfuzQlayAiCNk3qaGOqpyKBt+8thRSVuLbCBx4ILH/8vvdvhqyURfZRZcY49fH5WdHs/PlZZLMi/j6D5VM/anzvfJadmgV9O5vETIafBJf+0QDlb6p7Ktt76tEse/vt85+nabr9jdCLT/9JsxQsP1PKmDQ9NnqSvpt9culf/vtZNuJcIW2Hx0N3govZXvap/VWWSUq54lxE9njKvw/ez/Y6nw6vfT4KKVOblAmlomPp/vzqjb2nT/Tt6HESMq4oBPi2G8eo0uZPoUAfXB2MHscS7ALpUSqi7WOgz55mn+x9+EvDb8RKhZQG+ALZYxj+zDzOdnf/Fm09i9WmD1zgez4T8Stl55KtK+rPu17QvnE9lh4NAQ3AsohfWeREx76/84cPA3ZjEN3yWYH6UOooeYXsvJa/8G347u7uX7tSQ46b1AOUUdax9lUo73SS4e7F3aTXueWHXNLCccgiK14u25P8Uvd6EH9wdUtrsNsGlBaoso9i131ZqoBmEDvtNLrFOGWDEFnKqBJWPGw5L0VF9lk/fXLbaMopVIdTDMbA8qM7ZJm8DFVJt/vbK1tmk1FsiuaoiyFs3FpzTpd2VQkVa+qJvkY5qG2kOAalPrMd4jbK0L5RyigY+9uppBT7woQxWhVfA+jKZZeUiBolRaQ76bD/CKcXwmdxAqQHEwXviFtWqJ6WnMca7KFDr0B5lEhEfa8dp81mmSp+KGgb5AANMCiP+gUa0HZsietUT5eMEkhdBZWQ8olhH9AUiwwoU/ouqDplKOgYdMwA9YBkItrSIAvBhLQrwQUyTTbFevhsC2W5nLSTK6OhTgqTUKxDqpenOk5UWHQeqyUUR5apSKvYmOLJgIqHq2cvONNQVGLCpIlWkWQcAkjJVKpDPKAk7TxsXWhNke2bAeTIdTT8T5pIiWaRDKE9Axws2gmvVeuts8tHZLtaCQk6MvpLnnDaxuwA50LGSdEd1hE2J/ULa4dl57WEzEBI2JvkSwnlCrHERfJF5X2mxF1S32+8XnWOJBpQoRQ3wzH5SkJufNJY3AdEKZedO6RVq9ed/1+fdMA9nwu82W6Oq2ewYEU3sVgHqGB2+fXLpNpoVsnXGxqlqZE0YFgX2yU5+bZB63KybYCGuH6hsI03A+dc80B2HhvGBTQOkwlFNCTV2tjpgayYLDlXVoce+GWdh6tvBm5zwWmQGqCjNhdoCz8VYlPvVxtuXj2jwarik9YCiqeTsIk1dXdtH+D8eaVyr50YjYuCwUDh126zWRuTRHM1icjAlPkU98BoFd1drTbdc3leqQzesfEmZUoWvkGh3603SO46aWLgZVIIaGtBjoprepU0lwF9XrlOmQ2ZSIzkAhWGOrrvkLxK8nzfdQhJBlhknE+7qLDu0NuF5RxVh/zaiCmbTqwBqaDEwdK4muf/XnGqi5sBYqBmHkAvYEbYHXdhCUUrvX5XR/3ETsx1jYKM/hT8ZOwC2iKNHYohpIjSxZEM8f6J3f1CtDKnrd4y0bAPOQ17qojf/fhnuVvL9xvkh8zqzYAJZf44PqNFGCC6CuRHxb6kSWpsMf/FdCioRMvJnVqeu/W31ns9QMFKbXxmxGmBLtfyFwU61x92043J0iGqFY/cak4AJWd/YKJ+6EPh74/dtxKJKI3hUQapIttFEFFdDKJWttrKx06ej1cWR4m+5ytla/t1a+TkOnhQy77ID626lkphmdKzraXchYzqZCeWXiDV1hJpWqNCFvht8+C7X+SH0DktJGaadn/TOpdXwfE+ecQ8v3MvepCT76wn8NSjneRBnh9B8eQLmYi2T7QIZAqyJB20I03tzf1lbYRkIWWxuf+88tohtNJTXIe4KZWGA10HW+PFxGgWA9q0Gs9XX+h4COU9gmpuiu2pVOsN6EAOlpMBj/S1vL7yFR7OHlM83pp2RQLKcdC3v7XSIM4LkM37cOvBuVFf+xI0r3gMHNsSVCL6XqW+TFxkD4IsrI7wVAx9j5lo4yg6P5KRDFG1coIsNJby7IAcO+T7BYoX9AAeT/k5gChDtEKa0NgX2YSsNsnpydFG6TrtbE+9dARuMyZzwl1CNMMyk4WVtQnKIJ1wKoqjyLE5yNZyNPyxg9fxSqOCK4co25yKzmmplTrIBUcGjwpASf2NyklA8UagcjqaqI6K7MbX6GuINpukihcLYBxQARWemiw3Ji1QwHDgcpcsNJeLZ1hBCmg8Fe0rKVC1iAPUWWi+cXCfiQK1ZT+HhTL/M4TJOgTJwpICknXem4rOjSQ/jLqnkUNZI2kI6HYJqrieXgaU5RxnbTqaahV2Sv9JzUsBqJ2qmRqlZEfZMlRLzoQtIaXSkSp1bCTn09AuklxEGsapLFlAlT0qmSDJuVCiFD05QPTovimOYGGpzHAPVeMj36c0ujFGK9OtlMQIbUUbh1VtopUyw24/LSXnDK6kOprOfPKK/+TYnJAyrER5lFcJx39jFhREAyrsDOi8Cr3A5zM51pQGM8qeDCmHANkZ6kTDdTZYj2eQHQSC7ryzns6C+rEXrYPfGRyr7Z49AP4Leaiq6xwgjn4AAAAASUVORK5CYII=" preserveaspectratio="none" />
  <image transform="scale(-1,1)" x="-87.399" y="123.26" width="16.447" height="12.443" stroke-width="1.85" xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAHMAAABXBAMAAADFUGf1AAAAMFBMVEX///9wcMCAoNCgwOBgcIDQ8PDw0JDw4MDwsHBgMBDgkGCwYECAQDDQQDAgICBgUIBvBO4iAAAAAXRSTlMAQObYZgAAB71JREFUeF6dl09sG8cVxlnfBAuOh0IORY0yu6yRiwBVJHQokMKX9GrAEogee5udUrEPBbxrGMh1Ox26hXrogSZr3+Tx7B5iF3CpmWkg261gcrfuzQlayAiCNk3qaGOqpyKBt+8thRSVuLbCBx4ILH/8vvdvhqyURfZRZcY49fH5WdHs/PlZZLMi/j6D5VM/anzvfJadmgV9O5vETIafBJf+0QDlb6p7Ktt76tEse/vt85+nabr9jdCLT/9JsxQsP1PKmDQ9NnqSvpt9culf/vtZNuJcIW2Hx0N3govZXvap/VWWSUq54lxE9njKvw/ez/Y6nw6vfT4KKVOblAmlomPp/vzqjb2nT/Tt6HESMq4oBPi2G8eo0uZPoUAfXB2MHscS7ALpUSqi7WOgz55mn+x9+EvDb8RKhZQG+ALZYxj+zDzOdnf/Fm09i9WmD1zgez4T8Stl55KtK+rPu17QvnE9lh4NAQ3AsohfWeREx76/84cPA3ZjEN3yWYH6UOooeYXsvJa/8G347u7uX7tSQ46b1AOUUdax9lUo73SS4e7F3aTXueWHXNLCccgiK14u25P8Uvd6EH9wdUtrsNsGlBaoso9i131ZqoBmEDvtNLrFOGWDEFnKqBJWPGw5L0VF9lk/fXLbaMopVIdTDMbA8qM7ZJm8DFVJt/vbK1tmk1FsiuaoiyFs3FpzTpd2VQkVa+qJvkY5qG2kOAalPrMd4jbK0L5RyigY+9uppBT7woQxWhVfA+jKZZeUiBolRaQ76bD/CKcXwmdxAqQHEwXviFtWqJ6WnMca7KFDr0B5lEhEfa8dp81mmSp+KGgb5AANMCiP+gUa0HZsietUT5eMEkhdBZWQ8olhH9AUiwwoU/ouqDplKOgYdMwA9YBkItrSIAvBhLQrwQUyTTbFevhsC2W5nLSTK6OhTgqTUKxDqpenOk5UWHQeqyUUR5apSKvYmOLJgIqHq2cvONNQVGLCpIlWkWQcAkjJVKpDPKAk7TxsXWhNke2bAeTIdTT8T5pIiWaRDKE9Axws2gmvVeuts8tHZLtaCQk6MvpLnnDaxuwA50LGSdEd1hE2J/ULa4dl57WEzEBI2JvkSwnlCrHERfJF5X2mxF1S32+8XnWOJBpQoRQ3wzH5SkJufNJY3AdEKZedO6RVq9ed/1+fdMA9nwu82W6Oq2ewYEU3sVgHqGB2+fXLpNpoVsnXGxqlqZE0YFgX2yU5+bZB63KybYCGuH6hsI03A+dc80B2HhvGBTQOkwlFNCTV2tjpgayYLDlXVoce+GWdh6tvBm5zwWmQGqCjNhdoCz8VYlPvVxtuXj2jwarik9YCiqeTsIk1dXdtH+D8eaVyr50YjYuCwUDh126zWRuTRHM1icjAlPkU98BoFd1drTbdc3leqQzesfEmZUoWvkGh3603SO46aWLgZVIIaGtBjoprepU0lwF9XrlOmQ2ZSIzkAhWGOrrvkLxK8nzfdQhJBlhknE+7qLDu0NuF5RxVh/zaiCmbTqwBqaDEwdK4muf/XnGqi5sBYqBmHkAvYEbYHXdhCUUrvX5XR/3ETsx1jYKM/hT8ZOwC2iKNHYohpIjSxZEM8f6J3f1CtDKnrd4y0bAPOQ17qojf/fhnuVvL9xvkh8zqzYAJZf44PqNFGCC6CuRHxb6kSWpsMf/FdCioRMvJnVqeu/W31ns9QMFKbXxmxGmBLtfyFwU61x92043J0iGqFY/cak4AJWd/YKJ+6EPh74/dtxKJKI3hUQapIttFEFFdDKJWttrKx06ej1cWR4m+5ytla/t1a+TkOnhQy77ID626lkphmdKzraXchYzqZCeWXiDV1hJpWqNCFvht8+C7X+SH0DktJGaadn/TOpdXwfE+ecQ8v3MvepCT76wn8NSjneRBnh9B8eQLmYi2T7QIZAqyJB20I03tzf1lbYRkIWWxuf+88tohtNJTXIe4KZWGA10HW+PFxGgWA9q0Gs9XX+h4COU9gmpuiu2pVOsN6EAOlpMBj/S1vL7yFR7OHlM83pp2RQLKcdC3v7XSIM4LkM37cOvBuVFf+xI0r3gMHNsSVCL6XqW+TFxkD4IsrI7wVAx9j5lo4yg6P5KRDFG1coIsNJby7IAcO+T7BYoX9AAeT/k5gChDtEKa0NgX2YSsNsnpydFG6TrtbE+9dARuMyZzwl1CNMMyk4WVtQnKIJ1wKoqjyLE5yNZyNPyxg9fxSqOCK4co25yKzmmplTrIBUcGjwpASf2NyklA8UagcjqaqI6K7MbX6GuINpukihcLYBxQARWemiw3Ji1QwHDgcpcsNJeLZ1hBCmg8Fe0rKVC1iAPUWWi+cXCfiQK1ZT+HhTL/M4TJOgTJwpICknXem4rOjSQ/jLqnkUNZI2kI6HYJqrieXgaU5RxnbTqaahV2Sv9JzUsBqJ2qmRqlZEfZMlRLzoQtIaXSkSp1bCTn09AuklxEGsapLFlAlT0qmSDJuVCiFD05QPTovimOYGGpzHAPVeMj36c0ujFGK9OtlMQIbUUbh1VtopUyw24/LSXnDK6kOprOfPKK/+TYnJAyrER5lFcJx39jFhREAyrsDOi8Cr3A5zM51pQGM8qeDCmHANkZ6kTDdTZYj2eQHQSC7ryzns6C+rEXrYPfGRyr7Z49AP4Leaiq6xwgjn4AAAAASUVORK5CYII=" preserveaspectratio="none" />
  <text x="30.701656" y="120.5547" fill="#648b8b" font-family="Arimo" font-size="3.7042px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:105%" xml:space="preserve"><tspan x="30.701656" y="120.5547">viable</tspan><tspan x="30.701656" y="124.44408">functions</tspan></text>
  <path d="m40.888 120.95h103.72" fill="none" stroke="#648b8b" stroke-dasharray="1.05833, 1.05833" stroke-width=".26458" />
 </g>
</svg>

<p>Let&rsquo;s look at the first three tiebreaker rules.</p>

<dl>
<dt>First tiebreaker: Better-matching parameters wins</dt>
<dd><p>C++ places the most importance on how well the caller&#8217;s argument types match the function&#8217;s parameter types. Loosely speaking, it prefers functions that require fewer implicit conversions from the given arguments. When both functions require conversions, <a href="https://en.cppreference.com/w/cpp/language/overload_resolution#Ranking_of_implicit_conversion_sequences">some conversions are considered &#8220;better&#8221; than others</a>. This is the rule that decides whether to call the <code>const</code> or non-<code>const</code> version of <code>std::vector</code>&#8217;s <a href="https://en.cppreference.com/w/cpp/container/vector/operator_at"><code>operator[]</code></a>, for example.</p>

<p>In the example we&#8217;ve been following, the two viable functions have identical parameter types, so neither is better than the other. It&#8217;s a tie. As such, we move on to the second tiebreaker.</p></dd>

<dt>Second tiebreaker: Non-template function wins</dt>
<dd><p>If the first tiebreaker doesn&#8217;t settle it, then C++ prefers to call non-template functions over template functions. This is the rule that decides the winner in our example; viable function 1 is a non-template function while viable function 2 came from a template. Therefore, our <strong>best viable function</strong> is the one that came from the <code>galaxy</code> namespace:</p>
<svg style="max-width:526px" version="1.1" viewbox="0 0 139.17 13.758" xmlns="http://www.w3.org/2000/svg">
 <g transform="translate(6.2943e-8 -24.077)">
  <text x="62.28854" y="36.520977" fill="#666666" font-family="cascadia_code" font-size="4.1457px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="62.28854" y="36.520977" fill="#666666" font-size="3.87px" stroke-width=".26458px" text-align="center" text-anchor="middle">void galaxy::<tspan fill="#2b9696" font-weight="bold" stroke-width=".26458px" text-align="center" text-anchor="middle">blast</tspan>(galaxy::Asteroid* ast, float force)</tspan></text>
  <g>
   <path transform="matrix(-.23106 .017039 .016637 .23172 145.18 37.281)" d="m33.962-32.6 6.6145 4.4304 7.3911-2.9583-2.1695 7.6598 5.0975 6.1152-7.9554.30369-4.2407 6.7377-2.7472-7.4721-7.7184-1.9511 6.2575-4.9217z" fill="#ffff5d" stroke="#ffd161" stroke-linecap="round" stroke-width="1.8367" />
   <text transform="rotate(10.512)" x="125.81173" y="8.9775229" fill="#ff5555" font-family="Arimo" font-size="4.4348px" letter-spacing="0px" stroke-width=".30157px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:100%" xml:space="preserve"><tspan x="125.81173" y="8.9775229" fill="#ff5555" font-family="Arimo" font-size="4.4348px" font-weight="bold" stroke-width=".30157px">WINNER!</tspan></text>
   <path transform="matrix(-.20441 .1086 .1085 .20517 119.12 27.658)" d="m33.962-32.6 6.6145 4.4304 7.3911-2.9583-2.1695 7.6598 5.0975 6.1152-7.9554.30369-4.2407 6.7377-2.7472-7.4721-7.7184-1.9511 6.2575-4.9217z" fill="#ffff5d" stroke="#ffd161" stroke-linecap="round" stroke-width="1.8367" />
  </g>
 </g>
</svg>
<p>It&#8217;s worth reiterating that the previous two tiebreakers are ordered in the way I&#8217;ve described. In other words, if there was a viable function whose parameters matched the given arguments better than all other viable functions, it would win <em>even if it was a template function</em>.</p>
</dd>

<dt>Third tiebreaker: More specialized template wins</dt>
<dd><p>In our example, the best viable function was already found, but if it wasn&#8217;t, we would move on to the third tiebreaker. In this tiebreaker, C++ prefers to call &#8220;more specialized&#8221; template functions over &#8220;less specialized&#8221; ones. For example, consider the following two function templates:</p>
<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt; <span class="directive">void</span> blast(T obj, <span class="predefined-type">float</span> force);
<span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt; <span class="directive">void</span> blast(T* obj, <span class="predefined-type">float</span> force);
</pre></div>
</div>
</div>
<p>When template argument deduction is performed for these two function templates, the first function template accepts any type as its first argument, but the second function template accepts only pointer types. Therefore, the second function template is said to be <strong>more specialized</strong>. If these two function templates were the only results of name lookup for our call to <code>blast(ast, 100)</code>, and both resulted in viable functions, the current tiebreaker rule would cause the second one to be picked over the first one. The <a href="https://en.cppreference.com/w/cpp/language/function_template#Function_template_overloading">rules to decide which function template is more specialized than another</a> are yet another big subject.</p>
<p>Even though it&#8217;s considered more specialized, it&#8217;s important to understand that the second function template isn&#8217;t actually a partial specialization of the first function template. On the contrary, they&#8217;re two completely separate function templates that happen to share the same name. In other words, they&#8217;re <strong>overloaded</strong>. C++ <a href="http://www.gotw.ca/publications/mill17.htm">doesn&#8217;t allow partial specialization</a> of function templates.</p></dd> 
</dl>

<p>There are <a href="https://en.cppreference.com/w/cpp/language/overload_resolution#Best_viable_function">several more tiebreakers</a> in addition to the ones listed here. For example, if both the <a href="https://devblogs.microsoft.com/cppblog/simplify-your-code-with-rocket-science-c20s-spaceship-operator/">spaceship <code>&lt;=&gt;</code> operator</a> and an overloaded comparison operator such as <code>&gt;</code> are viable, C++ prefers the comparison operator. And if the candidates are user-defined conversion functions, there are other rules that take higher priority than the ones I&rsquo;ve shown. Nonetheless, I believe the three tiebreakers I&rsquo;ve shown are the most important to remember.</p>

<p>Needless to say, if the compiler checks every tiebreaker and doesn&rsquo;t find a single, unambiguous winner, compilation fails with an error message similar to the one shown near the beginning of this post.</p>

<h2 id="after-the-function-call-is-resolved">After the Function Call Is Resolved</h2>

<p>We&rsquo;ve reached the end of our journey. The compiler now knows exactly which function should be called by the expression <code>blast(ast, 100)</code>. In many cases, though, the compiler has more work to do after resolving a function call:</p>

<ul>
  <li>If the function being called is a class member, the compiler must check that member&rsquo;s <a href="https://en.cppreference.com/w/cpp/language/access">access specifiers</a> to see if it&rsquo;s accessible to the caller.</li>
  <li>If the function being called is a template function, the compiler attempts to <a href="https://en.cppreference.com/w/cpp/language/function_template#Implicit_instantiation">instantiate</a> that template function, provided its definition is visible.</li>
  <li>If the function being called is a <a href="https://en.cppreference.com/w/cpp/language/virtual">virtual function</a>, the compiler generates special machine instructions so that the correct override will be called at runtime.</li>
</ul>

<p>None of those things apply to our example. Besides, they&rsquo;re outside the scope of this post.</p>

<p>This post didn&rsquo;t contain any new information. It was basically a condensed explanation of an algorithm already described by <a href="https://en.cppreference.com/w/cpp/language">cppreference.com</a>, which, in turn, is a condensed version of the <a href="https://eel.is/c++draft/">C++ standard</a>. However, the goal of this post was to convey the main steps without getting dragged down into details. Let&rsquo;s take a look back to see just how much detail was skipped. It&rsquo;s actually kind of remarkable:</p>

<ul>
  <li>There&rsquo;s an entire set of rules for <a href="https://en.cppreference.com/w/cpp/language/unqualified_lookup">unqualified name lookup</a>.</li>
  <li>Within that, there&rsquo;s a set of rules for <a href="https://en.wikipedia.org/wiki/Argument-dependent_name_lookup">argument-dependent lookup</a>.</li>
  <li><a href="https://eel.is/c++draft/class.member.lookup">Member name lookup</a> has its own rules, too.</li>
  <li>There&rsquo;s a set of rules for <a href="https://en.cppreference.com/w/cpp/language/template_argument_deduction">template argument deduction</a>.</li>
  <li>There&rsquo;s an entire family of metaprogramming techniques based on <a href="https://en.wikipedia.org/wiki/Substitution_failure_is_not_an_error">SFINAE</a>.</li>
  <li>There&rsquo;s a set of rules governing how <a href="https://en.cppreference.com/w/cpp/language/implicit_conversion">implicit conversions</a> work.</li>
  <li><a href="https://en.cppreference.com/w/cpp/language/constraints">Constraints</a> (and concepts) are a completely new feature in C++20.</li>
  <li>There&rsquo;s a set of rules to determine <a href="https://en.cppreference.com/w/cpp/language/overload_resolution#Ranking_of_implicit_conversion_sequences">which implicit conversions are better than others</a>.</li>
  <li>There&rsquo;s a set of rules to determine <a href="https://en.cppreference.com/w/cpp/language/function_template#Function_template_overloading">which function template is more specialized than another</a>.</li>
</ul>

<p>Yeah, C++ is complicated. If you&rsquo;d like to spend more time exploring these details, Stephan T. Lavavej produced a very watchable <a href="https://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Core-C-">series of videos on Channel 9 back in 2012</a>. Check out the first three in particular. (Thanks to Stephan for reviewing an early draft of this post.)</p>

<p>Now that I&rsquo;ve learned exactly how C++ resolves a function call, I feel more competent as a library developer. Compilation errors are more obvious. I can better justify API design decisions. I even managed to distill a small set of tips and tricks out of the rules. But that&rsquo;s a subject for another post.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Flap Hero Code Review]]></title>
    <link href="https://preshing.com/20201210/flap-hero-code-review"/>
    <updated>2020-12-10T15:23:00-05:00</updated>
    <id>https://preshing.com/?p=20201210</id>
    <content type="html"><![CDATA[<p>Flap Hero is a small game written entirely in C++ without using an existing game engine. All of its source code is available <a href="https://github.com/arc80/FlapHero">on GitHub</a>. I think it can serve as an interesting resource for novice and intermediate game developers to study.</p>

<p><a href="https://github.com/arc80/FlapHero"><img class="center" src="https://preshing.com/images/flaphero-github-button.svg" /></a></p>

<p>In this post, I&rsquo;ll explain how Flap Hero&rsquo;s code is organized, how it differs from larger game projects, why it was written that way, and what could have been done better.</p>

<p>Very little information in this post is specific to C++. Most of it would still be relevant if Flap Hero was written in another language like C#, Rust or plain C. That said, if you browse (or build) the source code, you will need some fluency in C++. <a href="https://learncpp.com">Learn C++</a> and <a href="https://learnopengl.com">Learn OpenGL</a> are two great resources for beginners. For the most part, Flap Hero&rsquo;s source code sticks to a fairly straightforward subset of C++, but the deeper you go into its low-level modules (like <a href="https://github.com/arc80/plywood/tree/main/repos/plywood/src/runtime/ply-runtime"><code>runtime</code></a>), the more you&rsquo;ll encounter advanced C++ features like templates and SFINAE.</p>

<!--more-->

<h2 id="general-architecture">General Architecture</h2>

<p>Flap Hero was developed using <a href="https://plywood.arc80.com">Plywood</a>, a C++ framework that helps organize code into reusable modules. Each yellow box in the diagram below represents a Plywood module. The blue arrows represent dependencies.</p>

<svg version="1.1" viewbox="0 0 552 457" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" style="max-width:552px">
 <g transform="translate(0 -599.36)">
  <rect x="264.5" y="1029.9" width="69" height="19" rx="6" ry="6" fill="#ffffe2" stroke="#e3e3c1" />
  <text x="299.47314" y="1043.4172" fill="#333333" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="299.47314" y="1043.4172" font-size="13.75px" font-weight="bold" style="line-height:1.1">platform</tspan></text>
  <g fill="#ffffe2" stroke="#e3e3c1">
   <rect x="189.5" y="1005.9" width="69" height="19" rx="6" ry="6" />
   <rect x="264.5" y="1005.9" width="69" height="19" rx="6" ry="6" />
   <rect x="340.5" y="1005.9" width="69" height="19" rx="6" ry="6" />
  </g>
  <g fill="#333333" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-anchor="middle" word-spacing="0px">
   <text x="375.01685" y="1019.4172" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="375.01685" y="1019.4172" font-size="13.75px" font-weight="bold" style="line-height:1.1">runtime</tspan></text>
   <text x="225.01685" y="1019.4172" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="225.01685" y="1019.4172" font-size="13.75px" font-weight="bold" style="line-height:1.1">image</tspan></text>
   <text x="299.01685" y="1019.4172" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="299.01685" y="1019.4172" font-size="13.75px" font-weight="bold" style="line-height:1.1">math</tspan></text>
  </g>
  <g>
   <text x="42.206085" y="1017.0928" fill="#d40000" font-family="Consolas,monospace" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="42.206085" y="1017.0928" fill="#d40000" font-family="Consolas,monospace" font-size="17.5px" style="line-height:1.1">plywood</tspan></text>
   <text x="42.016876" y="1033.4172" fill="#d40000" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="42.016876" y="1033.4172" fill="#d40000" font-size="13.75px" style="line-height:1.1">repo</tspan></text>
   <rect x="112.5" y="668.86" width="126" height="47" rx="6" ry="6" fill="#ffffe2" stroke="#e3e3c1" />
   <rect x="129.5" y="746.86" width="340" height="149" rx="10.565" ry="13.268" fill="#ffffe2" stroke="#e3e3c1" />
   <text x="172.01685" y="761.41718" fill="#333333" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="172.01685" y="761.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">flapGame</tspan></text>
   <text x="150.01672" y="683.41718" fill="#333333" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="150.01672" y="683.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">glfwFlap</tspan></text>
   <rect x="245.5" y="759.86" width="105" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <text x="297.55701" y="773.41718" fill="#808080" font-family="Arimo" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="297.55701" y="773.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">GameFlow.cpp</tspan></text>
   <rect x="245.5" y="787.86" width="105" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <text x="297.55701" y="801.41718" fill="#808080" font-family="Arimo" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="297.55701" y="801.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">GameState.cpp</tspan></text>
   <rect x="205.5" y="845.86" width="105" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <text x="257.55701" y="859.41718" fill="#808080" font-family="Arimo" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="257.55701" y="859.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">Collision.cpp</tspan></text>
   <rect x="315.5" y="845.86" width="77" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <g letter-spacing="0px" stroke-width="1px" text-anchor="middle" word-spacing="0px">
    <text x="353.55701" y="859.41718" fill="#808080" font-family="Arimo" font-size="13px" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="353.55701" y="859.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">Text.cpp</tspan></text>
    <text x="41.701935" y="813.09271" fill="#d40000" font-family="Consolas,monospace" font-size="17.5px" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="41.701935" y="813.09271" style="line-height:1.1">FlapHero</tspan><tspan x="41.701935" y="821.383" style="line-height:1.1" /></text>
    <text x="42.016876" y="829.41718" fill="#d40000" font-family="Arimo" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="42.016876" y="829.41718" fill="#d40000" font-size="13.75px" style="line-height:1.1">repo</tspan></text>
    <text x="330.55701" y="735.41718" fill="#0000b8" font-family="Consolas,monospace" font-size="13px" text-align="center" style="line-height:0%" xml:space="preserve"><tspan x="330.55701" y="735.41718" fill="#0000b8" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">Public.h</tspan></text>
   </g>
  </g>
  <path d="m229 699.73h26.042c20.342 0 41.958 7.7779 41.958 27.023v20.476" fill="none" stroke="#5555c0" stroke-width="2" />
  <path d="m302.64 741.23-5.6202 5.7519-5.6202-5.7519" fill="none" stroke="#5555c0" stroke-dashoffset="20" stroke-width="2" />
  <g>
   <rect x="166" y="929.86" width="61" height="19" rx="6.364" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="196.20459" y="943.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="196.20459" y="943.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">glfw</tspan></text>
   <rect x="372" y="929.86" width="61" height="19" rx="5.0416" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="402.51685" y="943.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="402.51685" y="943.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">soloud</tspan></text>
   <rect x="235" y="929.86" width="61" height="19" rx="6" ry="6" fill="#ffffe2" stroke="#e3e3c1" />
   <text x="265.6712" y="943.41718" fill="#333333" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="265.6712" y="943.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">glad</tspan></text>
   <rect x="303" y="929.86" width="61" height="19" rx="5.0416" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="333.51685" y="943.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="333.51685" y="943.41718" font-size="13.75px" font-weight="bold" style="line-height:1.1">assimp</tspan></text>
  </g>
  <g>
   <g>
    <path d="m32 659.36h514" fill="none" stroke="#d40000" stroke-dasharray="4, 4" stroke-width="2" />
    <path d="m306 969.36h240" fill="none" stroke="#d40000" stroke-dasharray="4, 4" stroke-width="2" />
    <rect x="159.5" y="689.86" width="71" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   </g>
   <text x="194.55701" y="703.41718" fill="#808080" font-family="Arimo" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="194.55701" y="703.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">Main.cpp</tspan></text>
   <path d="m360 615.73h-22.042c-20.638 0-40.958 11.003-40.958 31.023v20.476" fill="none" stroke="#5555c0" stroke-width="2" />
  </g>
  <g>
   <rect x="365.5" y="604.86" width="95" height="19" rx="7.8516" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="412.59045" y="617.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="412.59045" y="617.41718" font-size="13.75px" style="line-height:1.1">iOS project</tspan></text>
   <rect x="358.5" y="604.86" width="109" height="19" rx="7.8516" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="412.59045" y="618.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="412.59045" y="618.41718" font-size="13.75px" style="line-height:1.1">Android project</tspan></text>
  </g>
  <path d="m360 639.73h-22.042c-20.638 0-40.958 11.003-40.958 31.023v54.476" fill="none" stroke="#5555c0" stroke-width="2" />
  <g>
   <rect x="365.5" y="629.86" width="95" height="19" rx="7.8516" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="412.59045" y="642.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="412.59045" y="642.41718" font-size="13.75px" style="line-height:1.1">iOS project</tspan></text>
   <rect x="358.5" y="629.86" width="109" height="19" rx="7.8516" ry="6.364" fill="#fff6d1" stroke="#e9d6b6" />
   <text x="412.59045" y="643.41718" fill="#cca561" font-family="Arimo" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="412.59045" y="643.41718" font-size="13.75px" style="line-height:1.1">iOS project</tspan></text>
  </g>
  <path d="m350.28 772.45c29.522 1.0632 26.479 22.779-.11496 22.548" fill="none" stroke="#5555c0" stroke-width="2" />
  <path d="m355.78 799.02-5.1091-3.7221 3.8074-5.1273" fill="none" stroke="#5555c0" stroke-dashoffset="20" stroke-width="2" />
  <g>
   <text x="115.01688" y="704.41718" fill="#b3b3b3" font-family="Arimo" font-size="12px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="115.01688" y="704.41718" style="line-height:1.1">Windows,</tspan><tspan x="115.01688" y="717.61719" style="line-height:1.1">Linux</tspan><tspan x="115.01688" y="730.8172" style="line-height:1.1">&amp; macOS</tspan></text>
   <rect x="191.5" y="869.86" width="85" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <text x="233.55701" y="883.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="233.55701" y="883.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">Assets.cpp</tspan></text>
   <rect x="281.5" y="869.86" width="105" height="19" rx="0" ry="0" fill="#fff" stroke="#c4c4c4" />
   <text x="333.55701" y="883.41718" fill="#808080" font-family="Arimo" font-size="13px" letter-spacing="0px" stroke-width="1px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:0%" xml:space="preserve"><tspan x="333.55701" y="883.41718" fill="#808080" font-family="Consolas,monospace" font-size="13px" style="line-height:1.1">GLHelpers.cpp</tspan></text>
  </g>
  <g transform="matrix(.22781 0 0 .22781 223.15 700.49)" stroke-width="4.3897">
   <circle id="path4059" cx="759" cy="789.36" r="9" fill="#b3b3b3" />
   <use id="use4061" transform="translate(31.344)" width="100%" height="100%" stroke-width="4.3897" xlink:href="#path4059" />
   <use transform="translate(31.344)" width="100%" height="100%" stroke-width="4.3897" xlink:href="#use4061" />
  </g>
  <g fill="none">
   <g stroke="#5555c0" stroke-dashoffset="20">
    <path d="m299.47 958.36.002 33.396m14.661-11.563-14.637 14.643-14.637-14.643" stroke-width="6.0909" />
    <path d="m299.48 900.58.00091 19.943m8.7551-6.9053-8.7404 8.7442-8.7404-8.7442" stroke-width="3.6372" />
    <path d="m299.48 814.58.00091 19.943m8.7551-6.9053-8.7404 8.7442-8.7404-8.7442" stroke-width="3.6372" />
   </g>
   <path d="m32 969.36h262" stroke="#d40000" stroke-dasharray="4, 4" stroke-width="2" />
  </g>
 </g>
</svg>

<p>The biggest chunk of Flap Hero&rsquo;s game code is located in the <a href="https://github.com/arc80/FlapHero/tree/main/src/flapGame/flapGame"><code>flapGame</code></a> module, which contains roughly 6400 physical lines of code. The two most important source files in the <code>flapGame</code> module are <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/GameFlow.cpp"><code>GameFlow.cpp</code></a> and <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/GameState.cpp"><code>GameState.cpp</code></a>.</p>

<p>All the state for a single gameplay session is held inside a single <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/GameState.h#L82"><code>GameState</code></a> object. This object is, in turn, owned by a <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/GameFlow.h#L9"><code>GameFlow</code></a> object. The <code>GameFlow</code> can actually own two <code>GameState</code> objects at a given time, with both gameplay sessions updating concurrently. This is used to achieve an animated &ldquo;split screen&rdquo; effect during the transition from one gameplay session to the next.</p>

<svg style="max-width:564px" version="1.1" viewbox="0 0 149.22 61.648" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <lineargradient id="linearGradient4810" x1="527" x2="602" y1="201" y2="201" gradienttransform="matrix(.25576 0 0 .26458 -11.673 -22.225)" gradientunits="userSpaceOnUse">
   <stop stop-color="#fff" stop-opacity=".44248" offset="0" />
   <stop stop-color="#d9a4a4" offset="1" />
  </lineargradient>
 </defs>
 <ellipse cx="40.51" cy="38.886" rx="13.097" ry="4.6302" fill="#eef6fb" stroke="#cacee2" stroke-width=".26458" />
 <ellipse cx="57.018" cy="24.209" rx="13.097" ry="4.6302" fill="#eef6fb" stroke="#cacee2" stroke-width=".26458" />
 <g>
  <text x="56.885414" y="25.399992" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="56.885414" y="25.399992" fill="#333333" font-size="3.9688px" stroke-width=".26458px">GameState</tspan></text>
  <text x="40.377682" y="40.076233" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="40.377682" y="40.076233" fill="#333333" font-size="3.9688px" stroke-width=".26458px">GameState</tspan></text>
  <path d="m27.517 24.077h16.3" fill="none" stroke="#808080" stroke-width=".52917" />
 </g>
 <path id="rect4758" d="m42.261 22.769 1.5334 1.3192-1.5682 1.2777" fill="none" stroke="#808080" stroke-width=".52916" />
 <use transform="matrix(.063352 .99799 .99799 -.063352 12.109 -7.8661)" width="100%" height="100%" xlink:href="#rect4758" />
 <ellipse cx="14.552" cy="24.209" rx="13.097" ry="4.6302" fill="#eef6fb" stroke="#cacee2" stroke-width=".26458" />
 <text x="14.552083" y="25.399992" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="14.552083" y="25.399992" fill="#333333" font-size="3.9688px" stroke-width=".26458px">GameFlow</tspan></text>
 <image x="91.281" y="1.3229" width="44.45" height="59.267" stroke-width="2.6667" preserveaspectratio="none" xlink:href="/images/flaphero-swipe.jpg" />
 <path d="m27.487 24.049c6.1675.31748 10.473 1.7433 11.558 10.195" fill="none" stroke="#808080" stroke-dasharray="1.058, 0.529" stroke-width=".529" />
 <path d="m139.08 21.795v6.319h-13.586v6.8172h13.586v6.319l9.4287-9.7276z" fill="url(#linearGradient4810)" />
</svg>

<p>The <code>flapGame</code> module is meant to be incorporated into a main project. On desktop operating systems, the main project is implemented by the <code>glfwFlap</code> module. The <code>glfwFlap</code> module is responsible for initializing the game&rsquo;s OpenGL context and for passing input and update events to <code>flapGame</code>. (Android and iOS use completely different main projects, but those serve the same purpose as <code>glfwFlap</code>.) <code>glfwFlap</code> communicates with <code>flapGame</code> using an API defined in a single file: <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Public.h"><code>Public.h</code></a>. There are only 13 functions in this API, and not all of them are used on every platform.</p>

<h2 id="as-few-subsystems-as-possible">As Few Subsystems As Possible</h2>

<p>Flap Hero is not based on an existing game engine. It&rsquo;s just a C++ program that, when you run it, plays Flap Hero! As such, it doesn&rsquo;t contain many of the <strong>subsystems</strong> found in a typical game engine. This was a deliberate choice. Flap Hero is a sample application, and I wanted to implement it using as little code as possible.</p>

<p>What do I mean by &ldquo;subsystem&rdquo;? I&rsquo;ll give a few examples in the following sections. In some cases, it was OK to not have the subsystem; in other cases, it turned out to be a disadvantage. I&rsquo;ll give a verdict for each one as we go.</p>

<h3 id="no-abstract-scene-representation">No Abstract Scene Representation</h3>

<p>In a typical game engine, there&rsquo;s an intermediate representation of the 3D (or 2D) scene consisting of a bunch of abstract objects. In Godot, those objects inherit from <a href="https://docs.godotengine.org/en/stable/classes/class_spatial.html"><code>Spatial</code></a>; in Unreal, they&rsquo;re <a href="https://docs.unrealengine.com/en-US/API/Runtime/Engine/GameFramework/AActor/index.html"><code>AActor</code></a> instances; in Unity, they&rsquo;re <a href="https://docs.unity3d.com/Manual/class-GameObject.html"><code>GameObject</code></a> instances. All of these objects are <em>abstract</em>, meaning that the details of how they&rsquo;re rendered to the screen are filled in by subclasses or components.</p>

<p>One benefit of this approach is that it lets the renderer perform view frustum culling in a generic way. First, the renderer determines which objects are visible, then it effectively tells those objects to draw themselves. Ultimately, each object issues a series of draw calls to the underlying graphics API, whether it&rsquo;s OpenGL, Metal, Direct3D, Vulkan or something else.</p>

<svg version="1.1" style="max-width:417px" viewbox="0 0 110.33 37.306" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <text x="9.2604151" y="22.188309" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="9.2604151" y="22.188309" fill="#333333" font-size="3.9688px" stroke-width=".26458px" text-align="center" text-anchor="middle">Renderer</tspan></text>
 <g fill="#eef6fb" stroke="#cacee2" stroke-width=".26458">
  <circle cx="40.217" cy="12.7" r="1.9844" />
  <circle cx="48.683" cy="10.054" r="1.9844" />
  <circle cx="57.415" cy="11.642" r="1.9844" />
  <circle cx="47.36" cy="17.992" r="1.9844" />
  <circle cx="56.885" cy="19.579" r="1.9844" />
  <circle cx="38.629" cy="21.431" r="1.9844" />
  <circle cx="47.36" cy="26.194" r="1.9844" />
  <circle cx="57.415" cy="27.517" r="1.9844" />
  <circle cx="65.088" cy="16.933" r="1.9844" />
  <circle cx="65.088" cy="24.342" r="1.9844" />
 </g>
 <g font-family="Arimo" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
  <text x="97.631233" y="22.188309" fill="#333333" font-size="3.9688px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="97.631233" y="22.188309" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Graphics API</tspan></text>
  <text x="51.894932" y="3.7041559" fill="#333333" font-size="3.9688px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="51.894932" y="3.7041559" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Scene</tspan></text>
  <text x="51.885956" y="33.337486" fill="#bbbbbb" font-size="3.4396px" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="51.885956" y="33.337486" fill="#bbbbbb" font-size="3.4396px" stroke-width=".26458px">abstract objects</tspan></text>
 </g>
 <g transform="translate(-11.112 -8.9958)">
  <path id="rect5444" d="m49.808 45.111h-1.5104c-1.9055 0-3.4396-1.5341-3.4396-3.4396v-23.283c0-1.9055 1.5341-3.4396 3.4396-3.4396h1.5104" fill="none" stroke="#c2c2c2" stroke-width=".79375" />
  <use transform="matrix(-1 0 0 1 126.02 0)" width="100%" height="100%" xlink:href="#rect5444" />
 </g>
 <path id="path5453" d="m24.871 14.508v3.2194h-3.7042v6.0854h3.7042v3.2194l5.3712-6.2622z" fill="#bccade" />
 <use transform="translate(53.181)" width="100%" height="100%" fill="#b2cddf" xlink:href="#path5453" />
</svg>

<p>Flap Hero has no such scene representation. In Flap Hero, rendering is performed using a dedicated set of functions that issue OpenGL calls directly. Most of the interesting stuff happens in the <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Render.cpp#L286"><code>renderGamePanel()</code></a> function. This function draws the bird, then the floor, then the pipes, then the shrubs, the cities in the background, the sky, the clouds, particle effects, and finally the UI layer. That&rsquo;s it. No abstract objects are involved.</p>

<svg version="1.1" style="max-width:215px" viewbox="0 -5 56.885 23.758" xmlns="http://www.w3.org/2000/svg">
 <text x="9.2604151" y="8.4299679" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="9.2604151" y="8.4299679" fill="#333333" font-size="3.9688px" stroke-width=".26458px" text-align="center" text-anchor="middle">Renderer</tspan></text>
 <text x="44.449955" y="8.4299679" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="44.449955" y="8.4299679" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Graphics API</tspan></text>
 <path d="m24.871.7493v3.2194h-3.7042v6.0854h3.7042v3.2194l5.3712-6.2622z" fill="#bccade" />
</svg>

<p>For a game like Flap Hero, where the contents of the screen are similar every frame, this approach works perfectly fine.</p>

<p class="verdict">Verdict: OK</p>

<p>Obviously, this approach has its limitations. If you&rsquo;re making a game involving exploration, where the contents of the screen can differ greatly from one moment to the next, you&rsquo;re going to want some kind of abstract scene representation. Depending on the style of game, you could even take a hybrid approach. For example, to make Flap Hero draw different obstacles instead of just pipes, you could replace the code that draws pipes with code that draws a collection of arbitrary objects. The rest of the <code>renderGamePanel()</code> function would remain the same.</p>

<h3 id="no-shader-manager">No Shader Manager</h3>

<p>Since the original Xbox, every game engine I&rsquo;ve worked on has included some kind of shader manager. A shader manager allows game objects to refer to shader programs indirectly using some sort of &ldquo;shader key&rdquo;. The shader key typically describes which features are needed to render each mesh, whether it&rsquo;s skinning, normal mapping, detail texturing or something else. For maximum flexibility, shader inputs are often passed to the graphics API automatically using some kind of reflection system.</p>

<p>Flap Hero has none of that. It has only a fixed set of shader programs. <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Shaders.cpp"><code>Shaders.cpp</code></a> contains the source code for 15 shaders, and <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Text.cpp"><code>Text.cpp</code></a> contains 2 more. There&rsquo;s a dedicated shader for the pipes, another for the smoke cloud particles, a couple that are only used in the title screen. All shaders are compiled at startup.</p>

<p>Moreover, in Flap Hero, all shader parameters are managed <em>by hand</em>. What does that mean? It means that the game extracts the locations of all vertex attributes and uniform variables using code that is <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Shaders.cpp#L48">written by hand for each shader</a>. Similarly, when the shader is used for drawing, the game passes uniform variables and configures vertex attributes using code <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Shaders.cpp#L84">written by hand for each shader</a>.</p>

<div><div class="CodeRay">
  <div class="code"><pre>    matShader-&gt;vertPositionAttrib =
        GL_NO_CHECK(GetAttribLocation(matShader-&gt;shader.id, <span class="string"><span class="delimiter">&quot;</span><span class="content">vertPosition</span><span class="delimiter">&quot;</span></span>));
    PLY_ASSERT(matShader-&gt;vertPositionAttrib &gt;= <span class="integer">0</span>);
    matShader-&gt;vertNormalAttrib =
        GL_NO_CHECK(GetAttribLocation(matShader-&gt;shader.id, <span class="string"><span class="delimiter">&quot;</span><span class="content">vertNormal</span><span class="delimiter">&quot;</span></span>));
    PLY_ASSERT(matShader-&gt;vertNormalAttrib &gt;= <span class="integer">0</span>);
    ...
</pre></div>
</div>
</div>

<p>This approach became really tedious after the 16th or 17th shader.</p>

<p class="verdict">Verdict: Bad</p>

<p>I found myself really missing the shader manager I had developed for my <a href="https://preshing.com/20171218/how-to-write-your-own-cpp-game-engine/">custom game engine</a>, which uses runtime reflection to automatically configure vertex attributes and pass uniform variables. In other words, it takes care of a lot of &ldquo;glue code&rdquo; automatically. The main reason why I didn&rsquo;t use a similar system in Flap Hero is because again, Flap Hero is a sample application and I didn&rsquo;t want to bring in too much extra machinery.</p>

<p>Incidentally, Flap Hero uses the <a href="http://docs.gl/gl3/glUniform"><code>glUniform</code></a> family of OpenGL functions to pass uniform variables to shaders. This approach is old-fashioned but easy to implement, and helps catch programming errors if you accidentally pass a uniform that the shader doesn&rsquo;t expect. The more modern approach, which incurs less driver overhead, is to use <a href="https://www.khronos.org/opengl/wiki/Uniform_Buffer_Object">Uniform Buffer Objects</a>.</p>

<h3 id="no-physics-engine">No Physics Engine</h3>

<p>Many game engines incorporate a physics engine like <a href="http://bulletphysics.org/">Bullet</a>, <a href="https://www.havok.com/havok-physics/">Havok</a> or <a href="https://box2d.org/">Box2D</a>. Each of these physics engines uses an approach similar to the &ldquo;abstract scene representation&rdquo; described above. They each maintain their own representation of physics objects in a collection known as the <strong>physics world</strong>. For example, in Bullet, the physics world is represented by a <a href="https://pybullet.org/Bullet/BulletFull/classbtDiscreteDynamicsWorld.html"><code>btDiscreteDynamicsWorld</code></a> object and contains a collection of <a href="https://pybullet.org/Bullet/BulletFull/classbtCollisionObject.html"><code>btCollisionObject</code></a>s. In Box2D, there&rsquo;s a <a href="https://box2d.org/documentation/classb2_world.html"><code>b2World</code></a> containing <a href="https://box2d.org/documentation/classb2_body.html"><code>b2Body</code></a> objects.</p>

<p>You can almost think of the physics world as a game within a game. It&rsquo;s more or less self-sufficient and independent of the game engine containing it. As long as the game keeps calling the physics engine&rsquo;s step function &ndash; for example, <a href="https://box2d.org/documentation/classb2_world.html#a82c081319af9a47e282dde807e4cd7b8">b2World::Step</a> in Box2D &ndash; the physics world will keep running on its own. The game engine takes advantage of the physics world by examining it after each step, using the state of physics objects to drive the position &amp; orientation of its own game objects.</p>

<svg version="1.1" viewbox="0 0 96.573 37.306" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" style="max-width:365px">
 <g fill="#eef6fb" stroke="#cacee2" stroke-width=".26458">
  <circle cx="66.146" cy="12.7" r="1.9844" />
  <circle cx="74.613" cy="10.054" r="1.9844" />
  <circle cx="83.344" cy="11.642" r="1.9844" />
  <circle cx="73.29" cy="17.992" r="1.9844" />
  <circle cx="82.815" cy="19.579" r="1.9844" />
  <circle cx="64.558" cy="21.431" r="1.9844" />
  <circle cx="73.29" cy="26.194" r="1.9844" />
  <circle cx="83.344" cy="27.517" r="1.9844" />
  <circle cx="91.017" cy="16.933" r="1.9844" />
  <circle cx="91.017" cy="24.342" r="1.9844" />
 </g>
 <text x="77.824112" y="3.7041559" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="77.824112" y="3.7041559" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Physics World</tspan></text>
 <text x="77.81514" y="33.337486" fill="#bbbbbb" font-family="Arimo" font-size="3.4396px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="77.81514" y="33.337486" fill="#bbbbbb" font-size="3.4396px" stroke-width=".26458px">physics objects</tspan></text>
 <g transform="translate(14.817 -8.9958)">
  <path id="rect5444" d="m49.808 45.111h-1.5104c-1.9055 0-3.4396-1.5341-3.4396-3.4396v-23.283c0-1.9055 1.5341-3.4396 3.4396-3.4396h1.5104" fill="none" stroke="#c2c2c2" stroke-width=".79375" />
  <use transform="matrix(-1 0 0 1 126.02 0)" width="100%" height="100%" xlink:href="#rect5444" />
 </g>
 <g>
  <circle cx="7.4083" cy="12.7" r="1.9844" fill="#eef6fb" stroke="#cacee2" stroke-width=".26458" />
  <circle cx="15.875" cy="10.054" r="1.9844" fill="#eef6fb" stroke="#cacee2" stroke-width=".26458" />
  <path d="m46.038 14.508-5.371 6.2622 5.371 6.2622v-3.2194h4.7625v3.2194l5.3713-6.2622-5.3713-6.2622v3.2194h-4.7625z" fill="#bccade" />
 </g>
 <g fill="#eef6fb" stroke="#cacee2" stroke-width=".26458">
  <circle cx="24.606" cy="11.642" r="1.9844" />
  <circle cx="14.552" cy="17.992" r="1.9844" />
  <circle cx="24.077" cy="19.579" r="1.9844" />
  <circle cx="5.8208" cy="21.431" r="1.9844" />
  <circle cx="14.552" cy="26.194" r="1.9844" />
  <circle cx="24.606" cy="27.517" r="1.9844" />
  <circle cx="32.279" cy="16.933" r="1.9844" />
  <circle cx="32.279" cy="24.342" r="1.9844" />
 </g>
 <text x="19.086584" y="3.7041559" fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="19.086584" y="3.7041559" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Scene</tspan></text>
 <text x="19.077608" y="33.337486" fill="#bbbbbb" font-family="Arimo" font-size="3.4396px" letter-spacing="0px" stroke-width=".26458px" text-align="center" text-anchor="middle" word-spacing="0px" style="line-height:125%" xml:space="preserve"><tspan x="19.077608" y="33.337486" fill="#bbbbbb" font-size="3.4396px" stroke-width=".26458px">game objects</tspan></text>
 <g transform="translate(-43.921 -8.9958)">
  <path d="m49.808 45.111h-1.5104c-1.9055 0-3.4396-1.5341-3.4396-3.4396v-23.283c0-1.9055 1.5341-3.4396 3.4396-3.4396h1.5104" fill="none" stroke="#c2c2c2" stroke-width=".79375" />
  <use transform="matrix(-1 0 0 1 126.02 0)" width="100%" height="100%" xlink:href="#rect5444" />
 </g>
</svg>

<p>Flap Hero contains some primitive physics, but doesn&rsquo;t use a physics engine. All Flap Hero needs is to check whether the bird collided with something. For collision purposes, the bird is treated as a sphere and the pipes are treated as cylinders. Most of the work is done by the <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Collision.cpp#L6"><code>sphereCylinderCollisionTest()</code></a> function, which detects sphere-cylinder collisions. The sphere can collide with three parts of the cylinder: the side, the edge or the cap.</p>

<svg style="max-width:488px" version="1.1" viewbox="0 0 129.12 46.831" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <lineargradient id="linearGradient6123" x1="18.455" x2="35.388" y1="34.131" y2="34.131" gradienttransform="matrix(1 0 0 1.782 4.116 -26.318)" gradientunits="userSpaceOnUse">
   <stop stop-color="#fff" offset="0" />
   <stop stop-color="#ebebeb" offset=".7" />
   <stop stop-color="#c1c1c1" offset="1" />
  </lineargradient>
  <radialgradient id="radialGradient6135" cx="-7.2482" cy="6.6329" r="6.0854" gradienttransform="matrix(1.2949 .74706 -.64424 1.115 17.385 15.665)" gradientunits="userSpaceOnUse">
   <stop stop-color="#fff" offset="0" />
   <stop stop-color="#fff" offset=".34883" />
   <stop stop-color="#ddd" offset="1" />
  </radialgradient>
 </defs>
 <g id="g6141" transform="matrix(.81188 0 0 .81126 -2.4507 4.5104)" opacity=".7">
  <path d="m13.112 5.6873v36.645c0 1.4321 5.1529 2.5931 11.509 2.5931s11.509-1.161 11.509-2.5931v-36.311" fill="url(#linearGradient6123)" stroke="#000" stroke-width=".32601" />
  <path d="m36.131 5.9785a11.509 2.5931 0 01-11.509 2.5931 11.509 2.5931 0 01-11.509-2.5931 11.509 2.5931 0 0111.509-2.5931 11.509 2.5931 0 0111.509 2.5931z" fill="#f9f9f9" stroke="#000" stroke-width=".32601" />
 </g>
 <ellipse id="path6127" cx="5.5083" cy="19.356" rx="4.8333" ry="4.8295" fill="url(#radialGradient6135)" opacity=".7" stroke="#000" stroke-width=".26458" />
 <g stroke-width="1.2322">
  <use id="use6143" transform="translate(49.249)" width="100%" height="100%" stroke-width="1.2322" xlink:href="#g6141" />
  <use id="use6145" transform="translate(50.102 -11.942)" width="100%" height="100%" stroke-width="1.2322" xlink:href="#path6127" />
  <use transform="translate(49.249)" width="100%" height="100%" xlink:href="#use6143" />
  <use transform="translate(56.991 -1.4326)" width="100%" height="100%" xlink:href="#use6145" />
 </g>
 <g fill="#333333" font-family="Arimo" font-size="3.9688px" letter-spacing="0px" stroke-width=".26458px" text-anchor="middle" word-spacing="0px">
  <text x="17.536858" y="45.50843" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="17.536858" y="45.50843" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Side</tspan></text>
  <text x="66.713547" y="45.50843" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="66.713547" y="45.50843" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Edge</tspan></text>
  <text x="116.02007" y="45.50843" text-align="center" style="line-height:125%" xml:space="preserve"><tspan x="116.02007" y="45.50843" fill="#333333" font-size="3.9688px" stroke-width=".26458px">Cap</tspan></text>
 </g>
 <g fill="none" stroke="#d40000">
  <path d="m116.22 9.3608a3.6193.73964 0 01-3.6193.73964 3.6193.73964 0 01-3.6193-.73964 3.6193.73964 0 013.6193-.73964 3.6193.73964 0 013.6193.73964z" stroke-dasharray="0.793751, 0.396876" stroke-width=".39688" />
  <path d="m62.122 11.18c-2.8492-.35874-4.7722-1.0393-4.7722-1.8199 0-.23845.17947-.46757.50999-.68104" stroke-dasharray="0.79375, 0.396875" stroke-width=".79375" />
  <path d="m9.2187 16.19c.25493 0 .66338 1.4311.66338 3.1964 0 1.7653-.29452 3.1964-.54946 3.1964-.25493-3e-6-.29127-1.4311-.29127-3.1964s-.077592-3.1964.17735-3.1964z" stroke-dasharray="0.793748, 0.396875" stroke-width=".39688" />
 </g>
</svg>

<p>For an arcade-style game like Flap Hero that only needs a few basic collision checks, this is good enough. A physics engine isn&rsquo;t necessary and would have only added complexity to the project. The amount of code needed to integrate a physics engine would likely have been greater than the amount of code needed to perform the collision checks ourselves.</p>

<p class="verdict">Verdict: Good</p>

<p>Having said that, if you&rsquo;re working with a game engine that already has an integrated physics engine, it often makes sense to use it. And for games requiring collisions between multiple objects, like the debris in a first-person shooter or the collapsing structures in Angry Birds, physics engines are definitely the way to go.</p>

<h3 id="no-asset-pipeline">No Asset Pipeline</h3>

<p>By &ldquo;asset&rdquo;, I&rsquo;m referring to data files that are loaded by the game: mainly textures, meshes, animations and sounds. I wrote about asset pipelines in an earlier post about <a href="https://preshing.com/20171218/how-to-write-your-own-cpp-game-engine/#be-aware-that-serialization-is-a-big-subject">writing your own game engine</a>.</p>

<p>Flap Hero doesn&rsquo;t have an asset pipeline and has no game-specific formats. Each of its assets is loaded from the format that was used to create it. The game imports 3D models from FBX using <a href="https://www.assimp.org/">Assimp</a>; decodes texture images from PNG using <a href="https://github.com/nothings/stb/blob/master/stb_image.h">stb_image</a>; loads a TrueType font and creates a texture atlas using <a href="https://github.com/nothings/stb/blob/master/stb_truetype.h">stb_truetype</a>; and decodes a 33-second Ogg Vorbis music file using <a href="https://github.com/nothings/stb/blob/master/stb_vorbis.c">stb_vorbis</a>. All of this happens when the game starts up. Despite the amount of processing, the game still loads fairly quickly.</p>

<video width="160" height="200" autoplay="" loop="" muted="">
  <source src="https://preshing.com/images/FlapHero-load.mp4" type="video/mp4" />
</video>

<p>If Flap Hero had an asset pipeline, most of that processing would be performed ahead of time using an offline tool (often known as the &ldquo;cooker&rdquo;) and the game would start even more quickly. But I wasn&rsquo;t worried about that. Flap Hero is just a sample project, and I didn&rsquo;t want to introduce additional build steps. In the end, though, I have to admit that the lack of an asset pipeline made certain things more difficult.</p>

<p class="verdict">Verdict: Bad</p>

<p>If you explore the way <a href="https://github.com/arc80/FlapHero/blob/main/src/flapGame/flapGame/Assets.cpp#L444">materials are associated with 3D meshes</a> in Flap Hero, you&rsquo;ll see what I mean. For example, the materials used to draw the bird have several properties: diffuse color, specular color, rim light color, specular exponent and rim light falloff. Not all of these properties can be represented in the FBX format. As a result, I ended up ignoring the FBX material properties and defining new materials entirely in code.</p>

<p>With an asset pipeline in place, that wouldn&rsquo;t be necessary. For example, in my custom game engine, I can define arbitrary material properties in Blender and export them directly to a flexible in-game format. Each time a mesh is exported, the game engine reloads it on-the-fly, even when the game running on a mobile device. This approach is great for iteration times, but obviously takes a lot of work to set up in the first place.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Small Open Source Game In C++]]></title>
    <link href="https://preshing.com/20201126/a-small-open-source-game-in-cpp"/>
    <updated>2020-11-26T07:52:00-05:00</updated>
    <id>https://preshing.com/?p=20201126</id>
    <content type="html"><![CDATA[<p>I just released a mobile game called <strong>Flap Hero</strong>. It&rsquo;s a <a href="https://en.wikipedia.org/wiki/Flappy_Bird">Flappy Bird</a> clone with cartoony graphics and a couple of twists: You can go in the pipes (wow!) and it takes <em>two</em> collisions to end the game. Flap Hero is free, quick to download (between 3 - 5 MB) and opens instantly. Give it a try!</p>

<div style="text-align:center;margin-bottom:1.3em">
<a href="https://apps.apple.com/gb/app/flap-hero/id1538082494"><img src="https://preshing.com/images/Download_on_the_App_Store_Badge_US-UK_RGB_blk_092917.svg" width="161" height="54" style="margin:0 20px" /></a>
<a href="https://play.google.com/store/apps/details?id=com.arc80.flaphero"><img src="https://preshing.com/images/google-play-badge.png" width="181" height="54" style="margin:0 20px" /></a>
</div>

<video width="300" height="400" autoplay="" loop="" controls="" muted="">
  <source src="https://preshing.com/images/FlapHeroPreview@300x400.mp4" type="video/mp4" />
</video>

<p>Flap Hero is <strong>open source</strong>, too. Its source code is released under the MIT license and its assets (3D models, sounds, music) are dedicated to the public domain. Do whatever you want with them! Everything&rsquo;s available <a href="https://github.com/arc80/FlapHero">on GitHub</a>.</p>

<!--more-->

<p><a href="https://github.com/arc80/FlapHero"><img class="center" src="https://preshing.com/images/flaphero-github-button.svg" /></a></p>

<p>I&rsquo;m releasing this game to promote <a href="https://plywood.arc80.com/">Plywood</a>, an open source C++ framework I <a href="https://preshing.com/20200526/a-new-cross-platform-open-source-cpp-framework/">released</a> a few months ago. Flap Hero was made using Plywood.</p>

<h2 id="how-flap-hero-uses-plywood">How Flap Hero Uses Plywood</h2>

<p>If you only read up to this point, you might think that Plywood is a game engine. <strong>It isn&rsquo;t!</strong> Plywood is best described as a &ldquo;module-oriented&rdquo; C++ framework. It gives you a workspace, a set of built-in modules and some (optional) code generation tricks.</p>

<p>Plywood currently has 36 built-in modules, none of which are specific to game development. For game-specific functionality, Flap Hero relies on several excellent third-party libraries: <a href="https://www.assimp.org/">Assimp</a> to load 3D models, <a href="https://sol.gfxile.net/soloud/">SoLoud</a> for audio, <a href="https://github.com/nothings/stb">stb</a> to load textures and fonts, and <a href="https://www.glfw.org/">GLFW</a> for desktop windowing &amp; input.</p>

<p>If Flap Hero relies on third-party libraries, you might be wondering, what&rsquo;s the point of Plywood? Well, those libraries have to be integrated into <em>something</em>. In Plywood, that something is the <a href="https://plywood.arc80.com/docs/DirectoryStructure">Plywood workspace</a>. In this workspace, you can create your own <a href="https://plywood.arc80.com/docs/KeyConcepts#modules">modules</a> that depend on other Plywood modules as well as on third-party libraries. You can then instantiate those modules in <a href="https://plywood.arc80.com/docs/KeyConcepts#build-folders">build folders</a>, and they&rsquo;ll bring all their dependencies along with them.</p>

<p><img srcset="/images/plywood-cabinet.png 1x,/images/plywood-cabinet@2x.png 2x" class="center" src="https://preshing.com/images/plywood-cabinet.png" /></p>

<p>In addition to the aforementioned libraries, Flap Hero uses several built-in Plywood modules such as <code>runtime</code>, <code>math</code> and <code>image</code>. Plywood&rsquo;s <code>runtime</code> module offers an alternative to the standard C and C++ runtimes, providing lean cross-platform I/O, strings, containers and more. The <code>math</code> module provides vectors, matrices, quaternions and other primitives. I&rsquo;ll continue fleshing out the details of these modules in <a href="https://plywood.arc80.com/">Plywood&rsquo;s documentation</a> over time.</p>

<h2 id="a-framework-for-efficient-software">A Framework for Efficient Software</h2>

<p>Flap Hero is written entirely in C++ and isn&rsquo;t built on any existing game engine. It keeps bloat to a minimum, resulting in a small download, fast load times, low memory usage, responsive controls and high framerate for the user, even on older devices. This is the &ldquo;handmade&rdquo; style of software development championed by communities such as the <a href="https://handmade.network/manifesto">Handmade Network</a>.</p>

<p>That&rsquo;s the kind of software that Plywood is meant to help create. Still, Plywood is a work in progress. Here&rsquo;s what I&rsquo;d like to do next:</p>

<svg style="max-width:285px" version="1.1" viewbox="0 0 75.406 39.158" xmlns="http://www.w3.org/2000/svg">
 <g>
  <g stroke="#b3b3b3">
   <path d="m3.7042 6.8248v10.549" fill="none" stroke-dasharray="1.05833, 1.05833" stroke-width=".52917" />
   <path d="m3.7042 22.7v10.549" fill="none" stroke-dasharray="1.05833, 1.05833" stroke-width=".52917" />
   <circle cx="3.7042" cy="3.7042" r="2.9104" fill="#fff" stroke-dashoffset="1.8" stroke-linejoin="round" stroke-width=".52917" />
  </g>
  <g fill="#333333" font-family="Arimo, 'Helvetica Neue', Arial, sans-serif" font-size="4.2333px" letter-spacing="0px" stroke-width=".26458px" word-spacing="0px">
   <text x="10.938342" y="5.0270834" style="line-height:125%" xml:space="preserve"><tspan x="10.938342" y="5.0270834" fill="#333333" font-size="4px" stroke-width=".26458px">Improve the documentation</tspan></text>
   <text x="10.938342" y="20.902081" style="line-height:125%" xml:space="preserve"><tspan x="10.938342" y="20.902081" fill="#333333" font-size="4px" stroke-width=".26458px">Create a GUI build manager</tspan></text>
   <text x="10.938342" y="36.777081" style="line-height:125%" xml:space="preserve"><tspan x="10.938342" y="36.777081" fill="#333333" font-size="4px" stroke-width=".26458px">Open source more modules</tspan></text>
  </g>
  <circle cx="3.7042" cy="19.579" r="2.9104" fill="#fff" stroke="#b3b3b3" stroke-dashoffset="1.8" stroke-linejoin="round" stroke-width=".52917" />
  <circle cx="3.7042" cy="35.454" r="2.9104" fill="#fff" stroke="#b3b3b3" stroke-dashoffset="1.8" stroke-linejoin="round" stroke-width=".52917" />
 </g>
</svg>

<p>I&rsquo;m especially excited about the potential of a GUI build manager. The GUI build manager would be a graphical user interface that lets you manage the Plywood workspace interactively, bypassing the somewhat cumbersome <a href="https://plywood.arc80.com/docs/PlyTool">command line tool</a>. Ideally, this tool would have close integration with various package managers across different platforms. The goal would be to make it as simple as possible to integrate third-party libraries and get projects built on other machines &ndash; something that&rsquo;s still a bit of a weak spot in Plywood.</p>

<p>The next post on this blog will be a review of Flap Hero&rsquo;s source code. Stay tuned if that kind of thing interests you!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Automatically Detecting Text Encodings in C++]]></title>
    <link href="https://preshing.com/20200727/automatically-detecting-text-encodings-in-cpp"/>
    <updated>2020-07-27T16:10:00-04:00</updated>
    <id>https://preshing.com/?p=20200727</id>
    <content type="html"><![CDATA[<p>Consider the lowly text file.</p>

<p><img class="center" src="https://preshing.com/images/untitled_txt.png" /></p>

<p>This text file can take on a surprising number of different formats. The text could be encoded as <a href="https://en.wikipedia.org/wiki/ASCII">ASCII</a>, <a href="https://en.wikipedia.org/wiki/UTF-8">UTF-8</a>, <a href="https://en.wikipedia.org/wiki/UTF-16">UTF-16</a> (little or big-endian), <a href="https://en.wikipedia.org/wiki/Windows-1252">Windows-1252</a>, <a href="https://en.wikipedia.org/wiki/Shift_JIS">Shift JIS</a>, or any of dozens of other encodings. The file may or may not begin with a <a href="https://en.wikipedia.org/wiki/Byte_order_mark">byte order mark (BOM)</a>. Lines of text could be terminated with a linefeed character <code>\n</code> (typical on UNIX), a CRLF sequence <code>\r\n</code> (typical on Windows) or, if the file was created on an older system, <a href="https://en.wikipedia.org/wiki/Newline#Representation">some other character sequence</a>.</p>

<p>Sometimes it&rsquo;s impossible to determine the encoding used by a particular text file. For example, suppose a file contains the following bytes:</p>

<!--more-->
<p style="text-align: center; font-size: 1.4em;"><code>A2 C2 A2 C2 A2 C2</code></p>

<p>This could be:</p>

<ul>
  <li>a UTF-8 file containing &ldquo;¢¢¢&rdquo;</li>
  <li>a little-endian UTF-16 (or <a href="http://justsolve.archiveteam.org/wiki/UCS-2">UCS-2</a>) file containing &ldquo;ꋂꋂꋂ&rdquo;</li>
  <li>a big-endian UTF-16 file containing &ldquo;슢슢슢&rdquo;</li>
  <li>a Windows-1252 file containing &ldquo;Â¢Â¢Â¢&rdquo;</li>
</ul>

<p>That&rsquo;s obviously an artificial example, but the point is that text files are inherently ambiguous. This poses a challenge to software that loads text.</p>

<p>It&rsquo;s a problem that has <a href="https://devblogs.microsoft.com/oldnewthing/20070417-00/?p=27223">been around for a while</a>. Fortunately, the text file landscape has gotten simpler over time, with UTF-8 winning out over other character encodings. <a href="https://w3techs.com/technologies/history_overview/character_encoding/ms/y">More than 95% of the Internet</a> is now delivered using UTF-8. It&rsquo;s impressive how quickly that number has changed; it was less than 10% <a href="https://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html">as recently as 2006</a>.</p>

<p><a href="https://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html"><img class="center" src="https://preshing.com/images/historical-utf8.png" /></a></p>

<p>UTF-8 hasn&rsquo;t taken over the world just yet, though. The Windows Registry editor, for example, still saves text files as UTF-16. When writing a text file from Python, the default encoding is platform-dependent; on my Windows PC, it&rsquo;s Windows-1252. In other words, the ambiguity problem still exists today. And even if a text file is encoded in UTF-8, there are still variations in format, since the file may or may not start with a BOM and could use either UNIX-style or Windows-style line endings.</p>

<h2 id="how-the-plywood-c-framework-loads-text">How the Plywood C++ Framework Loads Text</h2>

<p><a href="https://plywood.arc80.com/">Plywood</a> is a cross-platform open-source C++ framework I <a href="https://preshing.com/20200526/a-new-cross-platform-open-source-cpp-framework">released two months ago</a>. When opening a text file using Plywood, you have a couple of options:</p>

<ul>
  <li>If you know the exact format of the text file ahead of time, you can call <a href="https://plywood.arc80.com/docs/modules/runtime/api/filesystem/FileSystem#openTextForRead"><code>FileSystem::openTextForRead()</code></a>, passing the expected format in a <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/text/TextFormat"><code>TextFormat</code></a> structure.</li>
  <li>If you don&rsquo;t know the exact format, you can call <a href="https://plywood.arc80.com/docs/modules/runtime/api/filesystem/FileSystem#openTextForReadAutodetect"><code>FileSystem::openTextForReadAutodetect()</code></a>, which will attempt to detect the format automatically and return it to you.</li>
</ul>

<p>The input stream returned from these functions never starts with a BOM, is always encoded in UTF-8, and always terminates each line of input with a single carriage return <code>\n</code>, regardless of the input file&rsquo;s original format. Conversion is performed on the fly if needed. This allows Plywood applications to work with a single encoding internally.</p>

<h3 id="automatic-format-detection">Automatic Format Detection</h3>

<p>Here&rsquo;s how Plywood&rsquo;s automatic text format detection currently works:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 362 757" style="max-width:362px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(0 -295.36)">
  <path d="m360 517.36c0-9.3998-4.4781-14-14-14h-39" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m119 945.36h227c9.5219 0 14-4.6002 14-14v-564c0-9.3998-4.4781-14-14-14h-39" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 845.36h150c9.5219 0 14-4.6002 14-14v-11" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m193 679.36h75c9.5219 0 14 4.6002 14 14v85" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m143.75 503.36h95.25" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m143.75 353.36h95.25" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 992.86v40.5" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 916.86v40.5" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 823.86v40.5" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 757.36v25" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 573.36v25" stroke="#000080" stroke-width="2" fill="none" />
  <path d="m118 405.36v25" stroke="#000080" stroke-width="2" fill="none" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="229" width="62" stroke="#ccc" stroke-miterlimit="6" y="-232.5" x="865.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="133" width="44" stroke="#ccc" stroke-miterlimit="6" y="-184.5" x="964.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="123" width="44" stroke="#ccc" stroke-miterlimit="6" y="-179.5" x="784.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="122" width="44" stroke="#ccc" stroke-miterlimit="6" y="-344.5" x="784.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="85" width="44" stroke="#ccc" stroke-miterlimit="6" y="-325.5" x="481.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(90)" height="85" width="44" stroke="#ccc" stroke-miterlimit="6" y="-325.5" x="331.86" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(45)" height="111.34" width="111.34" stroke="#ccc" stroke-miterlimit="6" y="341.63" x="508.5" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(45)" height="100.03" width="100.03" stroke="#ccc" stroke-miterlimit="6" y="222.6" x="389.48" fill="#fffff9" />
  <rect stroke-dashoffset="1.8" transform="rotate(45)" height="77.402" width="77.402" stroke="#ccc" stroke-miterlimit="6" y="127.85" x="294.73" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="343.327" x="118.35266" font-family="Arimo" fill="#333333"><tspan x="118.35266" y="343.327">Does the</tspan><tspan x="118.35266" y="358.452">file start with</tspan><tspan x="118.35266" y="373.577">a BOM?</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="349.34866" x="283.04028" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="349.34866" x="283.04028">Use BOM</tspan><tspan y="364.47366" x="283.04028">encoding</tspan><tspan y="379.59866" x="283.04028" /></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="470.45822" x="118.01361" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="470.45822" x="118.01361">Can</tspan><tspan y="485.58322" x="118.01361">the file be</tspan><tspan y="500.70822" x="118.01361">decoded as UTF-8</tspan><tspan y="515.83325" x="118.01361">without any errors</tspan><tspan y="530.95825" x="118.01361">or control</tspan><tspan y="546.08325" x="118.01361">codes?</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="508.27075" x="282.77173" font-family="Arimo" fill="#333333"><tspan x="282.77173" y="508.27075">Use UTF-8</tspan><tspan x="282.77173" y="523.39575" /></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="632.04224" x="118.14453" font-family="Arimo" fill="#333333"><tspan x="118.14453" y="632.04224">When</tspan><tspan x="118.14453" y="647.16724">decoding</tspan><tspan x="118.14453" y="662.29224">as UTF-8, are</tspan><tspan x="118.14453" y="677.41724">there decoding errors</tspan><tspan x="118.14453" y="692.54224">in more than 25% of</tspan><tspan x="118.14453" y="707.66724">non-ASCII code</tspan><tspan x="118.14453" y="722.79224">points?</tspan><tspan x="118.14453" y="737.91724" /></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="803.88953" x="117.89276" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="803.88953" x="117.89276">The 8-bit format</tspan><tspan y="819.01453" x="117.89276">is UTF-8</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="802.52997" x="283.39258" font-family="Arimo" fill="#333333"><tspan x="283.39258" y="802.52997">The 8-bit format</tspan><tspan x="283.39258" y="817.65497">is plain bytes</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="886.32703" x="117.60406" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="886.32703" x="117.60406">Try decoding as little and big-endian</tspan><tspan y="901.45203" x="117.60406">UTF-16, then take the best score</tspan><tspan y="916.57703" x="117.60406">between those and the 8-bit format</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="984.961" x="118.32916" font-family="Arimo" fill="#333333"><tspan x="118.32916" y="984.961">Detect line ending</tspan><tspan x="118.32916" y="1000.086">type (LF or CRLF)</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="15px" line-height="110.00000238%" y="1050.9609" x="118.05725" font-family="Arimo" xml:space="preserve" fill="#000000"><tspan font-size="15px" y="1050.9609" x="118.05725" font-weight="bold">Done</tspan></text>
  <path transform="matrix(.16492 0 0 .12242 36.858 420.07)" fill="#000080" d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" />
  <path d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" transform="matrix(.16492 0 0 .12242 36.858 588.07)" fill="#000080" />
  <path transform="matrix(.16492 0 0 .12242 36.858 773.07)" fill="#000080" d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" />
  <path d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" transform="matrix(.16492 0 0 .12242 36.858 854.07)" fill="#000080" />
  <path transform="matrix(.16492 0 0 .12242 36.858 953.07)" fill="#000080" d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" />
  <path d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" transform="matrix(.16492 0 0 .12242 36.858 1023.1)" fill="#000080" />
  <path d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" transform="matrix(0 -.16492 .12242 0 228.71 434.5)" fill="#000080" />
  <path transform="matrix(0 -.16492 .12242 0 228.71 584.5)" fill="#000080" d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" />
  <path transform="matrix(.16492 0 0 .12242 200.86 773.07)" fill="#000080" d="m492 100.37-21.22-36.755-21.22-36.758h42.44l42.444-0.000001-21.222 36.758z" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="10px" line-height="110.00000238%" y="364.27075" x="184.77173" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan font-size="12.5px" y="364.27075" x="184.77173" fill="#000080">yes</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="10px" line-height="110.00000238%" y="514.27075" x="200.77173" font-family="Arimo" fill="#333333"><tspan font-size="12.5px" y="514.27075" x="200.77173" fill="#000080">yes</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="10px" line-height="110.00000238%" y="690.27075" x="208.77173" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan font-size="12.5px" y="690.27075" x="208.77173" fill="#000080">yes</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="10px" line-height="110.00000238%" y="416.27075" x="122.49426" font-family="Arimo" fill="#333333"><tspan font-size="12.5px" y="416.27075" x="122.49426" fill="#000080">no</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="10px" line-height="110.00000238%" y="582.27075" x="122.49426" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan font-size="12.5px" y="582.27075" x="122.49426" fill="#000080">no</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="10px" line-height="110.00000238%" y="768.27075" x="122.49426" font-family="Arimo" fill="#333333"><tspan font-size="12.5px" y="768.27075" x="122.49426" fill="#000080">no</tspan></text>
 </g>
</svg>

<p>Plywood analyzes up to the first 4KB of the input file in order to guess its format. The first two checks handle the vast majority of text files I&rsquo;ve encountered. There are lots of invalid byte sequences in UTF-8, so if a text file can be decoded as UTF-8 and doesn&rsquo;t contain any control codes, then it&rsquo;s almost certainly a UTF-8 file. (A control code is considered to be any code point less than 32 except for tab, linefeed and carriage return.)</p>

<p>It&rsquo;s only when we enter the bottom half of the flowchart that some guesswork begins to happen. First, Plywood decides whether it&rsquo;s better to interpret the file as UTF-8 or as plain bytes. This is meant to catch, for example, text encoded in Windows-1252 that uses accented characters. In Windows-1252, the French word <em>détail</em> is encoded as <code>64 E9 74 61 69 6C</code>, which triggers a UTF-8 decoding error since UTF-8 expects <code>E9</code> to be followed be a byte in the range <code>80</code> - <code>BF</code>. After a certain number of such errors, Plywood will favor plain bytes over UTF-8.</p>

<h3 id="scoring-system">Scoring System</h3>

<p>After that, Plywood attempts to decode the same data using the 8-bit format, little-endian UTF-16 and big-endian UTF-16. It calculates a score for each encoding as follows:</p>

<ul>
  <li>Each <strong>whitespace</strong> character decoded is worth <strong>+2.5</strong> points. Whitespace is very helpful to identify encodings, since UTF-8 whitespace can&rsquo;t be recognized in UTF-16, and UTF-16 whitespace contains control codes when interpreted in an 8-bit encoding.</li>
  <li><strong>ASCII</strong> characters are worth <strong>+1</strong> point each, except for control codes.</li>
  <li><strong>Decoding errors</strong> incur a penalty of <strong>-100</strong> points.</li>
  <li><strong>Control codes</strong> incur a penalty of <strong>-50</strong> points.</li>
  <li>Code points greater than U+FFFF are worth <strong>+5</strong> points, since the odds of encountering such characters in random data is low no matter what the encoding. This includes <strong>emojis</strong>.</li>
</ul>

<p>Scores are divided by the total number of characters decoded, and the best score is chosen. If you&rsquo;re wondering where these point values came from, I made them up! They&rsquo;re probably not optimal yet.</p>

<p>The algorithm has other weaknesses. Plywood doesn&rsquo;t yet know how to decode arbitrary 8-bit decodings. Currently, it interprets every 8-bit text file that isn&rsquo;t UTF-8 as Windows-1252. It also doesn&rsquo;t support Shift JIS at this time. The good news is that Plywood is an open source project on GitHub, which means that improvements can published as soon as they&rsquo;re developed.</p>

<h2 id="the-test-suite">The Test Suite</h2>

<p>In Plywood&rsquo;s GitHub repository, you&rsquo;ll find <a href="https://github.com/arc80/plywood/tree/9c606056faf89f0918b81f5af09c23fefaf9a12d/repos/plywood/src/apps/AutodetectTest/tests">a folder that contains 50 different text files</a> using a variety of formats. All of these files are identified and loaded correctly using <code>FileSystem::openTextForReadAutodetect()</code>.</p>

<p><a href="https://github.com/arc80/plywood/tree/9c606056faf89f0918b81f5af09c23fefaf9a12d/repos/plywood/src/apps/AutodetectTest/tests"><img class="center" src="https://preshing.com/images/autodetect-text-test-files.png" /></a></p>

<p>A lot of modern text editors perform automatic format detection, just like Plywood. Out of curiosity, I tried opening this set of text files in a few editors:</p>

<ul>
  <li><a href="https://notepad-plus-plus.org/">Notepad++</a> correctly detected the format of <strong>38</strong> out of 50 files. It fails on all UTF-16 files that are missing a BOM except for little-endian files that mostly consist of ASCII characters.</li>
  <li><a href="https://www.sublimetext.com/">Sublime Text</a> correctly detected the format of <strong>42</strong> files. When text consists mostly of ASCII, it guesses correctly no matter what the encoding.</li>
  <li><a href="https://code.visualstudio.com/">Visual Studio Code</a> correctly detected the format of <strong>40</strong> files. It&rsquo;s like Sublime Text, but fails on Windows-1252 files containing accented characters.</li>
  <li>And perhaps most impressively, <strong>Windows Notepad</strong> correctly detected the format of a whopping <strong>42</strong> files! It guesses correctly on all little-endian UTF-16 files without BOMs, but fails on all big-endian UTF-16 files without BOMs.</li>
</ul>

<p>Admittedly, this wasn&rsquo;t a fair contest, since the entire test suite is hand-made for Plywood. And most of the time, when an editor failed, it was on a UTF-16 file that was missing a BOM, which seems to be a rare format &ndash; none of the editors allows you to save such a file.</p>

<p>It&rsquo;s worth mentioning that these text editors were the inspiration for Plywood&rsquo;s autodetection strategy in the first place. Working with Unicode has always been difficult in C++ &ndash; the situation is bad enough the standard C++ committee recently <a href="https://isocpp.org/files/papers/p1238r0.html">formed a study group</a> dedicated to improving it. Meanwhile, I&rsquo;ve always wondered: Why can&rsquo;t loading text in C++ be as simple as it is in a modern text editor?</p>

<p>If you&rsquo;d like to improve Plywood in any of the ways mentioned in this post, feel free to get involved <a href="https://github.com/arc80/plywood">on GitHub</a> or in <a href="https://discord.gg/WnQhuVF">the Discord server</a>. And if you only need the source code that detects text encodings, but don&rsquo;t want to adopt Plywood itself, I get it! Feel free to <a href="https://github.com/arc80/plywood/blob/main/repos/plywood/src/runtime/ply-runtime/io/text/TextFormat.cpp">copy the source code</a> however you see fit &ndash; everything is MIT licensed.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[I/O in Plywood]]></title>
    <link href="https://preshing.com/20200708/io-in-plywood"/>
    <updated>2020-07-08T08:15:00-04:00</updated>
    <id>https://preshing.com/?p=20200708</id>
    <content type="html"><![CDATA[<p><a href="https://plywood.arc80.com/">Plywood</a> is an open-source C++ framework I released a few weeks ago. It includes, among other things, a <a href="https://github.com/arc80/plywood/tree/main/repos/plywood/src/runtime">runtime module</a> that exposes a cross-platform API for I/O, memory, threads, process management and more.</p>

<p>This post is about the <strong>I/O</strong> part. For those who don&rsquo;t know, I/O stands for <a href="https://en.wikipedia.org/wiki/Input/output">input/output</a>, and refers to the part of a computer system that either <strong>writes serialized data to</strong> or <strong>reads serialized data from</strong> an external interface. The external interface could be a storage device, pipe, network connection or any other type of communication channel.</p>

<p><img class="center" src="https://preshing.com/images/c1571-drive.jpg" /></p>

<p>Typically, it&rsquo;s the operating system&rsquo;s responsibility to provide low-level I/O services to an application. But there&rsquo;s still plenty of work that needs to happen at the application level, such as buffering, data conversion, performance tuning and exposing an interface that makes life easier on application programmers. That&rsquo;s where Plywood&rsquo;s I/O system comes in.</p>

<!--more-->

<p>Of course, standard C++ already comes with its own <a href="https://en.cppreference.com/w/cpp/io">input/output library</a>, as does the <a href="https://en.cppreference.com/w/cpp/io/c">standard C runtime</a>, and most C and C++ programmers are quite familiar with those libraries. Plywood&rsquo;s I/O system is meant serve as an alternative to those libraries. Those libraries were originally developed in <a href="https://en.wikipedia.org/wiki/Input/output_(C%2B%2B)#History">1984</a> and <a href="https://en.wikipedia.org/wiki/C_file_input/output">the early 1970s</a>, respectively. They&rsquo;ve stood the test of time incredibly well, but I don&rsquo;t think it&rsquo;s outrageous to suggest that, hey, <em>maybe</em> some innovation is possible here.</p>

<p>To be clear, when you build a project using Plywood, you aren&rsquo;t <em>required</em> to use Plywood&rsquo;s I/O system &ndash; you can still use the standard C or C++ runtime library, if you prefer.</p>

<p>I&rsquo;m sure this blog post will seem dry for some (or many) readers &ndash; but not for me! I like this topic, and I&rsquo;m willing bet that there are other low-level I/O wonks out there who will find it interesting as well. So let&rsquo;s jump in.</p>

<h2 id="writing-raw-bytes-to-standard-output">Writing Raw Bytes to Standard Output</h2>

<p>The following program writes <code>"Hello!\n"</code> to standard output as a raw sequence of 7 bytes. No newline conversion or character encoding conversion is performed. Writing takes place through an <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutStream"><code>OutStream</code></a>, which is a class (defined in the <code>ply</code> namespace) that performs buffered output.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="preprocessor">#include</span> <span class="include">&lt;ply-runtime/Base.h&gt;</span>

<span class="predefined-type">int</span> main() {
    <span class="directive">using</span> <span class="keyword">namespace</span> ply;
    OutStream outs = StdOut::binary();
    outs.write({<span class="string"><span class="delimiter">&quot;</span><span class="content">Hello!</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, <span class="integer">7</span>});
    <span class="keyword">return</span> <span class="integer">0</span>;
}
</pre></div>
</div>
</div>

<p>Now, suppose we pause this program immediately after the <code>OutStream</code> is created, before anything gets written. On a Linux system, this is what the <code>OutStream</code> initially looks like in memory:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 694 336" style="max-width:694px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <filter id="b" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".062745" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
  <filter id="a" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".12549" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
 </defs>
 <g transform="translate(0 -716.36)">
  <g stroke-linejoin="round" transform="translate(-1035,410)" stroke-dashoffset="1.8" filter="url(#b)" stroke="#ccc" stroke-linecap="round" fill="#fffff9">
   <rect height="69" width="163" y="514.86" x="1068.5" />
   <rect height="148" width="321" y="345.86" x="1392.5" />
   <rect height="132" width="321" y="493.86" x="1392.5" />
   <rect height="161" width="267" y="329.86" x="1048.5" />
  </g>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="770.36218" x="44" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="770.36218" x="44" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> curByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="785.36218" x="44" font-family="Consolas,monospace" fill="#000000"><tspan x="44" font-size="13.125px" y="785.36218"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> endByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="800.36218" x="44" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="800.36218" x="44" font-size="13.125px"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> chunk</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="815.36218" x="44" font-family="Consolas,monospace" fill="#000000"><tspan x="44" font-size="13.125px" y="815.36218"><tspan fill="#15999c">OutPipe</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> outPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="830.36218" x="44" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="830.36218" x="44" font-size="13.125px"><tspan fill="#15999c">Status</tspan><tspan fill="#333333"> status</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="845.36218" x="68" font-family="Consolas,monospace" fill="#000000"><tspan x="68" font-size="13.125px" y="845.36218"><tspan fill="#15999c">u32</tspan><tspan> chunkSizeExp <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 28 </tspan><tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 12</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="860.36218" x="68" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="860.36218" x="68" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> type <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 2 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="875.36218" x="68" font-family="Consolas,monospace" fill="#000000"><tspan x="68" font-size="13.125px" y="875.36218"><tspan fill="#15999c">u32</tspan><tspan> isPipeOwner <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="890.36218" x="68" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="890.36218" x="68" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> eof <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="970.36218" x="88" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="970.36218" x="88" font-size="13.125px"><tspan fill="#15999c">Funcs<tspan fill="#07448d" font-size="13.125px">*</tspan></tspan><tspan fill="#333333"> funcs</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="955.36218" x="64" font-family="Consolas,monospace" fill="#000000"><tspan x="64" y="955.36218"><tspan fill="#15999c" font-size="13.125px">OutPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="940.36218" x="40.000004" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="940.36218" x="40.000004"><tspan fill="#15999c" font-size="13.125px">OutPipe_FD</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="755.36218" x="20" font-family="Consolas,monospace" fill="#000000"><tspan x="20" y="755.36218"><tspan fill="#15999c" font-size="13.125px">OutStream</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="985.36218" x="64" font-family="Consolas,monospace" fill="#000000"><tspan x="64" font-size="13.125px" y="985.36218"><tspan fill="#15999c">int</tspan><tspan> fd <tspan fill="#15999c" font-size="13.125px" /><tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="949.36218" x="388" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="949.36218" x="388" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> bytes</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="964.36218" x="388" font-family="Consolas,monospace" fill="#000000"><tspan x="388" font-size="13.125px" y="964.36218"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan> next <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> nullptr</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="979.36218" x="388" font-family="Consolas,monospace" fill="#000000"><tspan x="388" font-size="13.125px" y="979.36218"><tspan fill="#15999c">u32</tspan><tspan> numBytes <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 4096</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="994.36218" x="388" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="994.36218" x="388" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> writePos <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="1009.3622" x="388" font-family="Consolas,monospace" fill="#000000"><tspan x="388" font-size="13.125px" y="1009.3622"><tspan fill="#15999c">u32</tspan><tspan> offsetIntoNextChunk <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="1024.3622" x="388" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="1024.3622" x="388" font-size="13.125px"><tspan fill="#15999c">mutable s32</tspan><tspan> refCount <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="919.36218" x="364" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="919.36218" x="364"><tspan fill="#15999c" font-size="13.125px">ChunkListNode</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="934.36218" x="388" font-family="Consolas,monospace" fill="#000000"><tspan x="388" font-size="13.125px" y="934.36218"><tspan fill="#15999c">u64</tspan><tspan> fileOffset <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="890.36218" x="368" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="890.36218" x="368"><tspan fill="#cccccc" font-size="13.125px">&#8230;</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="875.36218" x="368" font-family="Consolas,monospace" fill="#cccccc"><tspan x="368" y="875.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="860.36218" x="368" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="860.36218" x="368"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="845.36218" x="368" font-family="Consolas,monospace" fill="#cccccc"><tspan x="368" y="845.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="830.36218" x="368" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="830.36218" x="368"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="815.36218" x="368" font-family="Consolas,monospace" fill="#cccccc"><tspan x="368" y="815.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="800.36218" x="368" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="800.36218" x="368"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="785.36218" x="368" font-family="Consolas,monospace" fill="#cccccc"><tspan x="368" y="785.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="770.36218" x="368" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="770.36218" x="368"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <circle cx="132" cy="766.36" r="3" fill="#4040b2" />
  <circle cy="781.36" cx="132" r="3" fill="#4040b2" />
  <path d="m129 766.36h223" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m378.7 945.36h-36.695c-12.558 0-16-6.6584-16-16v-147c0-9.6538 6.1909-16 16-16h10.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="379" cy="945.36" r="3" fill="#4040b2" />
  <path d="m34.195 811.36h-15.195c-12.558 0-16 6.6584-16 16v93c0 9.6538 6.1909 16 16 16h10.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cy="811.36" cx="34" r="3" fill="#4040b2" />
  <path d="m352.2 915.36h-34.195c-12.558 0-16-6.6584-16-16v-87c0-9.6538-6.1909-16-16-16h-17" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cy="796.36" cx="269" r="3" fill="#4040b2" />
  <path d="m355.35 766.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m355.35 915.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m32.347 936.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m355.35 893.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m353.2 893.36h-23.195c-12.558 0-16-6.6584-16-16v-80c0-9.6538-6.1909-16-16-16h-166.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="227.74" stroke-dashoffset="1.8" stroke="#9d9dcf" filter="url(#a)" cy="737.34" r="15" stroke-miterlimit="6" stroke-width="3" fill="#fff" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" line-height="125%" y="745.36224" x="227.68285" font-family="Consolas,monospace" fill="#808080"><tspan style="text-anchor:middle;text-align:center" font-size="25px" y="745.36224" x="227.68285" font-weight="bold" fill="#808080">1</tspan></text>
  <circle stroke-width="3" stroke-dashoffset="1.8" stroke="#9d9dcf" filter="url(#a)" cy="751.34" cx="410.74" stroke-miterlimit="6" r="15" fill="#fff" />
  <text style="word-spacing:0px;letter-spacing:0px" font-size="17.5px" line-height="125%" y="759.36224" x="410.68283" font-family="Consolas,monospace" xml:space="preserve" fill="#808080"><tspan style="text-anchor:middle;text-align:center" font-size="25px" y="759.36224" x="410.68283" font-weight="bold" fill="#808080">2</tspan></text>
  <circle stroke-width="3" stroke-dashoffset="1.8" stroke="#9d9dcf" filter="url(#a)" cy="1029.3" cx="608.74" stroke-miterlimit="6" r="15" fill="#fff" />
  <text style="word-spacing:0px;letter-spacing:0px" font-size="17.5px" line-height="125%" y="1037.3622" x="608.6828" font-family="Consolas,monospace" xml:space="preserve" fill="#808080"><tspan style="text-anchor:middle;text-align:center" font-size="25px" y="1037.3622" x="608.6828" font-weight="bold" fill="#808080">3</tspan></text>
  <circle cx="199.74" stroke-dashoffset="1.8" stroke="#9d9dcf" filter="url(#a)" cy="983.34" r="15" stroke-miterlimit="6" stroke-width="3" fill="#fff" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" line-height="125%" y="991.36218" x="199.68285" font-family="Consolas,monospace" fill="#808080"><tspan style="text-anchor:middle;text-align:center" font-size="25px" y="991.36218" x="199.68285" font-weight="bold" fill="#808080">4</tspan></text>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(0 734.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="-222" width="545" x="24" height="169" /></flowregion><flowpara /></flowroot>
 </g>
</svg>

<ol>
  <li>
    <p>This is the <code>OutStream</code> object itself. As already mentioned, <code>OutStream</code> is a class that performs <strong>buffered output</strong>. That means when you write to an <code>OutStream</code>, your data actually gets written to a temporary buffer in memory first.</p>
  </li>
  <li>
    <p>This is the temporary buffer used by the <code>OutStream</code>. <code>OutStream::curByte</code> initially points to the start of this buffer, and <code>OutStream::endByte</code> points to the end. The temporary buffer is 4096 bytes long, as indicated by <code>ChunkListNode::numBytes</code>.</p>
  </li>
  <li>
    <p>This is a <code>ChunkListNode</code>, a reference-counted object that owns the temporary buffer. It&rsquo;s responsible for freeing the temporary buffer in its destructor. The <code>OutStream</code> holds a reference to this object. (The ability to create additional <code>ChunkListNode</code> references gives rise to some interesting features, but I&rsquo;ll skip the details in this post.)</p>
  </li>
  <li>
    <p>This is an <code>OutPipe_FD</code>, which is a subclass of <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutPipe"><code>OutPipe</code></a> that writes to a file descriptor. In this example, the file descriptor is <code>1</code>, which corresponds to standard output. The <code>OutStream</code> holds a pointer to this object, but <code>OutStream::status.isPipeOwner</code> is <code>0</code>, which means that the <code>OutPipe_FD</code> won&rsquo;t be destroyed when the <code>OutStream</code> is destructed. That&rsquo;s important, because this particular <code>OutPipe_FD</code> can be shared by several <code>OutStream</code>s.</p>
  </li>
</ol>

<p>From here, the program proceeds in two steps: First, the statement <code>outs.write({"Hello!\n", 7});</code> is executed. This statement basically just copies the string <code>"Hello!\n"</code> to the temporary buffer and advances <code>OutStream::curByte</code> forward by 7 bytes. After that, we return from <code>main</code>, which invokes the <code>OutStream</code> destructor. The <code>OutStream</code> destructor flushes the contents of the temporary buffer to the <code>OutPipe</code>. That&rsquo;s when the raw byte sequence for <code>"Hello!\n"</code> actually gets written to standard output.</p>

<p>There are other times when <code>OutStream</code> flushes its temporary buffer to the underlying <code>OutPipe</code>, too. For example, if we write several megabytes of data to the <code>OutStream</code>, the temporary buffer will get flushed each time it becomes full, which in this case happens every 4096 bytes. It&rsquo;s also possible to flush the <code>OutStream</code> explicitly at any time by calling <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutStream#flush"><code>OutStream::flush()</code></a>.</p>

<p>I&rsquo;m sure many readers will recognize the similarity between <code>OutStream</code> and <code>std::ostream</code> in C++ or <code>FILE</code> in C. It&rsquo;s a high-level wrapper around a low-level output destination such as a file descriptor, and it performs buffering.</p>

<p>One difference between <code>OutStream</code> and those other stream types &ndash; and this might sound like a disadvantage at first &ndash; is that <code>OutStream</code> objects aren&rsquo;t thread-safe. You must either manipulate each <code>OutStream</code> object from a single thread, or enforce mutual exclusion between threads yourself. That&rsquo;s why there&rsquo;s no single, global <code>OutStream</code> object that writes to standard output, like <code>std::cout</code> in C++ or <code>stdout</code> in C. Instead, if you need to write to standard output, you must call <code>StdOut::binary()</code> &ndash; or perhaps <code>StdOut::text()</code>, as we&rsquo;ll see in the next example &ndash; to create a unique <code>OutStream</code> object.</p>

<h2 id="writing-to-standard-output-with-newline-conversion">Writing to Standard Output With Newline Conversion</h2>

<p>In this next example, instead of creating an <code>OutStream</code> object, we create a <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/text/StringWriter"><code>StringWriter</code></a> object that writes to standard output. <code>StringWriter</code> is a subclass of <code>OutStream</code> with additional member functions for writing text.</p>

<p><code>StringWriter</code> does not extend <code>OutStream</code> with additional data members, so the two classes are actually interchangeable. Any time you have an <code>OutStream</code> object, you can freely cast it to <code>StringWriter</code> by calling <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutStream#strWriter"><code>OutStream::strWriter()</code></a>. The main reason why <code>OutStream</code> and <code>StringWriter</code> are separate classes is to help express intention in the code. <code>OutStream</code>s are mainly intended to write binary data, and <code>StringWriter</code>s are mainly intended to write text encoded in an 8-bit format compatible with ASCII, such as UTF-8.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="preprocessor">#include</span> <span class="include">&lt;ply-runtime/Base.h&gt;</span>

<span class="predefined-type">int</span> main() {
    <span class="directive">using</span> <span class="keyword">namespace</span> ply;
    StringWriter sw = StdOut::text();
    sw &lt;&lt; <span class="string"><span class="delimiter">&quot;</span><span class="content">Hello!</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>;
    <span class="keyword">return</span> <span class="integer">0</span>;
}
</pre></div>
</div>
</div>

<p>In addition, <code>StdOut::text</code> installs an adapter that performs <strong>newline conversion</strong>. This is what it looks like in memory immediately after the <code>StringWriter</code> is created, before anything gets written:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 568 618" style="max-width:568px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <filter id="a" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".062745" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
 </defs>
 <g transform="translate(0 -434.36)">
  <g transform="matrix(.5 0 0 .5 -85.1 452.03)">
   <g stroke-linejoin="round" stroke-dashoffset="1.8" stroke="#ccc" stroke-linecap="round" fill="#fffff9">
    <rect height="148" filter="url(#a)" width="321" y="704.86" x="957.5" />
    <rect height="132" filter="url(#a)" width="321" y="852.86" x="957.5" />
   </g>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="898.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="898.36218" x="988" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> bytes</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="913.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="913.36218"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan> next <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> nullptr</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="928.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="928.36218"><tspan fill="#15999c">u32</tspan><tspan> numBytes <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 4096</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="943.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="943.36218" x="988" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> writePos <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="958.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="958.36218"><tspan fill="#15999c">u32</tspan><tspan> offsetIntoNextChunk <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="973.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="973.36218" x="988" font-size="13.125px"><tspan fill="#15999c">mutable s32</tspan><tspan> refCount <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="868.36218" x="964" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="868.36218" x="964"><tspan fill="#15999c" font-size="13.125px">ChunkListNode</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="883.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="883.36218"><tspan fill="#15999c">u64</tspan><tspan> fileOffset <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="839.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="839.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="824.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="824.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="809.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="809.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="794.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="794.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="779.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="779.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="764.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="764.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="749.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="749.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="734.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="734.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="719.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="719.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <path d="m978.7 894.36h-36.695c-12.558 0-16-6.6584-16-16v-147c0-9.6538 6.1909-16 16-16h10.5" stroke="#4040b2" stroke-width="2" fill="none" />
   <circle cx="979" cy="894.36" r="3" fill="#4040b2" />
  </g>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="115" filter="url(#a)" width="277" stroke="#ccc" stroke-linecap="round" y="640.86" x="38.5" fill="#f9f9ff" />
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="177" filter="url(#a)" width="290" stroke="#ccc" stroke-linecap="round" y="439.86" x="15.5" fill="#f9f9ff" />
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="486.36221" x="68.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="486.36221" x="68.999992" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> curByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="501.36221" x="68.999992" font-family="Consolas,monospace" fill="#000000"><tspan x="68.999992" font-size="13.125px" y="501.36221"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> endByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="516.36218" x="68.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="516.36218" x="68.999992" font-size="13.125px"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> chunk</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="531.36218" x="68.999992" font-family="Consolas,monospace" fill="#000000"><tspan x="68.999992" font-size="13.125px" y="531.36218"><tspan fill="#15999c">OutPipe</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> outPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="546.36218" x="68.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="546.36218" x="68.999992" font-size="13.125px"><tspan fill="#15999c">Status</tspan><tspan fill="#333333"> status</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="561.36218" x="93" font-family="Consolas,monospace" fill="#000000"><tspan x="93" font-size="13.125px" y="561.36218"><tspan fill="#15999c">u32</tspan><tspan> chunkSizeExp <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 28 </tspan><tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 12</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="576.36218" x="93" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="576.36218" x="93" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> type <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 2 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="591.36218" x="93" font-family="Consolas,monospace" fill="#000000"><tspan x="93" font-size="13.125px" y="591.36218"><tspan fill="#15999c">u32</tspan><tspan> isPipeOwner <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="606.36218" x="93" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="606.36218" x="93" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> eof <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="686.36218" x="93" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="686.36218" x="93" font-size="13.125px"><tspan fill="#15999c">Funcs<tspan fill="#07448d" font-size="13.125px">*</tspan></tspan><tspan fill="#333333"> funcs</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="656.36218" x="45.000004" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="656.36218" x="45.000004"><tspan fill="#15999c" font-size="13.125px">OutPipe_NewLineFilter</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="471.36221" x="44.999996" font-family="Consolas,monospace" fill="#000000"><tspan x="44.999996" y="471.36221"><tspan fill="#15999c" font-size="13.125px">OutStream</tspan></tspan></text>
  <circle cx="157" cy="482.36" r="3" fill="#4040b2" />
  <circle cy="497.36" cx="157" r="3" fill="#4040b2" />
  <path d="m158 482.36h211" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m59.195 527.36h-38.195c-12.558 0-16 6.6584-16 16v93c0 9.6538 6.1909 16 16 16h13.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cy="527.36" cx="59" r="3" fill="#4040b2" />
  <path d="m369.2 557.36h-26.195c-12.558 0-16-6.6584-16-16v-13c0-9.6538-6.1909-16-16-16h-17" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cy="512.36" cx="294" r="3" fill="#4040b2" />
  <g transform="matrix(.5 0 0 .5 -106.1 124.74)">
   <g stroke-linejoin="round" stroke-dashoffset="1.8" stroke="#ccc" stroke-linecap="round" fill="#f9f9ff">
    <rect height="148" filter="url(#a)" width="321" y="704.86" x="957.5" />
    <rect height="132" filter="url(#a)" width="321" y="852.86" x="957.5" />
   </g>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="898.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="898.36218"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> bytes</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="913.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="913.36218" x="988" font-size="13.125px"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan> next <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> nullptr</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="928.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="928.36218" x="988" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> numBytes <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 4096</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="943.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="943.36218"><tspan fill="#15999c">u32</tspan><tspan> writePos <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="958.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="958.36218" x="988" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> offsetIntoNextChunk <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="973.36218" x="988" font-family="Consolas,monospace" fill="#000000"><tspan x="988" font-size="13.125px" y="973.36218"><tspan fill="#15999c">mutable s32</tspan><tspan> refCount <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="868.36218" x="964" font-family="Consolas,monospace" fill="#000000"><tspan x="964" y="868.36218"><tspan fill="#15999c" font-size="13.125px">ChunkListNode</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="883.36218" x="988" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="883.36218" x="988" font-size="13.125px"><tspan fill="#15999c">u64</tspan><tspan> fileOffset <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="839.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="839.36218"><tspan fill="#cccccc" font-size="13.125px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="824.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="824.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="809.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="809.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="794.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="794.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="779.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="779.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="764.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="764.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="749.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="749.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="734.36218" x="968" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="734.36218" x="968"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="719.36218" x="968" font-family="Consolas,monospace" fill="#cccccc"><tspan x="968" y="719.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <path d="m978.7 894.36h-36.695c-12.558 0-16-6.6584-16-16v-147c0-9.6538 6.1909-16 16-16h10.5" stroke="#4040b2" stroke-width="2" fill="none" />
   <circle cy="894.36" cx="979" r="3" fill="#4040b2" />
  </g>
  <path d="m37.347 652.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m368.2 543.36h-13.195c-12.558 0-16-6.6584-16-16v-14c0-9.6538-6.1909-16-16-16h-166.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 683.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="-222" width="545" x="24" height="169" /></flowregion><flowpara /></flowroot>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="456.36221" x="21" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="456.36221" x="21">StringWriter<tspan fill="#15999c" font-size="13.125px" /></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="671.36218" x="69" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="671.36218" x="69"><tspan fill="#15999c" font-size="13.125px">OutPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="701.36218" x="69" font-family="Consolas,monospace" fill="#000000"><tspan x="69" font-size="13.125px" y="701.36218"><tspan fill="#15999c">OptionallyOwned<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>OutStream</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> outs</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="715.86218" x="69" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="715.86218" x="69" font-size="13.125px"><tspan fill="#15999c">NewLineFilter</tspan><tspan fill="#333333"> filter</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="730.86218" x="93" font-family="Consolas,monospace" fill="#000000"><tspan x="93" font-size="13.125px" y="730.86218"><tspan fill="#15999c">bool</tspan><tspan fill="#333333"> crlf = <tspan fill="#45ba45">false</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="745.86218" x="93" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="745.86218" x="93" font-size="13.125px"><tspan fill="#15999c">bool</tspan><tspan fill="#333333"> needsLF = <tspan fill="#45ba45">false</tspan></tspan></tspan></text>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 666.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="337" width="564" x="-147" height="189" /></flowregion><flowpara /></flowroot>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="162" filter="url(#a)" width="269" stroke="#ccc" stroke-linecap="round" y="781.86" x="60.5" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="813.36218" x="90.999992" font-family="Consolas,monospace" fill="#000000"><tspan x="90.999992" font-size="13.125px" y="813.36218"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> curByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="828.36218" x="90.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="828.36218" x="90.999992" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> endByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="843.36218" x="90.999992" font-family="Consolas,monospace" fill="#000000"><tspan x="90.999992" font-size="13.125px" y="843.36218"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> chunk</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="858.36218" x="90.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="858.36218" x="90.999992" font-size="13.125px"><tspan fill="#15999c">OutPipe</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> outPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="873.36218" x="90.999992" font-family="Consolas,monospace" fill="#000000"><tspan x="90.999992" font-size="13.125px" y="873.36218"><tspan fill="#15999c">Status</tspan><tspan fill="#333333"> status</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="888.36218" x="115" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="888.36218" x="115" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> chunkSizeExp <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 28 </tspan><tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 12</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="903.36218" x="115" font-family="Consolas,monospace" fill="#000000"><tspan x="115" font-size="13.125px" y="903.36218"><tspan fill="#15999c">u32</tspan><tspan> type <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 2 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="918.36218" x="115" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="918.36218" x="115" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> isPipeOwner <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="933.36218" x="115" font-family="Consolas,monospace" fill="#000000"><tspan x="115" font-size="13.125px" y="933.36218"><tspan fill="#15999c">u32</tspan><tspan> eof <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="798.36218" x="66.999992" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="798.36218" x="66.999992"><tspan fill="#15999c" font-size="13.125px">OutStream</tspan></tspan></text>
  <circle cx="81" cy="854.36" r="3" fill="#4040b2" />
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="70" filter="url(#a)" width="152" stroke="#ccc" stroke-linecap="round" y="968.86" x="86.5" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="1014.3622" x="141" font-family="Consolas,monospace" fill="#000000"><tspan x="141" font-size="13.125px" y="1014.3622"><tspan fill="#15999c">Funcs<tspan fill="#07448d" font-size="13.125px">*</tspan></tspan><tspan fill="#333333"> funcs</tspan></tspan></text>
  <path d="m84.347 981.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="999.36218" x="117" font-family="Consolas,monospace" fill="#000000"><tspan x="117" y="999.36218"><tspan fill="#15999c" font-size="13.125px">OutPipe</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="1029.3622" x="117" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="1029.3622" x="117" font-size="13.125px"><tspan fill="#15999c">int </tspan><tspan>fd <tspan fill="#07448d">=</tspan><tspan fill="#45ba45"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="984.36218" x="93" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="984.36218" x="93"><tspan fill="#15999c" font-size="13.125px">OutPipe_FD</tspan></tspan></text>
  <path d="m81.195 854.36h-15.195c-12.558 0-16 6.6584-16 16v95c0 9.6538 6.1909 16 16 16h13.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m59.195 697.36h-15.195c-12.558 0-16 6.6584-16 16v65c0 9.6538 6.1909 16 16 16h10.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="59" cy="697.36" r="3" fill="#4040b2" />
  <path d="m59.347 794.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m370.35 482.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m370.35 543.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m370.35 557.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <circle cy="809.36" cx="178" r="3" fill="#4040b2" />
  <circle cx="178" cy="824.36" r="3" fill="#4040b2" />
  <path d="m179 809.36h211" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m390.2 884.36h-26.195c-12.558 0-16-6.6584-16-16v-13c0-9.6538-6.1909-16-16-16h-17" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="315" cy="839.36" r="3" fill="#4040b2" />
  <path d="m389.2 870.36h-13.195c-12.558 0-16-6.6584-16-16v-14c0-9.6538-6.1909-16-16-16h-166.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m391.35 809.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m391.35 870.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m391.35 884.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" line-height="125%" y="698.36224" x="398" font-family="Consolas,monospace" fill="#000000"><tspan font-size="17.5px" font-style="italic" y="698.36224" x="398" font-family="Asap" fill="#ff2a2a">adapter</tspan></text>
  <path d="m346.82 691.95c28.87 31.922 49.613 28.211 68.257 12.251" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <path d="m348.27 705.38-1.6486-14.001 13.838 4.3315" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
 </g>
</svg>

<p>The bottom half of the above diagram is identical to the previous diagram, and the top half of the diagram is basically an adapter. It&rsquo;s a <code>StringWriter</code> (which derives from <code>OutStream</code>), with its own temporary buffer, pointing to a <code>OutPipe_NewLineFilter</code>. This time, <code>status.isPipeOwner</code> is <code>1</code>, which means that the <code>OutPipe_NewLineFilter</code> will be automatically destroyed when the <code>StringWriter</code> is destructed.</p>

<p>When the <code>StringWriter</code> flushes the contents of its temporary buffer to the <code>OutPipe_NewLineFilter</code>, the <code>OutPipe_NewLineFilter</code> performs newline conversion on that data and writes the result to its own <code>OutStream</code>. Assuming this example runs on Linux, that basically just means discarding any <code>\r</code> (carriage return) character it encounters. If we run the same program on Windows, it will also replace any <code>\n</code> (linefeed) character it encounters with <code>\r\n</code>.</p>

<p>Personally, I think it&rsquo;s a strange/funny convention that Windows applications tend to use <code>\r\n</code> to terminate lines of text, and applications on Unix-like platforms tend to use <code>\n</code>. There are <a href="https://en.wikipedia.org/wiki/Newline#History">historical reasons for this difference</a>, but I don&rsquo;t think there&rsquo;s a very convincing reason for it anymore. Nonetheless, I&rsquo;ve designed Plywood to play along, at least when <code>StdOut::text()</code> is called. You can always override the default behavior by calling <code>StdOut::binary()</code> and installing your preferred type of newline filter on it.</p>

<p>Finally, note that Plywood does not have the equivalent of <a href="https://en.cppreference.com/w/cpp/io/manip/endl"><code>std::endl</code></a>. Instead, <code>StringWriter</code> generally expects <code>\n</code> to terminate lines of text, and lines aren&rsquo;t flushed automatically. If you need to flush a <code>StringWriter</code> after writing a line of text, do so explicitly by calling <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutStream#flush"><code>OutStream::flush()</code></a>.</p>

<h2 id="writing-utf-16">Writing UTF-16</h2>

<p>In the previous example, there were two <code>OutStream</code>s chained together with an <code>OutPipe_NewLineFilter</code> acting as an adapter. Plywood has adapters that perform other conversions, too. For example, here&rsquo;s a short program that saves a text file as <a href="https://en.wikipedia.org/wiki/UTF-16">UTF-16</a>. The text file is written in little-endian byte order with Windows-style CRLF line endings, and includes a <a href="https://en.wikipedia.org/wiki/Byte_order_mark">byte order mark (BOM)</a> at the beginning of the file:</p>

<pre><code>#include &lt;ply-runtime/Base.h&gt;

int main() {
    using namespace ply;

    TextFormat tf;
    tf.encoding = TextFormat::Encoding::UTF16_le;
    tf.newLine = TextFormat::NewLine::CRLF;
    tf.bom = true;

    Owned&lt;StringWriter&gt; sw = FileSystem::native()-&gt;openTextForWrite("utf16.txt", tf);
    if (sw) {
        *sw &lt;&lt; "Hello!\n";
        *sw &lt;&lt; u8"😋🍺🍕\n";
    }        
    return 0;
}
</code></pre>

<p>Here&rsquo;s what the output file looks like when we open it in <a href="https://code.visualstudio.com/">Visual Studio Code</a>. You can see the expected information about the file format in the status bar: <code>UTF-16 LE</code> and <code>CRLF</code>.</p>

<p><img class="center" src="https://preshing.com/images/utf16-vscode.png" /></p>

<p>However, because Plywood <a href="https://plywood.arc80.com/docs/modules/runtime/guides/Unicode">generally prefers to work with UTF-8</a>, the <code>StringWriter</code> returned from <a href="https://plywood.arc80.com/docs/modules/runtime/api/filesystem/FileSystem#openTextForWrite"><code>FileSystem::openTextForWrite()</code></a> actually expects UTF-8-encoded text as input. That&rsquo;s why the first line we passed to the <code>StringWriter</code> was the 8-bit character string <code>"Hello!\n"</code>, and the second line was the string literal <code>u8"😋🍺🍕\n"</code>, which uses the <code>u8</code> <a href="https://en.cppreference.com/w/cpp/language/string_literal">string literal prefix</a>. Both strings are valid UTF-8 strings at runtime. (In general, any time you use characters outside the ASCII character set, and you want UTF-8-encoded text at runtime, it&rsquo;s a good idea to use the <code>u8</code> prefix.)</p>

<p>As you might expect, the UTF-8 text passed to the <code>StringWriter</code> gets converted to UTF-16 using another adapter:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 562 287" style="max-width:562px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <filter id="a" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".062745" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
 </defs>
 <g transform="translate(0 -765.36)">
  <g transform="translate(390 207)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#fffff9" />
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="823.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="816.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="808.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="801.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="793.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="786.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="778.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="771.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="763.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 683.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="-222" width="545" x="24" height="169" /></flowregion><flowpara /></flowroot>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 666.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="337" width="564" x="-147" height="189" /></flowregion><flowpara /></flowroot>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="17.5px" line-height="110.00000238%" y="880.36224" x="500.98105" font-family="Asap" font-style="italic" fill="#ff2a2a"><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="880.36224" x="500.98105">converts UTF-8</tspan><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="899.61224" x="500.98105">to UTF-16</tspan></text>
  <path d="m406.82 904.95c26.942.0803 35.919-3.1533 46.257-10.749" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <path d="m417.78 912.76-11.799-7.7157 12.199-7.8381" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="866.5" x="5.1427" stroke-width=".28551" fill="#f3fff3" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="883.36218" x="86.365646" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.365646" y="883.36218">OutStream</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="914.5" x="5.1427" stroke-width=".28551" fill="#f3fff3" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="931.36218" x="86.396347" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="931.36218" x="86.396347">OutPipe_TextConverter</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="962.5" x="5.1427" stroke-width=".28551" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="979.36218" x="86.365646" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="979.36218" x="86.365646">OutStream</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="1010.5" x="5.1427" stroke-width=".28551" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="1027.3622" x="86.311172" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.311172" y="1027.3622">OutPipe_FD</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="770.5" x="5.1427" stroke-width=".28551" fill="#f9f9ff" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="787.36218" x="86.223312" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="787.36218" x="86.223312">StringWriter<tspan style="text-anchor:middle;text-align:center" fill="#15999c" font-size="13.125px" /></tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="818.5" x="5.1427" stroke-width=".28551" fill="#f9f9ff" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="835.36218" x="86.396347" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.396347" y="835.36218">OutPipe_NewLineFilter</tspan></text>
  <g transform="translate(390 111)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#f3fff3" />
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="823.71222"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="816.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="808.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="801.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="793.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="786.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="778.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="771.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="763.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <g transform="translate(390 15)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#f9f9ff" />
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="823.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="816.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="808.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="801.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="793.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="786.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="778.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="771.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="763.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <path d="m168 783.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 783.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m168 879.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 879.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m168 975.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 975.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 796.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 816.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 844.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 864.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 892.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 912.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 940.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 960.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 988.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 1008.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
 </g>
</svg>

<p>In general, it&rsquo;s possible to create any kind of adapter that takes an input stream and writes an output stream. You could even implement an adapter that performs compression using <a href="https://zlib.net/">zlib</a>, encryption using <a href="https://www.openssl.org/">OpenSSL</a>, or any other compression/encryption codec. I&rsquo;m sure Plywood will provide a few such adapters some point, but I haven&rsquo;t had to implement them yet.</p>

<h2 id="reading-lines-of-text">Reading Lines of Text</h2>

<p>So far, all of the examples have involved writing output using either <code>OutStream</code> or <code>StringWriter</code>. This next one involves <strong>reading input</strong> using <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/text/StringReader"><code>StringReader</code></a>. <code>StringReader</code> is a subclass of <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/InStream"><code>InStream</code></a> with additional member functions for reading and parsing text, and <code>InStream</code> is a class that performs buffered input from an underlying data source.</p>

<p>The following function opens a text file and read its contents to an array of <code>String</code>s, with one array item per line:</p>

<div><div class="CodeRay">
  <div class="code"><pre>Array&lt;String&gt; readLines() {
    Array&lt;String&gt; result;
    Owned&lt;StringReader&gt; sr = FileSystem::native()-&gt;openTextForReadAutodetect(<span class="string"><span class="delimiter">&quot;</span><span class="content">utf16.txt</span><span class="delimiter">&quot;</span></span>).first;
    <span class="keyword">if</span> (sr) {
        <span class="keyword">while</span> (String line = sr-&gt;readString&lt;fmt::Line&gt;()) {
            result.append(std::move(line));
        }
    }
    <span class="keyword">return</span> result;
}
</pre></div>
</div>
</div>

<p>The <a href="https://plywood.arc80.com/docs/modules/runtime/api/filesystem/FileSystem#openTextForReadAutodetect"><code>FileSystem::openTextForReadAutodetect()</code></a> function attempts to guess the file format of the text file automatically. It looks for a byte order mark (BOM) and, if the BOM is missing, uses some heuristics to guess between UTF-8 and UTF-16. Again, because Plywood encourages working with UTF-8 text, the lines returned by the <code>StringReader</code> are always encoded in UTF-8 and terminated with <code>\n</code>, regardless of the source file&rsquo;s original encoding and line endings. All necessary conversions are accomplished using adapters. For example, if we open the UTF-16 file that was written in the previous example, the chain of <code>InStream</code> objects would look like this:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 554 287" style="max-width:554px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <filter id="a" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".062745" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
 </defs>
 <g transform="translate(0 -765.36)">
  <g transform="translate(390 207)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#fffff9" />
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="823.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="816.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="808.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="801.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="793.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="786.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="778.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="771.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="763.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 683.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="-222" width="545" x="24" height="169" /></flowregion><flowpara /></flowroot>
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(4 666.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="337" width="564" x="-147" height="189" /></flowregion><flowpara /></flowroot>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="17.5px" line-height="110.00000238%" y="879.36224" x="498.98105" font-family="Asap" font-style="italic" fill="#ff2a2a"><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="879.36224" x="498.98105">converts to</tspan><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="898.61224" x="498.98105">UTF-8</tspan></text>
  <path d="m406.82 904.95c26.942.0803 35.919-3.1533 46.257-10.749" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <path d="m417.78 912.76-11.799-7.7157 12.199-7.8381" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="866.5" x="5.1427" stroke-width=".28551" fill="#f3fff3" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="883.36218" x="86.365646" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.365646" y="883.36218">InStream</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="914.5" x="5.1427" stroke-width=".28551" fill="#f3fff3" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="931.36218" x="86.396347" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="931.36218" x="86.396347">InPipe_TextConverter</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="962.5" x="5.1427" stroke-width=".28551" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="979.36218" x="86.365646" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="979.36218" x="86.365646">InStream</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="1010.5" x="5.1427" stroke-width=".28551" fill="#fffff9" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="1027.3622" x="86.311172" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.311172" y="1027.3622">InPipe_FD</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="770.5" x="5.1427" stroke-width=".28551" fill="#f9f9ff" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.125px" line-height="125%" y="787.36218" x="86.223312" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="787.36218" x="86.223312">StringReader</tspan></text>
  <rect stroke-linejoin="round" stroke-dashoffset="1.8" height="25.715" filter="url(#a)" width="162.71" stroke="#ccc" stroke-linecap="round" y="818.5" x="5.1427" stroke-width=".28551" fill="#f9f9ff" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.125px" line-height="125%" y="835.36218" x="86.396347" font-family="Consolas,monospace" fill="#15999c"><tspan x="86.396347" y="835.36218">InPipe_NewLineFilter</tspan></text>
  <g transform="translate(390 111)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#f3fff3" />
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="823.71222"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="816.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="808.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="801.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="793.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="786.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="778.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="771.21222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="763.71222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <g transform="translate(390 15)">
   <rect stroke-linejoin="round" stroke-dashoffset="1.8" transform="matrix(.5 0 0 .5 -637.1 404.03)" height="148" filter="url(#a)" width="321" stroke="#ccc" stroke-linecap="round" y="704.86" x="957.5" fill="#f9f9ff" />
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="823.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="823.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">&#8230;</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="816.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="816.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="808.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="808.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="801.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="801.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="793.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="793.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="786.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="786.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="778.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="778.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="6.5625px" line-height="125%" y="771.21222" x="-153.09987" font-family="Consolas,monospace" fill="#cccccc"><tspan x="-153.09987" y="771.21222"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
   <text style="word-spacing:0px;letter-spacing:0px" font-size="6.5625px" line-height="125%" y="763.71222" x="-153.09987" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="763.71222" x="-153.09987"><tspan fill="#cccccc" font-size="6.5625px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  </g>
  <path d="m168 783.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 783.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m168 879.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 879.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m168 975.36h61" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m230.35 975.36-2.6383 1.568-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359c0-1.1227-1.1943-3.1358-1.1943-3.1358l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 796.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 816.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 844.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 864.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 892.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 912.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 940.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 960.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m86 988.36v17" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m86 1008.4-1.568-2.6383-1.5679-2.6382s2.0131 1.1943 3.1359 1.1943c1.1227 0 3.1358-1.1943 3.1358-1.1943l-1.5679 2.6382z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="17.5px" line-height="110.00000238%" font-style="italic" y="791.36224" x="502.98105" font-family="Asap" xml:space="preserve" fill="#ff2a2a"><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="791.36224" x="502.98105">discards <tspan font-size="15px" font-style="normal" font-family="Consolas,monospace">\r</tspan></tspan><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="813.08557" x="502.98105">characters</tspan></text>
  <path d="m406.82 816.95c26.942.0803 35.919-3.1533 46.257-10.749" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <path d="m417.78 824.76-11.799-7.7157 12.199-7.8381" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="17.5px" line-height="110.00000238%" font-style="italic" y="957.36224" x="498.98105" font-family="Asap" xml:space="preserve" fill="#ff2a2a"><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="957.36224" x="501.0831">reads </tspan><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="976.61224" x="498.98105">from file</tspan><tspan style="text-anchor:middle;text-align:center" line-height="110.00000238%" y="995.86224" x="498.98105">descriptor</tspan></text>
  <path d="m406.82 992.95c26.942.0803 35.919-3.1533 46.257-10.749" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
  <path d="m417.78 1000.8-11.799-7.7157 12.199-7.8381" stroke="#ff2a2a" stroke-width="1.7502" fill="none" />
 </g>
</svg>

<p>This example works fine, but it will potentially perform a lot of memory allocations since every <code>String</code> object owns its own block of memory. An alternative way to extract lines of text from a file is to read the entire file into a <code>String</code> first &ndash; perhaps using <a href="https://plywood.arc80.com/docs/modules/runtime/api/filesystem/FileSystem#loadTextAutodetect"><code>FileSystem::loadTextAutodetect()</code></a> &ndash; and create an array of <a href="https://plywood.arc80.com/docs/modules/runtime/api/string/StringView"><code>StringView</code></a> objects instead, using a function similar to the following:</p>

<div><div class="CodeRay">
  <div class="code"><pre>Array&lt;StringView&gt; extractLines(StringView src) {
    Array&lt;StringView&gt; result;
    StringViewReader svr{src};
    <span class="keyword">while</span> (StringView line = svr.readView&lt;fmt::Line&gt;()) {
        result.append(line);
    }
    <span class="keyword">return</span> result;
}
</pre></div>
</div>
</div>

<p>This is the general approach used in Plywood&rsquo;s built-in JSON, Markdown and C++ parsers. Source files are always loaded into memory first, then parsed in-place, avoiding additional memory allocations and string copies as much as possible. When using an approach like this, care must be taken to ensure that the original <code>String</code> remains valid as long as there are <code>StringView</code>s into it.</p>

<h2 id="writing-to-memory">Writing to Memory</h2>

<p>A <code>StringWriter</code> or <code>OutStream</code> doesn&rsquo;t always have to write to an <code>OutPipe</code>. You can create a <code>StringWriter</code> that writes to memory simply by invoking its default constructor. After writing to such a <code>StringWriter</code>, you can extract its contents to a <code>String</code> by calling <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/text/StringWriter#moveToString"><code>StringWriter::moveToString()</code></a>:</p>

<pre><code>String getMessage() {
    StringWriter sw;
    sw &lt;&lt; u8"OCEAN MAN 🌊 😍 ";
    sw.format("Take me by the {} lead me to {} that you {}", u8"hand ✋", "land",
            u8"understand 🙌 🌊");
    return sw.moveToString();
}
</code></pre>

<p>Here&rsquo;s what the <code>StringWriter</code> initially looks like in memory, before anything gets written. <code>OutStream::status.type</code> is <code>2</code>, and instead of an <code>OutPipe</code> pointer, there&rsquo;s a second <code>Reference&lt;ChunkListNode&gt;</code> member named <code>headChunk</code>. If we write a large amount of data to this <code>StringWriter</code>, we&rsquo;ll end up with a linked list of <code>ChunkListNodes</code> in memory, with <code>headChunk</code> always pointing to the start of the linked list.</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 710 332" style="max-width:710px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <filter id="a" style="color-interpolation-filters:sRGB">
   <feflood result="flood" flood-opacity=".062745" flood-color="rgb(0,0,0)" />
   <fecomposite operator="in" result="composite1" in2="SourceGraphic" in="flood" />
   <fegaussianblur stddeviation="3" result="blur" in="composite1" />
   <feoffset result="offset" dx="2" dy="2" />
   <fecomposite operator="over" result="composite2" in2="offset" in="SourceGraphic" />
  </filter>
 </defs>
 <g transform="translate(0 -720.36)">
  <g stroke-linejoin="round" transform="translate(-1026,410)" stroke-dashoffset="1.8" filter="url(#a)" stroke="#ccc" stroke-linecap="round" fill="#fffff9">
   <rect height="148" width="321" y="345.86" x="1402.5" />
   <rect height="132" width="321" y="493.86" x="1402.5" />
   <rect height="175" width="314" y="315.86" x="1029.5" />
  </g>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="770.36218" x="53" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="770.36218" x="53" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> curByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="785.36218" x="53" font-family="Consolas,monospace" fill="#000000"><tspan x="53" font-size="13.125px" y="785.36218"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> endByte</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="800.36218" x="53" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="800.36218" x="53" font-size="13.125px"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> chunk</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="830.36218" x="53" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="830.36218" x="53" font-size="13.125px"><tspan fill="#15999c">Status</tspan><tspan fill="#333333"> status</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="845.36218" x="77" font-family="Consolas,monospace" fill="#000000"><tspan x="77" font-size="13.125px" y="845.36218"><tspan fill="#15999c">u32</tspan><tspan> chunkSizeExp <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 28 </tspan><tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 12</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="860.36218" x="77" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="860.36218" x="77" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> type <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 2 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 2</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="875.36218" x="77" font-family="Consolas,monospace" fill="#000000"><tspan x="77" font-size="13.125px" y="875.36218"><tspan fill="#15999c">u32</tspan><tspan> isPipeOwner <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="890.36218" x="77" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="890.36218" x="77" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> eof <tspan fill="#07448d" font-size="13.125px">:</tspan><tspan fill="#15999c" font-size="13.125px"> 1 <tspan fill="#07448d" font-size="13.125px">=</tspan></tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="755.36218" x="29" font-family="Consolas,monospace" fill="#000000"><tspan x="29" y="755.36218"><tspan fill="#15999c" font-size="13.125px">OutStream</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="949.36218" x="407" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="949.36218" x="407" font-size="13.125px"><tspan fill="#15999c">u8</tspan><tspan fill="#07448d">*</tspan><tspan fill="#333333"> bytes</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="964.36218" x="407" font-family="Consolas,monospace" fill="#000000"><tspan x="407" font-size="13.125px" y="964.36218"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan> next <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> nullptr</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="979.36218" x="407" font-family="Consolas,monospace" fill="#000000"><tspan x="407" font-size="13.125px" y="979.36218"><tspan fill="#15999c">u32</tspan><tspan> numBytes <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 4096</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="994.36218" x="407" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="994.36218" x="407" font-size="13.125px"><tspan fill="#15999c">u32</tspan><tspan> writePos <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="1009.3622" x="407" font-family="Consolas,monospace" fill="#000000"><tspan x="407" font-size="13.125px" y="1009.3622"><tspan fill="#15999c">u32</tspan><tspan> offsetIntoNextChunk <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="1024.3622" x="407" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="1024.3622" x="407" font-size="13.125px"><tspan fill="#15999c">mutable s32</tspan><tspan> refCount <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 1</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="919.36218" x="383" font-family="Consolas,monospace" xml:space="preserve" fill="#000000"><tspan y="919.36218" x="383"><tspan fill="#15999c" font-size="13.125px">ChunkListNode</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="934.36218" x="407" font-family="Consolas,monospace" fill="#000000"><tspan x="407" font-size="13.125px" y="934.36218"><tspan fill="#15999c">u64</tspan><tspan> fileOffset <tspan fill="#07448d" font-size="13.125px">=</tspan><tspan fill="#45ba45" font-size="13.125px"> 0</tspan></tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="890.36218" x="387" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="890.36218" x="387"><tspan fill="#cccccc" font-size="13.125px">&#8230;</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="875.36218" x="387" font-family="Consolas,monospace" fill="#cccccc"><tspan x="387" y="875.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="860.36218" x="387" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="860.36218" x="387"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="845.36218" x="387" font-family="Consolas,monospace" fill="#cccccc"><tspan x="387" y="845.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="830.36218" x="387" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="830.36218" x="387"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="815.36218" x="387" font-family="Consolas,monospace" fill="#cccccc"><tspan x="387" y="815.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="800.36218" x="387" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="800.36218" x="387"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="785.36218" x="387" font-family="Consolas,monospace" fill="#cccccc"><tspan x="387" y="785.36218"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="770.36218" x="387" font-family="Consolas,monospace" xml:space="preserve" fill="#cccccc"><tspan y="770.36218" x="387"><tspan fill="#cccccc" font-size="13.125px">00 00 00 00 00 00 00 00 00 00 00 00 00 00</tspan></tspan></text>
  <circle cx="141" cy="766.36" r="3" fill="#4040b2" />
  <circle cy="781.36" cx="141" r="3" fill="#4040b2" />
  <path d="m140 766.36h231" stroke="#4040b2" stroke-width="2" fill="none" />
  <path d="m397.7 945.36h-30.695c-12.558 0-16-6.6584-16-16v-147c0-9.6538 6.1909-16 16-16h2.5001" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="398" cy="945.36" r="3" fill="#4040b2" />
  <path d="m371.2 915.36h-24.195c-12.558 0-18-6.6584-18-16v-87c0-9.6538-6.1909-16-16-16h-35" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cy="796.36" cx="278" r="3" fill="#4040b2" />
  <path d="m374.35 766.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m374.35 915.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m374.35 893.36-2.6383 1.5679-2.6382 1.5679s1.1943-2.0131 1.1943-3.1359-1.1943-3.1359-1.1943-3.1359l2.6382 1.5679z" stroke-dashoffset="1.8" stroke="#4040b2" stroke-miterlimit="6" stroke-width="2" fill="#4040b2" />
  <path d="m370.2 893.36h-14.195c-12.558 0-16-6.6584-16-16v-80c0-9.6538-6.1909-16-16-16h-183.5" stroke="#4040b2" stroke-width="2" fill="none" />
  <flowroot style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="17.5px" transform="translate(-1 734.36)" line-height="125%" font-family="Consolas,monospace" fill="#000000"><flowregion><rect y="-222" width="545" x="24" height="169" /></flowregion><flowpara /></flowroot>
  <text style="word-spacing:0px;letter-spacing:0px" xml:space="preserve" font-size="13.125px" line-height="125%" y="815.36218" x="53" font-family="Consolas,monospace" fill="#000000"><tspan x="53" font-size="13.125px" y="815.36218"><tspan fill="#15999c">Reference<tspan fill="#07448d" font-size="13.125px">&lt;</tspan>ChunkListNode</tspan><tspan fill="#07448d">&gt;</tspan><tspan fill="#333333"> headChunk</tspan></tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px" font-size="13.125px" line-height="125%" y="740.36218" x="9" font-family="Consolas,monospace" xml:space="preserve" fill="#15999c"><tspan y="740.36218" x="9">StringWriter<tspan fill="#15999c" font-size="13.125px" /></tspan></text>
  <path d="m329 827.36c0-9.6538-6.1909-16-16-16h-7" stroke="#4040b2" stroke-width="2" fill="none" />
  <circle cx="306" cy="811.36" r="3" fill="#4040b2" />
  <path d="m57.922 802.86c-25.793-1.5078-21.293 20.186-8.551 19.003 14.93-1.8736 229.77-.0955 249.45.73456 19.991 1.4768 19.689-18.947 2.3827-18.454-20.296-.91239-220.48-2.6959-240.76-.79054" stroke="#ff2a2a" stroke-width="2" fill="none" />
  <path d="m193.15 846.86c-26.81-1.5078-21.872 16.369-8.8883 19.003 20.752 4.2099 27.453-16.955 9.198-17.219" stroke="#ff2a2a" stroke-width="2" fill="none" />
 </g>
</svg>

<p>There&rsquo;s a particular optimization that&rsquo;s worth mentioning here. When working with a <code>StringWriter</code> like this one, each <code>ChunkListNode</code> object is located contiguously in memory <em>after</em> the memory buffer it owns. Therefore, when there&rsquo;s just a single node in the linked list, and <code>moveToString()</code> is called, the existing memory buffer is truncated using <code>realloc</code> and returned directly. In other words, when creating a short <code>String</code> this way (smaller than 4 KB), only a single block of memory is allocated, written to and returned; no additional memory allocations or string copies are performed.</p>

<p>The previous example demonstrates using <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/text/StringWriter#format"><code>StringWriter::format()</code></a> to write formatted text to an output stream. Plywood also provides a convenient wrapper function <a href="https://plywood.arc80.com/docs/modules/runtime/api/string/String#format"><code>String::format()</code></a> that hides the temporary <code>StringWriter</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">return</span> String::format(<span class="string"><span class="delimiter">&quot;</span><span class="content">The answer is {}.</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, <span class="integer">42</span>);
</pre></div>
</div>
</div>

<p>If you wish to write to memory using an <code>OutStream</code> instead of a <code>StringWriter</code>, use the derived class <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/MemOutStream"><code>MemOutStream</code></a>. <code>MemOutStream</code> is mainly intended for writing binary data, but as mentioned earlier, there&rsquo;s really nothing stopping you from casting it to a <code>StringWriter</code> using <a href="https://plywood.arc80.com/docs/modules/runtime/api/io/OutStream#strWriter"><code>OutStream::strWriter()</code></a> after it&rsquo;s created.</p>

<h2 id="future-improvements">Future Improvements</h2>

<p>I hope you enjoyed this brief, meandering tour through Plywood&rsquo;s I/O system. Here&rsquo;s a quick list of potential future improvements to the system:</p>

<ul>
  <li>
    <p>Plywood doesn&rsquo;t have a bidirectional stream yet, like <a href="https://en.cppreference.com/w/cpp/io/basic_iostream"><code>std::iostream</code></a> in standard C++. As of this writing, only <code>InStream</code> and <code>OutStream</code> are implemented.</p>
  </li>
  <li>
    <p>Currently, the contents of <code>OutStream</code>&rsquo;s temporary buffer are always flushed using a synchronous function call. When writing to a file descriptor, better throughput could be achieved using the operating system&rsquo;s asynchronous I/O support instead. When writing to an adapter, such as an <code>OutPipe_TextConverter</code> or a compression codec, better throughput could be achieved by processing the data in a background thread or using a job system. The main challenge will be to manage the lifetimes of multiple <code>ChunkListNode</code> objects while I/O is pending.</p>
  </li>
  <li>
    <p>Plywood&rsquo;s text conversion is only aware of UTF-8 and UTF-16 at this time. Some work was done towards ISO 8859-1 and Windows-1252, but support is not yet complete. Additional work is needed to support other encodings like Shift-JIS or GB 2312. The intention would still be to work with UTF-8 when text is loaded in memory, and convert between formats when performing I/O.</p>
  </li>
</ul>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A New Cross-Platform Open Source C++ Framework]]></title>
    <link href="https://preshing.com/20200526/a-new-cross-platform-open-source-cpp-framework"/>
    <updated>2020-05-26T07:50:00-04:00</updated>
    <id>https://preshing.com/?p=20200526</id>
    <content type="html"><![CDATA[<p>For the past little while &ndash; OK, long while &ndash; I&rsquo;ve been working on a <a href="https://preshing.com/20171218/how-to-write-your-own-cpp-game-engine">custom game engine in C++</a>. Today, I&rsquo;m releasing part of that game engine as an open source framework. It&rsquo;s called the <strong>Plywood</strong> framework.</p>

<div class="linkPanelSet">
<div>
<a href="https://plywood.arc80.com/"><img src="https://preshing.com/images/plywood-logo.svg" /><br />
View the documentation</a>
</div>
<div>
<a href="https://github.com/arc80/plywood"><img src="https://preshing.com/images/plywood-github.svg" /><br />
View on GitHub</a>
</div>
</div>

<p>Please note that Plywood, by itself, is <em>not</em> a game engine! It&rsquo;s a framework for building all kinds of software using C++.</p>

<p>For example, Plywood&rsquo;s <a href="https://plywood.arc80.com/">documentation</a> is generated with the help of a C++ parser, formatted by a Markdown parser, and runs on a custom webserver all written using Plywood.</p>

<!--more-->
<p>Integrating third-party libraries can a pain in C++, but Plywood aims to simplify it. <a href="https://github.com/arc80/plywood/blob/main/repos/plywood/src/apps/CairoToVideo/Main.cpp">Here&rsquo;s a short Plywood program</a> that uses <a href="https://www.cairographics.org/">Cairo</a> and <a href="https://www.ffmpeg.org/libavcodec.html">Libavcodec</a> to render a vector animation to a video file:</p>

<video width="240" height="240" autoplay="" loop="" muted="">
  <source src="https://preshing.com/images/vector-animation.mp4" type="video/mp4" />
</video>

<p>And <a href="https://github.com/arc80/plywood/blob/main/repos/plywood/src/apps/MusicSample/Main.cpp">here&rsquo;s one</a> that synthesizes a short music clip to an MP3:</p>

<audio controls="">
  <source src="https://preshing.com/images/super-mario-intro.mp3" type="audio/mpeg" />
</audio>

<p>The source code for these examples is included in the Plywood repository, and everything builds and runs on Windows, Linux and macOS.</p>

<p>Of course, Plywood also serves as the foundation for my (proprietary) game engine, which I call the <strong>Arc80 Engine</strong>. That&rsquo;s why Plywood came into existence in the first place. I haven&rsquo;t shipped a complete game using the Arc80 Engine yet, but I have made a number of prototypes with it. More on that later!</p>

<video width="580" height="176" autoplay="" loop="" muted="">
  <source src="https://preshing.com/images/montage.mp4" type="video/mp4" />
</video>

<h2 id="whats-included-in-plywood">What&rsquo;s Included In Plywood</h2>

<p>Plywood comes with:</p>

<ol>
  <li>A <strong>workspace</strong> designed to help you reuse code between applications.</li>
  <li>A set of <strong>built-in modules</strong> providing cross-platform I/O, containers, process creation and more.</li>
  <li>A runtime <strong>reflection</strong> and serialization system.</li>
</ol>

<p>Here are a few more details about each component.</p>

<h3 id="the-workspace">The Workspace</h3>

<p>Most open source C++ projects are libraries that are meant to be integrated into other applications. Plywood is the opposite of that: It gives you a workspace into which source code and libraries can be integrated. A single Plywood workspace can contain several applications &ndash; a webserver, a game engine, a command-line tool. Plywood simplifies the task of building and sharing code between them.</p>

<p>Plywood uses <a href="https://cmake.org/">CMake</a> under the hood, but you don&rsquo;t have to write any CMake scripts. Like any CMake-based project, a Plywood workspace has the concepts of <strong>targets</strong> and <strong>build folders</strong> that are kept separate from the source code, but it adds several other concepts on top of that, such as <a href="https://plywood.arc80.com/docs/KeyConcepts">modules, root targets and extern providers</a>. (Note: Plywood modules are not to be confused with C++20 modules; they&rsquo;re different things.)</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 542 311" style="max-width:542px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(0 -741.36)">
  <rect rx="11" ry="11" height="289" width="537" stroke="#333" y="762.36" x="3" stroke-width="2" fill="none" />
  <rect rx="7" ry="7" height="29" width="81" stroke="#a7a7a7" y="855.86" x="372.5" fill="#f9f9f9" />
  <rect rx="7" ry="7" height="29" width="81" stroke="#a7a7a7" y="855.86" x="283.5" fill="#f9f9f9" />
  <rect rx="7" ry="7" height="29" width="81" stroke="#a7a7a7" y="819.86" x="372.5" fill="#f9f9f9" />
  <rect rx="4.8819" ry="4.8819" height="63" width="178" stroke="#85c885" y="813.86" x="87.5" fill="#f3fff5" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="823.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="823.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="823.86" x="164.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="823.86" x="196.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="847.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="847.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="847.86" x="196.5" fill="#fff6d1" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="823.86" x="228.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="847.86" x="228.5" fill="#fff6d1" />
  <rect rx="4.8819" ry="4.8819" height="63" width="178" stroke="#85c885" y="883.86" x="87.5" fill="#f3fff5" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="893.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="893.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="893.86" x="164.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="893.86" x="196.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="917.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="917.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="917.86" x="164.5" fill="#fff6d1" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="917.86" x="196.5" fill="#fff6d1" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="893.86" x="228.5" fill="#fffff0" />
  <rect rx="4.8819" ry="4.8819" height="63" width="178" stroke="#85c885" y="953.86" x="87.5" fill="#f3fff5" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="963.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="963.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="963.86" x="164.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="963.86" x="196.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="987.86" x="100.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="987.86" x="132.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="987.86" x="196.5" fill="#fff6d1" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="963.86" x="228.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="987.86" x="228.5" fill="#fff6d1" />
  <rect rx="7" ry="7" height="29" width="81" stroke="#a7a7a7" y="819.86" x="283.5" fill="#f9f9f9" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="827.86" x="290.5" fill="#fff" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="827.86" x="314.5" fill="#fff" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="827.86" x="379.5" fill="#fff" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="863.86" x="290.5" fill="#fff" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="863.86" x="379.5" fill="#fff" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="863.86" x="403.5" fill="#fff" />
  <rect rx="11" ry="11" height="29" width="108" stroke="#e9d6b6" y="903.86" x="283.5" fill="#fff6d1" />
  <rect rx="11" ry="11" height="29" width="108" stroke="#e9d6b6" y="939.86" x="283.5" fill="#fff6d1" />
  <rect rx="11" ry="11" height="29" width="108" stroke="#e9d6b6" y="975.86" x="283.5" fill="#fff6d1" />
  <rect rx="7" ry="7" height="14" width="19" stroke="#d2d2d2" y="863.86" x="427.5" fill="#fff" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e3e3c1" y="987.86" x="164.5" fill="#fffff0" />
  <rect rx="9.5" ry="9.5" height="19" width="25" stroke="#e9d6b6" y="847.86" x="164.5" fill="#fff6d1" />
 </g>
 <g transform="translate(0 -741.36)">
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="15px" line-height="110.00000238%" y="753.41718" x="271.57324" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="753.41718" x="271.57324" font-weight="bold">A Plywood workspace</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="970.41718" x="456.01678" font-family="Arimo" xml:space="preserve" fill="#ff5555"><tspan y="970.41718" x="456.01678">external</tspan><tspan y="985.54218" x="456.01678">packages</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="828.41718" x="510.01685" font-family="Arimo" fill="#ff5555"><tspan x="510.01685" y="828.41718">build</tspan><tspan x="510.01685" y="843.54218">folders</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="801.41718" x="320.01678" font-family="Arimo" xml:space="preserve" fill="#ff5555"><tspan y="801.41718" x="320.01678">root targets</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="864.41718" x="32.016785" font-family="Arimo" fill="#ff5555"><tspan x="32.016785" y="864.41718">repos</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="779.41718" x="108.01678" font-family="Arimo" xml:space="preserve" fill="#ff5555"><tspan y="779.41718" x="108.01678">source code</tspan><tspan y="794.54218" x="108.01678">modules</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="1040.4172" x="274.01678" font-family="Arimo" fill="#ff5555"><tspan x="274.01678" y="1040.4172">extern providers</tspan></text>
  <path d="m93 857.36-38 6" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m93 889.37-38-26.01" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m93 962.37-38-99.01" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m244 1001.4-3 25.001" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m212 1001.4 29 25.001" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m320 832.36-15-25" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m296 832.36 9-25.001" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m449 827.36 38 6" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m449 863.37 38-30.012" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m424 970.36-40 18.008" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m424 970.36-40-15.99" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m424 970.36-40-49.99" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m107.2 829.36 5.9183-29" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
  <path d="m139.2 829.36-26.08-29" stroke="#f55" stroke-linecap="round" stroke-width="1px" fill="none" />
 </g>
</svg>

<p>An explanation of the Plywood workspace, and how it interacts with CMake, really deserves its own post, but what it helps achieve is <strong>modularity</strong>.</p>

<p>Arc80 is a modular game engine, and thanks to the Plywood workspace, it&rsquo;s possible to quickly build small applications using Arc80 engine modules. For example, when I was developing the flame effect for a jet engine, I created a small test application that contained nothing but the flame effect. This test application allowed me to focus on the feature itself and kept the effect decoupled from the rest of the game.</p>

<video width="280" height="172" autoplay="" loop="" muted="">
  <source src="https://preshing.com/images/flametest-window.mp4" type="video/mp4" />
</video>

<p>Additional work is still needed to make the process more user-friendly. For example, to define a new module, you must currently write a C++ function using an API I haven&rsquo;t documented yet. A simple configuration file would be preferable. That&rsquo;s the next thing I plan to work on now that Plywood is released.</p>

<h3 id="built-in-modules">Built-In Modules</h3>

<p>I thought about open-sourcing part of the Arc80 Engine for a long time, and that&rsquo;s what led to the <a href="https://plywood.arc80.com/docs/KeyConcepts#repos">repos</a> idea mentioned above. A single Plywood workspace can combine modules from separate Git repositories. When it came time to open-source Plywood, it was a matter of moving a bunch of modules from the proprietary <code>arc80</code> repo to the public <code>plywood</code> repo.</p>

<p>As of today, the <code>plywood</code> repo comes with <strong>36</strong> <a href="https://plywood.arc80.com/docs/KeyConcepts#modules">built-in modules</a>. These modules offer functionality for platform abstraction, vector math, JSON parsing, audio and image processing, and more. Of course, all modules are optional, so if you don&rsquo;t think a specific module suits your needs, you don&rsquo;t have to use it. Here&rsquo;s a simplified diagram showing some of the modules included in the <code>plywood</code> repo (on the bottom) versus the modules I&rsquo;m keeping private for now (on the top). The arrows represent dependencies:</p>

<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 381 453" style="max-width:381px" xmlns:xlink="http://www.w3.org/1999/xlink">
 <g transform="translate(0 -599.36)">
  <path d="m186 786.36v12h-11.111l17.111 16 17.111-16h-11.111v-12z" fill="#dbdbdb" />
  <rect rx="16.927" ry="16.927" height="180" width="362" stroke="#85c885" y="601.86" x="10.5" fill="#f3fff5" />
  <rect rx="6" ry="6" height="19" width="105" stroke="#e3e3c1" y="611.86" x="123.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="611.86" x="47.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="707.86" x="103.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="659.86" x="103.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="659.86" x="27.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="659.86" x="179.5" fill="#fffff0" />
  <rect rx="16.927" ry="16.927" height="232" width="377" stroke="#85c885" y="818.86" x="2.5" fill="#f3fff5" />
  <rect rx="6.364" ry="6.364" height="19" width="77" stroke="#e9d6b6" y="972.86" x="286.5" fill="#fff6d1" />
  <path d="m262 904.36h45.515c9.8394 0 15.485 6.0701 15.485 15.106v14.894" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="877.86" x="34.5" fill="#fffff0" />
  <rect rx="6.364" ry="6.364" height="19" width="77" stroke="#e9d6b6" y="948.86" x="286.5" fill="#fff6d1" />
  <path d="m156 862.36-11 10-11-10z" fill="#dbdbdb" />
  <path d="m156 959.36-11 10-11-10z" fill="#dbdbdb" />
  <path d="m156 1006.4-11 10-11-10z" fill="#dbdbdb" />
  <path d="m145 854.36v9" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <path d="m145 951.36v9" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <path d="m145 998.36v9" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <path d="m334 933.36-11 10-11-10z" fill="#dbdbdb" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="683.86" x="103.5" fill="#fffff0" />
  <rect rx="6.364" ry="6.364" height="19" width="77" stroke="#e9d6b6" y="729.86" x="279.5" fill="#fff6d1" />
  <rect rx="6.364" ry="6.364" height="19" width="77" stroke="#e9d6b6" y="753.86" x="279.5" fill="#fff6d1" />
  <path d="m255 685.36h45.515c9.8394 0 15.485 6.0701 15.485 15.106v14.894" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <path d="m138 636.36v9" stroke="#dbdbdb" stroke-width="8" fill="none" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="707.86" x="27.5" fill="#fffff0" />
  <path d="m149 644.36-11 10-11-10z" fill="#dbdbdb" />
  <path d="m327 714.36-11 10-11-10z" fill="#dbdbdb" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="877.86" x="110.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="877.86" x="186.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="901.86" x="110.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="901.86" x="186.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="925.86" x="110.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="925.86" x="186.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="925.86" x="34.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="901.86" x="34.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="974.86" x="72.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="974.86" x="148.5" fill="#fffff0" />
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="1022.9" x="110.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="1036.4172" x="145.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="1036.4172" x="145.01685">platform</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="988.41724" x="183.01685" font-family="Arimo" fill="#333333"><tspan x="183.01685" y="988.41724">runtime</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="939.41724" x="146.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="939.41724" x="146.01685">reflect</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="939.41724" x="69.016861" font-family="Arimo" fill="#333333"><tspan x="69.016861" y="939.41724">pylon</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="915.41724" x="69.016861" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="915.41724" x="69.016861">cook</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="988.41724" x="107.01686" font-family="Arimo" fill="#333333"><tspan x="107.01686" y="988.41724">math</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="915.41724" x="221.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="915.41724" x="221.01685">image</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="939.41724" x="221.01685" font-family="Arimo" fill="#333333"><tspan x="221.01685" y="939.41724">web</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="915.41724" x="145.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="915.41724" x="145.01685">cpp</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="891.41724" x="221.01685" font-family="Arimo" fill="#333333"><tspan x="221.01685" y="891.41724">codec</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="891.41724" x="146.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="891.41724" x="146.01685">build</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="891.41724" x="69.016861" font-family="Arimo" fill="#333333"><tspan x="69.016861" y="891.41724">audio</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="962.41724" x="325.01685" font-family="Arimo" xml:space="preserve" fill="#cca561"><tspan y="962.41724" x="325.01685">libavcodec</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="986.41724" x="325.01685" font-family="Arimo" fill="#cca561"><tspan x="325.01685" y="986.41724">cairo</tspan></text>
  <rect rx="6.364" ry="6.364" height="19" width="77" stroke="#e9d6b6" y="996.86" x="286.5" fill="#fff6d1" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="1010.4172" x="325.01685" font-family="Arimo" xml:space="preserve" fill="#cca561"><tspan y="1010.4172" x="325.01685">libsass</tspan></text>
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="828.86" x="20.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="842.41718" x="55.016777" font-family="Arimo" fill="#333333"><tspan x="55.016777" y="842.41718">plytool</tspan></text>
  <rect rx="7.2174" ry="6" height="19" width="83" stroke="#e3e3c1" y="828.86" x="96.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="842.41718" x="138.01678" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="842.41718" x="138.01678">WebCooker</tspan></text>
  <rect rx="7.2174" ry="6" height="19" width="83" stroke="#e3e3c1" y="828.86" x="186.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="842.41718" x="228.01678" font-family="Arimo" fill="#333333"><tspan x="228.01678" y="842.41718">WebServer</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="721.41718" x="139.01685" font-family="Arimo" fill="#333333"><tspan x="139.01685" y="721.41718">ui</tspan></text>
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="707.86" x="179.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="721.41718" x="62.016861" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="721.41718" x="62.016861">session</tspan></text>
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="683.86" x="27.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="697.41718" x="62.016861" font-family="Arimo" fill="#333333"><tspan x="62.016861" y="697.41718">graphic</tspan></text>
  <rect rx="6" ry="6" height="19" width="69" stroke="#e3e3c1" y="683.86" x="179.5" fill="#fffff0" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="697.41718" x="214.01685" font-family="Arimo" fill="#333333"><tspan x="214.01685" y="697.41718">physics</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="721.41718" x="214.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="721.41718" x="214.01685">view</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="697.41718" x="138.01685" font-family="Arimo" fill="#333333"><tspan x="138.01685" y="697.41718">image</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="673.41718" x="214.01685" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="673.41718" x="214.01685">gpu</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="673.41718" x="139.01685" font-family="Arimo" fill="#333333"><tspan x="139.01685" y="673.41718">audio</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="673.41718" x="62.016861" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="673.41718" x="62.016861">assetBank</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="743.41718" x="318.01685" font-family="Arimo" fill="#cca561"><tspan x="318.01685" y="743.41718">FreeType</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="767.41718" x="318.01685" font-family="Arimo" xml:space="preserve" fill="#cca561"><tspan y="767.41718" x="318.01685">SDL</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="625.41718" x="82.016724" font-family="Arimo" xml:space="preserve" fill="#333333"><tspan y="625.41718" x="82.016724">Game</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="625.41718" x="175.55701" font-family="Arimo" fill="#333333"><tspan x="175.55701" y="625.41718">RemoteCooker</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="20px" line-height="110.00000238%" y="621.41718" x="323.82645" font-family="Consolas" xml:space="preserve" fill="#3b9d2b"><tspan font-size="17.5px" y="621.41718" x="323.82645" font-family="Consolas" fill="#3b9d2b">arc80</tspan></text>
  <path style="color-rendering:auto;text-decoration-color:#000000;color:#000000;isolation:auto;mix-blend-mode:normal;shape-rendering:auto;solid-color:#000000;block-progression:tb;text-decoration-line:none;text-decoration-style:solid;image-rendering:auto;white-space:normal;text-indent:0;text-transform:none" d="m224.74 982.85c-.24341-.007-.48564.11397-.6308.30886l-3.7328 5.0175a.082277 .082277 0 0 1 -.0881 .0303l-6.0467-1.6282c-.15154-.0407-.31593-.0332-.46335.02-.14743.0532-.27697.15305-.36664.2815-.0897.12845-.13921.28506-.13858.44173.00064.15661.0507.31234.14147.44026l3.6289 5.0998a.082277 .082277 0 0 1 .003 .0924l-3.4138 5.2398c-.0851.13128-.12855.28837-.1227.44462.006.15634.0612.3112.15589.4359.0947.12469.22749.21923.37675.26702.14926.0478.31375.0499.46335 0l5.9759-1.8618a.082277 .082277 0 0 1 .088 .0255l3.9364 4.8644c.0988.1217.23601.2125.38685.2555.15083.043.31526.038.46335-.013.14809-.052.27943-.1502.37098-.2773.0916-.1269.14254-.2823.14435-.4387l.065-6.2517a.082277 .082277 0 0 1 .0534 -.0754l5.8475-2.2331c.14648-.0563.27482-.15879.36231-.28869.0875-.1299.13317-.28646.12991-.44312-.003-.15656-.0559-.31248-.14867-.43888-.0928-.12629-.22515-.22264-.37386-.27275l-5.9341-1.9977a.082277 .082277 0 0 1 -.0563 -.0739l-.32334-6.2444c-.0157-.3067-.23467-.59361-.52686-.68997-.0732-.0237-.14968-.0362-.22663-.0377z" fill-opacity=".078431" />
  <path stroke-linejoin="round" transform="matrix(.31177 .10661 -.10653 .31122 152.17 602.21)" stroke="#ebc45c" stroke-linecap="round" stroke-width="5.1631" fill="#ffff47" d="m595 1074.4-17.573-11.056-16.739 12.283 5.0849-20.13-16.854-12.124 20.716-1.3843 6.3224-19.776 7.7181 19.274 20.762-.098-15.946 13.296z" />
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="20px" line-height="110.00000238%" y="839.09271" x="329.04007" font-family="Consolas" fill="#3b9d2b"><tspan font-size="17.5px" y="839.09271" x="329.04007" font-family="Consolas" fill="#3b9d2b">plywood</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" font-size="13.75px" line-height="110.00000238%" y="636.41718" x="324.01688" font-family="Arimo" xml:space="preserve" fill="#7aba6f"><tspan y="636.41718" x="324.01688">(not included)</tspan></text>
  <text style="word-spacing:0px;letter-spacing:0px;text-anchor:middle;text-align:center" xml:space="preserve" font-size="13.75px" line-height="110.00000238%" y="855.41718" x="329.01688" font-family="Arimo" fill="#7aba6f"><tspan x="329.01688" y="855.41718">(included)</tspan></text>
 </g>
</svg>

<p>The most important module is the one marked with a star: <code>runtime</code>. I originally used <a href="https://github.com/preshing/turf">Turf</a> for cross-platform threads and memory management, then I gradually added containers, I/O, a filesystem API and sensible Unicode support. The <code>runtime</code> module is the result. It also exposes things not available in the standard C or C++ runtime libraries, like redirecting I/O to a subprocess and watching for changes to a directory. The API could use some improvement here &amp; there, but I already prefer it over the standard libraries quite a bit.</p>

<p>I would have liked to open source more of the Arc80 Engine, especially modules related to graphics and audio processing, but those aren&rsquo;t ready for prime time yet. I might open source them in the future, but it&rsquo;ll depend on whether Plywood is successful in the first place.</p>

<h3 id="the-reflection-system">The Reflection System</h3>

<p>Runtime reflection is, in my opinion, the biggest missing feature in standard C++. This feature enables an entire category of generic programming that&rsquo;s simply not possible otherwise. I wrote about runtime reflection in the previous <a href="https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1/">two</a> <a href="https://preshing.com/20180124/a-flexible-reflection-system-in-cpp-part-2/">posts</a> on this blog. Plywood comes with a built-in reflection system that&rsquo;s based on the approach described in those posts, but adds a code generation step on top of it.</p>

<p>In short, Plywood&rsquo;s reflection system exposes metadata about your program&rsquo;s data structures at runtime. For example, a data structure named <code>Contents</code> is defined in Plywood&rsquo;s <code>web-documentation</code> module:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Contents {
    PLY_REFLECT()
    String title;
    String linkDestination;
    Array&lt;Contents&gt; children;
    <span class="comment">// ply reflect off</span>
};
</pre></div>
</div>
</div>

<p>At runtime, Plywood&rsquo;s documentation system loads an <code>Array&lt;Contents&gt;</code> from a JSON file:</p>

<pre><code>[
  {
    "title": "Home",
    "linkDestination": "/",
    "children": [
    ]
  },
  {
    "title": "Quick Start",
    "linkDestination": "/docs/QuickStart",
    "children": [
      {
        "title": "Building the Documentation",
        "linkDestination": "/docs/QuickStart/BuildDocs",
        "children": [
        ]
      }
    ]
  },
  ...
</code></pre>

<p>The data is loaded by passing the metadata for <code>Array&lt;Contents&gt;</code> to a generic JSON loading function. No type-specific loading code needs to be written.</p>

<p>Currently, to make this work, you need to run <a href="https://plywood.arc80.com/docs/PlyTool"><code>plytool codegen</code></a> at a command prompt before compiling your project. (I plan to eventually merge this command into the build process, but only after optimizing it so that it runs incrementally.) This command scans all available source code modules, extracts all the type declarations using a custom C++ parser, and generates some additional source code required for the metadata.</p>

<p>The Arc80 Engine also uses reflection for binary serialization, shader parameter passing and more.</p>

<p>If you&rsquo;re a team working on a relatively new, cross-platform C++ project, I think it&rsquo;s already viable to base that project on Plywood. You&rsquo;ll have to follow Plywood development closely, and you&rsquo;ll have to seek out some answers directly from the source code where the documentation is not yet complete (as is the case with any project). But given that I&rsquo;ve already put roughly 4000 hours of work into this framework, you&rsquo;ll have a significant head start compared to building a similar framework yourself.</p>

<p>There are a lot of features in Plywood that could be extended in the future:</p>

<ul>
  <li>Plywood&rsquo;s built-in C++ parser could be extended to make it useful for other applications. It currently recognizes declarations but not expressions; function bodies are currently skipped. This feature would be a big undertaking.</li>
  <li>The webserver could be improved. It doesn&rsquo;t even use a thread pool yet. Any improvements to the webserver would likely make Plywood a better fit for other back-end services, too.</li>
  <li>Plywood&rsquo;s dependency on the standard C++ runtime could possibly be eliminated, and on some platforms, the standard C runtime too. Doing so would reduce build times and result in smaller executables.</li>
  <li><a href="https://plywood.arc80.com/docs/PlyTool">PlyTool</a> could be made to invoke the C++ compiler directly, without requiring CMake to generate an intermediate build system first.</li>
</ul>

<p>It&rsquo;ll depend what other people consider useful, so I&rsquo;m interested to hear your thoughts. Which of those improvements is worth pursuing? Do you see a way to simplify something in Plywood? Have a question, or more of a comment than a question? I&rsquo;ll be hanging out on the shiny new <a href="https://discord.gg/WnQhuVF">Plywood Discord server</a>, so feel free to jump in and join the discussion! Otherwise, you can <a href="https://preshing.com/contact">contact me directly.</a></p>

<p><a href="https://discord.gg/WnQhuVF"><img class="center" src="https://preshing.com/images/plywood-discord.svg" /></a></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Flexible Reflection System in C++: Part 2]]></title>
    <link href="https://preshing.com/20180124/a-flexible-reflection-system-in-cpp-part-2"/>
    <updated>2018-01-24T08:07:00-05:00</updated>
    <id>https://preshing.com/?p=20180124</id>
    <content type="html"><![CDATA[<p>In <a href="http://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1">the previous post</a>, I presented a basic system for <strong>runtime reflection</strong> in C++11. The post included a sample project that created a <strong>type descriptor</strong> using a block of macros:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Define Node's type descriptor</span>
REFLECT_STRUCT_BEGIN(Node)
REFLECT_STRUCT_MEMBER(key)
REFLECT_STRUCT_MEMBER(value)
REFLECT_STRUCT_MEMBER(children)
REFLECT_STRUCT_END()
</pre></div>
</div>
</div>

<p>At runtime, the type descriptor was found by calling <code>reflect::TypeResolver&lt;Node&gt;::get()</code>.</p>

<p>This reflection system is small but very flexible. In this post, I&rsquo;ll extend it to support additional built-in types. You can clone the project <a href="https://github.com/preshing/FlexibleReflection/tree/part1">from GitHub</a> to follow along. At the end, I&rsquo;ll discuss other ways to extend the system.</p>

<p><a href="https://github.com/preshing/FlexibleReflection/tree/part1"><img srcset="/images/github-flexible-reflection.png 1x,/images/github-flexible-reflection@2x.png 2x" class="center" src="https://preshing.com/images/github-flexible-reflection.png" /></a></p>

<!--more-->

<h2 id="adding-support-for-double">Adding Support for <code>double</code></h2>

<p>In <a href="https://github.com/preshing/FlexibleReflection/blob/part1/Main.cpp#L6"><code>Main.cpp</code></a>, let&rsquo;s change the definition of <code>Node</code> so that it contains a <code>double</code> instead of an <code>int</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Node {
    std::<span class="predefined-type">string</span> key;
    <span class="highlight"><span class="predefined-type">double</span> value;</span>
    std::vector&lt;Node&gt; children;

    REFLECT()      <span class="comment">// Enable reflection for this type</span>
};
</pre></div>
</div>
</div>

<p>Now, when we build the sample project, we get a link error:</p>

<pre><code>error: unresolved external symbol "reflect::getPrimitiveDescriptor&lt;double&gt;()"
</code></pre>

<p>That&rsquo;s because the reflection system doesn&rsquo;t support <code>double</code> yet. To add support, add the following code near the bottom of <a href="https://github.com/preshing/FlexibleReflection/blob/part1/Primitives.cpp#L40"><code>Primitives.cpp</code></a>, inside the <code>reflect</code> namespace. The highlighted line defines the missing function that the linker complained about.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">//--------------------------------------------------------</span>
<span class="comment">// A type descriptor for double</span>
<span class="comment">//--------------------------------------------------------</span>

<span class="keyword">struct</span> TypeDescriptor_Double : TypeDescriptor {
    TypeDescriptor_Double() : TypeDescriptor{<span class="string"><span class="delimiter">&quot;</span><span class="content">double</span><span class="delimiter">&quot;</span></span>, <span class="keyword">sizeof</span>(<span class="predefined-type">double</span>)} {
    }
    <span class="directive">virtual</span> <span class="directive">void</span> dump(<span class="directive">const</span> <span class="directive">void</span>* obj, <span class="predefined-type">int</span> <span class="comment">/* unused */</span>) <span class="directive">const</span> override {
        std::cout &lt;&lt; <span class="string"><span class="delimiter">&quot;</span><span class="content">double{</span><span class="delimiter">&quot;</span></span> &lt;&lt; *(<span class="directive">const</span> <span class="predefined-type">double</span>*) obj &lt;&lt; <span class="string"><span class="delimiter">&quot;</span><span class="content">}</span><span class="delimiter">&quot;</span></span>;
    }
};

<span class="keyword">template</span> &lt;&gt;
TypeDescriptor* <span class="highlight">getPrimitiveDescriptor&lt;<span class="predefined-type">double</span>&gt;()</span> {
    <span class="directive">static</span> TypeDescriptor_Double typeDesc;
    <span class="keyword">return</span> &amp;typeDesc;
}
</pre></div>
</div>
</div>

<p>Now, when we run the program &ndash; which creates a <code>Node</code> object and dumps it to the console &ndash; we get the following output instead. As expected, members that were previously <code>int</code> are now <code>double</code>.</p>

<p><img srcset="/images/reflection-output-double.png 1x,/images/reflection-output-double@2x.png 2x" class="center" src="https://preshing.com/images/reflection-output-double.png" /></p>

<h3 id="how-it-works">How It Works</h3>

<p>In this system, the type descriptor of every primitive &ldquo;built-in&rdquo; type &ndash; whether it&rsquo;s <code>int</code>, <code>double</code>, <code>std::string</code> or something else &ndash; is found using the <code>getPrimitiveDescriptor&lt;&gt;()</code> function template, declared in <code>Reflect.h</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Declare the function template that handles primitive types such as int, std::string, etc.:</span>
<span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt;
TypeDescriptor* getPrimitiveDescriptor();
</pre></div>
</div>
</div>

<p>That&rsquo;s the primary template. The primary template is not <em>defined</em> anywhere &ndash; only <em>declared</em>. When <code>Main.cpp</code> compiles, the compiler happily generates a call to <code>getPrimitiveDescriptor&lt;double&gt;()</code> without knowing its definition &ndash; just like it would for any other external function. Of course, the linker expects to find the definition of this function at link time. In the above example, the function is defined in <code>Primitives.cpp</code>.</p>

<p>The nice thing about this approach is that <code>getPrimitiveDescriptor&lt;&gt;()</code> can be specialized for any C++ type, and those specializations can be placed in any <code>.cpp</code> file in the program. (They don&rsquo;t all have to go in <code>Primitives.cpp</code>!) For example, in my <a href="http://preshing.com/20171218/how-to-write-your-own-cpp-game-engine/">custom game engine</a>, the graphics library specializes it for <code>VertexBuffer</code>, a class that manages OpenGL vertex buffer objects. As far as the reflection system is concerned, <code>VertexBuffer</code> is a built-in type. It can be used as a member of any class/struct, and reflected just like any other member that class/struct.</p>

<p>Be aware, however, that when you specialize this template in an arbitrary <code>.cpp</code> file, there are <a href="http://eel.is/c++draft/temp.expl.spec#6">rules</a> that limit the things you&rsquo;re allowed to do in the same <code>.cpp</code> file &ndash; though the compiler <a href="https://stackoverflow.com/questions/36997351/using-a-template-before-its-specialized">may or may not complain.</a></p>

<h2 id="adding-support-for-stduniqueptr">Adding Support for <code>std::unique_ptr&lt;&gt;</code></h2>

<p>Let&rsquo;s <a href="https://github.com/preshing/FlexibleReflection/blob/e2545737170e35385930808a07e82771d916c3f4/Main.cpp#L7">change the definition of <code>Node</code></a> again so that it contains a <a href="http://en.cppreference.com/w/cpp/memory/unique_ptr"><code>std::unique_ptr&lt;&gt;</code></a> instead of a <code>std::vector&lt;&gt;</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Node {
    std::<span class="predefined-type">string</span> key;
    <span class="predefined-type">double</span> value;
    <span class="highlight">std::unique_ptr&lt;Node&gt; next;</span>

    REFLECT()       <span class="comment">// Enable reflection for this type</span>
};
</pre></div>
</div>
</div>

<p>This time, we&rsquo;ll have to <a href="https://github.com/preshing/FlexibleReflection/blob/e2545737170e35385930808a07e82771d916c3f4/Main.cpp#L14">initialize the <code>Node</code> object</a> differently:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Create an object of type Node</span>
Node node = {
    <span class="string"><span class="delimiter">&quot;</span><span class="content">apple</span><span class="delimiter">&quot;</span></span>,
    <span class="integer">5</span>,
    std::unique_ptr&lt;Node&gt;{<span class="keyword">new</span> Node{
        <span class="string"><span class="delimiter">&quot;</span><span class="content">banana</span><span class="delimiter">&quot;</span></span>,
        <span class="integer">7</span>,
        std::unique_ptr&lt;Node&gt;{<span class="keyword">new</span> Node{
            <span class="string"><span class="delimiter">&quot;</span><span class="content">cherry</span><span class="delimiter">&quot;</span></span>,
            <span class="integer">11</span>,
            nullptr
        }}
    }}
};
</pre></div>
</div>
</div>

<p>The <a href="https://github.com/preshing/FlexibleReflection/blob/e2545737170e35385930808a07e82771d916c3f4/Main.cpp#L29">block of macros</a> needs to be updated, too:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Define Node's type descriptor</span>
REFLECT_STRUCT_BEGIN(Node)
REFLECT_STRUCT_MEMBER(key)
REFLECT_STRUCT_MEMBER(value)
REFLECT_STRUCT_MEMBER(<span class="highlight">next</span>)
REFLECT_STRUCT_END()
</pre></div>
</div>
</div>

<p>If we build the sample project at this point, we&rsquo;ll encounter a link error, as before.</p>

<pre><code>error: unresolved external symbol "reflect::getPrimitiveDescriptor&lt;std::unique_ptr&lt;Node&gt;&gt;()"
</code></pre>

<p>That&rsquo;s because the system doesn&rsquo;t support <code>std::unique_ptr&lt;&gt;</code> yet &ndash; no surprise there. We want the system to consider <code>std::unique_ptr&lt;&gt;</code> a built-in type. Unlike <code>double</code>, however, <code>std::unique_ptr&lt;&gt;</code> is not a primitive type; it&rsquo;s a template type. In this example, we&rsquo;ve instantiated <code>std::unique_ptr&lt;&gt;</code> for <code>Node</code>, but it could be instantiated for an unlimited number of other types. Each instantiation should have its own type descriptor.</p>

<p>The system looks for <code>std::unique_ptr&lt;Node&gt;</code>&rsquo;s type descriptor the same way it looks for every type descriptor: through the <a href="https://github.com/preshing/FlexibleReflection/blob/2ddb02979ba1e96db0eae328c98ba04f8f6f36c3/Reflect.h#L52-L58"><code>TypeResolver&lt;&gt;</code></a> class template. By default, <code>TypeResolver&lt;&gt;::get()</code> tries to call <code>getPrimitiveDescriptor&lt;&gt;()</code>. We&rsquo;ll override that behavior by writing a <a href="http://en.cppreference.com/w/cpp/language/partial_specialization">partial specialization</a> instead:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Partially specialize TypeResolver&lt;&gt; for std::unique_ptr&lt;&gt;:</span>
<span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt;
<span class="keyword">class</span> <span class="class">TypeResolver</span>&lt;std::unique_ptr&lt;T&gt;&gt; {
<span class="directive">public</span>:
    <span class="directive">static</span> TypeDescriptor* get() {
        <span class="directive">static</span> TypeDescriptor_StdUniquePtr typeDesc{(T*) nullptr};
        <span class="keyword">return</span> &amp;typeDesc;
    }
};
</pre></div>
</div>
</div>

<p>In this partial specialization, <code>get()</code> constructs a new kind of type descriptor: <code>TypeDescriptor_StdUniquePtr</code>. Whenever the system looks for a type descriptor for <code>std::unique_ptr&lt;T&gt;</code> &ndash; for some type <code>T</code> &ndash; the compiler will instantiate a copy of the above <code>get()</code>. Each copy of <code>get()</code> will return a different type descriptor for each <code>T</code>, but the same type descriptor will always be returned for the same <code>T</code>, which is exactly what we want.</p>

<p>I&rsquo;ve implemented full support for <code>std::unique_ptr&lt;&gt;</code> in a <a href="https://github.com/preshing/FlexibleReflection/tree/part2">separate branch on GitHub</a>. The partial specialization is <a href="https://github.com/preshing/FlexibleReflection/blob/8e334f6294aba3a781c78c3ae8c18349e437f903/Reflect.h#L199-L207">located in <code>Reflect.h</code></a> so that it&rsquo;s visible from every source file that needs it. With proper support in place, the sample project successfully dumps our updated <code>Node</code> object to the console.</p>

<p><img srcset="/images/reflection-output-unique_ptr.png 1x,/images/reflection-output-unique_ptr@2x.png 2x" class="center" src="https://preshing.com/images/reflection-output-unique_ptr.png" /></p>

<h3 id="how-it-works-1">How It Works</h3>

<p>In memory, the type descriptor for <code>std::unique_ptr&lt;Node&gt;</code> looks like this. It&rsquo;s an object of type <code>TypeDescriptor_StdUniquePtr</code>, a subclass of <code>TypeDescriptor</code> that holds two extra member variables:</p>

<p><img srcset="/images/reflect-uniqueptr.png 1x,/images/reflect-uniqueptr@2x.png 2x" class="center" src="https://preshing.com/images/reflect-uniqueptr.png" /></p>

<p>Of those two member variables, the more mysterious one is <code>getTarget</code>. <code>getTarget</code> is a pointer to a kind of helper function. It points to an anonymous function that, at runtime, will dereference a particular specialization of <code>std::unique_ptr&lt;&gt;</code>. To understand it better, let&rsquo;s see how it gets initialized.</p>

<p>Here&rsquo;s the constructor for <code>TypeDescriptor_StdUniquePtr</code>. It&rsquo;s actually a <a href="https://stackoverflow.com/questions/3960849/c-template-constructor">template constructor</a>, which means that the compiler will instantiate a new copy of this constructor each time it&rsquo;s called with a different template parameter (specified via the dummy argument). It&rsquo;s called from the partial specialization of <code>TypeDescriptor</code> we saw earlier.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> TypeDescriptor_StdUniquePtr : TypeDescriptor {
    TypeDescriptor* targetType;
    <span class="directive">const</span> <span class="directive">void</span>* (*getTarget)(<span class="directive">const</span> <span class="directive">void</span>*);

    <span class="comment">// Template constructor:</span>
    <span class="keyword">template</span> &lt;<span class="keyword">typename</span> TargetType&gt;
    TypeDescriptor_StdUniquePtr(TargetType* <span class="comment">/* dummy argument */</span>)
        : TypeDescriptor{<span class="string"><span class="delimiter">&quot;</span><span class="content">std::unique_ptr&lt;&gt;</span><span class="delimiter">&quot;</span></span>, <span class="keyword">sizeof</span>(std::unique_ptr&lt;TargetType&gt;)},
                         targetType{TypeResolver&lt;TargetType&gt;::get()} {
        getTarget = [](<span class="directive">const</span> <span class="directive">void</span>* uniquePtrPtr) -&gt; <span class="directive">const</span> <span class="directive">void</span>* {
            <span class="directive">const</span> <span class="directive">auto</span>&amp; uniquePtr = *(<span class="directive">const</span> std::unique_ptr&lt;TargetType&gt;*) uniquePtrPtr;
            <span class="keyword">return</span> uniquePtr.get();
        };
    }
    ...
</pre></div>
</div>
</div>

<p>Things get a little complex here, but as you can hopefully see, <code>getTarget</code> is initialized to a (captureless) <a href="https://en.wikipedia.org/wiki/Anonymous_function#C++_(since_C++11)">lambda expression</a>. Basically, <code>getTarget</code> points to an anonymous function that casts its argument to a <code>std::unique_ptr&lt;&gt;</code> of the expected type, then dereferences it using <a href="http://en.cppreference.com/w/cpp/memory/unique_ptr/get"><code>std::unique_ptr&lt;&gt;::get()</code></a>. The anonymous function takes a <code>const void*</code> argument because the struct <code>TypeDescriptor_StdUniquePtr</code> can be used to describe <em>any</em> specialization of <code>std::unique_ptr&lt;&gt;</code>. The function itself knows which specialization to expect.</p>

<p>Moreover, because the lambda expression is evaluated inside a template constructor, the compiler will generate a <em>different</em> anonymous function for each specialization of <code>std::unique_ptr&lt;&gt;</code>. That&rsquo;s important, because we don&rsquo;t know how <code>std::unique_ptr&lt;&gt;</code> is implemented by the standard library. All we can do is generate these anonymous functions to help us deal with every possible specialization.</p>

<p>With all of that in place, the implementation of <code>TypeDescriptor_StdUniquePtr::dump()</code>, which helps dump the object to the console, is much more straightforward. You can <a href="https://github.com/preshing/FlexibleReflection/blob/8e334f6294aba3a781c78c3ae8c18349e437f903/Reflect.h#L183-L196">view the implementation here</a>. It&rsquo;s written in a generic way: The same function is used by all <code>std::unique_ptr&lt;&gt;</code> type descriptors, using <code>getTarget</code> to handle the differences between specializations.</p>

<p>Incidentally, <code>TypeDescriptor_StdVector</code> is implemented in much the same way as <code>TypeDescriptor_StdUniquePtr</code>. The main difference is that, instead of having one anonymous helper function, <code>TypeDesriptor_StdVector</code> has two: one that returns the number of elements in a <code>std::vector&lt;&gt;</code>, and another that returns a pointer to a specific element. You can see how both helper functions are initialized <a href="https://github.com/preshing/FlexibleReflection/blob/8e334f6294aba3a781c78c3ae8c18349e437f903/Reflect.h#L123-L130">here</a>.</p>

<h2 id="summary-of-how-type-descriptors-are-found">Summary of How Type Descriptors are Found</h2>

<p>As we&rsquo;ve seen, a call to <code>reflect::TypeResolver&lt;T&gt;::get()</code> will return a type descriptor for any reflected type <code>T</code>, whether it&rsquo;s a built-in primitive type, a built-in template type, or a user-defined class or struct. In summary, the compiler resolves the call as follows:</p>

<p><img srcset="/images/reflection-compiler-resolve.png 1x,/images/reflection-compiler-resolve@2x.png 2x" class="center" src="https://preshing.com/images/reflection-compiler-resolve.png" /></p>

<h2 id="further-improvements">Further Improvements</h2>

<p>In the <a href="https://github.com/preshing/FlexibleReflection/tree/part2">FlexibleReflection</a> sample project, type descriptors are useful because they implement virtual functions like <code>getFullName()</code> and <code>dump()</code>. In my real reflection system, however, type descriptors are mainly used to serialize to (and from) a custom binary format. Instead of virtual functions, the serialization API is exposed through a pointer to an explicit table of function pointers. I call this table the <code>TypeKey</code>. For example, the real <code>TypeDescriptor_Struct</code> looks something like this:</p>

<p><img srcset="/images/reflect-realdescriptor.png 1x,/images/reflect-realdescriptor@2x.png 2x" class="center" src="https://preshing.com/images/reflect-realdescriptor.png" /></p>

<p>One benefit of the <code>TypeKey</code> object is that its address serves as an identifier for the <em>kind</em> of type descriptor it is. There&rsquo;s no need to define a separate enum. For example, all <code>TypeDescriptor_Struct</code> objects point to the same <code>TypeKey</code>.</p>

<p>You can also see that the type descriptor has function pointers to help <code>construct</code> and <code>destruct</code> objects of the underlying type. These functions, too, are generated by lambda expressions inside a function template. The serializer uses them to create new objects at load time. You can even add helper functions to manipulate dynamic arrays and maps.</p>

<p>Perhaps the biggest weakness of this reflection system is that it relies on preprocessor macros. In particular, a block of <code>REFLECT_STRUCT_*()</code> macros is needed to reflect the member variables of a class, and it&rsquo;s easy to forget to keep this block of macros up-to-date. To prevent such mistakes, you could collect a list of class members automatically using <a href="https://clang.llvm.org/docs/Tooling.html">libclang</a>, a custom header file parser (like <a href="https://docs.unrealengine.com/latest/INT/Programming/UnrealBuildSystem/#unrealheadertool">Unreal</a>), or a data definition language (like <a href="http://doc.qt.io/archives/qt-4.8/moc.html">Qt moc</a>). For my part, I&rsquo;m using a simple approach: A small Python script reads class and member names from clearly-marked sections in each header file, then injects the corresponding <code>REFLECT_STRUCT_*()</code> macros into the corresponding <code>.cpp</code> files. It took a single day to write this script, and it runs in a fraction of a second.</p>

<p>I developed this reflection system for <a href="http://preshing.com/20171218/how-to-write-your-own-cpp-game-engine">a custom game engine I&rsquo;m working on</a>, all to create a little game called <strong>Hop Out</strong>. The reflection system has proven invaluable so far. It&rsquo;s used heavily in both my serializer and 3D renderer, and it&rsquo;s currently reflecting 134 classes that are constantly in flux, with more being added all the time. I doubt I could manage all that data without this system!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Flexible Reflection System in C++: Part 1]]></title>
    <link href="https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1"/>
    <updated>2018-01-16T09:21:00-05:00</updated>
    <id>https://preshing.com/?p=20180116</id>
    <content type="html"><![CDATA[<p>In this post, I&rsquo;ll present a small, flexible system for <strong>runtime reflection</strong> using C++11 language features. This is a system to generate <a href="https://en.wikipedia.org/wiki/Metadata">metadata</a> for C++ types. The metadata takes the form of <code>TypeDescriptor</code> objects, created at runtime, that describe the structure of other runtime objects.</p>

<p><img srcset="/images/type-descriptors.png 1x,/images/type-descriptors@2x.png 2x" class="center" src="https://preshing.com/images/type-descriptors.png" /></p>

<p>I&rsquo;ll call these objects <strong>type descriptors</strong>. My initial motivation for writing a reflection system was to support <strong>serialization</strong> in my <a href="http://preshing.com/20171218/how-to-write-your-own-cpp-game-engine">custom C++ game engine</a>, since I have very specific needs. Once that worked, I began to use runtime reflection for other engine features, too:</p>

<!--more-->
<ul>
  <li><strong>3D rendering</strong>: Every time the game engine draws something using OpenGL ES, it uses reflection to pass uniform parameters and describe vertex formats to the API. It makes graphics programming much more productive!</li>
  <li><strong>Importing JSON</strong>: The engine&rsquo;s asset pipeline has a generic routine to synthesize a C++ object from a JSON file and a type descriptor. It&rsquo;s used to import 3D models, level definitions and other assets.</li>
</ul>

<p>This reflection system is based on preprocessor macros and templates. C++, at least in its current form, was not designed to make runtime reflection easy. As anyone who&rsquo;s written one knows, it&rsquo;s tough to design a reflection system that&rsquo;s easy to use, easily extended, and that actually works. I was burned many times by obscure language rules, order-of-initialization bugs and corner cases before settling on the system I have today.</p>

<p>To illustrate how it works, I&rsquo;ve published a sample project <a href="https://github.com/preshing/FlexibleReflection/tree/part1">on GitHub:</a></p>

<p><a href="https://github.com/preshing/FlexibleReflection/tree/part1"><img srcset="/images/github-flexible-reflection.png 1x,/images/github-flexible-reflection@2x.png 2x" class="center" src="https://preshing.com/images/github-flexible-reflection.png" /></a></p>

<p>This sample doesn&rsquo;t actually use my game engine&rsquo;s reflection system. It uses a tiny reflection system of its own, but the most interesting part &ndash; the way type descriptors are <strong>created</strong>, <strong>structured</strong> and <strong>found</strong> &ndash; is almost identical. That&rsquo;s the part I&rsquo;ll focus on in this post. In the next post, I&rsquo;ll discuss how the system can be extended.</p>

<p>This post is meant for programmers who are interested in how to <em>develop</em> a runtime reflection system, not just use one. It touches on many advanced features of C++, but the sample project is only 242 lines of code, so hopefully, with some persistence, any determined C++ programmer can follow along. If you&rsquo;re more interested in using an existing solution, take a look at <a href="http://www.rttr.org/">RTTR</a>.</p>

<h2 id="demonstration">Demonstration</h2>

<p>In <code>Main.cpp</code>, the sample project defines a struct named <code>Node</code>. The <code>REFLECT()</code> macro tells the system to enable reflection for this type.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Node {
    std::<span class="predefined-type">string</span> key;
    <span class="predefined-type">int</span> value;
    std::vector&lt;Node&gt; children;

    REFLECT()      <span class="comment">// Enable reflection for this type</span>
};
</pre></div>
</div>
</div>

<p>At runtime, the sample creates an object of type <code>Node</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Create an object of type Node</span>
Node node = {<span class="string"><span class="delimiter">&quot;</span><span class="content">apple</span><span class="delimiter">&quot;</span></span>, <span class="integer">3</span>, {{<span class="string"><span class="delimiter">&quot;</span><span class="content">banana</span><span class="delimiter">&quot;</span></span>, <span class="integer">7</span>, {}}, {<span class="string"><span class="delimiter">&quot;</span><span class="content">cherry</span><span class="delimiter">&quot;</span></span>, <span class="integer">11</span>, {}}}};
</pre></div>
</div>
</div>

<p>In memory, the <code>Node</code> object looks something like this:</p>

<p><img srcset="/images/reflect-node.png 1x,/images/reflect-node@2x.png 2x" class="center" src="https://preshing.com/images/reflect-node.png" /></p>

<p>Next, the sample finds <code>Node</code>&rsquo;s type descriptor. For this to work, the following macros must be placed in a <code>.cpp</code> file somewhere. I put them in <code>Main.cpp</code>, but they could be placed in any file from which the definition of <code>Node</code> is visible.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Define Node's type descriptor</span>
REFLECT_STRUCT_BEGIN(Node)
REFLECT_STRUCT_MEMBER(key)
REFLECT_STRUCT_MEMBER(value)
REFLECT_STRUCT_MEMBER(children)
REFLECT_STRUCT_END()
</pre></div>
</div>
</div>

<p><code>Node</code>&rsquo;s member variables are now said to be <strong>reflected</strong>.</p>

<p>A pointer to <code>Node</code>&rsquo;s type descriptor is obtained by calling <code>reflect::TypeResolver&lt;Node&gt;::get()</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Find Node's type descriptor</span>
reflect::TypeDescriptor* typeDesc = reflect::TypeResolver&lt;Node&gt;::get();
</pre></div>
</div>
</div>

<p>Having found the type descriptor, the sample uses it to dump a description of the <code>Node</code> object to the console.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Dump a description of the Node object to the console</span>
typeDesc-&gt;dump(&amp;node);
</pre></div>
</div>
</div>

<p>This produces the following output:</p>

<p><img srcset="/images/reflection-output.png 1x,/images/reflection-output@2x.png 2x" class="center" src="https://preshing.com/images/reflection-output.png" /></p>

<h2 id="how-the-macros-are-implemented">How the Macros Are Implemented</h2>

<p>When you add the <code>REFLECT()</code> macro to a struct or a class, it declares two additional static members: <code>Reflection</code>, the struct&rsquo;s type descriptor, and <code>initReflection</code>, a function to initialize it. Effectively, when the macro is expanded, the complete <code>Node</code> struct looks like this:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Node {
    std::<span class="predefined-type">string</span> key;
    <span class="predefined-type">int</span> value;
    std::vector&lt;Node&gt; children;

    <span class="comment">// Declare the struct's type descriptor:</span>
    <span class="directive">static</span> reflect::TypeDescriptor_Struct <span class="highlight">Reflection</span>;

    <span class="comment">// Declare a function to initialize it:</span>
    <span class="directive">static</span> <span class="directive">void</span> <span class="highlight">initReflection</span>(reflect::TypeDescriptor_Struct*);
};
</pre></div>
</div>
</div>

<p>Similarly, the block of <code>REFLECT_STRUCT_*()</code> macros in <code>Main.cpp</code> look like this when expanded:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Definition of the struct's type descriptor:</span>
reflect::TypeDescriptor_Struct Node::<span class="highlight">Reflection</span>{Node::initReflection};

<span class="comment">// Definition of the function that initializes it:</span>
<span class="directive">void</span> Node::<span class="highlight">initReflection</span>(reflect::TypeDescriptor_Struct* typeDesc) {
    <span class="directive">using</span> T = Node;
    typeDesc-&gt;name = <span class="string"><span class="delimiter">&quot;</span><span class="content">Node</span><span class="delimiter">&quot;</span></span>;
    typeDesc-&gt;size = <span class="keyword">sizeof</span>(T);
    typeDesc-&gt;members = {
        {<span class="string"><span class="delimiter">&quot;</span><span class="content">key</span><span class="delimiter">&quot;</span></span>, offsetof(T, key), reflect::TypeResolver&lt;decltype(T::key)&gt;::get()},
        {<span class="string"><span class="delimiter">&quot;</span><span class="content">value</span><span class="delimiter">&quot;</span></span>, offsetof(T, value), reflect::TypeResolver&lt;decltype(T::value)&gt;::get()},
        {<span class="string"><span class="delimiter">&quot;</span><span class="content">children</span><span class="delimiter">&quot;</span></span>, offsetof(T, children), reflect::TypeResolver&lt;decltype(T::children)&gt;::get()},
    };
}
</pre></div>
</div>
</div>

<p>Now, because <code>Node::Reflection</code> is a static member variable, its constructor, which accepts a pointer to <code>initReflection()</code>, is automatically called at program startup. You might be wondering: Why pass a function pointer to the constructor? Why not pass an <a href="http://en.cppreference.com/w/cpp/language/initializer_list">initializer list</a> instead? The answer is because the body of the function gives us a place to declare a C++11 <a href="http://en.cppreference.com/w/cpp/language/type_alias">type alias</a>: <code>using T = Node</code>. Without the type alias, we&rsquo;d have to pass the identifier <code>Node</code> as an extra argument to every <code>REFLECT_STRUCT_MEMBER()</code> macro. The macros wouldn&rsquo;t be as easy to use.</p>

<p>As you can see, inside the function, there are three additional calls to <code>reflect::TypeResolver&lt;&gt;::get()</code>. Each one finds the type descriptor for a reflected member of <code>Node</code>. These calls use C++11&rsquo;s <a href="http://en.cppreference.com/w/cpp/language/decltype"><code>decltype</code> specifier</a> to automatically pass the correct type to the <code>TypeResolver</code> template.</p>

<h2 id="finding-typedescriptors">Finding TypeDescriptors</h2>

<p>(Note that everything in this section is defined in the <code>reflect</code> namespace.)</p>

<p><code>TypeResolver</code> is a <strong>class template</strong>. When you call <code>TypeResolver&lt;T&gt;::get()</code> for a particular type <code>T</code>, the compiler instantiates a function that returns the corresponding <code>TypeDescriptor</code> for <code>T</code>. It works for reflected structs as well as for every reflected member of those structs. By default, this happens through the primary template, highlighted below.</p>

<p>By default, if <code>T</code> is a struct (or a class) that contains the <code>REFLECT()</code> macro, like <code>Node</code>, <code>get()</code> will return a pointer to that struct&rsquo;s <code>Reflection</code> member  &ndash; which is what we want. For every other type <code>T</code>, <code>get()</code> instead calls <code>getPrimitiveDescriptor&lt;T&gt;</code> &ndash; a <strong>function template</strong> that handles primitive types such as <code>int</code> or <code>std::string</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Declare the function template that handles primitive types such as int, std::string, etc.:</span>
<span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt;
TypeDescriptor* getPrimitiveDescriptor();

<span class="comment">// A helper class to find TypeDescriptors in different ways:</span>
<span class="keyword">struct</span> DefaultResolver {
    ...

    <span class="comment">// This version is called if T has a static member variable named &quot;Reflection&quot;:</span>
    <span class="keyword">template</span> &lt;<span class="keyword">typename</span> T, <span class="comment">/* SFINAE stuff here */</span>&gt;
    <span class="directive">static</span> TypeDescriptor* get() {
        <span class="keyword">return</span> &amp;T::Reflection;
    }

    <span class="comment">// This version is called otherwise:</span>
    <span class="keyword">template</span> &lt;<span class="keyword">typename</span> T, <span class="comment">/* SFINAE stuff here */</span>&gt;
    <span class="directive">static</span> TypeDescriptor* get() {
        <span class="keyword">return</span> getPrimitiveDescriptor&lt;T&gt;();
    }
};

<span class="comment">// This is the primary class template for finding all TypeDescriptors:</span>
<span class="keyword">template</span> <span class="highlight">&lt;<span class="keyword">typename</span> T&gt;</span>
<span class="keyword">struct</span> <span class="highlight">TypeResolver</span> {
    <span class="directive">static</span> TypeDescriptor* <span class="highlight">get</span>() {
        <span class="keyword">return</span> DefaultResolver::get&lt;T&gt;();
    }
};
</pre></div>
</div>
</div>

<p>This bit of compile-time logic &ndash; generating different code depending on whether a static member variable is present in <code>T</code> &ndash; is achieved using <a href="http://en.cppreference.com/w/cpp/language/sfinae">SFINAE</a>. I omitted the SFINAE code from the above snippet because, quite frankly, it&rsquo;s ugly. You can check the actual implementation <a href="https://github.com/preshing/FlexibleReflection/blob/a1c5a518e000383a89aca61116329d6fc09a6b3c/Reflect.h#L30-L50">in the source code</a>. Part of it could be rewritten more elegantly using <a href="http://en.cppreference.com/w/cpp/language/if#Constexpr_If"><code>if constexpr</code></a>, but I&rsquo;m targeting C++11. Even then, the part that detects whether <code>T</code> has a specific member variable will remain ugly, at least until C++ adopts <a href="https://meetingcpp.com/blog/items/reflections-on-the-reflection-proposals.html">static reflection</a>. In the meantime, however &ndash; it works!</p>

<h2 id="the-structure-of-typedescriptors">The Structure of TypeDescriptors</h2>

<p>In the sample project, every <code>TypeDescriptor</code> has a name, a size, and a couple of virtual functions:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> TypeDescriptor {
    <span class="directive">const</span> <span class="predefined-type">char</span>* <span class="highlight">name</span>;
    size_t <span class="highlight">size</span>;

    TypeDescriptor(<span class="directive">const</span> <span class="predefined-type">char</span>* name, size_t size) : name{name}, size{size} {}
    <span class="directive">virtual</span> ~TypeDescriptor() {}
    <span class="directive">virtual</span> std::<span class="predefined-type">string</span> <span class="highlight">getFullName</span>() <span class="directive">const</span> { <span class="keyword">return</span> name; }
    <span class="directive">virtual</span> <span class="directive">void</span> <span class="highlight">dump</span>(<span class="directive">const</span> <span class="directive">void</span>* obj, <span class="predefined-type">int</span> indentLevel = <span class="integer">0</span>) <span class="directive">const</span> = <span class="integer">0</span>;
};
</pre></div>
</div>
</div>

<p>The sample project never creates <code>TypeDescriptor</code> objects directly. Instead, the system creates objects of types derived from <code>TypeDescriptor</code>. That way, every type descriptor can hold extra information depending on, well, the <em>kind</em> of type descriptor it is.</p>

<p>For example, the actual type of the object returned by <code>TypeResolver&lt;Node&gt;::get()</code> is <code>TypeDescriptor_Struct</code>. It has one additional member variable, <code>members</code>, that holds information about every reflected member of <code>Node</code>. For each reflected member, there&rsquo;s a pointer to another <code>TypeDescriptor</code>. Here&rsquo;s what the whole thing looks like in memory. I&rsquo;ve circled the various <code>TypeDescriptor</code> subclasses in red:</p>

<p><img srcset="/images/reflect-typedescs.png 1x,/images/reflect-typedescs@2x.png 2x" class="center" src="https://preshing.com/images/reflect-typedescs.png" /></p>

<p>At runtime, you can get the full name of any type by calling <code>getFullName()</code> on its type descriptor. Most subclasses simply use the base class implementation of <code>getFullName()</code>, which returns <code>TypeDescriptor::name</code>. The only exception, in this example, is <code>TypeDescriptor_StdVector</code>, a subclass that describes <code>std::vector&lt;&gt;</code> specializations. In order to return a full type name, such as <code>"std::vector&lt;Node&gt;"</code>, it keeps a pointer to the type descriptor of its item type. You can see this in the above memory diagram: There&rsquo;s a <code>TypeDescriptor_StdVector</code> object whose <code>itemType</code> member points all the way back to the type descriptor for <code>Node</code>.</p>

<p>Of course, type descriptors only describe <em>types</em>. For a complete description of a runtime object, we need both a type descriptor and a pointer to the object itself.</p>

<p>Note that <code>TypeDescriptor::dump()</code> accepts a pointer to the object as <code>const void*</code>. That&rsquo;s because the abstract <code>TypeDescriptor</code> interface is meant to deal with <em>any</em> type of object. The subclassed implementation knows what type to expect. For example, here&rsquo;s the implementation of <code>TypeDescriptor_StdString::dump()</code>. It casts the <code>const void*</code> to <code>const std::string*</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="directive">virtual</span> <span class="directive">void</span> dump(<span class="directive">const</span> <span class="directive">void</span>* obj, <span class="predefined-type">int</span> <span class="comment">/*unused*/</span>) <span class="directive">const</span> override {
    std::cout &lt;&lt; <span class="string"><span class="delimiter">&quot;</span><span class="content">std::string{</span><span class="char">\&quot;</span><span class="delimiter">&quot;</span></span> &lt;&lt; *(<span class="directive">const</span> std::<span class="predefined-type">string</span>*) obj &lt;&lt; <span class="string"><span class="delimiter">&quot;</span><span class="char">\&quot;</span><span class="content">}</span><span class="delimiter">&quot;</span></span>;
}
</pre></div>
</div>
</div>

<p>You might wonder whether it&rsquo;s safe to cast <code>void</code> pointers in this way. Clearly, if an invalid pointer is passed in, the program is likely to crash. That&rsquo;s why, in my game engine, objects represented by <code>void</code> pointers always travel around with their type descriptors in pairs. By representing objects this way, it&rsquo;s possible to write many kinds of generic algorithms.</p>

<p>In the sample project, dumping objects to the console is the only functionality implemented, but you can imagine how type descriptors could serve as a framework for serializing to a binary format instead.</p>

<p>In the next post, I&rsquo;ll explain how to add built-in types to the reflection system, and what the &ldquo;anonymous functions&rdquo; are for in the above diagram. I&rsquo;ll also discuss other ways to extend the system.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How to Write Your Own C++ Game Engine]]></title>
    <link href="https://preshing.com/20171218/how-to-write-your-own-cpp-game-engine"/>
    <updated>2017-12-18T07:54:00-05:00</updated>
    <id>https://preshing.com/?p=20171218</id>
    <content type="html"><![CDATA[<p>Lately I&rsquo;ve been writing a game engine in C++. I&rsquo;m using it to make a little mobile game called <strong>Hop Out</strong>. Here&rsquo;s a clip captured from my iPhone 6. (Unmute for sound!)</p>

<video width="504" height="284" autoplay="" loop="" controls="" muted="">
  <source src="https://preshing.com/images/hopoutclip.mp4" type="video/mp4" />
  <source src="https://preshing.com/images/hopoutclip.webm" type="video/webm" />
  <source src="https://preshing.com/images/hopoutclip.ogv" type="video/ogg" />
<img class="center" src="https://preshing.com/images/hopout-snap.jpg" />
</video>

<p>Hop Out is the kind of game I want to play: Retro arcade gameplay with a 3D cartoon look. The goal is to change the color of every pad, like in Q*Bert.</p>

<p>Hop Out is still in development, but the engine powering it is starting to become quite mature, so I thought I&rsquo;d share a few tips about engine development here.</p>

<!--more-->
<p>Why would you want to write a game engine? There are many possible reasons:</p>

<ul>
  <li>You&rsquo;re a tinkerer. You love building systems from the ground up and seeing them come to life.</li>
  <li>You want to learn more about game development. I spent 14 years in the game industry and I&rsquo;m still figuring it out. I wasn&rsquo;t even sure I could write an engine from scratch, since it&rsquo;s vastly different from the daily responsibilities of a programming job at a big studio. I wanted to find out.</li>
  <li>You like control. It&rsquo;s satisfying to organize the code exactly the way you want, knowing where everything is at all times.</li>
  <li>You feel inspired by classic game engines like <a href="https://en.wikipedia.org/wiki/Adventure_Game_Interpreter">AGI</a> (1984), <a href="https://en.wikipedia.org/wiki/Doom_engine">id Tech 1</a> (1993), <a href="https://en.wikipedia.org/wiki/Build_(game_engine)">Build</a> (1995), and industry giants like Unity and Unreal.</li>
  <li>You believe that we, the game industry, should try to demystify the engine development process. It&rsquo;s not like we&rsquo;ve mastered the art of making games. Far from it! The more we examine this process, the greater our chances of improving upon it.</li>
</ul>

<p>The gaming platforms of 2017 &ndash; mobile, console and PC &ndash; are very powerful and, in many ways, quite similar to one another. Game engine development is not so much about struggling with weak and exotic hardware, as it was in the past. In my opinion, it&rsquo;s more about struggling with <strong>complexity of your own making</strong>. It&rsquo;s easy to create a monster! That&rsquo;s why the advice in this post centers around keeping things manageable. I&rsquo;ve organized it into three sections:</p>

<ol>
  <li>Use an iterative approach</li>
  <li>Think twice before unifying things too much</li>
  <li>Be aware that serialization is a big subject</li>
</ol>

<p>This advice applies to any kind of game engine. I&rsquo;m not going to tell you how to write a shader, what an octree is, or how to add physics. Those are the kinds of things that, I assume, you already know that you should know &ndash; and it depends largely on the type of game you want to make. Instead, I&rsquo;ve deliberately chosen points that don&rsquo;t seem to be widely acknowledged or talked about &ndash; these are the kinds of points I find most interesting when trying to demystify a subject.</p>

<h2 id="use-an-iterative-approach">Use an Iterative Approach</h2>

<p>My first piece of advice is to get something (anything!) running quickly, then iterate.</p>

<p>If possible, start with a sample application that initializes the device and draws something on the screen. In my case, I downloaded <a href="https://www.libsdl.org/">SDL</a>, opened <code>Xcode-iOS/Test/TestiPhoneOS.xcodeproj</code>, then ran the <code>testgles2</code> sample on my iPhone.</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step1.jpg" /></p>

<p>Voilà! I had a lovely spinning cube using OpenGL ES 2.0.</p>

<p>My next step was to download a 3D model somebody made of Mario. I wrote a quick &amp; dirty OBJ file loader &ndash; the file format is not that complicated &ndash; and hacked the sample application to render Mario instead of a cube. I also integrated <a href="https://www.libsdl.org/projects/SDL_image/">SDL_Image</a> to help load textures.</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step2.png" /></p>

<p>Then I implemented dual-stick controls to move Mario around. (In the beginning, I was contemplating making a dual-stick shooter. Not with Mario, though.)</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step3.png" /></p>

<p>Next, I wanted to explore skeletal animation, so I opened <a href="https://www.blender.org/">Blender</a>, modeled a tentacle, and rigged it with a two-bone skeleton that wiggled back and forth.</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step4.png" /></p>

<p>At this point, I abandoned the OBJ file format and wrote a Python script to export custom JSON files from Blender. These JSON files described the skinned mesh, skeleton and animation data. I loaded these files into the game with the help of a <a href="https://github.com/nlohmann/json">C++ JSON library</a>.</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step5.png" /></p>

<p>Once that worked, I went back into Blender and made more elaborate character. (This was the first rigged 3D human I ever created. I was quite proud of him.)</p>

<p><img class="center" src="https://preshing.com/images/gameengine-step6.jpg" /></p>

<p>Over the next few months, I took the following steps:</p>

<ul>
  <li>Started factoring out vector and matrix functions into my own 3D math library.</li>
  <li>Replaced the <code>.xcodeproj</code> with a CMake project.</li>
  <li>Got the engine running on both Windows and iOS, because I like working in Visual Studio.</li>
  <li>Started moving code into separate &ldquo;engine&rdquo; and &ldquo;game&rdquo; libraries. Over time, I split those into even more granular libraries.</li>
  <li>Wrote a separate application to convert my JSON files into binary data that the game can load directly.</li>
  <li>Eventually removed all SDL libraries from the iOS build. (The Windows build still uses SDL.)</li>
</ul>

<p>The point is: <strong>I didn&rsquo;t plan the engine architecture before I started programming</strong>. This was a deliberate choice. Instead, I just wrote the simplest code that implemented the next feature, then I&rsquo;d look at the code to see what kind of architecture emerged naturally. By &ldquo;engine architecture&rdquo;, I mean the set of modules that make up the game engine, the dependencies between those modules, and the <a href="https://en.wikipedia.org/wiki/Application_programming_interface">API</a> for interacting with each module.</p>

<p><img class="center" src="https://preshing.com/images/how-to-iterate.png" /></p>

<p>This is an <a href="https://en.wikipedia.org/wiki/Iterative_and_incremental_development">iterative</a> approach because it focuses on smaller deliverables. It works well when writing a game engine because, at each step along the way, you have a running program. If something goes wrong when you&rsquo;re factoring code into a new module, you can always compare your changes with the code that worked previously. Obviously, I assume you&rsquo;re using some kind of <a href="https://www.perforce.com/blog/list-of-equivalent-commands-in-git-mercurial-and-svn">source control</a>.</p>

<p>You might think a lot of time gets wasted in this approach, since you&rsquo;re always writing bad code that needs to be cleaned up later. But most of the cleanup involves moving code from one <code>.cpp</code> file to another, extracting function declarations into <code>.h</code> files, or equally straightforward changes. Deciding <em>where</em> things should go is the hard part, and that&rsquo;s easier to do when the code already exists.</p>

<p>I would argue that more time is wasted in the opposite approach: Trying too hard to come up with an architecture that will do everything you think you&rsquo;ll need ahead of time. Two of my favorite articles about the perils of over-engineering are <a href="http://altdevblog.com/2011/04/01/vicious-circle-of-generalization/">The Vicious Circle of Generalization</a> by Tomasz Dąbrowski and <a href="https://www.joelonsoftware.com/2001/04/21/dont-let-architecture-astronauts-scare-you/">Don&rsquo;t Let Architecture Astronauts Scare You</a> by Joel Spolsky.</p>

<p>I&rsquo;m not saying you should never solve a problem on paper before tackling it in code. I&rsquo;m also not saying you shouldn&rsquo;t decide what features you want in advance. For example, I knew from the beginning that I wanted my engine to load all assets in a background thread. I just didn&rsquo;t try to design or implement that feature until my engine actually loaded some assets first.</p>

<p>The iterative approach has given me a much more elegant architecture than I ever could have dreamed up by staring at a blank sheet of paper. The iOS build of my engine is now 100% original code including a custom math library, container templates, reflection/serialization system, rendering framework, physics and audio mixer. I had reasons for writing each of those modules, but you might not find it necessary to write all those things yourself. There are lots of great, permissively-licensed open source libraries that you might find appropriate for your engine instead. <a href="https://glm.g-truc.net/">GLM</a>, <a href="https://pybullet.org/wordpress/">Bullet Physics</a> and the <a href="https://github.com/nothings/stb">STB headers</a> are just a few interesting examples.</p>

<h2 id="think-twice-before-unifying-things-too-much">Think Twice Before Unifying Things Too Much</h2>

<p>As programmers, we try to avoid code duplication, and we like it when our code follows a uniform style. However, I think it&rsquo;s good not to let those instincts override every decision.</p>

<h3 id="resist-the-dry-principle-once-in-a-while">Resist the DRY Principle Once in a While</h3>

<p>To give you an example, my engine contains several &ldquo;smart pointer&rdquo; template classes, similar in spirit to <a href="http://en.cppreference.com/w/cpp/memory/shared_ptr"><code>std::shared_ptr</code></a>. Each one helps prevent memory leaks by serving as a wrapper around a raw pointer.</p>

<ul>
  <li><code>Owned&lt;&gt;</code> is for dynamically allocated objects that have a single owner.</li>
  <li><code>Reference&lt;&gt;</code> uses reference counting to allow an object to have several owners.</li>
  <li><code>audio::AppOwned&lt;&gt;</code> is used by code outside the audio mixer. It allows game systems to own objects that the audio mixer uses, such as a voice that&rsquo;s currently playing.</li>
  <li><code>audio::AudioHandle&lt;&gt;</code> uses a reference counting system internal to the audio mixer.</li>
</ul>

<p>It may look like some of those classes duplicate the functionality of the others, in violation of the <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY (Don&rsquo;t Repeat Yourself) Principle</a>. Indeed, earlier in development, I tried to re-use the existing <code>Reference&lt;&gt;</code> class as much as possible. However, I found that the lifetime of an audio object is governed by special rules: If an audio voice has finished playing a sample, and the game does not hold a pointer to that voice, the voice can be queued for deletion immediately. If the game holds a pointer, then the voice object should not be deleted. And if the game holds a pointer, but the pointer&rsquo;s owner is destroyed before the voice has ended, the voice should be canceled. Rather than adding complexity to <code>Reference&lt;&gt;</code>, I decided it was more practical to introduce separate template classes instead.</p>

<p>95% of the time, re-using existing code is the way to go. But if you start to feel paralyzed, or find yourself adding complexity to something that was once simple, ask yourself if something in the codebase should actually be two things.</p>

<h3 id="its-ok-to-use-different-calling-conventions">It&rsquo;s OK to Use Different Calling Conventions</h3>

<p>One thing I dislike about Java is that it forces you to define every function inside a class. That&rsquo;s nonsense, in my opinion. It might make your code look more consistent, but it also encourages over-engineering and doesn&rsquo;t lend itself well to the iterative approach I described earlier.</p>

<p>In my C++ engine, some functions belong to classes and some don&rsquo;t. For example, every enemy in the game is a class, and most of the enemy&rsquo;s behavior is implemented inside that class, as you&rsquo;d probably expect. On the other hand, <a href="https://stackoverflow.com/questions/7136449/shape-casting-a-capsule-against-convex-polyhedra">sphere casts</a> in my engine are performed by calling <code>sphereCast()</code>, a function in the <code>physics</code> namespace. <code>sphereCast()</code> doesn&rsquo;t belong to any class &ndash; it&rsquo;s just part of the <code>physics</code> module. I have a build system that manages dependencies between modules, which keeps the code organized well enough for me. Wrapping this function inside an arbitrary class won&rsquo;t improve the code organization in any meaningful way.</p>

<p>Then there&rsquo;s <a href="https://en.wikipedia.org/wiki/Dynamic_dispatch">dynamic dispatch</a>, which is a form of <a href="https://en.wikipedia.org/wiki/Polymorphism_(computer_science)">polymorphism</a>. We often need to call a function for an object without knowing the exact type of that object. A C++ programmer&rsquo;s first instinct is to define an abstract base class with virtual functions, then override those functions in a derived class. That&rsquo;s valid, but it&rsquo;s only one technique. There are other dynamic dispatch techniques that don&rsquo;t introduce as much extra code, or that bring other benefits:</p>

<ul>
  <li>C++11 introduced <a href="http://en.cppreference.com/w/cpp/utility/functional/function"><code>std::function</code></a>, which is a convenient way to store callback functions. It&rsquo;s also possible to write your own version of <code>std::function</code> that&rsquo;s less painful to step into in the debugger.</li>
  <li>Many callback functions can be implemented with a pair of pointers: A function pointer and an opaque argument. It just requires an explicit cast inside the callback function. You see this a lot in pure C libraries.</li>
  <li>Sometimes, the underlying type is actually known at compile time, and you can bind the function call without any additional runtime overhead. <a href="https://github.com/preshing/turf">Turf</a>, a library that I use in my game engine, relies on this technique a lot. See <a href="https://github.com/preshing/turf/blob/9ae0d4b984fa95ed5f823274b39c87ee742f6650/turf/Mutex.h#L50"><code>turf::Mutex</code></a> for example. It&rsquo;s just a <code>typedef</code> over a platform-specific class.</li>
  <li>Sometimes, the most straightforward approach is to build and maintain a table of raw function pointers yourself. I used this approach in my audio mixer and serialization system. The Python interpreter also makes heavy use of this technique, as mentioned below.</li>
  <li>You can even store function pointers in a hash table, using the function names as keys. I use this technique to dispatch input events, such as multitouch events. It&rsquo;s part of a strategy to record game inputs and play them back with a replay system.</li>
</ul>

<p>Dynamic dispatch is a big subject. I&rsquo;m only scratching the surface to show that there many ways to achieve it. The more you write extendible low-level code &ndash; which is common in a game engine &ndash; the more you&rsquo;ll find yourself exploring alternatives. If you&rsquo;re not used to this kind of programming, the Python interpreter, which is written an C, is an excellent resource to learn from. It implements a powerful object model: Every <code>PyObject</code> points to a <code>PyTypeObject</code>, and every <code>PyTypeObject</code> contains a table of function pointers for dynamic dispatch. The document <a href="https://docs.python.org/3/extending/newtypes.html">Defining New Types</a> is a good starting point if you want to jump straight right in.</p>

<h2 id="be-aware-that-serialization-is-a-big-subject">Be Aware that Serialization Is a Big Subject</h2>

<p><a href="https://en.wikipedia.org/wiki/Serialization">Serialization</a> is the act of converting runtime objects to and from a sequence of bytes. In other words, saving and loading data.</p>

<p>For many if not most game engines, game content is created in various editable formats such as <code>.png</code>, <code>.json</code>, <code>.blend</code> or proprietary formats, then eventually converted to platform-specific game formats that the engine can load quickly. The last application in this pipeline is often referred to as a &ldquo;cooker&rdquo;. The cooker might be integrated into another tool, or even distributed across several machines. Usually, the cooker and a number of tools are developed and maintained in tandem with the game engine itself.</p>

<p><img class="center" src="https://preshing.com/images/asset-pipeline.png" /></p>

<p>When setting up such a pipeline, the choice of file format at each stage is up to you. You might define some file formats of your own, and those formats might evolve as you add engine features. As they evolve, you might find it necessary to keep certain programs compatible with previously saved files. No matter what format, you&rsquo;ll ultimately need to serialize it in C++.</p>

<p>There are countless ways to implement serialization in C++. One fairly obvious way is to add <code>load</code> and <code>save</code> functions to the C++ classes you want to serialize. You can achieve backward compatibility by storing a version number in the file header, then passing this number into every <code>load</code> function. This works, although the code can become cumbersome to maintain.</p>

<div><div class="CodeRay">
  <div class="code"><pre>    <span class="directive">void</span> load(InStream&amp; in, u32 fileVersion) {
        <span class="comment">// Load expected member variables</span>
        in &gt;&gt; m_position;
        in &gt;&gt; m_direction;

        <span class="comment">// Load a newer variable only if the file version being loaded is 2 or greater</span>
        <span class="keyword">if</span> (fileVersion &gt;= <span class="integer">2</span>) {
            in &gt;&gt; m_velocity;
        }
    }
</pre></div>
</div>
</div>

<p>It&rsquo;s possible to write more flexible, less error-prone serialization code by taking advantage of <a href="https://en.wikipedia.org/wiki/Reflection_(computer_programming)">reflection</a> &ndash; specifically, by creating runtime data that describes the layout of your C++ types. For a quick idea of how reflection can help with serialization, take a look at how <a href="https://www.blender.org/get-involved/developers/">Blender</a>, an open source project, does it.</p>

<p><img class="center" src="https://preshing.com/images/blender-reflection.png" /></p>

<p>When you build Blender from source code, many steps happen. First, a custom utility named <code>makesdna</code> is compiled and run. This utility parses a set of C header files in the Blender source tree, then outputs a compact summary of all C types defined within, in a custom format known as <a href="https://wiki.blender.org/index.php/Dev:Source/Architecture/SDNA_Notes">SDNA</a>. This SDNA data serves as <strong>reflection data</strong>. The SDNA is then linked into Blender itself, and saved with every <code>.blend</code> file that Blender writes. From that point on, whenever Blender loads a <code>.blend</code> file, it compares the <code>.blend</code> file&rsquo;s SDNA with the SDNA linked into the current version at runtime, and uses generic serialization code to handle any differences. This strategy gives Blender an impressive degree of backward and forward compatibility. You can <a href="https://www.blendernation.com/2008/12/01/blender-dna-rna-and-backward-compatibility/">still load 1.0 files</a> in the latest version of Blender, and new <code>.blend</code> files can be loaded in older versions.</p>

<p>Like Blender, many game engines &ndash; and their associated tools &ndash; generate and use their own reflection data. There are many ways to do it: You can parse your own C/C++ source code to extract type information, as Blender does. You can create a separate data description language, and write a tool to generate C++ type definitions and reflection data from this language. You can use preprocessor macros and C++ templates to generate reflection data at runtime. And once you have reflection data available, there are countless ways to write a generic serializer on top of it.</p>

<p>Clearly, I&rsquo;m omitting a lot of detail. In this post, I only want to show that there are many different ways to serialize data, some of which are very complex. Programmers just don&rsquo;t discuss serialization as much as other engine systems, even though most other systems rely on it. For example, out of the 96 programming talks given at <a href="https://www.gdcvault.com/browse/gdc-17/?categories=Pg">GDC 2017</a>, I counted 31 talks about graphics, 11 about online, 10 about tools, 4 about AI, 3 about physics, 2 about audio &ndash; but only one that <a href="https://www.gdcvault.com/play/1024444/The-Data-Building-Pipeline-of">touched directly on serialization</a>.</p>

<p>At a minimum, try to have an idea how complex your needs will be. If you&rsquo;re making a tiny game like Flappy Bird, with only a few assets, you probably don&rsquo;t need to think too hard about serialization. You can probably load textures directly from PNG and it&rsquo;ll be fine. If you need a compact binary format with backward compatibility, but don&rsquo;t want to develop your own, take a look at third-party libraries such as <a href="https://uscilab.github.io/cereal/">Cereal</a> or <a href="https://theboostcpplibraries.com/boost.serialization">Boost.Serialization</a>. I don&rsquo;t think <a href="https://developers.google.com/protocol-buffers/">Google Protocol Buffers</a> are ideal for serializing game assets, but they&rsquo;re worth studying nonetheless.</p>

<p>Writing a game engine &ndash; even a small one &ndash; is a big undertaking. There&rsquo;s a lot more I could say about it, but for a post of this length, that&rsquo;s honestly the most helpful advice I can think to give: Work iteratively, resist the urge to unify code a little bit, and know that serialization is a big subject so you can choose an appropriate strategy. In my experience, each of those things can become a stumbling block if ignored.</p>

<p>I love comparing notes on this stuff, so I&rsquo;d be really interested to hear from other developers. If you&rsquo;ve written an engine, did your experience lead you to any of the same conclusions? And if you haven&rsquo;t written one, or are just thinking about it, I&rsquo;m interested in your thoughts too. What do you consider a good resource to learn from? What parts still seem mysterious to you? Feel free to leave a comment below or hit me up <a href="https://twitter.com/preshing">on Twitter</a>!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Can Reordering of Release/Acquire Operations Introduce Deadlock?]]></title>
    <link href="https://preshing.com/20170612/can-reordering-of-release-acquire-operations-introduce-deadlock"/>
    <updated>2017-06-12T07:34:00-04:00</updated>
    <id>https://preshing.com/?p=20170612</id>
    <content type="html"><![CDATA[<p>I wasn&rsquo;t planning to write about lock-free programming again, but a commenter named Mike recently asked an interesting question on my <a href="http://preshing.com/20120913/acquire-and-release-semantics">Acquire and Release Semantics</a> post from 2012. It&rsquo;s a question I wondered about years ago, but could never really reconcile until (possibly) now.</p>

<p>A quick recap: A <strong>read-acquire</strong> operation cannot be reordered, either by the compiler or the CPU, with any read or write operation that <em>follows</em> it in program order. A <strong>write-release</strong> operation cannot be reordered with any read or write operation that <em>precedes</em> it in program order.</p>

<p>Those rules <em>don&rsquo;t</em> prevent the reordering of a write-release followed by a read-acquire. For example, in C++, if <code>A</code> and <code>B</code> are <code>std::atomic&lt;int&gt;</code>, and we write:</p>

<div><div class="CodeRay">
  <div class="code"><pre>A.store(<span class="integer">1</span>, std::memory_order_release);
<span class="predefined-type">int</span> b = B.load(std::memory_order_acquire);
</pre></div>
</div>
</div>

<p>&hellip;the compiler is free to reorder those statements, as if we had written:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="predefined-type">int</span> b = B.load(std::memory_order_acquire);
A.store(<span class="integer">1</span>, std::memory_order_release);
</pre></div>
</div>
</div>

<p>And that&rsquo;s fair. Why the heck not? On many architectures, including x86, the CPU could perform this reordering anyway.</p>

<!--more-->
<p>Well, here&rsquo;s where Mike&rsquo;s question comes in. What if <code>A</code> and <code>B</code> are spinlocks? Let&rsquo;s say that the spinlock is initially 0. To lock it, we repeatedly attempt a <a href="http://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation/#compare-and-swap-the-mother-of-all-rmws">compare-and-swap</a>, with acquire semantics, until it changes from 0 to 1. To unlock it, we simply set it back to 0, with release semantics.</p>

<p>Now, suppose <strong>Thread 1</strong> does the following:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Lock A</span>
<span class="predefined-type">int</span> expected = <span class="integer">0</span>;
<span class="keyword">while</span> (!A.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Unlock A</span>
<span class="highlight">A.store(<span class="integer">0</span>, std::memory_order_release);</span>

<span class="comment">// Lock B</span>
<span class="keyword">while</span> (!<span class="highlight">B.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)</span>) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Unlock B</span>
B.store(<span class="integer">0</span>, std::memory_order_release);
</pre></div>
</div>
</div>

<p>Meanwhile, <strong>Thread 2</strong> does the following:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Lock B</span>
<span class="predefined-type">int</span> expected = <span class="integer">0</span>;
<span class="keyword">while</span> (!B.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Lock A</span>
<span class="keyword">while</span> (!A.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Unlock A</span>
A.store(<span class="integer">0</span>, std::memory_order_release);                      

<span class="comment">// Unlock B</span>
B.store(<span class="integer">0</span>, std::memory_order_release);
</pre></div>
</div>
</div>

<p>Check the highlighted lines in Thread 1. It&rsquo;s a write-release followed by a read-acquire! I just said that acquire and release semantics <em>don&rsquo;t</em> prevent the reordering of those operations. So, is the compiler free to reorder those statements? If it reorders those statements, then it would be as if we had written:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Lock A</span>
<span class="predefined-type">int</span> expected = <span class="integer">0</span>;
<span class="keyword">while</span> (!A.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Lock B</span>
<span class="keyword">while</span> (!<span class="highlight">B.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)</span>) {
    expected = <span class="integer">0</span>;
}

<span class="comment">// Unlock A</span>
<span class="highlight">A.store(<span class="integer">0</span>, std::memory_order_release);</span>

<span class="comment">// Unlock B</span>
B.store(<span class="integer">0</span>, std::memory_order_release);
</pre></div>
</div>
</div>

<p>This version is quite different from the original code. In the original code, Thread 1 only held one spinlock at a time. In this version, Thread 1 obtains both spinlocks. This introduces a potential <strong>deadlock</strong> in our program: Thread 1 could successfully lock A, but get stuck waiting for lock B; and Thread 2 could successfully lock B, but get stuck waiting for lock A.</p>

<p>That&rsquo;s bad.</p>

<p>However, I&rsquo;m not so sure the compiler is allowed to reorder those statements. Not because of acquire and release semantics, but because of a different rule from the C++ standard. In <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf">working draft N4659</a>, section 4.7.2:18 states:</p>

<blockquote>
  <p>An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.</p>
</blockquote>

<p>So, getting back to Thread 1&rsquo;s original code:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// Unlock A</span>
<span class="highlight">A.store(<span class="integer">0</span>, std::memory_order_release);</span>

<span class="comment">// Lock B</span>
<span class="keyword">while</span> (!<span class="highlight">B.compare_exchange_weak(expected, <span class="integer">1</span>, std::memory_order_acquire)</span>) {
    expected = <span class="integer">0</span>;
}
</pre></div>
</div>
</div>

<p>Once execution reaches the <code>while</code> loop, the last value assigned to <code>A</code> is <strong>0</strong>. The standard says that this value must become visible to all other threads in a finite period of time. But what if the <code>while</code> loop is infinite? The compiler has no way of ruling that out. And if the compiler can&rsquo;t rule out that the <code>while</code> loop is infinite, then it shouldn&rsquo;t reorder the first highlighted line to occur after the loop. If it moves that line after an infinite loop, then it is violating §4.7.2:18 of the C++ standard.</p>

<p>Therefore, I believe the compiler shouldn&rsquo;t reorder those statements, and deadlock is not possible. <em>[Note: This is not an iron-clad guarantee; see the update at the end of this post.]</em></p>

<p>As a sanity check, I pasted Thread 1&rsquo;s code into <a href="https://godbolt.org/g/DW6fuV">Matt Godbolt&rsquo;s Compiler Explorer</a>. Judging from the assembly code, none of the three major C++ compilers reorder those statements when optimizations are enabled. This obviously doesn&rsquo;t prove my claim, but it doesn&rsquo;t disprove it either.</p>

<p><a href="https://godbolt.org/g/DW6fuV"><img class="center" src="https://preshing.com/images/godbolt-spinlocks.png" /></a></p>

<p>I&rsquo;ve wondered about this question ever since watching Herb Sutter&rsquo;s <a href="http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2">Atomic Weapons talk from 2012</a>. At the 44:35 mark in the video, he alludes to an example exactly like this one &ndash; involving spinlocks &ndash; and warns that the reordering of release/acquire operations could introduce deadlock, exactly as described here. I thought it was an alarming point.</p>

<p><a href="http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2"><img class="center" src="https://preshing.com/images/release-acquire-slide.png" /></a></p>

<p>Now I don&rsquo;t think there&rsquo;s anything to worry about. At least not in this example. Am I right, or am I misinterpreting §4.7.2:18 of the standard? It would be nice if a compiler developer or other expert could weigh in.</p>

<p>By the way, in that part of Herb&rsquo;s talk, he describes the difference between what he calls &ldquo;plain acquire and release&rdquo; and &ldquo;SC (sequentially consistent) acquire and release&rdquo;. From what I can tell, the term &ldquo;SC acquire and release&rdquo; describes the behavior of the <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CHDCJBGA.html"><code>stlr</code> and <code>ldar</code></a> instructions introduced in ARMv8. Those instructions were introduced to help implement C++&rsquo;s <a href="http://preshing.com/20120930/weak-vs-strong-memory-models/#sequential-consistency">sequentially consistent</a> atomic operations more efficiently on ARM processors, as there is an implicit hardware <a href="http://preshing.com/20120710/memory-barriers-are-like-source-control-operations/#storeload"><code>#StoreLoad</code></a> barrier between those instructions. However, neither a hardware <code>#StoreLoad</code> barrier nor C++&rsquo;s sequentially consistent atomics are necessary to prevent the deadlock described in this post. All that&rsquo;s needed is to forbid the compiler reordering I pointed out, which I believe the standard already does.</p>

<p>Finally, this post should not be taken as an endorsement of spin locks. I defer to <a href="https://randomascii.wordpress.com/2012/06/05/in-praise-of-idleness/">Bruce Dawson&rsquo;s advice</a> on that subject. This post is just an attempt to better understand lock-free programming in C++.</p>

<h2 id="update-jun-16-2017">Update (Jun 16, 2017)</h2>

<p>Anthony Williams (author of <a href="http://www.amazon.com/gp/product/1933988770/ref=as_li_ss_tl?ie=UTF8&amp;tag=preshonprogr-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1933988770">C++ Concurrency in Action</a>) states in the comments that he doesn&rsquo;t think the above example can deadlock either.</p>

<p>Here&rsquo;s a simpler example that illustrates the same question: <code>thread2</code> busy-waits for a signal from <code>thread1</code>, then <code>thread1</code> busy-waits for a signal from <code>thread2</code>. Is the compiler allowed to reorder the highlighted line to the end of <code>thread1</code>? If it does, neither thread will terminate.</p>

<div><div class="CodeRay">
  <div class="code"><pre>std::atomic&lt;<span class="predefined-type">int</span>&gt; A = <span class="integer">0</span>;
std::atomic&lt;<span class="predefined-type">int</span>&gt; B = <span class="integer">0</span>;

<span class="directive">void</span> thread1() {
    <span class="highlight">A.store(<span class="integer">1</span>, std::memory_order_release);</span>

    <span class="keyword">while</span> (B.load(std::memory_order_acquire) == <span class="integer">0</span>) {
    }
}

<span class="directive">void</span> thread2() {
    <span class="keyword">while</span> (A.load(std::memory_order_acquire) == <span class="integer">0</span>) {
    }

    B.store(<span class="integer">1</span>, std::memory_order_release);
}
</pre></div>
</div>
</div>

<p>Nothing about acquire &amp; release semantics seems to prohibit this particular reordering, but I still contend that the answer is no. Again, I&rsquo;m assuming that the compiler follows <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf">N4659 §4.7.2:18</a>, which says that once the abstract machine issues an atomic store, it should ultimately become visible to all other threads, though it may take a long time. If the above reordering did take place, it would be as if the abstract machine didn&rsquo;t issue the store at all. </p>

<p>Part of the reason why this question is murky, at least for me, is that the standard&rsquo;s wording is weak. §4.7.2:18 says that implementations &ldquo;should&rdquo; ensure that stores become visible, not that they must. It&rsquo;s a recommendation, not a requirement.</p>

<p>Perhaps this weak wording was chosen because it&rsquo;s possible to run C++ programs on a single CPU without any thread preemption (say, on an embedded system). In such an environment, all of the above examples are likely to livelock anyway &ndash; they can get stuck on the first loop. Stronger ordering constraints, such as <code>memory_order_acq_rel</code> or <code>memory_order_seq_cst</code>, won&rsquo;t make the code any safer on such machines.</p>

<p>In the end, while <code>memory_order_acquire</code> and <code>memory_order_release</code> are certainly harder to <a href="http://preshing.com/20130823/the-synchronizes-with-relation">synchronize</a> than other ordering constraints, I don&rsquo;t think they are more inherently deadlock-prone. Any evidence to the contrary is welcome.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Here's a Standalone Cairo DLL for Windows]]></title>
    <link href="https://preshing.com/20170529/heres-a-standalone-cairo-dll-for-windows"/>
    <updated>2017-05-29T06:22:00-04:00</updated>
    <id>https://preshing.com/?p=20170528</id>
    <content type="html"><![CDATA[<p><a href="https://www.cairographics.org/">Cairo</a> is an open source C library for drawing vector graphics. I used it to create many of the diagrams and graphs on this blog.</p>

<p>Cairo is great, but it&rsquo;s always been difficult to find a precompiled Windows DLL that&rsquo;s up-to-date and that doesn&rsquo;t depend on a bunch of other DLLs. I was recently unable to find such a DLL, so I wrote a script to simplify the build process for one. The script is shared <a href="https://github.com/preshing/cairo-windows">on GitHub</a>:</p>

<p><a href="https://github.com/preshing/cairo-windows"><img class="center" src="https://preshing.com/images/cairo-windows-repo.png" /></a></p>

<p>If you just want a binary package, you can download one from the <a href="https://github.com/preshing/cairo-windows/releases">Releases</a> page:</p>

<p><a href="https://github.com/preshing/cairo-windows/releases"><img class="center" src="https://preshing.com/images/cairo-windows-download.png" /></a></p>

<p>The binary package contains Cairo header files, import libraries and DLLs for both x86 and x64. The DLLs are statically linked with their own C runtime and have no external dependencies. Since Cairo&rsquo;s API is pure C, these DLLs should work with any application built with any version of MSVC. I configured these DLLs to render text using <a href="https://www.freetype.org/">FreeType</a> because I find the quality of FreeType-rendered text better than Win32-rendered text, which Cairo normally uses by default. FreeType also supports more font formats and gives text a consistent appearance across different operating systems.</p>

<!--more-->

<h2 id="sample-application-using-cmake">Sample Application Using CMake</h2>

<p>Here&rsquo;s a <a href="https://github.com/preshing/CairoSample">small Cairo application</a> to test the DLLs. It uses <a href="http://preshing.com/20170511/how-to-build-a-cmake-based-project">CMake</a> to support multiple platforms including Windows, MacOS and Linux.</p>

<p><a href="https://github.com/preshing/CairoSample"><img class="center" src="https://preshing.com/images/cairo-sample-repo.png" /></a></p>

<p><a href="https://github.com/preshing/CairoSample"><img class="center" src="https://preshing.com/images/cairo-spiral.png" /></a></p>

<p>Hope this helps somebody!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Learn CMake's Scripting Language in 15 Minutes]]></title>
    <link href="https://preshing.com/20170522/learn-cmakes-scripting-language-in-15-minutes"/>
    <updated>2017-05-22T08:20:00-04:00</updated>
    <id>https://preshing.com/?p=20170522</id>
    <content type="html"><![CDATA[<p>As explained in my <a href="http://preshing.com/20170511/how-to-build-a-cmake-based-project">previous post</a>, every CMake-based project must contain a script named <code>CMakeLists.txt</code>. This script defines <strong>targets</strong>, but it can also do a lot of other things, such as finding third-party libraries or generating C++ header files. CMake scripts have a lot of flexibility.</p>

<p><img class="center" src="https://preshing.com/images/cmake-simpler-flowchart.png" /></p>

<p>Every time you integrate an external library, and often when adding support for another platform, you&rsquo;ll need to edit the script. I spent a long time editing CMake scripts without <em>really</em> understanding the language, as the documentation is quite scattered, but eventually, things clicked. The goal of this post is to get you to the same point as quickly as possible.</p>

<!--more-->
<p>This post won&rsquo;t cover all of CMake&rsquo;s built-in commands, as there are hundreds, but it is a fairly complete guide to the <strong>syntax</strong> and <strong>programming model</strong> of the language.</p>

<h2 id="hello-world">Hello World</h2>

<p>If you create a file named <code>hello.txt</code> with the following contents:</p>

<pre><code>message("Hello world!")         # A message to print
</code></pre>

<p>&hellip;you can run it from the command line using <code>cmake -P hello.txt</code>. (The <code>-P</code> option runs the given script, but doesn&rsquo;t generate a build pipeline.) As expected, it prints &ldquo;Hello world!&rdquo;.</p>

<pre><code>$ cmake -P hello.txt
Hello world!
</code></pre>

<h2 id="all-variables-are-strings">All Variables Are Strings</h2>

<p>In CMake, every variable is a string. You can substitute a variable inside a string literal by surrounding it with <code>${}</code>. This is called a <strong>variable reference</strong>. Modify <code>hello.txt</code> as follows:</p>

<pre><code>message("Hello ${NAME}!")       # Substitute a variable into the message
</code></pre>

<p>Now, if we define <code>NAME</code> on the <code>cmake</code> command line using the <code>-D</code> option, the script will use it:</p>

<pre><code>$ cmake -DNAME=Newman -P hello.txt
Hello Newman!
</code></pre>

<p>When a variable is undefined, it defaults to an empty string:</p>

<pre><code>$ cmake -P hello.txt
Hello !
</code></pre>

<p>To define a variable inside a script, use the <a href="https://cmake.org/cmake/help/latest/command/set.html"><code>set</code></a> command. The first argument is the name of the variable to assign, and the second argument is its value:</p>

<pre><code>set(THING "funk")
message("We want the ${THING}!")
</code></pre>

<p>Quotes around arguments are optional, as long as there are no <strong>spaces</strong> or <strong>variable references</strong> in the argument. For example, I could have written <code>set("THING" funk)</code> in the first line above &ndash; it would have been equivalent. For most CMake commands (except <code>if</code> and <code>while</code>, described below), the choice of whether to quote such arguments is simply a matter of style. When the argument is the name of a variable, I tend not to use quotes.</p>

<h2 id="you-can-simulate-a-data-structure-using-prefixes">You Can Simulate a Data Structure using Prefixes</h2>

<p>CMake does not have classes, but you can simulate a data structure by defining a group of variables with names that begin with the same prefix. You can then look up variables in that group using nested <code>${}</code> variable references. For example, the following script will print &ldquo;John Smith lives at 123 Fake St.&rdquo;:</p>

<pre><code>set(JOHN_NAME "John Smith")
set(JOHN_ADDRESS "123 Fake St")
set(PERSON "JOHN")
message("${${PERSON}_NAME} lives at ${${PERSON}_ADDRESS}.")
</code></pre>

<p>You can even use variable references in the name of the variable to set. For example, if the value of <code>PERSON</code> is still &ldquo;JOHN&rdquo;, the following will set the variable <code>JOHN_NAME</code> to &ldquo;John Goodman&rdquo;:</p>

<pre><code>set(${PERSON}_NAME "John Goodman")
</code></pre>

<h2 id="every-statement-is-a-command">Every Statement is a Command</h2>

<p>In CMake, every statement is a command that takes a list of <strong>string arguments</strong> and has <strong>no return value</strong>. Arguments are separated by (unquoted) spaces. As we&rsquo;ve already seen, the <code>set</code> command defines a variable at file scope.</p>

<p>As another example, CMake has a <a href="https://cmake.org/cmake/help/latest/command/math.html"><code>math</code></a> command that performs arithmetic. The first argument must be <code>EXPR</code>, the second argument is the name of the variable to assign, and the third argument is the expression to evaluate &ndash; all strings. Note that on the third line below, CMake substitutes the <em>string</em> value of <code>MY_SUM</code> into the enclosing argument before passing the argument to <code>math</code>.</p>

<pre><code>math(EXPR MY_SUM "1 + 1")                   # Evaluate 1 + 1; store result in MY_SUM
message("The sum is ${MY_SUM}.")
math(EXPR DOUBLE_SUM "${MY_SUM} * 2")       # Multiply by 2; store result in DOUBLE_SUM
message("Double that is ${DOUBLE_SUM}.")
</code></pre>

<p>There&rsquo;s a <a href="https://cmake.org/cmake/help/latest/manual/cmake-commands.7.html">CMake command</a> for just about anything you&rsquo;ll need to do. The <a href="https://cmake.org/cmake/help/latest/command/string.html"><code>string</code></a> command lets you perform advanced string manipulation, including regular expression replacement. The <a href="https://cmake.org/cmake/help/latest/command/file.html"><code>file</code></a> command can read or write files, or manipulate filesystem paths.</p>

<h2 id="flow-control-commands">Flow Control Commands</h2>

<p>Even flow control statements are commands. The <a href="https://cmake.org/cmake/help/latest/command/if.html"><code>if</code></a>/<code>endif</code> commands execute the enclosed commands conditionally. Whitespace doesn&rsquo;t matter, but it&rsquo;s common to indent the enclosed commands for readablity. The following checks whether CMake&rsquo;s built-in variable <a href="https://cmake.org/cmake/help/latest/manual/cmake-variables.7.html#variables-that-describe-the-system"><code>WIN32</code></a> is set:</p>

<pre><code>if(WIN32)
    message("You're running CMake on Windows.")
endif()
</code></pre>

<p>CMake also has <a href="https://cmake.org/cmake/help/latest/command/if.html"><code>while</code></a>/<code>endwhile</code> commands which, as you might expect, repeat the enclosed commands as long as the condition is true. Here&rsquo;s a loop that prints all the <a href="https://en.wikipedia.org/wiki/Fibonacci_number">Fibonacci numbers</a> up to one million:</p>

<pre><code>set(A "1")
set(B "1")
while(A LESS "1000000")
    message("${A}")                 # Print A
    math(EXPR T "${A} + ${B}")      # Add the numeric values of A and B; store result in T
    set(A "${B}")                   # Assign the value of B to A
    set(B "${T}")                   # Assign the value of T to B
endwhile()
</code></pre>

<p>CMake&rsquo;s <code>if</code> and <code>while</code> conditions aren&rsquo;t written the same way as in other languages. For example, to perform a numeric comparison, you must specify <code>LESS</code> as a string argument, as shown above. The <a href="https://cmake.org/cmake/help/latest/command/if.html">documentation</a> explains how to write a valid condition.</p>

<p><code>if</code> and <code>while</code> are different from other CMake commands in that if the name of a variable is specified without quotes, the command will use the variable&rsquo;s value. In the above code, I took advantage of that behavior by writing <code>while(A LESS "1000000")</code> instead of <code>while("${A}" LESS "1000000")</code> &ndash; both forms are equivalent. Other CMake commands don&rsquo;t do that.</p>

<h2 id="lists-are-just-semicolon-delimited-strings">Lists are Just Semicolon-Delimited Strings</h2>

<p>CMake has a special substitution rule for <strong>unquoted</strong> arguments. If the entire argument is a variable reference without quotes, and the variable&rsquo;s value contains <strong>semicolons</strong>, CMake will split the value at the semicolons and pass <strong>multiple arguments</strong> to the enclosing command. For example, the following passes three arguments to <code>math</code>:</p>

<pre><code>set(ARGS "EXPR;T;1 + 1")
math(${ARGS})                                   # Equivalent to calling math(EXPR T "1 + 1")
</code></pre>

<p>On the other hand, <strong>quoted</strong> arguments are never split into multiple arguments, even after substitution. CMake always passes a quoted string as a single argument, leaving semicolons intact:</p>

<pre><code>set(ARGS "EXPR;T;1 + 1")
message("${ARGS}")                              # Prints: EXPR;T;1 + 1
</code></pre>

<p>If more than two arguments are passed to the <code>set</code> command, they are joined by semicolons, then assigned to the specified variable. This effectively creates a list from the arguments:</p>

<pre><code>set(MY_LIST These are separate arguments)
message("${MY_LIST}")                           # Prints: These;are;separate;arguments
</code></pre>

<p>You can manipulate such lists using the <a href="https://cmake.org/cmake/help/latest/command/list.html"><code>list</code></a> command:</p>

<pre><code>set(MY_LIST These are separate arguments)
list(REMOVE_ITEM MY_LIST "separate")            # Removes "separate" from the list
message("${MY_LIST}")                           # Prints: These;are;arguments
</code></pre>

<p>The <a href="https://cmake.org/cmake/help/latest/command/foreach.html"><code>foreach</code></a>/<code>endforeach</code> command accepts multiple arguments. It iterates over all arguments except the first, assigning each one to the named variable:</p>

<pre><code>foreach(ARG These are separate arguments)
    message("${ARG}")                           # Prints each word on a separate line
endforeach()
</code></pre>

<p>You can iterate over a list by passing an unquoted variable reference to <code>foreach</code>. As with any other command, CMake will split the variable&rsquo;s value and pass multiple arguments to the command:</p>

<pre><code>foreach(ARG ${MY_LIST})                         # Splits the list; passes items as arguments
    message("${ARG}")                           # Prints each item on a separate line
endforeach()
</code></pre>

<h2 id="functions-run-in-their-own-scope-macros-dont">Functions Run In Their Own Scope; Macros Don&rsquo;t</h2>

<p>In CMake, you can use a pair of <a href="https://cmake.org/cmake/help/latest/command/function.html"><code>function</code></a>/<code>endfunction</code> commands to define a function. Here&rsquo;s one that doubles the numeric value of its argument, then prints the result:</p>

<pre><code>function(doubleIt VALUE)
    math(EXPR RESULT "${VALUE} * 2")
    message("${RESULT}")
endfunction()

doubleIt("4")                           # Prints: 8
</code></pre>

<p>Functions run in their own scope. None of the variables defined in a function pollute the caller&rsquo;s scope. If you want to return a value, you can pass the name of a variable to your function, then call the <a href="https://cmake.org/cmake/help/latest/command/set.html"><code>set</code></a> command with the special argument <code>PARENT_SCOPE</code>:</p>

<pre><code>function(doubleIt VARNAME VALUE)
    math(EXPR RESULT "${VALUE} * 2")
    set(${VARNAME} "${RESULT}" PARENT_SCOPE)    # Set the named variable in caller's scope
endfunction()

doubleIt(RESULT "4")                    # Tell the function to set the variable named RESULT
message("${RESULT}")                    # Prints: 8
</code></pre>

<p>Similarly, a pair of <a href="https://cmake.org/cmake/help/latest/command/function.html"><code>macro</code></a>/<code>endmacro</code> commands defines a macro. Unlike functions, macros run in the same scope as their caller. Therefore, all variables defined inside a macro are set in the caller&rsquo;s scope. We can replace the previous function with the following:</p>

<pre><code>macro(doubleIt VARNAME VALUE)
    math(EXPR ${VARNAME} "${VALUE} * 2")        # Set the named variable in caller's scope
endmacro()

doubleIt(RESULT "4")                    # Tell the macro to set the variable named RESULT
message("${RESULT}")                    # Prints: 8
</code></pre>

<p>Both functions and macros accept an arbitrary number of arguments. Unnamed arguments are exposed to the function as a list, through a special variable named <code>ARGN</code>. Here&rsquo;s a function that doubles every argument it receives, printing each one on a separate line:</p>

<pre><code>function(doubleEach)
    foreach(ARG ${ARGN})                # Iterate over each argument
        math(EXPR N "${ARG} * 2")       # Double ARG's numeric value; store result in N
        message("${N}")                 # Print N
    endforeach()
endfunction()

doubleEach(5 6 7 8)                     # Prints 10, 12, 14, 16 on separate lines
</code></pre>

<h2 id="including-other-scripts">Including Other Scripts</h2>

<p>CMake variables are defined at file scope. The <a href="https://cmake.org/cmake/help/latest/command/include.html"><code>include</code></a> command executes another CMake script in the <strong>same scope</strong> as the calling script. It&rsquo;s a lot like the <code>#include</code> directive in C/C++. It&rsquo;s typically used to define a common set of functions or macros in the calling script. It uses the variable <a href="https://cmake.org/cmake/help/latest/variable/CMAKE_MODULE_PATH.html"><code>CMAKE_MODULE_PATH</code></a> as a search path.</p>

<p>The <a href="https://cmake.org/cmake/help/v3.3/command/find_package.html"><code>find_package</code></a> command looks for scripts of the form <code>Find*.cmake</code> and also runs them in the same scope. Such scripts are often used to help find external libraries. For example, if there is a file named <code>FindSDL2.cmake</code> in the search path, <code>find_package(SDL2)</code> is equivalent to <code>include(FindSDL2.cmake)</code>. (Note that there are several ways to use the <code>find_package</code> command &ndash; this is just one of them.)</p>

<p>CMake&rsquo;s <a href="https://cmake.org/cmake/help/latest/command/add_subdirectory.html"><code>add_subdirectory</code></a> command, on the other hand, creates a <strong>new scope</strong>, then executes the script named <code>CMakeLists.txt</code> from the specified directory in that new scope. You typically use it to add another CMake-based subproject, such as a library or executable, to the calling project. The targets defined by the subproject are added to the build pipeline unless otherwise specified. None of the variables defined in the subproject&rsquo;s script will pollute the parent&rsquo;s scope unless the <code>set</code> command&rsquo;s <code>PARENT_SCOPE</code> option is used.</p>

<p>As an example, here are some of the scripts involved when you run CMake on the <a href="https://github.com/preshing/turf">Turf</a> project:</p>

<p><img class="center" src="https://preshing.com/images/cmake-variable-scopes.png" /></p>

<h2 id="getting-and-setting-properties">Getting and Setting Properties</h2>

<p>A CMake script defines <strong>targets</strong> using the <a href="https://cmake.org/cmake/help/latest/command/add_executable.html"><code>add_executable</code></a>, <a href="https://cmake.org/cmake/help/latest/command/add_library.html"><code>add_library</code></a> or <a href="https://cmake.org/cmake/help/latest/command/add_custom_target.html"><code>add_custom_target</code></a> commands. Once a target is created, it has <strong>properties</strong> that you can manipulate using the <a href="https://cmake.org/cmake/help/latest/command/get_property.html"><code>get_property</code></a> and <a href="https://cmake.org/cmake/help/latest/command/set_property.html"><code>set_property</code></a> commands. Unlike variables, targets are visible in every scope, even if they were defined in a subdirectory. All target properties are strings.</p>

<pre><code>add_executable(MyApp "main.cpp")        # Create a target named MyApp

# Get the target's SOURCES property and assign it to MYAPP_SOURCES
get_property(MYAPP_SOURCES TARGET MyApp PROPERTY SOURCES)

message("${MYAPP_SOURCES}")             # Prints: main.cpp
</code></pre>

<p>Other <a href="https://cmake.org/cmake/help/latest/manual/cmake-properties.7.html#properties-on-targets">target properties</a> include <code>LINK_LIBRARIES</code>, <code>INCLUDE_DIRECTORIES</code> and <code>COMPILE_DEFINITIONS</code>. Those properties are modified, indirectly, by the <a href="https://cmake.org/cmake/help/latest/command/target_link_libraries.html"><code>target_link_libraries</code></a>, <a href="https://cmake.org/cmake/help/latest/command/target_include_directories.html"><code>target_include_directories</code></a> and <a href="https://cmake.org/cmake/help/latest/command/target_compile_definitions.html"><code>target_compile_definitions</code></a> commands. At the end of the script, CMake uses those target properties to generate the build pipeline.</p>

<p>There are properties for other CMake entities, too. There is a set of <a href="https://cmake.org/cmake/help/latest/manual/cmake-properties.7.html#properties-on-directories">directory properties</a> at every file scope. There is a set of <a href="https://cmake.org/cmake/help/latest/manual/cmake-properties.7.html#properties-of-global-scope">global properties</a> that is accessible from all scripts. And there is a set of <a href="https://cmake.org/cmake/help/latest/manual/cmake-properties.7.html#properties-on-source-files">source file properties</a> for every C/C++ source file.</p>

<p>Congratulations! You now know the CMake scripting language &ndash; or at least, it should be easier to understand large scripts using CMake&rsquo;s <a href="https://cmake.org/cmake/help/latest/manual/cmake-commands.7.html">command reference</a>. Otherwise, the only thing missing from this guide, that I can think of, is <a href="https://cmake.org/cmake/help/latest/manual/cmake-generator-expressions.7.html#manual:cmake-generator-expressions(7)">generator expressions</a>. Let me know if I forgot anything else!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How to Build a CMake-Based Project]]></title>
    <link href="https://preshing.com/20170511/how-to-build-a-cmake-based-project"/>
    <updated>2017-05-11T08:30:00-04:00</updated>
    <id>https://preshing.com/?p=20170511</id>
    <content type="html"><![CDATA[<p><a href="https://cmake.org/">CMake</a> is a versatile tool that helps you build C/C++ projects on just about any platform you can think of. It&rsquo;s used by many popular open source projects including LLVM, Qt, KDE and Blender.</p>

<p>All CMake-based projects contain a script named <code>CMakeLists.txt</code>, and this post is meant as a guide for configuring and building such projects. This post won&rsquo;t show you how to <em>write</em> a CMake script &ndash; that&rsquo;s getting ahead of things, in my opinion.</p>

<p>As an example, I&rsquo;ve prepared a <a href="https://github.com/preshing/CMakeDemo">CMake-based project</a> that uses SDL2 and OpenGL to render a spinning 3D logo. You can build it on Windows, MacOS or Linux.</p>

<p><a href="https://github.com/preshing/CMakeDemo"><img class="center" src="https://preshing.com/images/cmakedemo-github.png" /></a>
<a href="https://github.com/preshing/CMakeDemo"><img class="center" src="https://preshing.com/images/cmakedemo-preview.png" /></a></p>

<!--more-->
<div class="panel_note">
<p>The information here applies to any CMake-based project, so feel free to skip ahead to any section. However, I recommend reading the first two sections first.</p>
<ul>
<li><a href="#the-source-and-binary-folders">The Source and Binary Folders</a></li>
<li><a href="#the-configure-and-generate-steps">The Configure and Generate Steps</a></li>
<li><a href="#running-cmake-from-the-command-line">Running CMake from the Command Line</a></li>
<li><a href="#running-cmake-gui">Running cmake-gui</a></li>
<li><a href="#running-ccmake">Running ccmake</a></li>
<li><a href="#building-with-unix-makefiles">Building with Unix Makefiles</a></li>
<li><a href="#building-with-visual-studio">Building with Visual Studio</a></li>
<li><a href="#building-with-xcode">Building with Xcode</a></li>
<li><a href="#building-with-qt-creator">Building with Qt Creator</a></li>
<li><a href="#other-cmake-features">Other CMake Features</a></li>
</ul>
</div>

<p>If you don&rsquo;t have CMake yet, there are installers and binary distributions <a href="https://cmake.org/download/">on the CMake website</a>. In Unix-like environments, including Linux, it&rsquo;s usually available through the system package manager. You can also install it through <a href="https://www.macports.org/">MacPorts</a>, <a href="https://brew.sh/">Homebrew</a>, <a href="https://www.cygwin.com/">Cygwin</a> or <a href="http://www.msys2.org/">MSYS2</a>.</p>

<h2 id="the-source-and-binary-folders">The Source and Binary Folders</h2>

<p>CMake generates <strong>build pipelines</strong>. A build pipeline might be a Visual Studio <code>.sln</code> file, an Xcode <code>.xcodeproj</code> or a Unix-style <code>Makefile</code>. It can also take several other forms.</p>

<p>To generate a build pipeline, CMake needs to know the <strong>source</strong> and <strong>binary</strong> folders. The source folder is the one containing <code>CMakeLists.txt</code>. The binary folder is where CMake generates the build pipeline. You can create the binary folder anywhere you want. A common practice is to create a subdirectory <code>build</code> beneath <code>CMakeLists.txt</code>.</p>

<p><img class="center" src="https://preshing.com/images/cmake-concepts.png" /></p>

<p>By keeping the binary folder separate from the source, you can delete the binary folder at any time to get back to a clean slate. You can even create several binary folders, side-by-side, that use different build systems or configuration options.</p>

<p>The <strong>cache</strong> is an important concept. It&rsquo;s a single text file in the binary folder named <code>CMakeCache.txt</code>. This is where <strong>cache variables</strong> are stored. Cache variables include user-configurable options defined by the project such as <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a>&rsquo;s <code>DEMO_ENABLE_MULTISAMPLE</code> option (explained later), and precomputed information to help speed up CMake runs. (You can, and will, re-run CMake several times on the same binary folder.)</p>

<p>You aren&rsquo;t meant to submit the generated build pipeline to source control, as it usually contains paths that are hardcoded to the local filesystem. Instead, simply re-run CMake each time you clone the project to a new folder. I usually add the rule <code>*build*/</code> to my <code>.gitignore</code> files.</p>

<h2 id="the-configure-and-generate-steps">The Configure and Generate Steps</h2>

<p>As you&rsquo;ll see in the following sections, there are several ways to run CMake. No matter how you run it, it performs two steps: the <strong>configure</strong> step and the <strong>generate</strong> step.</p>

<p><img class="center" src="https://preshing.com/images/cmake-simple-flowchart.png" /></p>

<p>The <code>CMakeLists.txt</code> script is executed during the configure step. This script is responsible for defining <strong>targets</strong>. Each target represents an executable, library, or some other output of the build pipeline.</p>

<p>If the configure step succeeds &ndash; meaning <code>CMakeLists.txt</code> completed without errors &ndash; CMake will generate a build pipeline using the targets defined by the script. The type of build pipeline generated depends on the type of <strong>generator</strong> used, as explained in the following sections.</p>

<p>Additional things may happen during the configure step, depending on the contents of <code>CMakeLists.txt</code>. For example, in our sample <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> project, the configure step also:</p>

<ul>
  <li>Finds the header files and libraries for SDL2 and OpenGL.</li>
  <li>Generates a header file <code>demo-config.h</code> in the binary folder, which will be included from C++ code.</li>
</ul>

<p><img class="center" src="https://preshing.com/images/cmake-linux-config-steps.png" /></p>

<p>In a more sophisticated project, the configure step might also test the availability of system functions (as a traditional Unix <code>configure</code> script would), or define a special &ldquo;install&rdquo; target (to help create a distributable package). If you re-run CMake on the same binary folder, many of the slow steps are skipped during subsequent runs, thanks to the cache.</p>

<h2 id="running-cmake-from-the-command-line">Running CMake from the Command Line</h2>

<p>Before running CMake, make sure you have the required dependencies for your project and platform. For <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> on Windows, you can run <code>setup-win32.py</code>. For other platforms, check the <a href="https://github.com/preshing/CMakeDemo/blob/master/README.md">README</a>.</p>

<p>You&rsquo;ll often want to tell CMake which generator to use. For a list of available generators, run <code>cmake --help</code>.</p>

<p><img class="center" src="https://preshing.com/images/cmake-generators.png" /></p>

<p>Create the binary folder, <code>cd</code> to that folder, then run <code>cmake</code>, specifying the path to the source folder on the command line. Specify the desired generator using the <code>-G</code> option. If you omit the <code>-G</code> option, <code>cmake</code> will choose one for you. (If you don&rsquo;t like its choice, you can always delete the binary folder and start over.)</p>

<pre><code>mkdir build
cd build
cmake -G "Visual Studio 15 2017" ..
</code></pre>

<p>If there are project-specific configuration options, you can specify those on the command line as well. For example, the CMakeDemo project has a configuration option <code>DEMO_ENABLE_MULTISAMPLE</code> that defaults to 0. You can enable this configuration option by specifying <code>-DDEMO_ENABLE_MULTISAMPLE=1</code> on the <code>cmake</code> command line. Changing the value of <code>DEMO_ENABLE_MULTISAMPLE</code> will change the contents of <code>demo-config.h</code>, a header file that&rsquo;s generated by <code>CMakeLists.txt</code> during the configure step. The value of this variable is also stored in the cache so that it persists during subsequent runs. Other projects have different configuration options.</p>

<pre><code>cmake -G "Visual Studio 15 2017" -DDEMO_ENABLE_MULTISAMPLE=1 ..
</code></pre>

<p>If you change your mind about the value of <code>DEMO_ENABLE_MULTISAMPLE</code>, you can re-run CMake at any time. On subsequent runs, instead of passing the source folder path to the <code>cmake</code> command line, you can simply specify the path to the existing binary folder. CMake will find all previous settings in the cache, such as the choice of generator, and re-use them.</p>

<pre><code>cmake -DDEMO_ENABLE_MULTISAMPLE=0 .
</code></pre>

<p>You can view project-defined cache variables by running <code>cmake -L -N .</code>. Here you can see <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a>&rsquo;s <code>DEMO_ENABLE_MULTISAMPLE</code> option left at its default 0 value:</p>

<p><img class="center" src="https://preshing.com/images/cmake-cl-cache-vars.png" /></p>

<h2 id="running-cmake-gui">Running cmake-gui</h2>

<p>I prefer the <a href="#running-cmake-from-the-command-line">command line</a>, but CMake also has a GUI. The GUI offers an interactive way to set cache variables. Again, make sure to install your project&rsquo;s required dependencies first.</p>

<p>To use it, run <code>cmake-gui</code>, fill in the source and binary folder paths, then click Configure.</p>

<p><img class="center" src="https://preshing.com/images/cmake-gui.png" /></p>

<p>If the binary folder doesn&rsquo;t exist, CMake will prompt you to create it. It will then ask you to select a generator.</p>

<p><img class="center" src="https://preshing.com/images/cmake-choose-generator.png" /></p>

<p>After the initial configure step, the GUI will show you a list of cache variables, similar to the list you see when you run <code>cmake -L -N .</code> from the command line. New cache variables are highlighted in red. (In this case, that&rsquo;s all of them.) If you click Configure again, the red highlights will disappear, since the variables are no longer considered new.</p>

<p><img class="center" src="https://preshing.com/images/cmake-gui-options.png" /></p>

<p>The idea is that if you change a cache variable, then click Configure, new cache variables might appear as a result of your change. The red highlights are meant to help you see any new variables, customize them, then click Configure again. In practice, changing a value doesn&rsquo;t introduce new cache variables very often. It depends how the project&rsquo;s <code>CMakeLists.txt</code> script was written.</p>

<p>Once you&rsquo;ve customized the cache variables to your liking, click Generate. This will generate the build pipeline in the binary folder. You can then use it to build your project.</p>

<h2 id="running-ccmake">Running ccmake</h2>

<p><code>ccmake</code> is the console equivalent to <code>cmake-gui</code>. Like the GUI, it lets you set cache variables interactively. It can be handy when running CMake on a remote machine, or if you just like using the console. If you can figure out the <a href="#running-cmake-gui">CMake GUI</a>, you can figure out <code>ccmake</code>.</p>

<p><img class="center" src="https://preshing.com/images/ccmake-grab.png" /></p>

<h2 id="building-with-unix-makefiles">Building with Unix Makefiles</h2>

<p>CMake generates a Unix makefile by default when <a href="#running-cmake-from-the-command-line">run from the command line</a> in a Unix-like environment. Of course, you can generate makefiles explicitly using the <code>-G</code> option. When generating a makefile, you should also define the <code>CMAKE_BUILD_TYPE</code> variable. Assuming the source folder is the parent:</p>

<pre><code>cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug ..
</code></pre>

<p>You should define the <code>CMAKE_BUILD_TYPE</code> variable because makefiles generated by CMake are <strong>single-configuration</strong>. Unlike a Visual Studio solution, you can&rsquo;t use the same makefile to build multiple configurations such as Debug and Release. A single makefile is capable of building <em>exactly one</em> build type. By default, the available types are Debug, MinSizeRel, RelWithDebInfo and Release. Watch out &ndash; if you forget to define <code>CMAKE_BUILD_TYPE</code>, you&rsquo;ll probably get an unoptimized build without debug information, which is useless. To change to a different build type, you must re-run CMake and generate a new makefile. </p>

<p>Personally, I also find CMake&rsquo;s default Release configuration useless because it doesn&rsquo;t generate any debug information. If you&rsquo;ve ever opened a crash dump or fixed a bug in Release, you&rsquo;ll appreciate the availability of debug information, even in an optimized build. That&rsquo;s why, in my other CMake projects, I usually delete the Release configuration from <code>CMakeLists.txt</code> and use  RelWithDebInfo instead.</p>

<p>Once the makefile exists, you can actually build your project by running <code>make</code>. By default, <code>make</code> will build every target that was defined by <code>CMakeLists.txt</code>. In <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a>&rsquo;s case, there&rsquo;s only one target. You can also build a specific target by passing its name to <code>make</code>:</p>

<pre><code>make CMakeDemo
</code></pre>

<p>The makefile generated by CMake detects header file dependencies automatically, so editing a single header file won&rsquo;t necessarily rebuild the entire project. You can also parallelize the build by passing <code>-j 4</code> (or a higher number) to <code>make</code>.</p>

<p>CMake also exposes a <a href="https://ninja-build.org/">Ninja</a> generator. Ninja is similar to <code>make</code>, but faster. It generates a <code>build.ninja</code> file, which is similar to a <code>Makefile</code>. The Ninja generator is also single-configuration. Ninja&rsquo;s <code>-j</code> option autodetects the number of available CPUs.</p>

<h2 id="building-with-visual-studio">Building with Visual Studio</h2>

<p>We&rsquo;ll generate a Visual Studio <code>.sln</code> file from the <a href="#running-cmake-from-the-command-line">CMake command line</a>. If you have several versions of Visual Studio installed, you&rsquo;ll want to tell <code>cmake</code> which version to use. Again, assuming that the source folder is the parent:</p>

<pre><code>cmake -G "Visual Studio 15 2017" ..
</code></pre>

<p>The above command line will generate a Visual Studio <code>.sln</code> file for a 32-bit build. There are no multiplatform <code>.sln</code> files using CMake, so for a 64-bit build, you must specify the 64-bit generator:</p>

<pre><code>cmake -G "Visual Studio 15 2017 Win64" ..
</code></pre>

<p>Open the resulting <code>.sln</code> file in Visual Studio, go to the Solution Explorer panel, right-click the target you want to run, then choose &ldquo;Set as Startup Project&rdquo;. Build and run as you normally would.</p>

<p><img class="center" src="https://preshing.com/images/cmake-sln-explorer.png" /></p>

<p>Note that CMake adds two additional targets to the solution: ALL_BUILD and ZERO_CHECK. ZERO_CHECK automatically re-runs CMake when it detects a change to <code>CMakeLists.txt</code>. ALL_BUILD usually builds all other targets, making it somewhat redundant in Visual Studio. If you&rsquo;re used to setting up your solutions a certain way, it might seem annoying to have these extra targets in your <code>.sln</code> file, but you get used to it. CMake lets you organize targets and source files into folders, but I didn&rsquo;t demonstrate that in the <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> sample.</p>

<p>Like any Visual Studio solution, you can change build type at any time from the Solution Configuration drop-down list. The <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> sample uses CMake&rsquo;s default set of build types, shown below. Again, I find the default Release configuration rather useless as it doesn&rsquo;t produce any debug information. In my other CMake projects, I usually delete the Release configuration from <code>CMakeLists.txt</code> and use RelWithDebInfo instead.</p>

<p><img class="center" src="https://preshing.com/images/cmake-vs-configs.png" /></p>

<h3 id="built-in-cmake-support-in-visual-studio-2017">Built-In CMake Support in Visual Studio 2017</h3>

<p>In Visual Studio 2017, Microsoft introduced <a href="https://blogs.msdn.microsoft.com/vcblog/2016/10/05/cmake-support-in-visual-studio/">another way to use CMake</a> with Visual Studio. You can now open the source folder containing <code>CMakeLists.txt</code> from Visual Studio&rsquo;s File &rarr; Open &rarr; Folder menu. This new method avoids creating intermediate <code>.sln</code> and <code>.vcxproj</code> files. It also exposes 32-bit and 64-bit builds in the same workspace. It&rsquo;s a nice idea that, in my opinion, falls short for a few reasons:</p>

<ul>
  <li>If there are any source files <em>outside</em> the source folder containing <code>CMakeLists.txt</code>, they won&rsquo;t appear in the Solution Explorer.</li>
  <li>The familiar C/C++ Property Pages are no longer available.</li>
  <li>Cache variables can only be set by editing a JSON file, which is pretty unintuitive for a Visual IDE.</li>
</ul>

<p>I&rsquo;m not really a fan. For now, I intend to keep generating <code>.sln</code> files by hand using CMake.</p>

<h2 id="building-with-xcode">Building with Xcode</h2>

<p>The CMake website publishes a <a href="https://cmake.org/download/">binary distribution</a> of CMake for MacOS as a <code>.dmg</code> file. The <code>.dmg</code> file contains an app that you can drag &amp; drop to your Applications folder. Note that if you install CMake this way, <code>cmake</code> won&rsquo;t be available from the command line unless you create a link to <code>/Applications/CMake.app/Contents/bin/cmake</code> somewhere. I prefer installing CMake from <a href="https://www.macports.org/">MacPorts</a> because it sets up the command line for you, and because dependencies like SDL2 can be installed the same way.</p>

<p>Specify the Xcode generator from the <a href="#running-cmake-from-the-command-line">CMake command line</a>. Again, assuming that the source folder is the parent:</p>

<pre><code>cmake -G "Xcode" ..
</code></pre>

<p>This will create an <code>.xcodeproj</code> folder. Open it in Xcode. (I tested in Xcode 8.3.1.) In the Xcode toolbar, click the &ldquo;active scheme&rdquo; drop-down list and select the target you want to run.</p>

<p><img class="center" src="https://preshing.com/images/cmake-xcode-target.png" /></p>

<p>After that, click &ldquo;Edit Scheme&hellip;&rdquo; from the same drop-down list, then choose a build configuration under Run &rarr; Info. Again, I don&rsquo;t recommend CMake&rsquo;s default Release configuration, as the lack of debug information limits its usefulness.</p>

<p><img class="center" src="https://preshing.com/images/cmake-xcode-config.png" /></p>

<p>Finally, build from the Product &rarr; Build menu (or the &#8984;B shortcut), run using Product &rarr; Run (or &#8984;R), or click the big play button in the toolbar.</p>

<p>It&rsquo;s possible to make CMake generate an Xcode project that builds a MacOS bundle or framework, but I didn&rsquo;t demonstrate that in the <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> project.</p>

<h2 id="building-with-qt-creator">Building with Qt Creator</h2>

<p>Qt Creator provides built-in support for CMake using the <a href="#building-with-unix-makefiles">Makefile or Ninja generator</a> under the hood. I tested the following steps in Qt Creator 3.5.1.</p>

<p>In Qt Creator, go to File &rarr; Open File or Project&hellip; and choose <code>CMakeLists.txt</code> from the source folder you want to build.</p>

<p><img class="center" src="https://preshing.com/images/cmake-qt-choose-folder.png" /></p>

<p>Qt Creator will prompt you for the location of the binary folder, calling it the &ldquo;build directory&rdquo;. By default, it suggests a path adjacent to the source folder. You can change this location if you want.</p>

<p><img class="center" src="https://preshing.com/images/cmake-qt-build-location.png" /></p>

<p>When prompted to run CMake, make sure to define the <code>CMAKE_BUILD_TYPE</code> variable since the Makefile generator is <a href="#building-with-unix-makefiles">single-configuration</a>. You can also specify project-specific variables here, such as <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a>&rsquo;s <code>DEMO_ENABLE_MULTISAMPLE</code> option.</p>

<p><img class="center" src="https://preshing.com/images/cmake-qt-run.png" /></p>

<p>After that, you can build and run the project from Qt Creator&rsquo;s menus or using the Shift+Ctrl+B or F5 shortcuts.</p>

<p>If you want to re-run CMake, for example to change the build type from Debug to RelWithDebInfo, navigate to Projects &rarr; Build &amp; Run &rarr; Build, then click &ldquo;Run CMake&rdquo;.</p>

<p><img class="center" src="https://preshing.com/images/cmake-qt-project-mode.png" /></p>

<p>The <a href="https://github.com/preshing/CMakeDemo">CMakeDemo</a> project contains a single executable target, but if your project contains multiple executable targets, you can tell Qt Creator which one to run by navigating to Projects &rarr; Build &amp; Run &rarr; Run and changing the &ldquo;Run configuration&rdquo; to something else. The drop-down list is automatically populated with a list of executable targets created by the build pipeline.</p>

<p><img class="center" src="https://preshing.com/images/cmake-qt-run-config.png" /></p>

<h2 id="other-cmake-features">Other CMake Features</h2>

<ul>
  <li>You can perform a build from the command line, regardless of the generator used: <code>cmake --build . --target CMakeDemo --config Debug</code></li>
  <li>You can create build pipelines that cross-compile for other environments with the help of the <code>CMAKE_TOOLCHAIN_FILE</code> variable.</li>
  <li>You can generate a <code>compile_commands.json</code> file that can be fed to Clang&rsquo;s <a href="https://clang.llvm.org/docs/LibTooling.html">LibTooling</a> library.</li>
</ul>

<p>I really appreciate how CMake helps integrate all kinds of C/C++ components and build them in all kinds of environments. It&rsquo;s not without its flaws, but once you&rsquo;re proficient with it, the open source world is your oyster, even when integrating non-CMake projects. My next post will be a crash course in CMake&rsquo;s scripting language.</p>

<p><a href="https://www.amazon.com/gp/product/1930934319?ie=UTF8&amp;tag=preshonprogr-20&amp;camp=1789&amp;linkCode=xm2&amp;creativeASIN=1930934319"><img class="right" src="https://preshing.com/images/mastering-cmake.png" /></a>If you wish to become a power user, and don&rsquo;t mind forking over a few bucks, the authors&rsquo; book <a href="https://www.amazon.com/gp/product/1930934319?ie=UTF8&amp;tag=preshonprogr-20&amp;camp=1789&amp;linkCode=xm2&amp;creativeASIN=1930934319">Mastering CMake</a> offers a big leap forward. Their article in <a href="http://aosabook.org/en/cmake.html">The Architecture of Open Source Applications</a> is also an interesting read.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using Quiescent States to Reclaim Memory]]></title>
    <link href="https://preshing.com/20160726/using-quiescent-states-to-reclaim-memory"/>
    <updated>2016-07-26T06:30:00-04:00</updated>
    <id>https://preshing.com/?p=20160726</id>
    <content type="html"><![CDATA[<p>If you want to support multiple readers for a data structure, while protecting against concurrent writes, a <a href="https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock">read-write lock</a> might seem like the only way &ndash; but it isn&rsquo;t! You can achieve the same thing without a read-write lock if you allow several copies of the data structure to exist in memory. You just need a way to delete old copies when they&rsquo;re no longer in use.</p>

<p>Let&rsquo;s look at one way to achieve that in C++. We&rsquo;ll start with an example based on a read-write lock.</p>

<h2 id="using-a-read-write-lock">Using a Read-Write Lock</h2>

<p>Suppose you have a network server with dozens of threads. Each thread broadcasts messages to dozens of connected clients. Once in a while, a new client connects or an existing client disconnects, so the list of connected clients must change. We can store the list of connected clients in a <code>std::vector</code> and protect it using a read-write lock such as <code>std::shared_mutex</code>.</p>

<!--more-->
<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">Server</span> {
<span class="directive">private</span>:
    std::shared_mutex m_rwLock;                                   <span class="comment">// Read-write lock</span>
    std::vector&lt;<span class="predefined-type">int</span>&gt; m_clients;                                   <span class="comment">// List of connected clients</span>
    
<span class="directive">public</span>:
    <span class="directive">void</span> broadcast(<span class="directive">const</span> <span class="directive">void</span>* msg, size_t len) {
        std::shared_lock&lt;std::shared_mutex&gt; shared(m_rwLock);     <span class="comment">// Shared lock</span>
        <span class="keyword">for</span> (<span class="predefined-type">int</span> fd : m_clients)
            send(fd, msg, len, <span class="integer">0</span>);
    }

    <span class="directive">void</span> addClient(<span class="predefined-type">int</span> fd) {
        std::unique_lock&lt;std::shared_mutex&gt; exclusive(m_rwLock);  <span class="comment">// Exclusive lock</span>
        m_clients.push_back(fd);
    }

    ...
</pre></div>
</div>
</div>

<p>The <code>broadcast</code> function reads from the list of connected clients, but doesn&rsquo;t modify it, so it takes a read lock (also known as a shared lock). <code>addClient</code>, on the other hand, needs to modify the list, so it takes a write lock (also known as an exclusive lock).</p>

<p>That&rsquo;s all fine and dandy. Now let&rsquo;s eliminate the read-write lock by allowing multiple copies of the list to exist at the same time.</p>

<h2 id="eliminating-the-read-write-lock">Eliminating the Read-Write Lock</h2>

<p>First, we must establish an atomic pointer to the current list. This pointer will hold the most up-to-date list of connected clients at any moment in time.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">Server</span> {
<span class="directive">private</span>:
    <span class="keyword">struct</span> ClientList {
        std::vector&lt;<span class="predefined-type">int</span>&gt; clients;
    };

    <span class="highlight">std::atomic&lt;ClientList*&gt; m_currentList;</span>      <span class="comment">// The most up-to-date list</span>

<span class="directive">public</span>:
    ...
</pre></div>
</div>
</div>

<p>The <code>broadcast</code> function copies that pointer to a local variable, then uses that local variable for the remainder of the function. Note that the shared lock has been eliminated. That reduces the number of modifications to shared memory, which is <a href="http://www.1024cores.net/home/lock-free-algorithms/first-things-first">better for scalability</a>.</p>

<div><div class="CodeRay">
  <div class="code"><pre>    <span class="directive">void</span> broadcast(<span class="directive">const</span> <span class="directive">void</span>* msg, size_t len) {
        ClientList* list = m_currentList.load();        <span class="comment">// Atomic load from m_currentList</span>
        <span class="keyword">for</span> (<span class="predefined-type">int</span> fd : list-&gt;clients)
            send(fd, msg, len);
    }
</pre></div>
</div>
</div>

<p>The <code>addClient</code> function, called less frequently, makes a new, private copy of the list, modifies the copy, then publishes the new copy back to the atomic pointer. For simplicity, let&rsquo;s assume all calls to <code>addClient</code> are made from a single thread. (If calls were made from multiple threads, we&rsquo;d need to protect <code>addClient</code> with a mutex or a <a href="http://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation">CAS loop</a>.)</p>

<div><div class="CodeRay">
  <div class="code"><pre>    <span class="directive">void</span> addClient(<span class="predefined-type">int</span> fd) {
        ClientList* oldList = m_currentList.load();        <span class="comment">// Atomic load from m_currentList</span>
        ClientList* newList = <span class="keyword">new</span> ClientList{*oldList};    <span class="comment">// Make a private copy</span>
        newList-&gt;clients.push_back(fd);                    <span class="comment">// Modify it</span>
        m_currentList.store(newList);                      <span class="comment">// Publish the new copy</span>

        <span class="comment">// *** Note: Must do something with the old list here ***</span>
    }
</pre></div>
</div>
</div>

<p>At the moment when <code>m_currentList</code> is replaced, other threads might still be using the old list, but that&rsquo;s fine. We allow it.</p>

<p><img class="center" src="https://preshing.com/images/qsbr-replace-client-list.png" /></p>

<p>We aren&rsquo;t done yet, though. <code>addClient</code> needs to do something with the old list. We can&rsquo;t delete the old list immediately, since other threads might still be using it. And we can&rsquo;t <em>not</em> delete it, since that would result in a memory leak. Let&rsquo;s introduce a new object that&rsquo;s responsible for deleting old lists at a safe point in time. We&rsquo;ll call it a <code>MemoryReclaimer</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">Server</span> {
    ...

    <span class="highlight">MemoryReclaimer m_reclaimer;</span>

    ...

    <span class="directive">void</span> addClient(<span class="predefined-type">int</span> fd) {
        ClientList* oldList = m_currentList.load();         <span class="comment">// Atomic load from m_currentList</span>
        ClientList* newList = <span class="keyword">new</span> ClientList{*oldList};     <span class="comment">// Make a private copy</span>
        newList-&gt;clients.push_back(fd);                     <span class="comment">// Modify it</span>
        m_currentList.store(newList);                       <span class="comment">// Publish the new copy</span>

        <span class="highlight">m_reclaimer.addCallback([=](){ <span class="keyword">delete</span> oldList });</span>
    }

    ...
</pre></div>
</div>
</div>

<p>It&rsquo;s interesting to note that if this was Java, we wouldn&rsquo;t need to introduce such a <code>MemoryReclaimer</code>. We could just stop referencing the old list, and Java&rsquo;s garbage collector would eventually delete it. But this is C++, so we must clean up those old lists explicitly.</p>

<p>We notify <code>MemoryReclaimer</code> about objects to delete by passing a callback to <code>addCallback</code>. <code>MemoryReclaimer</code> must invoke this callback sometime after all threads are finished reading from the old object. It must also ensure that none of those threads will ever access the old object again. Here&rsquo;s one way to achieve both goals.</p>

<h2 id="quiescent-state-based-reclamation">Quiescent State-Based Reclamation</h2>

<p>The approach I&rsquo;ll describe here is known as <em>quiescent state-based reclamation</em>, or QSBR for short. The idea is to identify a <strong>quiescent state</strong> in each thread. A quiescent state is a bit like the opposite of a critical section. It&rsquo;s some point in the thread&rsquo;s execution that lies outside all related critical sections performed by that thread. For example, our <code>broadcast</code> function still contains a critical section, even though it doesn&rsquo;t explicitly lock anymore, because it&rsquo;s critical not to delete the list before the function returns. Therefore, at a very minimum, the quiescent state should lie somewhere outside the <code>broadcast</code> function.</p>

<p>Wherever we choose to put the quiescent state, we must notify the <code>MemoryReclaimer</code> object about it. In our case, we&rsquo;ll require threads to call <code>onQuiescentState</code>. At a minimum, before invoking a given callback, the <code>MemoryReclaimer</code> should wait until all participating threads have called <code>onQuiescentState</code> first. Once that condition is satisfied, it is guaranteed that if any preceding critical sections used the old object, those critical sections have ended.</p>

<p><img class="center" src="https://preshing.com/images/qsbr-timeline.png" /></p>

<p>Finding a good place to call <code>onQuiescentState</code> for each thread is really application-specific. Ideally, in our example, it would be called much less often than the <code>broadcast</code> function &ndash; otherwise, we&rsquo;d negate the benefit of eliminating the read-write lock in the first place. For example, it could be called after a fixed number of calls to <code>broadcast</code>, or a fixed amount of time, whichever comes first. If this was a game engine, it could be called on every iteration of the main loop, or some other coarse-grained unit of work.</p>

<h2 id="intervals">Intervals</h2>

<p>A simple implementation of <code>MemoryReclaimer</code> could work as follows. Instead of handling each callback individually, we can introduce the concept of <em>intervals</em>, and group callbacks together by interval. Once every thread has called <code>onQuiescentState</code>, the current interval is considered to end, and a new interval is considered to begin. At the end of each interval, we know that it&rsquo;s safe to invoke all the callbacks added in the <em>previous</em> interval, because every participating thread has called <code>onQuiescentState</code> since the previous interval ended.</p>

<p><img class="center" src="https://preshing.com/images/qsbr-intervals.png" /></p>

<p>Here&rsquo;s a quick implementation of such a <code>MemoryReclaimer</code>. It uses a <code>bool</code> vector to keep track of which threads have called <code>onQuiescentState</code> during the current interval, and which ones haven&rsquo;t yet. Every participating thread in the system must call <code>registerThread</code> beforehand.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">MemoryReclaimer</span> {
<span class="directive">private</span>:
    std::mutex m_mutex;
    std::vector&lt;<span class="predefined-type">bool</span>&gt; m_threadWasQuiescent;
    std::vector&lt;std::function&lt;<span class="directive">void</span>()&gt;&gt; m_currentIntervalCallbacks;
    std::vector&lt;std::function&lt;<span class="directive">void</span>()&gt;&gt; m_previousIntervalCallbacks;
    size_t m_numRemaining = <span class="integer">0</span>;

<span class="directive">public</span>:
    <span class="keyword">typedef</span> size_t ThreadIndex;

    ThreadIndex registerThread() {
        std::lock_guard&lt;std::mutex&gt; guard(m_mutex);
        ThreadIndex id = m_threadWasQuiescent.size();
        m_threadWasQuiescent.push_back(<span class="predefined-constant">false</span>);
        m_numRemaining++;
        <span class="keyword">return</span> id;
    }

    <span class="directive">void</span> addCallback(<span class="directive">const</span> std::function&lt;<span class="directive">void</span>()&gt;&amp; callback) {
        std::lock_guard&lt;std::mutex&gt; guard(m_mutex);
        m_currentIntervalCallbacks.push_back(callback);
    }

    <span class="directive">void</span> onQuiescentState(ThreadIndex id) {
        std::lock_guard&lt;std::mutex&gt; guard(m_mutex);
        <span class="keyword">if</span> (!m_threadWasQuiescent[id]) {
            m_threadWasQuiescent[id] = <span class="predefined-constant">true</span>;
            m_numRemaining--;
            <span class="keyword">if</span> (m_numRemaining == <span class="integer">0</span>) {
                <span class="comment">// End of interval. Invoke all callbacks from the previous interval.</span>
                <span class="keyword">for</span> (<span class="directive">const</span> <span class="directive">auto</span>&amp; callback : m_previousIntervalCallbacks) {
                    callback();
                }
                
                <span class="comment">// Move current callbacks to previous interval.</span>
                m_previousIntervalCallbacks = std::move(m_currentIntervalCallbacks);
                m_currentIntervalCallbacks.clear();
                
                <span class="comment">// Reset all thread statuses.</span>
                <span class="keyword">for</span> (size_t i = <span class="integer">0</span>; i &lt; m_threadWasQuiescent.size(); i++) {
                    m_threadWasQuiescent[i] = <span class="predefined-constant">false</span>;
                }
                m_numRemaining = m_threadWasQuiescent.size();
            }
        }
    }
};
</pre></div>
</div>
</div>

<p>Not only does <code>MemoryReclaimer</code> guarantee that preceding critical sections have ended &ndash; when used correctly, it also ensures that no thread will ever use an old object again. Consider again our server&rsquo;s <code>addClient</code> function. This function modifies <code>m_currentList</code>, which doesn&rsquo;t necessarily become visible to other threads right away, then calls <code>addCallback</code>. <code>addCallback</code> locks a mutex, then unlocks it. According to the C++ standard (<a href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/n4594.pdf">§30.4.1.2.11</a>), the unlock will <a href="http://preshing.com/20130823/the-synchronizes-with-relation">synchronize-with</a> every subsequent lock of the same mutex, which in our case includes calls to <code>onQuiescentState</code> from other threads. As a result, the new value of <code>m_currentList</code> will automatically become visible to other threads when <code>onQuiescentState</code> is called.</p>

<p><img class="center" src="https://preshing.com/images/qsbr-synchronize.png" /></p>

<p>That&rsquo;s just one implementation of a <code>MemoryReclaimer</code> based on QSBR. It might be possible to implement a more efficient version, but I haven&rsquo;t thought too hard about it. If you know of a better one, let me know in the comments.</p>

<h2 id="related-information">Related Information</h2>

<p>I&rsquo;m not sure exactly when the term &ldquo;QSBR&rdquo; was coined, but it seems to have emerged from research into <a href="https://lwn.net/Articles/262464/">read-copy-update</a> (RCU), a family of techniques that&rsquo;s especially popular inside the Linux kernel. The memory reclamation strategy described in this post, on the other hand, takes place entirely at the application level. It&rsquo;s similar to the QSBR flavor of <a href="https://lwn.net/Articles/573424/">userspace RCU</a>.</p>

<p>I used this technique in <a href="https://github.com/preshing/junction">Junction</a> to implement a <a href="http://preshing.com/20160222/a-resizable-concurrent-map">resizable concurrent map</a>. Every time a map&rsquo;s contents are migrated to a new table, QSBR is used to reclaim the memory of the old table. If Junction used read-write locks to protect those tables instead, I don&rsquo;t think its maps would be as scalable.</p>

<p>QSBR is not the only memory reclamation strategy that exists. <a href="http://www.cs.toronto.edu/~tomhart/papers/tomhart_thesis.pdf">Tom Hart&rsquo;s 2005 thesis</a> gives a nice overview of other strategies. To be honest, I&rsquo;ve never personally seen any of those techniques used in any C++ application or library besides Junction. If you have, I&rsquo;d be interested to hear about it. I can only think of one or two instances where a game I worked on <em>might</em> have benefitted from QSBR, performance-wise.</p>

<p>QSBR can be used to clean up resources other than memory. For example, the server described in this post maintains a list of open file descriptors &ndash; one for each connected client. A safe strategy for closing those file descriptors could be based on QSBR as well.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Leapfrog Probing]]></title>
    <link href="https://preshing.com/20160314/leapfrog-probing"/>
    <updated>2016-03-14T16:24:00-04:00</updated>
    <id>https://preshing.com/?p=20160314</id>
    <content type="html"><![CDATA[<p>A <a href="https://en.wikipedia.org/wiki/Hash_table">hash table</a> is a data structure that stores a set of items, each of which maps a specific key to a specific value. There are many ways to implement a hash table, but they all have one thing in common: <strong>buckets</strong>. Every hash table maintains an array of buckets somewhere, and each item belongs to exactly one bucket.</p>

<p>To determine the bucket for a given item, you typically hash the item&rsquo;s key, then compute its <a href="https://en.wikipedia.org/wiki/Modulo_operation">modulus</a> &ndash; that is, the remainder when divided by the number of buckets. For a hash table with 16 buckets, the modulus is given by the final hexadecimal digit of the hash.</p>

<p><img class="center" src="https://preshing.com/images/buckets.png" /></p>

<p>Inevitably, several items will end up belonging to same bucket. For simplicity, let&rsquo;s suppose the hash function is <a href="http://preshing.com/20160222/a-resizable-concurrent-map/#the-data-structure">invertible</a>, so that we only need to store hashed keys. A well-known strategy is to store the bucket contents in a linked list:</p>

<!--more-->
<p><img class="center" src="https://preshing.com/images/probe-chained-before.png" /></p>

<p>This strategy is known as <strong>separate chaining</strong>. Separate chaining tends to be relatively slow on modern CPUs, since it requires a lot of pointer lookups.</p>

<p>I&rsquo;m more fond of <strong>open addressing</strong>, which stores all the items in the array itself:</p>

<p><img class="center" src="https://preshing.com/images/probe-linear-before.png" /></p>

<p>In open addressing, each cell in the array still <em>represents</em> a single bucket, but can actually store an item belonging to any bucket. Open addressing is more cache-friendly than separate chaining. If an item is not found in its ideal cell, it&rsquo;s often nearby. The drawback is that as the array becomes full, you may need to search a lot of cells before finding a particular item, depending on the probing strategy.</p>

<p>For example, consider <strong>linear probing</strong>, the simplest probing strategy. Suppose we want to insert the item (13, &ldquo;orange&rdquo;) into the above table, and the hash of 13 is <code>0x95bb7d92</code>. Ideally, we&rsquo;d store this item at index 2, the last hexadecimal digit of the hash, but that cell is already taken. Under linear probing, we find the next free cell by searching linearly, starting at the item&rsquo;s ideal index, and store the item there instead:</p>

<p><img class="center" src="https://preshing.com/images/probe-linear-after.png" /></p>

<p>As you can see, the item (13, &ldquo;orange&rdquo;) ended up quite far from its ideal cell. Not great for lookups. Every time someone calls <code>get</code> with this key, they&rsquo;ll have to search cells 2 through 11 before finding it. As the array becomes full, long searches become more and more common. Nonetheless, linear probing tends to be quite fast as long as you don&rsquo;t let the array get too full. I&rsquo;ve shown benchmarks in <a href="http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array">previous</a> <a href="http://preshing.com/20110603/hash-table-performance-tests">posts</a>.</p>

<p>There are alternatives to linear probing, such as quadratic probing, double hashing, cuckoo hashing and hopscotch hashing. While developing <a href="https://github.com/preshing/junction">Junction</a>, I came up with yet another strategy. I call it <strong>leapfrog probing</strong>. Leapfrog probing reduces the average search length compared to linear probing, while also lending itself nicely to concurrency. It was inspired by <a href="https://en.wikipedia.org/wiki/Hopscotch_hashing">hopscotch hashing</a>, but uses explicit delta values instead of bitfields to identify cells belonging to a given bucket. </p>

<h2 id="finding-existing-items">Finding Existing Items</h2>

<p>In leapfrog probing, we store two additional delta values for each cell. These delta values define an explicit <strong>probe chain</strong> for each bucket.</p>

<p><img class="center" src="https://preshing.com/images/probe-leapfrog-before.png" /></p>

<p>To find a given key, proceed as follows:</p>

<ol>
  <li>First, hash the key and compute its modulus to get the bucket index. That&rsquo;s the item&rsquo;s ideal cell. Check there first.</li>
  <li>If the item isn&rsquo;t found in that cell, use that cell&rsquo;s <em>first</em> delta value to determine the next cell to check. Just add the delta value to the current array index, making sure to wrap at the end of the array.</li>
  <li>If the item isn&rsquo;t found in that cell, use the <em>second</em> delta value for all subsequent cells. Stop when the delta is zero.</li>
</ol>

<p>For the strategy to work, there really needs to be two delta values per cell. The first delta value directs us to the desired bucket&rsquo;s probe chain (if not already in it), and the second delta value keeps us in the same probe chain.</p>

<p>For example, suppose we look for the key 40 in the above table, and 40 hashes to <code>0x674a0243</code>. The modulus (last digit) is 3, so we check index 3 first, but index 3 contains an item belonging to a different bucket. The <em>first</em> delta value at index 3 is <code>2</code>, so we add that to the current index and check index 5. The item isn&rsquo;t there either, but at least index 5 contains an item belonging to the desired bucket, since its hash also ends with 3. The <em>second</em> delta value at index 5 is <code>3</code>, so we add that to the current index and check index 8. At index 8, the hashed key <code>0x674a0243</code> is found.</p>

<p>A single byte is sufficient to store each delta value. If the hash table&rsquo;s keys and values are 4 bytes each, and we pack the delta values together, it only takes 25% additional memory to add them. If the keys and values are 8 bytes, the additional memory is just 12.5%. Best of all, we can let the hash table become much more full before having to resize.</p>

<h2 id="inserting-new-items">Inserting New Items</h2>

<p>Inserting an item into a leapfrog table consists of two phases: following the probe chain to see if an item with the same key already exists, then, if not, performing a linear search for a free cell. The linear search begins at the end of the probe chain. Once a free cell is found and reserved, it gets linked to the end of the chain. </p>

<p>For example, suppose we insert the same item (13, &ldquo;orange&rdquo;) we inserted earlier, with hash <code>0x95bb7d92</code>. This item&rsquo;s bucket index is 2, but index 2 already contains a different key. The first delta value at index 2 is zero, which marks the end of the probe chain. We proceed to the second phase: performing a linear search starting at index 2 to locate the next free cell. As before, the item ends up quite far from its ideal cell, but this time, we set index 2&rsquo;s first delta value to <code>9</code>, linking the item to its probe chain. Now, subsequent lookups will find the item more quickly.</p>

<p><img class="center" src="https://preshing.com/images/probe-leapfrog-after.png" /></p>

<p>Of course, any time we search for a free cell, the cell must fall within reach of its designated probe chain. In other words, the resulting delta value must fit into a single byte. If the delta value doesn&rsquo;t fit, then the table is overpopulated, and it&rsquo;s time to migrate its entire contents to a new table.</p>

<p>In you&rsquo;re interested, I added a <a href="https://github.com/preshing/junction/blob/master/junction/SingleMap_Leapfrog.h">single-threaded leapfrog map</a> to Junction. The single-threaded version is easier to follow than the concurrent one, if you&rsquo;d just like to study how leapfrog probing works.</p>

<h2 id="concurrent-operations">Concurrent Operations</h2>

<p>Junction also contains <a href="https://github.com/preshing/junction/blob/master/junction/SingleMap_Leapfrog.h">a concurrent leapfrog map</a>. It&rsquo;s the most interesting map to talk about, so I&rsquo;ll simply refer to it as <strong>Leapfrog</strong> from this point on.</p>

<p>Leapfrog is quite similar to Junction&rsquo;s concurrent Linear map, which I described in the <a href="http://preshing.com/20160222/a-resizable-concurrent-map">previous post</a>. In Leapfrog, (hashed) keys are assigned to cells in the same way that Linear would assign them, and once a cell is reserved for a key, that cell&rsquo;s key never changes as long as the table is in use. The delta values are nothing but shortcuts between keys; they&rsquo;re entirely determined by key placement.</p>

<p>In Leapfrog, reserving a cell and linking that cell to its probe chain are two discrete steps. First, the cell is reserved via <a href="https://en.wikipedia.org/wiki/Compare-and-swap">compare-and-swap (CAS)</a>. If that succeeds, the cell is then linked to the end of its probe chain using a relaxed atomic store:</p>

<div><div class="CodeRay">
  <div class="code"><pre>    ...
    <span class="keyword">while</span> (linearProbesRemaining-- &gt; <span class="integer">0</span>) {
        idx++;
        group = table-&gt;getCellGroups() + ((idx &amp; sizeMask) &gt;&gt; <span class="integer">2</span>);
        cell = group-&gt;cells + (idx &amp; <span class="integer">3</span>);
        probeHash = cell-&gt;hash.load(turf::Relaxed);
        <span class="keyword">if</span> (probeHash == KeyTraits::NullHash) {
            <span class="comment">// It's an empty cell. Try to reserve it.</span>
            <span class="keyword">if</span> (<span class="highlight">cell-&gt;hash.compareExchangeStrong(probeHash, hash, turf::Relaxed)</span>) {
                <span class="comment">// Success. We've reserved the cell. Link it to previous cell in same bucket.</span>
                u8 desiredDelta = idx - prevLinkIdx;
                <span class="highlight">prevLink-&gt;store(desiredDelta, turf::Relaxed)</span>;
                <span class="keyword">return</span> InsertResult_InsertedNew;
            }
        }
        ...
</pre></div>
</div>
</div>

<p>Each step is atomic on its own, but together, they&rsquo;re not atomic. That means that if another thread performs a concurrent operation, one of the steps could be visible but not the other. Leapfrog operations must account for all possibilities.</p>

<p>For example, suppose Thread 1 inserts (18, &ldquo;fig&rdquo;) into the table shown earlier, and 18 hashes to <code>0xd6045317</code>. This item belongs in bucket 7, so it will ultimately get linked to the existing &ldquo;mango&rdquo; item at index 9, which was the last item belonging to the same bucket.</p>

<p>Now, suppose Thread 2 performs a concurrent <code>get</code> with the same key. It&rsquo;s totally possible that, during the <code>get</code>, Thread 1&rsquo;s link from index 9 will be visible, but not the new item itself:</p>

<p><img class="center" src="https://preshing.com/images/racing-insert-1.png" /></p>

<p>In this case, Leapfrog simply lets the concurrent <code>get</code> return <code>NullValue</code>, meaning the key wasn&rsquo;t found. This is perfectly OK, since the insert and the <code>get</code> are concurrent. In other words, they&rsquo;re racing. We aren&rsquo;t obligated to ensure the newly inserted item is visible in this case. For a well-behaved concurrent map, we only need to ensure visibility when there&rsquo;s a <a href="http://preshing.com/20130702/the-happens-before-relation">happens-before</a> relationship between two operations, not when they&rsquo;re racing.</p>

<p>Concurrent inserts require a bit more care. For example, suppose Thread 3 performs a concurrent insert of (62, &ldquo;plum&rdquo;) while Thread 1 is inserting (18, &ldquo;fig&rdquo;), and 62 hashes to <code>0x95ed72b7</code>. In this case, both items belong to the same bucket. Again, it&rsquo;s totally possible that from Thread 3&rsquo;s point of view, the link from index 9 is visible, but not Thread 1&rsquo;s new item, as illustrated above.</p>

<p>We can&rsquo;t allow Thread 3 to continue any further at this point. Thread 3 absolutely needs to know whether Thread 1&rsquo;s new item uses the same hashed key &ndash; otherwise, it could end up inserting a second copy of the same hashed key into the table. In Leapfrog, when this case is detected, Thread 3 simply spin-reads until the new item becomes visible before proceeding.</p>

<p>The opposite order is possible, too. During Thread 2&rsquo;s concurrent <code>get</code>, Thread 1&rsquo;s link from index 9 might not be visible even though the item itself is already visible:</p>

<p><img class="center" src="https://preshing.com/images/racing-insert-2.png" /></p>

<p>In this case, the concurrent <code>get</code> will simply consider index 9 to be the end of the probe chain and return <code>NullValue</code>, meaning the key wasn&rsquo;t found. This, too, is perfectly OK. Again, Thread 1 and Thread 2 are in a race, and we aren&rsquo;t obligated to ensure the insert is visible to the <code>get</code>.</p>

<p>Once again, concurrent inserts require a bit more care. If Thread 3 performs a concurrent insert into the same bucket as Thread 1, and Thread 1&rsquo;s link from index 9 is not yet visible, Thread 3 will consider index 9 to be the end of the probe chain, then switch to a linear search for a free cell. During the linear search, Thread 3 might encounter Thread 1&rsquo;s item (18, &ldquo;fig&rdquo;), as illustrated above. The item is unexpected during the linear search, because normally, items in the same bucket should be linked to the same probe chain.</p>

<p>In Leapfrog, when this case is detected, Thread 3 takes matters into its own hands: It sets the link from index 9 <em>on behalf</em> of Thread 1! This is obviously redundant, since both threads will end up writing the same delta value at roughly the same time, but it&rsquo;s important. If Thread 3 doesn&rsquo;t set the link from index 9, it could then call <code>get</code> with key 62, and if the link is <em>still</em> not visible, the item (62, &ldquo;plum&rdquo;) it just inserted will not be found.</p>

<p><img class="center" src="https://preshing.com/images/racing-insert-3.png" /></p>

<p>That&rsquo;s bad behavior, because sequential operations performed by a single thread should always be visible to each other. I actually encountered this bug during testing.</p>

<p>In the <a href="http://preshing.com/20160201/new-concurrent-hash-maps-for-cpp">benchmarks</a> I posted earlier, Leapfrog was the fastest concurrent map out of all the concurrent maps I tested, at least up to a modest number of CPU cores. I suspect that part of its speed comes from leapfrog probing, and part comes from having very low write contention on shared variables. In particular, Leapfrog doesn&rsquo;t keep track of the map&rsquo;s population anywhere, since the decision to resize is based on failed inserts, not on the map&rsquo;s population.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Resizable Concurrent Map]]></title>
    <link href="https://preshing.com/20160222/a-resizable-concurrent-map"/>
    <updated>2016-02-22T08:05:00-05:00</updated>
    <id>https://preshing.com/?p=20160222</id>
    <content type="html"><![CDATA[<p>In an earlier post, I showed how to implement the <a href="http://preshing.com/20130605/the-worlds-simplest-lock-free-hash-table/">&ldquo;world&rsquo;s simplest lock-free hash table&rdquo;</a> in C++. It was so simple that you couldn&rsquo;t even delete entries or resize the table. Well, a few years have passed since then, and I&rsquo;ve recently written some concurrent maps without those limitations. You&rsquo;ll find them in my <a href="https://github.com/preshing/junction">Junction</a> project on GitHub.</p>

<p>Junction contains several concurrent maps &ndash; even the &lsquo;world&rsquo;s simplest&rsquo; is there, under the name <code>ConcurrentMap_Crude</code>. For brevity, let&rsquo;s call that one the <strong>Crude</strong> map. In this post, I&rsquo;ll explain the difference between the Crude map and Junction&rsquo;s <strong>Linear</strong> map. Linear is the simplest Junction map that supports both resize and delete.</p>

<p>You can <a href="http://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/">review the original post</a> for an explanation of how the Crude map works. To recap: It&rsquo;s based on <em>open addressing</em> and <em>linear probing</em>. That means it&rsquo;s basically a big array of keys and values using a linear search. When inserting or looking up a given key, you hash the key to determine where to begin the search. Concurrent inserts and lookups are permitted.</p>

<p><img class="center" src="https://preshing.com/images/hash-diagram-2.png" /></p>

<!--more-->
<p>Junction&rsquo;s Linear map is based on the same principle, except that when the array gets too full, its entire contents are migrated to a new, larger array. When the migration completes, the old table is replaced with the old one. So, how do we achieve that while still allowing concurrent operations? The Linear map&rsquo;s approach is based on Cliff Click&rsquo;s <a href="http://high-scale-lib.cvs.sourceforge.net/viewvc/high-scale-lib/high-scale-lib/org/cliffc/high_scale_lib/NonBlockingHashMap.java?view=markup">non-blocking hash map</a> in Java, but has a few differences.</p>

<h2 id="the-data-structure">The Data Structure</h2>

<p>First, we need to modify our data structure a little bit. The original Crude map had two data members: A pointer <code>m_cells</code> and an integer <code>m_sizeMask</code>.</p>

<p><img class="center" src="https://preshing.com/images/concurrentmap-crude-structure.png" /></p>

<p>The Linear map instead has a single data member <code>m_root</code>, which points to a <code>Table</code> structure followed by the cells themselves in a single, contiguous memory block.</p>

<p><img class="center" src="https://preshing.com/images/concurrentmap-linear-structure.png" /></p>

<p>In the <code>Table</code> structure, there&rsquo;s a new shared counter <code>cellsRemaining</code>, initially set to 75% of the table size. Whenever a thread tries to insert a new key, it decrements <code>cellsRemaining</code> first. If it decrements <code>cellsRemaining</code> below zero, that means the table is overpopulated, and it&rsquo;s time to migrate everything to a new table.</p>

<p>With this new data structure, we can simultaneously replace the table, <code>sizeMask</code> and <code>cellsRemaining</code> all in a single atomic step, simply by reassigning the <code>m_root</code> pointer.</p>

<p><img class="center" src="https://preshing.com/images/replace-table.png" /></p>

<p>Another difference between the two maps is that the Linear map stores <em>hashed keys</em> instead of raw keys. That makes migrations faster, since we never need to recompute the hash function. Junction&rsquo;s hash function is invertible, too, so it&rsquo;s always possible to recover an original key from a hashed one.</p>

<p><img class="center" src="https://preshing.com/images/hash-dehash.png" /></p>

<p>Because the hash function is invertible, finding an existing key is as simple as finding its hash. That&rsquo;s why currently, Junction only supports integer and raw pointer keys. (In my opinion, the best way to support more complex keys would be to implement a concurrent set instead of a map.)</p>

<h2 id="migrating-to-a-new-table----the-incorrect-way">Migrating to a New Table &ndash; The Incorrect Way</h2>

<p>Now that we know when a migration should begin, let&rsquo;s turn to the challenge of actually performing that migration. Essentially, we must identify every cell that&rsquo;s in use in the old table and insert a copy of it into the new table. Some entries will end up at the same array index, some will end up at a higher index, and others will shift closer to their ideal index.</p>

<p><img class="center" src="https://preshing.com/images/migration.png" /></p>

<p>Of course, if other threads can still modify the old table during migration, things are not so simple. If we take a naïve approach, we risk losing changes. For example, suppose we have a map that&rsquo;s almost full when two threads perform the following:</p>

<ol>
  <li>Thread 1 calls <code>assign(2, "apple")</code>, decrementing <code>cellsRemaining</code> to 0.</li>
  <li>Thread 2 enters <code>assign(14, "peach")</code> and decrements <code>cellsRemaining</code> to -1. A migration is needed.</li>
  <li>Thread 2 migrates the contents of the old table to a new table, but doesn&rsquo;t publish the new table yet.</li>
  <li>Thread 1 calls <code>assign(2, "banana")</code> on the old table. Because a cell already exists for this key, the function doesn&rsquo;t decrement <code>cellsRemaining</code>. It simply replaces &ldquo;apple&rdquo; with &ldquo;banana&rdquo; in the old cell.</li>
  <li>Thread 2 publishes the new table to <code>m_root</code>, wiping out Thread 1&rsquo;s changes.</li>
  <li>Thread 1 calls <code>get(2)</code> on the new table.</li>
</ol>

<p>At this point, we would like <code>get(2)</code> to return &ldquo;banana&rdquo;, because this key was only modified by a single thread, and that was the last value it wrote. Unfortunately, <code>get(2)</code> will return the older value &ldquo;apple&rdquo;, which is incorrect. We need a better migration strategy.</p>

<h2 id="migrating-to-a-new-table-safely">Migrating to a New Table Safely</h2>

<p>To prevent that problem, we could block concurrent modifications using a <a href="https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock">read-write lock</a>, although in this case, &lsquo;shared-exclusive lock&rsquo; would be a better description. In that strategy, any function that modifies the contents of the table would take the shared lock first. The thread that migrates the table would take the exclusive lock. And thanks to <a href="http://preshing.com/20160726/using-quiescent-states-to-reclaim-memory">QSBR</a>, <code>get</code> wouldn&rsquo;t need any lock at all.</p>

<p>The Linear map doesn&rsquo;t do that. It goes a step further, so that even modifications don&rsquo;t require a shared lock. Essentially, as the migration proceeds through the old table, it replaces each cell&rsquo;s <code>value</code> field with a special <code>Redirect</code> marker.</p>

<p><img class="center" src="https://preshing.com/images/migrate-redirect.png" /></p>

<p>All map operations are impacted by this change. In particular, the <code>assign</code> function can&rsquo;t just blindly modify a cell&rsquo;s <code>value</code> anymore. It must perform a <a href="http://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation/p">read-modify-write</a> on <code>value</code> instead, to avoid overwriting the <code>Redirect</code> marker if one has been placed. If it sees a <code>Redirect</code> marker in <code>value</code>, that means a new table exists, and the operation should be performed in the new table instead.</p>

<p>Now, if we allow concurrent operations while a migration is in progress, then clearly, we must maintain consistent values for each key between the two tables. Unfortunately, there&rsquo;s no way to atomically write <code>Redirect</code> to an old cell while simultaneously copying its previous value to a new cell. Nonetheless, we can still ensure consistency by migrating each value using a loop. This is the loop Linear maps use:</p>

<p><img class="center" src="https://preshing.com/images/migrate-retry-loop.png" /></p>

<p>In this loop, it&rsquo;s still possible for racing threads to modify the source <code>value</code> immediately after the migration thread reads it, since a <code>Redirect</code> marker hasn&rsquo;t been placed there yet. In that case, when the migration thread tries to change the source to <code>Redirect</code> via <a href="https://en.wikipedia.org/wiki/Compare-and-swap">CAS</a>, the CAS will fail and the operation will retry using the updated value. As long as the source <code>value</code> keeps changing, the migration thread will keep retrying, but eventually it will succeed. This strategy allows concurrent <code>get</code> calls to safely find values in the new table, but concurrent <code>assign</code> calls cannot modify the new table until the migration is complete. (Cliff Click&rsquo;s hash map doesn&rsquo;t have that limitation, so his migration loop involves a few more steps.)</p>

<p>In the current version of Linear, even <code>get</code> calls don&rsquo;t read from the new table until the migration is complete. Therefore, in the current version, a loop is not really necessary; the migration could be implemented as an atomic exchange of the source <code>value</code> followed by a plain store to the destination <code>value</code>. (I just realized that while writing this post.) Right now, if a <code>get</code> call encounters a <code>Redirect</code>, it helps complete the migration. Perhaps it would be better for scalability if it immediately read from the new table instead. That&rsquo;s something worth investigating.</p>

<h2 id="multithreaded-migration">Multithreaded Migration</h2>

<p>The <code>Table</code> structure has some additional data members I didn&rsquo;t mention earlier. One member is <code>jobCoordinator</code>. During migration, the <code>jobCoordinator</code> points to a <code>TableMigration</code> object that represents the migration in progress. This is where the new table is stored before it gets published to <code>m_root</code>. I won&rsquo;t go into details, but the <code>jobCoordinator</code> allows multiple threads to participate in the migration in parallel.</p>

<p><img class="center" src="https://preshing.com/images/tablemigration.png" /></p>

<p>What if multiple threads try to <em>begin</em> a migration at the same time? In the event of such a race, Linear maps use <a href="http://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/">double-checked locking</a> to prevent duplicate <code>TableMigration</code> objects from being created. That&rsquo;s why each <code>Table</code> has a mutex. (Cliff Click&rsquo;s hash map differs here, too. He allows racing threads to create new tables optimistically.)</p>

<p>I haven&rsquo;t said much about the Linear map&rsquo;s <code>erase</code> function in this post. That&rsquo;s because it&rsquo;s easy: It simply changes the cell&rsquo;s <code>value</code> to the special <code>NullValue</code>, the same value used to initialize a cell. The cell&rsquo;s <code>hash</code> field, however, is left unchanged. That means the table may eventually fill up with deleted cells, but those cells will be purged when migrating to a new table. There are a few remaining details about choosing the size of the destination table, but I&rsquo;ll skip those details here.</p>

<p>That&rsquo;s the Linear map in a nutshell! Junction&rsquo;s <strong>Leapfrog</strong> and <strong>Grampa</strong> maps are based on the same design, but extend it in different ways.</p>

<p>Concurrent programming is difficult, but I feel that a better understanding is worth pursuing, since multicore processors are not going away. That&rsquo;s why I wanted to share the experience of building the Linear map. Examples are a powerful way to learn, or at least to become familiar with the problem domain.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[New Concurrent Hash Maps for C++]]></title>
    <link href="https://preshing.com/20160201/new-concurrent-hash-maps-for-cpp"/>
    <updated>2016-02-01T08:30:00-05:00</updated>
    <id>https://preshing.com/?p=20160201</id>
    <content type="html"><![CDATA[<p>A <a href="https://en.wikipedia.org/wiki/Associative_array">map</a> is a data structure that maps a collection of keys to a collection of values. It&rsquo;s a common concept in computer programming. You typically manipulate maps using functions such as <code>find</code>, <code>insert</code> and <code>erase</code>.</p>

<p>A <strong>concurrent map</strong> is one that lets you call some of those functions <em>concurrently</em> &ndash; even in combinations where the map is modified. If it lets you call <code>insert</code> from multiple threads, with no mutual exclusion, it&rsquo;s a concurrent map. If it lets you call <code>insert</code> while another thread is calling <code>find</code>, with no mutual exclusion, it&rsquo;s a concurrent map. Other combinations might be allowed, too. Traditional maps, such as <code>std::map</code> and <code>std::unordered_map</code>, don&rsquo;t allow that.</p>

<p>Today I&rsquo;m releasing <a href="https://github.com/preshing/junction">Junction</a>, a C++ library that contains several new concurrent maps. It&rsquo;s BSD-licensed, so you can use the source code freely in any project, for any purpose.</p>

<p><a href="https://github.com/preshing/junction"><img class="center" src="https://preshing.com/images/junction-repo.png" /></a></p>

<p>On my Core i7-5930K, Junction&rsquo;s two fastest maps outperform all other concurrent maps.</p>

<p><img class="center" src="https://preshing.com/images/concurrent-map-graph.png" /></p>

<!-- more -->
<p>They come in three flavors:</p>

<ol>
  <li>
    <p>Junction&rsquo;s <strong>Linear</strong> map is similar to the <a href="http://preshing.com/20130605/the-worlds-simplest-lock-free-hash-table/">simple lock-free hash table</a> I published a while ago, except that it also supports resizing, deleting entries, and templated key/value types. It was inspired by Cliff Click&rsquo;s <a href="http://high-scale-lib.cvs.sourceforge.net/viewvc/high-scale-lib/high-scale-lib/org/cliffc/high_scale_lib/NonBlockingHashMap.java?view=markup">non-blocking hash map</a> in Java, but has a few differences.</p>
  </li>
  <li>
    <p>Junction&rsquo;s <strong>Leapfrog</strong> map is similar to Linear, except that it uses a probing strategy loosely based on <a href="https://en.wikipedia.org/wiki/Hopscotch_hashing">hopscotch hashing</a>. This strategy improves lookup efficiency when the table is densely populated. Leapfrog scales better than Linear because it modifies shared state far less frequently.</p>
  </li>
  <li>
    <p>Junction&rsquo;s <strong>Grampa</strong> map is similar to Leapfrog, except that at high populations, the map gets split into a set of smaller, fixed-size Leapfrog tables. Whenever one of those tables overflows, it gets split into two new tables instead of resizing the entire map.</p>
  </li>
</ol>

<p>Junction aims to support as many platforms as possible. So far, it&rsquo;s been tested on Windows, Ubuntu, OS X and iOS. Its main dependencies are CMake and a companion library called <a href="https://github.com/preshing/turf">Turf</a>. Turf is an abstraction layer over POSIX, Win32, Mach, Linux, Boost, C++11, and possibly other platform APIs. You configure Turf to use the API you want.</p>

<h2 id="using-junction-maps">Using Junction Maps</h2>

<p>Instantiate one of the class templates using integer or raw pointer types.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">typedef</span> junction::ConcurrentMap_Grampa&lt;turf::u64, Foo*&gt; ConcurrentMap;
ConcurrentMap myMap;
</pre></div>
</div>
</div>

<p>Each map exposes functions such as <code>get</code>, <code>assign</code>, <code>exchange</code> and <code>erase</code>. These functions are all atomic with respect to each other, so you can call them from any thread at any time. They also provide <a href="http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11/">release and consume semantics</a> implicitly, so you can safely pass non-atomic information between threads.</p>

<div><div class="CodeRay">
  <div class="code"><pre>myMap.assign(<span class="integer">14</span>, <span class="keyword">new</span> Foo);
Foo* foo = myMap.get(<span class="integer">14</span>);
foo = myMap.exchange(<span class="integer">14</span>, <span class="keyword">new</span> Foo);
<span class="keyword">delete</span> foo;
foo = myMap.erase(<span class="integer">14</span>);
<span class="keyword">delete</span> foo;
</pre></div>
</div>
</div>

<p>Out of all possible keys, a <em>null</em> key must be reserved, and out of all possible values, <em>null</em> and <em>redirect</em> values must be reserved. The defaults are 0 and 1. You can override those defaults by passing custom <code>KeyTraits</code> and <code>ValueTraits</code> parameters to the template.</p>

<h2 id="safe-memory-reclamation">Safe Memory Reclamation</h2>

<p>All Junction maps rely on a form of safe memory reclamation known as QSBR, or <a href="http://preshing.com/20160726/using-quiescent-states-to-reclaim-memory">quiescent state-based memory reclamation</a>. QSBR could be described as a primitive garbage collector.</p>

<p>If it seems odd to perform garbage collection in C++, keep in mind that scalable concurrent maps are already prevalent in Java, an entirely garbage-collected language. That&rsquo;s no coincidence. Garbage collection allows you to sidestep locks, especially during read operations, which greatly improves scalability. Not even a read-write lock is necessary. You can certainly write a concurrent map in C++ without garbage collection, but I doubt it will scale as well as a Junction map.</p>

<p>To make QSBR work, each thread must periodically call <code>junction::DefaultQSBR.update</code> at a moment when that thread is <em>quiescent</em> &ndash; that is, not in the middle of an operation that uses the map. In a game engine, you could call it between iterations of the main loop.</p>

<h2 id="dynamically-allocated-values">Dynamically Allocated Values</h2>

<p>Junction maps use QSBR internally, but you must still manage object lifetimes yourself. The maps don&rsquo;t currently support smart pointers.</p>

<p>If you&rsquo;re storing dynamically allocated objects, you&rsquo;ll often want to check for existing entries in the table before inserting a new one. There are a couple of ways to do that. One way is to create objects optimistically, then detect racing inserts using <code>exchangeValue</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre>ConcurrentMap::Mutator mutator = myMap.insertOrFind(<span class="integer">14</span>);
Foo* value = mutator.getValue();
<span class="keyword">if</span> (!value) {
    value = <span class="keyword">new</span> Foo;
    Foo* oldValue = mutator.exchangeValue(value);
    <span class="keyword">if</span> (oldValue)
        junction::DefaultQSBR.enqueue(&amp;Foo::destroy, oldValue);
}
</pre></div>
</div>
</div>

<p>The above code uses a <code>Mutator</code>, which is like a pointer to a single entry in the map. First, <code>insertOrFind</code> creates an entry if one doesn&rsquo;t already exist. Then, if two threads race to insert the same key, the second thread will garbage-collect the object created by the first.</p>

<p>Another way is to prevent such collisions entirely using double-checked locking. This approach guarantees that only one object will ever be created for a given key.</p>

<div><div class="CodeRay">
  <div class="code"><pre>Foo* value = myMap.get(<span class="integer">14</span>);
<span class="keyword">if</span> (!value) {
    turf::LockGuard&lt;turf::Mutex&gt; lock(someMutex);
    ConcurrentMap::Mutator mutator = myMap.insertOrFind(<span class="integer">14</span>);
    value = mutator.getValue();
    <span class="keyword">if</span> (!value) {
        value = <span class="keyword">new</span> Foo;
        mutator.assignValue(value);
    }
}
</pre></div>
</div>
</div>

<h2 id="development-status">Development Status</h2>

<p>You should consider this alpha software. All of the code is experimental. I spent a lot of effort to get it right, and it passes all the tests I&rsquo;ve thrown at it, but you never know &ndash; bugs might still lurk under the surface. That&rsquo;s part of the reason why I&rsquo;m releasing the code for free. Readers of this blog have proven quite good at finding obscure bugs. I hope you&rsquo;ll subject Junction to your harshest scrutiny.</p>

<p><em>[Update Dec. 30, 2017: Almost two years after this was published, <a href="https://github.com/preshing/junction/issues/31">the first bug</a> has been spotted and fixed.]</em></p>

<p>If you&rsquo;d like to contribute to the project in other ways, here are a few suggestions:</p>

<ul>
  <li>Porting Turf to additional platforms</li>
  <li>Further optimization</li>
  <li>Searching the repository for <code>FIXME</code> comments</li>
  <li>Identifying missing functionality that would be useful</li>
</ul>

<p>To leave feedback, simply post a comment below, <a href="https://github.com/preshing/junction/issues">open an issue</a> on GitHub, or use my direct <a href="http://preshing.com/contact/">contact form</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[You Can Do Any Kind of Atomic Read-Modify-Write Operation]]></title>
    <link href="https://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation"/>
    <updated>2015-04-02T07:20:00-04:00</updated>
    <id>https://preshing.com/?p=20150402</id>
    <content type="html"><![CDATA[<p>Atomic read-modify-write operations &ndash; or &ldquo;RMWs&rdquo; &ndash; are more sophisticated than <a href="http://preshing.com/20130618/atomic-vs-non-atomic-operations">atomic loads and stores</a>. They let you read from a variable in shared memory and simultaneously write a different value in its place. In the C++11 atomic library, all of the following functions perform an RMW:</p>

<div><div class="CodeRay">
  <div class="code"><pre>std::atomic&lt;&gt;::fetch_add()
std::atomic&lt;&gt;::fetch_sub()
std::atomic&lt;&gt;::fetch_and()
std::atomic&lt;&gt;::fetch_or()
std::atomic&lt;&gt;::fetch_xor()
std::atomic&lt;&gt;::exchange()
std::atomic&lt;&gt;::compare_exchange_strong()
std::atomic&lt;&gt;::compare_exchange_weak()
</pre></div>
</div>
</div>

<p><code>fetch_add</code>, for example, reads from a shared variable, adds another value to it, and writes the result back &ndash; all in one indivisible step. You can accomplish the same thing using a mutex, but a mutex-based version wouldn&rsquo;t be <a href="http://preshing.com/20120612/an-introduction-to-lock-free-programming">lock-free</a>. RMW operations, on the other hand, are designed to be lock-free. They&rsquo;ll take advantage of lock-free CPU instructions whenever possible, such as <code>ldrex</code>/<code>strex</code> on ARMv7.</p>

<!--more-->
<p>A novice programmer might look at the above list of functions and ask, &ldquo;Why does C++11 offer so few RMW operations? Why is there an atomic <code>fetch_add</code>, but no atomic <code>fetch_multiply</code>, no <code>fetch_divide</code> and no <code>fetch_shift_left</code>?&rdquo; There are two reasons:</p>

<ol>
  <li>Because there is very little need for those RMW operations in practice. Try not to get the wrong impression of how RMWs are used. You can&rsquo;t write safe multithreaded code by taking a single-threaded algorithm and turning each step into an RMW.</li>
  <li>Because if you do need those operations, you can easily implement them yourself. As the title says, you can do any kind of RMW operation!</li>
</ol>

<h2 id="compare-and-swap-the-mother-of-all-rmws">Compare-and-Swap: The Mother of All RMWs</h2>

<p>Out of all the available RMW operations in C++11, the only one that is absolutely essential is <code>compare_exchange_weak</code>. Every other RMW operation can be implemented using that one. It takes a minimum of two arguments:</p>

<div><div class="CodeRay">
  <div class="code"><pre>shared.compare_exchange_weak(T&amp; expected, T desired, ...);
</pre></div>
</div>
</div>

<p>This function attempts to store the <code>desired</code> value to <code>shared</code>, but only if the current value of <code>shared</code> matches <code>expected</code>. It returns <code>true</code> if successful. If it fails, it loads the current value of <code>shared</code> back into <code>expected</code>, which despite its name, is an in/out parameter. This is called a <strong>compare-and-swap</strong> operation, and it all happens in one atomic, indivisible step.</p>

<p><img class="center" src="https://preshing.com/images/compare-exchange.png" /></p>

<p>So, suppose you really need an atomic <code>fetch_multiply</code> operation, though I can&rsquo;t imagine why. Here&rsquo;s one way to implement it:</p>

<div><div class="CodeRay">
  <div class="code"><pre>uint32_t fetch_multiply(std::atomic&lt;uint32_t&gt;&amp; shared, uint32_t multiplier)
{
    uint32_t oldValue = shared.load();
    <span class="keyword">while</span> (!shared.<span class="highlight">compare_exchange_weak</span>(oldValue, oldValue * multiplier))
    {
    }
    <span class="keyword">return</span> oldValue;
}
</pre></div>
</div>
</div>

<p>This is known as a compare-and-swap loop, or <strong>CAS loop</strong>. The function repeatedly tries to exchange <code>oldValue</code> with <code>oldValue * multiplier</code> until it succeeds. If no concurrent modifications happen in other threads, <code>compare_exchange_weak</code> will usually succeed on the first try. On the other hand, if <code>shared</code> is concurrently modified by another thread, it&rsquo;s totally possible for its value to change between the call to <code>load</code> and the call to <code>compare_exchange_weak</code>, causing the compare-and-swap operation to fail. In that case, <code>oldValue</code> will be updated with the most recent value of <code>shared</code>, and the loop will try again.</p>

<p><img class="center" src="https://preshing.com/images/fetch-multiply-timeline.png" /></p>

<p>The above implementation of <code>fetch_multiply</code> is both atomic and lock-free. It&rsquo;s atomic even though the CAS loop may take an indeterminate number of tries, because when the loop finally does modify <nobr><code>shared</code>,</nobr> it does so atomically. It&rsquo;s lock-free because if a single iteration of the CAS loop fails, it&rsquo;s usually because some other thread modified <code>shared</code> successfully. That last statement hinges on the assumption that <code>compare_exchange_weak</code> actually compiles to lock-free machine code &ndash; more on that below. It also ignores the fact that <code>compare_exchange_weak</code> can <a href="http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange">fail spuriously</a> on certain platforms, but that&rsquo;s a rare event.</p>

<h2 id="you-can-combine-several-steps-into-one-rmw">You Can Combine Several Steps Into One RMW</h2>

<p><code>fetch_multiply</code> just replaces the value of <code>shared</code> with a multiple of the same value. What if we want to perform a more elaborate kind of RMW? Can we still make the operation atomic <em>and</em> lock-free? Sure we can. To offer a somewhat convoluted example, here&rsquo;s a function that loads a shared variable, decrements the value if odd, divides it in half if even, and stores the result back only if it&rsquo;s greater than or equal to 10, all in a single atomic, lock-free operation:</p>

<div><div class="CodeRay">
  <div class="code"><pre>uint32_t atomicDecrementOrHalveWithLimit(std::atomic&lt;uint32_t&gt;&amp; shared)
{
    uint32_t oldValue = shared.load();
    uint32_t newValue;
    <span class="keyword">do</span>
    {
        <span class="keyword">if</span> (oldValue % <span class="integer">2</span> == <span class="integer">1</span>)
            newValue = oldValue - <span class="integer">1</span>;
        <span class="keyword">else</span>
            newValue = oldValue / <span class="integer">2</span>;
        <span class="keyword">if</span> (newValue &lt; <span class="integer">10</span>)
            <span class="keyword">break</span>;
    }
    <span class="keyword">while</span> (!shared.<span class="highlight">compare_exchange_weak</span>(oldValue, newValue));
    <span class="keyword">return</span> oldValue;
}
</pre></div>
</div>
</div>

<p>It&rsquo;s the same idea as before: If <code>compare_exchange_weak</code> fails &ndash; usually due to a modification performed by another thread &ndash; <code>oldValue</code> is updated with a more recent value, and the loop tries again. If, during any attempt, we find that <code>newValue</code> is less than 10, the CAS loop terminates early, effectively turning the RMW operation into a no-op.</p>

<p>The point is that you can put anything inside the CAS loop. Think of the body of the CAS loop as a critical section. Normally, we protect a critical section using a mutex. With a CAS loop, we simply retry the entire transaction until it succeeds.</p>

<p>This is obviously a synthetic example. A more practical example can be seen in the <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/autoresetevent.h"><code>AutoResetEvent</code></a> class described in my <a href="http://preshing.com/20150316/semaphores-are-surprisingly-versatile">earlier post about semaphores</a>. It uses a CAS loop with multiple steps to atomically increment a shared variable up to a limit of 1.</p>

<h2 id="you-can-combine-several-variables-into-one-rmw">You Can Combine Several Variables Into One RMW</h2>

<p>So far, we&rsquo;ve only looked at examples that perform an atomic operation on a single shared variable. What if we want to perform an atomic operation on multiple variables? Normally, we&rsquo;d protect those variables using a mutex:</p>

<div><div class="CodeRay">
  <div class="code"><pre>std::mutex mutex;
uint32_t x;
uint32_t y;

<span class="directive">void</span> atomicFibonacciStep()
{
    <span class="highlight">std::lock_guard&lt;std::mutex&gt; lock(mutex);</span>
    <span class="predefined-type">int</span> t = y;
    y = x + y;
    x = t;
}
</pre></div>
</div>
</div>

<p>This mutex-based approach is atomic, but obviously not lock-free. That <a href="http://preshing.com/20111118/locks-arent-slow-lock-contention-is">may very well be good enough</a>, but for the sake of illustration, let&rsquo;s go ahead and convert it to a CAS loop just like the other examples. <code>std::atomic&lt;&gt;</code> is a template, so we can actually pack both shared variables into a <code>struct</code> and apply the same pattern as before:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Terms
{
    uint32_t x;
    uint32_t y;
};

<span class="highlight">std::atomic&lt;Terms&gt; terms;</span>

<span class="directive">void</span> atomicFibonacciStep()
{
    Terms oldTerms = terms.load();
    Terms newTerms;
    <span class="keyword">do</span>
    {
        newTerms.x = oldTerms.y;
        newTerms.y = oldTerms.x + oldTerms.y;
    }
    <span class="keyword">while</span> (!<span class="highlight">terms.compare_exchange_weak</span>(oldTerms, newTerms));
}
</pre></div>
</div>
</div>

<p>Is <em>this</em> operation lock-free? Now we&rsquo;re venturing into dicey territory. As I wrote at the start, C++11 atomic operations are designed take advantage of lock-free CPU instructions &ldquo;whenever possible&rdquo; &ndash; admittedly a loose definition. In this case, we&rsquo;ve wrapped <code>std::atomic&lt;&gt;</code> around a struct, <code>Terms</code>. Let&rsquo;s see how GCC 4.9.2 compiles it for x64:</p>

<p><img class="center" src="https://preshing.com/images/atomic-terms-rmw.png" /></p>

<p>We got lucky. The compiler was clever enough to see that <code>Terms</code> fits inside a single 64-bit register, and implemented <code>compare_exchange_weak</code> using <code>lock cmpxchg</code>. The compiled code is lock-free.</p>

<p>This brings up an interesting point: In general, the C++11 standard does <em>not</em> guarantee that atomic operations will be lock-free. There are simply too many CPU architectures to support and too many ways to specialize the <code>std::atomic&lt;&gt;</code> template. You need to <a href="http://en.cppreference.com/w/cpp/atomic/atomic/is_lock_free">check with your compiler</a> to make absolutely sure. In practice, though, it&rsquo;s pretty safe to assume that atomic operations are lock-free when all of the following conditions are true:</p>

<ol>
  <li>The compiler is a recent version MSVC, GCC or Clang.</li>
  <li>The target processor is x86, x64 or ARMv7 (and possibly others).</li>
  <li>The atomic type is <code>std::atomic&lt;uint32_t&gt;</code>, <code>std::atomic&lt;uint64_t&gt;</code> or <code>std::atomic&lt;T*&gt;</code> for some type <code>T</code>.</li>
</ol>

<p>As a personal preference, I like to hang my hat on that third point, and limit myself to specializations of the <code>std::atomic&lt;&gt;</code> template that use explicit integer or pointer types. The <a href="http://preshing.com/20150324/safe-bitfields-in-cpp">safe bitfield technique</a> I described in the previous post gives us a convenient way to rewrite the above function using an explicit integer specialization, <code>std::atomic&lt;uint64_t&gt;</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre>BEGIN_BITFIELD_TYPE(Terms, uint64_t)
    ADD_BITFIELD_MEMBER(x, <span class="integer">0</span>, <span class="integer">32</span>)
    ADD_BITFIELD_MEMBER(y, <span class="integer">32</span>, <span class="integer">32</span>)
END_BITFIELD_TYPE()

<span class="highlight">std::atomic&lt;uint64_t&gt; terms;</span>

<span class="directive">void</span> atomicFibonacciStep()
{
    Terms oldTerms = terms.load();
    Terms newTerms;
    <span class="keyword">do</span>
    {
        newTerms.x = oldTerms.y;
        newTerms.y = (uint32_t) (oldTerms.x + oldTerms.y);
    }
    <span class="keyword">while</span> (!terms.compare_exchange_weak(oldTerms, newTerms));
}
</pre></div>
</div>
</div>

<p>Some real-world examples where we pack several values into an atomic bitfield include:</p>

<ul>
  <li>Implementing tagged pointers as a <a href="http://en.wikipedia.org/wiki/ABA_problem#Tagged_state_reference">workaround for the ABA problem</a>.</li>
  <li>Implementing a lightweight read-write lock, which I touched upon briefly <a href="http://preshing.com/20150316/semaphores-are-surprisingly-versatile">in a previous post</a>.</li>
</ul>

<p>In general, any time you have a small amount of data protected by a mutex, and you can pack that data entirely into a 32- or 64-bit integer type, you can always convert your mutex-based operations into lock-free RMW operations, no matter what those operations actually do! That&rsquo;s the principle I exploited in my <a href="http://preshing.com/20150316/semaphores-are-surprisingly-versatile">Semaphores are Surprisingly Versatile</a> post, to implement a bunch of lightweight synchronization primitives.</p>

<p>Of course, this technique is not unique to the C++11 atomic library. I&rsquo;m just using C++11 atomics because they&rsquo;re quite widely available now, and compiler support is pretty good. You can implement a custom RMW operation using any library that exposes a compare-and-swap function, such as <a href="https://msdn.microsoft.com/en-us/library/ttk2z1ws.aspx">Win32</a>, the <a href="https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/OSAtomicCompareAndSwap32.3.html">Mach kernel API</a>, the <a href="http://lxr.free-electrons.com/ident?i=atomic_cmpxchg">Linux kernel API</a>, <a href="https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/_005f_005fatomic-Builtins.html">GCC atomic builtins</a> or <a href="http://mintomic.github.io/lock-free/atomics/">Mintomic</a>. In the interest of brevity, I didn&rsquo;t discuss memory ordering concerns in this post, but it&rsquo;s critical to consider the guarantees made by your atomic library. In particular, if your custom RMW operation is intended to pass non-atomic information between threads, then at a minimum, you should ensure that there is the equivalent of a <a href="http://preshing.com/20130823/the-synchronizes-with-relation"><em>synchronizes-with</em></a> relationship somewhere.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Safe Bitfields in C++]]></title>
    <link href="https://preshing.com/20150324/safe-bitfields-in-cpp"/>
    <updated>2015-03-24T06:15:00-04:00</updated>
    <id>https://preshing.com/?p=20150324</id>
    <content type="html"><![CDATA[<p>In my <a href="https://github.com/preshing/cpp11-on-multicore">cpp11-on-multicore</a> project on GitHub, there&rsquo;s <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/rwlock.h">a class</a> that packs three 10-bit values into a 32-bit integer.</p>

<p><img class="center" src="https://preshing.com/images/rwlock-bitfield.png" /></p>

<p>I could have implemented it using traditional bitfields&hellip;</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">struct</span> Status
{
    uint32_t readers : <span class="integer">10</span>;
    uint32_t waitToRead : <span class="integer">10</span>;
    uint32_t writers : <span class="integer">10</span>;
};
</pre></div>
</div>
</div>

<p>Or with some bit twiddling&hellip;</p>

<pre><code>uint32_t status = readers | (waitToRead &lt;&lt; 10) | (writers &lt;&lt; 20);
</code></pre>

<!--more-->
<p>Instead, I did what any overzealous C++ programmer does. I abused the preprocessor and templating system.</p>

<div><div class="CodeRay">
  <div class="code"><pre>BEGIN_BITFIELD_TYPE(Status, uint32_t)           <span class="comment">// type name, storage size</span>
    ADD_BITFIELD_MEMBER(readers, <span class="integer">0</span>, <span class="integer">10</span>)         <span class="comment">// member name, offset, number of bits</span>
    ADD_BITFIELD_MEMBER(waitToRead, <span class="integer">10</span>, <span class="integer">10</span>)
    ADD_BITFIELD_MEMBER(writers, <span class="integer">20</span>, <span class="integer">10</span>)
END_BITFIELD_TYPE()
</pre></div>
</div>
</div>

<p>The above set of macros defines a new bitfield type <code>Status</code> with three members. The second argument to <code>BEGIN_BITFIELD_TYPE()</code> must be an unsigned integer type. The second argument to <code>ADD_BITFIELD_MEMBER()</code> specifies each member&rsquo;s offset, while the third argument specifies the number of bits.</p>

<p>I call this a <strong>safe bitfield</strong> because it performs safety checks to ensure that every operation on the bitfield fits within the available number of bits. It also supports packed arrays. I thought the technique deserved a quick explanation here, since I&rsquo;m going to refer back to it in future posts.</p>

<h2 id="how-to-manipulate-a-safe-bitfield">How to Manipulate a Safe Bitfield</h2>

<p>Let&rsquo;s take <code>Status</code> as an example. Simply create an object of type <code>Status</code> as you would any other object. By default, it&rsquo;s initialized to zero, but you can initialize it from any integer of the same size. In the GitHub project, it&rsquo;s often initialized from the result of a C++11 atomic operation.</p>

<div><div class="CodeRay">
  <div class="code"><pre>Status status = m_status.load(std::memory_order_relaxed);
</pre></div>
</div>
</div>

<p>Setting the value of a bitfield member is easy. Just assign to the member the same way you would using a traditional bitfield. If asserts are enabled &ndash; such as in a debug build &ndash; and you try to assign a value that&rsquo;s too large for the bitfield, an assert will occur at runtime. It&rsquo;s meant to help catch programming errors during development.</p>

<div><div class="CodeRay">
  <div class="code"><pre>status.writers = <span class="integer">1023</span>;     <span class="comment">// OK</span>
status.writers = <span class="integer">1024</span>;     <span class="comment">// assert: value out of range</span>
</pre></div>
</div>
</div>

<p>You can increment or decrement a bitfield member using the <code>++</code> and <code>--</code> operators. If the resulting value is too large, or if it underflows past zero, the operation will trigger an assert as well.</p>

<div><div class="CodeRay">
  <div class="code"><pre>status.writers++;          <span class="comment">// assert if overflow; otherwise OK</span>
status.writers--;          <span class="comment">// assert if underflow; otherwise OK</span>
</pre></div>
</div>
</div>

<p>It would be easy to implement a version of increment and decrement that silently wrap around, without corrupting any neighboring bitfield members, but I haven&rsquo;t done so yet. I&rsquo;ll add those functions as soon as I have a need for them.</p>

<p>You can pass the entire bitfield to any function that expects a <code>uint32_t</code>. In the GitHub project, they&rsquo;re often passed to C++11 atomic operations. It even works by reference.</p>

<div><div class="CodeRay">
  <div class="code"><pre>m_status.store(<span class="highlight">status</span>, std::memory_order_relaxed);
m_status.compare_exchange_weak(<span class="highlight">oldStatus</span>, <span class="highlight">newStatus</span>,
                               std::memory_order_acquire, std::memory_order_relaxed));
</pre></div>
</div>
</div>

<p>For each bitfield member, there are helper functions that return the representation of <code>1</code>, as well as the maximum value the member can hold. These helper functions let you atomically increment a specific member using <code>std::atomic&lt;&gt;::fetch_add()</code>. You can invoke them on temporary objects, since they return the same value for any <code>Status</code> object.</p>

<div><div class="CodeRay">
  <div class="code"><pre>Status oldStatus = m_status.fetch_add(<span class="highlight">Status().writers.one()</span>, std::memory_order_acquire);
assert(oldStatus.writers + <span class="integer">1</span> &lt;= <span class="highlight">Status().writers.maximum()</span>);
</pre></div>
</div>
</div>

<h2 id="how-its-implemented">How It&rsquo;s Implemented</h2>

<p>When expanded by the preprocessor, the macros shown near the top of this post generate a <strong>union</strong> that contains four member variables: <code>wrapper</code>, <code>readers</code>, <code>waitToRead</code> and <code>writers</code>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="comment">// BEGIN_BITFIELD_TYPE(Status, uint32_t)</span>
<span class="keyword">union</span> Status
{
    <span class="keyword">struct</span> Wrapper
    {
        uint32_t value;
    };
    Wrapper <span class="highlight">wrapper</span>;

    Status(uint32_t v = <span class="integer">0</span>) { wrapper.value = v; }
    Status&amp; <span class="directive">operator</span>=(uint32_t v) { wrapper.value = v; <span class="keyword">return</span> *<span class="local-variable">this</span>; }
    <span class="directive">operator</span> uint32_t&amp;() { <span class="keyword">return</span> wrapper.value; }
    <span class="directive">operator</span> uint32_t() <span class="directive">const</span> { <span class="keyword">return</span> wrapper.value; }

    <span class="keyword">typedef</span> uint32_t StorageType;

    <span class="comment">// ADD_BITFIELD_MEMBER(readers, 0, 10)</span>
    BitFieldMember&lt;StorageType, <span class="integer">0</span>, <span class="integer">10</span>&gt; <span class="highlight">readers</span>;

    <span class="comment">// ADD_BITFIELD_MEMBER(waitToRead, 10, 10)</span>
    BitFieldMember&lt;StorageType, <span class="integer">10</span>, <span class="integer">10</span>&gt; <span class="highlight">waitToRead</span>;

    <span class="comment">// ADD_BITFIELD_MEMBER(writers, 20, 10)</span>
    BitFieldMember&lt;StorageType, <span class="integer">20</span>, <span class="integer">10</span>&gt; <span class="highlight">writers</span>;

<span class="comment">// END_BITFIELD_TYPE()</span>
};
</pre></div>
</div>
</div>

<p>The cool thing about unions in C++ is that they share a lot of the same capabilities as C++ classes. As you can see, I&rsquo;ve given this one a constructor and overloaded several operators, to support some of the functionality described earlier.</p>

<p>Each member of the union is exactly 32 bits wide. <code>readers</code>, <code>waitToRead</code> and <code>writers</code> are all instances of the <code>BitFieldMember</code> class template. <code>BitFieldMember&lt;uint32_t, 20, 10&gt;</code>, for example, represents a range of 10 bits starting at offset 20 within a <code>uint32_t</code>. (In the diagram below, the bits are ordered from most significant to least, so we count offsets starting from the right.)</p>

<p><img class="center" src="https://preshing.com/images/bitfieldmember.png" /></p>

<p>Here&rsquo;s a partial definition of the the <code>BitFieldMember</code> class template. You can view the full definition <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/bitfield.h">on GitHub</a>:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T, <span class="predefined-type">int</span> Offset, <span class="predefined-type">int</span> Bits&gt;
<span class="keyword">struct</span> BitFieldMember
{
    T value;

    <span class="directive">static</span> <span class="directive">const</span> T Maximum = (T(<span class="integer">1</span>) &lt;&lt; Bits) - <span class="integer">1</span>;
    <span class="directive">static</span> <span class="directive">const</span> T Mask = Maximum &lt;&lt; Offset;

    <span class="directive">operator</span> T() <span class="directive">const</span>
    {
        <span class="keyword">return</span> (value &gt;&gt; Offset) &amp; Maximum;
    }

    BitFieldMember&amp; <span class="directive">operator</span>=(T v)
    {
        assert(v &lt;= Maximum);               <span class="comment">// v must fit inside the bitfield member</span>
        value = (value &amp; ~Mask) | (v &lt;&lt; Offset);
        <span class="keyword">return</span> *<span class="local-variable">this</span>;
    }

    ...
</pre></div>
</div>
</div>

<p><code>operator T()</code> is a user-defined conversion that lets us read the bitfield member as if it was a plain integer. <code>operator=(T v)</code> is, of course, a copy assignment operator that lets use write to the bitfield member. This is where all the necessary bit twiddling and safety checks take place.</p>

<h3 id="no-undefined-behavior">No Undefined Behavior</h3>

<p>Is this legal C++? We&rsquo;ve been reading from various <code>Status</code> members after writing to others; something the C++ standard <a href="http://stackoverflow.com/q/17273320/3043469">generally forbids</a>. Luckily, in <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf">§9.5.1</a>, it makes the following exception:</p>

<blockquote>
  <p>If a standard-layout union contains several standard-layout structs that share a common initial sequence &hellip; it is permitted to inspect the common initial sequence of any of standard-layout struct members.</p>
</blockquote>

<p>In our case, <code>Status</code> fits the definition of a standard-layout union; <code>wrapper</code>, <code>readers</code>, <code>waitToRead</code> and <code>writers</code> are all standard-layout structs; and they share a common initial sequence: <code>uint32_t value</code>. Therefore, we have the standard&rsquo;s endorsement, and there&rsquo;s no <a href="http://en.wikipedia.org/wiki/Undefined_behavior">undefined behavior</a>. (Thanks to Michael Reilly and others for helping me sort that out.)</p>

<h2 id="bonus-support-for-packed-arrays">Bonus: Support for Packed Arrays</h2>

<p>In <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/diningphilosophers.h">another class</a>, I needed a bitfield to hold a packed array of eight 4-bit values.</p>

<p><img class="center" src="https://preshing.com/images/bitfield-packed-array.png" /></p>

<p>Packed array members are supported using the <code>ADD_BITFIELD_ARRAY</code> macro. It&rsquo;s similar to the <code>ADD_BITFIELD_MEMBER</code> macro, but it takes an additional argument to specify the number of array elements.</p>

<div><div class="CodeRay">
  <div class="code"><pre>BEGIN_BITFIELD_TYPE(AllStatus, uint32_t)
    ADD_BITFIELD_ARRAY(philos, <span class="integer">0</span>, <span class="integer">4</span>, <span class="integer">8</span>)     <span class="comment">// 8 array elements, 4 bits each</span>
END_BITFIELD_TYPE()
</pre></div>
</div>
</div>

<p>You can index a packed array member just like a regular array. An assert is triggered if the array index is out of range.</p>

<div><div class="CodeRay">
  <div class="code"><pre>AllStatus status;
status.philos[<span class="integer">0</span>] = <span class="integer">5</span>;           <span class="comment">// OK</span>
status.philos[<span class="integer">8</span>] = <span class="integer">0</span>;           <span class="comment">// assert: array index out of range</span>
</pre></div>
</div>
</div>

<p>Packed array items support all of the same operations as bitfield members. I won&rsquo;t go into the details, but the trick is to overload <code>operator[]</code> in <code>philos</code> so that it returns a temporary object that has the same capabilities as a <code>BitFieldMember</code> instance.</p>

<div><div class="CodeRay">
  <div class="code"><pre>status.philos[<span class="integer">1</span>]++;
status.philos[<span class="integer">2</span>]--;
std::cout &lt;&lt; status.philos[<span class="integer">3</span>];
</pre></div>
</div>
</div>

<p>When optimizations are enabled, MSVC, GCC and Clang do a great job of inlining all the hidden function calls behind this technique. The generated machine code ends up as efficient as if you had explicitly performed all of the bit twiddling yourself.</p>

<p>I&rsquo;m not the first person to implement custom bitfields on top of C++ unions and templates. The implementation here was inspired by <a href="http://blog.codef00.com/2014/12/06/portable-bitfields-using-c11/">this blog post</a> by Evan Teran, with a few twists of my own. I don&rsquo;t usually like to rely on clever language contortions, but this is one of those cases where the convenience gained feels worth the increase in obfuscation.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Semaphores are Surprisingly Versatile]]></title>
    <link href="https://preshing.com/20150316/semaphores-are-surprisingly-versatile"/>
    <updated>2015-03-16T05:50:00-04:00</updated>
    <id>https://preshing.com/?p=20150316</id>
    <content type="html"><![CDATA[<p>In multithreaded programming, it&rsquo;s important to make threads wait. They must wait for exclusive access to a resource. They must wait when there&rsquo;s no work available. One way to make threads wait &ndash; and put them to sleep inside the kernel, so that they no longer take any CPU time &ndash; is with a <strong>semaphore</strong>.</p>

<p>I used to think semaphores were strange and old-fashioned. They were invented by Edsger Dijkstra <a href="http://en.wikipedia.org/wiki/Semaphore_%28programming%29">back in the early 1960s</a>, before anyone had done much multithreaded programming, or much programming at all, for that matter. I knew that a semaphore could keep track of available units of a resource, or function as a clunky kind of <a href="http://en.wikipedia.org/wiki/Semaphore_%28programming%29#Semaphores_vs._mutexes">mutex</a>, but that seemed to be about it.</p>

<p>My opinion changed once I realized that, using only semaphores and atomic operations, it&rsquo;s possible to implement all of the following primitives:</p>

<ol>
  <li>A Lightweight Mutex</li>
  <li>A Lightweight Auto-Reset Event Object</li>
  <li>A Lightweight Read-Write Lock</li>
  <li>Another Solution to the Dining Philosophers Problem</li>
  <li>A Lightweight Semaphore With Partial Spinning</li>
</ol>

<!-- more -->
<p>Not only that, but these implementations share some desirable properties. They&rsquo;re <em>lightweight</em>, in the sense that some operations happen entirely in userspace, and they can (optionally) spin for a short period before sleeping in the kernel. You&rsquo;ll find all of the C++11 source code <a href="https://github.com/preshing/cpp11-on-multicore">on GitHub</a>. Since the standard C++11 library does not include semaphores, I&rsquo;ve also provided a portable <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/sema.h"><code>Semaphore</code></a> class that maps directly to native semaphores on Windows, MacOS, iOS, Linux and other POSIX environments. You should be able to drop any of these primitives into almost any existing C++11 project.</p>

<p><a href="https://github.com/preshing/cpp11-on-multicore"><img class="center" src="https://preshing.com/images/cpp11om-repo.png" /></a></p>

<h2 id="a-semaphore-is-like-a-bouncer">A Semaphore Is Like a Bouncer</h2>

<p>Imagine a set of waiting threads, lined up in a queue &ndash; much like a lineup in front of a busy nightclub or theatre. A semaphore is like a bouncer at the front of the lineup. He only allows threads to proceed when instructed to do so.</p>

<p><img class="center" src="https://preshing.com/images/sema-intro.png" /></p>

<p>Each thread decides for itself when to join the queue. Dijkstra called this the <code>P</code> operation. <code>P</code> originally stood for some funny-sounding Dutch word, but in a modern semaphore implementation, you&rsquo;re more likely to see this operation called <code>wait</code>. Basically, when a thread calls the semaphore&rsquo;s <code>wait</code> operation, it enters the lineup.</p>

<p>The bouncer, himself, only needs to understand a single instruction. Originally, Dijkstra called this the <code>V</code> operation. Nowadays, the operation goes by various names, such as <code>post</code>, <code>release</code> or <code>signal</code>. I prefer <code>signal</code>. Any running thread can call <code>signal</code> at any time, and when it does, the bouncer releases exactly one waiting thread from the queue. (Not necessarily in the same order they arrived.)</p>

<p>Now, what happens if some thread calls <code>signal</code> <em>before</em> there are any threads waiting in line? No problem: As soon as the next thread arrives in the lineup, the bouncer will let it pass directly through. And if <code>signal</code> is called, say, 3 times on an empty lineup, the bouncer will let the next 3 threads to arrive pass directly through.</p>

<p><img class="center" src="https://preshing.com/images/sema-count.png" /></p>

<p>Of course, the bouncer needs to keep track of this number, which is why all semaphores maintain an <a href="http://linux.die.net/man/7/sem_overview">integer counter</a>. <code>signal</code> increments the counter, and <code>wait</code> decrements it.</p>

<p>The beauty of this strategy is that if <code>wait</code> is called some number of times, and <code>signal</code> is called some number of times, the outcome is always the same: The bouncer will always release the same number of threads, and there will always be the same number of threads left waiting in line, regardless of the order in which those <code>wait</code> and <code>signal</code> calls occurred.</p>

<p><img class="center" src="https://preshing.com/images/sema-order.png" /></p>

<h2 id="a-lightweight-mutex">1. A Lightweight Mutex</h2>

<p>I&rsquo;ve already shown how to implement a lightweight mutex in an <a href="http://preshing.com/20120226/roll-your-own-lightweight-mutex">earlier post</a>. I didn&rsquo;t know it at the time, but that post was just one example of a reusable pattern. The trick is to build another mechanism in front of the semaphore, which I like to call the <strong>box office</strong>.</p>

<p><img class="center" src="https://preshing.com/images/sema-box-office.png" /></p>

<p>The box office is where the real decisions are made. Should the current thread wait in line? Should it bypass the queue entirely? Should another thread be released from the queue? The box office cannot directly check how many threads are waiting on the semaphore, nor can it check the semaphore&rsquo;s current signal count. Instead, the box office must somehow keep track of its own previous decisions. In the case of a lightweight mutex, all it needs is an atomic counter. I&rsquo;ll call this counter <code>m_contention</code>, since it keeps track of how many threads are simultaneously contending for the mutex.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">LightweightMutex</span>
{
<span class="directive">private</span>:
    std::atomic&lt;<span class="predefined-type">int</span>&gt; <span class="highlight">m_contention</span>;         <span class="comment">// The &quot;box office&quot;</span>
    Semaphore m_semaphore;                 <span class="comment">// The &quot;bouncer&quot;</span>
</pre></div>
</div>
</div>

<p>When a thread decides to lock the mutex, it first visits the box office to increment <code>m_contention</code>.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="directive">public</span>:
    <span class="directive">void</span> lock()
    {
        <span class="keyword">if</span> (<span class="highlight">m_contention.fetch_add(<span class="integer">1</span>, std::memory_order_acquire)</span> &gt; <span class="integer">0</span>)  <span class="comment">// Visit the box office</span>
        {
            m_semaphore.wait();     <span class="comment">// Enter the wait queue</span>
        }
    }
</pre></div>
</div>
</div>

<p>If the previous value was 0, that means no other thread has contended for the mutex yet. As such, the current thread immediately considers itself the new owner, bypasses the semaphore, returns from <code>lock</code> and proceeds into whatever code the mutex is intended to protect.</p>

<p>Otherwise, if the previous value was greater than 0, that means another thread is already considered to own the mutex. In that case, the current thread must wait in line for its turn.</p>

<p><img class="center" src="https://preshing.com/images/sema-mutex-1.png" /></p>

<p>When the previous thread unlocks the mutex, it visits the box office to decrement the counter:</p>

<div><div class="CodeRay">
  <div class="code"><pre>    <span class="directive">void</span> unlock()
    {
        <span class="keyword">if</span> (<span class="highlight">m_contention.fetch_sub(<span class="integer">1</span>, std::memory_order_release)</span> &gt; <span class="integer">1</span>)  <span class="comment">// Visit the box office</span>
        {
            m_semaphore.signal();   <span class="comment">// Release a waiting thread from the queue</span>
        }
    }
</pre></div>
</div>
</div>

<p>If the previous counter value was 1, that means no other threads arrived in the meantime, so there&rsquo;s nothing else to do. <code>m_contention</code> is simply left at 0.</p>

<p>Otherwise, if the previous counter value was greater than 1, another thread has attempted to lock the mutex, and is therefore waiting in the queue. As such, we alert the bouncer that it&rsquo;s now safe to release the next thread. That thread will be considered the new owner.</p>

<p><img class="center" src="https://preshing.com/images/sema-mutex-2.png" /></p>

<p>Every visit to the box office is an indivisible, atomic operation. Therefore, even if multiple threads call <code>lock</code> and <code>unlock</code> concurrently, they will always visit the box office one at a time. Furthermore, the behavior of the mutex is <em>completely determined</em> by the decisions made at the box office. After they visit the box office, they may operate on the semaphore in an unpredictable order, but that&rsquo;s OK. As I&rsquo;ve already explained, the outcome will remain valid regardless of the order in which those semaphore operations occur. (In the worst case, some threads may trade places in line.)</p>

<p>This class is considered &ldquo;lightweight&rdquo; because it bypasses the semaphore when there&rsquo;s no contention, thereby avoiding system calls. I&rsquo;ve published it to GitHub as <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/benaphore.h"><code>NonRecursiveBenaphore</code></a> along with a <a href="http://preshing.com/20120305/implementing-a-recursive-mutex">recursive version</a>. However, there&rsquo;s no need to use these classes in practice. Most available mutex implementations are <a href="http://preshing.com/20111124/always-use-a-lightweight-mutex">already lightweight</a>. Nonetheless, they&rsquo;re noteworthy for serving as inspiration for the rest of the primitives described here.</p>

<h2 id="a-lightweight-auto-reset-event-object">2. A Lightweight Auto-Reset Event Object</h2>

<p>You don&rsquo;t hear autoreset event objects discussed very often, but as I mentioned in my <a href="http://preshing.com/20141024/my-multicore-talk-at-cppcon-2014">CppCon 2014 talk</a>, they&rsquo;re widely used in game engines. Most often, they&rsquo;re used to notify a single other thread (possibly sleeping) of available work.</p>

<p><a href="http://preshing.com/20141024/my-multicore-talk-at-cppcon-2014"><img class="center" src="https://preshing.com/images/cppcon-event-slide.png" /></a></p>

<p>An autoreset event object is basically a semaphore that ignores redundant signals. In other words, when <code>signal</code> is called multiple times, the event object&rsquo;s signal count will never exceed 1. That means you can go ahead and publish work units somewhere, blindly calling <code>signal</code> after each one. It&rsquo;s a flexible technique that works even when you publish work units to some data structure other than a queue.</p>

<p>Windows has native support for event objects, but its <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms686211.aspx"><code>SetEvent</code></a> function &ndash; the equivalent of <code>signal</code> &ndash; can be expensive. One one machine, I timed it at <strong>700 ns</strong> per call, even when the event was already signaled. If you&rsquo;re publishing thousands of work units between threads, the overhead for each <code>SetEvent</code> can quickly add up.</p>

<p>Luckily, the box office/bouncer pattern reduces this overhead significantly. All of the autoreset event logic can be implemented at the box office using atomic operations, and the box office will invoke the semaphore only when it&rsquo;s absolutely necessary for threads to wait.</p>

<p><img class="center" src="https://preshing.com/images/sema-event.png" /></p>

<p>I&rsquo;ve published the implementation as <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/autoresetevent.h"><code>AutoResetEvent</code></a>. This time, the box office has a different way to keep track of how many threads have been sent to wait in the queue. When <code>m_status</code> is negative, its magnitude indicates how many threads are waiting:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="keyword">class</span> <span class="class">AutoResetEvent</span>
{
<span class="directive">private</span>:
    <span class="comment">// m_status == 1: Event object is signaled.</span>
    <span class="comment">// m_status == 0: Event object is reset and no threads are waiting.</span>
    <span class="comment">// m_status == -N: Event object is reset and N threads are waiting.</span>
    std::atomic&lt;<span class="predefined-type">int</span>&gt; <span class="highlight">m_status</span>;
    Semaphore m_sema;
</pre></div>
</div>
</div>

<p>In the event object&rsquo;s <code>signal</code> operation, we increment <code>m_status</code> atomically, up to the limit of 1:</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="directive">public</span>:
    <span class="directive">void</span> signal()
    {
        <span class="predefined-type">int</span> oldStatus = m_status.load(std::memory_order_relaxed);
        <span class="keyword">for</span> (;;)    <span class="comment">// Increment m_status atomically via CAS loop.</span>
        {
            assert(oldStatus &lt;= <span class="integer">1</span>);
            <span class="highlight"><span class="predefined-type">int</span> newStatus = oldStatus &lt; <span class="integer">1</span> ? oldStatus + <span class="integer">1</span> : <span class="integer">1</span>;</span>
            <span class="keyword">if</span> (m_status.compare_exchange_weak(oldStatus, newStatus, std::memory_order_release, std::memory_order_relaxed))
                <span class="keyword">break</span>;
            <span class="comment">// The compare-exchange failed, likely because another thread changed m_status.</span>
            <span class="comment">// oldStatus has been updated. Retry the CAS loop.</span>
        }
        <span class="keyword">if</span> (oldStatus &lt; <span class="integer">0</span>)
            m_sema.signal();    <span class="comment">// Release one waiting thread.</span>
    }
</pre></div>
</div>
</div>

<p>Note that because the initial load from <code>m_status</code> is relaxed, it&rsquo;s important for the above code to call <code>compare_exchange_weak</code> even if <code>m_status</code> already equals 1. Thanks to commenter Tobias Brüll for pointing that out. See <a href="https://github.com/preshing/cpp11-on-multicore/tree/master/tests/lostwakeup">this README file</a> for more information.</p>

<h2 id="a-lightweight-read-write-lock">3. A Lightweight Read-Write Lock</h2>

<p>Using the same box office/bouncer pattern, it&rsquo;s possible to implement a pretty good <a href="http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock">read-write lock</a>. This read-write lock is completely lock-free in the absence of writers, it&rsquo;s starvation-free for both readers and writers, and just like the other primitives, it can spin before putting threads to sleep. It requires two semaphores: one for waiting readers, and another for waiting writers. The code is available as <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/rwlock.h"><code>NonRecursiveRWLock</code></a>.</p>

<p><img class="center" src="https://preshing.com/images/sema-rwlock.png" /></p>

<h2 id="another-solution-to-the-dining-philosophers-problem">4. Another Solution to the Dining Philosophers Problem</h2>

<p>The box office/bouncer pattern can also solve Dijkstra&rsquo;s <a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem">dining philosophers problem</a> in a way that I haven&rsquo;t seen described elsewhere. If you&rsquo;re not familiar with this problem, it involves philosophers that share dinner forks with each other. Each philosopher must obtain two specific forks before he or she can eat. I don&rsquo;t believe this solution will prove useful to anybody, so I won&rsquo;t go into great detail. I&rsquo;m just including it as further demonstration of semaphores&rsquo; versatility.</p>

<p>In this solution, we assign each philosopher (thread) its own dedicated semaphore. The box office keeps track of which philosophers are eating, which ones have requested to eat, and the order in which those requests arrived. With that information, the box office is able to shepherd all philosophers through their bouncers in an optimal way.</p>

<p><img class="center" src="https://preshing.com/images/sema-philosophers.png" /></p>

<p>I&rsquo;ve posted two implementations. One is <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/diningphilosophers.h"><code>DiningPhilosophers</code></a>, which implements the box office using a mutex. The other is <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/diningphilosophers.h"><code>LockReducedDiningPhilosophers</code></a>, in which every visit to the box office is lock-free.</p>

<h2 id="a-lightweight-semaphore-with-partial-spinning">5. A Lightweight Semaphore with Partial Spinning</h2>

<p>You read that right: It&rsquo;s possible to combine a semaphore with a box office to implement&hellip; another semaphore.</p>

<p>Why would you do such a thing? Because you end up with a <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/sema.h"><code>LightweightSemaphore</code></a>. It becomes extremely cheap when the lineup is empty and the signal count climbs above zero, regardless of how the underlying semaphore is implemented. In such cases, the box office will rely entirely on atomic operations, leaving the underlying semaphore untouched.</p>

<p><img class="center" src="https://preshing.com/images/sema-lwsema2.png" /></p>

<p>Not only that, but you can make threads wait in a <a href="http://en.wikipedia.org/wiki/Spinlock">spin loop</a> for a short period of time before invoking the underlying semaphore. This trick helps avoid expensive system calls when the wait time ends up being short.</p>

<p>In the <a href="https://github.com/preshing/cpp11-on-multicore/tree/master/common">GitHub repository</a>, all of the other primitives are implemented on top of <code>LightweightSemaphore</code>, rather than using <code>Semaphore</code> directly. That&rsquo;s how they all inherit the ability to partially spin. <code>LightweightSemaphore</code> sits on top of <code>Semaphore</code>, which in turn encapsulates a platform-specific semaphore.</p>

<p><img class="center" src="https://preshing.com/images/semaphore-class-diagram.png" /></p>

<p>The repository comes with a simple test suite, with each test case exercising a different primitive. It&rsquo;s possible to remove <code>LightweightSemaphore</code> and force all primitives to use <code>Semaphore</code> directly. Here are the resulting timings on my Windows PC:</p>

<table class="grid" id="shake">
<thead><tr><th /><th>LightweightSemaphore</th><th>Semaphore</th></tr></thead>
<tbody>
<tr><td>testBenaphore</td><td style="text-align: center">375 ms</td><td style="text-align: center; background-color: #fff8f8">5503 ms</td></tr>
<tr><td>testRecursiveBenaphore</td><td style="text-align: center">393 ms</td><td style="text-align: center; background-color: #fff8f8">404 ms</td></tr>
<tr><td>testAutoResetEvent</td><td style="text-align: center">593 ms</td><td style="text-align: center; background-color: #fff8f8">4665 ms</td></tr>
<tr><td>testRWLock</td><td style="text-align: center">598 ms</td><td style="text-align: center; background-color: #fff8f8">7126 ms</td></tr>
<tr><td>testDiningPhilosophers</td><td style="text-align: center">309 ms</td><td style="text-align: center; background-color: #fff8f8">580 ms</td></tr>
</tbody></table>

<p>As you can see, the test suite benefits significantly from <code>LightweightSemaphore</code> in this environment. Having said that, I&rsquo;m pretty sure the current spinning strategy is not optimal for every environment. It simply spins a fixed number of 10000 times before falling back on <code>Semaphore</code>. I looked briefly into adaptive spinning, but the best approach wasn&rsquo;t obvious. Any suggestions?</p>

<h2 id="comparison-with-condition-variables">Comparison With Condition Variables</h2>

<p>With all of these applications, semaphores are more general-purpose than I originally thought &ndash; and this wasn&rsquo;t even a complete list. So why are semaphores absent from the standard C++11 library? For the same reason they&rsquo;re absent from Boost: a preference for <a href="http://en.wikipedia.org/wiki/Monitor_%28synchronization%29"><strong>mutexes and condition variables</strong></a>. From the library maintainers&rsquo; point of view, conventional semaphore techniques are just <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2043.html#SemaphoreTypes">too error prone</a>.</p>

<p>When you think about it, though, the box office/bouncer pattern shown here is really just an optimization for condition variables in a specific case &ndash; the case where all condition variable operations are performed at the end of the critical section.</p>

<p>Consider the <code>AutoResetEvent</code> class described above. I&rsquo;ve implemented <a href="https://github.com/preshing/cpp11-on-multicore/blob/master/common/autoreseteventcondvar.h"><code>AutoResetEventCondVar</code></a>, an equivalent class based on a condition variable, in the same repository. Its condition variable is always manipulated at the end of the critical section.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="directive">void</span> AutoResetEventCondVar::signal()
{
    <span class="comment">// Increment m_status atomically via critical section.</span>
    std::unique_lock&lt;std::mutex&gt; lock(m_mutex);
    <span class="predefined-type">int</span> oldStatus = m_status;
    <span class="keyword">if</span> (oldStatus == <span class="integer">1</span>)
        <span class="keyword">return</span>;     <span class="comment">// Event object is already signaled.</span>
    m_status++;
    <span class="keyword">if</span> (oldStatus &lt; <span class="integer">0</span>)
        <span class="highlight">m_condition.notify_one()</span>;   <span class="comment">// Release one waiting thread.</span>
}
</pre></div>
</div>
</div>

<p>We can optimize <code>AutoResetEventCondVar</code> in two steps:</p>

<ol>
  <li>
    <p>Pull each condition variable outside of its critical section and convert it to a semaphore. The order-independence of semaphore operations makes this safe. After this step, we&rsquo;ve already implemented the box office/bouncer pattern. (In general, this step also lets us avoid a <a href="http://javaagile.blogspot.ca/2012/12/the-thundering-herd.html">thundering herd</a> when multiple threads are signaled at once.)</p>
  </li>
  <li>
    <p>Make the box office lock-free by <a href="http://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation">converting all operations to CAS loops</a>, greatly improving its scalability. This step results in <code>AutoResetEvent</code>.</p>
  </li>
</ol>

<p><img class="center" src="https://preshing.com/images/sema-condvar.png" /></p>

<p>On my Windows PC, using <code>AutoResetEvent</code> in place of <code>AutoResetEventCondVar</code> makes the associated test case run <strong>10x</strong> faster.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[C++ Has Become More Pythonic]]></title>
    <link href="https://preshing.com/20141202/cpp-has-become-more-pythonic"/>
    <updated>2014-12-02T08:20:00-05:00</updated>
    <id>https://preshing.com/?p=20141202</id>
    <content type="html"><![CDATA[<p><img class="center" src="https://preshing.com/images/pythonic_cpp.png" /></p>

<p>C++ has changed a lot in recent years. The last two revisions, C++11 and C++14, introduce so many new features that, in the words of Bjarne Stroustrup, <a href="http://www.stroustrup.com/C++11FAQ.html#think">&ldquo;It feels like a new language.&rdquo;</a></p>

<p>It&rsquo;s true. Modern C++ lends itself to a whole new style of programming &ndash; and I couldn&rsquo;t help noticing it has more of a <a href="https://www.python.org/">Python</a> flavor. Ranged-based for loops, type deduction, vector and map initializers, lambda expressions. The more you explore modern C++, the more you find Python&rsquo;s fingerprints all over it.</p>

<!--more-->
<p>Was Python a direct influence on modern C++? Or did Python simply adopt a few useful constructs before C++ got around to it? You be the judge.</p>

<h2 id="literals">Literals</h2>

<p>Python introduced <a href="https://docs.python.org/dev/whatsnew/2.6.html#pep-3127-integer-literal-support-and-syntax">binary literals</a> in 2008. Now <a href="http://en.cppreference.com/w/cpp/language/integer_literal">C++14 has them</a>. <em>[Update: Thiago Macieira points out in the comments that GCC actually supported them back in 2007.]</em></p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre>static const int primes = 0b10100000100010100010100010101100;
</pre></div>
</div>
</div>

<p>Python also introduced <a href="https://www.python.org/download/releases/1.5/whatsnew/">raw string literals</a> back in 1998. They&rsquo;re convenient when hardcoding a regular expression or a Windows path. <a href="http://en.cppreference.com/w/cpp/language/string_literal">C++11</a> added the same idea with a slightly different syntax:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre>const char* path = R&quot;(c:\this\string\has\backslashes)&quot;;
</pre></div>
</div>
</div>

<h2 id="range-based-for-loops">Range-Based For Loops</h2>

<p>In Python, a <code>for</code> loop always iterates over a Python object:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="keyword">for</span> x <span class="keyword">in</span> myList:
    print(x)
</pre></div>
</div>
</div>

<p>Meanwhile, for nearly three decades, C++ supported only C-style <code>for</code> loops. Finally, in C++11, <a href="http://en.cppreference.com/w/cpp/language/range-for">range-based for loops</a> were added:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="keyword">for</span> (<span class="predefined-type">int</span> x : myList)
    std::cout &lt;&lt; x;
</pre></div>
</div>
</div>

<p>You can iterate over a <code>std::vector</code> or any class which implements the <code>begin</code> and <code>end</code> member functions &ndash; not unlike Python&rsquo;s <a href="https://docs.python.org/release/2.2/lib/typeiter.html">iterator protocol</a>. With range-based for loops, I often find myself wishing C++ had Python&rsquo;s <code>xrange</code> function built-in.</p>

<h2 id="auto">Auto</h2>

<p>Python has always been a dynamically typed language. You don&rsquo;t need to declare variable types anywhere, since types are a property of the objects themselves.</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>x = <span class="string"><span class="delimiter">&quot;</span><span class="content">Hello world!</span><span class="delimiter">&quot;</span></span>
print(x)
</pre></div>
</div>
</div>

<p>C++, on the other hand, is not dynamically typed. It&rsquo;s statically typed. But since C++11 <a href="http://en.cppreference.com/w/cpp/language/auto">repurposed</a> the <code>auto</code> keyword for type deduction, you can write code that <em>looks</em> a lot like dynamic typing:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> x = <span class="string"><span class="delimiter">&quot;</span><span class="content">Hello world!</span><span class="delimiter">&quot;</span></span>;
std::cout &lt;&lt; x;
</pre></div>
</div>
</div>

<p>When you call functions that are overloaded for several types, such as <code>std::ostream::operator&lt;&lt;</code> or a template function, C++ resembles a dynamically typed language even more. C++14 further fleshes out support for the <code>auto</code> keyword, adding support for <code>auto</code> <a href="http://en.wikipedia.org/wiki/C%2B%2B14#Function_return_type_deduction">return values</a> and <code>auto</code> <a href="http://en.wikipedia.org/wiki/C%2B%2B14#Generic_lambdas">arguments</a> to lambda functions.</p>

<h2 id="tuples">Tuples</h2>

<p>Python has had <a href="https://docs.python.org/release/1.4/ref/ref3.html">tuples</a> pretty much since the beginning. They&rsquo;re nice when you need to package several values together, but don&rsquo;t feel like naming a class.</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>triple = (<span class="integer">5</span>, <span class="integer">6</span>, <span class="integer">7</span>)
print(triple[<span class="integer">0</span>])
</pre></div>
</div>
</div>

<p>C++ added tuples to the standard library in C++11. The proposal <a href="http://open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1403.pdf">even mentions Python</a> as an inspiration:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> triple = std::make_tuple(<span class="integer">5</span>, <span class="integer">6</span>, <span class="integer">7</span>);
std::cout &lt;&lt; std::get&lt;<span class="integer">0</span>&gt;(triple);
</pre></div>
</div>
</div>

<p>Python lets you unpack a tuple into separate variables:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>x, y, z = triple
</pre></div>
</div>
</div>

<p>You can do the same thing in C++ using <code>std::tie</code>:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre>std::tie(x, y, z) = triple;
</pre></div>
</div>
</div>

<h2 id="uniform-initialization">Uniform Initialization</h2>

<p>In Python, lists are a <a href="https://docs.python.org/2/tutorial/introduction.html#lists">built-in type</a>. As such, you can create a Python list using a single expression:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>myList = [<span class="integer">6</span>, <span class="integer">3</span>, <span class="integer">7</span>, <span class="integer">8</span>]
myList.append(<span class="integer">5</span>);
</pre></div>
</div>
</div>

<p>C++&rsquo;s <code>std::vector</code> is the closest analog to a Python list. <a href="http://www.stroustrup.com/C++11FAQ.html#init-list">Uniform initialization</a>, new in C++11, now lets us create them using a single expression as well:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> myList = std::vector&lt;<span class="predefined-type">int</span>&gt;{ <span class="integer">6</span>, <span class="integer">3</span>, <span class="integer">7</span>, <span class="integer">8</span> };
myList.push_back(<span class="integer">5</span>);
</pre></div>
</div>
</div>

<p>In Python, you can also create a <a href="https://docs.python.org/2/tutorial/datastructures.html#dictionaries">dictionary</a> with a single expression:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>myDict = {<span class="integer">5</span>: <span class="string"><span class="delimiter">&quot;</span><span class="content">foo</span><span class="delimiter">&quot;</span></span>, <span class="integer">6</span>: <span class="string"><span class="delimiter">&quot;</span><span class="content">bar</span><span class="delimiter">&quot;</span></span>}
print(myDict[<span class="integer">5</span>])
</pre></div>
</div>
</div>

<p>Similarly, uniform initialization also works on C++&rsquo;s <code>std::map</code> and <code>unordered_map</code>:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> myDict = std::unordered_map&lt;<span class="predefined-type">int</span>, <span class="directive">const</span> <span class="predefined-type">char</span>*&gt;{ { <span class="integer">5</span>, <span class="string"><span class="delimiter">&quot;</span><span class="content">foo</span><span class="delimiter">&quot;</span></span> }, { <span class="integer">6</span>, <span class="string"><span class="delimiter">&quot;</span><span class="content">bar</span><span class="delimiter">&quot;</span></span> } };
std::cout &lt;&lt; myDict[<span class="integer">5</span>];
</pre></div>
</div>
</div>

<h2 id="lambda-expressions">Lambda Expressions</h2>

<p>Python has supported <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=98196">lambda functions</a> since 1994:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>myList.sort(key = <span class="keyword">lambda</span> x: <span class="predefined">abs</span>(x))
</pre></div>
</div>
</div>

<p><a href="http://www.stroustrup.com/C++11FAQ.html#lambda">Lambda expressions</a> were added in C++11:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre>std::sort(myList.begin(), myList.end(), [](<span class="predefined-type">int</span> x, <span class="predefined-type">int</span> y){ <span class="keyword">return</span> std::abs(x) &lt; std::abs(y); });
</pre></div>
</div>
</div>

<p>In 2001, Python added <a href="https://docs.python.org/2/whatsnew/2.2.html#pep-227-nested-scopes">statically nested scopes</a>, which allow lambda functions to capture variables defined in enclosing functions:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="keyword">def</span> <span class="function">adder</span>(amount):
    <span class="keyword">return</span> <span class="keyword">lambda</span> x: x + amount
...
print(adder(<span class="integer">5</span>)(<span class="integer">5</span>))
</pre></div>
</div>
</div>

<p>Likewise, C++ lambda expressions support a flexible set of <a href="http://en.cppreference.com/w/cpp/language/lambda#Lambda_capture">capture rules</a>, allowing you to do similar things:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> adder(<span class="predefined-type">int</span> amount) {
    <span class="keyword">return</span> [=](<span class="predefined-type">int</span> x){ <span class="keyword">return</span> x + amount; };
}
...
std::cout &lt;&lt; adder(<span class="integer">5</span>)(<span class="integer">5</span>);
</pre></div>
</div>
</div>

<h2 id="standard-algorithms">Standard Algorithms</h2>

<p>Python&rsquo;s built-in <code>filter</code> function lets you selectively copy elements from a list (though <a href="https://docs.python.org/3/whatsnew/2.0.html#list-comprehensions">list comprehensions</a> are preferred):</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre>result = <span class="predefined">filter</span>(<span class="keyword">lambda</span> x: x &gt;= <span class="integer">0</span>, myList)
</pre></div>
</div>
</div>

<p>C++11 <a href="http://en.cppreference.com/w/cpp/algorithm/copy">introduces</a> <code>std::copy_if</code>, which lets us use a similar, almost-functional style:</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="directive">auto</span> result = std::vector&lt;<span class="predefined-type">int</span>&gt;{};
std::copy_if(myList.begin(), myList.end(), std::back_inserter(result), [](<span class="predefined-type">int</span> x){ <span class="keyword">return</span> x &gt;= <span class="integer">0</span>; });
</pre></div>
</div>
</div>

<p>Other C++ <a href="http://en.cppreference.com/w/cpp/algorithm">algorithms</a> that mimic Python built-ins include <code>transform</code>, <code>any_of</code>, <code>all_of</code>, <code>min</code> and <code>max</code>. The upcoming <a href="https://github.com/ericniebler/range-v3/blob/master/doc/D4128.md">ranges proposal</a> has the potential to simplify such expressions further.</p>

<h2 id="parameter-packs">Parameter Packs</h2>

<p>Python began supporting arbitrary argument lists in <a href="https://docs.python.org/release/1.5/tut/node29.html">1998</a>. You can define a function taking a variable number of arguments, exposed as a tuple, and expand a tuple when passing arguments to another function:</p>

<div overlay="/images/pyicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="keyword">def</span> <span class="function">foo</span>(*args):
    <span class="keyword">return</span> <span class="predefined">tuple</span>(*args)
...
triple = foo(<span class="integer">5</span>, <span class="integer">6</span>, <span class="integer">7</span>)
</pre></div>
</div>
</div>

<p>C++11 adds support for <a href="http://www.stroustrup.com/C++11FAQ.html#variadic-templates">parameter packs</a>. Unlike C-style variable arguments, but like Python&rsquo;s arbitrary argument lists, the parameter pack has a name which represents the entire sequence of arguments. One important difference: C++ parameter packs are <em>not</em> exposed as a single object at runtime. You can only manipulate them through template metaprogramming at compile time.</p>

<div overlay="/images/cppicon.png"><div class="CodeRay">
  <div class="code"><pre><span class="keyword">template</span> &lt;<span class="keyword">typename</span>... T&gt; <span class="directive">auto</span> foo(T&amp;&amp;... args) {
    <span class="keyword">return</span> std::make_tuple(args...);
}
...
<span class="directive">auto</span> triple = foo(<span class="integer">5</span>, <span class="integer">6</span>, <span class="integer">7</span>);
</pre></div>
</div>
</div>

<p>Not all of the new C++11 and C++14 features mimic Python functionality, but it seems a lot of them do. Python is recognized as a friendly, approachable programming language. Perhaps some of its charisma has rubbed off?</p>

<p>What do you think? Do the new features succeed in making C++ simpler, more approachable or more expressive?</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Fixing GCC's Implementation of memory_order_consume]]></title>
    <link href="https://preshing.com/20141124/fixing-gccs-implementation-of-memory_order_consume"/>
    <updated>2014-11-24T06:25:00-05:00</updated>
    <id>https://preshing.com/?p=20141124</id>
    <content type="html"><![CDATA[<p>As I <a href="http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11">explained previously</a>, there are two valid ways for a C++11 compiler to implement <code>memory_order_consume</code>: an efficient strategy and a heavy one. In the heavy strategy, the compiler simply treats <code>memory_order_consume</code> as an alias for <code>memory_order_acquire</code>. The heavy strategy is not what the designers of <code>memory_order_consume</code> had in mind, but technically, it&rsquo;s still compliant with the C++11 standard.</p>

<p><img class="center" src="https://preshing.com/images/consume-strategy-circled.png" /></p>

<p>There&rsquo;s a somewhat common misconception that all current C++11 compilers use the heavy strategy. I certainly had that impression until recently, and others I spoke to at <a href="http://preshing.com/20141024/my-multicore-talk-at-cppcon-2014">CppCon 2014</a> seemed to have that impression as well.</p>

<p>This belief turns out not to be true: GCC does not always use the heavy strategy (yet). GCC 4.9.2 actually has a bug in its implementation of <code>memory_order_consume</code>, as described in <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448">this GCC bug report</a>. I was rather surprised to learn that, since it contradicted <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=5944">my own experience with GCC 4.8.3</a>, in which the PowerPC compiler appeared to use the heavy strategy correctly.</p>

<!--more-->
<p>I decided to verify the bug on my own, which is why I recently took an interest in <a href="http://preshing.com/20141119/how-to-build-a-gcc-cross-compiler">building GCC cross-compilers</a>. This post will explain the bug and document the process of patching the compiler.</p>

<h2 id="an-example-that-illustrates-the-compiler-bug">An Example That Illustrates the Compiler Bug</h2>

<p>Imagine a bunch of threads repeatedly calling the following <code>read</code> function:</p>

<div><div class="CodeRay">
  <div class="code"><pre>#include &lt;atomic&gt;

std::atomic&lt;int&gt; Guard(0);
int Payload[1] = { 0xbadf00d };

int read()
{
    int f = <span class="highlight">Guard.load(std::memory_order_consume)</span>;    // load-consume
    if (f != 0)
        return Payload[f - f];                        // plain load from Payload[f - f]
    return 0;
}
</pre></div>
</div>
</div>

<p>At some point, another thread comes along and calls <code>write</code>:    </p>

<div><div class="CodeRay">
  <div class="code"><pre>int write()
{
    Payload[0] = 42;                                  // plain store to Payload[0]
    Guard.store(1, std::memory_order_release);        // store-release
}
</pre></div>
</div>
</div>

<p>If the compiler is fully compliant with the current C++11 standard, then there are only two possible return values from <code>read</code>: <strong>0</strong> or <strong>42</strong>. The outcome depends on the value seen by the load-consume highlighted above. If the load-consume sees 0, then obviously, <code>read</code> will return 0. If the load-consume sees 1, then according to the <a href="">rules of the standard</a>, the plain store <code>Payload[0] = 42</code> must be visible to the plain load <code>Payload[f - f]</code>, and <code>read</code> must return 42.</p>

<p><img class="center" src="https://preshing.com/images/supposed-consume-ordering.png" /></p>

<p>As I&rsquo;ve <a href="http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11">already explained</a>, <code>memory_order_consume</code> is meant to provide ordering guarantees that are similar to those of <code>memory_order_acquire</code>, only restricted to code that lies along the load-consume&rsquo;s <strong>dependency chain</strong> at the source code level. In other words, the load-consume must <em>carry-a-dependency</em> to the source code statements we want ordered.</p>

<p>In this example, we are admittedly abusing C++11&rsquo;s definition of <em>carry-a-dependency</em> by using <code>f</code> in an expression that cancels it out (<code>f - f</code>). Nonetheless, we are still technically playing by the standard&rsquo;s current rules, and thus, its ordering guarantees should still apply.</p>

<h2 id="compiling-for-aarch64">Compiling for AArch64</h2>

<p>The <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448">compiler bug report</a> mentions AArch64, a new 64-bit instruction set supported by the latest ARM processors. Conveniently enough, I described how to build a GCC cross-compiler for AArch64 in the <a href="http://preshing.com/20141119/how-to-build-a-gcc-cross-compiler">previous post</a>. Let&rsquo;s use that cross-compiler to compile the above code and examine the assembly listing for <code>read</code>:</p>

<pre><code>$ aarch64-linux-g++ -std=c++11 -O2 -S consumetest.cpp
$ cat consumetest.s
</code></pre>

<p><img class="center" src="https://preshing.com/images/gcc-consume-aarch64.png" /></p>

<p>This machine code is flawed. AArch64 is a <a href="http://preshing.com/20120930/weak-vs-strong-memory-models">weakly-ordered CPU architecture</a> that preserves data dependency ordering, and yet neither compiler strategy has been taken:</p>

<ul>
  <li><strong>No heavy strategy:</strong> There is no memory barrier instruction between the load from <code>Guard</code> and the load from <code>Payload[f - f]</code>. The load-consume has not been promoted to a load-acquire.</li>
  <li><strong>No efficient strategy:</strong> There is no <a href="http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11">dependency chain</a> connecting the two loads at the machine code level. I&rsquo;ve highlighted the two machine-level dependency chains above, in blue and green. As you can see, the two loads lie along separate chains.</li>
</ul>

<p>As a result, the processor is free to reorder the loads at runtime so that the second load sees an older value than the first. There is a very real possibility that <code>read</code> will return <code>0xbadf00d</code>, the initial value of <code>Payload[0]</code>, even though the C++ standard forbids it.</p>

<h2 id="patching-the-cross-compiler">Patching the Cross-Compiler</h2>

<p>Andrew Macleod <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448#c15">posted a patch</a> for this issue in the bug report. His patch adds the following lines near the end of the <code>get_memmodel</code> function in <code>gcc/builtins.c</code>:</p>

<pre><code>  /* Workaround for Bugzilla 59448. GCC doesn't track consume properly, so
     be conservative and promote consume to acquire.  */
  if (val == MEMMODEL_CONSUME)
    val = MEMMODEL_ACQUIRE;
</code></pre>

<p>Let&rsquo;s apply this patch and build a new cross-compiler.</p>

<pre><code>$ cd gcc-4.9.2/gcc
$ wget -qO- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33831 | patch
$ cd ../../build-gcc
$ make
$ make install
$ cd ..
</code></pre>

<p>Now let&rsquo;s compile the same source code as before:</p>

<pre><code>$ aarch64-linux-g++ -std=c++11 -O2 -S consumetest.cpp
$ cat consumetest.s
</code></pre>

<p><img class="center" src="https://preshing.com/images/gcc-consume-aarch64-fixed.png" /></p>

<p>This time, the generated assembly is valid. The compiler now implements the load-consume from <code>Guard</code> using <code>ldar</code>, a <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0802a/LDAR.html">new AArch64 instruction</a> that provides acquire semantics. This instruction acts as a memory barrier on the load itself, ensuring that the load will be completed before all subsequent loads and stores (among other things). In other words, our AArch64 cross-compiler now implements the &ldquo;heavy&rdquo; strategy correctly.</p>

<h2 id="this-bug-doesnt-happen-on-powerpc">This Bug Doesn&rsquo;t Happen on PowerPC</h2>

<p>Interestingly, if you compile the same example for PowerPC, there is no bug. This is using the same GCC version 4.9.2 without Andrew&rsquo;s patch applied:</p>

<pre><code>$ powerpc-linux-g++ -std=c++11 -O2 -S consumetest.cpp
$ cat consumetest.s
</code></pre>

<p><img class="center" src="https://preshing.com/images/gcc-consume-powerpc.png" /></p>

<p>The PowerPC cross-compiler appears to implement the &ldquo;heavy&rdquo; strategy correctly, promoting consume to acquire and emitting the necessary memory barrier instructions. Why does the PowerPC cross-compiler work in this case, but not the AArch64 cross-compiler? One hint lies in GCC&rsquo;s <a href="https://gcc.gnu.org/onlinedocs/gccint/Machine-Desc.html">machine description (MD)</a> files. GCC uses these MD files in its final stage of compilation, after optimization, when it converts its intermediate <a href="https://gcc.gnu.org/onlinedocs/gccint/RTL.html">RTL format</a> to a native assembly code listing. Among the AArch64 MD files, in <code>gcc-4.9.2/gcc/config/aarch64/atomics.md</code>, you&rsquo;ll currently find the following:</p>

<div><div class="CodeRay">
  <div class="code"><pre>    if (model == MEMMODEL_RELAXED
    || model == <span class="highlight">MEMMODEL_CONSUME</span>
    || model == MEMMODEL_RELEASE)
      return &quot;ldr&lt;atomic_sfx&gt;\t%&lt;w&gt;0, %1&quot;;
    else
      return &quot;ldar&lt;atomic_sfx&gt;\t%&lt;w&gt;0, %1&quot;;
</pre></div>
</div>
</div>

<p>Meanwhile, among PowerPC&rsquo;s MD files, in <code>gcc-4.9.2/gcc/config/rs6000/sync.md</code>, you&rsquo;ll find:</p>

<div><div class="CodeRay">
  <div class="code"><pre>  switch (model)
    {
    case MEMMODEL_RELAXED:
      break;
    case <span class="highlight">MEMMODEL_CONSUME</span>:
    case MEMMODEL_ACQUIRE:
    case MEMMODEL_SEQ_CST:
      emit_insn (gen_loadsync_&lt;mode&gt; (operands[0]));
      break;
</pre></div>
</div>
</div>

<p>Based on the above, it seems that the AArch64 cross-compiler currently treats consume the same as relaxed at the final stage of compilation, whereas the PowerPC cross-compiler treats consume the same as acquire at the final stage. Indeed, if you move <code>case MEMMODEL_CONSUME:</code> one line earlier in the PowerPC MD file, you can reproduce the bug on PowerPC, too.</p>

<p>Andrew&rsquo;s patch appears to make <em>all</em> compilers treat consume the same as acquire at an earlier stage of compilation.</p>

<h2 id="the-uncertain-future-of-memoryorderconsume">The Uncertain Future of memory_order_consume</h2>

<p>It&rsquo;s fair to call <code>memory_order_consume</code> an obscure subject, and the current status of GCC support reflects that. The C++ standard committee is wondering what to do with <code>memory_order_consume</code> in future revisions of C++.</p>

<p>My opinion is that the definition of <em>carries-a-dependency</em> should be narrowed to require that different return values from a load-consume result in different behavior for any dependent statements that are executed. Let&rsquo;s face it: Using <code>f - f</code> as a dependency is nonsense, and narrowing the definition would free the compiler from having to support such nonsense &ldquo;dependencies&rdquo; if it chooses to implement the efficient strategy. This idea was first proposed by Torvald Riegel <a href="https://lkml.org/lkml/2014/2/27/806">in the Linux Kernel Mailing List</a> and is captured among various alternatives described in Paul McKenney&rsquo;s <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4036.pdf">proposal N4036</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How to Build a GCC Cross-Compiler]]></title>
    <link href="https://preshing.com/20141119/how-to-build-a-gcc-cross-compiler"/>
    <updated>2014-11-19T06:30:00-05:00</updated>
    <id>https://preshing.com/?p=20141119</id>
    <content type="html"><![CDATA[<p>GCC is not just a compiler. It&rsquo;s an open source project that lets you build all kinds of compilers. Some compilers support multithreading; some support shared libraries; some support <a href="http://wiki.gentoo.org/wiki/Multilib">multilib</a>. It all depends on how you configure the compiler before building it.</p>

<p>This guide will demonstrate how to build a <strong>cross-compiler</strong>, which is a compiler that builds programs for another machine. All you need is a Unix-like environment with a recent version of GCC already installed.</p>

<p><img class="center" src="https://preshing.com/images/gcc-cross-compiler.png" /></p>

<!--more-->
<p>In this guide, I&rsquo;ll use Debian Linux to build a full C++ cross-compiler for <strong>AArch64</strong>, a 64-bit instruction set available in the latest ARM processors. I don&rsquo;t actually own an AArch64 device &ndash; I just wanted an AArch64 compiler to verify <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448">this bug</a>.</p>

<h2 id="required-packages">Required Packages</h2>

<p>Starting with a clean Debian system, you must first install a few packages:</p>

<pre><code>$ sudo apt-get install g++ make gawk
</code></pre>

<p>Everything else will be built from source. Create a new directory somewhere, and download the following source packages. (If you&rsquo;re following this guide at a later date, there will be more recent releases of each package available. Check for newer releases by pasting each URL into your browser without the filename. For example: <a href="http://ftpmirror.gnu.org/binutils/">http://ftpmirror.gnu.org/binutils/</a>)</p>

<pre><code>$ wget http://ftpmirror.gnu.org/binutils/binutils-2.24.tar.gz
$ wget http://ftpmirror.gnu.org/gcc/gcc-4.9.2/gcc-4.9.2.tar.gz
$ wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.17.2.tar.xz
$ wget http://ftpmirror.gnu.org/glibc/glibc-2.20.tar.xz
$ wget http://ftpmirror.gnu.org/mpfr/mpfr-3.1.2.tar.xz
$ wget http://ftpmirror.gnu.org/gmp/gmp-6.0.0a.tar.xz
$ wget http://ftpmirror.gnu.org/mpc/mpc-1.0.2.tar.gz
$ wget ftp://gcc.gnu.org/pub/gcc/infrastructure/isl-0.12.2.tar.bz2
$ wget ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-0.18.1.tar.gz
</code></pre>

<p>The first four packages &ndash; <a href="http://www.gnu.org/software/binutils/">Binutils</a>, <a href="https://gcc.gnu.org/">GCC</a>, the <a href="https://www.kernel.org/">Linux kernel</a> and <a href="http://www.gnu.org/software/libc/">Glibc</a> &ndash; are the main ones. We could have installed the next three packages in binary form using our system&rsquo;s package manager instead, but that tends to provide older versions. The last two packages, ISL and CLooG, are optional, but they enable a few more optimizations in the compiler we&rsquo;re about to build.</p>

<h2 id="how-the-pieces-fit-together">How The Pieces Fit Together</h2>

<p>By the time we&rsquo;re finished, we will have built each of the following programs and libraries. First, we&rsquo;ll build the tools on the left, then we&rsquo;ll use those tools to build the programs and libraries on the right. We won&rsquo;t actually build the target system&rsquo;s Linux kernel, but we do need the kernel header files in order to build the target system&rsquo;s standard C library.</p>

<p><img class="center" src="https://preshing.com/images/cross-gcc-sandwich.png" /></p>

<p>The compilers on the left will invoke the assembler &amp; linker as part of their job. All the other packages we downloaded, such as MPFR, GMP and MPC, will be linked into the compilers themselves.</p>

<p>The diagram on the right represents a sample program, <code>a.out</code>, running on the target OS, built using the cross compiler and linked with the target system&rsquo;s standard C and C++ libraries. The standard C++ library makes calls to the standard C library, and the C library makes direct system calls to the AArch64 Linux kernel.</p>

<p>Note that instead of using Glibc as the standard C library implementation, we could have used <a href="https://sourceware.org/newlib/">Newlib</a>, an alternative implementation. Newlib is a popular C library implementation for embedded devices. Unlike Glibc, Newlib doesn&rsquo;t require a complete OS on the target system &ndash; just a thin hardware abstraction layer called <a href="http://ieee.uwaterloo.ca/coldfire/gcc-doc/docs/porting_1.html">Libgloss</a>. Newlib doesn&rsquo;t have regular releases; instead, you&rsquo;re meant to pull the source directly from the <a href="https://sourceware.org/newlib/">Newlib CVS repository</a>. One limitation of Newlib is that currently, it doesn&rsquo;t seem to support building multithreaded programs for AArch64. That&rsquo;s why I chose not to use it here.</p>

<h2 id="build-steps">Build Steps</h2>

<p>Extract all the source packages.</p>

<pre><code>$ for f in *.tar*; do tar xf $f; done
</code></pre>

<p>Create symbolic links from the GCC directory to some of the other directories. These five packages are dependencies of GCC, and when the symbolic links are present, GCC&rsquo;s build script <a href="https://gcc.gnu.org/install/download.html">will build them automatically</a>.</p>

<pre><code>$ cd gcc-4.9.2
$ ln -s ../mpfr-3.1.2 mpfr
$ ln -s ../gmp-6.0.0 gmp
$ ln -s ../mpc-1.0.2 mpc
$ ln -s ../isl-0.12.2 isl
$ ln -s ../cloog-0.18.1 cloog
$ cd ..
</code></pre>

<p>Choose an installation directory, and make sure you have write permission to it. In the steps that follow, I&rsquo;ll install the new toolchain to <code>/opt/cross</code>.</p>

<pre><code>$ sudo mkdir -p /opt/cross
$ sudo chown jeff /opt/cross
</code></pre>

<p>Throughout the entire build process, make sure the installation&rsquo;s <code>bin</code> subdirectory is in your <code>PATH</code> environment variable. You can remove this directory from your <code>PATH</code> later, but most of the build steps expect to find <code>aarch64-linux-gcc</code> and other host tools via the <code>PATH</code> by default.</p>

<pre><code>$ export PATH=/opt/cross/bin:$PATH
</code></pre>

<p>Pay particular attention to the stuff that gets installed under <code>/opt/cross/aarch64-linux/</code>. This directory is considered the <strong>system root</strong> of an imaginary AArch64 Linux target system. A self-hosted AArch64 Linux compiler could, in theory, use all the headers and libraries placed here. Obviously, none of the programs built for the host system, such as the cross-compiler itself, will be installed to this directory.</p>

<h3 id="binutils">1. Binutils</h3>

<p>This step builds and installs the cross-assembler, cross-linker, and other tools.</p>

<pre><code>$ mkdir build-binutils
$ cd build-binutils
$ ../binutils-2.24/configure --prefix=/opt/cross --target=aarch64-linux --disable-multilib
$ make -j4
$ make install
$ cd ..
</code></pre>

<ul>
  <li>We&rsquo;ve specified <code>aarch64-linux</code> as the target system type. Binutils&rsquo;s <code>configure</code> script will recognize that this target is different from the machine we&rsquo;re building on, and configure a cross-assembler and cross-linker as a result. The tools will be installed to <code>/opt/cross/bin</code>, their names prefixed by <code>aarch64-linux-</code>.</li>
  <li><code>--disable-multilib</code> means that we only want our Binutils installation to work with programs and libraries using the AArch64 instruction set, and not any related instruction sets such as AArch32.</li>
</ul>

<h3 id="linux-kernel-headers">2. Linux Kernel Headers</h3>

<p>This step installs the Linux kernel header files to <code>/opt/cross/aarch64-linux/include</code>, which will ultimately allow programs built using our new toolchain to make system calls to the AArch64 kernel in the target environment.</p>

<pre><code>$ cd linux-3.17.2
$ make ARCH=arm64 INSTALL_HDR_PATH=/opt/cross/aarch64-linux headers_install
$ cd ..
</code></pre>

<ul>
  <li>We could even have done this before installing Binutils.</li>
  <li>The Linux kernel header files won&rsquo;t actually be used until step 6, when we build the standard C library, although the <code>configure</code> script in step 4 expects them to be already installed.</li>
  <li>Because the Linux kernel is a different open-source project from the others, it has a different way of identifying the target CPU architecture: <code>ARCH=arm64</code></li>
</ul>

<p>All of the remaining steps involve building GCC and Glibc. The trick is that there are parts of GCC which depend on parts of Glibc already being built, and vice versa. We can&rsquo;t build either package in a single step; we need to go back and forth between the two packages and build their components in a way that satisfies their dependencies.</p>

<p><img class="center" src="https://preshing.com/images/cross-gcc-steps.png" /></p>

<h3 id="cc-compilers">3. C/C++ Compilers</h3>

<p>This step will build GCC&rsquo;s C and C++ cross-compilers only, and install them to <code>/opt/cross/bin</code>. It won&rsquo;t invoke those compilers to build any libraries just yet.</p>

<pre><code>$ mkdir -p build-gcc
$ cd build-gcc
$ ../gcc-4.9.2/configure --prefix=/opt/cross --target=aarch64-linux --enable-languages=c,c++ --disable-multilib
$ make -j4 all-gcc
$ make install-gcc
$ cd ..
</code></pre>

<ul>
  <li>Because we&rsquo;ve specified <code>--target=aarch64-linux</code>, the build script looks for the Binutils cross-tools we built in step 1 with names prefixed by <code>aarch64-linux-</code>. Likewise, the C/C++ compiler names will be prefixed by <code>aarch64-linux-</code>.</li>
  <li><code>--enable-languages=c,c++</code> prevents other compilers in the GCC suite, such as Fortran, Go or Java, from being built.</li>
</ul>

<h3 id="standard-c-library-headers-and-startup-files">4. Standard C Library Headers and Startup Files</h3>

<p>In this step, we install Glibc&rsquo;s standard C library headers to <code>/opt/cross/aarch64-linux/include</code>. We also use the C compiler built in step 3 to compile the library&rsquo;s startup files and install them to <code>/opt/cross/aarch64-linux/lib</code>. Finally, we create a couple of dummy files, <code>libc.so</code> and <code>stubs.h</code>, which are expected in step 5, but which will be replaced in step 6.</p>

<pre><code>$ mkdir -p build-glibc
$ cd build-glibc
$ ../glibc-2.20/configure --prefix=/opt/cross/aarch64-linux --build=$MACHTYPE --host=aarch64-linux --target=aarch64-linux --with-headers=/opt/cross/aarch64-linux/include --disable-multilib libc_cv_forced_unwind=yes
$ make install-bootstrap-headers=yes install-headers
$ make -j4 csu/subdir_lib
$ install csu/crt1.o csu/crti.o csu/crtn.o /opt/cross/aarch64-linux/lib
$ aarch64-linux-gcc -nostdlib -nostartfiles -shared -x c /dev/null -o /opt/cross/aarch64-linux/lib/libc.so
$ touch /opt/cross/aarch64-linux/include/gnu/stubs.h
$ cd ..
</code></pre>

<ul>
  <li><code>--prefix=/opt/cross/aarch64-linux</code> tells Glibc&rsquo;s <code>configure</code> script where it should install its headers and libraries. Note that it&rsquo;s different from the usual <code>--prefix</code>.</li>
  <li>Despite some contradictory information out there, Glibc&rsquo;s <code>configure</code> script currently requires us to specify all three <code>--build</code>, <code>--host</code> and <code>--target</code> system types.</li>
  <li><code>$MACHTYPE</code> is a predefined environment variable which describes the machine running the build script. <code>--build=$MACHTYPE</code> is needed because in step 6, the build script will compile some additional tools which run as part of the build process itself.</li>
  <li><code>--host</code> has a different meaning here than we&rsquo;ve been using so far. In Glibc&rsquo;s <code>configure</code>, both the <code>--host</code> and <code>--target</code> options are meant to describe the system on which Glibc&rsquo;s libraries will ultimately run.</li>
  <li>We install the C library&rsquo;s startup files, <code>crt1.o</code>, <code>crti.o</code> and <code>crtn.o</code>, to the installation directory manually. There&rsquo;s doesn&rsquo;t seem to a <code>make</code> rule that does this without having other side effects.</li>
</ul>

<h3 id="compiler-support-library">5. Compiler Support Library</h3>

<p>This step uses the cross-compilers built in step 3 to build the compiler support library. The compiler support library contains some C++ exception handling boilerplate code, among other things. This library depends on the startup files installed in step 4. The library itself is needed in step 6. Unlike some <a href="http://www.ifp.illinois.edu/~nakazato/tips/xgcc.html">other guides</a>, we don&rsquo;t need to re-run GCC&rsquo;s <code>configure</code>. We&rsquo;re just building additional targets in the same configuration.</p>

<pre><code>$ cd build-gcc
$ make -j4 all-target-libgcc
$ make install-target-libgcc
$ cd ..
</code></pre>

<ul>
  <li>Two static libraries, <code>libgcc.a</code> and <code>libgcc_eh.a</code>, are installed to <code>/opt/cross/lib/gcc/aarch64-linux/4.9.2/</code>.</li>
  <li>A shared library, <code>libgcc_s.so</code>, is installed to <code>/opt/cross/aarch64-linux/lib64</code>.    </li>
</ul>

<h3 id="standard-c-library">6. Standard C Library</h3>

<p>In this step, we finish off the Glibc package, which builds the standard C library and installs its files to <code>/opt/cross/aarch64-linux/lib/</code>. The static library is named <code>libc.a</code> and the shared library is <code>libc.so</code>.</p>

<pre><code>$ cd build-glibc
$ make -j4
$ make install
$ cd ..
</code></pre>

<h3 id="standard-c-library-1">7. Standard C++ Library</h3>

<p>Finally, we finish off the GCC package, which builds the standard C++ library and installs it to <code>/opt/cross/aarch64-linux/lib64/</code>. It depends on the C library built in step 6. The resulting static library is named <code>libstdc++.a</code> and the shared library is <code>libstdc++.so</code>.</p>

<pre><code>$ cd build-gcc
$ make -j4
$ make install
$ cd ..
</code></pre>

<h2 id="dealing-with-build-errors">Dealing with Build Errors</h2>

<p>If you encounter any errors during the build process, there are three possibilities:</p>

<ol>
  <li>You&rsquo;re missing a required package or tool on the build system.</li>
  <li>You&rsquo;re attempting to perform the build steps in an incorrect order.</li>
  <li>You&rsquo;ve done everything right, but something is just broken in the configuration you&rsquo;re attempting to build.</li>
</ol>

<p>You&rsquo;ll have to examine the build logs to determine which case applies. GCC supports a lot of configurations, and some of them may not build right away. The less popular a configuration is, the greater the chance of it being broken. GCC, being an open source project, depends on contributions from its users to keep each configuration working.</p>

<h2 id="automating-the-above-steps">Automating the Above Steps</h2>

<p>I&rsquo;ve written a small bash script named <code>build_cross_gcc</code> to perform all of the above steps. You can find it <a href="https://gist.github.com/preshing/41d5c7248dea16238b60">on GitHub</a>. On my Core 2 Quad Q9550 Debian machine, it takes 13 minutes from start to finish. Customize it to your liking before running.</p>

<p><a href="https://gist.github.com/preshing/41d5c7248dea16238b60"><img class="center" src="https://preshing.com/images/build_cross_gcc.png" /></a></p>

<p><code>build_cross_gcc</code> also supports Newlib configurations. When you build a Newlib-based cross-compiler, steps 4, 5 and 6 above can be combined into a single step. (Indeed, that&rsquo;s what many <a href="http://kunen.org/uC/gnu_tool.html">existing guides</a> do.) For Newlib support, edit the script options as follows:</p>

<pre><code>TARGET=aarch64-elf
USE_NEWLIB=1
CONFIGURATION_OPTIONS="--disable-multilib --disable-threads"
</code></pre>

<p>Another way to build a GCC cross-compiler is using a <a href="http://raghunathlolur.wordpress.com/2014/06/30/combined-tree-build-of-gcc-binutils-and-libraries/">combined tree</a>, where the source code for Binutils, GCC and Newlib are merged into a single directory. A combined tree will only work if the <code>intl</code> and <code>libiberty</code> libraries bundled with GCC and Binutils are identical, which is not the case for the versions used in this post. Combined trees don&rsquo;t support Glibc either, so it wasn&rsquo;t an option for this configuration.</p>

<p>There are a couple of popular build scripts, namely <a href="http://crosstool-ng.org/">crosstool-NG</a> and <a href="https://www.embtoolkit.org/">EmbToolkit</a>, which automate the entire process of building cross-compilers. I had mixed results using crosstool-NG, but it helped me make sense of the build process while putting together this guide.</p>

<h2 id="testing-the-cross-compiler">Testing the Cross-Compiler</h2>

<p>If everything built successfully, let&rsquo;s check our cross-compiler for a dial tone:</p>

<pre><code>$ aarch64-linux-g++ -v
Using built-in specs.
COLLECT_GCC=aarch64-linux-g++
COLLECT_LTO_WRAPPER=/opt/cross/libexec/gcc/aarch64-linux/4.9.2/lto-wrapper
Target: aarch64-linux
Configured with: ../gcc-4.9.2/configure --prefix=/opt/cross --target=aarch64-linux --enable-languages=c,c++ --disable-multilib
Thread model: posix
gcc version 4.9.2 (GCC)
</code></pre>

<p>We can compile the C++14 program from <a href="http://preshing.com/20141108/how-to-install-the-latest-gcc-on-windows">the previous post</a>, then disassemble it:</p>

<pre><code>$ aarch64-linux-g++ -std=c++14 test.cpp
$ aarch64-linux-objdump -d a.out
...
0000000000400830 &lt;main&gt;:
  400830:       a9be7bfd        stp     x29, x30, [sp,#-32]!
  400834:       910003fd        mov     x29, sp
  400838:       910063a2        add     x2, x29, #0x18
  40083c:       90000000        adrp    x0, 400000 &lt;_init-0x618&gt;
  ...
</code></pre>

<p>This was my first foray into building a cross-compiler. I basically wrote this guide to remember what I&rsquo;ve learned. I think the above steps serve as a pretty good template for building other configurations; I used <code>build_cross_gcc</code> to build <code>TARGET=powerpc-eabi</code> as well. You can browse <code>config.sub</code> from any of the packages to see what other target environments are supported. Comments and corrections are more than welcome!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How to Install the Latest GCC on Windows]]></title>
    <link href="https://preshing.com/20141108/how-to-install-the-latest-gcc-on-windows"/>
    <updated>2014-11-08T10:50:00-05:00</updated>
    <id>https://preshing.com/?p=20141108</id>
    <content type="html"><![CDATA[<p>Several modern C++ features are currently missing from Visual Studio Express, and from the system GCC compiler provided with many of today&rsquo;s Linux distributions. <a href="http://en.wikipedia.org/wiki/C%2B%2B14#Generic_lambdas">Generic lambdas</a> &ndash; also known as <em>polymorphic lambdas</em> &ndash; are one such feature. This feature is, however, available in the latest versions of GCC and Clang.</p>

<p>The following guide will help you install the latest GCC on Windows, so you can experiment with generic lambdas and <a href="https://gcc.gnu.org/projects/cxx1y.html">other cutting-edge C++ features</a>. You&rsquo;ll need to compile GCC from sources, but that&rsquo;s not a problem. Depending on the speed of your machine, you can have the latest GCC up and running in as little as <strong>15 minutes</strong>.</p>

<!--more-->
<p>The steps are:</p>

<ol>
  <li>Install Cygwin, which gives us a Unix-like environment running on Windows.</li>
  <li>Install a set of Cygwin packages required for building GCC.</li>
  <li>From within Cygwin, download the GCC source code, build and install it.</li>
  <li>Test the new GCC compiler in C++14 mode using the <code>-std=c++14</code> option.</li>
</ol>

<p><em>[Update: As a commenter points out, you can also install native GCC compilers <a href="http://mingw-w64.sourceforge.net/">from the MinGW-w64 project</a> without needing Cygwin.]</em></p>

<h2 id="install-cygwin">1. Install Cygwin</h2>

<p>First, download and run either the 32- or 64-bit version of the <a href="https://cygwin.com/install.html">Cygwin installer</a>, depending on your version of Windows. Cygwin&rsquo;s setup wizard will walk you through a series of steps. If your machine is located behind a proxy server, make sure to check &ldquo;Use Internet Explorer Proxy Settings&rdquo; when you get to the &ldquo;Select Your Internet Connection&rdquo; step.</p>

<p>When you reach the &ldquo;Select Packages&rdquo; step (shown below), don&rsquo;t bother selecting any packages yet. Just go ahead and click Next. We&rsquo;ll add additional packages from the command line later.</p>

<p><img class="center" src="https://preshing.com/images/cygwin-select-packages.png" /></p>

<p>After the Cygwin installer completes, it&rsquo;s very important to keep the installer around. The installer is an executable named either <code>setup-x86.exe</code> or <code>setup-x86_64.exe</code>, and you&rsquo;ll need it to add or remove Cygwin packages in the future. I suggest moving the installer to the same folder where you installed Cygwin itself; typically <code>C:\cygwin</code> or <code>C:\cygwin64</code>.</p>

<p><img class="center" src="https://preshing.com/images/cygwin-setup-exe.png" /></p>

<p>If you already have Cygwin installed, it&rsquo;s a good idea to re-run the installer to make sure it has the latest available packages. Alternatively, you can install a new instance of Cygwin in a different folder.</p>

<h2 id="install-required-cygwin-packages">2. Install Required Cygwin Packages</h2>

<p>Next, you&rsquo;ll need to add several packages to Cygwin. You can add them all in one fell swoop. Just open a Command Prompt (in Windows), navigate to the folder where the Cygwin installer is located, and run the following command:</p>

<pre><code>C:\cygwin64&gt;setup-x86_64.exe -q -P wget -P gcc-g++ -P make -P diffutils -P libmpfr-devel -P libgmp-devel -P libmpc-devel
</code></pre>

<p><img class="center" src="https://preshing.com/images/cygwin-additional-packages.png" /></p>

<p>A window will pop up and download all the required packages along with their dependencies.</p>

<p>At this point, you now have a working GCC compiler on your system. It&rsquo;s not the latest version of GCC; it&rsquo;s whatever version the Cygwin maintainers chose as their system compiler. At the time of writing, that&rsquo;s GCC 4.8.3. To get a more recent version of GCC, you&rsquo;ll have to compile it yourself, using the GCC compiler you already have.</p>

<h2 id="download-build-and-install-the-latest-gcc">3. Download, Build and Install the Latest GCC</h2>

<p>Open a Cygwin terminal, either from the Start menu or by running <code>Cygwin.bat</code> from the Cygwin installation folder.</p>

<p><img class="center" src="https://preshing.com/images/cygwin-prompt.png" /></p>

<p>If your machine is located behind a proxy server, you must run the following command from the Cygwin terminal before proceeding &ndash; otherwise, <code>wget</code> won&rsquo;t work. This step is not needed if your machine is directly connected to the Internet.</p>

<pre><code>$ export http_proxy=$HTTP_PROXY https_proxy=$HTTP_PROXY ftp_proxy=$HTTP_PROXY
</code></pre>

<p>To download and extract the latest GCC source code, enter the following commands in the Cygwin terminal. If you&rsquo;re following this guide at a later date, there will surely be a <a href="https://www.gnu.org/software/gcc/releases.html">more recent version</a> of GCC available. I used 4.9.2, but you can use any version you like. Keep in mind, though, that it&rsquo;s always best to have the latest Cygwin packages installed when building the latest GCC. Be patient with the <code>tar</code> command; it takes several minutes.</p>

<pre><code>$ wget http://ftpmirror.gnu.org/gcc/gcc-4.9.2/gcc-4.9.2.tar.gz
$ tar xf gcc-4.9.2.tar.gz
</code></pre>

<p>That will create a subdirectory named <code>gcc-4.9.2</code>. Next, we&rsquo;ll configure our GCC build. As the <a href="https://gcc.gnu.org/install/configure.html">GCC documentation</a> recommends, it&rsquo;s best to configure and build GCC in another directory <em>outside</em> <code>gcc-4.9.2</code>, so that&rsquo;s what we&rsquo;ll do.</p>

<pre><code>$ mkdir build-gcc
$ cd build-gcc
$ ../gcc-4.9.2/configure --program-suffix=-4.9.2 --enable-languages=c,c++ --disable-bootstrap --disable-shared
</code></pre>

<p>Here&rsquo;s a description of the command-line options passed to <code>configure</code>:</p>

<ul>
  <li>
    <p>The <code>--program-suffix=-4.9.2</code> option means that once our new GCC is installed, we&rsquo;ll run it as <code>g++-4.9.2</code>. This will make it easier for the new GCC compiler to coexist alongside the system GCC compiler provided by Cygwin.</p>
  </li>
  <li>
    <p>The <code>--enable-languages=c,c++</code> option means that only the C and C++ compilers will be built. Compilers for other languages, such as Fortran, Java and Go, will be excluded. This will save compile time.</p>
  </li>
  <li>
    <p>The <code>--disable-bootstrap</code> option means that we only want to build the new compiler once. If we don&rsquo;t specify <code>--disable-bootstrap</code>, the new compiler will be built three times, for testing and performance reasons. However, the system GCC compiler (4.8.3) provided by Cygwin is pretty recent, so <code>--disable-bootstrap</code> is good enough for our purposes. This will save a significant amount of compile time.</p>
  </li>
  <li>
    <p>The <code>--disable-shared</code> option means that we don&rsquo;t want to build the new standard C++ runtime library as a DLL that&rsquo;s shared with other C++ applications on the system. It&rsquo;s totally possible to make C++ executables work with such DLLs, but it takes care not to introduce conflicts with C++ executables created by older or newer versions of GCC. That&rsquo;s something distribution maintainers need to worry about; not us. Let&rsquo;s just avoid the additional headache.</p>
  </li>
  <li>
    <p>By default, the new version of GCC will be installed to <code>/usr/local</code> in Cygwin&rsquo;s virtual filesystem. This will make it easier to launch the new GCC, since <code>/usr/local/bin</code> is already listed in Cygwin&rsquo;s <code>PATH</code> environment variable. However, if you&rsquo;re using an existing Cygwin installation, it might prove difficult to uninstall GCC from <code>/usr/local</code> later on (if you so choose), since that directory tends to contain files from several different packages. If you prefer to install the new GCC to a different directory, add the option <code>--prefix=/path/to/directory</code> to the above <code>configure</code> command.</p>
  </li>
</ul>

<p>We&rsquo;re not going to build a new Binutils, which GCC relies on, because the existing Binutils provided by Cygwin is already quite recent. We&rsquo;re also skipping a couple of packages, namely ISL and CLooG, which means that the new compiler won&rsquo;t be able to use any of the <a href="https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html">Graphite loop optimizations</a>.</p>

<p>Next, we&rsquo;ll actually build the new GCC compiler suite, including C, C++ and the standard C++ library. This is the longest step.</p>

<pre><code>$ make -j4
</code></pre>

<p>The <code>-j4</code> option lets the build process spawn up to four child processes in parallel. If your machine&rsquo;s CPU has at least four hardware threads, this option makes the build process run significantly faster. The main downside is that it jumbles the output messages generated during the build process. If your CPU has even more hardware threads, you can specify a higher number with <code>-j</code>. For comparison, I tried various numbers on a <a href="http://ark.intel.com/products/75780/Intel-Xeon-Processor-E5-1650-v2-12M-Cache-3_50-GHz">Xeon-based</a> machine having 12 hardware threads, and got the following build times:</p>

<p><img class="center" src="https://preshing.com/images/make-j-results.png" /></p>

<p>Be warned: I encountered a <a href="http://en.wikipedia.org/wiki/Segmentation_fault">segmentation fault</a> the first time I ran with <code>-j4</code>. Bad luck on my part. If that happens to you, running the same command a second time should allow the build process to finish successfully. Also, when specifying higher numbers with <code>-j</code>, there are often strange error messages at the end of the build process involving &ldquo;jobserver tokens&rdquo;, but they&rsquo;re harmless.</p>

<p>Once that&rsquo;s finished, install the new compiler:</p>

<pre><code>$ make install
$ cd ..
</code></pre>

<p>This installs several executables to <code>/usr/local/bin</code>; it installs the standard C++ library&rsquo;s include files to <code>/usr/local/include/c++/4.9.2</code>; and it installs the static standard C++ library to <code>/usr/local/lib</code>, among other things. Interestingly, it does <em>not</em> install a new standard C library! The new compiler will continue to use the existing system C library that came with Cygwin.</p>

<p>If, later, you decide to uninstall the new GCC compiler, you have several options:</p>

<ul>
  <li>If you installed GCC to a directory other than <code>/usr/local</code>, and that directory contains no other files, you can simply delete that directory.</li>
  <li>If you installed GCC to <code>/usr/local</code>, and there are files from other packages mixed into the same directory tree, you can run the <code>list_modifications.py</code> script from <a href="http://preshing.com/20130115/view-your-filesystem-history-using-python">this post</a> to determine which files are safe to delete from <code>/usr/local</code>.</li>
  <li>You can simply uninstall Cygwin itself, by deleting the <code>C:\cygwin64</code> folder in Windows, along with its associated Start menu entry.</li>
</ul>

<h2 id="test-the-new-compiler">4. Test the New Compiler</h2>

<p>All right, let&rsquo;s compile some code that uses generic lambdas! Generic lambdas are part of the C++14 standard. They let you pass arguments to lambda functions as <code>auto</code> (or any templated type), like the one highlighted below. Create a file named <code>test.cpp</code> with the following contents:</p>

<div><div class="CodeRay">
  <div class="code"><pre>#include &lt;iostream&gt;

int main()
{
    auto lambda = [](<span class="highlight">auto</span> x){ return x; };
    std::cout &lt;&lt; lambda(&quot;Hello generic lambda!\n&quot;);
    return 0;
}
</pre></div>
</div>
</div>

<p>You can add files to your home directory in Cygwin using any Windows-based text editor; just save them to the folder <code>C:\cygwin64\home\Jeff</code> (or similar) in Windows.</p>

<p>First, let&rsquo;s see what happens when we try to compile it using the system GCC compiler provided by Cygwin:</p>

<pre><code>$ g++ --version
$ g++ -std=c++1y test.cpp
</code></pre>

<p>If the system compiler version is less than 4.9, compilation will fail:</p>

<p><img class="center" src="https://preshing.com/images/generic-lambda-error.png" /></p>

<p>Now, let&rsquo;s try it again using our freshly built GCC compiler. The new compiler is already configured to locate its include files in <code>/usr/local/include/c++/4.9.2</code> and its static libraries in <code>/usr/local/lib</code>. All we need to do is run it:</p>

<pre><code>$ g++-4.9.2 -std=c++14 test.cpp
$ ./a.exe
</code></pre>

<p><img class="center" src="https://preshing.com/images/generic-lambda-ok.png" /></p>

<p>It works!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[My Multicore Talk at CppCon 2014]]></title>
    <link href="https://preshing.com/20141024/my-multicore-talk-at-cppcon-2014"/>
    <updated>2014-10-24T06:40:00-04:00</updated>
    <id>https://preshing.com/?p=20141024</id>
    <content type="html"><![CDATA[<p>Last month, I attended <a href="http://cppcon.org/">CppCon</a> 2014 in Bellevue, Washington. It was an awesome conference, filled with the who&rsquo;s who of C++ development, and loaded with interesting, relevant talks. It was a first-year conference, so I&rsquo;m sure CppCon 2015 will be even better. I highly recommend it for any serious C++ developer.</p>

<p>While I was there, I gave a talk entitled, &ldquo;How Ubisoft Montreal Develops Games For Multicore &ndash; Before and After C++11.&rdquo; You can watch the whole thing here:</p>

<div class="embed-video-container"><iframe src="http://www.youtube.com/embed/X1T3IQ4N-3g" width="640" height="360"></iframe></div>

<!--more-->
<p>To summarize the talk:</p>

<ul>
  <li>At Ubisoft Montreal, we exploit multicore by building our game engines on top of three common threading patterns.</li>
  <li>To implement those patterns, we need to write a lot of custom concurrent objects.</li>
  <li>When a concurrent object is under heavy contention, we optimize it using atomic operations.</li>
  <li>Game engines have their own portable atomic libraries. These libraries are similar to the C++11 atomic library&rsquo;s &ldquo;low level&rdquo; functionality.</li>
</ul>

<p>Most of the talk is spent exploring that last point: Comparing game atomics to low-level C++11 atomics.</p>

<p>There was a wide range of experience levels in the room, which was cool. Among the attendees were Michael Wong, CEO of OpenMP and C++ standard committee member, and Lawrence Crowl, who authored most of section 29, &ldquo;Atomic operations library,&rdquo; in the C++11 standard. Both of them chime in at various points. (I certainly wasn&rsquo;t expecting to explain the standard to the guy who wrote it!)</p>

<p>You can download the slides <a href="https://github.com/CppCon/CppCon2014/blob/master/Presentations/How%20Ubisoft%20Montreal%20Develops%20Games%20for%20Multicore/How%20Ubisoft%20Montreal%20Develops%20Games%20for%20Multicore%20-%20Before%20and%20After%20C++11%20-%20Jeff%20Preshing%20-%20CppCon%202014.pdf?raw=true">here</a> and grab the source code for the sample application <a href="https://gist.github.com/preshing/4d28abad8da4e40cb1d4">here</a>. A couple of corrections about certain points:</p>

<h3 id="compiler-ordering-around-c-volatiles">Compiler Ordering Around C++ Volatiles</h3>

<p>At 24:05, I said that the compiler could have reordered some instructions on x86, leading to the same kind of memory reordering bug we saw at runtime on PowerPC, and that we were just lucky it didn&rsquo;t.</p>

<p><img class="center" src="https://preshing.com/images/capped-waitfree-queue.png" /></p>

<p>However, I should acknowledge that in the previous console generation, the only x86 compiler we used at Ubisoft was Microsoft&rsquo;s. Microsoft&rsquo;s compiler is exceptional in that it does <em>not</em> perform those particular instruction reorderings on x86, because it treats volatile variables differently from other compilers, and <code>m_writePos</code> is volatile. That&rsquo;s Microsoft&rsquo;s <a href="http://msdn.microsoft.com/en-us/library/12a04hfd.aspx">default x86 behavior</a> today, and it was its only x86 behavior back then. So in fact, the absence of compiler reordering was more than just luck: It was a vendor-specific guarantee. If we had used GCC or Clang, <em>then</em> we would have run the risk of compiler reordering in these two places.</p>

<h3 id="enforcing-correct-usage-of-concurrent-objects">Enforcing Correct Usage of Concurrent Objects</h3>

<p>Throughout the talk, I keep returning to the example of a single-producer, single-consumer concurrent queue. For this queue to work correctly, you must follow the rules. In particular, it&rsquo;s important not to call <code>tryPush</code> from multiple threads at the same time.</p>

<p>At 54:00, somebody asks if there&rsquo;s a way to prevent coworkers from breaking such rules. My answer was to talk to them. At Ubisoft Montreal, the community of programmers playing with lock-free data structures is small, and we tend to know each other, so this answer is actually quite true for us. In many cases, the only person using a lock-free data structure is the one who implemented it.</p>

<p>But there was a better answer to his question, which I didn&rsquo;t think of at the time: We can implement a macro that fires an assert when two threads enter the same function simultaneously. I won&rsquo;t show the macro&rsquo;s implementation here, but as it turns out, the <code>tryPush</code> and <code>tryPop</code> functions are two perfect candidates for it. This assert won&rsquo;t prevent people from breaking the rules, but it will help catch errors earlier.</p>

<div><div class="CodeRay">
  <div class="code"><pre><span class="predefined-type">bool</span> tryPush(<span class="directive">const</span> T&amp; item)
{
    <span class="highlight">ASSERT_SINGLE_THREADED(m_pushDetector);</span>
    <span class="predefined-type">int</span> w = m_writePos.load(memory_order_relaxed);  
    <span class="keyword">if</span> (w &gt;= size)   
        <span class="keyword">return</span> <span class="predefined-constant">false</span>;  
    m_items[w] = item;   
    m_writePos.store(w + <span class="integer">1</span>, memory_order_release);  
    <span class="keyword">return</span> <span class="predefined-constant">true</span>;
}
</pre></div>
</div>
</div>
]]></content>
  </entry>
  
</feed>
