{"id":1744,"date":"2018-01-06T14:57:40","date_gmt":"2018-01-06T04:57:40","guid":{"rendered":"http:\/\/brnz.org\/hbr\/?p=1744"},"modified":"2018-01-06T14:58:47","modified_gmt":"2018-01-06T04:58:47","slug":"a-little-bit-of-floating-point-in-a-memory-allocator-part-2-a-little-bit-of-floating-point","status":"publish","type":"post","link":"https:\/\/brnz.org\/hbr\/?p=1744","title":{"rendered":"A little bit of floating point in a memory allocator \u2014 Part 2: The floating point"},"content":{"rendered":"<p><a href=\"https:\/\/brnz.org\/hbr\/?p=1735\">[Previously]<\/a><\/p>\n<p>This post contains the same material as\u00a0<a href=\"https:\/\/twitter.com\/twoscomplement\/status\/949335343836180480\">this thread of tweets<\/a>, with a few minor edits.<\/p>\n<p>In IEEE754, floating point numbers are represented like this:<\/p>\n<p>\u00b12\u207f\u207f\u207f\u00d71.sss&#8230;<\/p>\n<p>nnn is the exponent, which is <span style=\"font-family: terminal, monaco, monospace;\">floor(log2(size))<\/span>\u00a0&#8212; which happens to be the <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> value computed by TLSF.<\/p>\n<p>sss&#8230; is the significand fraction: the part that follows the decimal point, which happens to be <span style=\"font-family: terminal, monaco, monospace;\">sl<\/span>.<\/p>\n<p>And so to calculate <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> and <span style=\"font-family: terminal, monaco, monospace;\">sl<\/span>, all we need to do is convert <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> to a floating point value (on recent x86 hardware, that&#8217;s a single instruction). Then we can extract the exponent, and the upper bits of the fractional part, and we&#8217;re all done :D<\/p>\n<p>That can be implemented like this:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ndouble sf = (int64_t)size;\r\nuint64_t sfi;\r\nmemcpy(&amp;sfi, &amp;sf, 8);\r\nfl = (sfi &gt;&gt; 52) - (1023 + 7);\r\nsl = (sfi &gt;&gt; 47) &amp; 31;\r\n<\/pre>\n<p>There&#8217;s some subtleties (there always is). I&#8217;ll break it down&#8230;<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ndouble sf = (int64_t)size;\r\n<\/pre>\n<p>Convert <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> to a double, with an explicit cast. <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> has type <span style=\"font-family: terminal, monaco, monospace;\">size_t<\/span>, but using TLSF from <a href=\"https:\/\/github.com\/mattconte\/tlsf\/\">github.com\/mattconte\/tlsf<\/a>, the largest supported allocation on 64bit architecture is 2^32 bytes &#8211; comfortably less than the precision provided by the <span style=\"font-family: terminal, monaco, monospace;\">double<\/span> type. If you need your TLSF allocator to allocate chunks bigger than 2^53, this isn&#8217;t the technique for you :)<\/p>\n<p>I first tried using <span style=\"font-family: terminal, monaco, monospace;\">float<\/span> (not <span style=\"font-family: terminal, monaco, monospace;\">double<\/span>), which can provide correct results &#8212; but only if the rounding mode happens to be set correctly.\u00a0<span style=\"font-family: terminal, monaco, monospace;\">double<\/span> is easier.<\/p>\n<p>The cast to <span style=\"font-family: terminal, monaco, monospace;\">(int64_t)<\/span> results in better codegen on x86: without it, the compiler will generate a full 64bit unsigned conversion, and there is no single instruction for that.<\/p>\n<p>The cast tells the compiler to (in effect) consider the bits of <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> as if they were a two&#8217;s complement signed value &#8212; and there is an SSE instruction to handle that case (<span style=\"font-family: terminal, monaco, monospace;\">cvtsi2sdq<\/span> or similar). Again, with the implementation we&#8217;re using\u00a0<span style=\"font-family: terminal, monaco, monospace;\">size<\/span> can&#8217;t be that big, so this will do the Right Thing.<\/p>\n<pre class=\"brush: cpp; first-line: 2; title: ; notranslate\" title=\"\">\r\nuint64_t sfi;\r\nmemcpy(&amp;sfi, &amp;sf, 8);\r\n<\/pre>\n<p>Copy the 8 bytes of the double into an unsigned integer variable.\u00a0There are a lot of ways that C\/C++ programmers copy bits from floating point to integer &#8211; some of them are well defined :) memcpy() does what we want, and any moderately respectable compiler knows how to select decent instructions to implement it.<\/p>\n<p>Now we have floating point bits in an integer register, consisting of one sign bit (always zero for this, because size is always positive), eleven exponent bits (offset by 1023), and 52 bits of significant fraction. All we need to do is extract those, and we&#8217;re done :)<\/p>\n<pre class=\"brush: cpp; first-line: 4; title: ; notranslate\" title=\"\">\r\nfl = (sfi &gt;&gt; 52) - (1023 + 7);\r\n<\/pre>\n<p>Extract the exponent: shift it down (ignoring the always-zero sign bit), subtract the offset (1023), and that 7 we saw earlier, at the same time.<\/p>\n<pre class=\"brush: cpp; first-line: 5; title: ; notranslate\" title=\"\">\r\nsl = (sfi &gt;&gt; 47) &amp; 31;\r\n<\/pre>\n<p>Extract the five most significant bits of the fraction &#8211; we do need to mask out the exponent.<\/p>\n<p>And, just like that*, we have <span style=\"font-family: terminal, monaco, monospace;\">mapping_insert()<\/span>, implemented in terms of integer -&gt; floating point conversion.<\/p>\n<p>* Actual code (rather than fragments) may be included in a later post&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[Previously] This post contains the same material as\u00a0this thread of tweets, with a few minor edits. In IEEE754, floating point numbers are represented like this: \u00b12\u207f\u207f\u207f\u00d71.sss&#8230; nnn is the exponent, which is floor(log2(size))\u00a0&#8212; which happens to be the fl value computed by TLSF. sss&#8230; is the significand fraction: the part that follows the decimal point, &hellip; <a href=\"https:\/\/brnz.org\/hbr\/?p=1744\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A little bit of floating point in a memory allocator \u2014 Part 2: The floating point&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,26],"tags":[],"_links":{"self":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1744"}],"collection":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1744"}],"version-history":[{"count":10,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1744\/revisions"}],"predecessor-version":[{"id":1755,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1744\/revisions\/1755"}],"wp:attachment":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1744"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1744"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1744"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}