{"id":1735,"date":"2018-01-06T14:22:54","date_gmt":"2018-01-06T04:22:54","guid":{"rendered":"http:\/\/brnz.org\/hbr\/?p=1735"},"modified":"2018-01-06T16:20:52","modified_gmt":"2018-01-06T06:20:52","slug":"a-little-bit-of-floating-point-in-a-memory-allocator-part-1-background","status":"publish","type":"post","link":"https:\/\/brnz.org\/hbr\/?p=1735","title":{"rendered":"A little bit of floating point in a memory allocator &#8212; Part 1: Background"},"content":{"rendered":"<p>This post contains the same material as <a href=\"https:\/\/twitter.com\/twoscomplement\/status\/949335034015526913\">this thread of tweets<\/a>, with a few minor edits.<\/p>\n<p>Over my holiday break at the end of 2017, I took a look into the TLSF (Two Level Segregated Fit) memory allocator to better understand how it works. I&#8217;ve made use of this allocator and have been impressed by its real world performance, but never really done a deep dive to properly understand it.<\/p>\n<p>The <span style=\"font-family: terminal, monaco, monospace;\">mapping_insert()<\/span> function is a key part of the allocator implementation, and caught my eye. Here&#8217;s how that function is described in the paper <a href=\"http:\/\/www.gii.upv.es\/tlsf\/files\/jrts2008.pdf\">A constant-time dynamic storage allocator for real-time systems<\/a>:<\/p>\n<p><img loading=\"lazy\" src=\"https:\/\/brnz.org\/hbr\/wp-content\/uploads\/2018\/01\/Screenshot_20180104_225911-540x509.png\" alt=\"\" width=\"540\" height=\"509\" \/><\/p>\n<p>I&#8217;ll be honest: from that description, I never developed a clear picture in my mind of what that function does.<\/p>\n<p>(Reading it now, it seems reasonably clear &#8211; but I can say that only after I spent quite a bit of time using other methods to develop my understanding)<\/p>\n<p>Something that helped me a lot was by looking at <a href=\"https:\/\/github.com\/mattconte\/tlsf\/blob\/master\/tlsf.c#L508\">the implementation of that function<\/a> from\u00a0<a href=\"https:\/\/github.com\/mattconte\/tlsf\/\">github.com\/mattconte\/tlsf\/<\/a>.\u00a0 There&#8217;s a bunch of long-named macro constants in there, and a few extra implementation details. If you collapse those it looks something like this:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nvoid mapping_insert(size_t size, int* fli, int* sli)\r\n{ \r\n  int fl, sl;\r\n  if (size &lt; 256)\r\n  {\r\n    fl = 0;\r\n    sl = (int)size \/ 8;\r\n  }\r\n  else\r\n  {\r\n    fl = fls(size);\r\n    sl = (int)(size &gt;&gt; (fl - 5)) ^ 0x20;\r\n    fl -= 7;\r\n  }\r\n  *fli = fl;\r\n  *sli = sl;\r\n}\r\n<\/pre>\n<p>It&#8217;s a pretty simple function (it really is). But I still failed to *see* the pattern of results that would be produced in my mind&#8217;s eye.<\/p>\n<p>I went so far as to make a giant spreadsheet of all the intermediate values for a range of inputs, to paint myself a picture of the effect of each step :) That helped immensely.<\/p>\n<p>Breaking it down&#8230;<\/p>\n<p>There are two cases handled in the function: one for when size is below a certain threshold, and on for when it is larger. The first is straightforward, and accounts for a small number of possible input values. The large size case is more interesting.<\/p>\n<p>The function computes two values: <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> and <span style=\"font-family: terminal, monaco, monospace;\">sl<\/span>, the first and second level indices for a lookup table. For the large case, <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> (where <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span>\u00a0is &#8220;first level&#8221;) is computed via <span style=\"font-family: terminal, monaco, monospace;\">fls(size)<\/span> (where <span style=\"font-family: terminal, monaco, monospace;\">fls<\/span>\u00a0is short for &#8220;find last set&#8221; &#8211; similar names, just to keep you on your toes).<\/p>\n<p><span style=\"font-family: terminal, monaco, monospace;\">fls()<\/span> returns the index of the largest bit set, counting from the least significant slbit, which is the index of the largest power of two. In the words of the paper:<\/p>\n<p>&#8220;the instruction fls can be used to compute the \u230alog2(x)\u230b function&#8221;<\/p>\n<p>Which is, in C-like syntax: <span style=\"font-family: terminal, monaco, monospace;\">floor(log2(x))<\/span><\/p>\n<p>And there&#8217;s that &#8220;<span style=\"font-family: terminal, monaco, monospace;\">fl -= 7<\/span>&#8221; at the end. That will show up again later.<\/p>\n<p>For the large case, the computation of <span style=\"font-family: terminal, monaco, monospace;\">sl<\/span> has a few steps:<\/p>\n<p><span style=\"font-family: terminal, monaco, monospace;\">\u00a0 sl = (size &gt;&gt; (fl &#8211; 5)) ^ 0x20;<\/span><\/p>\n<p>Depending on shift down <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> by some amount (based on <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span>), and mask out the sixth bit?<\/p>\n<p>(Aside: The CellBE programmer in me is flinching at that variable shift)<\/p>\n<p>It took me a while (longer than I would have liked&#8230;) to realize that this<br \/>\n<span style=\"font-family: terminal, monaco, monospace;\">size &gt;&gt; (fl &#8211; 5)<\/span> is shifting <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> to generate a number that has exactly six significant bits, at the least significant end of the register (bits 5 thru 0).<\/p>\n<p>Because <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> is the index of the most significant bit, after this shift, bit 5 will always be 1 &#8211; and that &#8220;<span style=\"font-family: terminal, monaco, monospace;\">^ 0x20<\/span>&#8221; will unset it, leaving the result as a value between 0 and 31 (inclusive).<\/p>\n<p>So here&#8217;s where floating point comes into it, and the cute thing I saw: another way to compute <span style=\"font-family: terminal, monaco, monospace;\">fl<\/span> and <span style=\"font-family: terminal, monaco, monospace;\">sl<\/span> is to convert <span style=\"font-family: terminal, monaco, monospace;\">size<\/span> into an IEEE754 floating point number, and extract the exponent, and most significant bits of the mantissa. I&#8217;ll cover that in the next part, <a href=\"https:\/\/brnz.org\/hbr\/?p=1744\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post contains the same material as this thread of tweets, with a few minor edits. Over my holiday break at the end of 2017, I took a look into the TLSF (Two Level Segregated Fit) memory allocator to better understand how it works. I&#8217;ve made use of this allocator and have been impressed by &hellip; <a href=\"https:\/\/brnz.org\/hbr\/?p=1735\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A little bit of floating point in a memory allocator &#8212; Part 1: Background&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,26],"tags":[],"_links":{"self":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1735"}],"collection":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1735"}],"version-history":[{"count":9,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1735\/revisions"}],"predecessor-version":[{"id":1757,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1735\/revisions\/1757"}],"wp:attachment":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1735"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1735"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1735"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}