{"id":850,"date":"2010-12-14T21:16:16","date_gmt":"2010-12-14T11:16:16","guid":{"rendered":"http:\/\/brnz.org\/hbr\/?p=850"},"modified":"2010-12-14T21:16:16","modified_gmt":"2010-12-14T11:16:16","slug":"diamond-square-dma-lists-and-dirty-tricks","status":"publish","type":"post","link":"https:\/\/brnz.org\/hbr\/?p=850","title":{"rendered":"Diamond-Square &#8212; DMA lists and dirty tricks"},"content":{"rendered":"<h3>There&#8217;s a hole in my dataset, dear MFC, dear MFC&#8230;<\/h3>\n<p>There&#8217;s a block of local store that holds a tile of pixels that need to be DMAd  out to main memory.\u00a0 128 lines makes performing many smaller transfers more complex , so we&#8217;ll try DMA lists why not.\u00a0 The problem here is that the pixel  data for each line does not match up directly with the next &#8212; between  each is stored the extra lines of extra-tile information that was needed  for the calculation.\u00a0 When using DMA lists to transfer data to many  memory, the source for each DMA transfer starts where the previous one  ended. This means that between each DMA there&#8217;s (148-128)\u00d74=80 bytes of information that we don&#8217;t want to see on the screen.<\/p>\n<p>There&#8217;s a lot of things that are &#8220;best-practice&#8221; for SPU DMA, particularly relating to size and alignment of transfers, and I&#8217;m typically a religious adherent.\u00a0 In this case, I did not comply with best practice and still met my time budget and only feel a small amount of shame :P<\/p>\n<p>Overall, less time is spent performing DMA than is spent calculating pixels, so further optimising DMA <em>for this particular program<\/em> is unnecessary.<\/p>\n<p>When transferring tiles from the SPU to the framebuffer, there&#8217;s three cases of particular interest:<\/p>\n<ol>\n<li>Tiles on the right edge of the screen<\/li>\n<li>Tiles on the lower edge of the screen<\/li>\n<li>The Other Tiles<\/li>\n<\/ol>\n<figure id=\"attachment_852\" aria-describedby=\"caption-attachment-852\" style=\"width: 111px\" class=\"wp-caption alignright\"><img loading=\"lazy\" class=\"size-full wp-image-852 \" title=\"3x3x5p\" src=\"https:\/\/brnz.org\/hbr\/wp-content\/uploads\/2010\/12\/3x3x5p.png\" alt=\"\" width=\"111\" height=\"111\" \/><figcaption id=\"caption-attachment-852\" class=\"wp-caption-text\">Tile data in local store<\/figcaption><\/figure>\n<h3>3. The Other Tiles<\/h3>\n<p>These are the easy ones.\u00a0 They will never need to be trimmed to fit the screen edges, and if drawn in the right order have the wonderful characteristic that the extra data needed can be DMAd to the framebuffer and will be overwritten by pixel data from a later tile.\u00a0 There&#8217;s no special case needed, just draw each pixel line and &#8212; all data between pixel lines &#8212; to the screen.<\/p>\n<figure id=\"attachment_851\" aria-describedby=\"caption-attachment-851\" style=\"width: 111px\" class=\"wp-caption alignright\"><img loading=\"lazy\" class=\"size-full wp-image-851 \" title=\"3x3x5s\" src=\"https:\/\/brnz.org\/hbr\/wp-content\/uploads\/2010\/12\/3x3x5s.png\" alt=\"\" width=\"111\" height=\"41\" \/><figcaption id=\"caption-attachment-851\" class=\"wp-caption-text\">Tile data drawn to screen<\/figcaption><\/figure>\n<p>(For the diagrams, the amount of overdraw is far greater than the actual tile part &#8212; this is a consequence of the small size.\u00a0 It&#8217;s a much smaller percentage for 128 pixel tiles.\u00a0 I&#8217;ll post some actual screengrabs here sometime&#8230;)<\/p>\n<h3>1. Tiles on the right edge of the screen<\/h3>\n<p>Overdraw isn&#8217;t the answer in this case. It is not possible to overdraw on either the left or right of the rightmost tile in a way that will be correct when the screen is finished.\u00a0 Instead, the extra information (including any portion that may not fit onto the visible screen) must be dealt with some other way.<\/p>\n<p>My solution, ugly as it is, is to write each surplus portion of a tile to a scratch location in memory &#8212; every one of them to the same location.\u00a0 It works :|<\/p>\n<h3>2. Tiles on the lower edge of the screen<\/h3>\n<p>These tiles are really just like the others, except they&#8217;ll stop a few lines short.\u00a0 They&#8217;re still fully calculated, but only the visible lines are transferred.<\/p>\n<p>(In hindsight, increasing the spacing between lines would help reduce the alignment and size problem here.\u00a0 Adding an extra 48 bytes to each tile would allow every transfer to be optimally aligned and sized.\u00a0 And would probably make no measurable difference to the total runtime.\u00a0 Heck, there&#8217;s probably enough time to repack the data in local store before performing the DMA. There&#8217;s not that much&#8230;)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There&#8217;s a hole in my dataset, dear MFC, dear MFC&#8230; There&#8217;s a block of local store that holds a tile of pixels that need to be DMAd out to main memory.\u00a0 128 lines makes performing many smaller transfers more complex , so we&#8217;ll try DMA lists why not.\u00a0 The problem here is that the pixel &hellip; <a href=\"https:\/\/brnz.org\/hbr\/?p=850\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Diamond-Square &#8212; DMA lists and dirty tricks&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,26],"tags":[36,40],"_links":{"self":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/850"}],"collection":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=850"}],"version-history":[{"count":24,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/850\/revisions"}],"predecessor-version":[{"id":906,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/850\/revisions\/906"}],"wp:attachment":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=850"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=850"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=850"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}