{"id":1001,"date":"2011-07-20T20:33:19","date_gmt":"2011-07-20T10:33:19","guid":{"rendered":"http:\/\/brnz.org\/hbr\/?p=1001"},"modified":"2011-07-21T00:15:45","modified_gmt":"2011-07-20T14:15:45","slug":"assembly-primer-part-7-%e2%80%94-working-with-strings-%e2%80%94-arm","status":"publish","type":"post","link":"https:\/\/brnz.org\/hbr\/?p=1001","title":{"rendered":"Assembly Primer Part 7 \u2014 Working with Strings \u2014 ARM"},"content":{"rendered":"<p>These are my notes for where I can see ARM varying from IA32, as presented in the video <a href=\"http:\/\/securitytube.net\/Assembly-Primer-for-Hackers-%28Part-7%29-Working-with-Strings-video.aspx\">Part 7 \u2014 Working with Strings<\/a>.<\/p>\n<p>I&#8217;ve not remotely attempted to implement anything approximating optimal string operations for this part &#8212; I&#8217;m just working my way through the examples and finding obvious mappings to the ARM arch (or, at least what seem to be obvious). When I do something particularly stupid, leave a comment and let me know :)<\/p>\n<h2>Working with Strings<\/h2>\n<pre escaped=\"true\" lang=\"txt\">.data\r\n     HelloWorldString:\r\n        .asciz \"Hello World of Assembly!\"\r\n    H3110:\r\n        .asciz \"H3110\"\r\n\r\n.bss\r\n    .lcomm Destination, 100\r\n    .lcomm DestinationUsingRep, 100\r\n    .lcomm DestinationUsingStos, 100<\/pre>\n<p>Here&#8217;s the storage that the provided example <a href=\"http:\/\/code.securitytube.net\/StringBasics.s\">StringBasics.s<\/a> uses. No changes are required to compile this for ARM.<\/p>\n<h3>1. Simple copying using movsb, movsw, movsl<\/h3>\n<pre escaped=\"true\" lang=\"txt\">    @movl $HelloWorldString, %esi\r\n    movw r0, #:lower16:HelloWorldString\r\n    movt r0, #:upper16:HelloWorldString\r\n\r\n    @movl $Destination, %edi\r\n    movw r1, #:lower16:Destination\r\n    movt r1, #:upper16:Destination\r\n\r\n    @movsb\r\n    ldrb r2, [r0], #1\r\n    strb r2, [r1], #1\r\n\r\n    @movsw\r\n    ldrh r3, [r0], #2\r\n    strh r3, [r1], #2\r\n\r\n    @movsl\r\n    ldr r4, [r0], #4\r\n    str r4, [r1], #4<\/pre>\n<p>More visible complexity than IA32, but not too bad overall.<\/p>\n<p>IA32&#8217;s <strong>movs<\/strong> instructions implicitly take their source and destination addresses from <strong>%esi<\/strong> and <strong>%edi<\/strong>, and increment\/decrement both. Because of ARM&#8217;s load\/store architecture, separate load and store instructions are required in each case, but there is support for indexing of these registers:<\/p>\n<h4>ARM addressing modes<\/h4>\n<p>According to ARM A8.5, memory access instructions commonly support three addressing modes:<\/p>\n<ul>\n<li><strong>Offset addressing<\/strong> &#8212; An offset is applied to an address from a base register and the result is used to perform the memory access. It&#8217;s the form of addressing I&#8217;ve used in previous parts and looks like <strong>[rN, offset]<\/strong><\/li>\n<li><strong>Pre-indexed addressing<\/strong> &#8212; An offset is applied to an address from a base register, the result is used to perform the memory access and also written back into the base register. It looks like <strong>[rN, offset]!<\/strong><\/li>\n<li><strong>Post-indexed addressing<\/strong> &#8212; An address is used as-is from a base register for memory access. The offset is applied and the result is stored back to the base register. It looks like <strong>[rN], offset<\/strong> and is what I&#8217;ve used in the example above.<\/li>\n<\/ul>\n<h3>2. Setting \/ Clearing the DF flag<\/h3>\n<p>ARM doesn&#8217;t have a DF flag (to the best of my understanding). It could perhaps be simulated through the use of two instructions and conditional execution to select the right direction. I&#8217;ll look further into conditional execution of instructions on ARM in a later post.<\/p>\n<h3>3. Using Rep<\/h3>\n<p>ARM also doesn&#8217;t appear to have an instruction quite like IA32&#8217;s <strong>rep<\/strong> instruction. A conditional branch and a decrement will be the long-form equivalent. As branches are part of a later section, I&#8217;ll skip them for now.<\/p>\n<pre escaped=\"true\" lang=\"txt\">    @movl $HelloWorldString, %esi\r\n    movw r0, #:lower16:HelloWorldString\r\n    movt r0, #:upper16:HelloWorldString\r\n\r\n    @movl $DestinationUsingRep, %edi\r\n    movw r1, #:lower16:DestinationUsingRep\r\n    movt r1, #:upper16:DestinationUsingRep\r\n\r\n    @movl $25, %ecx # set the string length in ECX\r\n    @cld # clear the DF\r\n    @rep movsb\r\n    @std\r\n\r\n    ldm r0!, {r2,r3,r4,r5,r6,r7}\r\n    ldrb r8, [r0,#0]\r\n    stm r1!, {r2,r3,r4,r5,r6,r7}\r\n    strb r8, [r1,#0]<\/pre>\n<p>To avoid conditional branches, I&#8217;ll start with the assumption that the string length is known (25 bytes). One approach would be using multiple load instructions, but the <strong>load multiple<\/strong> (<strong>ldm<\/strong>) instruction makes it somewhat easier for us &#8212; one instruction to fetch 24 bytes, and a <strong>load register byte <\/strong>(<strong>ldrb<\/strong>) for the last one. Using the <strong>!<\/strong> after the source-address register indicates that it should be updated with the address of the next byte after those that have been read.<\/p>\n<p>The storing of the data back to memory is done analogously. <strong>Store multiple<\/strong> (<strong>stm<\/strong>) writes 6 registers\u00d74 bytes = 24 bytes (with the <strong>!<\/strong> to have the destination address updated). The final byte is written using <strong>strb<\/strong>.<\/p>\n<h3>4. Loading string from memory into EAX register<\/h3>\n<pre escaped=\"true\" lang=\"txt\">    @cld\r\n    @leal HelloWorldString, %esi\r\n    movw r0, #:lower16:HelloWorldString\r\n    movt r0, #:upper16:HelloWorldString\r\n\r\n    @lodsb\r\n    ldrb r1, [r0, #0]\r\n\r\n    @movb $0, %al\r\n    mov r1, #0\r\n\r\n    @dec %esi  @ unneeded. equiv: sub r0, r0, #1\r\n    @lodsw\r\n    ldrh r1, [r0, #0]\r\n\r\n    @movw $0, %ax\r\n    mov r1, #0\r\n\r\n    @subl $2, %esi # Make ESI point back to the original string. unneeded. equiv: sub r0, r0, #2\r\n    @lodsl\r\n    ldr r1, [r0, #0]<\/pre>\n<p>In this section, we are shown how the IA32 <strong>lodsb<\/strong>, <strong>lodsw<\/strong> and <strong>lodsl<\/strong> instructions work. Again, they have implicitly assigned register usage, which isn&#8217;t how ARM operates.<\/p>\n<p>So, instead of a simple, no-operand instruction like <strong>lodsb<\/strong>, we have a <strong>ldrb r1, [r0, #0]<\/strong> loading a byte from the address in r0 into r1. Because I didn&#8217;t use post indexed addressing, there&#8217;s no need to dec or subl the address after the load. If I were to do so, it could look like this:<\/p>\n<pre escaped=\"true\" lang=\"txt\">    ldrb r1, [r0], #1\r\n    sub r0, r0, #1\r\n\r\n    ldrh r1, [r0], #2\r\n    sub r0, r0, #2\r\n\r\n    ldr r1, [r0], #4<\/pre>\n<p>If you trace through it in gdb, look at how the value in r0 changes after each instruction.<\/p>\n<h3>5. Storing strings from EAX to memory<\/h3>\n<pre escaped=\"true\" lang=\"txt\">    @leal DestinationUsingStos, %edi\r\n    movw r0, #:lower16:DestinationUsingStos\r\n    movt r0, #:upper16:DestinationUsingStos\r\n\r\n    @stosb\r\n    strb r1, [r0], #1\r\n    @stosw\r\n    strh r1, [r0], #2\r\n    @stosl\r\n    str r1, [r0], #4<\/pre>\n<p>Same kind of thing as for the loads. Writes the letters in r1 (being &#8220;Hell&#8221; &#8212; leftovers from the previous section) into DestinationUsingStos (the result being &#8220;HHeHell&#8221;). String processing on little endian architectures has its appeal.<\/p>\n<h3>6. Comparing Strings<\/h3>\n<pre escaped=\"true\" lang=\"txt\">    @cld\r\n    @leal HelloWorldString, %esi\r\n    movw r0, #:lower16:HelloWorldString\r\n    movt r0, #:upper16:HelloWorldString\r\n    @leal H3110, %edi\r\n    movw r1, #:lower16:H3110\r\n    movt r1, #:upper16:H3110\r\n\r\n    @cmpsb\r\n    ldrb r2, [r0,#0]\r\n    ldrb r3, [r1,#0]\r\n    cmp r2, r3\r\n\r\n    @dec %esi\r\n    @dec %edi\r\n    @not needed because of the addressing mode used\r\n\r\n    @cmpsw\r\n    ldrh r2, [r0,#0]\r\n    ldrh r3, [r1,#0]\r\n    cmp r2, r3\r\n\r\n    @subl $2, %esi\r\n    @subl $2, %edi\r\n    @not needed because of the addressing mode used\r\n    @cmpsl\r\n    ldr r2, [r0,#0]\r\n    ldr r3, [r1,#0]\r\n    cmp r2, r3<\/pre>\n<p>Where IA32&#8217;s <strong>cmps<\/strong> instructions implicitly load through the pointers in <strong>%edi<\/strong> and <strong>%esi<\/strong>, explicit loads are needed for ARM. The compare then works in pretty much the same way as for IA32, setting condition code flags in the <strong>current program status register<\/strong> (<strong>cpsr<\/strong>). If you run the above code, and check the status registers before and after execution of the <strong>cmp<\/strong> instructions, you&#8217;ll see the zero flag set and unset in the same way as is demonstrated in the video.<\/p>\n<p>The condition code flags are:<\/p>\n<ul>\n<li>bit 31 &#8212; negative (N)<\/li>\n<li>bit 30 &#8212; zero (Z)<\/li>\n<li>bit 29 &#8212; carry (C)<\/li>\n<li>bit 28 &#8212; overflow (V)<\/li>\n<\/ul>\n<p>There&#8217;s other flags in that register &#8212; all the details are on page B1-16 and B1-17 in the ARM Architecture Reference Manual.<\/p>\n<p>And with that, I think we&#8217;ve made it (finally) to the end of this part for ARM.<\/p>\n<h3>Other assembly primer notes are linked <a href=\"..\/?page_id=737\">here<\/a>.<\/h3>\n","protected":false},"excerpt":{"rendered":"<p>These are my notes for where I can see ARM varying from IA32, as presented in the video Part 7 \u2014 Working with Strings. I&#8217;ve not remotely attempted to implement anything approximating optimal string operations for this part &#8212; I&#8217;m just working my way through the examples and finding obvious mappings to the ARM arch &hellip; <a href=\"https:\/\/brnz.org\/hbr\/?p=1001\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Assembly Primer Part 7 \u2014 Working with Strings \u2014 ARM&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,26],"tags":[45,38],"_links":{"self":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1001"}],"collection":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1001"}],"version-history":[{"count":21,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1001\/revisions"}],"predecessor-version":[{"id":1054,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=\/wp\/v2\/posts\/1001\/revisions\/1054"}],"wp:attachment":[{"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1001"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1001"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brnz.org\/hbr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1001"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}