Lab 5: 64-bit Assembler - Aarch64

Instruction
This posting will describe the process and the finding of lab 5. This lab covers the comparison between Aarch64 and x86_64. The lab instructs to compare the result of the 'objdump -d' command in Aarch64 and x86_64. Also, it covers the writing code to make the same result using Aarch64 and x86_64. This posting only covers Aarch64 part because I got a login problem in x86_64. You can see the detailed instruction of the lab5 on the Seneca SPO600 website.

The Brief Instruction of the Lab 5 about Aarch64

There are four representative part of the Lab 5
  • Take a look at the code using 'objdump -d objectfile' using aarch64 assembly language programs
  • Take a look at the code using 'objdump -d objectfile' using x86_64 assembly language programs
  • Write code to loop a message with digit using Aarch64 assembly language programs
  • Write code to loop a message with digit using x86_64 assembly language programs
This posting only covers the Aarch64 part.


'Objdump -d hello' using Aarch64 language programs
The <main> section in the output of 'objdump -d' command using Aarch 64 language programs is below.

hello

hello2

hello3

Designing the Lab 5 using Aarch64
Provided code
The provided code for this lab is written below.
=====================================================
.text
.globl _start
_start:

        mov     x0, 1           /* file descriptor: 1 is stdout */
        adr     x1, msg         /* message location (memory address) */
        mov     x2, len         /* message length (bytes) */

        mov     x8, 64          /* write is syscall #64 */
        svc     0               /* invoke syscall */

        mov     x0, 0           /* status -> 0 */
        mov     x8, 93          /* exit is syscall #93 */
        svc     0               /* invoke syscall */

.data
msg:    .ascii      "Hello, world!\n"
len=    . - msg
======================================================
When you assemble this, it prints "Hello, world!". The ascii string "Hello, world!\n" is stored in msg and the length of the message is stored in len. The message is printed at the point of 'svc  0' after 'mov   x8, 64'. The program is done at the point of svc after 'mov  x8, 93'. 



Looping
To loop the message, it needs the loop index for starting point and the loop index for ending point. Also, It needs a subroutine to execute looping.
======================================================


.text

.globl _start



_start:

        mov     x19,min

loop:
        mov     x0, 1           /* file descriptor: 1 is stdout /
        adr     x1, msg         / message location (memory address) /
        mov     x2, len         / message length (bytes) /

        mov     x8, 64          / write is syscall #64 /
        svc     0               / invoke syscall /

        add     x19, x19, 1
        cmp     x19, max
        b.ne    loop

        mov     x0, 0           / status -> 0 /
        mov     x8, 93          / exit is syscall #93 /
        svc     0               / invoke syscall */


.data
msg:    .ascii      "Hello, world!\n"
len=    . - msg
min = 0
max = 30
======================================================
The shape of the code is similar to c language. The code to loop using C language is the following:
for(int i = 0; i < 30, i++){
     printf("Hello, world!\n");
}
The code to correspond to for is 'loop:'. This is the subroutine to loop the code. The register x19 works as an 'i' which is used to check the condition. Every time the message is printed, the value in register x19 is increment and the incremented value is used to compare with max value. If the value is the same, the looping finishes. It is not the same, it loops again.

Print the digits
As the code loops 30 times, there are two cases of printing the digits: printing 1-digit and printing 2-digits. Therefore, it needs two subroutines for each case. To print the digit depend on the each loop, the blank spaces '  ' is added in the message (msg: .ascii "Hello, world!:   \n"). 

Print the character number using the loop index
The hexadecimal for character 0 starts at 30. Therefore, one way to print number using the loop index is adding 30 to index value and store it in a register.
======================================================
        add   x20, x19, 30
======================================================
 If x20 is printed and the value in x19 is 0, the character 0 will be printed.

However, there is an easier way to calculate number character.
======================================================
        add   x20, x19, '0'
======================================================
By adding character 0 with the value of x19, x20 stores the value for the number character.

Distinguish the Digit: 1-digit or 2-digits
You need to distinguish whether you need to print 1-digit or 2-digits by checking the value you want to print. If the value is less than 10, it means the digit is 1-digit. If it is greater or equal than 10, it means, the digit is 2-digits.
======================================================
        cmp     x19, 9
        b.gt    digit_2
        bl      digit_1
======================================================
The subroutine for 1-digit names digit_1. The subroutine for 2-digits names digit_2.

Print 1-digit (Subroutine digit_1)
Printing 1 digit is simple. After the system calculates the value for the character number, store the value at the position for digit (for example, position # in the message).
======================================================
        adr     x30, msg+14
        strb    w20, [x30]
======================================================
The position is loaded in x30. As the character number value is stored in x20, it uses w20 to write one byte. It stores the x20 value in the position of x30.

Print 2-digits (Subroutine digit_2)
Printing 2 digits needs more steps because it needs division. As Aarch64 language program can print one byte, it you want to print 2 digits, you need to divide the value by 10 so you can store each value in a separate register to print. There are a command for division 'udiv' and a command for remainder 'msub'.
======================================================
        udiv    x25, x19, x18
        msub    x26, x25, x18, x19
======================================================
The register x25 and x26 is for storing the value. The x25 is for the left digit and the x26 is for the right digit. For example, if you want to print 30, 3 will be stored in x25 and 0 will be stored in x26. The x19 is the value you want to output. For esample, it you want to print 30, x19 stores the value 30. The x18 stores 10 for division.

The whole code to implement the Lab 5
.text
.globl _start

_start:
        mov     x19,min         /*store the min value into x19 as a loop index*/
        mov     x18,division    /*store the division value(10) into x18*/
loop:
        mov     x0, 1           /* file descriptor: 1 is stdout */

        cmp     x19, 9          /* compare x19(loop index value) with 9*/
        b.gt    digit_2         /* if the value is greater than 9(2-digit), go to the subroutine digit_2*/
        bl      digit_1         /* if the value is less or equal than 9(1-digit), go to the subroutine digit_1*/

digit_1:
        add     x20, x19, '0'   /* ascii number character*/

        adr     x30, msg+14     /* the digit location within string */
        strb    w20, [x30]      /* store the digit at the location */
        bl      print           /* go to the print subroutine */


digit_2:
        udiv    x25, x19, x18   /* divide the value by 10 and store the value into the x25 */
        msub    x26, x25, x18, x19 /* store the remainder into x26 */

        add     x21, x25, '0'   /* ascii number character */
        add     x20, x26, '0'   /* ascii number character */

        adr     x25, msg+14     /* the digit location within string */
        strb    w21, [x25]      /* store the digit at the location */
        adr     x26, msg+15     /* the digit location within string */
        strb    w20, [x26]      /* store the digit at the location */

        bl      print

print:
        adr     x1, msg         /* store the locatio of message */
        mov     x2, len         /* store the string length into x2 */

        mov     x8, 64          /* write is syscall #64 */
        svc     0               /* invoke syscall */

        add     x19, x19, 1     /* increment x19 value which is loop index */
        cmp     x19, max        /* compare x19 with max value */

        b.ne    loop            /* if the value is not equal to max value, loop it again */

        mov     x0, 0           /* status -> 0 */
        mov     x8, 93          /* exit is syscall #93 */
        svc     0               /* invoke syscall */

.data
msg:    .ascii      "Hello World!:   \n"
len= .- msg
min = 0
max = 30
division=10

Struggles to Implement the Code
Login problem
  • I downloaded a virtual machine for this called VMware Workstation. I tried to log in to the server but ssh and scp do not work in the VM. The network was the problem. While the VM is created, it does not include the network so it cannot connect to another server.
  • As the environment of my laptop is Windows 10, I can use PowerShell for this workshop. Therefore, I copied the ssh key I have generated and sent to the professor into my local drive and tried to log in again. However, it does not work. The SSH key was the reason for this problem. As the format of the SSH key I have sent to the professor was not right, it does not allow me to log in to the server. I resent the properly formatted public key to the professor and that allows me to log in.

Segmentation Fault
I tried to divide the case for 1-digit and 2-digits, make subroutines for each case, use them to calculate, and go back to the previous subroutine to print the string. The segmentation fault happens when I tried to use a subroutine. I looked at my code and found the problem was located to 'go back to the previous subroutine' part. I used 'ret' for it but this command is for 'bl' not 'b.ne' or 'b.gt'. Therefore, there is no subroutine to go back. When I tried to find the point the error generated using 'gdb' command and then 'where' command, it said 'No Stack'. That was the reason. Thus, I have changed the way to implement it. Instead of using 'ret' to go back, used 'bl' to jump the subroutine for printing.

Next Step
In the next posting, I will implement the looping using the x86_64 language program. Also, I will compare the code using Aarch64 with the code using x86_64.

What I have learned from Lab 5
I have learned how to implement the code using Aarch64. Also, I got some ideas about how the ssh key works while I got a struggle to log in. 
The Aarch64 part was the most interesting. Compared to 6502 Assembler, it has more features such as division, multiplies, and remainder. I feel it is similar to the C language. It just uses a different command. I have fun to do the lab 5 Aarch64 part and have more interest than the 6502 assembler or x86_64.

Comments

Popular posts from this blog

Open Source Project: Lab2 (Pull Request)

Release 4.0 - Part 3: Release

Lab 8: Automated Testing and Continuous Integration