INTRODUCTION TO ASSEMBLY LANGUAGE PROGRAMMING
Aims of the Experiment:
- Give an introduction to the architecture of 8086 microprocessor
- Enable the students to write small assembly language programs using DEBUG utility program.
Assembly language programs resemble a compiled language programs, like PASCAL. In compiled languages the user first creates a source file, which is a text file of entire program. The compiler changes it into machine language instructions. (Actually a linker is used too but is ignored for the time being) In the compiled language program the entire program is transformed into machine language at once.
Some of the most rudimentary functions that a debugger can perform are the following:
- Assemble short programs
- View a program’s source code, along with its machine code
- View the CPU registers and flags
- Trace or execute a program, watching variables for changes
- Enter new values into memory
- Search for binary or ASCII values in memory
- Move a block of memory from one location to another
- Fill a block of memory
- Load and write disk files
We’ll be using only some of the DEBUG commands in this experiment. If you want to have a look at the entire command-set, you can enter ‘HELP DEBUG’ at the DOS prompt on your PC, if it has help available. The meaning of some of the following commands might become clear to you later during this experiment.
- A Assemble a program using instruction mnemonics
- G Execute the program currently in the memory
- R Display the contents of registers and flags
- T Trace a single instruction
- U Disassemble memory into assembler mnemonics
- D Dump (display) the contents of memory
- E Enter bytes into memory
- F Fill a memory range with a single value
- Q Quit DEBUG and return to DOS
- L Load data from disk
- W Write data from memory to disk
- N Create a filename for use by the L and W commands
The 8086 Registers
Registers are special work areas inside the microprocessor designed to be accessed at high speeds. The registers are 16 bits long but you have the option of accessing the upper or lower halves of the four data registers. Again in this experiment we’ll restrict ourselves only to the use of data registers, so don’t get upset if you do not understand the meaning of some of the following registers:
- 16-bit AX (accumulator), BX (base), CX (counter), DX (data)
- 8-bit: AH, AL, BH, BL, CH, CL, DH, DL
- CS (code segment), DS (data segment)
- SI (source index), DI (destination index), BP (base pointer)
- IP (instruction pointer), SP (stack pointer)
- Overflow, Direction, Interrupt, Trap, Sign, Zero, Auxiliary Carry, Parity, Carry
IBM – PC Memory Architecture:
The IBM – PC can access 1 MB of memory using a standard 20-bit address. The memory is divided between RAM and ROM. RAM memory starts at location 00000h and extends to BFFFFh. ROM memory begins at location C0000h and extends to FFFFFh.
The RAM area (from 00000h to BFFFFh) is further divided into following parts:
- Interrupt vector table
- BIOS and DOS data
- Resident potion of RAM
- User RAM
- EGA color video
- Monochrome video
- Color video
We’ll be using only the user RAM for our programs, which extends from nearly 22400h to BFFFFh in a 640 K RAM machine (nearly 540K).
Caution: Do not try to write data to any other address other than in the range 22400h to BFFFFh, otherwise your system might halt.
- Memory Dump (or display) Command
This command is used to display memory on the screen as single bytes in both hexadecimal and ASCII. Enter the following command at the DEBUG prompt:
You’ll see eight lines containing different characters on your screen. Note down the first line. The numbers in the left column will be in segment: offset format. The number in second column that you noted down displays the contents of memory at offset 0100h, the number in third column displays the contents of 0101h, and so on the number in 17th column displays the contents of memory at offset address 010Fh. (The characters to the right are the ASCII representation of each byte.) Calculate the absolute address corresponding to this last memory location. Refer to appendix notes B, if you have not been through them.
Now display (or dump) the memory contents within the range of offset addresses 140h and 14Bh. Note down the command that you use.
Try following commands as well and note down your observations on the work sheet:
- Memory Fill Command
This command is used to fill a range of memory with a single value or a list of values.
Try the following commands and comment on what they do:
F100 200 ‘A’
F2260:20 30 ff
F 100 L 10’A’, ‘B’
Use ‘d’ command to display the contents that you just saved. Check if they match.
- Register Command
This command may be used in one of three ways:
Display contents of one register, allowing it to be changed.
Display registers, flags, and the next instruction about to be executed.
Display the eight flag settings, allowing any or all of them to be changed. (We’ll not use the last one in this experiment.)
Enter ‘r’ at the DEBUG prompt and note down what you see on the screen. All the register contents, the flags, and an instruction will be displayed.
Now enter ‘r cx’. The screen will display the contents of register CX, and will wait for a new value to be entered. Enter 100 and then verify by entering ‘r’ at the next prompt.
- Assemble Command
This command assembles a program into machine language. Let us start our program at location having offset address of 100h. Type ‘a ‘ at the DEBUG prompt and press ‘ENTER’. The current address in segment: offset format will be displayed, with the DEBUG asking you to enter assembly language command. Enter the following commands. Press the ‘ENTER’ key after every line. The characters after semicolon (;) are ignored by DEBUG. They are only comments. For this experiment, you don’t need to include them in your program. Of course when you’ll be writing big programs, it is always recommended to include comments both for clarity and for later references.
mov ah, 2 ; put 02h in the 8-bit register AH
mov bh, 5 ; put 05 in the 8 bit register BH
add ah , bh ; add the two numbers, store result in AH
Int 20h ; end of program, return to DEBUG
Press ‘ENTER’ key once more to exit from the assemble mode. This is your first assembly language program. Which adds two numbers together. Now let us see the machine language code.
For this we can use the ‘d’ command like this.
D cs : 100
The first few bytes (how many is not clear yet) represent the machine language code of the above program now we’ll use another DEBUG command that helps us to visualize both the assembly language mnemonics (instructions) as well as the corresponding machine language code.
- Un-assemble Command
This command translates memory into assembly language mnemonics. This process is called dis-assembly Enter the following command and note down what you observe.
Now you can see both your assembly language program as well as the corresponding machine code on the screen. Can you now figure out how many bytes are occupied by your small program? Note down the number. Our next step will be to run this program by executing one command at a time and then watching the results in various registers. This process is called single-step operation in debugging of assembly language programs
- Trace Command
This command executes one or more instructions from the current CS:IP location (see Appendix notes B) or an optional address if specified. The contents of the registers are shown after each instruction is executed. Check the contents of the data registers AX and BX using ‘r’ command.
Enter the following command at DEBUG prompt:
Now check the contents of AX, and BX.
Again enter ‘t’ and note the contents. Both AH and BH will contain the bytes you specified.
Again enter ‘t’ and note the contents. Now AH should contain the sum. Confirm if this is true.
You can also run the entire program, but then you’ll not be able to see the results in this case. The command is described below.
- Go Command
This command executes the program in memory. You can specify a starting address and a break point, causing the program to stop at a given address. Enter ‘g 100’ at eh prompt. The program starting at location 100h will be executed, and you’ll see the following message at the screen.
‘Program terminated normally.’
Now try ‘g 100 102’
In this case the program will stop after executing the first instruction. This is what is called setting breakpoints. You can set break points at various locations in your memory to debug any errors.
Here is an interesting QUIZ for you to test what you learn.
1. Write a program that adds the following three numbers together:
0267h + 04Ah+ 19DDh
Debug your program using the trace command and note down the important register contents after each step.
Note: Don’t forget to include ‘int 20’ command to end your program normally.
2. Enter the following command and explain and explain its purpose.
F 200 ‘I am a student of UNIVERSITY’
Now enter ‘a 100’ and enter the following program,
mov cx, 20 ; initialize the counter
mov bx, 0200 ; initialize the ds:bx to point to first mem. loc.
mov dl, [bx] ; put in DL the contents of DS:BX
mov ah, 02 ; call DOS function
inc bx ; increment the mem pointer
dec cx ; decrement the counter
mov ax, cx ; put the count in accumulator
jnz 0106 ; jump back if count not zero
int 20 ; end of program
then use ‘g’ to run the program.
- DEBUG Command parameters:
Address: A complete segment-offset address (see appendix notes B) may be given, or just an offset. The segment portion may be a hexadecimal number or segment register name.
F000:100 Segment, offset
DS: 200 Segment register, offset
List: One or more bytes or string values, separated by commas.
‘A’, ‘B’ , 50
Range: A range refers to a span of memory, identified by addresses in one of two formats. In format 1, if the second address is omitted, it defaults to a standard value. In format 2, the value following the letter L is the number of bytes to be processed by the command. A range cannot be greater than 10000h (65,536).
Format 1 : address [,address]
Format 2: address L [value]
(Refers to 20h bytes starting at location 100h)
Value: A value consists of a 1- to 4-character hexadecimal number.
- APPENDIX NOTES
Finding the Real Address:
The addresses in an 8086 system are expressed by combining two hex numbers together. As an example, you may find an address expressed as 18F1:0120. The number on the left is called the segment address and the one on the right is called the offset address. Both of the hex number separated by a colon are 16-bit numbers, however, the address bus of the 8086 microprocessor is 20-bit wide. To get a 20-bit real or absolute address the segment address and the offset address are combined together in an unusual way. The segment address (the address on the left) is shifted left one digit, which is the same as multiplying it by 10h or putting a ‘0’ at its right and making it a 5-digit (20-bit) hex number. It is then added to the offset address (the number on the right) to get the real or absolute address. In the above example, 18F1 is shifted left one position to make it 18F10h. It is then added to 0120h to get 19030h which is the absolute 20-bit address.
You may be wondering why Intel designed the 8086 family devices to access memory using the segment : offset approach rather than accessing memory directly with 20-bit addresses. One of the reasons for doing so is that the 8086 is a 16-bit microprocessor. It performs all its internal operations based on 16-bit words. The segment : offset approach requires only a 16-bit number to represent the base address for a segment, and only a 16-bit offset to access any location in a segment.
- Douglas V Hall, Microprocessors and Interfacing, Programming and Hardware, 2nd edition. McGraw-Hill International, 1992
- Kip R. Irvine, Assembly Language for the IBM-PC Macmillan Publishing Company, 1990
- Robert Lafore, Assembly Language Primer for the IBM PC & XT. The Waite Group Inc. 1984
- Avtar Sing and Walter A. Trieble, The 8086 and 80286 Microprocessor, Hardware, Software, and Interfacing. Prentice-Hall Inc. 1990