CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
1 of 58 17/11/2005 08:08 p.m.
PART One 80x86 Assembly Language Programming
PART Two Introduction to MIPS Programming
Recommended Texts:
Sargent, Murray. - The personal computer from the inside out / Murray Sargent
III and Rich. - Rev. ed. - Reading, Mass : Addison-Wesley Pub. Co, 1986. -
0201069180
Waldron, John, 1964-. - Introduction to RISC assembly language programming
/ John Waldron. - Harlow, England :
Addison-Wesley, 1999. - 0201398281
OVERVIEW OF THE 80x86 FAMILY
Why Assembly Language ?
REPRESENTATION OF NUMBERS IN BINARY
REGISTERS
General Purpose Regs
Index Registers
Stack Register
SEGMENTS AND OFFSETS
THE STACK
INTRODUCTION TO ASSEMBLY
PUSH AND POP
TYPES OF OPERAND
SOME USEFUL INSTRUCTIONS
MOV INT ADD SUB
MUL IMUL DIV IDIV
WHAT ARE MEMORY MODELS
Tiny Small Medium
Large Flat
BASIC ASSEMBLY PROGRAM
Listing 1:1stProgram.asm
COMPILATION INSTRUCTIONS
MAKING THINGS EASIER
KEYBOARD INPUT
PRINTING A CHARACTER
Listing 2:
DOS Interrupt 21h
INTRODUCTION TO PROCEDURES
Listing 3: SIMPROC.ASM
PROCEDURES THAT PASS PARAMETERS
Paramater Passing in Registers
Listing 4:Proc1.asm
PASSING PARAMETERS THROUGH MEMORY
Listing 5: PROC2.ASM
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
2 of 58 17/11/2005 08:08 p.m.
PASSING PARAMETERS THROUGH THE STACK
Listing 6:Proc3.asm
MACROS (in Turbo Assembler)
Macros with Parameters
FILES AND HOW TO USE THEM
Function 3Dh: open file
Function 3Eh: close file
Function 3Fh: read file/device
Listing 7: READFILE.ASM
Function 3Ch: Create File
OVERVIEW OF THE 80x86 FAMILY
--------------------------------------------------
The 80x86 family was first started in 1981 with the8086 and the newest
member is the Pentium which was released thirteen years later in 1994.
They are all backwards compatible with each other but each new generation
has added features and more speed than the previous chip. Today there are very
few computers in use that have the 8088 and 8086 chips in them as they are
very outdated and slow. There are a few 286's but their numbers are declining
as today's software becomes more and more demanding. Even the 386, Intel's
first 32-bit CPU, is now declining and it seems that the 486 is nowthe entry
level system.
Why Assembly Language?
An old joke goes something like this: "There are three reasons for using assembly
language: speed, speed, and more speed." Even those who absolutely hate assembly
language will admit that if speed is your primary concern, assembly language is the
way to go. Assembly language has several benefits:
Speed. Assembly language programs are generally the fastest programs
around.
Space. Assembly language programs are often the smallest.
Capability. You can do things in assembly which are difficult or impossible
in HLLs.
Knowledge. Your knowledge of assembly language will help you write better
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
3 of 58 17/11/2005 08:08 p.m.
programs, even when using HLLs.
Assembly language is the uncontested speed champion among programming
languages. An expert assembly language programmer will almost always produce a
faster program than an expert C programmer. While certain programs may not
benefit much from implementation in assembly, you can speed up many programs by
a factor of five or ten over their HLL counterparts by careful coding in assembly
language; even greater improvement is possible if you're not using an
optimizing compiler. Alas, speedups on the order of five to ten times are generally
not achieved by beginning assembly language programmers. However, if you spend
the time to learn assembly language really well, you too can achieve these
impressive performance gains.
Despite some people's claims that programmers no longer have to worry about
memory constraints, there are many programmers who need to write smaller
programs. Assembly language programs are often less than one-half the size of
comparable HLL programs. This is especially impressive when you consider the fact
that data items generally consume the same amount of space in both types of
programs, and that data is responsible for a good amount of the space used by a
typical application. Saving space saves money. Pure and simple. If a program
requires 1.5 megabytes, it will not fit on a 1.44 Mbyte floppy. Likewise, if an
application requires 2 megabytes RAM, the user will have to install an extra
megabyte if there is only one available in the machine. Even on big machines with
32 or more megabytes, writing gigantic applications isn't excusable. Most users put
more than eight megabytes in their machines so they can run multiple programs
from memory at one time. The bigger a program is, the fewer applications will be
able to coexist in memory with it. Virtual memory isn't a particularly attractive
solution either. With virtual memory, the bigger an application is, the slower the
system will run as a result of that program's size.
Capability is another reason people resort to assembly language. HLLs are an
abstraction of a typical machine architecture. They are designed to be independent of
the particular machine architecture. As a result, they rarely take into account any
special features of the machine, features which are available to assembly language
programmers. If you want to use such features, you will need to use assembly
language. A really good example is the input/output instructions available
on the 80x86 microprocessors. These instructions let you directly access certain I/O
devices on the computer. In general, such access is not part of any high level
language. Indeed, some languages like C pride themselves on not supporting any
specific I/O operations. In assembly language you have no such restrictions.
Anything you can do on the machine you can do in assembly language. This is
definitely not the case with most HLLs.
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
4 of 58 17/11/2005 08:08 p.m.
Of course, another reason for learning assembly language is just for the knowledge.
Now some of you may be thinking, "Gee, that would be wonderful, but I've got lots
to do. My time would be better spent writing code than learning assembly language."
There are some practical reasons for learning assembly, even if you never intend to
write a single line of assembly code. If you know assembly language well, you'll
have an appreciation for the compiler, and you'll know exactly
what the compiler is doing with all those HLL statements. Once you see how
compilers translate seemingly innocuous statements into a ton of machine code,
you'll want to search for better ways to accomplish the same thing. Good assembly
language programmers make better HLL programmers because they understand the
limitations of the compiler and they know what it's doing with their code. Those who
don't know assembly language will accept the poor performance their
compiler produces and simply shrug it off.
Representation of numbers in binary
---------------------------------------------------------------------
Before we begin to understand how to program in assembly it is best to try to
understand how numbers are represented in computers. Numbers are stored in
binary, base two. There are several terms which are used to describe different size
numbers and I will describe what these mean.
1 BIT: 0
One bit is the simplest piece of data that exists. Its either a one or a zero.
1 NIBBLE: 0000
4 BITS
The nibble is four bits or half a byte. Note that it has a maximum value of 15 (1111
= 15). This is the basis for the hexadecimal (base 16) number system which is used
as it is far easier to understand.
Hexadecimal numbers go from 1 to F and are followed by a h to state that the are in
hex. i.e. Fh = 15 decimal. Hexadecimal numbers that begin with a letter are prefixed
with a 0 (zero).
1 BYTE 00000000
2 NIBBLES
8 BITS
A byte is 8 bits or 2 nibbles. A byte has a maximum value of FFh (255 decimal).
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
5 of 58 17/11/2005 08:08 p.m.
Because a byte is 2 nibbles the hexadecimal representation is two hex digits in a row
i.e. 3Dh. The byte is also that size of the 8-bit registers which we will be covering
later.
1 WORD 0000000000000000
2 BYTES
4 NIBBLES
16 BITS
A word is two bytes that are stuck together. A word has a maximum value of FFFFh
(65,536). Since a word is four nibbles, it is represented by four hex digits. This is
the size of the 16-bit registers.
Registers
---------------------------------------------------------------------
Registers are a place in the CPU where a number can be stored and manipulated.
There are three sizes of registers: 8-bit, 16-bit and on 386 and above 32-bit. There
are four different types of registers; general purpose registers, segment egisters,
index registers and stack registers. Firstly here are descriptions of the main registers.
Stack registers and segment registers will be covered later.
General Purpose Registers
---------------------------------------------------------------------
These are 16-bit registers. There are four general purpose registers;
AX, BX, CX and DX.
They are split up into 8-bit registers. AX is split up into AH which contains the high
byte and AL which contains the low
byte. On 386's and above there are also 32-bit registers, these have the same names
as the 16-bit registers but with an 'E' in front i.e. EAX. You can use AL, AH, AX and
EAX separatly and treat them as separate registers for some tasks.
CPU registers are very special memory locations constructed from flip-flops. They
are not part of main memory; the CPU implements them on-chip. Various members
of the 80x86 family have different register sizes. The 886, 8286, 8486, and 8686
(x86 from now on) CPUs have exactly four registers, all 16 bits wide.
All arithmetic and location operations occur in the CPU registers.
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
6 of 58 17/11/2005 08:08 p.m.
Because the x86 processor has so few registers, we'll give each register its own name
and refer to it by that name rather than its address. The names for the x86
registers are
AX -The accumulator register
BX -The base address register
CX -The count register
DX -The data register
Besides the above registers, which are visible to the programmer, the x86 processors
also have an instruction pointer register which contains the address of the next
instruction to execute. There is also a flags register that holds the result of a
comparison. The flags register remembers if one value was less than, equal to, or
greater than another value.
Because registers are on-chip and handled specially by the CPU, they are much
faster than memory. Accessing a memory location requires one or more clock cycles.
Accessing data in a register usually takes zero clock cycles. Therefore, you should
try to keep variables in the registers. Register sets are very small and most registers
have special purposes which limit their use as variables, but they are still an
excellent place to store temporary data.
If AX contained 24689 decimal:
AH AL
01100000 01110001
AH would be 96 and AL would be 113. If you added one to AL it would be 114 and
AH would be unchanged.
SI, DI, SP and BP can also be used as general purpose registers but have more
specific uses. They are not split into two
halves.
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
7 of 58 17/11/2005 08:08 p.m.
Index Registers
---------------------------------------------------------------------
These are sometimes called pointer registers and they are 16-bit registers. They are
mainly used for string instructions. There are three index registers SI (source index),
DI (destination index) and IP (instruction pointer). On 386's and above there are also
32-bit index registers: EDI and ESI. You can also use BX to index strings. IP is a
index register but it can't be manipulated directly as it stores the address of the next
instruction.
Stack registers
---------------------------------------------------------------------
BP and SP are stack registers and are used when dealing with the stack. They will be
covered when we talk about the stack later on.
Segments and offsets
---------------------------------------------------------------------
You cannot discuss memory addressing on the 80x86 processor family without first
discussing segmentation. Among other things, segmentation provides a powerful
memory management mechanism. It allows programmers to partition their programs
into modules that operate independently of one another. Segments provide a way to
easily implement object-oriented programs. Segments allow two processes to easily
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
8 of 58 17/11/2005 08:08 p.m.
share data. All in all, segmentation is a really neat feature. On the other hand, if you
ask ten programmers what they think of segmentation, at least nine of the ten will
claim it's terrible. Why such a response?
Well, it turns out that segmentation provides one other nifty feature: it allows you to
extend the addressability of a processor. In the case of the 8086, segmentation let
Intel's designers extend the maximum addressable memory from 64K to one
megabyte. Gee, that sounds good. Why is everyone complaining? Well, a little
history lesson is in order to understand what went wrong.
In 1976, when Intel began designing the 8086 processor, memory was very
expensive. Personal computers, such that they were at the time, typically had four
thousand bytes of memory. Even when IBM introduced the PC five years later, 64K
was still quite a bit of memory, one megabyte was a tremendous amount. Intel's
designers felt that 64K memory would remain a large amount throughout the
lifetime of the 8086. The only mistake they made was completely underestimating
the lifetime of the 8086. They figured it would last about five years, like their earlier
8080 processor. They had plans for lots of other processors at the time, and "86" was
not a suffix on the names of any of those. Intel figured they were set. Surely one
megabyte would be more than enough to last until they came out with something
better.
Unfortunately, Intel didn't count on the IBM PC and the massive amount of software
to appear for it. By 1983, it was very clear that Intel could not abandon the 80x86
architecture. They were stuck with it, but by then people were running up against the
one megabyte limit of 8086. So Intel gave us the 80286. This processor could
address up to 16 megabytes of memory. Surely more than enough. The only problem
was that all that wonderful software written for the IBM PC
was written in such a way that it couldn't take advantage of any memory beyond one
megabyte.
It turns out that the maximum amount of addressable memory is not everyone's main
complaint. The real problem is that the 8086 was a 16 bit processor, with 16 bit
registers and 16 bit addresses. This limited the processor to addressing 64K chunks
of memory. Intel's clever use of segmentation extended this to one megabyte, but
addressing more than 64K at one time takes some effort. Addressing more than
256K at one time takes a lot of effort.
Despite what you might have heard, segmentation is not bad. In fact, it is a really
great memory management scheme. What is bad is Intel's 1976 implementation of
segmentation still in use today. You can't blame Intel for this - they fixed the
problem in the 80's with the release of the 80386. The real culprit is MS-DOS that
forces programmers to continue to use 1976 style segmentation. Fortunately, newer
operating systems such as Linux, UNIX, Windows 9x, Windows NT, and
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
9 of 58 17/11/2005 08:08 p.m.
OS/2 don't suffer from the same problems as MS-DOS. Furthermore, users finally
seem to be more willing to switch to these newer operating systems so programmers
can take advantage of the new features of the 80x86 family.
With the history lesson aside, it's probably a good idea to figure out what
segmentation is all about. Consider the current view of memory: it looks like a linear
array of bytes. A single index (address) selects some particular byte from that array.
Let's call this type of addressing linear or flat addressing. Segmented addressing uses
two components to specify a memory location: a segment value and an offset within
that segment. Ideally, the segment and offset values are independent of one another.
The best way to describe segmented addressing is with a two-dimensional array. The
segment provides one of the indices into the array, the offset provides the other:
Now you may be wondering, "Why make this process more complex?" Linear
addresses seem to work fine, why bother with this two dimensional addressing
scheme? Well, let's consider the way you typically write a program. If you were to
write, say, a SIN(X) routine and you needed some temporary variables, you probably
would not use global variables. Instead, you would use local variables inside the
SIN(X) function. In a broad sense, this is one of the features that segmentation offers
- the ability to attach blocks of variables (a segment) to a particular piece of code.
You could, for xample, have a segment containing local variables for SIN, a
segment for SQRT, a segment for DRAWWindow, etc. Since the variables for SIN
appear in the segment for SIN, it's less likely your SIN routine will affect the
variables belonging to the SQRT routine. Indeed, on the 80286 and later operating in
protected mode, the CPU can prevent one routine from accidentally modifying the
variables in a different segment.
A full segmented address contains a segment component and an offset component.
This text will write segmented addresses as segment:offset. On the 8086 through the
80286, these two values are 16 bit constants. On the 80386 and later, the offset can
be a 16 bit constant or a 32 bit constant.
CA225 Assembly Language Programming http://www.computing.dcu.ie/%7Eray/CA225.html
10 of 58 17/11/2005 08:08 p.m.
The size of the offset limits the maximum size of a segment. On the 8086 with 16 bit
offsets, a segment may be no longer than 64K; it could be smaller (and most
segments are), but never larger. The 80386 and later processors allow 32 bit offsets
with segments as large as four gigabytes.
The segment portion is 16 bits on all 80x86 processors. This lets a single program
have up to 65,536 different segments in the program. Most programs have less than
16 segments (or thereabouts) so this isn't a practical limitation.
Of course, despite the fact that the 80x86 family uses segmented addressing, the
actual (physical) memory connected to the CPU is still a linear array of bytes. There
is a function that converts the segment value to a physical memory address. The
processor then adds the offset to this physical address to obtain the actual address of
the data in memory. This text will refer to addresses in your programs as segmented
addresses or logical addresses. The actual linear address that
appears on the address bus is the physical address :
On the 8086, 8088, 80186, and 80188 (and other processors operating in real mode),
the function that maps a segment to a physical address is very simple. The CPU
multiplies the segment value by sixteen (10h) and adds the offset portion. For
example, consider the segmented address: 1000:1F00. To convert this to a physical
address you multiply the segment value (1000h) by sixteen. Multiplying by the
本文档为【Assembly Language Programming cs.uns.edu】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。