To demonstrate some common problems with assembly language programs, consider the following programs or program segments. These are actual programs written in assembly language taken from the internet. Each example demonstrates a separate problem. (By the way, the choice of these examples is not intended to embarass the original authors. These programs are typical of assembly language source code found on the Internet.)
%TITLE "Sums TWO hex values"
IDEAL
DOSSEG
MODEL small
STACK 256
DATASEG
exitCode db 0
prompt1 db 'Enter value 1: ', 0
prompt2 db 'Enter value 2: ', 0
string db 20 DUP (?)
CODESEG
EXTRN StrLength:proc
EXTRN StrWrite:proc, StrRead:proc, NewLine:proc
EXTRN AscToBin:proc, BinToAscHex:proc
Start:
mov ax,@data
mov ds,ax
mov es,ax
mov di, offset prompt1
call GetValue
push ax
mov di, offset prompt2
call GetValue
pop bx
add ax,bx
mov cx,4
mov di, offset string
call BinToAscHex
call StrWrite
Exit:
mov ah,04Ch
mov al,[exitCode]
int 21h
PROC GetValue
call StrWrite
mov di, offset string
mov cl,4
call StrRead
call NewLine
call StrLength
mov bx,cx
mov [word bx + di], 'h'
call AscToBin
ret
ENDP GetValue
END Start
Well, the biggest problem with this program should be fairly obvious - it
has absolutely no comments other than the title of the program. Another
problem is the fact that strings that prompt the user appear in one part
of the program and the calls that print those strings appear in another.
While this is typical assembly language programming, it still makes the
program harder to read. Another, relatively minor, problem is that it
uses TASM's "less-than" IDEAL syntax[1].This program also uses the MASM/TASM "simplified" segment directives. How typically Microsoft to name a feature that adds complexity to a product "simplified." It turns out that programs that use the standard segmentation directives will be easier to read[2].
Before moving one, it is worthwhile to point out two good features about this program (with respect to readability). First, the programmer chose a reasonable set of names for the procedures and variables this program uses (I'll assume the author of this code segment is also the author of the library routines it calls). Another positive aspect to this program is that the mnemonic and operand fields are nicely aligned.
Okay, after complaining about how hard this code is to read, how about a more readable version? The following program is, arguably, more readable than the version above. Arguably, because this version uses the UCR Standard Library v2.0 and it assumes that the reader is familiar with features of that particular library.
;**************************************************
;
; AddHex-
;
; This simple program reads two integer values from
; the user, computes their sum, and prints the
; result to the display.
;
; This example uses the "UCR Standard Library for
; 80x86 Assembly Language Programmers v2.0"
;
; Randall Hyde
; 12/13/96
title AddHex
.xlist
include ucrlib.a
includelib ucrlib.lib
.list
cseg segment para public 'code'
assume cs:cseg
; GetInt-
;
; This function reads an integer value from the keyboard and
; returns that value in the AX register.
;
; This routine traps illegal values (either too large or
; incorrect digits) and makes the user re-enter the value.
GetInt textequ <call GetInt_p>
GetInt_p proc
push dx ;DX hold error code.
GetIntLoop: mov dx, false ;Assume no error.
try ;Trap any errors.
FlushGetc ;Force input from a new line.
geti ;Read the integer.
except $Conversion ;Trap if bad characters.
print "Illegal numeric conversion, please
re-enter", nl
mov dx, true
except $Overflow ;Trap if # too large.
print "Value out of range, please re-enter.",nl
mov dx, true
endtry
cmp dx, true
je GetIntLoop
pop dx
ret
GetInt_p endp
Main proc
InitExcept
print 'Enter value 1: '
GetInt
mov bx, ax
print 'Enter value 2: '
GetInt
print cr, lf, 'The sum of the two values is '
add ax, bx
puti
putcr
Quit: CleanUpEx
ExitPgm ;DOS macro to quit program.
Main endp
cseg ends
sseg segment para stack 'stack'
stk db 256 dup (?)
sseg ends
zzzzzzseg segment para public 'zzzzzz'
LastBytes db 16 dup (?)
zzzzzzseg ends
end Main
It is well worth pointing out that this code does quite a bit more than
the original AddHex program. In particular, it validates the user's
input; something the original program did not do. If one were to exactly
simulate the original program, the program could be simplified to the
following:
print nl, 'Enter value 1: '
Geti
mov bx, ax
print nl, 'Enter value 2: '
Geti
add ax, bx
putcr
puti
putcr
In this example, the two sample solutions improved the readability of the
program by adding comments, formatting the program a little bit better,
and by using the high-level features of the UCR Standard Library to
simplify the coding and keep output string literals with the statements
that print them.
;===================================
;SET_POINT (Xpos%, Ypos%, ColorNum%)
;===================================
;
; Plots a single Pixel on the active display page
;
; ENTRY: Xpos = X position to plot pixel at
; Ypos = Y position to plot pixel at
; ColorNum = Color to plot pixel with
;
; EXIT: No meaningful values returned
;
SP_STACK STRUC
DW ?,? ; BP, DI
DD ? ; Caller
SETP_Color DB ?,? ; Color of Point to Plot
SETP_Ypos DW ? ; Y pos of Point to Plot
SETP_Xpos DW ? ; X pos of Point to Plot
SP_STACK ENDS
PUBLIC SET_POINT
SET_POINT PROC FAR
PUSHx BP, DI ; Preserve Registers
MOV BP, SP ; Set up Stack Frame
LES DI, d CURRENT_PAGE ; Point to Active VGA Page
MOV AX, [BP].SETP_Ypos ; Get Line # of Pixel
MUL SCREEN_WIDTH ; Get Offset to Start of Line
MOV BX, [BP].SETP_Xpos ; Get Xpos
MOV CX, BX ; Copy to extract Plane # from
SHR BX, 2 ; X offset (Bytes) = Xpos/4
ADD BX, AX ; Offset = Width*Ypos + Xpos/4
MOV AX, MAP_MASK_PLANE1 ; Map Mask & Plane Select Register
AND CL, PLANE_BITS ; Get Plane Bits
SHL AH, CL ; Get Plane Select Value
OUT_16 SC_Index, AX ; Select Plane
MOV AL,[BP].SETP_Color ; Get Pixel Color
MOV ES:[DI+BX], AL ; Draw Pixel
POPx DI, BP ; Restore Saved Registers
RET 6 ; Exit and Clean up Stack
SET_POINT ENDP
Unlike the previous example, this one has lots of comments. Indeed, the
comments are not bad. However, this particular routine suffers from its
own set of problems. First, most of the instructions, register names, and
identifiers appear in upper case. Upper case characters are much harder
to read than lower case letters. Considering the extra work involved in
entering upper case letters into the computer, it's a real shame to see
this type of mistake in a program[3]. Another
big problem with this particular code segment is that the author didn't
align the label field, the mnemonic field, and the operand field very well
(it's not horrible, but it's bad enough to affect the readability of the
program.Here is an improved version of the program:
;===================================
;
;SetPoint (Xpos%, Ypos%, ColorNum%)
;
;
; Plots a single Pixel on the active display page
;
; ENTRY: Xpos = X position to plot pixel at
; Ypos = Y position to plot pixel at
; ColorNum = Color to plot pixel with
;
; ES:DI = Screen base address (??? I added this without really
; knowing what is going on here
[RLH]).
;
; EXIT: No meaningful values returned
;
dp textequ <dword ptr>
Color textequ <[bp+6]>
YPos textequ <[bp+8]>
XPos textequ <[bp+10]>
public SetPoint
SetPoint proc far
push bp
mov bp, sp
push di
les di, dp CurrentPage ;Point at active VGA Page
mov ax, YPos ;Get line # of Pixel
mul ScreenWidth ;Get offset to start of
line
mov bx, XPos ;Get offset into line
mov cx, bx ;Save for plane
computations
shr bx, 2 ;X offset (bytes)= XPos/4
add bx, ax ;Offset=Width*YPos + XPos/4
mov ax, MapMaskPlane1 ;Map mask & plane
select reg
and cl, PlaneBits ;Get plane bits
shl ah, cl ;Get plane select value
out_16 SCIndex, ax ;Select plane
mov al, Color ;Get pixel color
mov es:[di+bx], al ;Draw pixel
pop di
pop bp
ret 6
SetPoint endp
Most of the changes here were purely mechanical: reducing the number of
upper case letters in the program, spacing the program out better,
adjusting some comments, etc. Nevertheless, these small, subtle, changes
have a big impact on how easy the code is to read (at least, to an
experienced assembly langage programmer).
;Get all file names matching filespec and set up tables
GetFileRecords:
mov dx, OFFSET DTA ;Set up DTA
mov ah, 1Ah
int 21h
mov dx, FILESPEC ;Get first file name
mov cl, 37h
mov ah, 4Eh
int 21h
jnc FileFound ;No files. Try a different filespec.
mov si, OFFSET NoFilesMsg
call Error
jmp NewFilespec
FileFound:
mov di, OFFSET fileRecords ;DI -> storage for file names
mov bx, OFFSET files ;BX -> array of files
sub bx, 2
StoreFileName:
add bx, 2 ;For all files that will fit,
cmp bx, (OFFSET files) + NFILES*2
jb @@L1
sub bx, 2
mov [last], bx
mov si, OFFSET tooManyMsg
jmp DoError
@@L1:
mov [bx], di ;Store pointer to status/filename in
files[]
mov al, [DTA_ATTRIB] ;Store status byte
and al, 3Fh ;Top bit is used to indicate file is marked
stosb
mov si, OFFSET DTA_NAME ;Copy file name from DTA to filename
storage
call CopyString
inc di
mov si, OFFSET DTA_TIME ;Copy time, date and size
mov cx, 4
rep movsw
mov ah, 4Fh ;Next filename
int 21h
jnc StoreFileName
mov [last], bx ;Save pointer to last file entry
mov al, [keepSorted] ;If returning from EXEC, need to resort
files?
or al, al
jz DisplayFiles
jmp Sort0
The primary problem with this program is the formatting. The label
fields overlap the mnemonic fields (in almost every instance), the operand
fields of the various instructions are not aligned, there are very few
blank lines to organize the code, the programmer makes excessive use of
"local" label names, and, although not prevalent, there are a
few items that are all uppercase (remember, upper case characters are
harder to read). This program also makes considerable use of "magic
numbers," especially with respect to opcodes passed on to DOS.Another subtle problem with this program is the way it organizes control flow. At a couple of points in the code it checks to see if an error condition exists (file not found and too many files processed). If an error exists, the code above branches around some error handling code that the author places in the middle of the routine. Unfortunately, this interrupts the flow of the program. Most readers will want to see a straight-line version of the program's typical operation without having to worry about details concerning error conditions. Unfortunately, the organization of this code is such that the user must skip over seldomly-executed code in order to follow what is happening with the common case[4].
Here is a slightly improved version of the above program:
;Get all file names matching filespec and set up tables
GetFileRecords mov dx, offset DTA ;Set up DTA
DOS SetDTA
; Get the first file that matches the specified filename (that may
; contain wildcard characters). If no such file exists, then
; we've got an error.
mov dx, FileSpec
mov cl, 37h
DOS FindFirstFile
jc FileNotFound
; As long as there are no more files matching our file spec (that contains
; wildcard characters), get the file information and place it in the
; "files" array. Each time through the
"StoreFileName" loop we've got
; a new file name via a call to DOS' FindNextFile function (FindFirstFile
; for the first iteration). Store the info concerning the file away and
; move on to the next file.
mov di, offset fileRecords ;DI -> storage for file
names
mov bx, offset files ;BX -> array of
files
sub bx, 2 ;Special case for 1st
iteration
StoreFileName: add bx, 2
cmp bx, (offset files) + NFILES*2
jae TooManyFiles
; Store away the pointer to the status/filename in files[] array.
; Note that the H.O. bit of the status byte indicates that the file is
; is marked.
mov [bx], di ;Store pointer in files[]
mov al, [DTAattrib] ;Store status byte
and al, 3Fh ;Clear file is marked bit
stosb
; Copy the filename from the DTA storage area to the space we've set aside.
mov si, offset DTAname
call CopyString
inc di ;Skip zero byte (???).
mov si, offset DTAtime ;Copy time, date and size
mov cx, 4
rep movsw
; Move on to the next file and try again.
DOS FindNextFile
jnc StoreFileName
; After processing the last file entry, do some clean up.
; (1) Save pointer to last file entry.
; (2) If returning from EXEC, we may need to resort and display the files.
mov [last], bx
mov al, [keepSorted]
or al, al
jz DisplayFiles
jmp Sort0
; Jump down here if there were no files to process.
FileNotFound: mov si, offset NoFilesMsg
call Error
jmp NewFilespec
; Jump down here if there were too many files to process.
TooManyFiles: sub bx, 2
mov [last], bx
mov si, offset tooManyMsg
jmp DoError
This improved version dispenses with the local labels, formats the code
better by aligning all the statement fields and inserting blank lines into
the code. It also eliminates much of the uppercase characters appearing
in the previous version. Another improvment is that this code moves the
error handling code out of the main stream of this code segment, allowing
the reader to follow the typical execution in a more linear fashion.In view of the above, it makes sense to define an "intended audience" that we intend to have read our assembly language programs. Such a person should:
To develop a metric for measuring the readability of an assembly language program, the first thing we must ask is "Why is readability important?" This question has a simple (though somewhat flippant) answer: Readability is important because programs are read (furthermore, a line of code is typically read ten times more often than it is written). To expand on this, consider the fact that most programs are read and maintained by other programmers (Steve McConnell claims that up to ten generations of maintenance programmers work on a typically real world program before it is rewritten; furthermore, they spend up to 60% of their effort on that code simply figuring out how it works). The more readable your programs are, the less time these other people will have to spend figuring out what your program does. Instead, they can concentrate on adding features or correcting defects in the code.
For the purposes of this document, we will define a "readable" program as one that has the following trait:
That's a tall order! This definition doesn't sound very difficult to achieve, but few non-trivial programs ever really achieve this status. This definition suggests that an appropriate programmer (i.e., one who is familiar with the problem the program is trying to solve) can pick up a program, read it at their normal reading pace (just once), and fully comprehend the program. Anything less is not a "readable" program.
Of course, in practice, this definition is unusable since very few programs reach this goal. Part of the problem is that programs tend to be quite long and few human beings are capable of managing a large number of details in their head at one time. Furthermore, no matter how well-written a program may be, "a competent programmer" does not suggest that the programmer's IQ is so high they can read a statement a fully comprehend its meaning without expending much thought. Therefore, we must define readabilty, not as a boolean entity, but as a scale. Although truly unreadable programs exist, there are many "readable" programs that are less readable than other programs. Therefore, perhaps the following definition is more realistic:
An 80% comprehension level means that the programmer can correct bugs in the program and add new features to the program without making mistakes due to a misunderstanding of the code at hand.
Of course, consistency by itself is not good enough. Consistently bad programs are not particularly easy to read. Therefore, one must carefully consider the guidelines to use when defining an all-encompassing standard. The purpose of this paper is to create such a standard. However, don't get the impression that the material appearing in this document appears simply because it sounded good at the time or because of some personal preferences. The material in this paper comes from several software engineering texts on the subject (including Elements of Programming Style, Code Complete, and Writing Solid Code), nearly 20 years of personal assembly language programming experience, and a set of generic programming guidelines developed for Information Management Associates, Inc.
This document assumes consistent usage by its readers. Therefore, it concentrates on a lot of mechanical and psychological issues that affect the readability of a program. For example, uppercase letters are harder to read than lower case letters (this is a well-known result from psychology research). It takes longer for a human being to recognize uppercase characters, therefore, an average human being will take more time to read text written all in upper case. Hence, this document suggests that one should avoid the use of uppercase sequences in a program. Many of the other issues appearing in this document are in a similar vein; they suggest minor changes to the way you might write your programs that make it easier for someone to recognize some pattern in your code, thus aiding in comprehension.
Section Two discusses programs in general. It primarily discusses documentation that must accompany a program and the organization of source files. It also discusses, briefly, configuration management and source code control issues. Keep in mind that figuring out how to build a program (make, assemble, link, test, debug, etc.) is important. If your reader fully understands the "heapsort" algorithm you are using, but cannot build an executable module to run, they still do not fully understand your program.
Section Three discusses how to organize modules in your program in a logical fashion. This makes it easier for others to locate sections of code and organizes related sections of code together so someone can easily find important code and ignore unimportant or unrelated code while attempting to understand what your program does.
Section Four discusses the use of procedures within a program. This is a continuation of the theme in Section Three, although at a lower, more detailed, level.
Section Five discusses the program at the level of the statement. This (large) section provides the meat of this proposal. Most of the rules this paper presents appear in this section.
Section Six discusses those items that make up a statement (labels, names, instructions, operands, operators, etc.) This is another large section that presents a large number of rules one should follow when writing readable programs. This section discusses naming conventions, appropriateness of operators, and so on.
Section Seven discusses data types and other related topics.
Section Eight covers miscellaneous topics that the previous sections did not cover.
A Guideline is a suggestion. It is a rule you should follow unless you can verbally defend why you should break the rule. As long as there is a good, defensible, reason, you should feel no apprehension violated a guideline. Guidelines exist in order to encourage consistency in areas where there are no good reasons for choosing one methodology over another. You shouldn't violate a Guideline just because you don't like it -- doing so will make your programs inconsistent with respect to other programs that do follow the Guidline (and, therefore, harder to read -- however, you shouldn't lose any sleep because you violated a Guideline.
Rules are much stronger than Guidelines. You should never break a rule unless there is some external reason for doing so (e.g., making a call to a library routine forces you to use a bad naming convention). Whenever you feel you must violate a rule, you should verify that it is reasonable to do so in a peer review with at least two peers. Furthermore, you should explain in the program's comments why it was necessary to violate the rule. Rules are just that -- rules to be followed. However, there are certain situations where it may be necessary to violate the rule in order to satisfy external requirements or even make the program more readable.
Enforced Rules are the toughest of the lot. You should never violate an enforced rule. If there is ever a true need to do this, then you should consider demoting the Enforced Rule to a simple Rule rather than treating the violation as a reasonable alternative.
An Exception is exactly that, a known example where one would commonly violate a Guideline, Rule, or (very rarely) Enforced Rule. Although exceptions are rare, the old adage "Every rule has its exceptions..." certainly applies to this document. The Exceptions point out some of the common violations one might expect.
Of course, the categorization of Guidelines, Rules, Enforced Rules, and Exceptions herein is one man's opinion. At some organizations, this categorization may require reworking depending on the needs of that organization.
The following rules takes care of case one:
The first issue to consider is the contents of these new subdirectories. Since programmers rummaging through this project in the future will need to easily locate source files in a project, it is important that you organize these new subdirectories so that it is easy to find the source files you are moving into them. The best organization is to put each source module (or a small group of strongly related modules) into its own subdirectory. The subdirectory should bear the name of the source module minus its suffix (or the main module if there is more than one present in the subdirectory). If you place two or more source files in the same directory, ensure this set of source files forms a cohesive set (meaning the source files contain code that solve a single problem). A discussion of cohesiveness appears later in this document.
Modules contain several different objects including constants, types, variables, and program units (routines). Modules shares many of the attributes with routines (program units); this is not surprising since routines are the major component of a typical module. However, modules have some additional attributes of their own. The following sections describe the attributes of a well-written module.
The first three forms of cohesion above are generally acceptable in a program. The fourth (temporal) is probably okay, but you should rarely use it. The last three forms should almost never appear in a program. For some reasonable examples of module cohesion, you should consult "Code Complete".
A module that uses loose coupling generally contains fewer errors per KLOC (thousands of lines of code). Furthermore, modules that exhibit loose coupling are easier to reuse (both in the current and future projects). For more information on coupling, see the appropriate chapter in "Code Complete".
This document does not address the decomposition of a problem into its modular components. Presumably, you can already handle that part of the task. There are a wide variety of texts on this subject if you feel weak in this area.
The MASM 6.x externdef directive is perfect for creating interface files. When you use externdef within a source module that defines a symbol, externdef behaves like the public directive, exporting the name to other modules. When you use externdef within a source modules that refers to an external name, externdef behaves like the extern (or extrn ) directive. This lets you place an externdef directive in a single file and include this file into both the modules that import and export the public names.
If you are using an assembler that does not support externdef, you should probably consider switching to MASM 6.x. If switching to a better assembler (that supports externdef) is not feasible, the last thing you want to do is have to maintain the interface information in several separate files. Instead, use the assembler's ifdef conditional assembly directives to assemble a set of public statements in the header file if a symbol with the module's name is defined prior to including the header file. It should assemble a set of extrn statements otherwise. Although you still have to maintain the public and external information in two places (in the ifdef true and false sections), they are in the same file and located near one another.
; Module: MyHeader.a
ifndef MyHeader_A
MyHeader_A = 0
.
. ;Statements in this header file.
.
endif
The first time a source file includes "MyHeader.a" the symbol
"MyHeader_A" is undefined. Therefore, the assembler will
process all the statements in the header file. In successive include
operations (during the same assembly) the symbol "MyHeader_A" is
already defined, so the assembler ignores the body of the include file.My would you ever include a file twice? Easy. Some header files may include other header files. By including the file "YourHeader.a" a module might also be including "MyHeader.a" (assuming "YourHeader.a" contains the appropriate include directive). Your main program, that includes "YourHeader.a" might also need "MyHeader.a" so it explicitly includes this file not realizing "YourHeader.a" has already processed "MyHeader.a" thereby causing symbol redefinitions.
Routines are closely related to modules, since they tend to be the major component of a module (along with data, constants, and types). Hence, many of the attributes that apply to a module also apply to routines. The following paragraphs, at the expense of being redundant, repeat the earlier definitions so you don't have to flip back to the previous sections.
A program that uses loose coupling generally contains fewer errors per KLOC (thousands of lines of code). Furthermore, routines that exhibit loose coupling are easier to reuse (both in the current and future projects). For more information on coupling, see the appropriate chapter in "Code Complete".
A routine that exhibits functional cohesiveness is the right size, almost regardless of the number of lines of code it contains. You shouldn't artificially break up a routine into two or more subroutines (e.g., sub_partI and sub_partII) just because you feel a routine is getting to be too long. First, verify that your routine exhibits strong cohesion and loose coupling. If this is the case, the routine is not too long. Do keep in mind, however, that a long routine is probably a good indication that it is performing several actions and, therefore, does not exhibit strong cohesion.
Of course, you can take this too far. Most studies on the subject indicate that routines in excess of 150-200 lines of code tend to contain more bugs and are more costly to fix than shorter routines. Note, by the way, that you do not count blank lines or lines containing only comments when counting the lines of code in a program.
Also note that most studies involving routine size deal with HLLs. A comparable assembly language routine will contain more lines of code than the corresponding HLL routine. Therefore, you can expect your routines in assembly language to be a little longer.
______________________________________________________
mov ax, 0
mov bx, ax
add ax, dx
mov cx, ax
______________________________________________________
mov ax, 0
mov bx, ax
add ax, dx
mov cx, ax
______________________________________________________
While this is an extreme example, do note that it only takes a few
mistakes to have a large impact on the readability of a program. Consider
(a short section from) an example presented earlier:
GetFileRecords:
mov dx, OFFSET DTA ;Set up DTA
mov ah, 1Ah
int 21h
mov dx, FILESPEC ;Get first file name
mov cl, 37h
mov ah, 4Eh
int 21h
jnc FileFound ;No files. Try a different filespec.
mov si, OFFSET NoFilesMsg
call Error
jmp NewFilespec
FileFound:
mov di, OFFSET fileRecords ;DI -> storage for file names
mov bx, OFFSET files ;BX -> array of files
sub bx, 2
Improved version:
GetFileRecords: mov dx, offset DTA ;Set up DTA
DOS SetDTA
mov dx, FileSpec
mov cl, 37h
DOS FindFirstFile
jc FileNotFound
mov di, offset fileRecords ;DI -> storage for file
names
mov bx, offset files ;BX -> array of files
sub bx, 2 ;Special case for 1st
iteration
An assembly language statement consists of four possible fields: a label
field, a mnemonic field, an operand field, and a comment file. The
mnemonic and comment fields are always optional. The label field is
generally optional although certain instructions (mnemonics) do not allow
labels while others require labels. The operand field's presence is tied
to the mnemonic field. For most instructions the actual mnemonic
determines whether an operand field must be present.MASM is a free-form assembler insofar as it does not require these fields to appear in any particular column[12]. However, the freedom to arrange these columns in any manner is one of the primary contributors to hard to read assembly language programs. Although MASM lets you enter your programs in free-form, there is absolutely no reason you cannot adopt a fixed field format, always starting each field in the same column. Doing so generally helps make an assembly language program much easier to read. Here are the rules you should use:
If you need to set off a sequence of statements from surrounding code, the best thing you can do is use blank lines in your source code. For a small amount of detachment, to separate one computation from another for example, a single blank line is sufficient. To really show that one section of code is special, use two, three, or even four blank lines to separate one block of statements from the surrounding code. To separate two totally unrelated sections of code, you might use several blank lines and a row of dashes or asterisks to separate the statements. E.g.,
mov dx, FileSpec
mov cl, 37h
DOS FindFirstFile
jc FileNotFound
; *********************************************
mov di, offset fileRecords ;DI -> storage for file
names
mov bx, offset files ;BX -> array of files
sub bx, 2 ;Special case for 1st
iteration
mov ax, 0 ;Set AX to zero.
Quite frankly, this comment is worse than no comment at all. It doesn't
tell the reader anything the instruction itself doesn't tell and it
requires the reader to take some of his or her precious time to figure out
that the comment is worthless. If someone cannot tell that this
instruction is setting AX to zero, they have no business reading an
assembly language program. This brings up the first guideline of this
section:
mov ax, 0 ;AX is the resulting sum. Initialize it.
Note that the comment does not say "Initialize it to zero."
Although there would be nothing intrinsically wrong with saying this, the
phrase "Initialize it" remains true no matter what value you
assign to AX. This makes maintaining the code (and comment) much easier
since you don't have to change the comment whenever you change the
constant associated with the instruction.
mov ax, 1 ;Set AX to zero.
It is amazing how long a typical person will look at this code trying to
figure out how on earth the program sets AX to zero when it's obvious it
does not do this. People will always believe comments over code. If
there is some ambiguity between the comments and the code, they will
assume that the code is tricky and that the comments are correct. Only
after exhausting all possible options is the average person likely to
concede that the comment must be incorrect.
mov ax, 10; { Set AX to 11 }
; This is a comment with a blank line between it and the next comment. ; ; This is another line with a comment on it.
Rather than like this:
; This is a comment with a blank line between it and the next comment. ; This is another line with a comment on it.
The semicolon appearing between the two statements suggest continuity that is not present when you remove the semicolon. If two blocks of comments are truly separate and whitespace between them is appropriate, you should consider separating them by a large number of blank lines to completely eliminate any possible association between the two.
Standalone comments are great for describing the actions of the code that immediately follows. So what are endline comments useful for? Endline comments can explain how a sequence of instructions are implimenting the algorithm described in a previous set of standalone comments. Consider the following code:
; Compute the transpose of a matrix using the algorithm:
;
; for i := 0 to 3 do
; for j := 0 to 3 do
; swap( a[i][j], b[j][i] );
forlp i, 0, 3
forlp j, 0, 3
mov bx, i ;Compute address of a[i][j] using
shl bx, 2 ; row major ordering (i*4 + j)*2.
add bx, j
add bx, bx
lea bx, a[bx]
push bx ;Push address of a[i][j] onto
stack.
mov bx, j ;Compute address of b[j][i] using
shl bx, 2 ;row major ordering (j*4 + i)*2.
add bx, i
add bx, bx
lea bx, b[bx]
push bx ;Push address of b[j][i] onto
stack.
call swap ;Swap a[i][j] with b[j][i].
next
next
Note that the block comments before this sequence explain, in high level
terms, what the code is doing. The endline comments explain how the
statement sequence implements the general algorithm. Note, however, that
the endline comments do not explain what each statement is doing (at least
at the machine level). Rather than claiming "add bx, bx" is
multiplying the quantity in BX by two, this code assumes the reader can
figure that out for themselves (any reasonable assembly programmer would
know this). Once again, keep in mind your audience and write your
comments for them.Ideally, one should never have to put such code into a program. Of course, ideally, programs never have any defects in them, either. Since such code inevitably finds its way into a program, it's best to have a policy in place to deal with it, hence this section.
Unfinished code comes in five general categories: non-functional code, partially functioning code, suspect code, code in need of enhancement, and code documentation. Non-functional code might be a stub or driver that needs to be replaced in the future with actual code or some code that has severe enough defects that it is useless except for some small special cases. This code is really bad, fortunately its severity prevents you from ignoring it. It is unlikely anyone would miss such a poorly constructed piece of code in early testing prior to release.
Partially functioning code is, perhaps, the biggest problem. This code works well enough to pass some simple tests yet contains serious defects that should be corrected. Moreover, these defects are known. Software often contains a large number of unknown defects; it's a shame to let some (prior) known defects ship with the product simply because a programmer forgot about a defect or couldn't find the defect later.
Suspect code is exactly that- code that is suspicious. The programmer may not be aware of a quantifiable problem but may suspect that a problem exists. Such code will need a later review in order to verify whether it is correct.
The fourth category, code in need of enhancement, is the least serious. For example, to expedite a release, a programmer might choose to use a simple algorithm rather than a complex, faster algorithm. S/he could make a comment in the code like "This linear search should be replaced by a hash table lookup in a future version of the software." Although it might not be absolutely necessary to correct such a problem, it would be nice to know about such problems so they can be dealt with in the future.
The fifth category, documentation, refers to changes made to software that will affect the corresponding documentation (user guide, design document, etc.). The documentation department can search for these defects to bring existing documentation in line with the current code.
This standard defines a mechanism for dealing with these five classes of problems. Any occurrence of unfinished code will be preceded by a comment that takes one of the following forms (where "_" denotes a single space):
;_#defect#severe_; ;_#defect#functional_; ;_#defect#suspect_; ;_#defect#enhancement_; ;_#defect#documentation_;It is important to use all lower case and verify the correct spelling so it is easy to find these comments using a text editor search or a tool like grep. Obviously, a separate comment explaining the situation must follow these comments in the source code.
Examples:
; #defect#suspect ; ; #defect#enhancement ; ; #defect#documentation ;Notice the use of comment delimiters (the semicolon) on both sides even though assembly language, doesn't require them.
; text #link#location text ;"Text" is optional and represents arbitrary text (although it is really intended for embedding html commands to provide hyperlinks to the specified document). "Location" describes the document and section where the associated information can be found.
Examples: ; #link#User's Guide Section 3.1 ; ; #link#Program Design Document, Page 5 ; ; #link#Funcs.pas module, "xyz" function ; ; <A HREF="DesignDoc.html#xyzfunc"> #link#xyzfunc </a> ;
The aforementioned researchers at IBM developed several programs with the following set of attributes:
As should be obvious, the programs that had bad comments and names were the hardest to read; likewise, those programs with good comments and names were the easiest to read. The surprising results concerned the other two cases. Most people assume good comments are more important than good names in a program. Not only did IBM find this to be false, they found it to be really false.
As it turns out, good names are even more important that good comments in a program. This is not to say that comments are unimportant, they are extremely important; however, it is worth pointing out that if you spend the time to write good comments and then choose poor names for your program's identifiers, you've damaged the readability of your program despite the work you've put into your comments. Quickly read over the following code:
mov ax, SignedValue
cwd
add ax, -1
rcl dx, 1
mov AbsoluteValue, dx
Question: What does this code compute and store in the AbsoluteValue
variable?
The obvious answer is the absolute value of SignedValue. This is also incorrect. The correct answer is signum:
mov ax, SignedValue ;Get value to check.
cwd ;DX = FFFF if neg, 0000 otherwise.
add ax, 0ffffh ;Carry=0 if ax is zero, one
otherwise.
rcl dx, 1 ;DX = FFFF if AX is neg, 0 if
ax=0,
mov Signum, dx ; 1 if ax>0.
Granted, this is a tricky piece of code[16].
Nonetheless, even without the comments you can probably figure out what
the code sequence does even if you can't figure out how it does it:
mov ax, SignedValue
cwd
add ax, 0ffffh
rcl dx, 1
mov Signum, dx
Based on the names alone you can probably figure out that this code
computes the signum function. This is the "understanding 80% of the
code" referred to earlier. Note that you don't need misleading names
to make this code unphathomable. Consider the following code that doesn't
trick you by using misleading names:
mov ax, x
cwd
add ax, 0ffffh
rcl dx, 1
mov y, dx
This is a very simple example. Now imagine a large program that has many
names. As the number of names increase in a program, it becomes harder to
keep track of them all. If the names themselves do not provide a good
clue to the meaning of the name, understanding the program becomes very
difficult.
The vast majority of programmers know only one language - English. Some programmers know English as a second language and may not be familiar with a common non-English phrase that is not in their own language (e.g., rendezvous). Since English is the common language of most programmers, all identifiers should use easily recognizable English words and phrases.
A case-neutral identifier will work properly whether you compile it with a compiler that has case sensitive identifiers or case insensitive identifiers. In practice, this means that all uses of the identifiers must be spelled exactly the same way (including case) and that no other identifier exists whose only difference is the case of the letters in the identifier. For example, if you declare an identifier "ProfitsThisYear" in Pascal (a case-insensitive language), you could legally refer to this variable as "profitsThisYear" and "PROFITSTHISYEAR". However, this is not a case-neutral usage since a case sensitive language would treat these three identifiers as different names. Conversely, in case-sensitive languages like C/C++, it is possible to create two different identifiers with names like "PROFITS" and "profits" in the program. This is not case-neutral since attempting to use these two identifiers in a case insensitive language (like Pascal) would produce an error since the case-insensitive language would think they were the same name.
Different programmers (especially in different languages) use alphabetic case to denote different objects. For example, a common C/C++ coding convention is to use all upper case to denote a constant, macro, or type definition and to use all lower case to denote variable names or reserved words. Prolog programmers use an initial lower case alphabetic to denote a variable. Other comparable coding conventions exist. Unfortunately, there are so many different conventions that make use of alphabetic case, they are nearly worthless, hence the following rule:
There are going to be some obvious exceptions to the above rule, this document will cover those exceptions a little later. Alphabetic case does have one very useful purpose in identifiers - it is useful for separating words in a multi-word identifier; more on that subject in a moment.
To produce readable identifiers often requires a multi-word phrase. Natural languages typically use spaces to separate words; we can not, however, use this technique in identifiers. Unfortunatelywritingmultiwordidentifiers makesthemalmostimpossibletoreadifyoudonotdosomethingtodistiguishtheindividualwords (Unfortunately writing multiword identifiers makes them almost impossible to read if you do not do something to distinguish the individual words). There are a couple of good conventions in place to solve this problem. This standard's convention is to capitalize the first alphabetic character of each word in the middle of an identifier.
Lower case characters are easier to read than upper case. Identifiers written completely in upper case take almost twice as long to recognize and, therefore, impair the readability of a program. Yes, all upper case does make an identifier stand out. Such emphasis is rarely necessary in real programs. Yes, common C/C++ coding conventions dictate the use of all upper case identifiers. Forget them. They not only make your programs harder to read, they also violate the first rule above.
Avoid abbreviations as much as possible. What may seem like a perfectly reasonable abbreviation to you may totally confound someone else. Consider the following variable names that have actually appeared in commercial software:
NoEmployees, NoAccounts, pend
The "NoEmployees" and "NoAccounts" variables seem to be boolean variables indicating the presence or absence of employees and accounts. In fact, this particular programmer was using the (perfectly reasonable in the real world) abbreviation of "number" to indicate the number of employees and the number of accounts. The "pend" name referred to a procedure's end rather than any pending operation.
Programmers often use abbreviations in two situations: they're poor typists and they want to reduce the typing effort, or a good descriptive name for an object is simply too long. The former case is an unacceptable reason for using abbreviations. The second case, especially if care is taken, may warrant the occasional use of an abbreviation.
Many C/C++ Programmers, especially Microsoft Windows programmers, have adopted a formal naming convention known as "Hungarian Notation." To quote Steve McConnell from Code Complete: "The term 'Hungarian' refers both to the fact that names that follow the convention look like words in a foreign language and to the fact that the creator of the convention, Charles Simonyi, is originally from Hungary." One of the first rules given concerning identifiers stated that all identifiers are to be English names. Do we really want to create "artificially foreign" identifiers? Hungarian notation actually violates another rule as well: names using the Hungarian notation generally have very common prefixes, thus making them harder to read.
Hungarian notation does have a few minor advantages, but the disadvantages far outweigh the advantages. The following list from Code Complete and other sources describes what's wrong with Hungarian notation:
Although attaching machine type information to an identifier is generally a bad idea, a well thought-out name can successfully associate some high-level type information with the identifier, especially if the name implies the type or the type information appears as a suffix. For example, names like "PencilCount" and "BytesAvailable" suggest integer values. Likewise, names like "IsReady" and "Busy" indicate boolean values. "KeyCode" and "MiddleInitial" suggest character variables. A name like "StopWatchTime" probably indicates a real value. Likewise, "CustomerName" is probably a string variable. Unfortunately, it isn't always possible to choose a great name that describes both the content and type of an object; this is particularly true when the object is an instance (or definition of) some abstract data type. In such instances, some additional text can improve the identifier. Hungarian notation is a raw attempt at this that, unfortunately, fails for a variety of reasons.
A better solution is to use a suffix phrase to denote the type or class of an identifier. A common UNIX/C convention, for example, is to apply a "_t" suffix to denote a type name (e.g., size_t, key_t, etc.). This convention succeeds over Hungarian notation for several reasons including (1) the "type phrase" is a suffix and doesn't interfere with reading the name, (2) this particular convention specifies the class of the object (const, var, type, function, etc.) rather than a low level type, and (3) It certainly makes sense to change the identifier if it's classification changes.
Can we apply this suffix idea to variables and avoid the pitfalls? Sometimes. Consider a high level data type "button" corresponding to a button on a Visual BASIC or Delphi form. A variable name like "CancelButton" makes perfect sense. Likewise, labels appearing on a form could use names like "ETWWLabel" and "EditPageLabel". Note that these suffixes still suffer from the fact that a change in type will require that you change the variable's name. However, changes in high level types are far less common than changes in low-level types, so this shouldn't present a big problem.
Avoid misleading abbreviations and names. For example, FALSE shouldn't be an identifier that stands for "Failed As a Legitimate Software Engineer." Likewise, you shouldn't compute the amount of free memory available to a program and stuff it into the variable "Profits".
You should avoid names with similar meanings. For example, if you have two variables "InputLine" and "InputLn" that you use for two separate purposes, you will undoubtedly confuse the two when writing or reading the code. If you can swap the names of the two objects and the program still makes sense, you should rename those identifiers. Note that the names do not have to be similar, only their meanings. "InputLine" and "LineBuffer" are obviously different but you can still easily confuse them in a program.
In a similar vein, you should avoid using two or more variables that have different meanings but similar names. For example, if you are writing a teacher's grading program you probably wouldn't want to use the name "NumStudents" to indicate the number of students in the class along with the variable "StudentNum" to hold an individual student's ID number. "NumStudents" and "StudentNum" are too similar.
Avoid names that sound similar when read aloud, especially out of context. This would include names like "hard" and "heart", "Knew" and "new", etc. Remember the discussion in the section above on abbreviations, you should be able to discuss your problem listing over the telephone with a peer. Names that sound alike make such discussions difficult.
Avoid misspelled words in names and avoid names that are commonly misspelled. Most programmers are notoriously bad spellers (look at some of the comments in our own code!). Spelling words correctly is hard enough, remembering how to spell an identifier incorrectly is even more difficult. Likewise, if a word is often spelled incorrectly, requiring a programer to spell it correctly on each use is probably asking too much.
If you redefine the name of some library routine in your code, another program will surely confuse your name with the library's version. This is especially true when dealing with standard library routines and APIs.
mov ax, SignedValue ;Get value to check.
cwd ;DX = FFFF if neg, 0000 otherwise.
add ax, 0ffffh ;Carry=0 if ax is zero.
rcl dx, 1 ;DX = FFFF if AX is neg, 0 if AX=0,
mov Signum, dx ; 1 if AX>0.
Now consider the following code sequence that also computes the signum
function:
mov ax, SignedValue ;Get value to check.
cmp ax, 0 ;Check the sign.
je GotSignum ;We're done if it's zero.
mov ax, 1 ;Assume it was positive.
jns GotSignum ;Branch if it was positive.
neg ax ;Else return -1 for negative
values.
GotSignum: mov Signum, ax
Yes, the second version is longer and slower. However, an average person
can read the instruction sequence and figure out what it's doing; hence
the second version is much easier to read than the first. Which sequence
is best? Unless speed or space is an extremely critical factor and you
can show that this routine is in the critical execution path, then the
second version is obviously better. There is a time and a place for
tricky assembly code; however, it's rare that you would need to pull
tricks like this throughout your code.So how does one choose appropriate instruction sequences when there are many possible ways to accomplish the same task? The best way is to ensure that you have a choice. Although there are many different ways to accomplish an operation, few people bother to consider any instruction sequence other than the first one that comes to mind. Unfortunatley, the "best" instruction sequence is rarely the first instruction sequence that comes to most people's minds[17]. In order to make a choice, you have to have a choice to make. That means you should create at least two different code sequences for a given operation if there is ever a question concerning the readability of your code. Once you have at least two versions, you can choose between them based on your needs at hand. While it is impractical to "write your program twice" so that you'll have a choice for every sequence of instructions in the program, you should apply this technique to particularly bothersome code sequences.
Fortunately, with a little discipline it is possible to write readable assembly language programs. How you design your control structures can have a big impact on the readability of your programs. The best way to do this can be summed up in two words: avoid spaghetti.
Spaghetti code is the name given to a program that has a large number of intertwined branches and branch targets within a code sequence. Consider the following example:
jmp L1
L1: mov ax, 0
jmp L2
L3: mov ax, 1
jmp L2
L4: mov ax, -1
jmp L2
L0: mov ax, x
cmp ax, 0
je L1
jns L3
jmp L4
L2: mov y, ax
This code sequence, by the way, is our good friend the Signum function.
It takes a few moments to figure this out because as you manually trace
through the code you find yourself spending more time following jumps
around than you do looking at code that computes useful results. Now this
is a rather extreme example, but it is also fairly short. A longer code
sequence code become just as obfuscated with even fewer branches all over
the place.Spaghetti code is given this name because it resembles a bowl of spaghetti. That is, if we consider a control path in the program a spaghetti noodle, spaghetti code contains lots of intertwined branches into and out of different sections of the program. Needless to say, most spaghetti programs are difficult to understand, generally contain lots of bugs, and are often inefficient (don't forget that branches are among the slowest executing instructions on most modern processors).
So how to we resolve this? Easy by physically adopting structured programming techniques in assembly language code. Of course, 80x86 assembly language doesn't provide if..then..else..endif, while..endwhile, repeat..until, and other such statements[22], but we can certainly simulate them. Consider the following high level language code sequence:
if(expression) then
<< statements to execute if expression is true
>>
else
<< statements to execute if expression is false
>>
endif
Almost any high level language program can figure out what this type of
statement will do. Assembly languge programmers should leverage this
knowledge by attempting to organize their code so it takes this same
form. Specifically, the assembly language version should look something
like the following:
<< Assembly code to compute value of expression
>>
JNxx ElsePart ;xx is the opposite condition we want to
check.
<< Assembly code corresponding to the then portion
>>
jmp AroundElsePart
ElsePart:
<< Assembly code corresponding to the else portion
>>
AroundElsePart:
For an concrete example, consider the following:
if ( x=y ) then
write( 'x = y' );
else
write( 'x <> y' );
endif;
; Corresponding Assembly Code:
mov ax, x
cmp ax, y
jne ElsePart
print "x=y",nl
jmp IfDone
ElsePart: print "x<>y",nl
IfDone:
While this may seem like the obvious way to organize an
if..then.else..endif statement, it is suprising how many people would
naturally assume they've got to place the else part somewhere else in the
program as follows:
mov ax, x
cmp ax, y
jne ElsePart
print "x=y",nl
IfDone:
.
.
.
ElsePart: print "x<>y",nl
jmp IfDone
This code organization makes the program more difficult to follow. Most
programmers have a HLL background and despite a current assignment, they
still work mostly in HLLs. Assembly language programs will be more
readable if they mimic the HLL control constructs[23].For similar reasons, you should attempt to organize your assembly code that simulates while loops, repeat..until loops, for loops, etc., so that the code resembles the HLL code (for example, a while loop should physically test the condition at the beginning of the loop with a jump at the bottom of the loop).
if( x <= y ) then
<< true statements>>
else
<< false statements>>
endif
; Assembly code:
mov ax, x
cmp ax, y
ja ElsePart
<< true code >>
jmp IfDone
ElsePart: << false code >>
IfDone:
When someone reads this program, the "JA" statement skips over
the true portion. Unfortunately, the "JA" instruction gives the
illusion we're checking to see if something is greater than something
else; in actuality, we're testing to see if some condition is less than
or equal, not greater than. As such, this code sequence hides some of the
original intent of high level algorithm. One solution is to swap the
false and true portions of the code:
mov ax, x
cmp ax, y
jbe ThenPart
<< false code >>
jmp IfDone
ThenPart: << true code >>
IfDone:
This code sequence uses the conditional jump that matches the high level
algorithm's test (less than or equal). However, this code is now
organized in a non-standard fashion (it's an if..else..then..endif
statement). This hurts the readability more than using the proper jump
helped it. Now consider the following solution:
mov ax, x
cmp ax, y
jnbe ElsePart
<< true code >>
jmp IfDone
ElsePart: << false code >>
IfDone:
This code is organized in the traditional if..then..else..endif fashion.
Instead of using JA to skip over the then portion, it uses JNBE to do so.
This helps indicate, in a more readable fashion, that the code falls
through on below or equal and branches if it is not below or equal. Since
the instruction (JNBE) is easier to relate to the original test (<=)
than JA, this makes this section of code a little more readable.
Note that MASM is a "high-level" assembler. It does things assemblers for other chips won't do like checking the types of operands and reporting errors if there are mismatches. Some people, who are used to assemblers on other machines find this annoying. However, it's a great idea in assembly language for the same reason it's a great idea in HLLs[25]. These features have one other beneficial side-effect: they help other understand what you're trying to do in your programs. It should come as no surprise, then, that this style guide will encourage the use of these features in your assembly language programs.
In its simplest form, the typedef directive behaves like a textequ. It let's you replace one string in your program with another. For example, you can create the following definitions with MASM:
char typedef byte integer typedef sword boolean typedef byte float typedef real4 IntPtr typedef far ptr integerOnce you have declared these names, you can define char, integer, boolean, and float variables as follows:
MyChar char ? I integer ? Ptr2I IntPtr I IsPresent boolean ? ProfitsThisYear float ?
DupOperator = expression ws* 'DUP' ws* '(' ws* operand ws* ') %%
Note that "expression" expands to a valid numeric value (or
numeric expression), "ws*" means "zero or more whitespace
characters" and "operand" expands to anything that is legal
in the operand field of a MASM word/dw, byte/db, etc., directive[27]. One would typically use this operator to
reserve a block of memory locations as follows:
ArrayName integer 16 dup (?) ;Declare array of 16 words.This declaration would set aside 16 contiguous words in memory.
The interesting thing about the DUP operator is that any legal operand field for a directive like byte or word may appear inside the parentheses, including additional DUP expressions. The DUP operator simply says "duplicate this object the specified number of times." For example, "16 dup (1,2)" says "give me 16 copies of the value pair one and two. If this operand appeared in the operand field of a byte directive, it would reserve 32 bytes, containing the alternating values one and two.
So what happens if we apply this technique recursively? Well, "4 dup ( 3 dup (0))" when read recursively says "give me four copies of whatever is inside the (outermost) parentheses. This turns out to be the expression "3 dup (0)" that says "give me three zeros." Since the original operand says to give four copies of three copies of a zero, the end result is that this expression produces 12 zeros. Now consider the following two declarations:
Array1 integer 4 dup ( 3 dup (0)) Array2 integer 12 dup (0)Both definitions set aside 12 integers in memory (initializing each to zero). To the assembler these are nearly identical; to the 80x86 they are absolutely identical. To the reader, however, they are obviously different. Were you to declare two identical one-dimensional arrays of integers, using two different declarations makes your program inconsistent and, therefore, harder to read.
However, we can exploit this difference to declare multidimensional arrays. The first example above suggests that we have four copies of an array containing three integers each. This corresponds to the popular row-major array access function. The second example above suggests that we have a single dimensional array containing 12 integers.
s struct
a Integer ?
b integer ?
s ends
.
.
.
r s {}
ptr2r dword r
.
.
.
les di, ptr2r
mov ax, es:[di].s.a ;No indication this is
ptr2r!
.
.
.
mov es:[di].b, bx ;Really no indication!
Now consider the following:
s struct
a Integer ?
b integer ?
s ends
sPtr typedef far ptr s
.
.
.
q s {}
r sPtr q
r@ textequ <es:[di].s>
.
.
.
les di, ptr2r
mov ax, r@.a ;Now it's clear this is using r
.
.
.
mov r@.b, bx ;Ditto.
Note that the "@" symbol is a legal identifier character to
MASM, hence "r@" is just another symbol. As a general rule you
should avoid using symbols like "@" in identifiers, but it
serves a good purpose here - it indicates we've got an indirect pointer.
Of course, you must always make sure to load the pointer into ES:DI when
using the textequ above. If you use several different segment/register
pairs to access the data that "r" points at, this trick may not
make the code anymore readable since you will need several text equates
that all mean the same thing.
var integer i, j, array[10], array2[10][3], *ptr2Int char *FirstName, LastName[32] endvarThese declarations emit the following assembly code:
i integer ? j integer 25 array integer 10 dup (?) array2 integer 10 dup ( 3 dup (?)) ptr2Int dword ? LastName char 32 dup (?) Name dword LastNameFor those comfortable with C/C++ (and other HLLs) the UCR Standard Library declarations should look very familiar. For that reason, their use is a good idea when writing assembly code that uses the UCR Standard Library.