8 This file documents the GNU Assembler "as".
10 Copyright (C) 1991 Free Software Foundation, Inc.
12 Permission is granted to make and distribute verbatim copies of
13 this manual provided the copyright notice and this permission notice
14 are preserved on all copies.
17 Permission is granted to process this file through Tex and print the
18 results, provided the printed document carries copying permission
19 notice identical to this one except for the removal of this paragraph
20 (this paragraph not being relevant to the printed manual).
23 Permission is granted to copy and distribute modified versions of this
24 manual under the conditions for verbatim copying, provided also that the
25 section entitled ``GNU General Public License'' is included exactly as
26 in the original, and provided that the entire resulting derived work is
27 distributed under the terms of a permission notice identical to this
30 Permission is granted to copy and distribute translations of this manual
31 into another language, under the above conditions for modified versions,
32 except that the section entitled ``GNU General Public License'' may be
33 included in a translation approved by the author instead of in the
37 @setchapternewpage odd
39 @c @settitle Using GNU as (680x0)
42 @settitle Using GNU as (AMD 29K)
47 @subtitle{The GNU Assembler}
49 @c @subtitle{for Motorola 680x0}
52 @subtitle{for the AMD 29K family}
55 @subtitle February 1991
57 The Free Software Foundation Inc. thanks The Nice Computer
58 Company of Australia for loaning Dean Elsner to write the
59 first (Vax) version of @code{as} for Project GNU.
60 The proprietors, management and staff of TNCCA thank FSF for
61 distracting the boss while they got some work
64 @author{Dean Elsner, Jay Fenlason & friends}
65 @author{revised by Roland Pesch for Cygnus Support}
69 \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
70 \xdef\manvers{\$Revision$} % For use in headers, footers too
72 \hfill Cygnus Support\par
74 \hfill \TeX{}info \texinfoversion\par
76 %"boxit" macro for figures:
77 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
78 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
79 \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
80 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
81 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
84 @vskip 0pt plus 1filll
85 Copyright @copyright{} 1991 Free Software Foundation, Inc.
87 Permission is granted to make and distribute verbatim copies of
88 this manual provided the copyright notice and this permission notice
89 are preserved on all copies.
91 Permission is granted to copy and distribute modified versions of this
92 manual under the conditions for verbatim copying, provided also that the
93 section entitled ``GNU General Public License'' is included exactly as
94 in the original, and provided that the entire resulting derived work is
95 distributed under the terms of a permission notice identical to this
98 Permission is granted to copy and distribute translations of this manual
99 into another language, under the above conditions for modified versions,
100 except that the section entitled ``GNU General Public License'' may be
101 included in a translation approved by the author instead of in the
106 @node Top, Overview, (dir), (dir)
109 * Overview:: Overview
111 * Segments:: Segments and Relocation
113 * Expressions:: Expressions
114 * Pseudo Ops:: Assembler Directives
115 * Maintenance:: Maintaining the Assembler
116 * Retargeting:: Teaching the Assembler about a New Machine
117 * License:: GNU GENERAL PUBLIC LICENSE
119 --- The Detailed Node Listing ---
123 * Invoking:: Invoking @code{as}
124 * Manual:: Structure of this Manual
125 * GNU Assembler:: as, the GNU Assembler
126 * Command Line:: Command Line
127 * Input Files:: Input Files
128 * Object:: Output (Object) File
129 * Errors:: Error and Warning Messages
134 * Filenames:: Input Filenames and Line-numbers
138 * Pre-processing:: Pre-processing
139 * Whitespace:: Whitespace
140 * Comments:: Comments
141 * Symbol Intro:: Symbols
142 * Statements:: Statements
143 * Constants:: Constants
147 * Characters:: Character Constants
148 * Numbers:: Number Constants
155 Segments and Relocation
157 * Segs Background:: Background
158 * ld Segments:: ld Segments
159 * as Segments:: as Internal Segments
160 * Sub-Segments:: Sub-Segments
163 Segments and Relocation
165 * ld Segments:: ld Segments
166 * as Segments:: as Internal Segments
167 * Sub-Segments:: Sub-Segments
173 * Setting Symbols:: Giving Symbols Other Values
174 * Symbol Names:: Symbol Names
175 * Dot:: The Special Dot Symbol
176 * Symbol Attributes:: Symbol Attributes
180 * Local Symbols:: Local Symbol Names
184 * Symbol Value:: Value
186 * Symbol Desc:: Descriptor
187 * Symbol Other:: Other
191 * Empty Exprs:: Empty Expressions
192 * Integer Exprs:: Integer Expressions
196 * Arguments:: Arguments
197 * Operators:: Operators
198 * Prefix Ops:: Prefix Operators
199 * Infix Ops:: Infix Operators
203 * Abort:: The Abort directive causes as to abort
204 * Align:: Pad the location counter to a power of 2
205 * App-File:: Set the logical file name
206 * Ascii:: Fill memory with bytes of ASCII characters
207 * Asciz:: Fill memory with bytes of ASCII characters followed
209 * Byte:: Fill memory with 8-bit integers
210 * Comm:: Reserve public space in the BSS segment
211 * Data:: Change to the data segment
212 * Desc:: Set the n_desc of a symbol
213 * Double:: Fill memory with double-precision floating-point numbers
214 * Else:: @code{.else}
216 * Endif:: @code{.endif}
217 * Equ:: @code{.equ @var{symbol}, @var{expression}}
218 * Extern:: @code{.extern}
219 * Fill:: Fill memory with repeated values
220 * Float:: Fill memory with single-precision floating-point numbers
221 * Global:: Make a symbol visible to the linker
222 * Ident:: @code{.ident}
223 * If:: @code{.if @var{absolute expression}}
224 * Include:: @code{.include "@var{file}"}
225 * Int:: Fill memory with 32-bit integers
226 * Lcomm:: Reserve private space in the BSS segment
227 * Line:: Set the logical line number
228 * Ln:: @code{.ln @var{line-number}}
229 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
230 * Long:: Fill memory with 32-bit integers
231 * Lsym:: Create a local symbol
232 * Octa:: Fill memory with 128-bit integers
233 * Org:: Change the location counter
234 * Quad:: Fill memory with 64-bit integers
235 * Set:: Set the value of a symbol
236 * Short:: Fill memory with 16-bit integers
237 * Single:: @code{.single @var{flonums}}
238 * Stab:: Store debugging information
239 * Text:: Change to the text segment
241 * Word:: Fill memory with 32-bit integers
242 @c else (not am29k or sparc)
243 * Deprecated:: Deprecated Directives
244 * Machine Options:: Options
245 * Machine Syntax:: Syntax
246 * Floating Point:: Floating Point
247 * Machine Directives:: Machine Directives
252 * block:: @code{.block @var{size} , @var{fill}}
253 * cputype:: @code{.cputype}
254 * file:: @code{.file}
255 * hword:: @code{.hword @var{expressions}}
256 * line:: @code{.line}
257 * reg:: @code{.reg @var{symbol}, @var{expression}}
258 * sect:: @code{.sect}
259 * use:: @code{.use @var{segment name}}
262 @node Overview, Syntax, Top, Top
265 This manual is a user guide to the GNU assembler @code{as}.
267 @c The following should be conditional on machine config
269 @c This version of the manual describes @code{as} configured to generate
270 @c code for Motorola 680x0 architectures.
273 This version of the manual describes @code{as} configured to generate
274 code for Advanced Micro Devices' 29K architectures.
278 * Invoking:: Invoking @code{as}
279 * Manual:: Structure of this Manual
280 * GNU Assembler:: as, the GNU Assembler
281 * Command Line:: Command Line
282 * Input Files:: Input Files
283 * Object:: Output (Object) File
284 * Errors:: Error and Warning Messages
288 @node Invoking, Manual, Overview, Overview
289 @section Invoking @code{as}
291 Here is a brief summary of how to invoke GNU @code{as}. For details,
294 @c We don't use @deffn and friends for the following because they seem
295 @c to be limited to one line for the header.
297 as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
299 @c [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
302 @c@c am29k has no machine-dependent assembler options
304 [ -- | @var{files} @dots{} ]
310 This option is accepted only for script compatibility with calls to
311 other assemblers; it has no effect on GNU @code{as}.
314 ``fast''---skip preprocessing (assume source is compiler output)
317 Add @var{path} to the search list for @code{.include} directives
321 This option is accepted but has no effect on the 29K family.
324 @c Issue warnings when difference tables altered for long displacements
328 Keep (in symbol table) local symbols, starting with @samp{L}
330 @item -o @var{objfile}
331 Name the object-file output from @code{as}
334 Fold data segment into text segment
337 Suppress warning messages
341 @c Shorten references to undefined symbols, to one word instead of two
343 @c @item -mc68000 | -mc68010 | -mc68020
344 @c Specify what processor in the 68000 family is the target (default 68020)
347 @item -- | @var{files} @dots{}
348 Source files to assemble, or standard input
351 @node Manual, GNU Assembler, Invoking, Overview
352 @section Structure of this Manual
353 This document is intended to describe what you need to know to use GNU
354 @code{as}. We cover the syntax expected in source files, including
355 notation for symbols, constants, and expressions; the directives that
356 @code{as} understands; and of course how to invoke @code{as}.
359 @c We also cover special features in the 68000 configuration of @code{as},
360 @c including pseudo-operations.
363 We also cover special features in the AMD 29K configuration of @code{as},
364 including assembler directives.
368 This document also describes some of the
369 machine-dependent features of various flavors of the assembler.
370 This document also describes how the assembler works internally, and
371 provides some information that may be useful to people attempting to
372 port the assembler to another machine.
375 On the other hand, this manual is @emph{not} intended as an introduction
376 to programming in assembly language---let alone programming in general!
377 In a similar vein, we make no attempt to introduce the machine
378 architecture; we do @emph{not} describe the instruction set, standard
379 mnemonics, registers or addressing modes that are standard to a
380 particular architecture. You may want to consult the manufacturer's
381 machine architecture manual for this information.
384 @c I think this is premature---pesch@cygnus.com, 17jan1991
386 Throughout this document, we assume that you are running @dfn{GNU},
387 the portable operating system from the @dfn{Free Software
388 Foundation, Inc.}. This restricts our attention to certain kinds of
389 computer (in particular, the kinds of computers that GNU can run on);
390 once this assumption is granted examples and definitions need less
393 @code{as} is part of a team of programs that turn a high-level
394 human-readable series of instructions into a low-level
395 computer-readable series of instructions. Different versions of
396 @code{as} are used for different kinds of computer. In particular,
397 at the moment, @code{as} only works for the DEC Vax, the Motorola
398 680x0, the Intel 80386, the Sparc, and the National Semiconductor
402 @c There used to be a section "Terminology" here, which defined
403 @c "contents", "byte", "word", and "long". Defining "word" to any
404 @c particular size is confusing when the .word directive may generate 16
405 @c bits on one machine and 32 bits on another; in general, for the user
406 @c version of this manual, none of these terms seem essential to define.
407 @c They were used very little even in the former draft of the manual;
408 @c this draft makes an effort to avoid them (except in names of
411 @node GNU Assembler, Command Line, Manual, Overview
412 @section as, the GNU Assembler
413 @code{as} is primarily intended to assemble the output of the GNU C
414 compiler @code{gcc} for use by the linker @code{ld}. Nevertheless,
415 we've tried to make @code{as} assemble correctly everything that the native
419 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
422 This doesn't mean @code{as} always uses the same syntax as another
423 assembler for the same architecture; for example, we know of several
424 incompatible versions of 680x0 assembly language syntax.
426 GNU @code{as} is really a family of assemblers. If you use (or have
427 used) GNU @code{as} on another architecture, you should find a fairly
428 similar environment. Each version has much in common with the others,
429 including object file formats, most assembler directives (often called
430 @dfn{pseudo-ops)} and assembler syntax.
432 Unlike older assemblers, @code{as} is designed to assemble a source
433 program in one pass of the source file. This has a subtle impact on the
434 @kbd{.org} directive (@pxref{Org}).
436 @node Command Line, Input Files, GNU Assembler, Overview
437 @section Command Line
439 as [ options @dots{} ] [ file1 @dots{} ]
442 After the program name @code{as}, the command line may contain
443 options and file names. Options may be in any order, and may be
444 before, after, or between file names. The order of file names is
447 @file{--} (two hyphens) by itself names the standard input file
448 explicitly, as one of the files for @code{as} to assemble.
450 Except for @samp{--} any command line argument that begins with a
451 hyphen (@samp{-}) is an option. Each option changes the behavior of
452 @code{as}. No option changes the way another option works. An
453 option is a @samp{-} followed by one or more letters; the case of
454 the letter is important. All options are optional.
456 Some options expect exactly one file name to follow them. The file
457 name may either immediately follow the option's letter (compatible
458 with older assemblers) or it may be the next command argument (GNU
459 standard). These two command lines are equivalent:
462 as -o my-object-file.o mumble
463 as -omy-object-file.o mumble
466 @node Input Files, Object, Command Line, Overview
469 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
470 describe the program input to one run of @code{as}. The program may
471 be in one or more files; how the source is partitioned into files
472 doesn't change the meaning of the source.
474 @c I added "con" prefix to "catenation" just to prove I can overcome my
475 @c APL training... pesch@cygnus.com
476 The source program is a concatenation of the text in all the files, in the
479 Each time you run @code{as} it assembles exactly one source
480 program. The source program is made up of one or more files.
481 (The standard input is also a file.)
483 You give @code{as} a command line that has zero or more input file
484 names. The input files are read (from left file name to right). A
485 command line argument (in any position) that has no special meaning
486 is taken to be an input file name.
488 If @code{as} is given no file names it attempts to read one input file
489 from @code{as}'s standard input, which is normally your terminal. You
490 may have to type @key{ctl-D} to tell @code{as} there is no more program
493 Use @samp{--} if you need to explicitly name the standard input file
494 in your command line.
496 If the source is empty, @code{as} will produce a small, empty object
500 * Filenames:: Input Filenames and Line-numbers
503 @node Filenames, , Input Files, Input Files
504 @subsection Input Filenames and Line-numbers
505 There are two ways of locating a line in the input file (or files) and both
506 are used in reporting error messages. One way refers to a line
507 number in a physical file; the other refers to a line number in a
510 @dfn{Physical files} are those files named in the command line given
513 @dfn{Logical files} are simply names declared explicitly by assembler
514 directives; they bear no relation to physical files. Logical file names
515 help error messages reflect the original source file, when @code{as}
516 source is itself synthesized from other files. @xref{App-File}.
518 @node Object, Errors, Input Files, Overview
519 @section Output (Object) File
520 Every time you run @code{as} it produces an output file, which is
521 your assembly language program translated into numbers. This file
522 is the object file, named @code{a.out} unless you tell @code{as} to
523 give it another name by using the @code{-o} option. Conventionally,
524 object file names end with @file{.o}. The default name of
525 @file{a.out} is used for historical reasons: older assemblers were
526 capable of assembling self-contained programs directly into a
528 @c This may still work, but hasn't been tested.
530 The object file is meant for input to the linker @code{ld}. It contains
531 assembled program code, information to help @code{ld} integrate
532 the assembled program into a runnable file, and (optionally) symbolic
533 information for the debugger.
535 @comment link above to some info file(s) like the description of a.out.
536 @comment don't forget to describe GNU info as well as Unix lossage.
538 @node Errors, Options, Object, Overview
539 @section Error and Warning Messages
541 @code{as} may write warnings and error messages to the standard error
542 file (usually your terminal). This should not happen when @code{as} is
543 run automatically by a compiler. Warnings report an assumption made so
544 that @code{as} could keep assembling a flawed program; errors report a
545 grave problem that stops the assembly.
547 Warning messages have the format
549 file_name:@b{NNN}:Warning Message Text
551 @noindent(where @b{NNN} is a line number). If a logical file name has
552 been given (@pxref{App-File}) it is used for the filename, otherwise the
553 name of the current input file is used. If a logical line number was
554 given (@pxref{Line}) then it is used to calculate the number printed,
555 otherwise the actual line in the current source file is printed. The
556 message text is intended to be self explanatory (in the grand Unix
559 Error messages have the format
561 file_name:@b{NNN}:FATAL:Error Message Text
563 The file name and line number are derived as for warning
564 messages. The actual message text may be rather less explanatory
565 because many of them aren't supposed to happen.
567 @node Options, , Errors, Overview
569 @subsection @code{-D}
570 This option has no effect whatsoever, but it is accepted to make it more
571 likely that scripts written for other assemblers will also work with
574 @subsection Work Faster: @code{-f}
575 @samp{-f} should only be used when assembling programs written by a
576 (trusted) compiler. @samp{-f} stops the assembler from pre-processing
577 the input file(s) before assembling them.
579 @emph{Warning:} if the files actually need to be pre-processed (if they
580 contain comments, for example), @code{as} will not work correctly if
584 @subsection Add to @code{.include} search path: @code{-I} @var{path}
585 Use this option to add a @var{path} to the list of directories GNU
586 @code{as} will search for files specified in @code{.include} directives
587 (@pxref{Include}). You may use @code{-I} as many times as necessary to
588 include a variety of paths. The current working directory is always
589 searched first; after that, @code{as} searches any @samp{-I} directories
590 in the same order as they were specified (left to right) on the command
593 @subsection Warn if difference tables altered: @code{-k}
595 On the AMD 29K family, this option is allowed, but has no effect. It is
596 permitted for compatibility with GNU @code{as} on other platforms,
597 where it can be used to warn when @code{as} alters the machine code
598 generated for @samp{.word} directives in difference tables. The AMD 29K
599 family does not have the addressing limitations that sometimes lead to this
600 alteration on other platforms.
605 @code{as} sometimes alters the code emitted for directives of the form
606 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
607 You can use the @samp{-k} option if you want a warning issued when this
612 @subsection Include Local Labels: @code{-L}
613 Labels beginning with @samp{L} (upper case only) are called @dfn{local
614 labels}. @xref{Symbol Names}. Normally you don't see such labels when
615 debugging, because they are intended for the use of programs (like
616 compilers) that compose assembler programs, not for your notice.
617 Normally both @code{as} and @code{ld} discard such labels, so you don't
618 normally debug with them.
620 This option tells @code{as} to retain those @samp{L@dots{}} symbols
621 in the object file. Usually if you do this you also tell the linker
622 @code{ld} to preserve symbols whose names begin with @samp{L}.
624 @subsection Name the Object File: @code{-o}
625 There is always one object file output when you run @code{as}. By
626 default it has the name @file{a.out}. You use this option (which
627 takes exactly one filename) to give the object file a different name.
629 Whatever the object file is called, @code{as} will overwrite any
630 existing file of the same name.
632 @subsection Fold Data Segment into Text Segment: @code{-R}
633 @code{-R} tells @code{as} to write the object file as if all
634 data-segment data lives in the text segment. This is only done at
635 the very last moment: your binary data are the same, but data
636 segment parts are relocated differently. The data segment part of
637 your object file is zero bytes long because all it bytes are
638 appended to the text segment. (@xref{Segments}.)
640 When you specify @code{-R} it would be possible to generate shorter
641 address displacements (because we don't have to cross between text and
642 data segment). We don't do this simply for compatibility with older
643 versions of @code{as}. In future, @code{-R} may work this way.
645 @subsection Suppress Warnings: @code{-W}
646 @code{as} should never give a warning or error message when
647 assembling compiler output. But programs written by people often
648 cause @code{as} to give a warning that a particular assumption was
649 made. All such warnings are directed to the standard error file.
650 If you use this option, no warnings are issued. This option only
651 affects the warning messages: it does not change any particular of how
652 @code{as} assembles your file. Errors, which stop the assembly, are
655 @node Syntax, Segments, Overview, Top
657 This chapter describes the machine-independent syntax allowed in a
658 source file. @code{as} syntax is similar to what many other assemblers
659 use; it is inspired in BSD 4.2
664 @c assembler, except that @code{as} does not
665 @c assemble Vax bit-fields.
669 * Pre-processing:: Pre-processing
670 * Whitespace:: Whitespace
671 * Comments:: Comments
672 * Symbol Intro:: Symbols
673 * Statements:: Statements
674 * Constants:: Constants
677 @node Pre-processing, Whitespace, Syntax, Syntax
678 @section Pre-processing
683 adjusts and removes extra whitespace. It leaves one space or tab before
684 the keywords on a line, and turns any other whitespace on the line into
688 removes all comments, replacing them with a single space, or an
689 appropriate number of newlines.
692 converts character constants into the appropriate numeric values.
695 Excess whitespace, comments, and character constants
696 cannot be used in the portions of the input text that are not
699 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
700 option is given, the input file will not be pre-processed. Within such
701 an input file, parts of the file can be pre-processed by putting a line
702 that says @code{#APP} before the text that should be pre-processed, and
703 putting a line that says @code{#NO_APP} after them. This feature is
704 mainly intend to support @code{asm} statements in compilers whose output
705 normally does not need to be pre-processed.
707 @node Whitespace, Comments, Pre-processing, Syntax
709 @dfn{Whitespace} is one or more blanks or tabs, in any order.
710 Whitespace is used to separate symbols, and to make programs neater
711 for people to read. Unless within character constants
712 (@pxref{Characters}), any whitespace means the same as exactly one
715 @node Comments, Symbol Intro, Whitespace, Syntax
717 There are two ways of rendering comments to @code{as}. In both
718 cases the comment is equivalent to one space.
720 Anything from @samp{/*} through the next @samp{*/} is a comment.
721 This means you may not nest these comments.
725 The only way to include a newline ('\n') in a comment
726 is to use this sort of comment.
729 /* This sort of comment does not nest. */
732 Anything from the @dfn{line comment} character to the next newline
733 is considered a comment and is ignored. The line comment character is
735 @c @samp{#} on the Vax. @xref{Machine Dependent}. @refill
738 @c @samp{|} on the 680x0. @xref{Machine Dependent}. @refill
741 @samp{;} for the AMD 29K family. @xref{Machine Dependent}. @refill
745 On some machines there are two different line comment characters. One
746 will only begin a comment if it is the first non-whitespace character on
747 a line, while the other will always begin a comment.
751 To be compatible with past assemblers a special interpretation is
752 given to lines that begin with @samp{#}. Following the @samp{#} an
753 absolute expression (@pxref{Expressions}) is expected: this will be
754 the logical line number of the @b{next} line. Then a string
755 (@xref{Strings}.) is allowed: if present it is a new logical file
756 name. The rest of the line, if any, should be whitespace.
758 If the first non-whitespace characters on the line are not numeric,
759 the line is ignored. (Just like a comment.)
761 # This is an ordinary comment.
762 # 42-6 "new_file_name" # New logical file name
763 # This is logical line # 36.
765 This feature is deprecated, and may disappear from future versions
768 @node Symbol Intro, Statements, Comments, Syntax
770 A @dfn{symbol} is one or more characters chosen from the set of all
771 letters (both upper and lower case), digits and the three characters
772 @samp{_.$}. No symbol may begin with a digit. Case is significant.
773 There is no length limit: all characters are significant. Symbols are
774 delimited by characters not in that set, or by the beginning of a file
775 (since the source program must end with a newline, the end of a file is
776 not a possible symbol delimiter). @xref{Symbols}.
778 @node Statements, Constants, Symbol Intro, Syntax
780 A @dfn{statement} ends at a newline character (@samp{\n})
781 @c @if m680x0 (or is this if !am29k?)
782 @c or at a semicolon (@samp{;}). The newline or semicolon
783 @c fi m680x0 (or !am29k)
785 or an ``at'' sign (@samp{@@}). The newline or at sign
788 of the preceding statement. Newlines
789 @c if m680x0 (or !am29k)
791 @c fi m680x0 (or !am29k)
796 character constants are an exception: they don't end statements.
797 It is an error to end any statement with end-of-file: the last
798 character of any input file should be a newline.@refill
800 You may write a statement on more than one line if you put a
801 backslash (@kbd{\}) immediately in front of any newlines within the
802 statement. When @code{as} reads a backslashed newline both
803 characters are ignored. You can even put backslashed newlines in
804 the middle of symbol names without changing the meaning of your
807 An empty statement is allowed, and may include whitespace. It is ignored.
809 @c "key symbol" is not used elsewhere in the document; seems pedantic to
810 @c @defn{} it in that case, as was done previously... pesch@cygnus.com,
812 A statement begins with zero or more labels, optionally followed by a
813 key symbol which determines what kind of statement it is. The key
814 symbol determines the syntax of the rest of the statement. If the
815 symbol begins with a dot @samp{.} then the statement is an assembler
816 directive: typically valid for any computer. If the symbol begins with
817 a letter the statement is an assembly language @dfn{instruction}: it
818 will assemble into a machine language instruction. Different versions
819 of @code{as} for different computers will recognize different
820 instructions. In fact, the same symbol may represent a different
821 instruction in a different computer's assembly language.
823 A label is a symbol immediately followed by a colon (@code{:}).
824 Whitespace before a label or after a colon is permitted, but you may not
825 have whitespace between a label's symbol and its colon. @xref{Labels}.
828 label: .directive followed by something
829 another$label: # This is an empty statement.
830 instruction operand_1, operand_2, @dots{}
833 @node Constants, , Statements, Syntax
835 A constant is a number, written so that its value is known by
836 inspection, without knowing any context. Like this:
838 .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
839 .ascii "Ring the bell\7" # A string constant.
840 .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
841 .float 0f-314159265358979323846264338327\
842 95028841971.693993751E-40 # - pi, a flonum.
846 * Characters:: Character Constants
847 * Numbers:: Number Constants
850 @node Characters, Numbers, Constants, Constants
851 @subsection Character Constants
852 There are two kinds of character constants. A @dfn{character} stands
853 for one character in one byte and its value may be used in
854 numeric expressions. String constants (properly called string
855 @emph{literals}) are potentially many bytes and their values may not be
856 used in arithmetic expressions.
863 @node Strings, Chars, Characters, Characters
864 @subsubsection Strings
865 A @dfn{string} is written between double-quotes. It may contain
866 double-quotes or null characters. The way to get special characters
867 into a string is to @dfn{escape} these characters: precede them with
868 a backslash @samp{\} character. For example @samp{\\} represents
869 one backslash: the first @code{\} is an escape which tells
870 @code{as} to interpret the second character literally as a backslash
871 (which prevents @code{as} from recognizing the second @code{\} as an
872 escape character). The complete list of escapes follows.
876 @c Mnemonic for ACKnowledge; for ASCII this is octal code 007.
878 Mnemonic for backspace; for ASCII this is octal code 010.
880 @c Mnemonic for EOText; for ASCII this is octal code 004.
882 Mnemonic for FormFeed; for ASCII this is octal code 014.
884 Mnemonic for newline; for ASCII this is octal code 012.
886 @c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
888 Mnemonic for carriage-Return; for ASCII this is octal code 015.
890 @c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with
893 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
895 @c Mnemonic for Vertical tab; for ASCII this is octal code 013.
896 @c @item \x @var{digit} @var{digit} @var{digit}
897 @c A hexadecimal character code. The numeric code is 3 hexadecimal digits.
898 @item \ @var{digit} @var{digit} @var{digit}
899 An octal character code. The numeric code is 3 octal digits.
900 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
901 for example, @code{\008} has the value 010, and @code{\009} the value 011.
903 Represents one @samp{\} character.
905 @c Represents one @samp{'} (accent acute) character.
906 @c This is needed in single character literals
907 @c (@xref{Characters}.) to represent
910 Represents one @samp{"} character. Needed in strings to represent
911 this character, because an unescaped @samp{"} would end the string.
912 @item \ @var{anything-else}
913 Any other character when escaped by @kbd{\} will give a warning, but
914 assemble as if the @samp{\} was not present. The idea is that if
915 you used an escape sequence you clearly didn't want the literal
916 interpretation of the following character. However @code{as} has no
917 other interpretation, so @code{as} knows it is giving you the wrong
918 code and warns you of the fact.
921 Which characters are escapable, and what those escapes represent,
922 varies widely among assemblers. The current set is what we think
923 BSD 4.2 @code{as} recognizes, and is a subset of what most C
924 compilers recognize. If you are in doubt, don't use an escape
927 @node Chars, , Strings, Characters
928 @subsubsection Characters
929 A single character may be written as a single quote immediately
930 followed by that character. The same escapes apply to characters as
931 to strings. So if you want to write the character backslash, you
932 must write @kbd{'\\} where the first @code{\} escapes the second
933 @code{\}. As you can see, the quote is an acute accent, not a
934 grave accent. A newline
935 @c if 680x0 (or !am29k)
936 @c (or semicolon @samp{;})
937 @c fi 680x0 (or !am29k)
939 (or at sign @samp{@@})
942 following an acute accent is taken as a literal character and does
943 not count as the end of a statement. The value of a character
944 constant in a numeric expression is the machine's byte-wide code for
945 that character. @code{as} assumes your character code is ASCII: @kbd{'A}
946 means 65, @kbd{'B} means 66, and so on. @refill
948 @node Numbers, , Characters, Constants
949 @subsection Number Constants
950 @code{as} distinguishes three kinds of numbers according to how they
951 are stored in the target machine. @emph{Integers} are numbers that
952 would fit into an @code{int} in the C language. @emph{Bignums} are
953 integers, but they are stored in a more than 32 bits. @emph{Flonums}
954 are floating point numbers, described below.
956 @subsubsection Integers
957 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
958 the binary digits @samp{01}.
960 An octal integer is @samp{0} followed by zero or more of the octal
961 digits (@samp{01234567}).
963 A decimal integer starts with a non-zero digit followed by zero or
964 more digits (@samp{0123456789}).
966 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
967 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
969 Integers have the usual values. To denote a negative integer, use
970 the prefix operator @samp{-} discussed under expressions
971 (@pxref{Prefix Ops}).
973 @subsubsection Bignums
974 A @dfn{bignum} has the same syntax and semantics as an integer
975 except that the number (or its negative) takes more than 32 bits to
976 represent in binary. The distinction is made because in some places
977 integers are permitted while bignums are not.
979 @subsubsection Flonums
980 A @dfn{flonum} represents a floating point number. The translation is
981 complex: a decimal floating point number from the text is converted by
982 @code{as} to a generic binary floating point number of more than
983 sufficient precision. This generic floating point number is converted
984 to a particular computer's floating point format (or formats) by a
985 portion of @code{as} specialized to that computer.
987 A flonum is written by writing (in order)
993 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
994 @code{as} the rest of the number is a flonum.
998 A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e}
999 is recommended. Case is not important. (Any otherwise illegal letter
1000 will work here, but that might be changed. Vax BSD 4.2 assembler seems
1001 to allow any of @samp{defghDEFGH}.)
1005 An optional sign: either @samp{+} or @samp{-}.
1007 An optional @dfn{integer part}: zero or more decimal digits.
1009 An optional @dfn{fraction part}: @samp{.} followed by zero
1010 or more decimal digits.
1012 An optional exponent, consisting of:
1016 An @samp{E} or @samp{e}.
1019 A letter; the exact significance varies according to
1020 the computer that executes the program. @code{as}
1021 accepts any letter for now. Case is not important.
1025 Optional sign: either @samp{+} or @samp{-}.
1027 One or more decimal digits.
1031 At least one of @var{integer part} or @var{fraction part} must be
1032 present. The floating point number has the usual base-10 value.
1034 @code{as} does all processing using integers. Flonums are computed
1035 independently of any floating point hardware in the computer running
1038 @node Segments, Symbols, Syntax, Top
1039 @chapter Segments and Relocation
1041 * Segs Background:: Background
1042 * ld Segments:: ld Segments
1043 * as Segments:: as Internal Segments
1044 * Sub-Segments:: Sub-Segments
1048 @node Segs Background, ld Segments, Segments, Segments
1050 Roughly, a segment is a range of addresses, with no gaps; all data
1051 ``in'' those addresses is treated the same for some particular purpose.
1052 For example there may be a ``read only'' segment.
1054 The linker @code{ld} reads many object files (partial programs) and
1055 combines their contents to form a runnable program. When @code{as}
1056 emits an object file, the partial program is assumed to start at address
1057 0. @code{ld} will assign the final addresses the partial program
1058 occupies, so that different partial programs don't overlap. This is
1059 actually an over-simplification, but it will suffice to explain how
1060 @code{as} uses segments.
1062 @code{ld} moves blocks of bytes of your program to their run-time
1063 addresses. These blocks slide to their run-time addresses as rigid
1064 units; their length does not change and neither does the order of bytes
1065 within them. Such a rigid unit is called a @emph{segment}. Assigning
1066 run-time addresses to segments is called @dfn{relocation}. It includes
1067 the task of adjusting mentions of object-file addresses so they refer to
1068 the proper run-time addresses.
1070 An object file written by @code{as} has three segments, any of which may
1071 be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss}
1072 segments. Within the object file, the text segment starts at
1073 address @code{0}, the data segment follows, and the bss segment
1074 follows the data segment.
1076 To let @code{ld} know which data will change when the segments are
1077 relocated, and how to change that data, @code{as} also writes to the
1078 object file details of the relocation needed. To perform relocation
1079 @code{ld} must know, each time an address in the object
1083 Where in the object file is the beginning of this reference to
1086 How long (in bytes) is this reference?
1088 Which segment does the address refer to? What is the numeric value of
1090 (@var{address}) @minus{} (@var{start-address of segment})?
1093 Is the reference to an address ``Program-Counter relative''?
1096 In fact, every address @code{as} ever uses is expressed as
1097 @code{(@var{segment}) + (@var{offset into segment})}. Further, every
1098 expression @code{as} computes is of this segmented nature.
1099 @dfn{Absolute expression} means an expression with segment ``absolute''
1100 (@pxref{ld Segments}). A @dfn{pass1 expression} means an expression
1101 with segment ``pass1'' (@pxref{as Segments}). In this manual we use the
1102 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1105 Apart from text, data and bss segments you need to know about the
1106 @dfn{absolute} segment. When @code{ld} mixes partial programs,
1107 addresses in the absolute segment remain unchanged. That is, address
1108 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1109 Although two partial programs' data segments will not overlap addresses
1110 after linking, @emph{by definition} their absolute segments will overlap.
1111 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1112 address when the program is running as address @code{@{absolute@ 239@}} in any
1113 other partial program.
1115 The idea of segments is extended to the @dfn{undefined} segment. Any
1116 address whose segment is unknown at assembly time is by definition
1117 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1118 Since numbers are always defined, the only way to generate an undefined
1119 address is to mention an undefined symbol. A reference to a named
1120 common block would be such a symbol: its value is unknown at assembly
1121 time so it has segment @emph{undefined}.
1123 By analogy the word @emph{segment} is used to describe groups of segments in
1124 the linked program. @code{ld} puts all partial programs' text
1125 segments in contiguous addresses in the linked program. It is
1126 customary to refer to the @emph{text segment} of a program, meaning all
1127 the addresses of all partial program's text segments. Likewise for
1128 data and bss segments.
1130 Some segments are manipulated by @code{ld}; others are invented for
1131 use of @code{as} and have no meaning except during assembly.
1134 * ld Segments:: ld Segments
1135 * as Segments:: as Internal Segments
1136 * Sub-Segments:: Sub-Segments
1140 @node ld Segments, as Segments, Segs Background, Segments
1141 @section ld Segments
1142 @code{ld} deals with just five kinds of segments, summarized below.
1148 These segments hold your program. @code{as} and @code{ld} treat them as
1149 separate but equal segments. Anything you can say of one segment is
1150 true of the other. When the program is running, however, it is
1151 customary for the text segment to be unalterable. The
1152 text segment is often shared among processes: it will contain
1153 instructions, constants and the like. The data segment of a running
1154 program is usually alterable: for example, C variables would be stored
1155 in the data segment.
1158 This segment contains zeroed bytes when your program begins running. It
1159 is used to hold unitialized variables or common storage. The length of
1160 each partial program's bss segment is important, but because it starts
1161 out containing zeroed bytes there is no need to store explicit zero
1162 bytes in the object file. The bss segment was invented to eliminate
1163 those explicit zeros from object files.
1165 @item absolute segment
1166 Address 0 of this segment is always ``relocated'' to runtime address 0.
1167 This is useful if you want to refer to an address that @code{ld} must
1168 not change when relocating. In this sense we speak of absolute
1169 addresses being ``unrelocatable'': they don't change during relocation.
1171 @item @code{undefined} segment
1172 This ``segment'' is a catch-all for address references to objects not in
1173 the preceding segments.
1174 @c FIXME: ref to some other doc on obj-file formats could go here.
1178 An idealized example of the 3 relocatable segments follows. Memory
1179 addresses are on the horizontal axis.
1184 partial program # 1: |ttttt|dddd|00|
1191 partial program # 2: |TTT|DDD|000|
1194 +--+---+-----+--+----+---+-----+~~
1195 linked program: | |TTT|ttttt| |dddd|DDD|00000|
1196 +--+---+-----+--+----+---+-----+~~
1198 addresses: 0 @dots{}
1202 \halign{\hfil\rm #\quad&#\cr
1204 &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1205 Partial program \#1:
1206 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1208 &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1209 Partial program \#2:
1210 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1212 &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1214 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1215 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1216 DDDD}\boxit{2cm}{00000}\ \dots\cr
1222 @node as Segments, Sub-Segments, ld Segments, Segments
1223 @section as Internal Segments
1224 These segments are invented for the internal use of @code{as}. They
1225 have no meaning at run-time. You don't need to know about these
1226 segments except that they might be mentioned in @code{as}' warning
1227 messages. These segments are invented to permit the value of every
1228 expression in your assembly language program to be a segmented
1232 @item absent segment
1233 An expression was expected and none was
1237 An internal assembler logic error has been
1238 found. This means there is a bug in the assembler.
1241 A @dfn{grand number} is a bignum or a flonum, but not an integer. If a
1242 number can't be written as a C @code{int} constant, it is a grand
1243 number. @code{as} has to remember that a flonum or a bignum does not
1244 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1245 expression: this is done by making a flonum or bignum be in segment
1246 grand. This is purely for internal @code{as} convenience; grand
1247 segment behaves similarly to absolute segment.
1250 The expression was impossible to evaluate in the first pass. The
1251 assembler will attempt a second pass (second reading of the source) to
1252 evaluate the expression. Your expression mentioned an undefined symbol
1253 in a way that defies the one-pass (segment + offset in segment) assembly
1254 process. No compiler need emit such an expression.
1257 @emph{Warning:} the second pass is currently not implemented. @code{as}
1258 will abort with an error message if one is required.
1261 @item difference segment
1262 As an assist to the C compiler, expressions of the forms
1264 (@var{undefined symbol}) @minus{} (@var{expression}
1265 (@var{something} @minus{} (@var{undefined symbol})
1266 (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1268 are permitted, and belong to the difference segment. @code{as}
1269 re-evaluates such expressions after the source file has been read and
1270 the symbol table built. If by that time there are no undefined symbols
1271 in the expression then the expression assumes a new segment. The
1272 intention is to permit statements like
1273 @samp{.word label - base_of_table}
1274 to be assembled in one pass where both @code{label} and
1275 @code{base_of_table} are undefined. This is useful for compiling C and
1276 Algol switch statements, Pascal case statements, FORTRAN computed goto
1277 statements and the like.
1280 @node Sub-Segments, bss, as Segments, Segments
1281 @section Sub-Segments
1282 Assembled bytes fall into two segments: text and data.
1283 Because you may have groups of text or data that you want to end up near
1284 to each other in the object file, @code{as} allows you to use
1285 @dfn{subsegments}. Within each segment, there can be numbered
1286 subsegments with values from 0 to 8192. Objects assembled into the same
1287 subsegment will be grouped with other objects in the same subsegment
1288 when they are all put into the object file. For example, a compiler
1289 might want to store constants in the text segment, but might not want to
1290 have them interspersed with the program being assembled. In this case,
1291 the compiler could issue a @code{text 0} before each section of code
1292 being output, and a @code{text 1} before each group of constants being
1295 Subsegments are optional. If you don't use subsegments, everything
1296 will be stored in subsegment number zero.
1299 @c Each subsegment is zero-padded up to a multiple of four bytes.
1300 @c (Subsegments may be padded a different amount on different flavors
1304 On the AMD 29K family, no particular padding is added to segment sizes;
1305 GNU as forces no alignment on this platform.
1307 Subsegments appear in your object file in numeric order, lowest numbered
1308 to highest. (All this to be compatible with other people's assemblers.)
1309 The object file contains no representation of subsegments; @code{ld} and
1310 other programs that manipulate object files will see no trace of them.
1311 They just see all your text subsegments as a text segment, and all your
1312 data subsegments as a data segment.
1314 To specify which subsegment you want subsequent statements assembled
1315 into, use a @samp{.text @var{expression}} or a @samp{.data
1316 @var{expression}} statement. @var{Expression} should be an absolute
1317 expression. (@xref{Expressions}.) If you just say @samp{.text}
1318 then @samp{.text 0} is assumed. Likewise @samp{.data} means
1319 @samp{.data 0}. Assembly begins in @code{text 0}.
1322 .text 0 # The default subsegment is text 0 anyway.
1323 .ascii "This lives in the first text subsegment. *"
1325 .ascii "But this lives in the second text subsegment."
1327 .ascii "This lives in the data segment,"
1328 .ascii "in the first data subsegment."
1330 .ascii "This lives in the first text segment,"
1331 .ascii "immediately following the asterisk (*)."
1334 Each segment has a @dfn{location counter} incremented by one for every
1335 byte assembled into that segment. Because subsegments are merely a
1336 convenience restricted to @code{as} there is no concept of a subsegment
1337 location counter. There is no way to directly manipulate a location
1338 counter---but the @code{.align} directive will change it, and any label
1339 definition will capture its current value. The location counter of the
1340 segment that statements are being assembled into is said to be the
1341 @dfn{active} location counter.
1343 @node bss, , Sub-Segments, Segments
1344 @section bss Segment
1345 The bss segment is used for local common variable storage.
1346 You may allocate address space in the bss segment, but you may
1347 not dictate data to load into it before your program executes. When
1348 your program starts running, all the contents of the bss
1349 segment are zeroed bytes.
1351 Addresses in the bss segment are allocated with special directives;
1352 you may not assemble anything directly into the bss segment. Hence
1353 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1355 @node Symbols, Expressions, Segments, Top
1357 Symbols are a central concept: the programmer uses symbols to name
1358 things, the linker uses symbols to link, and the debugger uses symbols
1362 @emph{Warning:} @code{as} does not place symbols in the object file in
1363 the same order they were declared. This may break some debuggers.
1368 * Setting Symbols:: Giving Symbols Other Values
1369 * Symbol Names:: Symbol Names
1370 * Dot:: The Special Dot Symbol
1371 * Symbol Attributes:: Symbol Attributes
1374 @node Labels, Setting Symbols, Symbols, Symbols
1376 A @dfn{label} is written as a symbol immediately followed by a colon
1377 @samp{:}. The symbol then represents the current value of the
1378 active location counter, and is, for example, a suitable instruction
1379 operand. You are warned if you use the same symbol to represent two
1380 different locations: the first definition overrides any other
1383 @node Setting Symbols, Symbol Names, Labels, Symbols
1384 @section Giving Symbols Other Values
1385 A symbol can be given an arbitrary value by writing a symbol, followed
1386 by an equals sign @samp{=}, followed by an expression
1387 (@pxref{Expressions}). This is equivalent to using the @code{.set}
1388 directive. @xref{Set}.
1390 @node Symbol Names, Dot, Setting Symbols, Symbols
1391 @section Symbol Names
1392 Symbol names begin with a letter or with one of @samp{$._}. That
1393 character may be followed by any string of digits, letters,
1394 underscores and dollar signs. Case of letters is significant:
1395 @code{foo} is a different symbol name than @code{Foo}.
1398 For the AMD 29K family, @samp{?} is also allowed in the
1399 body of a symbol name, though not at its beginning.
1402 Each symbol has exactly one name. Each name in an assembly language
1403 program refers to exactly one symbol. You may use that symbol name any
1404 number of times in a program.
1407 * Local Symbols:: Local Symbol Names
1410 @node Local Symbols, , Symbol Names, Symbol Names
1411 @subsection Local Symbol Names
1413 Local symbols help compilers and programmers use names temporarily.
1414 There are ten local symbol names, which are re-used throughout the
1415 program. You may refer to them using the names @samp{0} @samp{1}
1416 @dots{} @samp{9}. To define a local symbol, write a label of the form
1417 @samp{@b{N}:} (where @b{N} represents any digit). To refer to the most
1418 recent previous definition of that symbol write @samp{@b{N}b}, using the
1419 same digit as when you defined the label. To refer to the next
1420 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1421 a choice of 10 forward references. The @samp{b} stands for
1422 ``backwards'' and the @samp{f} stands for ``forwards''.
1424 Local symbols are not emitted by the current GNU C compiler.
1426 There is no restriction on how you can use these labels, but
1427 remember that at any point in the assembly you can refer to at most
1428 10 prior local labels and to at most 10 forward local labels.
1430 Local symbol names are only a notation device. They are immediately
1431 transformed into more conventional symbol names before the assembler
1432 uses them. The symbol names stored in the symbol table, appearing in
1433 error messages and optionally emitted to the object file have these
1438 All local labels begin with @samp{L}. Normally both @code{as} and
1439 @code{ld} forget symbols that start with @samp{L}. These labels are
1440 used for symbols you are never intended to see. If you give the
1441 @samp{-L} option then @code{as} will retain these symbols in the
1442 object file. If you also instruct @code{ld} to retain these symbols,
1443 you may use them in debugging.
1446 If the label is written @samp{0:} then the digit is @samp{0}.
1447 If the label is written @samp{1:} then the digit is @samp{1}.
1448 And so on up through @samp{9:}.
1451 This unusual character is included so you don't accidentally invent
1452 a symbol of the same name. The character has ASCII value
1455 @item @emph{ordinal number}
1456 This is a serial number to keep the labels distinct. The first
1457 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1458 number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:}
1462 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1463 @code{3:} is named @code{L3@ctrl{A}44}.
1465 @node Dot, Symbol Attributes, Symbol Names, Symbols
1466 @section The Special Dot Symbol
1468 The special symbol @samp{.} refers to the current address that
1469 @code{as} is assembling into. Thus, the expression @samp{melvin:
1470 .long .} will cause @code{melvin} to contain its own address.
1471 Assigning a value to @code{.} is treated the same as a @code{.org}
1472 directive. Thus, the expression @samp{.=.+4} is the same as saying
1480 @node Symbol Attributes, , Dot, Symbols
1481 @section Symbol Attributes
1482 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1484 @c The detailed definitions are in <a.out.h>.
1487 If you use a symbol without defining it, @code{as} assumes zero for
1488 all these attributes, and probably won't warn you. This makes the
1489 symbol an externally defined symbol, which is generally what you
1493 * Symbol Value:: Value
1494 * Symbol Type:: Type
1495 * Symbol Desc:: Descriptor
1496 * Symbol Other:: Other
1499 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1501 The value of a symbol is (usually) 32 bits, the size of one GNU C
1502 @code{int}. For a symbol which labels a location in the
1503 text, data, bss or absolute segments the
1504 value is the number of addresses from the start of that segment to
1505 the label. Naturally for text, data and bss
1506 segments the value of a symbol changes as @code{ld} changes segment
1507 base addresses during linking. absolute symbols' values do
1508 not change during linking: that is why they are called absolute.
1510 The value of an undefined symbol is treated in a special way. If it is
1511 0 then the symbol is not defined in this assembler source program, and
1512 @code{ld} will try to determine its value from other programs it is
1513 linked with. You make this kind of symbol simply by mentioning a symbol
1514 name without defining it. A non-zero value represents a @code{.comm}
1515 common declaration. The value is how much common storage to reserve, in
1516 bytes (addresses). The symbol refers to the first address of the
1519 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1521 The type attribute of a symbol is 8 bits encoded in a devious way.
1522 We kept this coding standard for compatibility with older operating
1528 7 6 5 4 3 2 1 0 bit numbers
1529 +-----+-----+-----+-----+-----+-----+-----+-----+
1531 | N_STAB bits | N_TYPE bits |N_EXT|
1533 +-----+-----+-----+-----+-----+-----+-----+-----+
1541 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1cm}{0}&bit numbers\cr
1542 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1543 bits}\boxit{1cm}{\tt N\_EXT}\cr
1544 \hfill {\bf Type} byte\hfill\cr
1548 @subsubsection @code{N_EXT} bit
1549 This bit is set if @code{ld} might need to use the symbol's type bits
1550 and value. If this bit is off, then @code{ld} can ignore the
1551 symbol while linking. It is set in two cases. If the symbol is
1552 undefined, then @code{ld} is expected to find the symbol's value
1553 elsewhere in another program module. Otherwise the symbol has the
1554 value given, but this symbol name and value are revealed to any other
1555 programs linked in the same executable program. This second use of
1556 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1558 @subsubsection @code{N_TYPE} bits
1559 These establish the symbol's ``type'', which is mainly a relocation
1560 concept. Common values are detailed in the manual describing the
1561 executable file format.
1563 @subsubsection @code{N_STAB} bits
1564 Common values for these bits are described in the manual on the
1565 executable file format.
1567 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1568 @subsection Descriptor
1569 This is an arbitrary 16-bit value. You may establish a symbol's
1570 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1571 A descriptor value means nothing to @code{as}.
1573 @node Symbol Other, , Symbol Desc, Symbol Attributes
1575 This is an arbitrary 8-bit value. It means nothing to @code{as}.
1577 @node Expressions, Pseudo Ops, Symbols, Top
1578 @chapter Expressions
1579 An @dfn{expression} specifies an address or numeric value.
1580 Whitespace may precede and/or follow an expression.
1583 * Empty Exprs:: Empty Expressions
1584 * Integer Exprs:: Integer Expressions
1587 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1588 @section Empty Expressions
1589 An empty expression has no value: it is just whitespace or null.
1590 Wherever an absolute expression is required, you may omit the
1591 expression and @code{as} will assume a value of (absolute) 0. This
1592 is compatible with other assemblers.
1594 @node Integer Exprs, , Empty Exprs, Expressions
1595 @section Integer Expressions
1596 An @dfn{integer expression} is one or more @emph{arguments} delimited
1597 by @emph{operators}.
1600 * Arguments:: Arguments
1601 * Operators:: Operators
1602 * Prefix Ops:: Prefix Operators
1603 * Infix Ops:: Infix Operators
1606 @node Arguments, Operators, Integer Exprs, Integer Exprs
1607 @subsection Arguments
1609 @dfn{Arguments} are symbols, numbers or subexpressions. In other
1610 contexts arguments are sometimes called ``arithmetic operands''. In
1611 this manual, to avoid confusing them with the ``instruction operands'' of
1612 the machine language, we use the term ``argument'' to refer to parts of
1613 expressions only, reserving the word ``operand'' to refer only to machine
1614 instruction operands.
1616 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1617 @var{segment} is one of text, data, bss, absolute,
1618 or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit
1621 Numbers are usually integers.
1623 A number can be a flonum or bignum. In this case, you are warned
1624 that only the low order 32 bits are used, and @code{as} pretends
1625 these 32 bits are an integer. You may write integer-manipulating
1626 instructions that act on exotic constants, compatible with other
1629 Subexpressions are a left parenthesis @samp{(} followed by an integer
1630 expression, followed by a right parenthesis @samp{)}; or a prefix
1631 operator followed by an argument.
1633 @node Operators, Prefix Ops, Arguments, Integer Exprs
1634 @subsection Operators
1635 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix
1636 operators are followed by an argument. Infix operators appear
1637 between their arguments. Operators may be preceded and/or followed by
1640 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1641 @subsection Prefix Operators
1642 @code{as} has the following @dfn{prefix operators}. They each take
1643 one argument, which must be absolute.
1646 @dfn{Negation}. Two's complement negation.
1648 @dfn{Complementation}. Bitwise not.
1651 @node Infix Ops, , Prefix Ops, Integer Exprs
1652 @subsection Infix Operators
1654 @dfn{Infix operators} take two arguments, one on either side. Operators
1655 have precedence, but operations with equal precedence are performed left
1656 to right. Apart from @code{+} or @code{-}, both arguments must be
1657 absolute, and the result is absolute.
1665 @dfn{Multiplication}.
1667 @dfn{Division}. Truncation is the same as the C operator @samp{/}
1672 @dfn{Shift Left}. Same as the C operator @samp{<<}
1675 @dfn{Shift Right}. Same as the C operator @samp{>>}
1679 Intermediate precedence
1682 @dfn{Bitwise Inclusive Or}.
1686 @dfn{Bitwise Exclusive Or}.
1688 @dfn{Bitwise Or Not}.
1695 @dfn{Addition}. If either argument is absolute, the result
1696 has the segment of the other argument.
1697 If either argument is pass1 or undefined, the result is pass1.
1698 Otherwise @code{+} is illegal.
1700 @dfn{Subtraction}. If the right argument is absolute, the
1701 result has the segment of the left argument.
1702 If either argument is pass1 the result is pass1.
1703 If either argument is undefined the result is difference segment.
1704 If both arguments are in the same segment, the result is absolute---provided
1705 that segment is one of text, data or bss.
1706 Otherwise subtraction is illegal.
1710 The sense of the rule for addition is that it's only meaningful to add
1711 the @emph{offsets} in an address; you can only have a defined segment in
1712 one of the two arguments.
1714 Similarly, you can't subtract quantities from two different segments.
1716 @node Pseudo Ops, Machine Dependent, Expressions, Top
1717 @chapter Assembler Directives
1719 * Abort:: The Abort directive causes as to abort
1720 * Align:: Pad the location counter to a power of 2
1721 * App-File:: Set the logical file name
1722 * Ascii:: Fill memory with bytes of ASCII characters
1723 * Asciz:: Fill memory with bytes of ASCII characters followed
1725 * Byte:: Fill memory with 8-bit integers
1726 * Comm:: Reserve public space in the BSS segment
1727 * Data:: Change to the data segment
1728 * Desc:: Set the n_desc of a symbol
1729 * Double:: Fill memory with double-precision floating-point numbers
1730 * Else:: @code{.else}
1732 * Endif:: @code{.endif}
1733 * Equ:: @code{.equ @var{symbol}, @var{expression}}
1734 * Extern:: @code{.extern}
1735 * Fill:: Fill memory with repeated values
1736 * Float:: Fill memory with single-precision floating-point numbers
1737 * Global:: Make a symbol visible to the linker
1738 * Ident:: @code{.ident}
1739 * If:: @code{.if @var{absolute expression}}
1740 * Include:: @code{.include "@var{file}"}
1741 * Int:: Fill memory with 32-bit integers
1742 * Lcomm:: Reserve private space in the BSS segment
1743 * Line:: Set the logical line number
1744 * Ln:: @code{.ln @var{line-number}}
1745 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1746 * Long:: Fill memory with 32-bit integers
1747 * Lsym:: Create a local symbol
1748 * Octa:: Fill memory with 128-bit integers
1749 * Org:: Change the location counter
1750 * Quad:: Fill memory with 64-bit integers
1751 * Set:: Set the value of a symbol
1752 * Short:: Fill memory with 16-bit integers
1753 * Single:: @code{.single @var{flonums}}
1754 * Stab:: Store debugging information
1755 * Text:: Change to the text segment
1756 @c if am29k or sparc
1757 * Word:: Fill memory with 32-bit integers
1758 @c else (not am29k or sparc)
1759 * Deprecated:: Deprecated Directives
1760 * Machine Options:: Options
1761 * Machine Syntax:: Syntax
1762 * Floating Point:: Floating Point
1763 * Machine Directives:: Machine Directives
1767 All assembler directives have names that begin with a period (@samp{.}).
1768 The rest of the name is letters: their case does not matter.
1770 This chapter discusses directives present in all versions of GNU
1771 @code{as}; @pxref{Machine Dependent} for additional directives.
1773 @node Abort, Align, Pseudo Ops, Pseudo Ops
1774 @section @code{.abort}
1775 This directive stops the assembly immediately. It is for
1776 compatibility with other assemblers. The original idea was that the
1777 assembler program would be piped into the assembler. If the sender
1778 of a program quit, it could use this directive tells @code{as} to
1779 quit also. One day @code{.abort} will not be supported.
1781 @node Align, App-File, Abort, Pseudo Ops
1782 @section @code{.align @var{absolute-expression} , @var{absolute-expression}}
1783 Pad the location counter (in the current subsegment) to a particular
1784 storage boundary. The first expression is the number of low-order zero
1785 bits the location counter will have after advancement. For example
1786 @samp{.align 3} will advance the location counter until it a multiple of
1787 8. If the location counter is already a multiple of 8, no change is
1790 The second expression gives the value to be stored in the padding
1791 bytes. It (and the comma) may be omitted. If it is omitted, the
1792 padding bytes are zero.
1794 @node App-File, Ascii, Align, Pseudo Ops
1795 @section @code{.app-file @var{string}}
1796 @code{.app-file} tells @code{as} that we are about to start a new
1797 logical file. @var{String} is the new file name. In general, the
1798 filename is recognized whether or not it is surrounded by quotes @samp{"};
1799 but if you wish to specify an empty file name is permitted,
1800 you must give the quotes--@code{""}. This statement may go away in
1801 future: it is only recognized to be compatible with old @code{as}
1804 @node Ascii, Asciz, App-File, Pseudo Ops
1805 @section @code{.ascii "@var{string}"}@dots{}
1806 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1807 separated by commas. It assembles each string (with no automatic
1808 trailing zero byte) into consecutive addresses.
1810 @node Asciz, Byte, Ascii, Pseudo Ops
1811 @section @code{.asciz "@var{string}"}@dots{}
1812 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1813 a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''.
1815 @node Byte, Comm, Asciz, Pseudo Ops
1816 @section @code{.byte @var{expressions}}
1818 @code{.byte} expects zero or more expressions, separated by commas.
1819 Each expression is assembled into the next byte.
1821 @node Comm, Data, Byte, Pseudo Ops
1822 @section @code{.comm @var{symbol} , @var{length} }
1823 @code{.comm} declares a named common area in the bss segment. Normally
1824 @code{ld} reserves memory addresses for it during linking, so no partial
1825 program defines the location of the symbol. Use @code{.comm} to tell
1826 @code{ld} that it must be at least @var{length} bytes long. @code{ld}
1827 will allocate space for each @code{.comm} symbol that is at least as
1828 long as the longest @code{.comm} request in any of the partial programs
1829 linked. @var{length} is an absolute expression.
1831 @node Data, Desc, Comm, Pseudo Ops
1832 @section @code{.data @var{subsegment}}
1833 @code{.data} tells @code{as} to assemble the following statements onto the
1834 end of the data subsegment numbered @var{subsegment} (which is an
1835 absolute expression). If @var{subsegment} is omitted, it defaults
1838 @node Desc, Double, Data, Pseudo Ops
1839 @section @code{.desc @var{symbol}, @var{absolute-expression}}
1840 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1841 to the low 16 bits of @var{absolute-expression}.
1843 @node Double, Else, Desc, Pseudo Ops
1844 @section @code{.double @var{flonums}}
1845 @code{.double} expects zero or more flonums, separated by commas. It assembles
1846 floating point numbers.
1848 @c The exact kind of floating point numbers
1849 @c emitted depends on how @code{as} is configured. @xref{Machine
1853 On the AMD 29K family the floating point format used is IEEE.
1856 @node Else, End, Double, Pseudo Ops
1857 @section @code{.else}
1858 @code{.else} is part of the @code{as} support for conditional assembly;
1859 @pxref{If}. It marks the beginning of a section of code to be assembled
1860 if the condition for the preceding @code{.if} was false.
1863 @node End, Endif, Else, Pseudo Ops
1864 @section @code{.end}
1865 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1866 meant to do something eventually (which is why it isn't documented here
1867 as "for compatibility with blah").
1870 @node Endif, Equ, End, Pseudo Ops
1871 @section @code{.endif}
1872 @code{.endif} is part of the @code{as} support for conditional assembly;
1873 it marks the end of a block of code that is only assembled
1874 conditionally. @xref{If}.
1876 @node Equ, Extern, Endif, Pseudo Ops
1877 @section @code{.equ @var{symbol}, @var{expression}}
1879 This directive sets the value of @var{symbol} to @var{expression}.
1880 It is synonymous with @samp{.set}; @pxref{Set}.
1882 @node Extern, Fill, Equ, Pseudo Ops
1883 @section @code{.extern}
1884 @code{.extern} is accepted in the source program---for compatibility
1885 with other assemblers---but it is ignored. GNU @code{as} treats
1886 all undefined symbols as external.
1888 @node Fill, Float, Extern, Pseudo Ops
1889 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1890 @var{result}, @var{size} and @var{value} are absolute expressions.
1891 This emits @var{repeat} copies of @var{size} bytes. @var{Repeat}
1892 may be zero or more. @var{Size} may be zero or more, but if it is
1893 more than 8, then it is deemed to have the value 8, compatible with
1894 other people's assemblers. The contents of each @var{repeat} bytes
1895 is taken from an 8-byte number. The highest order 4 bytes are
1896 zero. The lowest order 4 bytes are @var{value} rendered in the
1897 byte-order of an integer on the computer @code{as} is assembling for.
1898 Each @var{size} bytes in a repetition is taken from the lowest order
1899 @var{size} bytes of this number. Again, this bizarre behavior is
1900 compatible with other people's assemblers.
1902 @var{Size} and @var{value} are optional.
1903 If the second comma and @var{value} are absent, @var{value} is
1904 assumed zero. If the first comma and following tokens are absent,
1905 @var{size} is assumed to be 1.
1907 @node Float, Global, Fill, Pseudo Ops
1908 @section @code{.float @var{flonums}}
1909 This directive assembles zero or more flonums, separated by commas. It
1910 has the same effect as @code{.single}.
1912 @c The exact kind of floating point numbers emitted depends on how
1913 @c @code{as} is configured.
1914 @c @xref{Machine Dependent}.
1917 The floating point format used for the AMD 29K family is IEEE.
1920 @node Global, Ident, Float, Pseudo Ops
1921 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1922 @code{.global} makes the symbol visible to @code{ld}. If you define
1923 @var{symbol} in your partial program, its value is made available to
1924 other partial programs that are linked with it. Otherwise,
1925 @var{symbol} will take its attributes from a symbol of the same name
1926 from another partial program it is linked with.
1928 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1929 to 1. @xref{Symbol Attributes}.
1931 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1932 compatibility with other assemblers.
1934 @node Ident, If, Global, Pseudo Ops
1935 @section @code{.ident}
1936 This directive is used by some assemblers to place tags in object files.
1937 GNU @code{as} simply accepts the directive for source-file
1938 compatibility with such assemblers, but does not actually emit anything
1941 @node If, Include, Ident, Pseudo Ops
1942 @section @code{.if @var{absolute expression}}
1943 @code{.if} marks the beginning of a section of code which is only
1944 considered part of the source program being assembled if the argument
1945 (which must be an @var{absolute expression}) is non-zero. The end of
1946 the conditional section of code must be marked by @code{.endif}
1947 (@pxref{Endif}); optionally, you may include code for the
1948 alternative condition, flagged by @code{.else} (@pxref{Else}.
1950 The following variants of @code{.if} are also supported:
1952 @item ifdef @var{symbol}
1953 Assembles the following section of code if the specified @var{symbol}
1961 @item ifndef @var{symbol}
1962 @itemx ifnotdef @var{symbol}
1963 Assembles the following section of code if the specified @var{symbol}
1964 has not been defined. Both spelling variants are equivalent.
1968 NO bogons, I presume?
1972 @node Include, Int, If, Pseudo Ops
1973 @section @code{.include "@var{file}"}
1974 This directive provides a way to include supporting files at specified
1975 points in your source program. The code from @var{file} is assembled as
1976 if it followed the point of the @code{.include}; when the end of the
1977 included file is reached, assembly of the original file continues. You
1978 can control the search paths used with the @samp{-I} command-line option
1979 (@pxref{Options}). Quotation marks are required around @var{file}.
1981 @node Int, Lcomm, Include, Pseudo Ops
1982 @section @code{.int @var{expressions}}
1983 Expect zero or more @var{expressions}, of any segment, separated by
1984 commas. For each expression, emit a 32-bit number that will, at run
1985 time, be the value of that expression. The byte order of the
1986 expression depends on what kind of computer will run the program.
1988 @node Lcomm, Line, Int, Pseudo Ops
1989 @section @code{.lcomm @var{symbol} , @var{length}}
1990 Reserve @var{length} (an absolute expression) bytes for a local
1991 common denoted by @var{symbol}. The segment and value of @var{symbol} are
1992 those of the new local common. The addresses are allocated in the
1993 bss segment, so at run-time the bytes will start off zeroed.
1994 @var{Symbol} is not declared global (@pxref{Global}), so is normally
1995 not visible to @code{ld}.
1999 @node Line, Ln, Lcomm, Pseudo Ops
2000 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2001 @code{.line}, and its alternate spelling @code{.ln}, tell
2005 @node Ln, List, Line, Pseudo Ops
2006 @section @code{.ln @var{line-number}}
2009 @code{as} to change the logical line number. @var{line-number} must be
2010 an absolute expression. The next line will have that logical line
2011 number. So any other statements on the current line (after a statement
2019 will be reported as on logical line number
2020 @var{logical line number} @minus{} 1.
2021 One day this directive will be unsupported: it is used only
2022 for compatibility with existing assembler programs. @refill
2024 @node List, Long, Ln, Pseudo Ops
2025 @section @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
2026 GNU @code{as} ignores these directives; however, they're
2027 accepted for compatibility with assemblers that use them.
2029 @node Long, Lsym, List, Pseudo Ops
2030 @section @code{.long @var{expressions}}
2031 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2033 @node Lsym, Octa, Long, Pseudo Ops
2034 @section @code{.lsym @var{symbol}, @var{expression}}
2035 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2036 the hash table, ensuring it cannot be referenced by name during the
2037 rest of the assembly. This sets the attributes of the symbol to be
2038 the same as the expression value:
2040 @var{other} = @var{descriptor} = 0
2041 @var{type} = @r{(segment of @var{expression})}
2043 @var{value} = @var{expression}
2046 @node Octa, Org, Lsym, Pseudo Ops
2047 @section @code{.octa @var{bignums}}
2048 This directive expects zero or more bignums, separated by commas. For each
2049 bignum, it emits a 16-byte integer.
2051 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2052 hence @emph{quad}-word for 8 bytes.
2054 @node Org, Quad, Octa, Pseudo Ops
2055 @section @code{.org @var{new-lc} , @var{fill}}
2057 @code{.org} will advance the location counter of the current segment to
2058 @var{new-lc}. @var{new-lc} is either an absolute expression or an
2059 expression with the same segment as the current subsegment. That is,
2060 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2061 wrong segment, the @code{.org} directive is ignored. To be compatible
2062 with former assemblers, if the segment of @var{new-lc} is absolute,
2063 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2064 is the same as the current subsegment.
2066 @code{.org} may only increase the location counter, or leave it
2067 unchanged; you cannot use @code{.org} to move the location counter
2070 @c double negative used below "not undefined" because this is a specific
2071 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2072 @c segment. pesch@cygnus.com 18feb91
2073 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2074 may not be undefined. If you really detest this restriction we eagerly await
2075 a chance to share your improved assembler.
2077 Beware that the origin is relative to the start of the segment, not
2078 to the start of the subsegment. This is compatible with other
2079 people's assemblers.
2081 When the location counter (of the current subsegment) is advanced, the
2082 intervening bytes are filled with @var{fill} which should be an
2083 absolute expression. If the comma and @var{fill} are omitted,
2084 @var{fill} defaults to zero.
2086 @node Quad, Set, Org, Pseudo Ops
2087 @section @code{.quad @var{bignums}}
2088 @code{.quad} expects zero or more bignums, separated by commas. For
2089 each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8
2090 bytes, it prints a warning message; and just takes the lowest order 8
2091 bytes of the bignum.
2093 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2094 hence @emph{quad}-word for 8 bytes.
2096 @node Set, Short, Quad, Pseudo Ops
2097 @section @code{.set @var{symbol}, @var{expression}}
2099 This directive sets the value of @var{symbol} to @var{expression}. This
2100 will change @var{symbol}'s value and type to conform to
2101 @var{expression}. If @code{N_EXT} is set, it remains set.
2102 (@xref{Symbol Attributes}.)
2104 You may @code{.set} a symbol many times in the same assembly.
2105 If the expression's segment is unknowable during pass 1, a second
2106 pass over the source program will be forced. The second pass is
2107 currently not implemented. @code{as} will abort with an error
2108 message if one is required.
2110 If you @code{.set} a global symbol, the value stored in the object
2111 file is the last value stored into it.
2113 @node Short, Single, Set, Pseudo Ops
2114 @section @code{.short @var{expressions}}
2115 @c if not (sparc or amd29k)
2116 @c @code{.short} is the same as @samp{.word}. @xref{Word}.
2117 @c fi not (sparc or amd29k)
2118 @c if (sparc or amd29k)
2119 This expects zero or more @var{expressions}, and emits
2120 a 16 bit number for each.
2121 @c fi (sparc or amd29k)
2123 @node Single, Space, Short, Pseudo Ops
2124 @section @code{.single @var{flonums}}
2125 This directive assembles zero or more flonums, separated by commas. It
2126 has the same effect as @code{.float}.
2128 @c The exact kind of floating point numbers emitted depends on how
2129 @c @code{as} is configured. @xref{Machine Dependent}.
2132 The floating point format used for the AMD 29K family is IEEE.
2136 @node Space, Space, Single, Pseudo Ops
2139 @section @code{.space @var{size} , @var{fill}}
2140 This directive emits @var{size} bytes, each of value @var{fill}. Both
2141 @var{size} and @var{fill} are absolute expressions. If the comma
2142 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2147 @section @code{.space}
2148 This directive is ignored; it is accepted for compatibility with other
2152 @emph{Warning:} In other versions of GNU @code{as}, the directive
2153 @code{.space} has the effect of @code{.block} @xref{Machine Directives}.
2157 @node Stab, Text, Space, Pseudo Ops
2158 @section @code{.stabd, .stabn, .stabs}
2159 There are three directives that begin @samp{.stab}.
2160 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2161 The symbols are not entered in @code{as}' hash table: they
2162 cannot be referenced elsewhere in the source file.
2163 Up to five fields are required:
2166 This is the symbol's name. It may contain any character except @samp{\000},
2167 so is more general than ordinary symbol names. Some debuggers used to
2168 code arbitrarily complex structures into symbol names using this field.
2170 An absolute expression. The symbol's type is set to the low 8
2171 bits of this expression.
2172 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2175 An absolute expression.
2176 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2178 An absolute expression.
2179 The symbol's descriptor is set to the low 16 bits of this expression.
2181 An absolute expression which becomes the symbol's value.
2184 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2185 or @code{.stabs} statement, the symbol has probably already been created
2186 and you will get a half-formed symbol in your object file. This is
2187 compatible with earlier assemblers!
2190 @item .stabd @var{type} , @var{other} , @var{desc}
2192 The ``name'' of the symbol generated is not even an empty string.
2193 It is a null pointer, for compatibility. Older assemblers used a
2194 null pointer so they didn't waste space in object files with empty
2197 The symbol's value is set to the location counter,
2198 relocatably. When your program is linked, the value of this symbol
2199 will be where the location counter was when the @code{.stabd} was
2202 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2204 The name of the symbol is set to the empty string @code{""}.
2206 @item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value}
2208 All five fields are specified.
2211 @node Text, Word, Stab, Pseudo Ops
2212 @section @code{.text @var{subsegment}}
2213 Tells @code{as} to assemble the following statements onto the end of
2214 the text subsegment numbered @var{subsegment}, which is an absolute
2215 expression. If @var{subsegment} is omitted, subsegment number zero
2218 @node Word, Deprecated, Text, Pseudo Ops
2219 @section @code{.word @var{expressions}}
2220 This directive expects zero or more @var{expressions}, of any segment,
2221 separated by commas.
2222 @c if sparc or amd29k
2223 For each expression, @code{as} emits a 32-bit number.
2224 @c fi sparc or amd29k
2225 @c if not (sparc or amd29k)
2226 @c For each expression, @code{as} emits a 16-bit number.
2227 @c fi not (sparc or amd29k)
2231 of the expression depends on what kind of computer will run the
2237 @c on the 29k this doesn't happen---32-bit addressability, period; no
2238 @c long/short jumps.
2240 @subsection Special Treatment to support Compilers
2242 In order to assemble compiler output into something that will work,
2243 @code{as} will occasionlly do strange things to @samp{.word} directives.
2244 Directives of the form @samp{.word sym1-sym2} are often emitted by
2245 compilers as part of jump tables. Therefore, when @code{as} assembles a
2246 directive of the form @samp{.word sym1-sym2}, and the difference between
2247 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2248 create a @dfn{secondary jump table}, immediately before the next label.
2249 This @var{secondary jump table} will be preceded by a short-jump to the
2250 first byte after the secondary table. This short-jump prevents the flow
2251 of control from accidentally falling into the new table. Inside the
2252 table will be a long-jump to @code{sym2}. The original @samp{.word}
2253 will contain @code{sym1} minus the address of the long-jump to
2256 If there were several occurrences of @samp{.word sym1-sym2} before the
2257 secondary jump table, all of them will be adjusted. If there was a
2258 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2259 long-jump to @code{sym4} will be included in the secondary jump table,
2260 and the @code{.word} directives will be adjusted to contain @code{sym3}
2261 minus the address of the long-jump to @code{sym4}; and so on, for as many
2262 entries in the original jump table as necessary.
2266 @emph{This feature may be disabled by compiling @code{as} with the
2267 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2268 assembly language programmers.
2273 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2274 @section Deprecated Directives
2275 One day these directives won't work.
2276 They are included for compatibility with older assemblers.
2283 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2285 @c chapter Machine Dependent Features
2288 @c chapter Machine Dependent Features: Motorola 680x0
2291 @chapter Machine Dependent Features: AMD 29K
2293 @c pesch@cygnus.com: This version of the manual is specifically hacked
2294 @c for gas on a particular machine.
2295 @c We should have a config method of
2296 @c automating this; in the meantime, use ignore
2297 @c for the other architectures (or for their stubs)
2304 The Vax version of @code{as} accepts any of the following options,
2305 gives a warning message that the option was ignored and proceeds.
2306 These options are for compatibility with scripts designed for other
2307 people's assemblers.
2310 @item @kbd{-D} (Debug)
2311 @itemx @kbd{-S} (Symbol Table)
2312 @itemx @kbd{-T} (Token Trace)
2313 These are obsolete options used to debug old assemblers.
2315 @item @kbd{-d} (Displacement size for JUMPs)
2316 This option expects a number following the @kbd{-d}. Like options
2317 that expect filenames, the number may immediately follow the
2318 @kbd{-d} (old standard) or constitute the whole of the command line
2319 argument that follows @kbd{-d} (GNU standard).
2321 @item @kbd{-V} (Virtualize Interpass Temporary File)
2322 Some other assemblers use a temporary file. This option
2323 commanded them to keep the information in active memory rather
2324 than in a disk file. @code{as} always does this, so this
2325 option is redundant.
2327 @item @kbd{-J} (JUMPify Longer Branches)
2328 Many 32-bit computers permit a variety of branch instructions
2329 to do the same job. Some of these instructions are short (and
2330 fast) but have a limited range; others are long (and slow) but
2331 can branch anywhere in virtual memory. Often there are 3
2332 flavors of branch: short, medium and long. Some other
2333 assemblers would emit short and medium branches, unless told by
2334 this option to emit short and long branches.
2336 @item @kbd{-t} (Temporary File Directory)
2337 Some other assemblers may use a temporary file, and this option
2338 takes a filename being the directory to site the temporary
2339 file. @code{as} does not use a temporary disk file, so this
2340 option makes no difference. @kbd{-t} needs exactly one
2344 The Vax version of the assembler accepts two options when
2345 compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The
2346 @kbd{-h} option prevents @code{as} from modifying the
2347 symbol-table entries for symbols that contain lowercase
2348 characters (I think). The @kbd{-+} option causes @code{as} to
2349 print warning messages if the FILENAME part of the object file,
2350 or any symbol name is larger than 31 characters. The @kbd{-+}
2351 option also insertes some code following the @samp{_main}
2352 symbol so that the object file will be compatible with Vax-11
2355 @subsection Floating Point
2356 Conversion of flonums to floating point is correct, and
2357 compatible with previous assemblers. Rounding is
2358 towards zero if the remainder is exactly half the least significant bit.
2360 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2363 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2364 are rendered correctly. Again, rounding is towards zero in the
2367 The @code{.float} directive produces @code{f} format numbers.
2368 The @code{.double} directive produces @code{d} format numbers.
2370 @subsection Machine Directives
2371 The Vax version of the assembler supports four directives for
2372 generating Vax floating point constants. They are described in the
2377 This expects zero or more flonums, separated by commas, and
2378 assembles Vax @code{d} format 64-bit floating point constants.
2381 This expects zero or more flonums, separated by commas, and
2382 assembles Vax @code{f} format 32-bit floating point constants.
2385 This expects zero or more flonums, separated by commas, and
2386 assembles Vax @code{g} format 64-bit floating point constants.
2389 This expects zero or more flonums, separated by commas, and
2390 assembles Vax @code{h} format 128-bit floating point constants.
2395 All DEC mnemonics are supported. Beware that @code{case@dots{}}
2396 instructions have exactly 3 operands. The dispatch table that
2397 follows the @code{case@dots{}} instruction should be made with
2398 @code{.word} statements. This is compatible with all unix
2399 assemblers we know of.
2401 @subsection Branch Improvement
2402 Certain pseudo opcodes are permitted. They are for branch
2403 instructions. They expand to the shortest branch instruction that
2404 will reach the target. Generally these mnemonics are made by
2405 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2406 This feature is included both for compatibility and to help
2407 compilers. If you don't need this feature, don't use these
2408 opcodes. Here are the mnemonics, and the code they can expand into.
2412 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2414 @item (byte displacement)
2416 @item (word displacement)
2418 @item (long displacement)
2423 Unconditional branch.
2425 @item (byte displacement)
2427 @item (word displacement)
2429 @item (long displacement)
2433 @var{COND} may be any one of the conditional branches
2434 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2435 @var{COND} may also be one of the bit tests
2436 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2437 @var{NOTCOND} is the opposite condition to @var{COND}.
2439 @item (byte displacement)
2440 @kbd{b@var{COND} @dots{}}
2441 @item (word displacement)
2442 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2443 @item (long displacement)
2444 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2447 @var{X} may be one of @code{b d f g h l w}.
2449 @item (word displacement)
2450 @kbd{@var{OPCODE} @dots{}}
2451 @item (long displacement)
2452 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2455 @var{YYY} may be one of @code{lss leq}.
2457 @var{ZZZ} may be one of @code{geq gtr}.
2459 @item (byte displacement)
2460 @kbd{@var{OPCODE} @dots{}}
2461 @item (word displacement)
2462 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2463 @item (long displacement)
2464 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2471 @item (byte displacement)
2472 @kbd{@var{OPCODE} @dots{}}
2473 @item (word displacement)
2474 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2475 @item (long displacement)
2476 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2480 @subsection operands
2481 The immediate character is @samp{$} for Unix compatibility, not
2482 @samp{#} as DEC writes it.
2484 The indirect character is @samp{*} for Unix compatibility, not
2485 @samp{@@} as DEC writes it.
2487 The displacement sizing character is @samp{`} (an accent grave) for
2488 Unix compatibility, not @samp{^} as DEC writes it. The letter
2489 preceding @samp{`} may have either case. @samp{G} is not
2490 understood, but all other letters (@code{b i l s w}) are understood.
2492 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2493 pc}. Any case of letters will do.
2500 Any expression is permitted in an operand. Operands are comma
2503 @c There is some bug to do with recognizing expressions
2504 @c in operands, but I forget what it is. It is
2505 @c a syntax clash because () is used as an address mode
2506 @c and to encapsulate sub-expressions.
2507 @subsection Not Supported
2508 Vax bit fields can not be assembled with @code{as}. Someone
2509 can add the required code if they really need it.
2513 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2515 GNU @code{as} has no additional command-line options for the AMD
2518 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2520 @subsection Special Characters
2521 @samp{;} is the line comment character.
2523 @samp{@@} can be used instead of a newline to separate statements.
2525 The character @samp{?} is permitted in identifiers (but may not begin
2528 @subsection Register Names
2529 General-purpose registers are represented by predefined symbols of the
2530 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2531 (for local registers), where @var{nnn} represents a number between
2532 @code{0} and @code{127}, written with no leading zeros. The leading
2533 letters may be in either upper or lower case; for example, @samp{gr13}
2534 and @samp{LR7} are both valid register names.
2536 You may also refer to general-purpose registers by specifying the
2537 register number as the result of an expression (prefixed with @samp{%%}
2538 to flag the expression as a register number):
2542 @noindent---where @var{expression} must be an absolute expression
2543 evaluating to a number between @code{0} and @code{255}. The range
2544 [0, 127] refers to global registers, and the range [128, 255] to local
2547 In addition, GNU @code{as} understands the following protected
2548 special-purpose register names for the AMD 29K family:
2558 These unprotected special-purpose register names are also recognized:
2566 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2567 @section Floating Point
2568 The AMD 29K family uses IEEE floating-point numbers.
2570 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2571 @section Machine Directives
2574 * block:: @code{.block @var{size} , @var{fill}}
2575 * cputype:: @code{.cputype}
2576 * file:: @code{.file}
2577 * hword:: @code{.hword @var{expressions}}
2578 * line:: @code{.line}
2579 * reg:: @code{.reg @var{symbol}, @var{expression}}
2580 * sect:: @code{.sect}
2581 * use:: @code{.use @var{segment name}}
2584 @node block, cputype, Machine Directives, Machine Directives
2585 @subsection @code{.block @var{size} , @var{fill}}
2586 This directive emits @var{size} bytes, each of value @var{fill}. Both
2587 @var{size} and @var{fill} are absolute expressions. If the comma
2588 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2590 In other versions of GNU @code{as}, this directive is called
2593 @node cputype, file, block, Machine Directives
2594 @subsection @code{.cputype}
2595 This directive is ignored; it is accepted for compatibility with other
2598 @node file, hword, cputype, Machine Directives
2599 @subsection @code{.file}
2600 This directive is ignored; it is accepted for compatibility with other
2604 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2605 used for the directive called @code{.app-file} in the AMD 29K support.
2608 @node hword, line, file, Machine Directives
2609 @subsection @code{.hword @var{expressions}}
2610 This expects zero or more @var{expressions}, and emits
2611 a 16 bit number for each. (Synonym for @samp{.short}.)
2613 @node line, reg, hword, Machine Directives
2614 @subsection @code{.line}
2615 This directive is ignored; it is accepted for compatibility with other
2618 @node reg, sect, line, Machine Directives
2619 @subsection @code{.reg @var{symbol}, @var{expression}}
2620 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2622 @node sect, use, reg, Machine Directives
2623 @subsection @code{.sect}
2624 This directive is ignored; it is accepted for compatibility with other
2627 @node use, , sect, Machine Directives
2628 @subsection @code{.use @var{segment name}}
2629 Establishes the segment and subsegment for the following code;
2630 @var{segment name} may be one of @code{.text}, @code{.data},
2631 @code{.data1}, or @code{.lit}. With one of the first three @var{segment
2632 name} options, @samp{.use} is equivalent to the machine directive
2633 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2637 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2639 GNU @code{as} implements all the standard AMD 29K opcodes. No
2640 additional pseudo-instructions are needed on this family.
2642 For information on the 29K machine instruction set, see @cite{Am29000
2643 User's Manual}, Advanced Micro Devices, Inc.
2650 The 680x0 version of @code{as} has two machine dependent options.
2651 One shortens undefined references from 32 to 16 bits, while the
2652 other is used to tell @code{as} what kind of machine it is
2655 You can use the @kbd{-l} option to shorten the size of references to
2656 undefined symbols. If the @kbd{-l} option is not given, references to
2657 undefined symbols will be a full long (32 bits) wide. (Since @code{as}
2658 cannot know where these symbols will end up, @code{as} can only allocate
2659 space for the linker to fill in later. Since @code{as} doesn't know how
2660 far away these symbols will be, it allocates as much space as it can.)
2661 If this option is given, the references will only be one word wide (16
2662 bits). This may be useful if you want the object file to be as small as
2663 possible, and you know that the relevant symbols will be less than 17
2666 The 680x0 version of @code{as} is most frequently used to assemble
2667 programs for the Motorola MC68020 microprocessor. Occasionally it is
2668 used to assemble programs for the mostly similar, but slightly different
2669 MC68000 or MC68010 microprocessors. You can give @code{as} the options
2670 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2671 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2676 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2677 Size modifiers are appended directly to the end of the opcode without an
2678 intervening period. For example, write @samp{movl} rather than
2681 @c pesch@cygnus.com: Vintage Release c1.37 isn't compiled with
2684 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2685 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2688 In the following table @dfn{apc} stands for any of the address
2689 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2690 Program Counter (@samp{pc}), or the zero-address relative to the
2691 program counter (@samp{zpc}).
2693 The following addressing modes are understood:
2696 @samp{#@var{digits}}
2699 @samp{d0} through @samp{d7}
2701 @item Address Register
2702 @samp{a0} through @samp{a7}
2704 @item Address Register Indirect
2705 @samp{a0@@} through @samp{a7@@}
2707 @item Address Register Postincrement
2708 @samp{a0@@+} through @samp{a7@@+}
2710 @item Address Register Predecrement
2711 @samp{a0@@-} through @samp{a7@@-}
2713 @item Indirect Plus Offset
2714 @samp{@var{apc}@@(@var{digits})}
2717 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2718 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2721 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2722 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2725 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2726 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2728 @item Memory Indirect
2729 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2732 @samp{@var{symbol}}, or @samp{@var{digits}}
2734 @c pesch@cygnus.com: gnu, rich concur the following needs careful
2735 @c research before documenting.
2736 , or either of the above followed
2737 by @samp{:b}, @samp{:w}, or @samp{:l}.
2741 @section Floating Point
2742 The floating point code is not too well tested, and may have
2745 Packed decimal (P) format floating literals are not supported.
2746 Feel free to add the code!
2748 The floating point formats generated by directives are these.
2751 @code{Single} precision floating point constants.
2753 @code{Double} precision floating point constants.
2756 There is no directive to produce regions of memory holding
2757 extended precision numbers, however they can be used as
2758 immediate operands to floating-point instructions. Adding a
2759 directive to create extended precision numbers would not be
2760 hard, but it has not yet seemed necessary.
2762 @section Machine Directives
2763 In order to be compatible with the Sun assembler the 680x0 assembler
2764 understands the following directives.
2767 This directive is identical to a @code{.data 1} directive.
2769 This directive is identical to a @code{.data 2} directive.
2771 This directive is identical to a @code{.align 1} directive.
2772 @c Is this true? does it work???
2774 This directive is identical to a @code{.space} directive.
2778 @c pesch@cygnus.com: I don't see any point in the following
2779 @c paragraph. Bugs are bugs; how does saying this
2782 Danger: Several bugs have been found in the opcode table (and
2783 fixed). More bugs may exist. Be careful when using obscure
2787 @subsection Branch Improvement
2789 Certain pseudo opcodes are permitted for branch instructions.
2790 They expand to the shortest branch instruction that will reach the
2791 target. Generally these mnemonics are made by substituting @samp{j} for
2792 @samp{b} at the start of a Motorola mnemonic.
2794 The following table summarizes the pseudo-operations. A @code{*} flags
2795 cases that are more fully described after the table:
2799 +---------------------------------------------------------
2801 Pseudo-Op |BYTE WORD LONG LONG non-PC relative
2802 +---------------------------------------------------------
2803 jbsr |bsrs bsr bsrl jsr jsr
2804 jra |bras bra bral jmp jmp
2805 * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp
2806 * dbXX |dbXX dbXX dbXX; bra; jmpl
2807 * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp
2810 NX: negative of condition XX
2813 @center{@code{*}---see full description below}
2818 These are the simplest jump pseudo-operations; they always map to one
2819 particular machine instruction, depending on the displacement to the
2823 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2824 where @var{XX} is a conditional branch or condition-code test. The full
2825 list of pseudo-ops in this family is:
2827 jhi jls jcc jcs jne jeq jvc
2828 jvs jpl jmi jge jlt jgt jle
2831 For the cases of non-PC relative displacements and long displacements on
2832 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2833 @var{NX}, the opposite condition to @var{XX}:
2845 The full family of pseudo-operations covered here is
2847 dbhi dbls dbcc dbcs dbne dbeq dbvc
2848 dbvs dbpl dbmi dbge dblt dbgt dble
2852 Other than for word and byte displacements, when the source reads
2853 @samp{db@var{XX} foo}, @code{as} will emit
2862 This family includes
2864 fjne fjeq fjge fjlt fjgt fjle fjf
2865 fjt fjgl fjgle fjnge fjngl fjngle fjngt
2866 fjnle fjnlt fjoge fjogl fjogt fjole fjolt
2867 fjor fjseq fjsf fjsne fjst fjueq fjuge
2868 fjugt fjule fjult fjun
2871 For branch targets that are not PC relative, @code{as} emits
2877 when it encounters @samp{fj@var{XX} foo}.
2881 @subsection Special Characters
2882 The immediate character is @samp{#} for Sun compatibility. The
2883 line-comment character is @samp{|}. If a @samp{#} appears at the
2884 beginning of a line, it is treated as a comment unless it looks like
2885 @samp{# line file}, in which case it is treated normally.
2889 @c pesch@cygnus.com: see remarks at ignore for vax.
2893 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2894 specify thiat it is compiling for a 32032 processor, or a
2895 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2896 The default (if neither is specified) is chosen when the assembler
2900 I don't know anything about the 32x32 syntax assembled by
2901 @code{as}. Someone who undersands the processor (I've never seen
2902 one) and the possible syntaxes should write this section.
2904 @subsection Floating Point
2905 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2906 create single or double precision values. I don't know if the 32x32
2907 understands extended precision numbers.
2909 @subsection Machine Directives
2910 The 32x32 has no machine dependent directives.
2914 The sparc has no machine dependent options.
2917 I don't know anything about Sparc syntax. Someone who does
2918 will have to write this section.
2920 @subsection Floating Point
2921 The Sparc uses ieee floating-point numbers.
2923 @subsection Machine Directives
2924 The Sparc version of @code{as} supports the following additional
2929 This must be followed by a symbol name, a positive number, and
2930 @code{"bss"}. This behaves somewhat like @code{.comm}, but the
2931 syntax is different.
2934 This is functionally identical to @code{.globl}.
2937 This is functionally identical to @code{.short}.
2940 This directive is ignored. Any text following it on the same
2941 line is also ignored.
2944 This must be followed by a symbol name, a positive number, and
2945 @code{"bss"}. This behaves somewhat like @code{.lcomm}, but the
2946 syntax is different.
2949 This must be followed by @code{"text"}, @code{"data"}, or
2950 @code{"data1"}. It behaves like @code{.text}, @code{.data}, or
2954 This is functionally identical to the .space directive.
2957 On the Sparc, the .word directive produces 32 bit values,
2958 instead of the 16 bit values it produces on every other machine.
2962 @section Intel 80386
2964 The 80386 has no machine dependent options.
2966 @subsection AT&T Syntax versus Intel Syntax
2967 In order to maintain compatibility with the output of @code{GCC},
2968 @code{as} supports AT&T System V/386 assembler syntax. This is quite
2969 different from Intel syntax. We mention these differences because
2970 almost all 80386 documents used only Intel syntax. Notable differences
2971 between the two syntaxes are:
2974 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2975 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2976 AT&T register operands are preceded by @samp{%}; Intel register operands
2977 are undelimited. AT&T absolute (as opposed to PC relative) jump/call
2978 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2981 AT&T and Intel syntax use the opposite order for source and destination
2982 operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
2983 @samp{source, dest} convention is maintained for compatibility with
2984 previous Unix assemblers.
2987 In AT&T syntax the size of memory operands is determined from the last
2988 character of the opcode name. Opcode suffixes of @samp{b}, @samp{w},
2989 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
2990 memory references. Intel syntax accomplishes this by prefixes memory
2991 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
2992 @samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte
2993 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
2996 Immediate form long jumps and calls are
2997 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
2999 @samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return
3001 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3002 @samp{ret far @var{stack-adjust}}.
3005 The AT&T assembler does not provide support for multiple segment
3006 programs. Unix style systems expect all programs to be single segments.
3009 @subsection Opcode Naming
3010 Opcode names are suffixed with one character modifiers which specify the
3011 size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify
3012 byte, word, and long operands. If no suffix is specified by an
3013 instruction and it contains no memory operands then @code{as} tries to
3014 fill in the missing suffix based on the destination register operand
3015 (the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
3016 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3017 @samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix
3018 assembler which assumes that a missing opcode suffix implies long
3019 operand size. (This incompatibility does not affect compiler output
3020 since compilers always explicitly specify the opcode suffix.)
3022 Almost all opcodes have the same names in AT&T and Intel format. There
3023 are a few exceptions. The sign extend and zero extend instructions need
3024 two sizes to specify them. They need a size to sign/zero extend
3025 @emph{from} and a size to zero extend @emph{to}. This is accomplished
3026 by using two opcode suffixes in AT&T syntax. Base names for sign extend
3027 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3028 syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode
3029 suffixes are tacked on to this base name, the @emph{from} suffix before
3030 the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3031 ``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
3032 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3033 and @samp{wl} (from word to long).
3035 The Intel syntax conversion instructions
3038 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3040 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3042 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3044 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3046 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3047 AT&T naming. @code{as} accepts either naming for these instructions.
3049 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3050 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3053 @subsection Register Naming
3054 Register operands are always prefixes with @samp{%}. The 80386 registers
3058 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3059 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3060 frame pointer), and @samp{%esp} (the stack pointer).
3063 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3064 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3067 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3068 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3069 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3070 @samp{%cx}, and @samp{%dx})
3073 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3074 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3078 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3082 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3083 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3086 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3089 the 8 floating point register stack @samp{%st} or equivalently
3090 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3091 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3094 @subsection Opcode Prefixes
3095 Opcode prefixes are used to modify the following opcode. They are used
3096 to repeat string instructions, to provide segment overrides, to perform
3097 bus lock operations, and to give operand and address size (16-bit
3098 operands are specified in an instruction by prefixing what would
3099 normally be 32-bit operands with a ``operand size'' opcode prefix).
3100 Opcode prefixes are usually given as single-line instructions with no
3101 operands, and must directly precede the instruction they act upon. For
3102 example, the @samp{scas} (scan string) instruction is repeated with:
3108 Here is a list of opcode prefixes:
3111 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3112 @samp{fs}, @samp{gs}. These are automatically added by specifying
3113 using the @var{segment}:@var{memory-operand} form for memory references.
3116 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3117 change 32-bit operands/addresses into 16-bit operands/addresses. Note
3118 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3119 are not supported (yet).
3122 The bus lock prefix @samp{lock} inhibits interrupts during
3123 execution of the instruction it precedes. (This is only valid with
3124 certain instructions; see a 80386 manual for details).
3127 The wait for coprocessor prefix @samp{wait} waits for the
3128 coprocessor to complete the current instruction. This should never be
3129 needed for the 80386/80387 combination.
3132 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3133 to string instructions to make them repeat @samp{%ecx} times.
3136 @subsection Memory References
3137 An Intel syntax indirect memory reference of the form
3139 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3141 is translated into the AT&T syntax
3143 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3145 where @var{base} and @var{index} are the optional 32-bit base and
3146 index registers, @var{disp} is the optional displacement, and
3147 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3148 to calculate the address of the operand. If no @var{scale} is
3149 specified, @var{scale} is taken to be 1. @var{segment} specifies the
3150 optional segment register for the memory operand, and may override the
3151 default segment register (see a 80386 manual for segment register
3152 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3153 be preceded by a @samp{%}. If you specify a segment override which
3154 coincides with the default segment register, @code{as} will @emph{not}
3155 output any segment register override prefixes to assemble the given
3156 instruction. Thus, segment overrides can be specified to emphasize which
3157 segment register is used for a given memory operand.
3159 Here are some examples of Intel and AT&T style memory references:
3162 @item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
3163 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3164 missing, and the default segment is used (@samp{%ss} for addressing with
3165 @samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
3167 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3168 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3169 @samp{foo}. All other fields are missing. The segment register here
3170 defaults to @samp{%ds}.
3172 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3173 This uses the value pointed to by @samp{foo} as a memory operand.
3174 Note that @var{base} and @var{index} are both missing, but there is only
3175 @emph{one} @samp{,}. This is a syntactic exception.
3177 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3178 This selects the contents of the variable @samp{foo} with segment
3179 register @var{segment} being @samp{%gs}.
3183 Absolute (as opposed to PC relative) call and jump operands must be
3184 prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will
3185 always choose PC relative addressing for jump/call labels.
3187 Any instruction that has a memory operand @emph{must} specify its size (byte,
3188 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3191 @subsection Handling of Jump Instructions
3192 Jump instructions are always optimized to use the smallest possible
3193 displacements. This is accomplished by using byte (8-bit) displacement
3194 jumps whenever the target is sufficiently close. If a byte displacement
3195 is insufficient a long (32-bit) displacement is used. We do not support
3196 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3197 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3198 @samp{%eip} to 16 bits after the word displacement is added.
3200 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3201 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3202 byte displacements, so that it is possible that use of these
3203 instructions (@code{GCC} does not use them) will cause the assembler to
3204 print an error message (and generate incorrect code). The AT&T 80386
3205 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3213 @subsection Floating Point
3214 All 80387 floating point types except packed BCD are supported.
3215 (BCD support may be added without much difficulty). These data
3216 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3217 double (64-bit), and extended (80-bit) precision floating point.
3218 Each supported type has an opcode suffix and a constructor
3219 associated with it. Opcode suffixes specify operand's data
3220 types. Constructors build these data types into memory.
3224 Floating point constructors are @samp{.float} or @samp{.single},
3225 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3226 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3227 @samp{t} stands for temporary real, and that the 80387 only supports
3228 this format via the @samp{fldt} (load temporary real to stack top) and
3229 @samp{fstpt} (store temporary real and pop stack) instructions.
3232 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3233 @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding
3234 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3235 (quad). As with the temporary real format the 64-bit @samp{q} format is
3236 only present in the @samp{fildq} (load quad integer to stack top) and
3237 @samp{fistpq} (store quad integer and pop stack) instructions.
3240 Register to register operations do not require opcode suffixes,
3241 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3243 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3244 instructions are almost never needed (this is not the case for the
3245 80286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses
3246 the @samp{fwait} instruction whenever it is implicitly selected by one
3247 of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and
3248 @samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}}
3249 instructions are made equivalent to @samp{f@dots{}} instructions. If
3250 @samp{fwait} is desired it must be explicitly coded.
3253 There is some trickery concerning the @samp{mul} and @samp{imul}
3254 instructions that deserves mention. The 16-, 32-, and 64-bit expanding
3255 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3256 for @samp{imul}) can be output only in the one operand form. Thus,
3257 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3258 the expanding multiply would clobber the @samp{%edx} register, and this
3259 would confuse @code{GCC} output. Use @samp{imul %ebx} to get the
3260 64-bit product in @samp{%edx:%eax}.
3262 We have added a two operand form of @samp{imul} when the first operand
3263 is an immediate mode expression and the second operand is a register.
3264 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3265 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3268 @c pesch@cygnus.com: we also ignore the following chapters, but for
3269 @c a different reason---internals are changing
3270 @c rapidly. These may need to be moved to another
3271 @c book anyhow, if we adopt the model of user/modifier
3274 @node Maintenance, Retargeting, Machine Dependent, Top
3275 @chapter Maintaining the Assembler
3276 [[this chapter is still being built]]
3279 We had these goals, in descending priority:
3282 For every program composed by a compiler, @code{as} should emit
3283 ``correct'' code. This leaves some latitude in choosing addressing
3284 modes, order of @code{relocation_info} structures in the object
3287 @item Speed, for usual case.
3288 By far the most common use of @code{as} will be assembling compiler
3291 @item Upward compatibility for existing assembler code.
3292 Well @dots{} we don't support Vax bit fields but everything else
3293 seems to be upward compatible.
3296 The code should be maintainable with few surprises. (JF: ha!)
3300 We assumed that disk I/O was slow and expensive while memory was
3301 fast and access to memory was cheap. We expect the in-memory data
3302 structures to be less than 10 times the size of the emitted object
3303 file. (Contrast this with the C compiler where in-memory structures
3304 might be 100 times object file size!)
3308 Try to read the source file from disk only one time. For other
3309 reasons, we keep large chunks of the source file in memory during
3310 assembly so this is not a problem. Also the assembly algorithm
3311 should only scan the source text once if the compiler composed the
3312 text according to a few simple rules.
3314 Emit the object code bytes only once. Don't store values and then
3317 Build the object file in memory and do direct writes to disk of
3321 RMS suggested a one-pass algorithm which seems to work well. By not
3322 parsing text during a second pass considerable time is saved on
3323 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3326 It happened that the data structures needed to emit relocation
3327 information to the object file were neatly subsumed into the data
3328 structures that do backpatching of addresses after pass 1.
3330 Many of the functions began life as re-usable modules, loosely
3331 connected. RMS changed this to gain speed. For example, input
3332 parsing routines which used to work on pre-sanitized strings now
3333 must parse raw data. Hence they have to import knowledge of the
3334 assemblers' comment conventions @emph{etc}.
3336 @section Deprecated Feature(?)s
3337 We have stopped supporting some features:
3340 @code{.org} statements must have @b{defined} expressions.
3342 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3345 It might be a good idea to not support these features in a future release:
3348 @kbd{#} should begin a comment, even in column 1.
3350 Why support the logical line & file concept any more?
3352 Subsegments are a good candidate for flushing.
3353 Depends on which compilers need them I guess.
3356 @section Bugs, Ideas, Further Work
3357 Clearly the major improvement is DON'T USE A TEXT-READING
3358 ASSEMBLER for the back end of a compiler. It is much faster to
3359 interpret binary gobbledygook from a compiler's tables than to
3360 ask the compiler to write out human-readable code just so the
3361 assembler can parse it back to binary.
3363 Assuming you use @code{as} for human written programs: here are
3367 Document (here) @code{APP}.
3369 Take advantage of knowing no spaces except after opcode
3370 to speed up @code{as}. (Modify @code{app.c} to flush useless spaces:
3371 only keep space/tabs at begin of line or between 2
3374 Put pointers in this documentation to @file{a.out} documentation.
3376 Split the assembler into parts so it can gobble direct binary
3377 from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text
3378 just so @code{as} can parse it back to binary.
3380 Rewrite hash functions: I want a more modular, faster library.
3382 Clean up LOTS of code.
3384 Include all the non-@file{.c} files in the maintenance chapter.
3388 Implement flonum short literals.
3390 Change all talk of expression operands to expression quantities,
3391 or perhaps to expression arguments.
3395 Whenever a @code{.text} or @code{.data} statement is seen, we close
3396 of the current frag with an imaginary @code{.fill 0}. This is
3397 because we only have one obstack for frags, and we can't grow new
3398 frags for a new subsegment, then go back to the old subsegment and
3399 append bytes to the old frag. All this nonsense goes away if we
3400 give each subsegment its own obstack. It makes code simpler in
3401 about 10 places, but nobody has bothered to do it because C compiler
3402 output rarely changes subsegments (compared to ending frags with
3403 relaxable addresses, which is common).
3407 @c The following files in the @file{as} directory
3408 @c are symbolic links to other files, of
3409 @c the same name, in a different directory.
3412 @c @file{atof_generic.c}
3414 @c @file{atof_vax.c}
3416 @c @file{flonum_const.c}
3418 @c @file{flonum_copy.c}
3420 @c @file{flonum_get.c}
3422 @c @file{flonum_multip.c}
3424 @c @file{flonum_normal.c}
3426 @c @file{flonum_print.c}
3429 Here is a list of the source files in the @file{as} directory.
3433 This contains the pre-processing phase, which deletes comments,
3434 handles whitespace, etc. This was recently re-written, since app
3435 used to be a separate program, but RMS wanted it to be inline.
3438 This is a subroutine to append a string to another string returning a
3439 pointer just after the last @code{char} appended. (JF: All these
3440 little routines should probably all be put in one file.)
3443 Here you will find the main program of the assembler @code{as}.
3446 This is a branch office of @file{read.c}. This understands
3447 expressions, arguments. Inside @code{as}, arguments are called
3448 (expression) @emph{operands}. This is confusing, because we also talk
3449 (elsewhere) about instruction @emph{operands}. Also, expression
3450 operands are called @emph{quantities} explicitly to avoid confusion
3451 with instruction operands. What a mess.
3454 This implements the @b{frag} concept. Without frags, finding the
3455 right size for branch instructions would be a lot harder.
3458 This contains the symbol table, opcode table @emph{etc.} hashing
3462 This is a table of values of digits, for use in atoi() type
3463 functions. Could probably be flushed by using calls to strtol(), or
3467 This contains Operating system dependent source file reading
3468 routines. Since error messages often say where we are in reading
3469 the source file, they live here too. Since @code{as} is intended to
3470 run under GNU and Unix only, this might be worth flushing. Anyway,
3471 almost all C compilers support stdio.
3474 This deals with calling the pre-processor (if needed) and feeding the
3475 chunks back to the rest of the assembler the right way.
3478 This contains operating system independent parts of fatal and
3479 warning message reporting. See @file{append.c} above.
3482 This contains operating system dependent functions that write an
3483 object file for @code{as}. See @file{input-file.c} above.
3486 This implements all the directives of @code{as}. This also deals
3487 with passing input lines to the machine dependent part of the
3491 This is a C library function that isn't in most C libraries yet.
3492 See @file{append.c} above.
3495 This implements subsegments.
3498 This implements symbols.
3501 This contains the code to perform relaxation, and to write out
3502 the object file. It is mostly operating system independent, but
3503 different OSes have different object file formats in any case.
3506 This implements @code{malloc()} or bust. See @file{append.c} above.
3509 This implements @code{realloc()} or bust. See @file{append.c} above.
3511 @item atof-generic.c
3512 The following files were taken from a machine-independent subroutine
3513 library for manipulating floating point numbers and very large
3516 @file{atof-generic.c} turns a string into a flonum internal format
3517 floating-point number.
3519 @item flonum-const.c
3520 This contains some potentially useful floating point numbers in
3524 This copies a flonum.
3526 @item flonum-multip.c
3527 This multiplies two flonums together.
3530 This copies a bignum.
3534 Here is a table of all the machine-specific files (this includes
3535 both source and header files). Typically, there is a
3536 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3537 atof-@var{machine}.c file. The @var{machine}-opcode.h file should
3538 be identical to the one used by GDB (which uses it for disassembly.)
3543 This contains code to turn a flonum into a ieee literal constant.
3544 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3547 This is the opcode-table for the i386 version of the assembler.
3550 This contains all the code for the i386 version of the assembler.
3553 This defines constants and macros used by the i386 version of the assembler.
3556 generic 68020 header file. To be linked to m68k.h on a
3557 non-sun3, non-hpux system.
3560 68010 header file for Sun2 workstations. Not well tested. To be linked
3561 to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the
3565 68020 header file for Sun3 workstations. To be linked to m68k.h before
3566 compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the
3570 68020 header file for a HPUX (system 5?) box. Which box, which
3571 version of HPUX, etc? I don't know.
3574 A hard- or symbolic- link to one of @file{m-generic.h},
3575 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3576 680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the
3580 Opcode table for 68020. This is now a link to the opcode table
3581 in the @code{GDB} source directory.
3584 All the mc680x0 code, in one huge, slow-to-compile file.
3587 This contains the code for the ns32032/ns32532 version of the
3590 @item ns32k-opcode.h
3591 This contains the opcode table for the ns32032/ns32532 version
3595 Vax specific file for describing Vax operands and other Vax-ish things.
3601 Vax specific parts of @code{as}. Also includes the former files
3602 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3605 Turns a flonum into a Vax constant.
3608 This file contains the special code needed to put out a VMS
3609 style object file for the Vax.
3613 Here is a list of the header files in the source directory.
3614 (Warning: This section may not be very accurate. I didn't
3615 write the header files; I just report them.) Also note that I
3616 think many of these header files could be cleaned up or
3622 This describes the structures used to create the binary header data
3623 inside the object file. Perhaps we should use the one in
3624 @file{/usr/include}?
3627 This defines all the globally useful things, and pulls in <stdio.h>
3631 This defines macros useful for dealing with bignums.
3634 Structure and macros for dealing with expression()
3637 This defines the structure for dealing with floating point
3638 numbers. It #includes @file{bignum.h}.
3641 This contains macro for appending a byte to the current frag.
3644 Structures and function definitions for the hashing functions.
3647 Function headers for the input-file.c functions.
3650 structures and function headers for things defined in the
3651 machine dependent part of the assembler.
3654 This is the GNU systemwide include file for manipulating obstacks.
3655 Since nobody is running under real GNU yet, we include this file.
3658 Macros and function headers for reading in source files.
3660 @item struct-symbol.h
3661 Structure definition and macros for dealing with the gas
3662 internal form of a symbol.
3665 structure definition for dealing with the numbered subsegments
3666 of the text and data segments.
3669 Macros and function headers for dealing with symbols.
3672 Structure for doing segment fixups.
3675 @comment ~subsection Test Directory
3676 @comment (Note: The test directory seems to have disappeared somewhere
3677 @comment along the line. If you want it, you'll probably have to find a
3678 @comment REALLY OLD dump tape~dots{})
3680 @comment The ~file{test/} directory is used for regression testing.
3681 @comment After you modify ~@code{as}, you can get a quick go/nogo
3682 @comment confidence test by running the new ~@code{as} over the source
3683 @comment files in this directory. You use a shell script ~file{test/do}.
3685 @comment The tests in this suite are evolving. They are not comprehensive.
3686 @comment They have, however, caught hundreds of bugs early in the debugging
3687 @comment cycle of ~@code{as}. Most test statements in this suite were naturally
3688 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3689 @comment than being written ~i{a prioi}.
3691 @comment Another testing suggestion: over 30 bugs have been found simply by
3692 @comment running examples from this manual through ~@code{as}.
3693 @comment Some examples in this manual are selected
3694 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3696 @comment ~subsubsection Regression Testing
3697 @comment Each regression test involves assembling a file and comparing the
3698 @comment actual output of ~@code{as} to ``known good'' output files. Both
3699 @comment the object file and the error/warning message file (stderr) are
3700 @comment inspected. Optionally ~@code{as}' exit status may be checked.
3701 @comment Discrepencies are reported. Each discrepency means either that
3702 @comment you broke some part of ~@code{as} or that the ``known good'' files
3703 @comment are now out of date and should be changed to reflect the new
3704 @comment definition of ``good''.
3706 @comment Each regression test lives in its own directory, in a tree
3707 @comment rooted in the directory ~file{test/}. Each such directory
3708 @comment has a name ending in ~file{.ret}, where `ret' stands for
3709 @comment REgression Test. The ~file{.ret} ending allows ~code{find
3710 @comment (1)} to find all regression tests in the tree, without
3711 @comment needing to list them explicitly.
3713 @comment Any ~file{.ret} directory must contain a file called
3714 @comment ~file{input} which is the source file to assemble. During
3715 @comment testing an object file ~file{output} is created, as well as
3716 @comment a file ~file{stdouterr} which contains the output to both
3717 @comment stderr and stderr. If there is a file ~file{output.good} in
3718 @comment the directory, and if ~file{output} contains exactly the
3719 @comment same data as ~file{output.good}, the file ~file{output} is
3720 @comment deleted. Likewise ~file{stdouterr} is removed if it exactly
3721 @comment matches a file ~file{stdouterr.good}. If file
3722 @comment ~file{status.good} is present, containing a decimal number
3723 @comment before a newline, the exit status of ~@code{as} is compared
3724 @comment to this number. If the status numbers are not equal, a file
3725 @comment ~file{status} is written to the directory, containing the
3726 @comment actual status as a decimal number followed by newline.
3728 @comment Should any of the ~file{*.good} files fail to match their corresponding
3729 @comment actual files, this is noted by a 1-line message on the screen during
3730 @comment the regression test, and you can use ~@code{find (1)} to find any
3731 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3733 @node Retargeting, License, Maintenance, Top
3734 @chapter Teaching the Assembler about a New Machine
3736 This chapter describes the steps required in order to make the
3737 assembler work with another machine's assembly language. This
3738 chapter is not complete, and only describes the steps in the
3739 broadest terms. You should look at the source for the
3740 currently supported machine in order to discover some of the
3741 details that aren't mentioned here.
3743 You should create a new file called @file{@var{machine}.c}, and
3744 add the appropriate lines to the file @file{Makefile} so that
3745 you can compile your new version of the assembler. This should
3746 be straighforward; simply add lines similar to the ones there
3747 for the four current versions of the assembler.
3749 If you want to be compatible with GDB, (and the current
3750 machine-dependent versions of the assembler), you should create
3751 a file called @file{@var{machine}-opcode.h} which should
3752 contain all the information about the names of the machine
3753 instructions, their opcodes, and what addressing modes they
3754 support. If you do this right, the assembler and GDB can share
3755 this file, and you'll only have to write it once. Note that
3756 while you're writing @code{as}, you may want to use an
3757 independent program (if you have access to one), to make sure
3758 that @code{as} is emitting the correct bytes. Since @code{as}
3759 and @code{GDB} share the opcode table, an incorrect opcode
3760 table entry may make invalid bytes look OK when you disassemble
3761 them with @code{GDB}.
3763 @section Functions You will Have to Write
3765 Your file @file{@var{machine}.c} should contain definitions for
3766 the following functions and variables. It will need to include
3767 some header files in order to use some of the structures
3768 defined in the machine-independent part of the assembler. The
3769 needed header files are mentioned in the descriptions of the
3770 functions that will need them.
3775 This long integer holds the value to place at the beginning of
3776 the @file{a.out} file. It is usually @samp{OMAGIC}, except on
3777 machines that store additional information in the magic-number.
3779 @item char comment_chars[];
3780 This character array holds the values of the characters that
3781 start a comment anywhere in a line. Comments are stripped off
3782 automatically by the machine independent part of the
3783 assembler. Note that the @samp{/*} will always start a
3784 comment, and that only @samp{*/} will end a comment started by
3787 @item char line_comment_chars[];
3788 This character array holds the values of the chars that start a
3789 comment only if they are the first (non-whitespace) character
3790 on a line. If the character @samp{#} does not appear in this
3791 list, you may get unexpected results. (Various
3792 machine-independent parts of the assembler treat the comments
3793 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3794 that start with @samp{#} are comments.)
3796 @item char EXP_CHARS[];
3797 This character array holds the letters that can separate the
3798 mantissa and the exponent of a floating point number. Typical
3799 values are @samp{e} and @samp{E}.
3801 @item char FLT_CHARS[];
3802 This character array holds the letters that--when they appear
3803 immediately after a leading zero--indicate that a number is a
3804 floating-point number. (Sort of how 0x indicates that a
3805 hexadecimal number follows.)
3807 @item pseudo_typeS md_pseudo_table[];
3808 (@var{pseudo_typeS} is defined in @file{md.h})
3809 This array contains a list of the machine_dependent directives
3810 the assembler must support. It contains the name of each
3811 pseudo op (Without the leading @samp{.}), a pointer to a
3812 function to be called when that directive is encountered, and
3813 an integer argument to be passed to that function.
3815 @item void md_begin(void)
3816 This function is called as part of the assembler's
3817 initialization. It should do any initialization required by
3818 any of your other routines.
3820 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3821 This routine is called once for each option on the command line
3822 that the machine-independent part of @code{as} does not
3823 understand. This function should return non-zero if the option
3824 pointed to by @var{optionPTR} is a valid option. If it is not
3825 a valid option, this routine should return zero. The variables
3826 @var{argcPTR} and @var{argvPTR} are provided in case the option
3827 requires a filename or something similar as an argument. If
3828 the option is multi-character, @var{optionPTR} should be
3829 advanced past the end of the option, otherwise every letter in
3830 the option will be treated as a separate single-character
3833 @item void md_assemble(char *string)
3834 This routine is called for every machine-dependent
3835 non-directive line in the source file. It does all the real
3836 work involved in reading the opcode, parsing the operands,
3837 etc. @var{string} is a pointer to a null-terminated string,
3838 that comprises the input line, with all excess whitespace and
3841 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3842 This routine is called to turn a C long int, short int, or char
3843 into the series of bytes that represents that number on the
3844 target machine. @var{outputPTR} points to an array where the
3845 result should be stored; @var{value} is the value to store; and
3846 @var{nbytes} is the number of bytes in 'value' that should be
3849 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3850 This routine is called to turn a C long int, short int, or char
3851 into the series of bytes that represent an immediate value on
3852 the target machine. It is identical to the function @code{md_number_to_chars},
3853 except on NS32K machines.@refill
3855 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3856 This routine is called to turn a C long int, short int, or char
3857 into the series of bytes that represent an displacement value on
3858 the target machine. It is identical to the function @code{md_number_to_chars},
3859 except on NS32K machines.@refill
3861 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3862 This routine is identical to @code{md_number_to_chars},
3863 except on NS32K machines.
3865 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3866 (@code{struct relocation_info} is defined in @file{a.out.h})
3867 This routine emits the relocation info in @var{ri}
3868 in the appropriate bit-pattern for the target machine.
3869 The result should be stored in the location pointed
3870 to by @var{riPTR}. This routine may be a no-op unless you are
3871 attempting to do cross-assembly.
3873 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3874 This routine turns a series of digits into the appropriate
3875 internal representation for a floating-point number.
3876 @var{type} is a character from @var{FLT_CHARS[]} that describes
3877 what kind of floating point number is wanted; @var{outputPTR}
3878 is a pointer to an array that the result should be stored in;
3879 and @var{sizePTR} is a pointer to an integer where the size (in
3880 bytes) of the result should be stored. This routine should
3881 return an error message, or an empty string (not (char *)0) for
3884 @item int md_short_jump_size;
3885 This variable holds the (maximum) size in bytes of a short (16
3886 bit or so) jump created by @code{md_create_short_jump()}. This
3887 variable is used as part of the broken-word feature, and isn't
3888 needed if the assembler is compiled with
3889 @samp{-DWORKING_DOT_WORD}.
3891 @item int md_long_jump_size;
3892 This variable holds the (maximum) size in bytes of a long (32
3893 bit or so) jump created by @code{md_create_long_jump()}. This
3894 variable is used as part of the broken-word feature, and isn't
3895 needed if the assembler is compiled with
3896 @samp{-DWORKING_DOT_WORD}.
3898 @item void md_create_short_jump(char *resultPTR,long from_addr,
3899 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3900 This function emits a jump from @var{from_addr} to @var{to_addr} in
3901 the array of bytes pointed to by @var{resultPTR}. If this creates a
3902 type of jump that must be relocated, this function should call
3903 @code{fix_new()} with @var{frag} and @var{to_symbol}. The jump
3904 emitted by this function may be smaller than @var{md_short_jump_size},
3905 but it must never create a larger one.
3906 (If it creates a smaller jump, the extra bytes of memory will not be
3907 used.) This function is used as part of the broken-word feature,
3908 and isn't needed if the assembler is compiled with
3909 @samp{-DWORKING_DOT_WORD}.@refill
3911 @item void md_create_long_jump(char *ptr,long from_addr,
3912 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3913 This function is similar to the previous function,
3914 @code{md_create_short_jump()}, except that it creates a long
3915 jump instead of a short one. This function is used as part of
3916 the broken-word feature, and isn't needed if the assembler is
3917 compiled with @samp{-DWORKING_DOT_WORD}.
3919 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3920 This function does the initial setting up for relaxation. This
3921 includes forcing references to still-undefined symbols to the
3922 appropriate addressing modes.
3924 @item relax_typeS md_relax_table[];
3925 (relax_typeS is defined in md.h)
3926 This array describes the various machine dependent states a
3927 frag may be in before relaxation. You will need one group of
3928 entries for each type of addressing mode you intend to relax.
3930 @item void md_convert_frag(fragS *fragPTR)
3931 (@var{fragS} is defined in @file{as.h})
3932 This routine does the required cleanup after relaxation.
3933 Relaxation has changed the type of the frag to a type that can
3934 reach its destination. This function should adjust the opcode
3935 of the frag to use the appropriate addressing mode.
3936 @var{fragPTR} points to the frag to clean up.
3938 @item void md_end(void)
3939 This function is called just before the assembler exits. It
3940 need not free up memory unless the operating system doesn't do
3941 it automatically on exit. (In which case you'll also have to
3942 track down all the other places where the assembler allocates
3943 space but never frees it.)
3947 @section External Variables You will Need to Use
3949 You will need to refer to or change the following external variables
3950 from within the machine-dependent part of the assembler.
3953 @item extern char flagseen[];
3954 This array holds non-zero values in locations corresponding to
3955 the options that were on the command line. Thus, if the
3956 assembler was called with @samp{-W}, @var{flagseen['W']} would
3959 @item extern fragS *frag_now;
3960 This pointer points to the current frag--the frag that bytes
3961 are currently being added to. If nothing else, you will need
3962 to pass it as an argument to various machine-independent
3963 functions. It is maintained automatically by the
3964 frag-manipulating functions; you should never have to change it
3967 @item extern LITTLENUM_TYPE generic_bignum[];
3968 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3969 This is where @dfn{bignums}--numbers larger than 32 bits--are
3970 returned when they are encountered in an expression. You will
3971 need to use this if you need to implement directives (or
3972 anything else) that must deal with these large numbers.
3973 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3974 @file{as.h}, and have a positive @code{X_add_number}. The
3975 @code{X_add_number} of a @code{bignum} is the number of
3976 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3979 @item extern FLONUM_TYPE generic_floating_point_number;
3980 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3981 The is where @dfn{flonums}--floating-point numbers within
3982 expressions--are returned. @code{Flonums} are of @code{segT}
3983 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3984 @code{Flonums} are returned in a generic format. You will have
3985 to write a routine to turn this generic format into the
3986 appropriate floating-point format for your machine.
3988 @item extern int need_pass_2;
3989 If this variable is non-zero, the assembler has encountered an
3990 expression that cannot be assembled in a single pass. Since
3991 the second pass isn't implemented, this flag means that the
3992 assembler is punting, and is only looking for additional syntax
3993 errors. (Or something like that.)
3995 @item extern segT now_seg;
3996 This variable holds the value of the segment the assembler is
3997 currently assembling into.
4001 @section External functions will you need
4003 You will find the following external functions useful (or
4004 indispensable) when you're writing the machine-dependent part
4009 @item char *frag_more(int bytes)
4010 This function allocates @var{bytes} more bytes in the current
4011 frag (or starts a new frag, if it can't expand the current frag
4012 any more.) for you to store some object-file bytes in. It
4013 returns a pointer to the bytes, ready for you to store data in.
4015 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4016 This function stores a relocation fixup to be acted on later.
4017 @var{frag} points to the frag the relocation belongs in;
4018 @var{where} is the location within the frag where the relocation begins;
4019 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4020 2 (sixteen bits), or 4 (a longword).
4021 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4022 at @var{frag->literal[where]}. If @var{pcrel} is non-zero, the address of the
4023 location is subtracted from the result. A relocation entry is also added
4024 to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or
4025 @var{offset} may be NULL.@refill
4027 @item char *frag_var(relax_stateT type, int max_chars, int var,
4028 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4029 This function creates a machine-dependent frag of type @var{type}
4030 (usually @code{rs_machine_dependent}).
4031 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4032 @var{var} is the current size of the variable end of the frag;
4033 @var{subtype} is the sub-type of the frag. The sub-type is used to index into
4034 @var{md_relax_table[]} during @code{relaxation}.
4035 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4036 @var{opcode} points into a byte whose value may have to be modified if the
4037 addressing mode used by this frag changes. It typically points into the
4038 @var{fr_literal[]} of the previous frag, and is used to point to a location
4039 that @code{md_convert_frag()}, may have to change.@refill
4041 @item void frag_wane(fragS *fragPTR)
4042 This function is useful from within @code{md_convert_frag}. It
4043 changes a frag to type rs_fill, and sets the variable-sized
4044 piece of the frag to zero. The frag will never change in size
4047 @item segT expression(expressionS *retval)
4048 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4049 This function parses the string pointed to by the external char
4050 pointer @var{input_line_pointer}, and returns the segment-type
4051 of the expression. It also stores the results in the
4052 @var{expressionS} pointed to by @var{retval}.
4053 @var{input_line_pointer} is advanced to point past the end of
4054 the expression. (@var{input_line_pointer} is used by other
4055 parts of the assembler. If you modify it, be sure to restore
4056 it to its original value.)
4058 @item as_warn(char *message,@dots{})
4059 If warning messages are disabled, this function does nothing.
4060 Otherwise, it prints out the current file name, and the current
4061 line number, then uses @code{fprintf} to print the
4062 @var{message} and any arguments it was passed.
4064 @item as_bad(char *message,@dots{})
4065 This function should be called when @code{as} encounters
4066 conditions that are bad enough that @code{as} should not
4067 produce an object file, but should continue reading input and
4068 printing warning and bad error messages.
4070 @item as_fatal(char *message,@dots{})
4071 This function prints out the current file name and line number,
4072 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4073 print the @var{message} and any arguments it was passed. Then
4074 the assembler exits. This function should only be used for
4075 serious, unrecoverable errors.
4077 @item void float_const(int float_type)
4078 This function reads floating-point constants from the current
4079 input line, and calls @code{md_atof} to assemble them. It is
4080 useful as the function to call for the directives
4081 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4082 @var{float_type} must be a character from @var{FLT_CHARS}.
4084 @item void demand_empty_rest_of_line(void);
4085 This function can be used by machine-dependent directives to
4086 make sure the rest of the input line is empty. It prints a
4087 warning message if there are additional characters on the line.
4089 @item long int get_absolute_expression(void)
4090 This function can be used by machine-dependent directives to
4091 read an absolute number from the current input line. It
4092 returns the result. If it isn't given an absolute expression,
4093 it prints a warning message and returns zero.
4098 @section The concept of Frags
4100 This assembler works to optimize the size of certain addressing
4101 modes. (e.g. branch instructions) This means the size of many
4102 pieces of object code cannot be determined until after assembly
4103 is finished. (This means that the addresses of symbols cannot be
4104 determined until assembly is finished.) In order to do this,
4105 @code{as} stores the output bytes as @dfn{frags}.
4107 Here is the definition of a frag (from @file{as.h})
4113 relax_stateT fr_type;
4114 relax_substateT fr_substate;
4115 unsigned long fr_address;
4117 struct symbol *fr_symbol;
4119 struct frag *fr_next;
4126 is the size of the fixed-size piece of the frag.
4129 is the maximum (?) size of the variable-sized piece of the frag.
4132 is the type of the frag.
4137 rs_machine_dependent
4140 This stores the type of machine-dependent frag this is. (what
4141 kind of addressing mode is being used, and what size is being
4145 @var{fr_address} is only valid after relaxation is finished.
4146 Before relaxation, the only way to store an address is (pointer
4147 to frag containing the address) plus (offset into the frag).
4150 This contains a number, whose meaning depends on the type of
4152 for machine_dependent frags, this contains the offset from
4153 fr_symbol that the frag wants to go to. Thus, for branch
4154 instructions it is usually zero. (unless the instruction was
4155 @samp{jba foo+12} or something like that.)
4158 for machine_dependent frags, this points to the symbol the frag
4162 This points to the location in the frag (or in a previous frag)
4163 of the opcode for the instruction that caused this to be a frag.
4164 @var{fr_opcode} is needed if the actual opcode must be changed
4165 in order to use a different form of the addressing mode.
4166 (For example, if a conditional branch only comes in size tiny,
4167 a large-size branch could be implemented by reversing the sense
4168 of the test, and turning it into a tiny branch over a large jump.
4169 This would require changing the opcode.)
4171 @var{fr_literal} is a variable-size array that contains the
4172 actual object bytes. A frag consists of a fixed size piece of
4173 object data, (which may be zero bytes long), followed by a
4174 piece of object data whose size may not have been determined
4175 yet. Other information includes the type of the frag (which
4176 controls how it is relaxed),
4179 This is the next frag in the singly-linked list. This is
4180 usually only needed by the machine-independent part of
4186 @node License, , Retargeting, Top
4187 @unnumbered GNU GENERAL PUBLIC LICENSE
4188 @center Version 1, February 1989
4191 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4192 675 Mass Ave, Cambridge, MA 02139, USA
4194 Everyone is permitted to copy and distribute verbatim copies
4195 of this license document, but changing it is not allowed.
4198 @unnumberedsec Preamble
4200 The license agreements of most software companies try to keep users
4201 at the mercy of those companies. By contrast, our General Public
4202 License is intended to guarantee your freedom to share and change free
4203 software---to make sure the software is free for all its users. The
4204 General Public License applies to the Free Software Foundation's
4205 software and to any other program whose authors commit to using it.
4206 You can use it for your programs, too.
4208 When we speak of free software, we are referring to freedom, not
4209 price. Specifically, the General Public License is designed to make
4210 sure that you have the freedom to give away or sell copies of free
4211 software, that you receive source code or can get it if you want it,
4212 that you can change the software or use pieces of it in new free
4213 programs; and that you know you can do these things.
4215 To protect your rights, we need to make restrictions that forbid
4216 anyone to deny you these rights or to ask you to surrender the rights.
4217 These restrictions translate to certain responsibilities for you if you
4218 distribute copies of the software, or if you modify it.
4220 For example, if you distribute copies of a such a program, whether
4221 gratis or for a fee, you must give the recipients all the rights that
4222 you have. You must make sure that they, too, receive or can get the
4223 source code. And you must tell them their rights.
4225 We protect your rights with two steps: (1) copyright the software, and
4226 (2) offer you this license which gives you legal permission to copy,
4227 distribute and/or modify the software.
4229 Also, for each author's protection and ours, we want to make certain
4230 that everyone understands that there is no warranty for this free
4231 software. If the software is modified by someone else and passed on, we
4232 want its recipients to know that what they have is not the original, so
4233 that any problems introduced by others will not reflect on the original
4234 authors' reputations.
4236 The precise terms and conditions for copying, distribution and
4237 modification follow.
4240 @unnumberedsec TERMS AND CONDITIONS
4243 @center TERMS AND CONDITIONS
4248 This License Agreement applies to any program or other work which
4249 contains a notice placed by the copyright holder saying it may be
4250 distributed under the terms of this General Public License. The
4251 ``Program'', below, refers to any such program or work, and a ``work based
4252 on the Program'' means either the Program or any work containing the
4253 Program or a portion of it, either verbatim or with modifications. Each
4254 licensee is addressed as ``you''.
4257 You may copy and distribute verbatim copies of the Program's source
4258 code as you receive it, in any medium, provided that you conspicuously and
4259 appropriately publish on each copy an appropriate copyright notice and
4260 disclaimer of warranty; keep intact all the notices that refer to this
4261 General Public License and to the absence of any warranty; and give any
4262 other recipients of the Program a copy of this General Public License
4263 along with the Program. You may charge a fee for the physical act of
4264 transferring a copy.
4267 You may modify your copy or copies of the Program or any portion of
4268 it, and copy and distribute such modifications under the terms of Paragraph
4269 1 above, provided that you also do the following:
4273 cause the modified files to carry prominent notices stating that
4274 you changed the files and the date of any change; and
4277 cause the whole of any work that you distribute or publish, that
4278 in whole or in part contains the Program or any part thereof, either
4279 with or without modifications, to be licensed at no charge to all
4280 third parties under the terms of this General Public License (except
4281 that you may choose to grant warranty protection to some or all
4282 third parties, at your option).
4285 If the modified program normally reads commands interactively when
4286 run, you must cause it, when started running for such interactive use
4287 in the simplest and most usual way, to print or display an
4288 announcement including an appropriate copyright notice and a notice
4289 that there is no warranty (or else, saying that you provide a
4290 warranty) and that users may redistribute the program under these
4291 conditions, and telling the user how to view a copy of this General
4295 You may charge a fee for the physical act of transferring a
4296 copy, and you may at your option offer warranty protection in
4300 Mere aggregation of another independent work with the Program (or its
4301 derivative) on a volume of a storage or distribution medium does not bring
4302 the other work under the scope of these terms.
4305 You may copy and distribute the Program (or a portion or derivative of
4306 it, under Paragraph 2) in object code or executable form under the terms of
4307 Paragraphs 1 and 2 above provided that you also do one of the following:
4311 accompany it with the complete corresponding machine-readable
4312 source code, which must be distributed under the terms of
4313 Paragraphs 1 and 2 above; or,
4316 accompany it with a written offer, valid for at least three
4317 years, to give any third party free (except for a nominal charge
4318 for the cost of distribution) a complete machine-readable copy of the
4319 corresponding source code, to be distributed under the terms of
4320 Paragraphs 1 and 2 above; or,
4323 accompany it with the information you received as to where the
4324 corresponding source code may be obtained. (This alternative is
4325 allowed only for noncommercial distribution and only if you
4326 received the program in object code or executable form alone.)
4329 Source code for a work means the preferred form of the work for making
4330 modifications to it. For an executable file, complete source code means
4331 all the source code for all modules it contains; but, as a special
4332 exception, it need not include source code for modules which are standard
4333 libraries that accompany the operating system on which the executable
4334 file runs, or for standard header files or definitions files that
4335 accompany that operating system.
4338 You may not copy, modify, sublicense, distribute or transfer the
4339 Program except as expressly provided under this General Public License.
4340 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4341 the Program is void, and will automatically terminate your rights to use
4342 the Program under this License. However, parties who have received
4343 copies, or rights to use copies, from you under this General Public
4344 License will not have their licenses terminated so long as such parties
4345 remain in full compliance.
4348 By copying, distributing or modifying the Program (or any work based
4349 on the Program) you indicate your acceptance of this license to do so,
4350 and all its terms and conditions.
4353 Each time you redistribute the Program (or any work based on the
4354 Program), the recipient automatically receives a license from the original
4355 licensor to copy, distribute or modify the Program subject to these
4356 terms and conditions. You may not impose any further restrictions on the
4357 recipients' exercise of the rights granted herein.
4360 The Free Software Foundation may publish revised and/or new versions
4361 of the General Public License from time to time. Such new versions will
4362 be similar in spirit to the present version, but may differ in detail to
4363 address new problems or concerns.
4365 Each version is given a distinguishing version number. If the Program
4366 specifies a version number of the license which applies to it and ``any
4367 later version'', you have the option of following the terms and conditions
4368 either of that version or of any later version published by the Free
4369 Software Foundation. If the Program does not specify a version number of
4370 the license, you may choose any version ever published by the Free Software
4374 If you wish to incorporate parts of the Program into other free
4375 programs whose distribution conditions are different, write to the author
4376 to ask for permission. For software which is copyrighted by the Free
4377 Software Foundation, write to the Free Software Foundation; we sometimes
4378 make exceptions for this. Our decision will be guided by the two goals
4379 of preserving the free status of all derivatives of our free software and
4380 of promoting the sharing and reuse of software generally.
4383 @heading NO WARRANTY
4390 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4391 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
4392 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4393 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4394 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4395 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
4396 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
4397 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4398 REPAIR OR CORRECTION.
4401 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4402 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4403 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4404 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4405 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4406 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4407 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4408 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4409 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4413 @heading END OF TERMS AND CONDITIONS
4416 @center END OF TERMS AND CONDITIONS
4420 @unnumberedsec Appendix: How to Apply These Terms to Your New Programs
4422 If you develop a new program, and you want it to be of the greatest
4423 possible use to humanity, the best way to achieve this is to make it
4424 free software which everyone can redistribute and change under these
4427 To do so, attach the following notices to the program. It is safest to
4428 attach them to the start of each source file to most effectively convey
4429 the exclusion of warranty; and each file should have at least the
4430 ``copyright'' line and a pointer to where the full notice is found.
4433 @var{one line to give the program's name and a brief idea of what it does.}
4434 Copyright (C) 19@var{yy} @var{name of author}
4436 This program is free software; you can redistribute it and/or modify
4437 it under the terms of the GNU General Public License as published by
4438 the Free Software Foundation; either version 1, or (at your option)
4441 This program is distributed in the hope that it will be useful,
4442 but WITHOUT ANY WARRANTY; without even the implied warranty of
4443 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4444 GNU General Public License for more details.
4446 You should have received a copy of the GNU General Public License
4447 along with this program; if not, write to the Free Software
4448 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4451 Also add information on how to contact you by electronic and paper mail.
4453 If the program is interactive, make it output a short notice like this
4454 when it starts in an interactive mode:
4457 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4458 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4459 This is free software, and you are welcome to redistribute it
4460 under certain conditions; type `show c' for details.
4463 The hypothetical commands `show w' and `show c' should show the
4464 appropriate parts of the General Public License. Of course, the
4465 commands you use may be called something other than `show w' and `show
4466 c'; they could even be mouse-clicks or menu items---whatever suits your
4469 You should also get your employer (if you work as a programmer) or your
4470 school, if any, to sign a ``copyright disclaimer'' for the program, if
4471 necessary. Here is a sample; alter the names:
4474 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4475 program `Gnomovision' (a program to direct compilers to make passes
4476 at assemblers) written by James Hacker.
4478 @var{signature of Ty Coon}, 1 April 1989
4479 Ty Coon, President of Vice
4482 That's all there is to it!