Commit | Line | Data |
---|---|---|
ae6cd60f KR |
1 | \input texinfo |
2 | @setfilename internals.info | |
af16e411 FF |
3 | @node Top |
4 | @top Assembler Internals | |
5 | @raisesections | |
582ffe70 KR |
6 | @cindex internals |
7 | ||
af16e411 FF |
8 | This chapter describes the internals of the assembler. It is incomplete, but |
9 | it may help a bit. | |
ae6cd60f | 10 | |
af16e411 FF |
11 | This chapter was last modified on $Date$. It is not updated regularly, and it |
12 | may be out of date. | |
ae6cd60f | 13 | |
582ffe70 | 14 | @menu |
af16e411 | 15 | * GAS versions:: GAS versions |
582ffe70 | 16 | * Data types:: Data types |
af16e411 FF |
17 | * GAS processing:: What GAS does when it runs |
18 | * Porting GAS:: Porting GAS | |
19 | * Relaxation:: Relaxation | |
20 | * Broken words:: Broken words | |
21 | * Internal functions:: Internal functions | |
22 | * Test suite:: Test suite | |
23 | @end menu | |
24 | ||
25 | @node GAS versions | |
26 | @section GAS versions | |
27 | ||
28 | GAS has acquired layers of code over time. The original GAS only supported the | |
29 | a.out object file format, with three sections. Support for multiple sections | |
30 | has been added in two different ways. | |
31 | ||
32 | The preferred approach is to use the version of GAS created when the symbol | |
33 | @code{BFD_ASSEMBLER} is defined. The other versions of GAS are documented for | |
34 | historical purposes, and to help anybody who has to debug code written for | |
35 | them. | |
36 | ||
37 | The type @code{segT} is used to represent a section in code which must work | |
38 | with all versions of GAS. | |
39 | ||
40 | @menu | |
41 | * Original GAS:: Original GAS version | |
42 | * MANY_SEGMENTS:: MANY_SEGMENTS gas version | |
43 | * BFD_ASSEMBLER:: BFD_ASSEMBLER gas version | |
582ffe70 KR |
44 | @end menu |
45 | ||
af16e411 FF |
46 | @node Original GAS |
47 | @subsection Original GAS | |
48 | ||
49 | The original GAS only supported the a.out object file format with three | |
50 | sections: @samp{.text}, @samp{.data}, and @samp{.bss}. This is the version of | |
51 | GAS that is compiled if neither @code{BFD_ASSEMBLER} nor @code{MANY_SEGMENTS} | |
52 | is defined. This version of GAS is still used for the m68k-aout target, and | |
53 | perhaps others. | |
54 | ||
55 | This version of GAS should not be used for any new development. | |
582ffe70 | 56 | |
af16e411 FF |
57 | There is still code that is specific to this version of GAS, notably in |
58 | @file{write.c}. There is no way for this code to loop through all the | |
59 | sections; it simply looks at global variables like @code{text_frag_root} and | |
60 | @code{data_frag_root}. | |
582ffe70 | 61 | |
af16e411 FF |
62 | The type @code{segT} is an enum. |
63 | ||
64 | @node MANY_SEGMENTS | |
65 | @subsection MANY_SEGMENTS gas version | |
66 | @cindex MANY_SEGMENTS | |
67 | ||
68 | The @code{MANY_SEGMENTS} version of gas is only used for COFF. It uses the BFD | |
69 | library, but it writes out all the data itself using @code{bfd_write}. This | |
70 | version of gas supports up to 40 normal sections. The section names are stored | |
71 | in the @code{seg_name} array. Other information is stored in the | |
72 | @code{segment_info} array. | |
73 | ||
74 | The type @code{segT} is an enum. Code that wants to examine all the sections | |
75 | can use a @code{segT} variable as loop index from @code{SEG_E0} up to but not | |
76 | including @code{SEG_UNKNOWN}. | |
77 | ||
78 | Most of the code specific to this version of GAS is in the file | |
79 | @file{config/obj-coff.c}, in the portion of that file that is compiled when | |
80 | @code{BFD_ASSEMBLER} is not defined. | |
81 | ||
82 | This version of GAS is still used for several COFF targets. | |
83 | ||
84 | @node BFD_ASSEMBLER | |
85 | @subsection BFD_ASSEMBLER gas version | |
86 | @cindex BFD_ASSEMBLER | |
87 | ||
88 | The preferred version of GAS is the @code{BFD_ASSEMBLER} version. In this | |
89 | version of GAS, the output file is a normal BFD, and the BFD routines are used | |
90 | to generate the output. | |
91 | ||
92 | @code{BFD_ASSEMBLER} will automatically be used for certain targets, including | |
93 | those that use the ELF, ECOFF, and SOM object file formats, and also all Alpha, | |
94 | MIPS, PowerPC, and SPARC targets. You can force the use of | |
95 | @code{BFD_ASSEMBLER} for other targets with the configure option | |
96 | @samp{--enable-bfd-assembler}; however, it has not been tested for many | |
97 | targets, and can not be assumed to work. | |
582ffe70 KR |
98 | |
99 | @node Data types | |
100 | @section Data types | |
101 | @cindex internals, data types | |
102 | ||
af16e411 FF |
103 | This section describes some fundamental GAS data types. |
104 | ||
105 | @menu | |
106 | * Symbols:: The symbolS structure | |
107 | * Expressions:: The expressionS structure | |
108 | * Fixups:: The fixS structure | |
109 | * Frags:: The fragS structure | |
110 | @end menu | |
111 | ||
112 | @node Symbols | |
ae6cd60f | 113 | @subsection Symbols |
582ffe70 KR |
114 | @cindex internals, symbols |
115 | @cindex symbols, internal | |
af16e411 | 116 | @cindex symbolS structure |
582ffe70 | 117 | |
ae6cd60f | 118 | The definition for @code{struct symbol}, also known as @code{symbolS}, is |
af16e411 | 119 | located in @file{struc-symbol.h}. Symbol structures contain the following |
ae6cd60f | 120 | fields: |
582ffe70 KR |
121 | |
122 | @table @code | |
123 | @item sy_value | |
ae6cd60f | 124 | This is an @code{expressionS} that describes the value of the symbol. It might |
af16e411 FF |
125 | refer to one or more other symbols; if so, its true value may not be known |
126 | until @code{resolve_symbol_value} is called in @code{write_object_file}. | |
582ffe70 | 127 | |
af16e411 FF |
128 | The expression is often simply a constant. Before @code{resolve_symbol_value} |
129 | is called, the value is the offset from the frag (@pxref{Frags}). Afterward, | |
130 | the frag address has been added in. | |
582ffe70 KR |
131 | |
132 | @item sy_resolved | |
ae6cd60f KR |
133 | This field is non-zero if the symbol's value has been completely resolved. It |
134 | is used during the final pass over the symbol table. | |
582ffe70 KR |
135 | |
136 | @item sy_resolving | |
137 | This field is used to detect loops while resolving the symbol's value. | |
138 | ||
139 | @item sy_used_in_reloc | |
ae6cd60f KR |
140 | This field is non-zero if the symbol is used by a relocation entry. If a local |
141 | symbol is used in a relocation entry, it must be possible to redirect those | |
142 | relocations to other symbols, or this symbol cannot be removed from the final | |
143 | symbol list. | |
582ffe70 KR |
144 | |
145 | @item sy_next | |
146 | @itemx sy_previous | |
ae6cd60f KR |
147 | These pointers to other @code{symbolS} structures describe a singly or doubly |
148 | linked list. (If @code{SYMBOLS_NEED_BACKPOINTERS} is not defined, the | |
af16e411 FF |
149 | @code{sy_previous} field will be omitted; @code{SYMBOLS_NEED_BACKPOINTERS} is |
150 | always defined if @code{BFD_ASSEMBLER}.) These fields should be accessed with | |
151 | the @code{symbol_next} and @code{symbol_previous} macros. | |
582ffe70 KR |
152 | |
153 | @item sy_frag | |
af16e411 | 154 | This points to the frag (@pxref{Frags}) that this symbol is attached to. |
582ffe70 KR |
155 | |
156 | @item sy_used | |
ae6cd60f KR |
157 | Whether the symbol is used as an operand or in an expression. Note: Not all of |
158 | the backends keep this information accurate; backends which use this bit are | |
159 | responsible for setting it when a symbol is used in backend routines. | |
582ffe70 | 160 | |
af16e411 FF |
161 | @item sy_mri_common |
162 | Whether the symbol is an MRI common symbol created by the @code{COMMON} | |
163 | pseudo-op when assembling in MRI mode. | |
164 | ||
582ffe70 | 165 | @item bsym |
af16e411 FF |
166 | If @code{BFD_ASSEMBLER} is defined, this points to the BFD @code{asymbol} that |
167 | will be used in writing the object file. | |
582ffe70 KR |
168 | |
169 | @item sy_name_offset | |
ae6cd60f | 170 | (Only used if @code{BFD_ASSEMBLER} is not defined.) This is the position of |
af16e411 | 171 | the symbol's name in the string table of the object file. On some formats, |
ae6cd60f KR |
172 | this will start at position 4, with position 0 reserved for unnamed symbols. |
173 | This field is not used until @code{write_object_file} is called. | |
582ffe70 KR |
174 | |
175 | @item sy_symbol | |
ae6cd60f KR |
176 | (Only used if @code{BFD_ASSEMBLER} is not defined.) This is the |
177 | format-specific symbol structure, as it would be written into the object file. | |
582ffe70 KR |
178 | |
179 | @item sy_number | |
ae6cd60f KR |
180 | (Only used if @code{BFD_ASSEMBLER} is not defined.) This is a 24-bit symbol |
181 | number, for use in constructing relocation table entries. | |
582ffe70 KR |
182 | |
183 | @item sy_obj | |
ae6cd60f KR |
184 | This format-specific data is of type @code{OBJ_SYMFIELD_TYPE}. If no macro by |
185 | that name is defined in @file{obj-format.h}, this field is not defined. | |
582ffe70 KR |
186 | |
187 | @item sy_tc | |
ae6cd60f KR |
188 | This processor-specific data is of type @code{TC_SYMFIELD_TYPE}. If no macro |
189 | by that name is defined in @file{targ-cpu.h}, this field is not defined. | |
582ffe70 KR |
190 | |
191 | @item TARGET_SYMBOL_FIELDS | |
ae6cd60f KR |
192 | If this macro is defined, it defines additional fields in the symbol structure. |
193 | This macro is obsolete, and should be replaced when possible by uses of | |
194 | @code{OBJ_SYMFIELD_TYPE} and @code{TC_SYMFIELD_TYPE}. | |
582ffe70 KR |
195 | @end table |
196 | ||
af16e411 FF |
197 | There are a number of access routines used to extract the fields of a |
198 | @code{symbolS} structure. When possible, these routines should be used rather | |
199 | than referring to the fields directly. These routines will work for any GAS | |
200 | version. | |
582ffe70 | 201 | |
af16e411 FF |
202 | @table @code |
203 | @item S_SET_VALUE | |
204 | @cindex S_SET_VALUE | |
205 | Set the symbol's value. | |
206 | ||
207 | @item S_GET_VALUE | |
208 | @cindex S_GET_VALUE | |
209 | Get the symbol's value. This will cause @code{resolve_symbol_value} to be | |
210 | called if necessary, so @code{S_GET_VALUE} should only be called when it is | |
211 | safe to resolve symbols (i.e., after the entire input file has been read and | |
212 | all symbols have been defined). | |
213 | ||
214 | @item S_SET_SEGMENT | |
215 | @cindex S_SET_SEGMENT | |
216 | Set the section of the symbol. | |
217 | ||
218 | @item S_GET_SEGMENT | |
219 | @cindex S_GET_SEGMENT | |
220 | Get the symbol's section. | |
221 | ||
222 | @item S_GET_NAME | |
223 | @cindex S_GET_NAME | |
224 | Get the name of the symbol. | |
225 | ||
226 | @item S_SET_NAME | |
227 | @cindex S_SET_NAME | |
228 | Set the name of the symbol. | |
229 | ||
230 | @item S_IS_EXTERNAL | |
231 | @cindex S_IS_EXTERNAL | |
232 | Return non-zero if the symbol is externally visible. | |
233 | ||
234 | @item S_IS_EXTERN | |
235 | @cindex S_IS_EXTERN | |
236 | A synonym for @code{S_IS_EXTERNAL}. Don't use it. | |
237 | ||
238 | @item S_IS_WEAK | |
239 | @cindex S_IS_WEAK | |
240 | Return non-zero if the symbol is weak. | |
241 | ||
242 | @item S_IS_COMMON | |
243 | @cindex S_IS_COMMON | |
244 | Return non-zero if this is a common symbol. Common symbols are sometimes | |
245 | represented as undefined symbols with a value, in which case this function will | |
246 | not be reliable. | |
247 | ||
248 | @item S_IS_DEFINED | |
249 | @cindex S_IS_DEFINED | |
250 | Return non-zero if this symbol is defined. This function is not reliable when | |
251 | called on a common symbol. | |
252 | ||
253 | @item S_IS_DEBUG | |
254 | @cindex S_IS_DEBUG | |
255 | Return non-zero if this is a debugging symbol. | |
256 | ||
257 | @item S_IS_LOCAL | |
258 | @cindex S_IS_LOCAL | |
259 | Return non-zero if this is a local assembler symbol which should not be | |
260 | included in the final symbol table. Note that this is not the opposite of | |
261 | @code{S_IS_EXTERNAL}. The @samp{-L} assembler option affects the return value | |
262 | of this function. | |
263 | ||
264 | @item S_SET_EXTERNAL | |
265 | @cindex S_SET_EXTERNAL | |
266 | Mark the symbol as externally visible. | |
267 | ||
268 | @item S_CLEAR_EXTERNAL | |
269 | @cindex S_CLEAR_EXTERNAL | |
270 | Mark the symbol as not externally visible. | |
271 | ||
272 | @item S_SET_WEAK | |
273 | @cindex S_SET_WEAK | |
274 | Mark the symbol as weak. | |
275 | ||
276 | @item S_GET_TYPE | |
277 | @item S_GET_DESC | |
278 | @item S_GET_OTHER | |
279 | @cindex S_GET_TYPE | |
280 | @cindex S_GET_DESC | |
281 | @cindex S_GET_OTHER | |
282 | Get the @code{type}, @code{desc}, and @code{other} fields of the symbol. These | |
283 | are only defined for object file formats for which they make sense (primarily | |
284 | a.out). | |
285 | ||
286 | @item S_SET_TYPE | |
287 | @item S_SET_DESC | |
288 | @item S_SET_OTHER | |
289 | @cindex S_SET_TYPE | |
290 | @cindex S_SET_DESC | |
291 | @cindex S_SET_OTHER | |
292 | Set the @code{type}, @code{desc}, and @code{other} fields of the symbol. These | |
293 | are only defined for object file formats for which they make sense (primarily | |
294 | a.out). | |
295 | ||
296 | @item S_GET_SIZE | |
297 | @cindex S_GET_SIZE | |
298 | Get the size of a symbol. This is only defined for object file formats for | |
299 | which it makes sense (primarily ELF). | |
300 | ||
301 | @item S_SET_SIZE | |
302 | @cindex S_SET_SIZE | |
303 | Set the size of a symbol. This is only defined for object file formats for | |
304 | which it makes sense (primarily ELF). | |
305 | @end table | |
306 | ||
307 | @node Expressions | |
ae6cd60f | 308 | @subsection Expressions |
582ffe70 KR |
309 | @cindex internals, expressions |
310 | @cindex expressions, internal | |
af16e411 FF |
311 | @cindex expressionS structure |
312 | ||
313 | Expressions are stored in an @code{expressionS} structure. The structure is | |
314 | defined in @file{expr.h}. | |
315 | ||
316 | @cindex expression | |
317 | The macro @code{expression} will create an @code{expressionS} structure based | |
318 | on the text found at the global variable @code{input_line_pointer}. | |
319 | ||
320 | @cindex make_expr_symbol | |
321 | @cindex expr_symbol_where | |
322 | A single @code{expressionS} structure can represent a single operation. | |
323 | Complex expressions are formed by creating @dfn{expression symbols} and | |
324 | combining them in @code{expressionS} structures. An expression symbol is | |
325 | created by calling @code{make_expr_symbol}. An expression symbol should | |
326 | naturally never appear in a symbol table, and the implementation of | |
327 | @code{S_IS_LOCAL} (@pxref{Symbols}) reflects that. The function | |
328 | @code{expr_symbol_where} returns non-zero if a symbol is an expression symbol, | |
329 | and also returns the file and line for the expression which caused it to be | |
330 | created. | |
331 | ||
332 | The @code{expressionS} structure has two symbol fields, a number field, an | |
333 | operator field, and a field indicating whether the number is unsigned. | |
334 | ||
335 | The operator field is of type @code{operatorT}, and describes how to interpret | |
336 | the other fields; see the definition in @file{expr.h} for the possibilities. | |
337 | ||
338 | An @code{operatorT} value of @code{O_big} indicates either a floating point | |
339 | number, stored in the global variable @code{generic_floating_point_number}, or | |
340 | an integer to large to store in an @code{offsetT} type, stored in the global | |
341 | array @code{generic_bignum}. This rather inflexible approach makes it | |
342 | impossible to use floating point numbers or large expressions in complex | |
343 | expressions. | |
344 | ||
345 | @node Fixups | |
ae6cd60f | 346 | @subsection Fixups |
582ffe70 KR |
347 | @cindex internals, fixups |
348 | @cindex fixups | |
af16e411 FF |
349 | @cindex fixS structure |
350 | ||
351 | A @dfn{fixup} is basically anything which can not be resolved in the first | |
352 | pass. Sometimes a fixup can be resolved by the end of the assembly; if not, | |
353 | the fixup becomes a relocation entry in the object file. | |
354 | ||
355 | @cindex fix_new | |
356 | @cindex fix_new_exp | |
357 | A fixup is created by a call to @code{fix_new} or @code{fix_new_exp}. Both | |
358 | take a frag (@pxref{Frags}), a position within the frag, a size, an indication | |
359 | of whether the fixup is PC relative, and a type. In a @code{BFD_ASSEMBLER} | |
360 | GAS, the type is nominally a @code{bfd_reloc_code_real_type}, but several | |
361 | targets use other type codes to represent fixups that can not be described as | |
362 | relocations. | |
363 | ||
364 | The @code{fixS} structure has a number of fields, several of which are obsolete | |
365 | or are only used by a particular target. The important fields are: | |
366 | ||
367 | @table @code | |
368 | @item fx_frag | |
369 | The frag (@pxref{Frags}) this fixup is in. | |
370 | ||
371 | @item fx_where | |
372 | The location within the frag where the fixup occurs. | |
373 | ||
374 | @item fx_addsy | |
375 | The symbol this fixup is against. Typically, the value of this symbol is added | |
376 | into the object contents. This may be NULL. | |
377 | ||
378 | @item fx_subsy | |
379 | The value of this symbol is subtracted from the object contents. This is | |
380 | normally NULL. | |
381 | ||
382 | @item fx_offset | |
383 | A number which is added into the fixup. | |
582ffe70 | 384 | |
af16e411 FF |
385 | @item fx_addnumber |
386 | Some CPU backends use this field to convey information between | |
387 | @code{md_apply_fix} and @code{tc_gen_reloc}. The machine independent code does | |
388 | not use it. | |
389 | ||
390 | @item fx_next | |
391 | The next fixup in the section. | |
392 | ||
393 | @item fx_r_type | |
394 | The type of the fixup. This field is only defined if @code{BFD_ASSEMBLER}, or | |
395 | if the target defines @code{NEED_FX_R_TYPE}. | |
396 | ||
397 | @item fx_size | |
398 | The size of the fixup. This is mostly used for error checking. | |
399 | ||
400 | @item fx_pcrel | |
401 | Whether the fixup is PC relative. | |
402 | ||
403 | @item fx_done | |
404 | Non-zero if the fixup has been applied, and no relocation entry needs to be | |
405 | generated. | |
406 | ||
407 | @item fx_file | |
408 | @itemx fx_line | |
409 | The file and line where the fixup was created. | |
410 | ||
411 | @item tc_fix_data | |
412 | This has the type @code{TC_FIX_TYPE}, and is only defined if the target defines | |
413 | that macro. | |
414 | @end table | |
415 | ||
416 | @node Frags | |
ae6cd60f | 417 | @subsection Frags |
582ffe70 KR |
418 | @cindex internals, frags |
419 | @cindex frags | |
af16e411 | 420 | @cindex fragS structure. |
582ffe70 | 421 | |
af16e411 FF |
422 | The @code{fragS} structure is defined in @file{as.h}. Each frag represents a |
423 | portion of the final object file. As GAS reads the source file, it creates | |
424 | frags to hold the data that it reads. At the end of the assembly the frags and | |
425 | fixups are processed to produce the final contents. | |
ae6cd60f KR |
426 | |
427 | @table @code | |
ae6cd60f KR |
428 | @item fr_address |
429 | The address of the frag. This is not set until the assembler rescans the list | |
430 | of all frags after the entire input file is parsed. The function | |
431 | @code{relax_segment} fills in this field. | |
432 | ||
433 | @item fr_next | |
434 | Pointer to the next frag in this (sub)section. | |
435 | ||
436 | @item fr_fix | |
437 | Fixed number of characters we know we're going to emit to the output file. May | |
438 | be zero. | |
439 | ||
440 | @item fr_var | |
441 | Variable number of characters we may output, after the initial @code{fr_fix} | |
442 | characters. May be zero. | |
443 | ||
af16e411 FF |
444 | @item fr_offset |
445 | The interpretation of this field is controlled by @code{fr_type}. Generally, | |
446 | if @code{fr_var} is non-zero, this is a repeat count: the @code{fr_var} | |
447 | characters are output @code{fr_offset} times. | |
ae6cd60f KR |
448 | |
449 | @item line | |
af16e411 | 450 | Holds line number info when an assembler listing was requested. |
ae6cd60f KR |
451 | |
452 | @item fr_type | |
453 | Relaxation state. This field indicates the interpretation of @code{fr_offset}, | |
454 | @code{fr_symbol} and the variable-length tail of the frag, as well as the | |
455 | treatment it gets in various phases of processing. It does not affect the | |
456 | initial @code{fr_fix} characters; they are always supposed to be output | |
457 | verbatim (fixups aside). See below for specific values this field can have. | |
458 | ||
459 | @item fr_subtype | |
460 | Relaxation substate. If the macro @code{md_relax_frag} isn't defined, this is | |
af16e411 FF |
461 | assumed to be an index into @code{TC_GENERIC_RELAX_TABLE} for the generic |
462 | relaxation code to process (@pxref{Relaxation}). If @code{md_relax_frag} is | |
463 | defined, this field is available for any use by the CPU-specific code. | |
464 | ||
465 | @item fr_symbol | |
466 | This normally indicates the symbol to use when relaxing the frag according to | |
467 | @code{fr_type}. | |
468 | ||
469 | @item fr_opcode | |
470 | Points to the lowest-addressed byte of the opcode, for use in relaxation. | |
ae6cd60f KR |
471 | |
472 | @item fr_pcrel_adjust | |
473 | @itemx fr_bsr | |
474 | These fields are only used in the NS32k configuration. But since @code{struct | |
475 | frag} is defined before the CPU-specific header files are included, they must | |
476 | unconditionally be defined. | |
477 | ||
af16e411 FF |
478 | @item fr_file |
479 | @itemx fr_line | |
480 | The file and line where this frag was last modified. | |
481 | ||
ae6cd60f KR |
482 | @item fr_literal |
483 | Declared as a one-character array, this last field grows arbitrarily large to | |
484 | hold the actual contents of the frag. | |
ae6cd60f KR |
485 | @end table |
486 | ||
487 | These are the possible relaxation states, provided in the enumeration type | |
488 | @code{relax_stateT}, and the interpretations they represent for the other | |
489 | fields: | |
490 | ||
491 | @table @code | |
ae6cd60f | 492 | @item rs_align |
af16e411 | 493 | @itemx rs_align_code |
ae6cd60f KR |
494 | The start of the following frag should be aligned on some boundary. In this |
495 | frag, @code{fr_offset} is the logarithm (base 2) of the alignment in bytes. | |
496 | (For example, if alignment on an 8-byte boundary were desired, @code{fr_offset} | |
497 | would have a value of 3.) The variable characters indicate the fill pattern to | |
d7bf6158 ILT |
498 | be used. The @code{fr_subtype} field holds the maximum number of bytes to skip |
499 | when doing this alignment. If more bytes are needed, the alignment is not | |
500 | done. An @code{fr_subtype} value of 0 means no maximum, which is the normal | |
501 | case. Target backends can use @code{rs_align_code} to handle certain types of | |
502 | alignment differently. | |
ae6cd60f KR |
503 | |
504 | @item rs_broken_word | |
af16e411 FF |
505 | This indicates that ``broken word'' processing should be done (@pxref{Broken |
506 | words}). If broken word processing is not necessary on the target machine, | |
507 | this enumerator value will not be defined. | |
ae6cd60f KR |
508 | |
509 | @item rs_fill | |
510 | The variable characters are to be repeated @code{fr_offset} times. If | |
af16e411 FF |
511 | @code{fr_offset} is 0, this frag has a length of @code{fr_fix}. Most frags |
512 | have this type. | |
ae6cd60f KR |
513 | |
514 | @item rs_machine_dependent | |
515 | Displacement relaxation is to be done on this frag. The target is indicated by | |
516 | @code{fr_symbol} and @code{fr_offset}, and @code{fr_subtype} indicates the | |
517 | particular machine-specific addressing mode desired. @xref{Relaxation}. | |
518 | ||
519 | @item rs_org | |
520 | The start of the following frag should be pushed back to some specific offset | |
af16e411 FF |
521 | within the section. (Some assemblers use the value as an absolute address; GAS |
522 | does not handle final absolute addresses, but rather requires that the linker | |
523 | set them.) The offset is given by @code{fr_symbol} and @code{fr_offset}; one | |
524 | character from the variable-length tail is used as the fill character. | |
ae6cd60f KR |
525 | @end table |
526 | ||
af16e411 | 527 | @cindex frchainS structure |
ae6cd60f KR |
528 | A chain of frags is built up for each subsection. The data structure |
529 | describing a chain is called a @code{frchainS}, and contains the following | |
530 | fields: | |
531 | ||
532 | @table @code | |
533 | @item frch_root | |
af16e411 | 534 | Points to the first frag in the chain. May be NULL if there are no frags in |
ae6cd60f KR |
535 | this chain. |
536 | @item frch_last | |
af16e411 | 537 | Points to the last frag in the chain, or NULL if there are none. |
ae6cd60f KR |
538 | @item frch_next |
539 | Next in the list of @code{frchainS} structures. | |
540 | @item frch_seg | |
541 | Indicates the section this frag chain belongs to. | |
542 | @item frch_subseg | |
543 | Subsection (subsegment) number of this frag chain. | |
544 | @item fix_root, fix_tail | |
af16e411 | 545 | (Defined only if @code{BFD_ASSEMBLER} is defined). Point to first and last |
ae6cd60f KR |
546 | @code{fixS} structures associated with this subsection. |
547 | @item frch_obstack | |
548 | Not currently used. Intended to be used for frag allocation for this | |
549 | subsection. This should reduce frag generation caused by switching sections. | |
af16e411 FF |
550 | @item frch_frag_now |
551 | The current frag for this subsegment. | |
ae6cd60f KR |
552 | @end table |
553 | ||
554 | A @code{frchainS} corresponds to a subsection; each section has a list of | |
555 | @code{frchainS} records associated with it. In most cases, only one subsection | |
556 | of each section is used, so the list will only be one element long, but any | |
557 | processing of frag chains should be prepared to deal with multiple chains per | |
558 | section. | |
559 | ||
560 | After the input files have been completely processed, and no more frags are to | |
561 | be generated, the frag chains are joined into one per section for further | |
562 | processing. After this point, it is safe to operate on one chain per section. | |
563 | ||
af16e411 FF |
564 | The assembler always has a current frag, named @code{frag_now}. More space is |
565 | allocated for the current frag using the @code{frag_more} function; this | |
566 | returns a pointer to the amount of requested space. Relaxing is done using | |
567 | variant frags allocated by @code{frag_var} or @code{frag_variant} | |
568 | (@pxref{Relaxation}). | |
569 | ||
570 | @node GAS processing | |
571 | @section What GAS does when it runs | |
572 | @cindex internals, overview | |
573 | ||
574 | This is a quick look at what an assembler run looks like. | |
575 | ||
576 | @itemize @bullet | |
577 | @item | |
578 | The assembler initializes itself by calling various init routines. | |
579 | ||
580 | @item | |
581 | For each source file, the @code{read_a_source_file} function reads in the file | |
582 | and parses it. The global variable @code{input_line_pointer} points to the | |
583 | current text; it is guaranteed to be correct up to the end of the line, but not | |
584 | farther. | |
585 | ||
586 | @item | |
587 | For each line, the assembler passes labels to the @code{colon} function, and | |
588 | isolates the first word. If it looks like a pseudo-op, the word is looked up | |
589 | in the pseudo-op hash table @code{po_hash} and dispatched to a pseudo-op | |
590 | routine. Otherwise, the target dependent @code{md_assemble} routine is called | |
591 | to parse the instruction. | |
592 | ||
593 | @item | |
594 | When pseudo-ops or instructions output data, they add it to a frag, calling | |
595 | @code{frag_more} to get space to store it in. | |
596 | ||
597 | @item | |
598 | Pseudo-ops and instructions can also output fixups created by @code{fix_new} or | |
599 | @code{fix_new_exp}. | |
600 | ||
601 | @item | |
602 | For certain targets, instructions can create variant frags which are used to | |
603 | store relaxation information (@pxref{Relaxation}). | |
604 | ||
605 | @item | |
606 | When the input file is finished, the @code{write_object_file} routine is | |
607 | called. It assigns addresses to all the frags (@code{relax_segment}), resolves | |
608 | all the fixups (@code{fixup_segment}), resolves all the symbol values (using | |
609 | @code{resolve_symbol_value}), and finally writes out the file (in the | |
610 | @code{BFD_ASSEMBLER} case, this is done by simply calling @code{bfd_close}). | |
611 | @end itemize | |
612 | ||
613 | @node Porting GAS | |
614 | @section Porting GAS | |
615 | @cindex porting | |
616 | ||
617 | Each GAS target specifies two main things: the CPU file and the object format | |
618 | file. Two main switches in the @file{configure.in} file handle this. The | |
619 | first switches on CPU type to set the shell variable @code{cpu_type}. The | |
620 | second switches on the entire target to set the shell variable @code{fmt}. | |
621 | ||
622 | The configure script uses the value of @code{cpu_type} to select two files in | |
623 | the @file{config} directory: @file{tc-@var{CPU}.c} and @file{tc-@var{CPU}.h}. | |
624 | The configuration process will create symlinks to these files from | |
625 | @file{targ-cpu.c} and @file{targ-cpu.h} in the build directory. | |
626 | ||
627 | The configure script also uses the value of @code{fmt} to select two files: | |
628 | @file{obj-@var{fmt}.c} and @file{obj-@var{fmt}.h}. The configuration process | |
629 | will create symlinks to these files from @file{obj-format.h} and | |
630 | @file{obj-format.c}. | |
631 | ||
632 | You can also set the emulation in the configure script by setting the @code{em} | |
633 | variable. Normally the default value of @samp{generic} is fine. The | |
634 | configuration process will create a symlink from @file{targ-env.h} to | |
635 | @file{te-@var{em}.h}. | |
636 | ||
637 | Porting GAS to a new CPU requires writing the @file{tc-@var{CPU}} files. | |
638 | Porting GAS to a new object file format requires writing the | |
639 | @file{obj-@var{fmt}} files. There is sometimes some interaction between these | |
640 | two files, but it is normally minimal. | |
641 | ||
642 | The best approach is, of course, to copy existing files. The documentation | |
643 | below assumes that you are looking at existing files to see usage details. | |
644 | ||
645 | These interfaces have grown over time, and have never been carefully thought | |
646 | out or designed. Nothing about the interfaces described here is cast in stone. | |
647 | It is possible that they will change from one version of the assembler to the | |
648 | next. Also, new macros are added all the time as they are needed. | |
649 | ||
650 | @menu | |
651 | * CPU backend:: Writing a CPU backend | |
652 | * Object format backend:: Writing an object format backend | |
653 | * Emulations:: Writing emulation files | |
654 | @end menu | |
655 | ||
656 | @node CPU backend | |
657 | @subsection Writing a CPU backend | |
658 | @cindex CPU backend | |
659 | @cindex @file{tc-@var{CPU}} | |
660 | ||
661 | The CPU backend files are the heart of the assembler. They are the only parts | |
662 | of the assembler which actually know anything about the instruction set of the | |
663 | processor. | |
664 | ||
665 | You must define a reasonably small list of macros and functions in the CPU | |
666 | backend files. You may define a large number of additional macros in the CPU | |
667 | backend files, not all of which are documented here. You must, of course, | |
668 | define macros in the @file{.h} file, which is included by every assembler | |
669 | source file. You may define the functions as macros in the @file{.h} file, or | |
670 | as functions in the @file{.c} file. | |
671 | ||
672 | @table @code | |
673 | @item TC_@var{CPU} | |
674 | @cindex TC_@var{CPU} | |
675 | By convention, you should define this macro in the @file{.h} file. For | |
676 | example, @file{tc-m68k.h} defines @code{TC_M68K}. You might have to use this | |
677 | if it is necessary to add CPU specific code to the object format file. | |
678 | ||
679 | @item TARGET_FORMAT | |
680 | This macro is the BFD target name to use when creating the output file. This | |
681 | will normally depend upon the @code{OBJ_@var{FMT}} macro. | |
682 | ||
683 | @item TARGET_ARCH | |
684 | This macro is the BFD architecture to pass to @code{bfd_set_arch_mach}. | |
685 | ||
686 | @item TARGET_MACH | |
687 | This macro is the BFD machine number to pass to @code{bfd_set_arch_mach}. If | |
688 | it is not defined, GAS will use 0. | |
689 | ||
690 | @item TARGET_BYTES_BIG_ENDIAN | |
691 | You should define this macro to be non-zero if the target is big endian, and | |
692 | zero if the target is little endian. | |
693 | ||
694 | @item md_shortopts | |
695 | @itemx md_longopts | |
696 | @itemx md_longopts_size | |
697 | @itemx md_parse_option | |
698 | @itemx md_show_usage | |
699 | @cindex md_shortopts | |
700 | @cindex md_longopts | |
701 | @cindex md_longopts_size | |
702 | @cindex md_parse_option | |
703 | @cindex md_show_usage | |
704 | GAS uses these variables and functions during option processing. | |
705 | @code{md_shortopts} is a @code{const char *} which GAS adds to the machine | |
706 | independent string passed to @code{getopt}. @code{md_longopts} is a | |
707 | @code{struct option []} which GAS adds to the machine independent long options | |
708 | passed to @code{getopt}; you may use @code{OPTION_MD_BASE}, defined in | |
709 | @file{as.h}, as the start of a set of long option indices, if necessary. | |
710 | @code{md_longopts_size} is a @code{size_t} holding the size @code{md_longopts}. | |
711 | GAS will call @code{md_parse_option} whenever @code{getopt} returns an | |
712 | unrecognized code, presumably indicating a special code value which appears in | |
713 | @code{md_longopts}. GAS will call @code{md_show_usage} when a usage message is | |
714 | printed; it should print a description of the machine specific options. | |
715 | ||
716 | @item md_begin | |
717 | @cindex md_begin | |
718 | GAS will call this function at the start of the assembly, after the command | |
719 | line arguments have been parsed and all the machine independent initializations | |
720 | have been completed. | |
721 | ||
722 | @item md_cleanup | |
723 | @cindex md_cleanup | |
724 | If you define this macro, GAS will call it at the end of each input file. | |
725 | ||
726 | @item md_assemble | |
727 | @cindex md_assemble | |
728 | GAS will call this function for each input line which does not contain a | |
729 | pseudo-op. The argument is a null terminated string. The function should | |
730 | assemble the string as an instruction with operands. Normally | |
731 | @code{md_assemble} will do this by calling @code{frag_more} and writing out | |
732 | some bytes (@pxref{Frags}). @code{md_assemble} will call @code{fix_new} to | |
733 | create fixups as needed (@pxref{Fixups}). Targets which need to do special | |
734 | purpose relaxation will call @code{frag_var}. | |
735 | ||
736 | @item md_pseudo_table | |
737 | @cindex md_pseudo_table | |
738 | This is a const array of type @code{pseudo_typeS}. It is a mapping from | |
739 | pseudo-op names to functions. You should use this table to implement | |
740 | pseudo-ops which are specific to the CPU. | |
741 | ||
742 | @item tc_conditional_pseudoop | |
743 | @cindex tc_conditional_pseudoop | |
744 | If this macro is defined, GAS will call it with a @code{pseudo_typeS} argument. | |
745 | It should return non-zero if the pseudo-op is a conditional which controls | |
746 | whether code is assembled, such as @samp{.if}. GAS knows about the normal | |
747 | conditional pseudo-ops,and you should normally not have to define this macro. | |
748 | ||
749 | @item comment_chars | |
750 | @cindex comment_chars | |
751 | This is a null terminated @code{const char} array of characters which start a | |
752 | comment. | |
753 | ||
754 | @item tc_comment_chars | |
755 | @cindex tc_comment_chars | |
756 | If this macro is defined, GAS will use it instead of @code{comment_chars}. | |
757 | ||
758 | @item line_comment_chars | |
759 | @cindex line_comment_chars | |
760 | This is a null terminated @code{const char} array of characters which start a | |
761 | comment when they appear at the start of a line. | |
762 | ||
763 | @item line_separator_chars | |
764 | @cindex line_separator_chars | |
765 | This is a null terminated @code{const char} array of characters which separate | |
766 | lines (the semicolon is such a character by default, and need not be listed in | |
767 | this array). | |
768 | ||
769 | @item EXP_CHARS | |
770 | @cindex EXP_CHARS | |
771 | This is a null terminated @code{const char} array of characters which may be | |
772 | used as the exponent character in a floating point number. This is normally | |
773 | @code{"eE"}. | |
774 | ||
775 | @item FLT_CHARS | |
776 | @cindex FLT_CHARS | |
777 | This is a null terminated @code{const char} array of characters which may be | |
778 | used to indicate a floating point constant. A zero followed by one of these | |
779 | characters is assumed to be followed by a floating point number; thus they | |
780 | operate the way that @code{0x} is used to indicate a hexadecimal constant. | |
781 | Usually this includes @samp{r} and @samp{f}. | |
782 | ||
783 | @item LEX_AT | |
784 | @cindex LEX_AT | |
785 | You may define this macro to the lexical type of the @kbd{@}} character. The | |
786 | default is zero. | |
787 | ||
788 | Lexical types are a combination of @code{LEX_NAME} and @code{LEX_BEGIN_NAME}, | |
789 | both defined in @file{read.h}. @code{LEX_NAME} indicates that the character | |
790 | may appear in a name. @code{LEX_BEGIN_NAME} indicates that the character may | |
791 | appear at the beginning of a nem. | |
792 | ||
793 | @item LEX_BR | |
794 | @cindex LEX_BR | |
795 | You may define this macro to the lexical type of the brace characters @kbd{@{}, | |
796 | @kbd{@}}, @kbd{[}, and @kbd{]}. The default value is zero. | |
797 | ||
798 | @item LEX_PCT | |
799 | @cindex LEX_PCT | |
800 | You may define this macro to the lexical type of the @kbd{%} character. The | |
801 | default value is zero. | |
802 | ||
803 | @item LEX_QM | |
804 | @cindex LEX_QM | |
805 | You may define this macro to the lexical type of the @kbd{?} character. The | |
806 | default value it zero. | |
807 | ||
808 | @item LEX_DOLLAR | |
809 | @cindex LEX_DOLLAR | |
810 | You may define this macro to the lexical type of the @kbd{$} character. The | |
811 | default value is @code{LEX_NAME | LEX_BEGIN_NAME}. | |
812 | ||
813 | @item SINGLE_QUOTE_STRINGS | |
814 | @cindex SINGLE_QUOTE_STRINGS | |
815 | If you define this macro, GAS will treat single quotes as string delimiters. | |
816 | Normally only double quotes are accepted as string delimiters. | |
817 | ||
818 | @item NO_STRING_ESCAPES | |
819 | @cindex NO_STRING_ESCAPES | |
820 | If you define this macro, GAS will not permit escape sequences in a string. | |
821 | ||
822 | @item ONLY_STANDARD_ESCAPES | |
823 | @cindex ONLY_STANDARD_ESCAPES | |
824 | If you define this macro, GAS will warn about the use of nonstandard escape | |
825 | sequences in a string. | |
826 | ||
827 | @item md_start_line_hook | |
828 | @cindex md_start_line_hook | |
829 | If you define this macro, GAS will call it at the start of each line. | |
830 | ||
831 | @item LABELS_WITHOUT_COLONS | |
832 | @cindex LABELS_WITHOUT_COLONS | |
833 | If you define this macro, GAS will assume that any text at the start of a line | |
834 | is a label, even if it does not have a colon. | |
835 | ||
836 | @item TC_START_LABEL | |
837 | @cindex TC_START_LABEL | |
838 | You may define this macro to control what GAS considers to be a label. The | |
839 | default definition is to accept any name followed by a colon character. | |
840 | ||
841 | @item NO_PSEUDO_DOT | |
842 | @cindex NO_PSEUDO_DOT | |
843 | If you define this macro, GAS will not require pseudo-ops to start with a | |
844 | @kbd{.} character. | |
845 | ||
846 | @item TC_EQUAL_IN_INSN | |
847 | @cindex TC_EQUAL_IN_INSN | |
848 | If you define this macro, it should return nonzero if the instruction is | |
849 | permitted to contain an @kbd{=} character. GAS will use this to decide if a | |
850 | @kbd{=} is an assignment or an instruction. | |
851 | ||
852 | @item TC_EOL_IN_INSN | |
853 | @cindex TC_EOL_IN_INSN | |
854 | If you define this macro, it should return nonzero if the current input line | |
855 | pointer should be treated as the end of a line. | |
856 | ||
857 | @item md_parse_name | |
858 | @cindex md_parse_name | |
859 | If this macro is defined, GAS will call it for any symbol found in an | |
860 | expression. You can define this to handle special symbols in a special way. | |
861 | If a symbol always has a certain value, you should normally enter it in the | |
862 | symbol table, perhaps using @code{reg_section}. | |
863 | ||
d7bf6158 ILT |
864 | @item md_undefined_symbol |
865 | @cindex md_undefined_symbol | |
866 | GAS will call this function when a symbol table lookup fails, before it | |
867 | creates a new symbol. Typically this would be used to supply symbols whose | |
868 | name or value changes dynamically, possibly in a context sensitive way. | |
869 | Predefined symbols with fixed values, such as register names or condition | |
870 | codes, are typically entered directly into the symbol table when @code{md_begin} | |
871 | is called. | |
872 | ||
af16e411 FF |
873 | @item md_operand |
874 | @cindex md_operand | |
875 | GAS will call this function for any expression that can not be recognized. | |
876 | When the function is called, @code{input_line_pointer} will point to the start | |
877 | of the expression. | |
878 | ||
879 | @item tc_unrecognized_line | |
880 | @cindex tc_unrecognized_line | |
881 | If you define this macro, GAS will call it when it finds a line that it can not | |
882 | parse. | |
883 | ||
884 | @item md_do_align | |
885 | @cindex md_do_align | |
886 | You may define this macro to handle an alignment directive. GAS will call it | |
887 | when the directive is seen in the input file. For example, the i386 backend | |
888 | uses this to generate efficient nop instructions of varying lengths, depending | |
889 | upon the number of bytes that the alignment will skip. | |
890 | ||
891 | @item HANDLE_ALIGN | |
892 | @cindex HANDLE_ALIGN | |
893 | You may define this macro to do special handling for an alignment directive. | |
894 | GAS will call it at the end of the assembly. | |
895 | ||
896 | @item md_flush_pending_output | |
897 | @cindex md_flush_pending_output | |
898 | If you define this macro, GAS will it each time it skips any space because of a | |
899 | space filling or alignment or data allocation pseudo-op. | |
900 | ||
901 | @item TC_PARSE_CONS_EXPRESSION | |
902 | @cindex TC_PARSE_CONS_EXPRESSION | |
903 | You may define this macro to parse an expression used in a data allocation | |
904 | pseudo-op such as @code{.word}. You can use this to recognize relocation | |
905 | directives that may appear in such directives. | |
906 | ||
907 | @item BITFIELD_CONS_EXPRESSION | |
908 | @cindex BITFIELD_CONS_EXPRESSION | |
909 | If you define this macro, GAS will recognize bitfield instructions in data | |
910 | allocation pseudo-ops, as used on the i960. | |
911 | ||
912 | @item REPEAT_CONS_EXPRESSION | |
913 | @cindex REPEAT_CONS_EXPRESSION | |
914 | If you define this macro, GAS will recognize repeat counts in data allocation | |
915 | pseudo-ops, as used on the MIPS. | |
916 | ||
917 | @item md_cons_align | |
918 | @cindex md_cons_align | |
919 | You may define this macro to do any special alignment before a data allocation | |
920 | pseudo-op. | |
921 | ||
922 | @item TC_CONS_FIX_NEW | |
923 | @cindex TC_CONS_FIX_NEW | |
924 | You may define this macro to generate a fixup for a data allocation pseudo-op. | |
925 | ||
926 | @item md_number_to_chars | |
927 | @cindex md_number_to_chars | |
928 | This should just call either @code{number_to_chars_bigendian} or | |
929 | @code{number_to_chars_littleendian}, whichever is appropriate. On targets like | |
930 | the MIPS which support options to change the endianness, which function to call | |
931 | is a runtime decision. On other targets, @code{md_number_to_chars} can be a | |
932 | simple macro. | |
933 | ||
934 | @item md_reloc_size | |
935 | @cindex md_reloc_size | |
936 | This variable is only used in the original version of gas (not | |
937 | @code{BFD_ASSEMBLER} and not @code{MANY_SEGMENTS}). It holds the size of a | |
938 | relocation entry. | |
939 | ||
940 | @item WORKING_DOT_WORD | |
941 | @itemx md_short_jump_size | |
942 | @itemx md_long_jump_size | |
943 | @itemx md_create_short_jump | |
944 | @itemx md_create_long_jump | |
945 | @cindex WORKING_DOT_WORD | |
946 | @cindex md_short_jump_size | |
947 | @cindex md_long_jump_size | |
948 | @cindex md_create_short_jump | |
949 | @cindex md_create_long_jump | |
950 | If @code{WORKING_DOT_WORD} is defined, GAS will not do broken word processing | |
951 | (@pxref{Broken words}). Otherwise, you should set @code{md_short_jump_size} to | |
952 | the size of a short jump (a jump that is just long enough to jump around a long | |
953 | jmp) and @code{md_long_jump_size} to the size of a long jump (a jump that can | |
954 | go anywhere in the function), You should define @code{md_create_short_jump} to | |
955 | create a short jump around a long jump, and define @code{md_create_long_jump} | |
956 | to create a long jump. | |
957 | ||
958 | @item md_estimate_size_before_relax | |
959 | @cindex md_estimate_size_before_relax | |
960 | This function returns an estimate of the size of a @code{rs_machine_dependent} | |
961 | frag before any relaxing is done. It may also create any necessary | |
962 | relocations. | |
963 | ||
964 | @item md_relax_frag | |
965 | @cindex md_relax_frag | |
966 | This macro may be defined to relax a frag. GAS will call this with the frag | |
967 | and the change in size of all previous frags; @code{md_relax_frag} should | |
968 | return the change in size of the frag. @xref{Relaxation}. | |
969 | ||
970 | @item TC_GENERIC_RELAX_TABLE | |
971 | @cindex TC_GENERIC_RELAX_TABLE | |
972 | If you do not define @code{md_relax_frag}, you may define | |
973 | @code{TC_GENERIC_RELAX_TABLE} as a table of @code{relax_typeS} structures. The | |
974 | machine independent code knows how to use such a table to relax PC relative | |
975 | references. See @file{tc-m68k.c} for an example. @xref{Relaxation}. | |
976 | ||
977 | @item md_prepare_relax_scan | |
978 | @cindex md_prepare_relax_scan | |
979 | If defined, it is a C statement that is invoked prior to scanning | |
980 | the relax table. | |
981 | ||
982 | @item LINKER_RELAXING_SHRINKS_ONLY | |
983 | @cindex LINKER_RELAXING_SHRINKS_ONLY | |
984 | If you define this macro, and the global variable @samp{linkrelax} is set | |
985 | (because of a command line option, or unconditionally in @code{md_begin}), a | |
986 | @samp{.align} directive will cause extra space to be allocated. The linker can | |
987 | then discard this space when relaxing the section. | |
988 | ||
989 | @item md_convert_frag | |
990 | @cindex md_convert_frag | |
991 | GAS will call this for each rs_machine_dependent fragment. | |
992 | The instruction is completed using the data from the relaxation pass. | |
993 | It may also create an necessary relocations. | |
994 | @xref{Relaxation}. | |
995 | ||
996 | @item md_apply_fix | |
997 | @cindex md_apply_fix | |
998 | GAS will call this for each fixup. It should store the correct value in the | |
999 | object file. | |
1000 | ||
1001 | @item TC_HANDLES_FX_DONE | |
1002 | @cindex TC_HANDLES_FX_DONE | |
1003 | If this macro is defined, it means that @code{md_apply_fix} correctly sets the | |
1004 | @code{fx_done} field in the fixup. | |
1005 | ||
1006 | @item tc_gen_reloc | |
1007 | @cindex tc_gen_reloc | |
1008 | A @code{BFD_ASSEMBLER} GAS will call this to generate a reloc. GAS will pass | |
1009 | the resulting reloc to @code{bfd_install_relocation}. This currently works | |
1010 | poorly, as @code{bfd_install_relocation} often does the wrong thing, and | |
1011 | instances of @code{tc_gen_reloc} have been written to work around the problems, | |
1012 | which in turns makes it difficult to fix @code{bfd_install_relocation}. | |
1013 | ||
1014 | @item RELOC_EXPANSION_POSSIBLE | |
1015 | @cindex RELOC_EXPANSION_POSSIBLE | |
1016 | If you define this macro, it means that @code{tc_gen_reloc} may return multiple | |
1017 | relocation entries for a single fixup. In this case, the return value of | |
1018 | @code{tc_gen_reloc} is a pointer to a null terminated array. | |
1019 | ||
1020 | @item MAX_RELOC_EXPANSION | |
1021 | @cindex MAX_RELOC_EXPANSION | |
1022 | You must define this if @code{RELOC_EXPANSION_POSSIBLE} is defined; it | |
1023 | indicates the largest number of relocs which @code{tc_gen_reloc} may return for | |
1024 | a single fixup. | |
1025 | ||
1026 | @item tc_fix_adjustable | |
1027 | @cindex tc_fix_adjustable | |
1028 | You may define this macro to indicate whether a fixup against a locally defined | |
1029 | symbol should be adjusted to be against the section symbol. It should return a | |
1030 | non-zero value if the adjustment is acceptable. | |
1031 | ||
1032 | @item MD_PCREL_FROM_SECTION | |
1033 | @cindex MD_PCREL_FROM_SECTION | |
1034 | If you define this macro, it should return the offset between the address of a | |
1035 | PC relative fixup and the position from which the PC relative adjustment should | |
1036 | be made. On many processors, the base of a PC relative instruction is the next | |
1037 | instruction, so this macro would return the length of an instruction. | |
1038 | ||
1039 | @item md_pcrel_from | |
1040 | @cindex md_pcrel_from | |
1041 | This is the default value of @code{MD_PCREL_FROM_SECTION}. The difference is | |
1042 | that @code{md_pcrel_from} does not take a section argument. | |
1043 | ||
1044 | @item tc_frob_label | |
1045 | @cindex tc_frob_label | |
1046 | If you define this macro, GAS will call it each time a label is defined. | |
1047 | ||
1048 | @item md_section_align | |
1049 | @cindex md_section_align | |
1050 | GAS will call this function for each section at the end of the assemebly, to | |
1051 | permit the CPU backend to adjust the alignment of a section. | |
1052 | ||
1053 | @item tc_frob_section | |
1054 | @cindex tc_frob_section | |
1055 | If you define this macro, a @code{BFD_ASSEMBLER} GAS will call it for each | |
1056 | section at the end of the assembly. | |
1057 | ||
1058 | @item tc_frob_file_before_adjust | |
1059 | @cindex tc_frob_file_before_adjust | |
1060 | If you define this macro, GAS will call it after the symbol values are | |
1061 | resolved, but before the fixups have been changed from local symbols to section | |
1062 | symbols. | |
1063 | ||
1064 | @item tc_frob_symbol | |
1065 | @cindex tc_frob_symbol | |
1066 | If you define this macro, GAS will call it for each symbol. You can indicate | |
1067 | that the symbol should not be included in the object file by definining this | |
1068 | macro to set its second argument to a non-zero value. | |
1069 | ||
1070 | @item tc_frob_file | |
1071 | @cindex tc_frob_file | |
1072 | If you define this macro, GAS will call it after the symbol table has been | |
1073 | completed, but before the relocations have been generated. | |
1074 | ||
1075 | @item tc_frob_file_after_relocs | |
1076 | If you define this macro, GAS will call it after the relocs have been | |
1077 | generated. | |
1078 | @end table | |
1079 | ||
1080 | @node Object format backend | |
1081 | @subsection Writing an object format backend | |
1082 | @cindex object format backend | |
1083 | @cindex @file{obj-@var{fmt}} | |
1084 | ||
1085 | As with the CPU backend, the object format backend must define a few things, | |
1086 | and may define some other things. The interface to the object format backend | |
1087 | is generally simpler; most of the support for an object file format consists of | |
1088 | defining a number of pseudo-ops. | |
1089 | ||
1090 | The object format @file{.h} file must include @file{targ-cpu.h}. | |
1091 | ||
1092 | This section will only define the @code{BFD_ASSEMBLER} version of GAS. It is | |
1093 | impossible to support a new object file format using any other version anyhow, | |
1094 | as the original GAS version only supports a.out, and the @code{MANY_SEGMENTS} | |
1095 | GAS version only supports COFF. | |
1096 | ||
1097 | @table @code | |
1098 | @item OBJ_@var{format} | |
1099 | @cindex OBJ_@var{format} | |
1100 | By convention, you should define this macro in the @file{.h} file. For | |
1101 | example, @file{obj-elf.h} defines @code{OBJ_ELF}. You might have to use this | |
1102 | if it is necessary to add object file format specific code to the CPU file. | |
1103 | ||
1104 | @item obj_begin | |
1105 | If you define this macro, GAS will call it at the start of the assembly, after | |
1106 | the command line arguments have been parsed and all the machine independent | |
1107 | initializations have been completed. | |
1108 | ||
1109 | @item obj_app_file | |
1110 | @cindex obj_app_file | |
1111 | If you define this macro, GAS will invoke it when it sees a @code{.file} | |
1112 | pseudo-op or a @samp{#} line as used by the C preprocessor. | |
1113 | ||
1114 | @item OBJ_COPY_SYMBOL_ATTRIBUTES | |
1115 | @cindex OBJ_COPY_SYMBOL_ATTRIBUTES | |
1116 | You should define this macro to copy object format specific information from | |
1117 | one symbol to another. GAS will call it when one symbol is equated to | |
1118 | another. | |
1119 | ||
1120 | @item obj_fix_adjustable | |
1121 | @cindex obj_fix_adjustable | |
1122 | You may define this macro to indicate whether a fixup against a locally defined | |
1123 | symbol should be adjusted to be against the section symbol. It should return a | |
1124 | non-zero value if the adjustment is acceptable. | |
1125 | ||
1126 | @item obj_sec_sym_ok_for_reloc | |
1127 | @cindex obj_sec_sym_ok_for_reloc | |
1128 | You may define this macro to indicate that it is OK to use a section symbol in | |
1129 | a relocateion entry. If it is not, GAS will define a new symbol at the start | |
1130 | of a section. | |
1131 | ||
1132 | @item EMIT_SECTION_SYMBOLS | |
1133 | @cindex EMIT_SECTION_SYMBOLS | |
1134 | You should define this macro with a zero value if you do not want to include | |
1135 | section symbols in the output symbol table. The default value for this macro | |
1136 | is one. | |
1137 | ||
1138 | @item obj_adjust_symtab | |
1139 | @cindex obj_adjust_symtab | |
1140 | If you define this macro, GAS will invoke it just before setting the symbol | |
1141 | table of the output BFD. For example, the COFF support uses this macro to | |
1142 | generate a @code{.file} symbol if none was generated previously. | |
1143 | ||
1144 | @item SEPARATE_STAB_SECTIONS | |
1145 | @cindex SEPARATE_STAB_SECTIONS | |
1146 | You may define this macro to indicate that stabs should be placed in separate | |
1147 | sections, as in ELF. | |
1148 | ||
1149 | @item INIT_STAB_SECTION | |
1150 | @cindex INIT_STAB_SECTION | |
1151 | You may define this macro to initialize the stabs section in the output file. | |
1152 | ||
1153 | @item OBJ_PROCESS_STAB | |
1154 | @cindex OBJ_PROCESS_STAB | |
1155 | You may define this macro to do specific processing on a stabs entry. | |
1156 | ||
1157 | @item obj_frob_section | |
1158 | @cindex obj_frob_section | |
1159 | If you define this macro, GAS will call it for each section at the end of the | |
1160 | assembly. | |
1161 | ||
1162 | @item obj_frob_file_before_adjust | |
1163 | @cindex obj_frob_file_before_adjust | |
1164 | If you define this macro, GAS will call it after the symbol values are | |
1165 | resolved, but before the fixups have been changed from local symbols to section | |
1166 | symbols. | |
1167 | ||
1168 | @item obj_frob_symbol | |
1169 | @cindex obj_frob_symbol | |
1170 | If you define this macro, GAS will call it for each symbol. You can indicate | |
1171 | that the symbol should not be included in the object file by definining this | |
1172 | macro to set its second argument to a non-zero value. | |
1173 | ||
1174 | @item obj_frob_file | |
1175 | @cindex obj_frob_file | |
1176 | If you define this macro, GAS will call it after the symbol table has been | |
1177 | completed, but before the relocations have been generated. | |
1178 | ||
1179 | @item obj_frob_file_after_relocs | |
1180 | If you define this macro, GAS will call it after the relocs have been | |
1181 | generated. | |
1182 | @end table | |
1183 | ||
1184 | @node Emulations | |
1185 | @subsection Writing emulation files | |
1186 | ||
1187 | Normally you do not have to write an emulation file. You can just use | |
1188 | @file{te-generic.h}. | |
1189 | ||
1190 | If you do write your own emulation file, it must include @file{obj-format.h}. | |
1191 | ||
1192 | An emulation file will often define @code{TE_@var{EM}}; this may then be used | |
1193 | in other files to change the output. | |
ae6cd60f KR |
1194 | |
1195 | @node Relaxation | |
af16e411 FF |
1196 | @section Relaxation |
1197 | @cindex relaxation | |
1198 | ||
1199 | @dfn{Relaxation} is a generic term used when the size of some instruction or | |
1200 | data depends upon the value of some symbol or other data. | |
1201 | ||
1202 | GAS knows to relax a particular type of PC relative relocation using a table. | |
1203 | You can also define arbitrarily complex forms of relaxation yourself. | |
1204 | ||
1205 | @menu | |
1206 | * Relaxing with a table:: Relaxing with a table | |
1207 | * General relaxing:: General relaxing | |
1208 | @end menu | |
1209 | ||
1210 | @node Relaxing with a table | |
1211 | @subsection Relaxing with a table | |
1212 | ||
1213 | If you do not define @code{md_relax_frag}, and you do define | |
1214 | @code{TC_GENERIC_RELAX_TABLE}, GAS will relax @code{rs_machine_dependent} frags | |
1215 | based on the frag subtype and the displacement to some specified target | |
1216 | address. The basic idea is that several machines have different addressing | |
1217 | modes for instructions that can specify different ranges of values, with | |
1218 | successive modes able to access wider ranges, including the entirety of the | |
1219 | previous range. Smaller ranges are assumed to be more desirable (perhaps the | |
1220 | instruction requires one word instead of two or three); if this is not the | |
1221 | case, don't describe the smaller-range, inferior mode. | |
1222 | ||
1223 | The @code{fr_subtype} field of a frag is an index into a CPU-specific | |
ae6cd60f KR |
1224 | relaxation table. That table entry indicates the range of values that can be |
1225 | stored, the number of bytes that will have to be added to the frag to | |
1226 | accomodate the addressing mode, and the index of the next entry to examine if | |
1227 | the value to be stored is outside the range accessible by the current | |
1228 | addressing mode. The @code{fr_symbol} field of the frag indicates what symbol | |
1229 | is to be accessed; the @code{fr_offset} field is added in. | |
1230 | ||
1231 | If the @code{fr_pcrel_adjust} field is set, which currently should only happen | |
1232 | for the NS32k family, the @code{TC_PCREL_ADJUST} macro is called on the frag to | |
1233 | compute an adjustment to be made to the displacement. | |
1234 | ||
1235 | The value fitted by the relaxation code is always assumed to be a displacement | |
1236 | from the current frag. (More specifically, from @code{fr_fix} bytes into the | |
af16e411 FF |
1237 | frag.) |
1238 | @ignore | |
1239 | This seems kinda silly. What about fitting small absolute values? I suppose | |
1240 | @code{md_assemble} is supposed to take care of that, but if the operand is a | |
1241 | difference between symbols, it might not be able to, if the difference was not | |
1242 | computable yet. | |
1243 | @end ignore | |
ae6cd60f KR |
1244 | |
1245 | The end of the relaxation sequence is indicated by a ``next'' value of 0. This | |
af16e411 FF |
1246 | means that the first entry in the table can't be used. |
1247 | ||
1248 | For some configurations, the linker can do relaxing within a section of an | |
1249 | object file. If call instructions of various sizes exist, the linker can | |
1250 | determine which should be used in each instance, when a symbol's value is | |
1251 | resolved. In order for the linker to avoid wasting space and having to insert | |
1252 | no-op instructions, it must be able to expand or shrink the section contents | |
1253 | while still preserving intra-section references and meeting alignment | |
1254 | requirements. | |
1255 | ||
1256 | For the i960 using b.out format, no expansion is done; instead, each | |
1257 | @samp{.align} directive causes extra space to be allocated, enough that when | |
1258 | the linker is relaxing a section and removing unneeded space, it can discard | |
1259 | some or all of this extra padding and cause the following data to be correctly | |
1260 | aligned. | |
1261 | ||
1262 | For the H8/300, I think the linker expands calls that can't reach, and doesn't | |
1263 | worry about alignment issues; the cpu probably never needs any significant | |
1264 | alignment beyond the instruction size. | |
ae6cd60f KR |
1265 | |
1266 | The relaxation table type contains these fields: | |
1267 | ||
1268 | @table @code | |
1269 | @item long rlx_forward | |
1270 | Forward reach, must be non-negative. | |
1271 | @item long rlx_backward | |
1272 | Backward reach, must be zero or negative. | |
1273 | @item rlx_length | |
1274 | Length in bytes of this addressing mode. | |
1275 | @item rlx_more | |
af16e411 | 1276 | Index of the next-longer relax state, or zero if there is no next relax state. |
ae6cd60f KR |
1277 | @end table |
1278 | ||
1279 | The relaxation is done in @code{relax_segment} in @file{write.c}. The | |
1280 | difference in the length fields between the original mode and the one finally | |
1281 | chosen by the relaxing code is taken as the size by which the current frag will | |
1282 | be increased in size. For example, if the initial relaxing mode has a length | |
1283 | of 2 bytes, and because of the size of the displacement, it gets upgraded to a | |
1284 | mode with a size of 6 bytes, it is assumed that the frag will grow by 4 bytes. | |
1285 | (The initial two bytes should have been part of the fixed portion of the frag, | |
1286 | since it is already known that they will be output.) This growth must be | |
1287 | effected by @code{md_convert_frag}; it should increase the @code{fr_fix} field | |
1288 | by the appropriate size, and fill in the appropriate bytes of the frag. | |
1289 | (Enough space for the maximum growth should have been allocated in the call to | |
1290 | frag_var as the second argument.) | |
1291 | ||
1292 | If relocation records are needed, they should be emitted by | |
af16e411 FF |
1293 | @code{md_estimate_size_before_relax}. This function should examine the target |
1294 | symbol of the supplied frag and correct the @code{fr_subtype} of the frag if | |
1295 | needed. When this function is called, if the symbol has not yet been defined, | |
1296 | it will not become defined later; however, its value may still change if the | |
1297 | section it is in gets relaxed. | |
ae6cd60f KR |
1298 | |
1299 | Usually, if the symbol is in the same section as the frag (given by the | |
1300 | @var{sec} argument), the narrowest likely relaxation mode is stored in | |
1301 | @code{fr_subtype}, and that's that. | |
1302 | ||
1303 | If the symbol is undefined, or in a different section (and therefore moveable | |
1304 | to an arbitrarily large distance), the largest available relaxation mode is | |
1305 | specified, @code{fix_new} is called to produce the relocation record, | |
1306 | @code{fr_fix} is increased to include the relocated field (remember, this | |
1307 | storage was allocated when @code{frag_var} was called), and @code{frag_wane} is | |
1308 | called to convert the frag to an @code{rs_fill} frag with no variant part. | |
1309 | Sometimes changing addressing modes may also require rewriting the instruction. | |
1310 | It can be accessed via @code{fr_opcode} or @code{fr_fix}. | |
1311 | ||
1312 | Sometimes @code{fr_var} is increased instead, and @code{frag_wane} is not | |
1313 | called. I'm not sure, but I think this is to keep @code{fr_fix} referring to | |
1314 | an earlier byte, and @code{fr_subtype} set to @code{rs_machine_dependent} so | |
1315 | that @code{md_convert_frag} will get called. | |
ae6cd60f | 1316 | |
af16e411 FF |
1317 | @node General relaxing |
1318 | @subsection General relaxing | |
ae6cd60f | 1319 | |
af16e411 FF |
1320 | If using a simple table is not suitable, you may implement arbitrarily complex |
1321 | relaxation semantics yourself. For example, the MIPS backend uses this to emit | |
1322 | different instruction sequences depending upon the size of the symbol being | |
1323 | accessed. | |
ae6cd60f | 1324 | |
af16e411 FF |
1325 | When you assemble an instruction that may need relaxation, you should allocate |
1326 | a frag using @code{frag_var} or @code{frag_variant} with a type of | |
1327 | @code{rs_machine_dependent}. You should store some sort of information in the | |
1328 | @code{fr_subtype} field so that you can figure out what to do with the frag | |
1329 | later. | |
ae6cd60f | 1330 | |
af16e411 FF |
1331 | When GAS reaches the end of the input file, it will look through the frags and |
1332 | work out their final sizes. | |
ae6cd60f | 1333 | |
af16e411 FF |
1334 | GAS will first call @code{md_estimate_size_before_relax} on each |
1335 | @code{rs_machine_dependent} frag. This function must return an estimated size | |
1336 | for the frag. | |
ae6cd60f | 1337 | |
af16e411 FF |
1338 | GAS will then loop over the frags, calling @code{md_relax_frag} on each |
1339 | @code{rs_machine_dependent} frag. This function should return the change in | |
1340 | size of the frag. GAS will keep looping over the frags until none of the frags | |
1341 | changes size. | |
ae6cd60f | 1342 | |
af16e411 FF |
1343 | @node Broken words |
1344 | @section Broken words | |
1345 | @cindex internals, broken words | |
1346 | @cindex broken words | |
ed307a20 | 1347 | |
af16e411 FF |
1348 | Some compilers, including GCC, will sometimes emit switch tables specifying |
1349 | 16-bit @code{.word} displacements to branch targets, and branch instructions | |
1350 | that load entries from that table to compute the target address. If this is | |
1351 | done on a 32-bit machine, there is a chance (at least with really large | |
1352 | functions) that the displacement will not fit in 16 bits. The assembler | |
1353 | handles this using a concept called @dfn{broken words}. This idea is well | |
1354 | named, since there is an implied promise that the 16-bit field will in fact | |
1355 | hold the specified displacement. | |
1356 | ||
1357 | If broken word processing is enabled, and a situation like this is encountered, | |
1358 | the assembler will insert a jump instruction into the instruction stream, close | |
1359 | enough to be reached with the 16-bit displacement. This jump instruction will | |
1360 | transfer to the real desired target address. Thus, as long as the @code{.word} | |
1361 | value really is used as a displacement to compute an address to jump to, the | |
1362 | net effect will be correct (minus a very small efficiency cost). If | |
1363 | @code{.word} directives with label differences for values are used for other | |
1364 | purposes, however, things may not work properly. For targets which use broken | |
1365 | words, the @samp{-K} option will warn when a broken word is discovered. | |
1366 | ||
1367 | The broken word code is turned off by the @code{WORKING_DOT_WORD} macro. It | |
1368 | isn't needed if @code{.word} emits a value large enough to contain an address | |
1369 | (or, more correctly, any possible difference between two addresses). | |
1370 | ||
1371 | @node Internal functions | |
1372 | @section Internal functions | |
1373 | ||
1374 | This section describes basic internal functions used by GAS. | |
ed307a20 | 1375 | |
af16e411 FF |
1376 | @menu |
1377 | * Warning and error messages:: Warning and error messages | |
1378 | * Hash tables:: Hash tables | |
1379 | @end menu | |
ed307a20 | 1380 | |
af16e411 FF |
1381 | @node Warning and error messages |
1382 | @subsection Warning and error messages | |
ed307a20 | 1383 | |
af16e411 FF |
1384 | @deftypefun @{@} int had_warnings (void) |
1385 | @deftypefunx @{@} int had_errors (void) | |
ae6cd60f KR |
1386 | Returns non-zero if any warnings or errors, respectively, have been printed |
1387 | during this invocation. | |
ed307a20 KR |
1388 | @end deftypefun |
1389 | ||
af16e411 | 1390 | @deftypefun @{@} void as_perror (const char *@var{gripe}, const char *@var{filename}) |
ed307a20 | 1391 | Displays a BFD or system error, then clears the error status. |
ed307a20 KR |
1392 | @end deftypefun |
1393 | ||
af16e411 FF |
1394 | @deftypefun @{@} void as_tsktsk (const char *@var{format}, ...) |
1395 | @deftypefunx @{@} void as_warn (const char *@var{format}, ...) | |
1396 | @deftypefunx @{@} void as_bad (const char *@var{format}, ...) | |
1397 | @deftypefunx @{@} void as_fatal (const char *@var{format}, ...) | |
ae6cd60f KR |
1398 | These functions display messages about something amiss with the input file, or |
1399 | internal problems in the assembler itself. The current file name and line | |
1400 | number are printed, followed by the supplied message, formatted using | |
1401 | @code{vfprintf}, and a final newline. | |
1402 | ||
1403 | An error indicated by @code{as_bad} will result in a non-zero exit status when | |
1404 | the assembler has finished. Calling @code{as_fatal} will result in immediate | |
1405 | termination of the assembler process. | |
ed307a20 KR |
1406 | @end deftypefun |
1407 | ||
af16e411 FF |
1408 | @deftypefun @{@} void as_warn_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...) |
1409 | @deftypefunx @{@} void as_bad_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...) | |
ae6cd60f KR |
1410 | These variants permit specification of the file name and line number, and are |
1411 | used when problems are detected when reprocessing information saved away when | |
1412 | processing some earlier part of the file. For example, fixups are processed | |
1413 | after all input has been read, but messages about fixups should refer to the | |
1414 | original filename and line number that they are applicable to. | |
ed307a20 KR |
1415 | @end deftypefun |
1416 | ||
af16e411 FF |
1417 | @deftypefun @{@} void fprint_value (FILE *@var{file}, valueT @var{val}) |
1418 | @deftypefunx @{@} void sprint_value (char *@var{buf}, valueT @var{val}) | |
ae6cd60f KR |
1419 | These functions are helpful for converting a @code{valueT} value into printable |
1420 | format, in case it's wider than modes that @code{*printf} can handle. If the | |
1421 | type is narrow enough, a decimal number will be produced; otherwise, it will be | |
af16e411 FF |
1422 | in hexadecimal. The value itself is not examined to make this determination. |
1423 | @end deftypefun | |
1424 | ||
1425 | @node Hash tables | |
1426 | @subsection Hash tables | |
1427 | @cindex hash tables | |
ed307a20 | 1428 | |
af16e411 FF |
1429 | @deftypefun @{@} @{struct hash_control *@} hash_new (void) |
1430 | Creates the hash table control structure. | |
ed307a20 | 1431 | @end deftypefun |
ae6cd60f | 1432 | |
af16e411 FF |
1433 | @deftypefun @{@} void hash_die (struct hash_control *) |
1434 | Destroy a hash table. | |
1435 | @end deftypefun | |
1436 | ||
1437 | @deftypefun @{@} PTR hash_delete (struct hash_control *, const char *) | |
1438 | Deletes entry from the hash table, returns the value it had. | |
1439 | @end deftypefun | |
1440 | ||
1441 | @deftypefun @{@} PTR hash_replace (struct hash_control *, const char *, PTR) | |
1442 | Updates the value for an entry already in the table, returning the old value. | |
1443 | If no entry was found, just returns NULL. | |
1444 | @end deftypefun | |
1445 | ||
1446 | @deftypefun @{@} @{const char *@} hash_insert (struct hash_control *, const char *, PTR) | |
1447 | Inserting a value already in the table is an error. | |
1448 | Returns an error message or NULL. | |
1449 | @end deftypefun | |
1450 | ||
1451 | @deftypefun @{@} @{const char *@} hash_jam (struct hash_control *, const char *, PTR) | |
1452 | Inserts if the value isn't already present, updates it if it is. | |
1453 | @end deftypefun | |
ae6cd60f KR |
1454 | |
1455 | @node Test suite | |
1456 | @section Test suite | |
1457 | @cindex test suite | |
1458 | ||
1459 | The test suite is kind of lame for most processors. Often it only checks to | |
1460 | see if a couple of files can be assembled without the assembler reporting any | |
1461 | errors. For more complete testing, write a test which either examines the | |
1462 | assembler listing, or runs @code{objdump} and examines its output. For the | |
1463 | latter, the TCL procedure @code{run_dump_test} may come in handy. It takes the | |
1464 | base name of a file, and looks for @file{@var{file}.d}. This file should | |
1465 | contain as its initial lines a set of variable settings in @samp{#} comments, | |
1466 | in the form: | |
1467 | ||
1468 | @example | |
1469 | #@var{varname}: @var{value} | |
1470 | @end example | |
1471 | ||
1472 | The @var{varname} may be @code{objdump}, @code{nm}, or @code{as}, in which case | |
1473 | it specifies the options to be passed to the specified programs. Exactly one | |
1474 | of @code{objdump} or @code{nm} must be specified, as that also specifies which | |
1475 | program to run after the assembler has finished. If @var{varname} is | |
1476 | @code{source}, it specifies the name of the source file; otherwise, | |
1477 | @file{@var{file}.s} is used. If @var{varname} is @code{name}, it specifies the | |
1478 | name of the test to be used in the @code{pass} or @code{fail} messages. | |
1479 | ||
1480 | The non-commented parts of the file are interpreted as regular expressions, one | |
1481 | per line. Blank lines in the @code{objdump} or @code{nm} output are skipped, | |
1482 | as are blank lines in the @code{.d} file; the other lines are tested to see if | |
1483 | the regular expression matches the program output. If it does not, the test | |
1484 | fails. | |
1485 | ||
1486 | Note that this means the tests must be modified if the @code{objdump} output | |
1487 | style is changed. | |
1488 | ||
1489 | @bye | |
1490 | @c Local Variables: | |
1491 | @c fill-column: 79 | |
1492 | @c End: |