Commit | Line | Data |
---|---|---|
252b5132 | 1 | \input texinfo |
7898deda | 2 | @c Copyright 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1998, |
c067354b | 3 | @c 2000, 2001, 2002, 2003, 2004, 2006, 2007, 2009 |
7898deda | 4 | @c Free Software Foundation, Inc. |
252b5132 RH |
5 | @setfilename bfdint.info |
6 | ||
7 | @settitle BFD Internals | |
8 | @iftex | |
9 | @titlepage | |
10 | @title{BFD Internals} | |
11 | @author{Ian Lance Taylor} | |
12 | @author{Cygnus Solutions} | |
13 | @page | |
14 | @end iftex | |
15 | ||
0e9517a9 | 16 | @copying |
f0757517 NC |
17 | This file documents the internals of the BFD library. |
18 | ||
0e9517a9 | 19 | Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, |
c067354b | 20 | 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2006, 2007, 2009 |
f0757517 NC |
21 | Free Software Foundation, Inc. |
22 | Contributed by Cygnus Support. | |
23 | ||
0e9517a9 NC |
24 | Permission is granted to copy, distribute and/or modify this document |
25 | under the terms of the GNU Free Documentation License, Version 1.1 or | |
26 | any later version published by the Free Software Foundation; with the | |
27 | Invariant Sections being ``GNU General Public License'' and ``Funding | |
28 | Free Software'', the Front-Cover texts being (a) (see below), and with | |
29 | the Back-Cover Texts being (b) (see below). A copy of the license is | |
30 | included in the section entitled ``GNU Free Documentation License''. | |
f0757517 | 31 | |
0e9517a9 | 32 | (a) The FSF's Front-Cover Text is: |
f0757517 | 33 | |
0e9517a9 NC |
34 | A GNU Manual |
35 | ||
36 | (b) The FSF's Back-Cover Text is: | |
37 | ||
38 | You have freedom to copy and modify this GNU Manual, like GNU | |
39 | software. Copies published by the Free Software Foundation raise | |
40 | funds for GNU development. | |
41 | @end copying | |
f0757517 | 42 | |
252b5132 RH |
43 | @node Top |
44 | @top BFD Internals | |
45 | @raisesections | |
46 | @cindex bfd internals | |
47 | ||
48 | This document describes some BFD internal information which may be | |
49 | helpful when working on BFD. It is very incomplete. | |
50 | ||
5b343f5a | 51 | This document is not updated regularly, and may be out of date. |
252b5132 RH |
52 | |
53 | The initial version of this document was written by Ian Lance Taylor | |
54 | @email{ian@@cygnus.com}. | |
55 | ||
56 | @menu | |
57 | * BFD overview:: BFD overview | |
58 | * BFD guidelines:: BFD programming guidelines | |
59 | * BFD target vector:: BFD target vector | |
60 | * BFD generated files:: BFD generated files | |
61 | * BFD multiple compilations:: Files compiled multiple times in BFD | |
62 | * BFD relocation handling:: BFD relocation handling | |
63 | * BFD ELF support:: BFD ELF support | |
64 | * BFD glossary:: Glossary | |
65 | * Index:: Index | |
66 | @end menu | |
67 | ||
68 | @node BFD overview | |
69 | @section BFD overview | |
70 | ||
71 | BFD is a library which provides a single interface to read and write | |
72 | object files, executables, archive files, and core files in any format. | |
73 | ||
74 | @menu | |
75 | * BFD library interfaces:: BFD library interfaces | |
76 | * BFD library users:: BFD library users | |
77 | * BFD view:: The BFD view of a file | |
78 | * BFD blindness:: BFD loses information | |
79 | @end menu | |
80 | ||
81 | @node BFD library interfaces | |
82 | @subsection BFD library interfaces | |
83 | ||
84 | One way to look at the BFD library is to divide it into four parts by | |
85 | type of interface. | |
86 | ||
87 | The first interface is the set of generic functions which programs using | |
88 | the BFD library will call. These generic function normally translate | |
89 | directly or indirectly into calls to routines which are specific to a | |
90 | particular object file format. Many of these generic functions are | |
91 | actually defined as macros in @file{bfd.h}. These functions comprise | |
92 | the official BFD interface. | |
93 | ||
94 | The second interface is the set of functions which appear in the target | |
95 | vectors. This is the bulk of the code in BFD. A target vector is a set | |
96 | of function pointers specific to a particular object file format. The | |
97 | target vector is used to implement the generic BFD functions. These | |
98 | functions are always called through the target vector, and are never | |
99 | called directly. The target vector is described in detail in @ref{BFD | |
100 | target vector}. The set of functions which appear in a particular | |
101 | target vector is often referred to as a BFD backend. | |
102 | ||
103 | The third interface is a set of oddball functions which are typically | |
104 | specific to a particular object file format, are not generic functions, | |
105 | and are called from outside of the BFD library. These are used as hooks | |
106 | by the linker and the assembler when a particular object file format | |
107 | requires some action which the BFD generic interface does not provide. | |
108 | These functions are typically declared in @file{bfd.h}, but in many | |
109 | cases they are only provided when BFD is configured with support for a | |
110 | particular object file format. These functions live in a grey area, and | |
111 | are not really part of the official BFD interface. | |
112 | ||
113 | The fourth interface is the set of BFD support functions which are | |
114 | called by the other BFD functions. These manage issues like memory | |
115 | allocation, error handling, file access, hash tables, swapping, and the | |
116 | like. These functions are never called from outside of the BFD library. | |
117 | ||
118 | @node BFD library users | |
119 | @subsection BFD library users | |
120 | ||
121 | Another way to look at the BFD library is to divide it into three parts | |
122 | by the manner in which it is used. | |
123 | ||
124 | The first use is to read an object file. The object file readers are | |
125 | programs like @samp{gdb}, @samp{nm}, @samp{objdump}, and @samp{objcopy}. | |
126 | These programs use BFD to view an object file in a generic form. The | |
127 | official BFD interface is normally fully adequate for these programs. | |
128 | ||
129 | The second use is to write an object file. The object file writers are | |
130 | programs like @samp{gas} and @samp{objcopy}. These programs use BFD to | |
131 | create an object file. The official BFD interface is normally adequate | |
132 | for these programs, but for some object file formats the assembler needs | |
133 | some additional hooks in order to set particular flags or other | |
134 | information. The official BFD interface includes functions to copy | |
135 | private information from one object file to another, and these functions | |
136 | are used by @samp{objcopy} to avoid information loss. | |
137 | ||
138 | The third use is to link object files. There is only one object file | |
139 | linker, @samp{ld}. Originally, @samp{ld} was an object file reader and | |
140 | an object file writer, and it did the link operation using the generic | |
141 | BFD structures. However, this turned out to be too slow and too memory | |
142 | intensive. | |
143 | ||
144 | The official BFD linker functions were written to permit specific BFD | |
145 | backends to perform the link without translating through the generic | |
146 | structures, in the normal case where all the input files and output file | |
147 | have the same object file format. Not all of the backends currently | |
148 | implement the new interface, and there are default linking functions | |
149 | within BFD which use the generic structures and which work with all | |
150 | backends. | |
151 | ||
152 | For several object file formats the linker needs additional hooks which | |
153 | are not provided by the official BFD interface, particularly for dynamic | |
154 | linking support. These functions are typically called from the linker | |
155 | emulation template. | |
156 | ||
157 | @node BFD view | |
158 | @subsection The BFD view of a file | |
159 | ||
160 | BFD uses generic structures to manage information. It translates data | |
161 | into the generic form when reading files, and out of the generic form | |
162 | when writing files. | |
163 | ||
164 | BFD describes a file as a pointer to the @samp{bfd} type. A @samp{bfd} | |
165 | is composed of the following elements. The BFD information can be | |
166 | displayed using the @samp{objdump} program with various options. | |
167 | ||
168 | @table @asis | |
169 | @item general information | |
170 | The object file format, a few general flags, the start address. | |
171 | @item architecture | |
172 | The architecture, including both a general processor type (m68k, MIPS | |
173 | etc.) and a specific machine number (m68000, R4000, etc.). | |
174 | @item sections | |
175 | A list of sections. | |
176 | @item symbols | |
177 | A symbol table. | |
178 | @end table | |
179 | ||
180 | BFD represents a section as a pointer to the @samp{asection} type. Each | |
181 | section has a name and a size. Most sections also have an associated | |
182 | block of data, known as the section contents. Sections also have | |
183 | associated flags, a virtual memory address, a load memory address, a | |
184 | required alignment, a list of relocations, and other miscellaneous | |
185 | information. | |
186 | ||
187 | BFD represents a relocation as a pointer to the @samp{arelent} type. A | |
188 | relocation describes an action which the linker must take to modify the | |
189 | section contents. Relocations have a symbol, an address, an addend, and | |
190 | a pointer to a howto structure which describes how to perform the | |
191 | relocation. For more information, see @ref{BFD relocation handling}. | |
192 | ||
193 | BFD represents a symbol as a pointer to the @samp{asymbol} type. A | |
194 | symbol has a name, a pointer to a section, an offset within that | |
195 | section, and some flags. | |
196 | ||
197 | Archive files do not have any sections or symbols. Instead, BFD | |
198 | represents an archive file as a file which contains a list of | |
199 | @samp{bfd}s. BFD also provides access to the archive symbol map, as a | |
200 | list of symbol names. BFD provides a function to return the @samp{bfd} | |
201 | within the archive which corresponds to a particular entry in the | |
202 | archive symbol map. | |
203 | ||
204 | @node BFD blindness | |
205 | @subsection BFD loses information | |
206 | ||
207 | Most object file formats have information which BFD can not represent in | |
208 | its generic form, at least as currently defined. | |
209 | ||
210 | There is often explicit information which BFD can not represent. For | |
211 | example, the COFF version stamp, or the ELF program segments. BFD | |
212 | provides special hooks to handle this information when copying, | |
213 | printing, or linking an object file. The BFD support for a particular | |
214 | object file format will normally store this information in private data | |
215 | and handle it using the special hooks. | |
216 | ||
217 | In some cases there is also implicit information which BFD can not | |
218 | represent. For example, the MIPS processor distinguishes small and | |
b45619c0 | 219 | large symbols, and requires that all small symbols be within 32K of the |
252b5132 RH |
220 | GP register. This means that the MIPS assembler must be able to mark |
221 | variables as either small or large, and the MIPS linker must know to put | |
222 | small symbols within range of the GP register. Since BFD can not | |
223 | represent this information, this means that the assembler and linker | |
224 | must have information that is specific to a particular object file | |
225 | format which is outside of the BFD library. | |
226 | ||
227 | This loss of information indicates areas where the BFD paradigm breaks | |
228 | down. It is not actually possible to represent the myriad differences | |
229 | among object file formats using a single generic interface, at least not | |
230 | in the manner which BFD does it today. | |
231 | ||
232 | Nevertheless, the BFD library does greatly simplify the task of dealing | |
233 | with object files, and particular problems caused by information loss | |
234 | can normally be solved using some sort of relatively constrained hook | |
235 | into the library. | |
236 | ||
237 | ||
238 | ||
239 | @node BFD guidelines | |
240 | @section BFD programming guidelines | |
241 | @cindex bfd programming guidelines | |
242 | @cindex programming guidelines for bfd | |
243 | @cindex guidelines, bfd programming | |
244 | ||
245 | There is a lot of poorly written and confusing code in BFD. New BFD | |
246 | code should be written to a higher standard. Merely because some BFD | |
247 | code is written in a particular manner does not mean that you should | |
248 | emulate it. | |
249 | ||
250 | Here are some general BFD programming guidelines: | |
251 | ||
252 | @itemize @bullet | |
253 | @item | |
254 | Follow the GNU coding standards. | |
255 | ||
256 | @item | |
257 | Avoid global variables. We ideally want BFD to be fully reentrant, so | |
258 | that it can be used in multiple threads. All uses of global or static | |
259 | variables interfere with that. Initialized constant variables are OK, | |
b45619c0 | 260 | and they should be explicitly marked with @samp{const}. Instead of global |
252b5132 RH |
261 | variables, use data attached to a BFD or to a linker hash table. |
262 | ||
263 | @item | |
264 | All externally visible functions should have names which start with | |
265 | @samp{bfd_}. All such functions should be declared in some header file, | |
266 | typically @file{bfd.h}. See, for example, the various declarations near | |
267 | the end of @file{bfd-in.h}, which mostly declare functions required by | |
268 | specific linker emulations. | |
269 | ||
270 | @item | |
271 | All functions which need to be visible from one file to another within | |
272 | BFD, but should not be visible outside of BFD, should start with | |
273 | @samp{_bfd_}. Although external names beginning with @samp{_} are | |
274 | prohibited by the ANSI standard, in practice this usage will always | |
275 | work, and it is required by the GNU coding standards. | |
276 | ||
277 | @item | |
278 | Always remember that people can compile using @samp{--enable-targets} to | |
279 | build several, or all, targets at once. It must be possible to link | |
280 | together the files for all targets. | |
281 | ||
282 | @item | |
283 | BFD code should compile with few or no warnings using @samp{gcc -Wall}. | |
284 | Some warnings are OK, like the absence of certain function declarations | |
285 | which may or may not be declared in system header files. Warnings about | |
286 | ambiguous expressions and the like should always be fixed. | |
287 | @end itemize | |
288 | ||
289 | @node BFD target vector | |
290 | @section BFD target vector | |
291 | @cindex bfd target vector | |
292 | @cindex target vector in bfd | |
293 | ||
294 | BFD supports multiple object file formats by using the @dfn{target | |
295 | vector}. This is simply a set of function pointers which implement | |
296 | behaviour that is specific to a particular object file format. | |
297 | ||
298 | In this section I list all of the entries in the target vector and | |
299 | describe what they do. | |
300 | ||
301 | @menu | |
302 | * BFD target vector miscellaneous:: Miscellaneous constants | |
303 | * BFD target vector swap:: Swapping functions | |
304 | * BFD target vector format:: Format type dependent functions | |
305 | * BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros | |
306 | * BFD target vector generic:: Generic functions | |
307 | * BFD target vector copy:: Copy functions | |
308 | * BFD target vector core:: Core file support functions | |
309 | * BFD target vector archive:: Archive functions | |
310 | * BFD target vector symbols:: Symbol table functions | |
311 | * BFD target vector relocs:: Relocation support | |
312 | * BFD target vector write:: Output functions | |
313 | * BFD target vector link:: Linker functions | |
314 | * BFD target vector dynamic:: Dynamic linking information functions | |
315 | @end menu | |
316 | ||
317 | @node BFD target vector miscellaneous | |
318 | @subsection Miscellaneous constants | |
319 | ||
320 | The target vector starts with a set of constants. | |
321 | ||
322 | @table @samp | |
323 | @item name | |
324 | The name of the target vector. This is an arbitrary string. This is | |
325 | how the target vector is named in command line options for tools which | |
d9bc7a44 | 326 | use BFD, such as the @samp{--oformat} linker option. |
252b5132 RH |
327 | |
328 | @item flavour | |
329 | A general description of the type of target. The following flavours are | |
330 | currently defined: | |
331 | ||
332 | @table @samp | |
333 | @item bfd_target_unknown_flavour | |
334 | Undefined or unknown. | |
335 | @item bfd_target_aout_flavour | |
336 | a.out. | |
337 | @item bfd_target_coff_flavour | |
338 | COFF. | |
339 | @item bfd_target_ecoff_flavour | |
340 | ECOFF. | |
341 | @item bfd_target_elf_flavour | |
342 | ELF. | |
343 | @item bfd_target_ieee_flavour | |
344 | IEEE-695. | |
345 | @item bfd_target_nlm_flavour | |
346 | NLM. | |
347 | @item bfd_target_oasys_flavour | |
348 | OASYS. | |
349 | @item bfd_target_tekhex_flavour | |
350 | Tektronix hex format. | |
351 | @item bfd_target_srec_flavour | |
352 | Motorola S-record format. | |
353 | @item bfd_target_ihex_flavour | |
354 | Intel hex format. | |
355 | @item bfd_target_som_flavour | |
356 | SOM (used on HP/UX). | |
c067354b NC |
357 | @item bfd_target_verilog_flavour |
358 | Verilog memory hex dump format. | |
252b5132 RH |
359 | @item bfd_target_os9k_flavour |
360 | os9000. | |
361 | @item bfd_target_versados_flavour | |
362 | VERSAdos. | |
363 | @item bfd_target_msdos_flavour | |
364 | MS-DOS. | |
365 | @item bfd_target_evax_flavour | |
366 | openVMS. | |
3c3bdf30 NC |
367 | @item bfd_target_mmo_flavour |
368 | Donald Knuth's MMIXware object format. | |
252b5132 RH |
369 | @end table |
370 | ||
371 | @item byteorder | |
372 | The byte order of data in the object file. One of | |
373 | @samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or | |
374 | @samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such | |
375 | as S-records which do not record the architecture of the data. | |
376 | ||
377 | @item header_byteorder | |
378 | The byte order of header information in the object file. Normally the | |
379 | same as the @samp{byteorder} field, but there are certain cases where it | |
380 | may be different. | |
381 | ||
382 | @item object_flags | |
383 | Flags which may appear in the @samp{flags} field of a BFD with this | |
384 | format. | |
385 | ||
386 | @item section_flags | |
387 | Flags which may appear in the @samp{flags} field of a section within a | |
388 | BFD with this format. | |
389 | ||
390 | @item symbol_leading_char | |
391 | A character which the C compiler normally puts before a symbol. For | |
392 | example, an a.out compiler will typically generate the symbol | |
393 | @samp{_foo} for a function named @samp{foo} in the C source, in which | |
394 | case this field would be @samp{_}. If there is no such character, this | |
395 | field will be @samp{0}. | |
396 | ||
397 | @item ar_pad_char | |
398 | The padding character to use at the end of an archive name. Normally | |
399 | @samp{/}. | |
400 | ||
401 | @item ar_max_namelen | |
402 | The maximum length of a short name in an archive. Normally @samp{14}. | |
403 | ||
404 | @item backend_data | |
405 | A pointer to constant backend data. This is used by backends to store | |
406 | whatever additional information they need to distinguish similar target | |
407 | vectors which use the same sets of functions. | |
408 | @end table | |
409 | ||
410 | @node BFD target vector swap | |
411 | @subsection Swapping functions | |
412 | ||
d1d013c3 | 413 | Every target vector has function pointers used for swapping information |
252b5132 RH |
414 | in and out of the target representation. There are two sets of |
415 | functions: one for data information, and one for header information. | |
416 | Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has | |
417 | three actual functions: put, get unsigned, and get signed. | |
418 | ||
419 | These 18 functions are used to convert data between the host and target | |
420 | representations. | |
421 | ||
422 | @node BFD target vector format | |
423 | @subsection Format type dependent functions | |
424 | ||
425 | Every target vector has three arrays of function pointers which are | |
426 | indexed by the BFD format type. The BFD format types are as follows: | |
427 | ||
428 | @table @samp | |
429 | @item bfd_unknown | |
430 | Unknown format. Not used for anything useful. | |
431 | @item bfd_object | |
432 | Object file. | |
433 | @item bfd_archive | |
434 | Archive file. | |
435 | @item bfd_core | |
436 | Core file. | |
437 | @end table | |
438 | ||
439 | The three arrays of function pointers are as follows: | |
440 | ||
441 | @table @samp | |
442 | @item bfd_check_format | |
443 | Check whether the BFD is of a particular format (object file, archive | |
444 | file, or core file) corresponding to this target vector. This is called | |
445 | by the @samp{bfd_check_format} function when examining an existing BFD. | |
446 | If the BFD matches the desired format, this function will initialize any | |
447 | format specific information such as the @samp{tdata} field of the BFD. | |
448 | This function must be called before any other BFD target vector function | |
449 | on a file opened for reading. | |
450 | ||
451 | @item bfd_set_format | |
452 | Set the format of a BFD which was created for output. This is called by | |
453 | the @samp{bfd_set_format} function after creating the BFD with a | |
454 | function such as @samp{bfd_openw}. This function will initialize format | |
455 | specific information required to write out an object file or whatever of | |
456 | the given format. This function must be called before any other BFD | |
457 | target vector function on a file opened for writing. | |
458 | ||
459 | @item bfd_write_contents | |
460 | Write out the contents of the BFD in the given format. This is called | |
461 | by @samp{bfd_close} function for a BFD opened for writing. This really | |
462 | should not be an array selected by format type, as the | |
463 | @samp{bfd_set_format} function provides all the required information. | |
464 | In fact, BFD will fail if a different format is used when calling | |
465 | through the @samp{bfd_set_format} and the @samp{bfd_write_contents} | |
466 | arrays; fortunately, since @samp{bfd_close} gets it right, this is a | |
467 | difficult error to make. | |
468 | @end table | |
469 | ||
470 | @node BFD_JUMP_TABLE macros | |
471 | @subsection @samp{BFD_JUMP_TABLE} macros | |
472 | @cindex @samp{BFD_JUMP_TABLE} | |
473 | ||
474 | Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros. | |
475 | These macros take a single argument, which is a prefix applied to a set | |
476 | of functions. The macros are then used to initialize the fields in the | |
477 | target vector. | |
478 | ||
479 | For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three | |
480 | functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc}, | |
481 | and @samp{_bfd_reloc_type_lookup}. A reference like | |
482 | @samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions | |
5398f678 | 483 | prefixed with @samp{foo}: @samp{foo_get_reloc_upper_bound}, etc. The |
252b5132 RH |
484 | @samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three |
485 | functions initialize the appropriate fields in the BFD target vector. | |
486 | ||
487 | This is done because it turns out that many different target vectors can | |
488 | share certain classes of functions. For example, archives are similar | |
489 | on most platforms, so most target vectors can use the same archive | |
490 | functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE} | |
491 | with the same argument, calling a set of functions which is defined in | |
492 | @file{archive.c}. | |
493 | ||
494 | Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with | |
495 | the description of the function pointers which it defines. The function | |
496 | pointers will be described using the name without the prefix which the | |
497 | @samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as | |
498 | the name of the field in the target vector structure. Any differences | |
499 | will be noted. | |
500 | ||
501 | @node BFD target vector generic | |
502 | @subsection Generic functions | |
503 | @cindex @samp{BFD_JUMP_TABLE_GENERIC} | |
504 | ||
505 | The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all | |
506 | functions which don't easily fit into other categories. | |
507 | ||
508 | @table @samp | |
509 | @item _close_and_cleanup | |
510 | Free any target specific information associated with the BFD. This is | |
511 | called when any BFD is closed (the @samp{bfd_write_contents} function | |
512 | mentioned earlier is only called for a BFD opened for writing). Most | |
513 | targets use @samp{bfd_alloc} to allocate all target specific | |
514 | information, and therefore don't have to do anything in this function. | |
515 | This function pointer is typically set to | |
516 | @samp{_bfd_generic_close_and_cleanup}, which simply returns true. | |
517 | ||
518 | @item _bfd_free_cached_info | |
519 | Free any cached information associated with the BFD which can be | |
520 | recreated later if necessary. This is used to reduce the memory | |
521 | consumption required by programs using BFD. This is normally called via | |
522 | the @samp{bfd_free_cached_info} macro. It is used by the default | |
523 | archive routines when computing the archive map. Most targets do not | |
524 | do anything special for this entry point, and just set it to | |
525 | @samp{_bfd_generic_free_cached_info}, which simply returns true. | |
526 | ||
527 | @item _new_section_hook | |
528 | This is called from @samp{bfd_make_section_anyway} whenever a new | |
529 | section is created. Most targets use it to initialize section specific | |
530 | information. This function is called whether or not the section | |
531 | corresponds to an actual section in an actual BFD. | |
532 | ||
533 | @item _get_section_contents | |
534 | Get the contents of a section. This is called from | |
535 | @samp{bfd_get_section_contents}. Most targets set this to | |
536 | @samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek} | |
17c1c87f | 537 | based on the section's @samp{filepos} field and a @samp{bfd_bread}. The |
252b5132 RH |
538 | corresponding field in the target vector is named |
539 | @samp{_bfd_get_section_contents}. | |
540 | ||
541 | @item _get_section_contents_in_window | |
542 | Set a @samp{bfd_window} to hold the contents of a section. This is | |
543 | called from @samp{bfd_get_section_contents_in_window}. The | |
544 | @samp{bfd_window} idea never really caught on, and I don't think this is | |
545 | ever called. Pretty much all targets implement this as | |
546 | @samp{bfd_generic_get_section_contents_in_window}, which uses | |
547 | @samp{bfd_get_section_contents} to do the right thing. The | |
548 | corresponding field in the target vector is named | |
549 | @samp{_bfd_get_section_contents_in_window}. | |
550 | @end table | |
551 | ||
552 | @node BFD target vector copy | |
553 | @subsection Copy functions | |
554 | @cindex @samp{BFD_JUMP_TABLE_COPY} | |
555 | ||
556 | The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are | |
557 | called when copying BFDs, and for a couple of functions which deal with | |
558 | internal BFD information. | |
559 | ||
560 | @table @samp | |
561 | @item _bfd_copy_private_bfd_data | |
562 | This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}. | |
563 | If the input and output BFDs have the same format, this will copy any | |
564 | private information over. This is called after all the section contents | |
565 | have been written to the output file. Only a few targets do anything in | |
566 | this function. | |
567 | ||
568 | @item _bfd_merge_private_bfd_data | |
569 | This is called when linking, via @samp{bfd_merge_private_bfd_data}. It | |
570 | gives the backend linker code a chance to set any special flags in the | |
571 | output file based on the contents of the input file. Only a few targets | |
572 | do anything in this function. | |
573 | ||
574 | @item _bfd_copy_private_section_data | |
575 | This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called | |
576 | for each section, via @samp{bfd_copy_private_section_data}. This | |
577 | function is called before any section contents have been written. Only | |
578 | a few targets do anything in this function. | |
579 | ||
580 | @item _bfd_copy_private_symbol_data | |
581 | This is called via @samp{bfd_copy_private_symbol_data}, but I don't | |
582 | think anything actually calls it. If it were defined, it could be used | |
583 | to copy private symbol data from one BFD to another. However, most BFDs | |
584 | store extra symbol information by allocating space which is larger than | |
585 | the @samp{asymbol} structure and storing private information in the | |
586 | extra space. Since @samp{objcopy} and other programs copy symbol | |
587 | information by copying pointers to @samp{asymbol} structures, the | |
588 | private symbol information is automatically copied as well. Most | |
589 | targets do not do anything in this function. | |
590 | ||
591 | @item _bfd_set_private_flags | |
592 | This is called via @samp{bfd_set_private_flags}. It is basically a hook | |
593 | for the assembler to set magic information. For example, the PowerPC | |
594 | ELF assembler uses it to set flags which appear in the e_flags field of | |
595 | the ELF header. Most targets do not do anything in this function. | |
596 | ||
597 | @item _bfd_print_private_bfd_data | |
598 | This is called by @samp{objdump} when the @samp{-p} option is used. It | |
599 | is called via @samp{bfd_print_private_data}. It prints any interesting | |
600 | information about the BFD which can not be otherwise represented by BFD | |
601 | and thus can not be printed by @samp{objdump}. Most targets do not do | |
602 | anything in this function. | |
603 | @end table | |
604 | ||
605 | @node BFD target vector core | |
606 | @subsection Core file support functions | |
607 | @cindex @samp{BFD_JUMP_TABLE_CORE} | |
608 | ||
609 | The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal | |
610 | with core files. Obviously, these functions only do something | |
611 | interesting for targets which have core file support. | |
612 | ||
613 | @table @samp | |
614 | @item _core_file_failing_command | |
615 | Given a core file, this returns the command which was run to produce the | |
616 | core file. | |
617 | ||
618 | @item _core_file_failing_signal | |
619 | Given a core file, this returns the signal number which produced the | |
620 | core file. | |
621 | ||
622 | @item _core_file_matches_executable_p | |
623 | Given a core file and a BFD for an executable, this returns whether the | |
624 | core file was generated by the executable. | |
625 | @end table | |
626 | ||
627 | @node BFD target vector archive | |
628 | @subsection Archive functions | |
629 | @cindex @samp{BFD_JUMP_TABLE_ARCHIVE} | |
630 | ||
631 | The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal | |
632 | with archive files. Most targets use COFF style archive files | |
633 | (including ELF targets), and these use @samp{_bfd_archive_coff} as the | |
634 | argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out | |
635 | style archives, and these use @samp{_bfd_archive_bsd}. (The main | |
636 | difference between BSD and COFF archives is the format of the archive | |
637 | symbol table). Targets with no archive support use | |
638 | @samp{_bfd_noarchive}. Finally, a few targets have unusual archive | |
639 | handling. | |
640 | ||
641 | @table @samp | |
642 | @item _slurp_armap | |
643 | Read in the archive symbol table, storing it in private BFD data. This | |
644 | is normally called from the archive @samp{check_format} routine. The | |
645 | corresponding field in the target vector is named | |
646 | @samp{_bfd_slurp_armap}. | |
647 | ||
648 | @item _slurp_extended_name_table | |
649 | Read in the extended name table from the archive, if there is one, | |
650 | storing it in private BFD data. This is normally called from the | |
651 | archive @samp{check_format} routine. The corresponding field in the | |
652 | target vector is named @samp{_bfd_slurp_extended_name_table}. | |
653 | ||
654 | @item construct_extended_name_table | |
655 | Build and return an extended name table if one is needed to write out | |
656 | the archive. This also adjusts the archive headers to refer to the | |
657 | extended name table appropriately. This is normally called from the | |
658 | archive @samp{write_contents} routine. The corresponding field in the | |
659 | target vector is named @samp{_bfd_construct_extended_name_table}. | |
660 | ||
661 | @item _truncate_arname | |
662 | This copies a file name into an archive header, truncating it as | |
663 | required. It is normally called from the archive @samp{write_contents} | |
664 | routine. This function is more interesting in targets which do not | |
665 | support extended name tables, but I think the GNU @samp{ar} program | |
666 | always uses extended name tables anyhow. The corresponding field in the | |
667 | target vector is named @samp{_bfd_truncate_arname}. | |
668 | ||
669 | @item _write_armap | |
17c1c87f | 670 | Write out the archive symbol table using calls to @samp{bfd_bwrite}. |
252b5132 RH |
671 | This is normally called from the archive @samp{write_contents} routine. |
672 | The corresponding field in the target vector is named @samp{write_armap} | |
673 | (no leading underscore). | |
674 | ||
675 | @item _read_ar_hdr | |
676 | Read and parse an archive header. This handles expanding the archive | |
677 | header name into the real file name using the extended name table. This | |
678 | is called by routines which read the archive symbol table or the archive | |
679 | itself. The corresponding field in the target vector is named | |
680 | @samp{_bfd_read_ar_hdr_fn}. | |
681 | ||
682 | @item _openr_next_archived_file | |
683 | Given an archive and a BFD representing a file stored within the | |
684 | archive, return a BFD for the next file in the archive. This is called | |
685 | via @samp{bfd_openr_next_archived_file}. The corresponding field in the | |
686 | target vector is named @samp{openr_next_archived_file} (no leading | |
687 | underscore). | |
688 | ||
689 | @item _get_elt_at_index | |
690 | Given an archive and an index, return a BFD for the file in the archive | |
691 | corresponding to that entry in the archive symbol table. This is called | |
692 | via @samp{bfd_get_elt_at_index}. The corresponding field in the target | |
693 | vector is named @samp{_bfd_get_elt_at_index}. | |
694 | ||
695 | @item _generic_stat_arch_elt | |
696 | Do a stat on an element of an archive, returning information read from | |
697 | the archive header (modification time, uid, gid, file mode, size). This | |
698 | is called via @samp{bfd_stat_arch_elt}. The corresponding field in the | |
699 | target vector is named @samp{_bfd_stat_arch_elt}. | |
700 | ||
701 | @item _update_armap_timestamp | |
702 | After the entire contents of an archive have been written out, update | |
703 | the timestamp of the archive symbol table to be newer than that of the | |
704 | file. This is required for a.out style archives. This is normally | |
705 | called by the archive @samp{write_contents} routine. The corresponding | |
706 | field in the target vector is named @samp{_bfd_update_armap_timestamp}. | |
707 | @end table | |
708 | ||
709 | @node BFD target vector symbols | |
710 | @subsection Symbol table functions | |
711 | @cindex @samp{BFD_JUMP_TABLE_SYMBOLS} | |
712 | ||
713 | The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal | |
714 | with symbols. | |
715 | ||
716 | @table @samp | |
717 | @item _get_symtab_upper_bound | |
718 | Return a sensible upper bound on the amount of memory which will be | |
719 | required to read the symbol table. In practice most targets return the | |
720 | amount of memory required to hold @samp{asymbol} pointers for all the | |
721 | symbols plus a trailing @samp{NULL} entry, and store the actual symbol | |
722 | information in BFD private data. This is called via | |
723 | @samp{bfd_get_symtab_upper_bound}. The corresponding field in the | |
724 | target vector is named @samp{_bfd_get_symtab_upper_bound}. | |
725 | ||
6cee3f79 | 726 | @item _canonicalize_symtab |
252b5132 RH |
727 | Read in the symbol table. This is called via |
728 | @samp{bfd_canonicalize_symtab}. The corresponding field in the target | |
729 | vector is named @samp{_bfd_canonicalize_symtab}. | |
730 | ||
731 | @item _make_empty_symbol | |
732 | Create an empty symbol for the BFD. This is needed because most targets | |
733 | store extra information with each symbol by allocating a structure | |
734 | larger than an @samp{asymbol} and storing the extra information at the | |
735 | end. This function will allocate the right amount of memory, and return | |
736 | what looks like a pointer to an empty @samp{asymbol}. This is called | |
737 | via @samp{bfd_make_empty_symbol}. The corresponding field in the target | |
738 | vector is named @samp{_bfd_make_empty_symbol}. | |
739 | ||
740 | @item _print_symbol | |
741 | Print information about the symbol. This is called via | |
742 | @samp{bfd_print_symbol}. One of the arguments indicates what sort of | |
743 | information should be printed: | |
744 | ||
745 | @table @samp | |
746 | @item bfd_print_symbol_name | |
747 | Just print the symbol name. | |
748 | @item bfd_print_symbol_more | |
749 | Print the symbol name and some interesting flags. I don't think | |
750 | anything actually uses this. | |
751 | @item bfd_print_symbol_all | |
752 | Print all information about the symbol. This is used by @samp{objdump} | |
753 | when run with the @samp{-t} option. | |
754 | @end table | |
755 | The corresponding field in the target vector is named | |
756 | @samp{_bfd_print_symbol}. | |
757 | ||
758 | @item _get_symbol_info | |
759 | Return a standard set of information about the symbol. This is called | |
760 | via @samp{bfd_symbol_info}. The corresponding field in the target | |
761 | vector is named @samp{_bfd_get_symbol_info}. | |
762 | ||
763 | @item _bfd_is_local_label_name | |
764 | Return whether the given string would normally represent the name of a | |
765 | local label. This is called via @samp{bfd_is_local_label} and | |
766 | @samp{bfd_is_local_label_name}. Local labels are normally discarded by | |
767 | the assembler. In the linker, this defines the difference between the | |
768 | @samp{-x} and @samp{-X} options. | |
769 | ||
770 | @item _get_lineno | |
771 | Return line number information for a symbol. This is only meaningful | |
772 | for a COFF target. This is called when writing out COFF line numbers. | |
773 | ||
774 | @item _find_nearest_line | |
775 | Given an address within a section, use the debugging information to find | |
776 | the matching file name, function name, and line number, if any. This is | |
777 | called via @samp{bfd_find_nearest_line}. The corresponding field in the | |
778 | target vector is named @samp{_bfd_find_nearest_line}. | |
779 | ||
780 | @item _bfd_make_debug_symbol | |
781 | Make a debugging symbol. This is only meaningful for a COFF target, | |
782 | where it simply returns a symbol which will be placed in the | |
783 | @samp{N_DEBUG} section when it is written out. This is called via | |
784 | @samp{bfd_make_debug_symbol}. | |
785 | ||
786 | @item _read_minisymbols | |
787 | Minisymbols are used to reduce the memory requirements of programs like | |
788 | @samp{nm}. A minisymbol is a cookie pointing to internal symbol | |
789 | information which the caller can use to extract complete symbol | |
790 | information. This permits BFD to not convert all the symbols into | |
791 | generic form, but to instead convert them one at a time. This is called | |
792 | via @samp{bfd_read_minisymbols}. Most targets do not implement this, | |
793 | and just use generic support which is based on using standard | |
794 | @samp{asymbol} structures. | |
795 | ||
796 | @item _minisymbol_to_symbol | |
797 | Convert a minisymbol to a standard @samp{asymbol}. This is called via | |
798 | @samp{bfd_minisymbol_to_symbol}. | |
799 | @end table | |
800 | ||
801 | @node BFD target vector relocs | |
802 | @subsection Relocation support | |
803 | @cindex @samp{BFD_JUMP_TABLE_RELOCS} | |
804 | ||
805 | The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal | |
806 | with relocations. | |
807 | ||
808 | @table @samp | |
809 | @item _get_reloc_upper_bound | |
810 | Return a sensible upper bound on the amount of memory which will be | |
811 | required to read the relocations for a section. In practice most | |
812 | targets return the amount of memory required to hold @samp{arelent} | |
813 | pointers for all the relocations plus a trailing @samp{NULL} entry, and | |
814 | store the actual relocation information in BFD private data. This is | |
815 | called via @samp{bfd_get_reloc_upper_bound}. | |
816 | ||
817 | @item _canonicalize_reloc | |
818 | Return the relocation information for a section. This is called via | |
819 | @samp{bfd_canonicalize_reloc}. The corresponding field in the target | |
820 | vector is named @samp{_bfd_canonicalize_reloc}. | |
821 | ||
822 | @item _bfd_reloc_type_lookup | |
823 | Given a relocation code, return the corresponding howto structure | |
824 | (@pxref{BFD relocation codes}). This is called via | |
825 | @samp{bfd_reloc_type_lookup}. The corresponding field in the target | |
826 | vector is named @samp{reloc_type_lookup}. | |
827 | @end table | |
828 | ||
829 | @node BFD target vector write | |
830 | @subsection Output functions | |
831 | @cindex @samp{BFD_JUMP_TABLE_WRITE} | |
832 | ||
833 | The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal | |
834 | with writing out a BFD. | |
835 | ||
836 | @table @samp | |
837 | @item _set_arch_mach | |
838 | Set the architecture and machine number for a BFD. This is called via | |
839 | @samp{bfd_set_arch_mach}. Most targets implement this by calling | |
840 | @samp{bfd_default_set_arch_mach}. The corresponding field in the target | |
841 | vector is named @samp{_bfd_set_arch_mach}. | |
842 | ||
843 | @item _set_section_contents | |
844 | Write out the contents of a section. This is called via | |
845 | @samp{bfd_set_section_contents}. The corresponding field in the target | |
846 | vector is named @samp{_bfd_set_section_contents}. | |
847 | @end table | |
848 | ||
849 | @node BFD target vector link | |
850 | @subsection Linker functions | |
851 | @cindex @samp{BFD_JUMP_TABLE_LINK} | |
852 | ||
853 | The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the | |
854 | linker. | |
855 | ||
856 | @table @samp | |
857 | @item _sizeof_headers | |
858 | Return the size of the header information required for a BFD. This is | |
859 | used to implement the @samp{SIZEOF_HEADERS} linker script function. It | |
860 | is normally used to align the first section at an efficient position on | |
861 | the page. This is called via @samp{bfd_sizeof_headers}. The | |
862 | corresponding field in the target vector is named | |
863 | @samp{_bfd_sizeof_headers}. | |
864 | ||
865 | @item _bfd_get_relocated_section_contents | |
866 | Read the contents of a section and apply the relocation information. | |
1049f94e | 867 | This handles both a final link and a relocatable link; in the latter |
252b5132 RH |
868 | case, it adjust the relocation information as well. This is called via |
869 | @samp{bfd_get_relocated_section_contents}. Most targets implement it by | |
870 | calling @samp{bfd_generic_get_relocated_section_contents}. | |
871 | ||
872 | @item _bfd_relax_section | |
873 | Try to use relaxation to shrink the size of a section. This is called | |
874 | by the linker when the @samp{-relax} option is used. This is called via | |
875 | @samp{bfd_relax_section}. Most targets do not support any sort of | |
876 | relaxation. | |
877 | ||
878 | @item _bfd_link_hash_table_create | |
879 | Create the symbol hash table to use for the linker. This linker hook | |
880 | permits the backend to control the size and information of the elements | |
881 | in the linker symbol hash table. This is called via | |
882 | @samp{bfd_link_hash_table_create}. | |
883 | ||
884 | @item _bfd_link_add_symbols | |
885 | Given an object file or an archive, add all symbols into the linker | |
886 | symbol hash table. Use callbacks to the linker to include archive | |
887 | elements in the link. This is called via @samp{bfd_link_add_symbols}. | |
888 | ||
889 | @item _bfd_final_link | |
890 | Finish the linking process. The linker calls this hook after all of the | |
891 | input files have been read, when it is ready to finish the link and | |
892 | generate the output file. This is called via @samp{bfd_final_link}. | |
893 | ||
894 | @item _bfd_link_split_section | |
895 | I don't know what this is for. Nothing seems to call it. The only | |
896 | non-trivial definition is in @file{som.c}. | |
897 | @end table | |
898 | ||
899 | @node BFD target vector dynamic | |
900 | @subsection Dynamic linking information functions | |
901 | @cindex @samp{BFD_JUMP_TABLE_DYNAMIC} | |
902 | ||
903 | The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read | |
904 | dynamic linking information. | |
905 | ||
906 | @table @samp | |
907 | @item _get_dynamic_symtab_upper_bound | |
908 | Return a sensible upper bound on the amount of memory which will be | |
909 | required to read the dynamic symbol table. In practice most targets | |
910 | return the amount of memory required to hold @samp{asymbol} pointers for | |
911 | all the symbols plus a trailing @samp{NULL} entry, and store the actual | |
912 | symbol information in BFD private data. This is called via | |
913 | @samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in | |
914 | the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}. | |
915 | ||
916 | @item _canonicalize_dynamic_symtab | |
917 | Read the dynamic symbol table. This is called via | |
918 | @samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the | |
919 | target vector is named @samp{_bfd_canonicalize_dynamic_symtab}. | |
920 | ||
921 | @item _get_dynamic_reloc_upper_bound | |
922 | Return a sensible upper bound on the amount of memory which will be | |
923 | required to read the dynamic relocations. In practice most targets | |
924 | return the amount of memory required to hold @samp{arelent} pointers for | |
925 | all the relocations plus a trailing @samp{NULL} entry, and store the | |
926 | actual relocation information in BFD private data. This is called via | |
927 | @samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in | |
928 | the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}. | |
929 | ||
930 | @item _canonicalize_dynamic_reloc | |
931 | Read the dynamic relocations. This is called via | |
932 | @samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the | |
933 | target vector is named @samp{_bfd_canonicalize_dynamic_reloc}. | |
934 | @end table | |
935 | ||
936 | @node BFD generated files | |
937 | @section BFD generated files | |
938 | @cindex generated files in bfd | |
939 | @cindex bfd generated files | |
940 | ||
941 | BFD contains several automatically generated files. This section | |
942 | describes them. Some files are created at configure time, when you | |
943 | configure BFD. Some files are created at make time, when you build | |
afdaa25f | 944 | BFD. Some files are automatically rebuilt at make time, but only if |
252b5132 RH |
945 | you configure with the @samp{--enable-maintainer-mode} option. Some |
946 | files live in the object directory---the directory from which you run | |
947 | configure---and some live in the source directory. All files that live | |
948 | in the source directory are checked into the CVS repository. | |
949 | ||
950 | @table @file | |
951 | @item bfd.h | |
952 | @cindex @file{bfd.h} | |
953 | @cindex @file{bfd-in3.h} | |
954 | Lives in the object directory. Created at make time from | |
955 | @file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at | |
956 | configure time from @file{bfd-in2.h}. There are automatic dependencies | |
957 | to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h} | |
958 | changes, so you can normally ignore @file{bfd-in3.h}, and just think | |
959 | about @file{bfd-in2.h} and @file{bfd.h}. | |
960 | ||
961 | @file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}. | |
962 | To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly | |
963 | control whether BFD is built for a 32 bit target or a 64 bit target. | |
964 | ||
965 | @item bfd-in2.h | |
966 | @cindex @file{bfd-in2.h} | |
967 | Lives in the source directory. Created from @file{bfd-in.h} and several | |
968 | other BFD source files. If you configure with the | |
969 | @samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt | |
970 | automatically when a source file changes. | |
971 | ||
972 | @item elf32-target.h | |
973 | @itemx elf64-target.h | |
974 | @cindex @file{elf32-target.h} | |
975 | @cindex @file{elf64-target.h} | |
976 | Live in the object directory. Created from @file{elfxx-target.h}. | |
977 | These files are versions of @file{elfxx-target.h} customized for either | |
978 | a 32 bit ELF target or a 64 bit ELF target. | |
979 | ||
980 | @item libbfd.h | |
981 | @cindex @file{libbfd.h} | |
982 | Lives in the source directory. Created from @file{libbfd-in.h} and | |
983 | several other BFD source files. If you configure with the | |
984 | @samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt | |
985 | automatically when a source file changes. | |
986 | ||
987 | @item libcoff.h | |
988 | @cindex @file{libcoff.h} | |
989 | Lives in the source directory. Created from @file{libcoff-in.h} and | |
990 | @file{coffcode.h}. If you configure with the | |
991 | @samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt | |
992 | automatically when a source file changes. | |
993 | ||
994 | @item targmatch.h | |
995 | @cindex @file{targmatch.h} | |
996 | Lives in the object directory. Created at make time from | |
997 | @file{config.bfd}. This file is used to map configuration triplets into | |
998 | BFD target vector variable names at run time. | |
999 | @end table | |
1000 | ||
1001 | @node BFD multiple compilations | |
1002 | @section Files compiled multiple times in BFD | |
1003 | Several files in BFD are compiled multiple times. By this I mean that | |
1004 | there are header files which contain function definitions. These header | |
1005 | files are included by other files, and thus the functions are compiled | |
1006 | once per file which includes them. | |
1007 | ||
1008 | Preprocessor macros are used to control the compilation, so that each | |
1009 | time the files are compiled the resulting functions are slightly | |
1010 | different. Naturally, if they weren't different, there would be no | |
1011 | reason to compile them multiple times. | |
1012 | ||
1013 | This is a not a particularly good programming technique, and future BFD | |
1014 | work should avoid it. | |
1015 | ||
1016 | @itemize @bullet | |
1017 | @item | |
1018 | Since this technique is rarely used, even experienced C programmers find | |
1019 | it confusing. | |
1020 | ||
1021 | @item | |
1022 | It is difficult to debug programs which use BFD, since there is no way | |
1023 | to describe which version of a particular function you are looking at. | |
1024 | ||
1025 | @item | |
1026 | Programs which use BFD wind up incorporating two or more slightly | |
1027 | different versions of the same function, which wastes space in the | |
1028 | executable. | |
1029 | ||
1030 | @item | |
1031 | This technique is never required nor is it especially efficient. It is | |
1032 | always possible to use statically initialized structures holding | |
1033 | function pointers and magic constants instead. | |
1034 | @end itemize | |
1035 | ||
1036 | The following is a list of the files which are compiled multiple times. | |
1037 | ||
1038 | @table @file | |
1039 | @item aout-target.h | |
1040 | @cindex @file{aout-target.h} | |
1041 | Describes a few functions and the target vector for a.out targets. This | |
1042 | is used by individual a.out targets with different definitions of | |
1043 | @samp{N_TXTADDR} and similar a.out macros. | |
1044 | ||
1045 | @item aoutf1.h | |
1046 | @cindex @file{aoutf1.h} | |
1047 | Implements standard SunOS a.out files. In principle it supports 64 bit | |
1048 | a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but | |
1049 | since all known a.out targets are 32 bits, this code may or may not | |
1050 | work. This file is only included by a few other files, and it is | |
1051 | difficult to justify its existence. | |
1052 | ||
1053 | @item aoutx.h | |
1054 | @cindex @file{aoutx.h} | |
1055 | Implements basic a.out support routines. This file can be compiled for | |
1056 | either 32 or 64 bit support. Since all known a.out targets are 32 bits, | |
1057 | the 64 bit support may or may not work. I believe the original | |
1058 | intention was that this file would only be included by @samp{aout32.c} | |
1059 | and @samp{aout64.c}, and that other a.out targets would simply refer to | |
1060 | the functions it defined. Unfortunately, some other a.out targets | |
1061 | started including it directly, leading to a somewhat confused state of | |
1062 | affairs. | |
1063 | ||
1064 | @item coffcode.h | |
1065 | @cindex @file{coffcode.h} | |
1066 | Implements basic COFF support routines. This file is included by every | |
1067 | COFF target. It implements code which handles COFF magic numbers as | |
1068 | well as various hook functions called by the generic COFF functions in | |
1069 | @file{coffgen.c}. This file is controlled by a number of different | |
1070 | macros, and more are added regularly. | |
1071 | ||
1072 | @item coffswap.h | |
1073 | @cindex @file{coffswap.h} | |
1074 | Implements COFF swapping routines. This file is included by | |
1075 | @file{coffcode.h}, and thus by every COFF target. It implements the | |
1076 | routines which swap COFF structures between internal and external | |
1077 | format. The main control for this file is the external structure | |
1078 | definitions in the files in the @file{include/coff} directory. A COFF | |
1079 | target file will include one of those files before including | |
1080 | @file{coffcode.h} and thus @file{coffswap.h}. There are a few other | |
1081 | macros which affect @file{coffswap.h} as well, mostly describing whether | |
1082 | certain fields are present in the external structures. | |
1083 | ||
1084 | @item ecoffswap.h | |
1085 | @cindex @file{ecoffswap.h} | |
1086 | Implements ECOFF swapping routines. This is like @file{coffswap.h}, but | |
1087 | for ECOFF. It is included by the ECOFF target files (of which there are | |
1088 | only two). The control is the preprocessor macro @samp{ECOFF_32} or | |
1089 | @samp{ECOFF_64}. | |
1090 | ||
1091 | @item elfcode.h | |
1092 | @cindex @file{elfcode.h} | |
1093 | Implements ELF functions that use external structure definitions. This | |
1094 | file is included by two other files: @file{elf32.c} and @file{elf64.c}. | |
1095 | It is controlled by the @samp{ARCH_SIZE} macro which is defined to be | |
1096 | @samp{32} or @samp{64} before including it. The @samp{NAME} macro is | |
1097 | used internally to give the functions different names for the two target | |
1098 | sizes. | |
1099 | ||
1100 | @item elfcore.h | |
1101 | @cindex @file{elfcore.h} | |
1102 | Like @file{elfcode.h}, but for functions that are specific to ELF core | |
1103 | files. This is included only by @file{elfcode.h}. | |
1104 | ||
252b5132 RH |
1105 | @item elfxx-target.h |
1106 | @cindex @file{elfxx-target.h} | |
1107 | This file is the source for the generated files @file{elf32-target.h} | |
1108 | and @file{elf64-target.h}, one of which is included by every ELF target. | |
1109 | It defines the ELF target vector. | |
1110 | ||
1111 | @item freebsd.h | |
1112 | @cindex @file{freebsd.h} | |
1113 | Presumably intended to be included by all FreeBSD targets, but in fact | |
1114 | there is only one such target, @samp{i386-freebsd}. This defines a | |
1115 | function used to set the right magic number for FreeBSD, as well as | |
1116 | various macros, and includes @file{aout-target.h}. | |
1117 | ||
1118 | @item netbsd.h | |
1119 | @cindex @file{netbsd.h} | |
1120 | Like @file{freebsd.h}, except that there are several files which include | |
1121 | it. | |
1122 | ||
1123 | @item nlm-target.h | |
1124 | @cindex @file{nlm-target.h} | |
1125 | Defines the target vector for a standard NLM target. | |
1126 | ||
1127 | @item nlmcode.h | |
1128 | @cindex @file{nlmcode.h} | |
1129 | Like @file{elfcode.h}, but for NLM targets. This is only included by | |
1130 | @file{nlm32.c} and @file{nlm64.c}, both of which define the macro | |
1131 | @samp{ARCH_SIZE} to an appropriate value. There are no 64 bit NLM | |
1132 | targets anyhow, so this is sort of useless. | |
1133 | ||
1134 | @item nlmswap.h | |
1135 | @cindex @file{nlmswap.h} | |
1136 | Like @file{coffswap.h}, but for NLM targets. This is included by each | |
1137 | NLM target, but I think it winds up compiling to the exact same code for | |
1138 | every target, and as such is fairly useless. | |
1139 | ||
1140 | @item peicode.h | |
1141 | @cindex @file{peicode.h} | |
1142 | Provides swapping routines and other hooks for PE targets. | |
1143 | @file{coffcode.h} will include this rather than @file{coffswap.h} for a | |
1144 | PE target. This defines PE specific versions of the COFF swapping | |
1145 | routines, and also defines some macros which control @file{coffcode.h} | |
1146 | itself. | |
1147 | @end table | |
1148 | ||
1149 | @node BFD relocation handling | |
1150 | @section BFD relocation handling | |
1151 | @cindex bfd relocation handling | |
1152 | @cindex relocations in bfd | |
1153 | ||
1154 | The handling of relocations is one of the more confusing aspects of BFD. | |
1155 | Relocation handling has been implemented in various different ways, all | |
1156 | somewhat incompatible, none perfect. | |
1157 | ||
1158 | @menu | |
1159 | * BFD relocation concepts:: BFD relocation concepts | |
1160 | * BFD relocation functions:: BFD relocation functions | |
1161 | * BFD relocation codes:: BFD relocation codes | |
1162 | * BFD relocation future:: BFD relocation future | |
1163 | @end menu | |
1164 | ||
1165 | @node BFD relocation concepts | |
1166 | @subsection BFD relocation concepts | |
1167 | ||
1168 | A relocation is an action which the linker must take when linking. It | |
1169 | describes a change to the contents of a section. The change is normally | |
1170 | based on the final value of one or more symbols. Relocations are | |
1171 | created by the assembler when it creates an object file. | |
1172 | ||
1173 | Most relocations are simple. A typical simple relocation is to set 32 | |
1174 | bits at a given offset in a section to the value of a symbol. This type | |
1175 | of relocation would be generated for code like @code{int *p = &i;} where | |
1176 | @samp{p} and @samp{i} are global variables. A relocation for the symbol | |
1177 | @samp{i} would be generated such that the linker would initialize the | |
1178 | area of memory which holds the value of @samp{p} to the value of the | |
1179 | symbol @samp{i}. | |
1180 | ||
1181 | Slightly more complex relocations may include an addend, which is a | |
1182 | constant to add to the symbol value before using it. In some cases a | |
1183 | relocation will require adding the symbol value to the existing contents | |
1184 | of the section in the object file. In others the relocation will simply | |
1185 | replace the contents of the section with the symbol value. Some | |
1186 | relocations are PC relative, so that the value to be stored in the | |
1187 | section is the difference between the value of a symbol and the final | |
1188 | address of the section contents. | |
1189 | ||
1190 | In general, relocations can be arbitrarily complex. For example, | |
1191 | relocations used in dynamic linking systems often require the linker to | |
1192 | allocate space in a different section and use the offset within that | |
1193 | section as the value to store. In the IEEE object file format, | |
1194 | relocations may involve arbitrary expressions. | |
1195 | ||
1049f94e | 1196 | When doing a relocatable link, the linker may or may not have to do |
252b5132 RH |
1197 | anything with a relocation, depending upon the definition of the |
1198 | relocation. Simple relocations generally do not require any special | |
1199 | action. | |
1200 | ||
1201 | @node BFD relocation functions | |
1202 | @subsection BFD relocation functions | |
1203 | ||
1204 | In BFD, each section has an array of @samp{arelent} structures. Each | |
1205 | structure has a pointer to a symbol, an address within the section, an | |
1206 | addend, and a pointer to a @samp{reloc_howto_struct} structure. The | |
1207 | howto structure has a bunch of fields describing the reloc, including a | |
1208 | type field. The type field is specific to the object file format | |
1209 | backend; none of the generic code in BFD examines it. | |
1210 | ||
1211 | Originally, the function @samp{bfd_perform_relocation} was supposed to | |
1212 | handle all relocations. In theory, many relocations would be simple | |
1213 | enough to be described by the fields in the howto structure. For those | |
1214 | that weren't, the howto structure included a @samp{special_function} | |
1215 | field to use as an escape. | |
1216 | ||
1217 | While this seems plausible, a look at @samp{bfd_perform_relocation} | |
1218 | shows that it failed. The function has odd special cases. Some of the | |
1219 | fields in the howto structure, such as @samp{pcrel_offset}, were not | |
1220 | adequately documented. | |
1221 | ||
1222 | The linker uses @samp{bfd_perform_relocation} to do all relocations when | |
1223 | the input and output file have different formats (e.g., when generating | |
1224 | S-records). The generic linker code, which is used by all targets which | |
1225 | do not define their own special purpose linker, uses | |
1226 | @samp{bfd_get_relocated_section_contents}, which for most targets turns | |
1227 | into a call to @samp{bfd_generic_get_relocated_section_contents}, which | |
1228 | calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation} | |
1229 | is still widely used, which makes it difficult to change, since it is | |
1230 | difficult to test all possible cases. | |
1231 | ||
1232 | The assembler used @samp{bfd_perform_relocation} for a while. This | |
1233 | turned out to be the wrong thing to do, since | |
1234 | @samp{bfd_perform_relocation} was written to handle relocations on an | |
1235 | existing object file, while the assembler needed to create relocations | |
1236 | in a new object file. The assembler was changed to use the new function | |
1237 | @samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation} | |
1238 | was created as a copy of @samp{bfd_perform_relocation}. | |
1239 | ||
1240 | Unfortunately, the work did not progress any farther, so | |
1241 | @samp{bfd_install_relocation} remains a simple copy of | |
1242 | @samp{bfd_perform_relocation}, with all the odd special cases and | |
1243 | confusing code. This again is difficult to change, because again any | |
1244 | change can affect any assembler target, and so is difficult to test. | |
1245 | ||
1246 | The new linker, when using the same object file format for all input | |
1247 | files and the output file, does not convert relocations into | |
1248 | @samp{arelent} structures, so it can not use | |
1249 | @samp{bfd_perform_relocation} at all. Instead, users of the new linker | |
1250 | are expected to write a @samp{relocate_section} function which will | |
1251 | handle relocations in a target specific fashion. | |
1252 | ||
1253 | There are two helper functions for target specific relocation: | |
1254 | @samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}. | |
1255 | These functions use a howto structure, but they @emph{do not} use the | |
1256 | @samp{special_function} field. Since the functions are normally called | |
1257 | from target specific code, the @samp{special_function} field adds | |
1258 | little; any relocations which require special handling can be handled | |
1259 | without calling those functions. | |
1260 | ||
1261 | So, if you want to add a new target, or add a new relocation to an | |
1262 | existing target, you need to do the following: | |
1263 | ||
1264 | @itemize @bullet | |
1265 | @item | |
1266 | Make sure you clearly understand what the contents of the section should | |
1049f94e | 1267 | look like after assembly, after a relocatable link, and after a final |
252b5132 | 1268 | link. Make sure you clearly understand the operations the linker must |
1049f94e | 1269 | perform during a relocatable link and during a final link. |
252b5132 RH |
1270 | |
1271 | @item | |
1272 | Write a howto structure for the relocation. The howto structure is | |
1273 | flexible enough to represent any relocation which should be handled by | |
1274 | setting a contiguous bitfield in the destination to the value of a | |
1275 | symbol, possibly with an addend, possibly adding the symbol value to the | |
1276 | value already present in the destination. | |
1277 | ||
1278 | @item | |
1279 | Change the assembler to generate your relocation. The assembler will | |
1280 | call @samp{bfd_install_relocation}, so your howto structure has to be | |
1281 | able to handle that. You may need to set the @samp{special_function} | |
1282 | field to handle assembly correctly. Be careful to ensure that any code | |
1283 | you write to handle the assembler will also work correctly when doing a | |
1049f94e | 1284 | relocatable link. For example, see @samp{bfd_elf_generic_reloc}. |
252b5132 RH |
1285 | |
1286 | @item | |
1287 | Test the assembler. Consider the cases of relocation against an | |
1288 | undefined symbol, a common symbol, a symbol defined in the object file | |
1289 | in the same section, and a symbol defined in the object file in a | |
1290 | different section. These cases may not all be applicable for your | |
1291 | reloc. | |
1292 | ||
1293 | @item | |
1294 | If your target uses the new linker, which is recommended, add any | |
1295 | required handling to the target specific relocation function. In simple | |
1296 | cases this will just involve a call to @samp{_bfd_final_link_relocate} | |
1297 | or @samp{_bfd_relocate_contents}, depending upon the definition of the | |
1049f94e | 1298 | relocation and whether the link is relocatable or not. |
252b5132 RH |
1299 | |
1300 | @item | |
1301 | Test the linker. Test the case of a final link. If the relocation can | |
1302 | overflow, use a linker script to force an overflow and make sure the | |
1049f94e AM |
1303 | error is reported correctly. Test a relocatable link, whether the |
1304 | symbol is defined or undefined in the relocatable output. For both the | |
1305 | final and relocatable link, test the case when the symbol is a common | |
252b5132 RH |
1306 | symbol, when the symbol looked like a common symbol but became a defined |
1307 | symbol, when the symbol is defined in a different object file, and when | |
1308 | the symbol is defined in the same object file. | |
1309 | ||
1310 | @item | |
1311 | In order for linking to another object file format, such as S-records, | |
1312 | to work correctly, @samp{bfd_perform_relocation} has to do the right | |
1313 | thing for the relocation. You may need to set the | |
1314 | @samp{special_function} field to handle this correctly. Test this by | |
1315 | doing a link in which the output object file format is S-records. | |
1316 | ||
1317 | @item | |
1049f94e | 1318 | Using the linker to generate relocatable output in a different object |
252b5132 | 1319 | file format is impossible in the general case, so you generally don't |
d1d013c3 HPN |
1320 | have to worry about that. The GNU linker makes sure to stop that from |
1321 | happening when an input file in a different format has relocations. | |
1322 | ||
1323 | Linking input files of different object file formats together is quite | |
1324 | unusual, but if you're really dedicated you may want to consider testing | |
1325 | this case, both when the output object file format is the same as your | |
1326 | format, and when it is different. | |
252b5132 RH |
1327 | @end itemize |
1328 | ||
1329 | @node BFD relocation codes | |
1330 | @subsection BFD relocation codes | |
1331 | ||
1332 | BFD has another way of describing relocations besides the howto | |
1333 | structures described above: the enum @samp{bfd_reloc_code_real_type}. | |
1334 | ||
1335 | Every known relocation type can be described as a value in this | |
1336 | enumeration. The enumeration contains many target specific relocations, | |
1337 | but where two or more targets have the same relocation, a single code is | |
1338 | used. For example, the single value @samp{BFD_RELOC_32} is used for all | |
1339 | simple 32 bit relocation types. | |
1340 | ||
1341 | The main purpose of this relocation code is to give the assembler some | |
1342 | mechanism to create @samp{arelent} structures. In order for the | |
1343 | assembler to create an @samp{arelent} structure, it has to be able to | |
1344 | obtain a howto structure. The function @samp{bfd_reloc_type_lookup}, | |
1345 | which simply calls the target vector entry point | |
1346 | @samp{reloc_type_lookup}, takes a relocation code and returns a howto | |
1347 | structure. | |
1348 | ||
1349 | The function @samp{bfd_get_reloc_code_name} returns the name of a | |
1350 | relocation code. This is mainly used in error messages. | |
1351 | ||
1352 | Using both howto structures and relocation codes can be somewhat | |
1353 | confusing. There are many processor specific relocation codes. | |
1354 | However, the relocation is only fully defined by the howto structure. | |
1355 | The same relocation code will map to different howto structures in | |
1356 | different object file formats. For example, the addend handling may be | |
1357 | different. | |
1358 | ||
1359 | Most of the relocation codes are not really general. The assembler can | |
1360 | not use them without already understanding what sorts of relocations can | |
1361 | be used for a particular target. It might be possible to replace the | |
1362 | relocation codes with something simpler. | |
1363 | ||
1364 | @node BFD relocation future | |
1365 | @subsection BFD relocation future | |
1366 | ||
1367 | Clearly the current BFD relocation support is in bad shape. A | |
1368 | wholescale rewrite would be very difficult, because it would require | |
1369 | thorough testing of every BFD target. So some sort of incremental | |
1370 | change is required. | |
1371 | ||
1372 | My vague thoughts on this would involve defining a new, clearly defined, | |
1373 | howto structure. Some mechanism would be used to determine which type | |
1374 | of howto structure was being used by a particular format. | |
1375 | ||
1376 | The new howto structure would clearly define the relocation behaviour in | |
1049f94e | 1377 | the case of an assembly, a relocatable link, and a final link. At |
252b5132 RH |
1378 | least one special function would be defined as an escape, and it might |
1379 | make sense to define more. | |
1380 | ||
1381 | One or more generic functions similar to @samp{bfd_perform_relocation} | |
1382 | would be written to handle the new howto structure. | |
1383 | ||
1384 | This should make it possible to write a generic version of the relocate | |
1385 | section functions used by the new linker. The target specific code | |
1386 | would provide some mechanism (a function pointer or an initial | |
1387 | conversion) to convert target specific relocations into howto | |
1388 | structures. | |
1389 | ||
1390 | Ideally it would be possible to use this generic relocate section | |
1391 | function for the generic linker as well. That is, it would replace the | |
1392 | @samp{bfd_generic_get_relocated_section_contents} function which is | |
1393 | currently normally used. | |
1394 | ||
1395 | For the special case of ELF dynamic linking, more consideration needs to | |
1396 | be given to writing ELF specific but ELF target generic code to handle | |
1397 | special relocation types such as GOT and PLT. | |
1398 | ||
1399 | @node BFD ELF support | |
1400 | @section BFD ELF support | |
1401 | @cindex elf support in bfd | |
1402 | @cindex bfd elf support | |
1403 | ||
1404 | The ELF object file format is defined in two parts: a generic ABI and a | |
1405 | processor specific supplement. The ELF support in BFD is split in a | |
1406 | similar fashion. The processor specific support is largely kept within | |
1407 | a single file. The generic support is provided by several other files. | |
1408 | The processor specific support provides a set of function pointers and | |
1409 | constants used by the generic support. | |
1410 | ||
1411 | @menu | |
1412 | * BFD ELF sections and segments:: ELF sections and segments | |
1413 | * BFD ELF generic support:: BFD ELF generic support | |
1414 | * BFD ELF processor specific support:: BFD ELF processor specific support | |
1415 | * BFD ELF core files:: BFD ELF core files | |
1416 | * BFD ELF future:: BFD ELF future | |
1417 | @end menu | |
1418 | ||
1419 | @node BFD ELF sections and segments | |
1420 | @subsection ELF sections and segments | |
1421 | ||
1422 | The ELF ABI permits a file to have either sections or segments or both. | |
b45619c0 | 1423 | Relocatable object files conventionally have only sections. |
252b5132 RH |
1424 | Executables conventionally have both. Core files conventionally have |
1425 | only program segments. | |
1426 | ||
1427 | ELF sections are similar to sections in other object file formats: they | |
1428 | have a name, a VMA, file contents, flags, and other miscellaneous | |
1429 | information. ELF relocations are stored in sections of a particular | |
1430 | type; BFD automatically converts these sections into internal relocation | |
1431 | information. | |
1432 | ||
1433 | ELF program segments are intended for fast interpretation by a system | |
1434 | loader. They have a type, a VMA, an LMA, file contents, and a couple of | |
1435 | other fields. When an ELF executable is run on a Unix system, the | |
1436 | system loader will examine the program segments to decide how to load | |
1437 | it. The loader will ignore the section information. Loadable program | |
1438 | segments (type @samp{PT_LOAD}) are directly loaded into memory. Other | |
1439 | program segments are interpreted by the loader, and generally provide | |
1440 | dynamic linking information. | |
1441 | ||
1442 | When an ELF file has both program segments and sections, an ELF program | |
1443 | segment may encompass one or more ELF sections, in the sense that the | |
1444 | portion of the file which corresponds to the program segment may include | |
1445 | the portions of the file corresponding to one or more sections. When | |
1446 | there is more than one section in a loadable program segment, the | |
1447 | relative positions of the section contents in the file must correspond | |
1448 | to the relative positions they should hold when the program segment is | |
1449 | loaded. This requirement should be obvious if you consider that the | |
1450 | system loader will load an entire program segment at a time. | |
1451 | ||
1452 | On a system which supports dynamic paging, such as any native Unix | |
1453 | system, the contents of a loadable program segment must be at the same | |
1454 | offset in the file as in memory, modulo the memory page size used on the | |
1455 | system. This is because the system loader will map the file into memory | |
1456 | starting at the start of a page. The system loader can easily remap | |
1457 | entire pages to the correct load address. However, if the contents of | |
1458 | the file were not correctly aligned within the page, the system loader | |
1459 | would have to shift the contents around within the page, which is too | |
1460 | expensive. For example, if the LMA of a loadable program segment is | |
1461 | @samp{0x40080} and the page size is @samp{0x1000}, then the position of | |
1462 | the segment contents within the file must equal @samp{0x80} modulo | |
1463 | @samp{0x1000}. | |
1464 | ||
1465 | BFD has only a single set of sections. It does not provide any generic | |
1466 | way to examine both sections and segments. When BFD is used to open an | |
1467 | object file or executable, the BFD sections will represent ELF sections. | |
1468 | When BFD is used to open a core file, the BFD sections will represent | |
1469 | ELF program segments. | |
1470 | ||
1471 | When BFD is used to examine an object file or executable, any program | |
1472 | segments will be read to set the LMA of the sections. This is because | |
1473 | ELF sections only have a VMA, while ELF program segments have both a VMA | |
1474 | and an LMA. Any program segments will be copied by the | |
1475 | @samp{copy_private} entry points. They will be printed by the | |
1476 | @samp{print_private} entry point. Otherwise, the program segments are | |
1477 | ignored. In particular, programs which use BFD currently have no direct | |
1478 | access to the program segments. | |
1479 | ||
1480 | When BFD is used to create an executable, the program segments will be | |
1481 | created automatically based on the section information. This is done in | |
1482 | the function @samp{assign_file_positions_for_segments} in @file{elf.c}. | |
1483 | This function has been tweaked many times, and probably still has | |
1484 | problems that arise in particular cases. | |
1485 | ||
1486 | There is a hook which may be used to explicitly define the program | |
1487 | segments when creating an executable: the @samp{bfd_record_phdr} | |
1488 | function in @file{bfd.c}. If this function is called, BFD will not | |
1489 | create program segments itself, but will only create the program | |
1490 | segments specified by the caller. The linker uses this function to | |
1491 | implement the @samp{PHDRS} linker script command. | |
1492 | ||
1493 | @node BFD ELF generic support | |
1494 | @subsection BFD ELF generic support | |
1495 | ||
1496 | In general, functions which do not read external data from the ELF file | |
1497 | are found in @file{elf.c}. They operate on the internal forms of the | |
1498 | ELF structures, which are defined in @file{include/elf/internal.h}. The | |
1499 | internal structures are defined in terms of @samp{bfd_vma}, and so may | |
1500 | be used for both 32 bit and 64 bit ELF targets. | |
1501 | ||
1502 | The file @file{elfcode.h} contains functions which operate on the | |
1503 | external data. @file{elfcode.h} is compiled twice, once via | |
1504 | @file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via | |
1505 | @file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}. | |
1506 | @file{elfcode.h} includes functions to swap the ELF structures in and | |
1507 | out of external form, as well as a few more complex functions. | |
1508 | ||
c152c796 | 1509 | Linker support is found in @file{elflink.c}. The |
252b5132 RH |
1510 | linker support is only used if the processor specific file defines |
1511 | @samp{elf_backend_relocate_section}, which is required to relocate the | |
1512 | section contents. If that macro is not defined, the generic linker code | |
1513 | is used, and relocations are handled via @samp{bfd_perform_relocation}. | |
1514 | ||
1515 | The core file support is in @file{elfcore.h}, which is compiled twice, | |
1516 | for both 32 and 64 bit support. The more interesting cases of core file | |
1517 | support only work on a native system which has the @file{sys/procfs.h} | |
1518 | header file. Without that file, the core file support does little more | |
1519 | than read the ELF program segments as BFD sections. | |
1520 | ||
1521 | The BFD internal header file @file{elf-bfd.h} is used for communication | |
1522 | among these files and the processor specific files. | |
1523 | ||
1524 | The default entries for the BFD ELF target vector are found mainly in | |
1525 | @file{elf.c}. Some functions are found in @file{elfcode.h}. | |
1526 | ||
1527 | The processor specific files may override particular entries in the | |
1528 | target vector, but most do not, with one exception: the | |
1529 | @samp{bfd_reloc_type_lookup} entry point is always processor specific. | |
1530 | ||
1531 | @node BFD ELF processor specific support | |
1532 | @subsection BFD ELF processor specific support | |
1533 | ||
1534 | By convention, the processor specific support for a particular processor | |
1535 | will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is | |
1536 | either 32 or 64, and @var{cpu} is the name of the processor. | |
1537 | ||
1538 | @menu | |
1539 | * BFD ELF processor required:: Required processor specific support | |
1540 | * BFD ELF processor linker:: Processor specific linker support | |
1541 | * BFD ELF processor other:: Other processor specific support options | |
1542 | @end menu | |
1543 | ||
1544 | @node BFD ELF processor required | |
1545 | @subsubsection Required processor specific support | |
1546 | ||
1547 | When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the | |
1548 | following: | |
1549 | ||
1550 | @itemize @bullet | |
1551 | @item | |
1552 | Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or | |
1553 | both, to a unique C name to use for the target vector. This name should | |
1554 | appear in the list of target vectors in @file{targets.c}, and will also | |
1555 | have to appear in @file{config.bfd} and @file{configure.in}. Define | |
1556 | @samp{TARGET_BIG_SYM} for a big-endian processor, | |
1557 | @samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both | |
1558 | for a bi-endian processor. | |
1559 | @item | |
1560 | Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or | |
1561 | both, to a string used as the name of the target vector. This is the | |
1562 | name which a user of the BFD tool would use to specify the object file | |
1563 | format. It would normally appear in a linker emulation parameters | |
1564 | file. | |
1565 | @item | |
1566 | Define @samp{ELF_ARCH} to the BFD architecture (an element of the | |
1567 | @samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}). | |
1568 | @item | |
1569 | Define @samp{ELF_MACHINE_CODE} to the magic number which should appear | |
1570 | in the @samp{e_machine} field of the ELF header. As of this writing, | |
abd4c6a2 | 1571 | these magic numbers are assigned by Caldera; if you want to get a magic |
252b5132 | 1572 | number for a particular processor, try sending a note to |
abd4c6a2 | 1573 | @email{registry@@caldera.com}. In the BFD sources, the magic numbers are |
252b5132 RH |
1574 | found in @file{include/elf/common.h}; they have names beginning with |
1575 | @samp{EM_}. | |
1576 | @item | |
1577 | Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in | |
1578 | memory. This can normally be found at the start of chapter 5 in the | |
1579 | processor specific supplement. For a processor which will only be used | |
1580 | in an embedded system, or which has no memory management hardware, this | |
1581 | can simply be @samp{1}. | |
1582 | @item | |
1583 | If the format should use @samp{Rel} rather than @samp{Rela} relocations, | |
1584 | define @samp{USE_REL}. This is normally defined in chapter 4 of the | |
1585 | processor specific supplement. | |
1586 | ||
1587 | In the absence of a supplement, it's easier to work with @samp{Rela} | |
1588 | relocations. @samp{Rela} relocations will require more space in object | |
1589 | files (but not in executables, except when using dynamic linking). | |
1590 | However, this is outweighed by the simplicity of addend handling when | |
1591 | using @samp{Rela} relocations. With @samp{Rel} relocations, the addend | |
1049f94e | 1592 | must be stored in the section contents, which makes relocatable links |
252b5132 RH |
1593 | more complex. |
1594 | ||
1595 | For example, consider C code like @code{i = a[1000];} where @samp{a} is | |
1596 | a global array. The instructions which load the value of @samp{a[1000]} | |
1597 | will most likely use a relocation which refers to the symbol | |
1598 | representing @samp{a}, with an addend that gives the offset from the | |
1599 | start of @samp{a} to element @samp{1000}. When using @samp{Rel} | |
1600 | relocations, that addend must be stored in the instructions themselves. | |
1601 | If you are adding support for a RISC chip which uses two or more | |
1602 | instructions to load an address, then the addend may not fit in a single | |
1603 | instruction, and will have to be somehow split among the instructions. | |
1049f94e | 1604 | This makes linking awkward, particularly when doing a relocatable link |
252b5132 RH |
1605 | in which the addend may have to be updated. It can be done---the MIPS |
1606 | ELF support does it---but it should be avoided when possible. | |
1607 | ||
1608 | It is possible, though somewhat awkward, to support both @samp{Rel} and | |
1609 | @samp{Rela} relocations for a single target; @file{elf64-mips.c} does it | |
1610 | by overriding the relocation reading and writing routines. | |
1611 | @item | |
1612 | Define howto structures for all the relocation types. | |
1613 | @item | |
1614 | Define a @samp{bfd_reloc_type_lookup} routine. This must be named | |
1615 | @samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a | |
1616 | function or a macro. It must translate a BFD relocation code into a | |
1617 | howto structure. This is normally a table lookup or a simple switch. | |
1618 | @item | |
1619 | If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}. | |
1620 | If using @samp{Rela} relocations, define @samp{elf_info_to_howto}. | |
1621 | Either way, this is a macro defined as the name of a function which | |
1622 | takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and | |
1623 | sets the @samp{howto} field of the @samp{arelent} based on the | |
1624 | @samp{Rel} or @samp{Rela} structure. This is normally uses | |
1625 | @samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as | |
1626 | an index into a table of howto structures. | |
1627 | @end itemize | |
1628 | ||
1629 | You must also add the magic number for this processor to the | |
1630 | @samp{prep_headers} function in @file{elf.c}. | |
1631 | ||
1632 | You must also create a header file in the @file{include/elf} directory | |
1633 | called @file{@var{cpu}.h}. This file should define any target specific | |
1634 | information which may be needed outside of the BFD code. In particular | |
1635 | it should use the @samp{START_RELOC_NUMBERS}, @samp{RELOC_NUMBER}, | |
1636 | @samp{FAKE_RELOC}, @samp{EMPTY_RELOC} and @samp{END_RELOC_NUMBERS} | |
4ee79850 | 1637 | macros to create a table mapping the number used to identify a |
252b5132 RH |
1638 | relocation to a name describing that relocation. |
1639 | ||
dd167cc8 HPN |
1640 | While not a BFD component, you probably also want to make the binutils |
1641 | program @samp{readelf} parse your ELF objects. For this, you need to add | |
964802a8 | 1642 | code for @code{EM_@var{cpu}} as appropriate in @file{binutils/readelf.c}. |
dd167cc8 | 1643 | |
252b5132 RH |
1644 | @node BFD ELF processor linker |
1645 | @subsubsection Processor specific linker support | |
1646 | ||
1647 | The linker will be much more efficient if you define a relocate section | |
1648 | function. This will permit BFD to use the ELF specific linker support. | |
1649 | ||
1650 | If you do not define a relocate section function, BFD must use the | |
1651 | generic linker support, which requires converting all symbols and | |
1652 | relocations into BFD @samp{asymbol} and @samp{arelent} structures. In | |
1653 | this case, relocations will be handled by calling | |
1654 | @samp{bfd_perform_relocation}, which will use the howto structures you | |
1655 | have defined. @xref{BFD relocation handling}. | |
1656 | ||
1657 | In order to support linking into a different object file format, such as | |
1658 | S-records, @samp{bfd_perform_relocation} must work correctly with your | |
1659 | howto structures, so you can't skip that step. However, if you define | |
1660 | the relocate section function, then in the normal case of linking into | |
1661 | an ELF file the linker will not need to convert symbols and relocations, | |
1662 | and will be much more efficient. | |
1663 | ||
1664 | To use a relocation section function, define the macro | |
1665 | @samp{elf_backend_relocate_section} as the name of a function which will | |
1666 | take the contents of a section, as well as relocation, symbol, and other | |
1667 | information, and modify the section contents according to the relocation | |
1668 | information. In simple cases, this is little more than a loop over the | |
1669 | relocations which computes the value of each relocation and calls | |
1670 | @samp{_bfd_final_link_relocate}. The function must check for a | |
1049f94e | 1671 | relocatable link, and in that case normally needs to do nothing other |
252b5132 RH |
1672 | than adjust the addend for relocations against a section symbol. |
1673 | ||
1674 | The complex cases generally have to do with dynamic linker support. GOT | |
1675 | and PLT relocations must be handled specially, and the linker normally | |
1676 | arranges to set up the GOT and PLT sections while handling relocations. | |
1677 | When generating a shared library, random relocations must normally be | |
1678 | copied into the shared library, or converted to RELATIVE relocations | |
1679 | when possible. | |
1680 | ||
1681 | @node BFD ELF processor other | |
1682 | @subsubsection Other processor specific support options | |
1683 | ||
1684 | There are many other macros which may be defined in | |
1685 | @file{elf@var{nn}-@var{cpu}.c}. These macros may be found in | |
1686 | @file{elfxx-target.h}. | |
1687 | ||
1688 | Macros may be used to override some of the generic ELF target vector | |
1689 | functions. | |
1690 | ||
1691 | Several processor specific hook functions which may be defined as | |
1692 | macros. These functions are found as function pointers in the | |
1693 | @samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In | |
1694 | general, a hook function is set by defining a macro | |
1695 | @samp{elf_backend_@var{name}}. | |
1696 | ||
1697 | There are a few processor specific constants which may also be defined. | |
1698 | These are again found in the @samp{elf_backend_data} structure. | |
1699 | ||
1700 | I will not define the various functions and constants here; see the | |
1701 | comments in @file{elf-bfd.h}. | |
1702 | ||
1703 | Normally any odd characteristic of a particular ELF processor is handled | |
1704 | via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON} | |
1705 | section number found in MIPS ELF is handled via the hooks | |
1706 | @samp{section_from_bfd_section}, @samp{symbol_processing}, | |
1707 | @samp{add_symbol_hook}, and @samp{output_symbol_hook}. | |
1708 | ||
1709 | Dynamic linking support, which involves processor specific relocations | |
1710 | requiring special handling, is also implemented via hook functions. | |
1711 | ||
1712 | @node BFD ELF core files | |
1713 | @subsection BFD ELF core files | |
1714 | @cindex elf core files | |
1715 | ||
1716 | On native ELF Unix systems, core files are generated without any | |
1717 | sections. Instead, they only have program segments. | |
1718 | ||
1719 | When BFD is used to read an ELF core file, the BFD sections will | |
1720 | actually represent program segments. Since ELF program segments do not | |
1721 | have names, BFD will invent names like @samp{segment@var{n}} where | |
1722 | @var{n} is a number. | |
1723 | ||
1724 | A single ELF program segment may include both an initialized part and an | |
1725 | uninitialized part. The size of the initialized part is given by the | |
1726 | @samp{p_filesz} field. The total size of the segment is given by the | |
1727 | @samp{p_memsz} field. If @samp{p_memsz} is larger than @samp{p_filesz}, | |
1728 | then the extra space is uninitialized, or, more precisely, initialized | |
1729 | to zero. | |
1730 | ||
1731 | BFD will represent such a program segment as two different sections. | |
1732 | The first, named @samp{segment@var{n}a}, will represent the initialized | |
1733 | part of the program segment. The second, named @samp{segment@var{n}b}, | |
1734 | will represent the uninitialized part. | |
1735 | ||
1736 | ELF core files store special information such as register values in | |
1737 | program segments with the type @samp{PT_NOTE}. BFD will attempt to | |
1738 | interpret the information in these segments, and will create additional | |
1739 | sections holding the information. Some of this interpretation requires | |
1740 | information found in the host header file @file{sys/procfs.h}, and so | |
1741 | will only work when BFD is built on a native system. | |
1742 | ||
1743 | BFD does not currently provide any way to create an ELF core file. In | |
1744 | general, BFD does not provide a way to create core files. The way to | |
1745 | implement this would be to write @samp{bfd_set_format} and | |
1746 | @samp{bfd_write_contents} routines for the @samp{bfd_core} type; see | |
1747 | @ref{BFD target vector format}. | |
1748 | ||
1749 | @node BFD ELF future | |
1750 | @subsection BFD ELF future | |
1751 | ||
1752 | The current dynamic linking support has too much code duplication. | |
1753 | While each processor has particular differences, much of the dynamic | |
1754 | linking support is quite similar for each processor. The GOT and PLT | |
1755 | are handled in fairly similar ways, the details of -Bsymbolic linking | |
1756 | are generally similar, etc. This code should be reworked to use more | |
1757 | generic functions, eliminating the duplication. | |
1758 | ||
1759 | Similarly, the relocation handling has too much duplication. Many of | |
1760 | the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are | |
1761 | quite similar. The relocate section functions are also often quite | |
1762 | similar, both in the standard linker handling and the dynamic linker | |
1763 | handling. Many of the COFF processor specific backends share a single | |
1764 | relocate section function (@samp{_bfd_coff_generic_relocate_section}), | |
1765 | and it should be possible to do something like this for the ELF targets | |
1766 | as well. | |
1767 | ||
1768 | The appearance of the processor specific magic number in | |
1769 | @samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be | |
1770 | possible to add support for a new processor without changing the generic | |
1771 | support. | |
1772 | ||
1773 | The processor function hooks and constants are ad hoc and need better | |
1774 | documentation. | |
1775 | ||
252b5132 RH |
1776 | @node BFD glossary |
1777 | @section BFD glossary | |
1778 | @cindex glossary for bfd | |
1779 | @cindex bfd glossary | |
1780 | ||
1781 | This is a short glossary of some BFD terms. | |
1782 | ||
1783 | @table @asis | |
1784 | @item a.out | |
1785 | The a.out object file format. The original Unix object file format. | |
1786 | Still used on SunOS, though not Solaris. Supports only three sections. | |
1787 | ||
1788 | @item archive | |
1789 | A collection of object files produced and manipulated by the @samp{ar} | |
1790 | program. | |
1791 | ||
1792 | @item backend | |
1793 | The implementation within BFD of a particular object file format. The | |
1794 | set of functions which appear in a particular target vector. | |
1795 | ||
1796 | @item BFD | |
4ee79850 | 1797 | The BFD library itself. Also, each object file, archive, or executable |
252b5132 RH |
1798 | opened by the BFD library has the type @samp{bfd *}, and is sometimes |
1799 | referred to as a bfd. | |
1800 | ||
1801 | @item COFF | |
1802 | The Common Object File Format. Used on Unix SVR3. Used by some | |
1803 | embedded targets, although ELF is normally better. | |
1804 | ||
1805 | @item DLL | |
1806 | A shared library on Windows. | |
1807 | ||
1808 | @item dynamic linker | |
1809 | When a program linked against a shared library is run, the dynamic | |
1810 | linker will locate the appropriate shared library and arrange to somehow | |
1811 | include it in the running image. | |
1812 | ||
1813 | @item dynamic object | |
1814 | Another name for an ELF shared library. | |
1815 | ||
1816 | @item ECOFF | |
1817 | The Extended Common Object File Format. Used on Alpha Digital Unix | |
1818 | (formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF. | |
1819 | ||
1820 | @item ELF | |
1821 | The Executable and Linking Format. The object file format used on most | |
1822 | modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also | |
1823 | used on many embedded systems. | |
1824 | ||
1825 | @item executable | |
1826 | A program, with instructions and symbols, and perhaps dynamic linking | |
1827 | information. Normally produced by a linker. | |
1828 | ||
1829 | @item LMA | |
1830 | Load Memory Address. This is the address at which a section will be | |
1831 | loaded. Compare with VMA, below. | |
1832 | ||
1833 | @item NLM | |
1834 | NetWare Loadable Module. Used to describe the format of an object which | |
1835 | be loaded into NetWare, which is some kind of PC based network server | |
1836 | program. | |
1837 | ||
1838 | @item object file | |
1839 | A binary file including machine instructions, symbols, and relocation | |
1840 | information. Normally produced by an assembler. | |
1841 | ||
1842 | @item object file format | |
1843 | The format of an object file. Typically object files and executables | |
1844 | for a particular system are in the same format, although executables | |
1845 | will not contain any relocation information. | |
1846 | ||
1847 | @item PE | |
1848 | The Portable Executable format. This is the object file format used for | |
1849 | Windows (specifically, Win32) object files. It is based closely on | |
1850 | COFF, but has a few significant differences. | |
1851 | ||
1852 | @item PEI | |
1853 | The Portable Executable Image format. This is the object file format | |
1854 | used for Windows (specifically, Win32) executables. It is very similar | |
1855 | to PE, but includes some additional header information. | |
1856 | ||
1857 | @item relocations | |
1858 | Information used by the linker to adjust section contents. Also called | |
1859 | relocs. | |
1860 | ||
1861 | @item section | |
1862 | Object files and executable are composed of sections. Sections have | |
1863 | optional data and optional relocation information. | |
1864 | ||
1865 | @item shared library | |
1866 | A library of functions which may be used by many executables without | |
1867 | actually being linked into each executable. There are several different | |
1868 | implementations of shared libraries, each having slightly different | |
1869 | features. | |
1870 | ||
1871 | @item symbol | |
1872 | Each object file and executable may have a list of symbols, often | |
1873 | referred to as the symbol table. A symbol is basically a name and an | |
1874 | address. There may also be some additional information like the type of | |
1875 | symbol, although the type of a symbol is normally something simple like | |
1876 | function or object, and should be confused with the more complex C | |
1877 | notion of type. Typically every global function and variable in a C | |
1878 | program will have an associated symbol. | |
1879 | ||
1880 | @item target vector | |
1881 | A set of functions which implement support for a particular object file | |
1882 | format. The @samp{bfd_target} structure. | |
1883 | ||
1884 | @item Win32 | |
1885 | The current Windows API, implemented by Windows 95 and later and Windows | |
1886 | NT 3.51 and later, but not by Windows 3.1. | |
1887 | ||
1888 | @item XCOFF | |
1889 | The eXtended Common Object File Format. Used on AIX. A variant of | |
1890 | COFF, with a completely different symbol table implementation. | |
1891 | ||
1892 | @item VMA | |
1893 | Virtual Memory Address. This is the address a section will have when | |
1894 | an executable is run. Compare with LMA, above. | |
1895 | @end table | |
1896 | ||
1897 | @node Index | |
1898 | @unnumberedsec Index | |
1899 | @printindex cp | |
1900 | ||
1901 | @contents | |
1902 | @bye |