| 1 | \input texinfo |
| 2 | @setfilename ldint.info |
| 3 | |
| 4 | @ifinfo |
| 5 | @format |
| 6 | START-INFO-DIR-ENTRY |
| 7 | * Ld-Internals: (ldint). The GNU linker internals. |
| 8 | END-INFO-DIR-ENTRY |
| 9 | @end format |
| 10 | @end ifinfo |
| 11 | |
| 12 | @ifinfo |
| 13 | This file documents the internals of the GNU linker ld. |
| 14 | |
| 15 | Copyright (C) 1992, 93, 94, 95, 96, 97, 1998 Free Software Foundation, Inc. |
| 16 | Contributed by Cygnus Support. |
| 17 | |
| 18 | Permission is granted to make and distribute verbatim copies of |
| 19 | this manual provided the copyright notice and this permission notice |
| 20 | are preserved on all copies. |
| 21 | |
| 22 | @ignore |
| 23 | Permission is granted to process this file through Tex and print the |
| 24 | results, provided the printed document carries copying permission |
| 25 | notice identical to this one except for the removal of this paragraph |
| 26 | (this paragraph not being relevant to the printed manual). |
| 27 | |
| 28 | @end ignore |
| 29 | Permission is granted to copy or distribute modified versions of this |
| 30 | manual under the terms of the GPL (for which purpose this text may be |
| 31 | regarded as a program in the language TeX). |
| 32 | @end ifinfo |
| 33 | |
| 34 | @iftex |
| 35 | @finalout |
| 36 | @setchapternewpage off |
| 37 | @settitle GNU Linker Internals |
| 38 | @titlepage |
| 39 | @title{A guide to the internals of the GNU linker} |
| 40 | @author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie |
| 41 | @author Cygnus Support |
| 42 | @page |
| 43 | |
| 44 | @tex |
| 45 | \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$ |
| 46 | \xdef\manvers{\$Revision$} % For use in headers, footers too |
| 47 | {\parskip=0pt |
| 48 | \hfill Cygnus Support\par |
| 49 | \hfill \manvers\par |
| 50 | \hfill \TeX{}info \texinfoversion\par |
| 51 | } |
| 52 | @end tex |
| 53 | |
| 54 | @vskip 0pt plus 1filll |
| 55 | Copyright @copyright{} 1992, 93, 94, 95, 96, 97, 1998 |
| 56 | Free Software Foundation, Inc. |
| 57 | |
| 58 | Permission is granted to make and distribute verbatim copies of |
| 59 | this manual provided the copyright notice and this permission notice |
| 60 | are preserved on all copies. |
| 61 | |
| 62 | @end titlepage |
| 63 | @end iftex |
| 64 | |
| 65 | @node Top |
| 66 | @top |
| 67 | |
| 68 | This file documents the internals of the GNU linker @code{ld}. It is a |
| 69 | collection of miscellaneous information with little form at this point. |
| 70 | Mostly, it is a repository into which you can put information about |
| 71 | GNU @code{ld} as you discover it (or as you design changes to @code{ld}). |
| 72 | |
| 73 | @menu |
| 74 | * README:: The README File |
| 75 | * Emulations:: How linker emulations are generated |
| 76 | * Emulation Walkthrough:: A Walkthrough of a Typical Emulation |
| 77 | @end menu |
| 78 | |
| 79 | @node README |
| 80 | @chapter The @file{README} File |
| 81 | |
| 82 | Check the @file{README} file; it often has useful information that does not |
| 83 | appear anywhere else in the directory. |
| 84 | |
| 85 | @node Emulations |
| 86 | @chapter How linker emulations are generated |
| 87 | |
| 88 | Each linker target has an @dfn{emulation}. The emulation includes the |
| 89 | default linker script, and certain emulations also modify certain types |
| 90 | of linker behaviour. |
| 91 | |
| 92 | Emulations are created during the build process by the shell script |
| 93 | @file{genscripts.sh}. |
| 94 | |
| 95 | The @file{genscripts.sh} script starts by reading a file in the |
| 96 | @file{emulparams} directory. This is a shell script which sets various |
| 97 | shell variables used by @file{genscripts.sh} and the other shell scripts |
| 98 | it invokes. |
| 99 | |
| 100 | The @file{genscripts.sh} script will invoke a shell script in the |
| 101 | @file{scripttempl} directory in order to create default linker scripts |
| 102 | written in the linker command language. The @file{scripttempl} script |
| 103 | will be invoked 5 (or, in some cases, 6) times, with different |
| 104 | assignments to shell variables, to create different default scripts. |
| 105 | The choice of script is made based on the command line options. |
| 106 | |
| 107 | After creating the scripts, @file{genscripts.sh} will invoke yet another |
| 108 | shell script, this time in the @file{emultempl} directory. That shell |
| 109 | script will create the emulation source file, which contains C code. |
| 110 | This C code permits the linker emulation to override various linker |
| 111 | behaviours. Most targets use the generic emulation code, which is in |
| 112 | @file{emultempl/generic.em}. |
| 113 | |
| 114 | To summarize, @file{genscripts.sh} reads three shell scripts: an |
| 115 | emulation parameters script in the @file{emulparams} directory, a linker |
| 116 | script generation script in the @file{scripttempl} directory, and an |
| 117 | emulation source file generation script in the @file{emultempl} |
| 118 | directory. |
| 119 | |
| 120 | For example, the Sun 4 linker sets up variables in |
| 121 | @file{emulparams/sun4.sh}, creates linker scripts using |
| 122 | @file{scripttempl/aout.sc}, and creates the emulation code using |
| 123 | @file{emultempl/sunos.em}. |
| 124 | |
| 125 | Note that the linker can support several emulations simultaneously, |
| 126 | depending upon how it is configured. An emulation can be selected with |
| 127 | the @code{-m} option. The @code{-V} option will list all supported |
| 128 | emulations. |
| 129 | |
| 130 | @menu |
| 131 | * emulation parameters:: @file{emulparams} scripts |
| 132 | * linker scripts:: @file{scripttempl} scripts |
| 133 | * linker emulations:: @file{emultempl} scripts |
| 134 | @end menu |
| 135 | |
| 136 | @node emulation parameters |
| 137 | @section @file{emulparams} scripts |
| 138 | |
| 139 | Each target selects a particular file in the @file{emulparams} directory |
| 140 | by setting the shell variable @code{targ_emul} in @file{configure.tgt}. |
| 141 | This shell variable is used by the @file{configure} script to control |
| 142 | building an emulation source file. |
| 143 | |
| 144 | Certain conventions are enforced. Suppose the @code{targ_emul} variable |
| 145 | is set to @var{emul} in @file{configure.tgt}. The name of the emulation |
| 146 | shell script will be @file{emulparams/@var{emul}.sh}. The |
| 147 | @file{Makefile} must have a target named @file{e@var{emul}.c}; this |
| 148 | target must depend upon @file{emulparams/@var{emul}.sh}, as well as the |
| 149 | appropriate scripts in the @file{scripttempl} and @file{emultempl} |
| 150 | directories. The @file{Makefile} target must invoke @code{GENSCRIPTS} |
| 151 | with two arguments: @var{emul}, and the value of the make variable |
| 152 | @code{tdir_@var{emul}}. The value of the latter variable will be set by |
| 153 | the @file{configure} script, and is used to set the default target |
| 154 | directory to search. |
| 155 | |
| 156 | By convention, the @file{emulparams/@var{emul}.sh} shell script should |
| 157 | only set shell variables. It may set shell variables which are to be |
| 158 | interpreted by the @file{scripttempl} and the @file{emultempl} scripts. |
| 159 | Certain shell variables are interpreted directly by the |
| 160 | @file{genscripts.sh} script. |
| 161 | |
| 162 | Here is a list of shell variables interpreted by @file{genscripts.sh}, |
| 163 | as well as some conventional shell variables interpreted by the |
| 164 | @file{scripttempl} and @file{emultempl} scripts. |
| 165 | |
| 166 | @table @code |
| 167 | @item SCRIPT_NAME |
| 168 | This is the name of the @file{scripttempl} script to use. If |
| 169 | @code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use |
| 170 | the script @file{scriptteml/@var{script}.sc}. |
| 171 | |
| 172 | @item TEMPLATE_NAME |
| 173 | This is the name of the @file{emultemlp} script to use. If |
| 174 | @code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will |
| 175 | use the script @file{emultempl/@var{template}.em}. If this variable is |
| 176 | not set, the default value is @samp{generic}. |
| 177 | |
| 178 | @item GENERATE_SHLIB_SCRIPT |
| 179 | If this is set to a nonempty string, @file{genscripts.sh} will invoke |
| 180 | the @file{scripttempl} script an extra time to create a shared library |
| 181 | script. @ref{linker scripts}. |
| 182 | |
| 183 | @item OUTPUT_FORMAT |
| 184 | This is normally set to indicate the BFD output format use (e.g., |
| 185 | @samp{"a.out-sunos-big"}. The @file{scripttempl} script will normally |
| 186 | use it in an @code{OUTPUT_FORMAT} expression in the linker script. |
| 187 | |
| 188 | @item ARCH |
| 189 | This is normally set to indicate the architecture to use (e.g., |
| 190 | @samp{sparc}). The @file{scripttempl} script will normally use it in an |
| 191 | @code{OUTPUT_ARCH} expression in the linker script. |
| 192 | |
| 193 | @item ENTRY |
| 194 | Some @file{scripttempl} scripts use this to set the entry address, in an |
| 195 | @code{ENTRY} expression in the linker script. |
| 196 | |
| 197 | @item TEXT_START_ADDR |
| 198 | Some @file{scripttempl} scripts use this to set the start address of the |
| 199 | @samp{.text} section. |
| 200 | |
| 201 | @item NONPAGED_TEXT_START_ADDR |
| 202 | If this is defined, the @file{genscripts.sh} script sets |
| 203 | @code{TEXT_START_ADDR} to its value before running the |
| 204 | @file{scripttempl} script for the @code{-n} and @code{-N} options |
| 205 | (@pxref{linker scripts}). |
| 206 | |
| 207 | @item SEGMENT_SIZE |
| 208 | The @file{genscripts.sh} script uses this to set the default value of |
| 209 | @code{DATA_ALIGNMENT} when running the @file{scripttempl} script. |
| 210 | |
| 211 | @item TARGET_PAGE_SIZE |
| 212 | If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script |
| 213 | uses this to define it. |
| 214 | |
| 215 | @item ALIGNMENT |
| 216 | Some @file{scripttempl} scripts set this to a number to pass to |
| 217 | @code{ALIGN} to set the required alignment for the @code{end} symbol. |
| 218 | @end table |
| 219 | |
| 220 | @node linker scripts |
| 221 | @section @file{scripttempl} scripts |
| 222 | |
| 223 | Each linker target uses a @file{scripttempl} script to generate the |
| 224 | default linker scripts. The name of the @file{scripttempl} script is |
| 225 | set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script. |
| 226 | If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will |
| 227 | invoke @file{scripttempl/@var{script}.sc}. |
| 228 | |
| 229 | The @file{genscripts.sh} script will invoke the @file{scripttempl} |
| 230 | script 5 or 6 times. Each time it will set the shell variable |
| 231 | @code{LD_FLAG} to a different value. When the linker is run, the |
| 232 | options used will direct it to select a particular script. (Script |
| 233 | selection is controlled by the @code{get_script} emulation entry point; |
| 234 | this describes the conventional behaviour). |
| 235 | |
| 236 | The @file{scripttempl} script should just write a linker script, written |
| 237 | in the linker command language, to standard output. If the emulation |
| 238 | name--the name of the @file{emulparams} file without the @file{.sc} |
| 239 | extension--is @var{emul}, then the output will be directed to |
| 240 | @file{ldscripts/@var{emul}.@var{extension}} in the build directory, |
| 241 | where @var{extension} changes each time the @file{scripttempl} script is |
| 242 | invoked. |
| 243 | |
| 244 | Here is the list of values assigned to @code{LD_FLAG}. |
| 245 | |
| 246 | @table @code |
| 247 | @item (empty) |
| 248 | The script generated is used by default (when none of the following |
| 249 | cases apply). The output has an extension of @file{.x}. |
| 250 | @item n |
| 251 | The script generated is used when the linker is invoked with the |
| 252 | @code{-n} option. The output has an extension of @file{.xn}. |
| 253 | @item N |
| 254 | The script generated is used when the linker is invoked with the |
| 255 | @code{-N} option. The output has an extension of @file{.xbn}. |
| 256 | @item r |
| 257 | The script generated is used when the linker is invoked with the |
| 258 | @code{-r} option. The output has an extension of @file{.xr}. |
| 259 | @item u |
| 260 | The script generated is used when the linker is invoked with the |
| 261 | @code{-Ur} option. The output has an extension of @file{.xu}. |
| 262 | @item shared |
| 263 | The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to |
| 264 | this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the |
| 265 | @file{emulparams} file. The @file{emultempl} script must arrange to use |
| 266 | this script at the appropriate time, normally when the linker is invoked |
| 267 | with the @code{-shared} option. The output has an extension of |
| 268 | @file{.xs}. |
| 269 | @end table |
| 270 | |
| 271 | Besides the shell variables set by the @file{emulparams} script, and the |
| 272 | @code{LD_FLAG} variable, the @file{genscripts.sh} script will set |
| 273 | certain variables for each run of the @file{scripttempl} script. |
| 274 | |
| 275 | @table @code |
| 276 | @item RELOCATING |
| 277 | This will be set to a non-empty string when the linker is doing a final |
| 278 | relocation (e.g., all scripts other than @code{-r} and @code{-Ur}). |
| 279 | |
| 280 | @item CONSTRUCTING |
| 281 | This will be set to a non-empty string when the linker is building |
| 282 | global constructor and destructor tables (e.g., all scripts other than |
| 283 | @code{-r}). |
| 284 | |
| 285 | @item DATA_ALIGNMENT |
| 286 | This will be set to an @code{ALIGN} expression when the output should be |
| 287 | page aligned, or to @samp{.} when generating the @code{-N} script. |
| 288 | |
| 289 | @item CREATE_SHLIB |
| 290 | This will be set to a non-empty string when generating a @code{-shared} |
| 291 | script. |
| 292 | @end table |
| 293 | |
| 294 | The conventional way to write a @file{scripttempl} script is to first |
| 295 | set a few shell variables, and then write out a linker script using |
| 296 | @code{cat} with a here document. The linker script will use variable |
| 297 | substitutions, based on the above variables and those set in the |
| 298 | @file{emulparams} script, to control its behaviour. |
| 299 | |
| 300 | When there are parts of the @file{scripttempl} script which should only |
| 301 | be run when doing a final relocation, they should be enclosed within a |
| 302 | variable substitution based on @code{RELOCATING}. For example, on many |
| 303 | targets special symbols such as @code{_end} should be defined when doing |
| 304 | a final link. Naturally, those symbols should not be defined when doing |
| 305 | a relocateable link using @code{-r}. The @file{scripttempl} script |
| 306 | could use a construct like this to define those symbols: |
| 307 | @smallexample |
| 308 | $@{RELOCATING+ _end = .;@} |
| 309 | @end smallexample |
| 310 | This will do the symbol assignment only if the @code{RELOCATING} |
| 311 | variable is defined. |
| 312 | |
| 313 | The basic job of the linker script is to put the sections in the correct |
| 314 | order, and at the correct memory addresses. For some targets, the |
| 315 | linker script may have to do some other operations. |
| 316 | |
| 317 | For example, on most MIPS platforms, the linker is responsible for |
| 318 | defining the special symbol @code{_gp}, used to initialize the |
| 319 | @code{$gp} register. It must be set to the start of the small data |
| 320 | section plus @code{0x8000}. Naturally, it should only be defined when |
| 321 | doing a final relocation. This will typically be done like this: |
| 322 | @smallexample |
| 323 | $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@} |
| 324 | @end smallexample |
| 325 | This line would appear just before the sections which compose the small |
| 326 | data section (@samp{.sdata}, @samp{.sbss}). All those sections would be |
| 327 | contiguous in memory. |
| 328 | |
| 329 | Many COFF systems build constructor tables in the linker script. The |
| 330 | compiler will arrange to output the address of each global constructor |
| 331 | in a @samp{.ctor} section, and the address of each global destructor in |
| 332 | a @samp{.dtor} section (this is done by defining |
| 333 | @code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the |
| 334 | @code{gcc} configuration files). The @code{gcc} runtime support |
| 335 | routines expect the constructor table to be named @code{__CTOR_LIST__}. |
| 336 | They expect it to be a list of words, with the first word being the |
| 337 | count of the number of entries. There should be a trailing zero word. |
| 338 | (Actually, the count may be -1 if the trailing word is present, and the |
| 339 | trailing word may be omitted if the count is correct, but, as the |
| 340 | @code{gcc} behaviour has changed slightly over the years, it is safest |
| 341 | to provide both). Here is a typical way that might be handled in a |
| 342 | @file{scripttempl} file. |
| 343 | @smallexample |
| 344 | $@{CONSTRUCTING+ __CTOR_LIST__ = .;@} |
| 345 | $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@} |
| 346 | $@{CONSTRUCTING+ *(.ctors)@} |
| 347 | $@{CONSTRUCTING+ LONG(0)@} |
| 348 | $@{CONSTRUCTING+ __CTOR_END__ = .;@} |
| 349 | $@{CONSTRUCTING+ __DTOR_LIST__ = .;@} |
| 350 | $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@} |
| 351 | $@{CONSTRUCTING+ *(.dtors)@} |
| 352 | $@{CONSTRUCTING+ LONG(0)@} |
| 353 | $@{CONSTRUCTING+ __DTOR_END__ = .;@} |
| 354 | @end smallexample |
| 355 | The use of @code{CONSTRUCTING} ensures that these linker script commands |
| 356 | will only appear when the linker is supposed to be building the |
| 357 | constructor and destructor tables. This example is written for a target |
| 358 | which uses 4 byte pointers. |
| 359 | |
| 360 | Embedded systems often need to set a stack address. This is normally |
| 361 | best done by using the @code{PROVIDE} construct with a default stack |
| 362 | address. This permits the user to easily override the stack address |
| 363 | using the @code{--defsym} option. Here is an example: |
| 364 | @smallexample |
| 365 | $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@} |
| 366 | @end smallexample |
| 367 | The value of the symbol @code{__stack} would then be used in the startup |
| 368 | code to initialize the stack pointer. |
| 369 | |
| 370 | @node linker emulations |
| 371 | @section @file{emultempl} scripts |
| 372 | |
| 373 | Each linker target uses an @file{emultempl} script to generate the |
| 374 | emulation code. The name of the @file{emultempl} script is set by the |
| 375 | @code{TEMPLATE_NAME} variable in the @file{emulparams} script. If the |
| 376 | @code{TEMPLATE_NAME} variable is not set, the default is |
| 377 | @samp{generic}. If the value of @code{TEMPLATE_NAME} is @var{template}, |
| 378 | @file{genscripts.sh} will use @file{emultempl/@var{template}.em}. |
| 379 | |
| 380 | Most targets use the generic @file{emultempl} script, |
| 381 | @file{emultempl/generic.em}. A different @file{emultempl} script is |
| 382 | only needed if the linker must support unusual actions, such as linking |
| 383 | against shared libraries. |
| 384 | |
| 385 | The @file{emultempl} script is normally written as a simple invocation |
| 386 | of @code{cat} with a here document. The document will use a few |
| 387 | variable substitutions. Typically each function names uses a |
| 388 | substitution involving @code{EMULATION_NAME}, for ease of debugging when |
| 389 | the linker supports multiple emulations. |
| 390 | |
| 391 | Every function and variable in the emitted file should be static. The |
| 392 | only globally visible object must be named |
| 393 | @code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is |
| 394 | the name of the emulation set in @file{configure.tgt} (this is also the |
| 395 | name of the @file{emulparams} file without the @file{.sh} extension). |
| 396 | The @file{genscripts.sh} script will set the shell variable |
| 397 | @code{EMULATION_NAME} before invoking the @file{emultempl} script. |
| 398 | |
| 399 | The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a |
| 400 | @code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}. |
| 401 | It defines a set of function pointers which are invoked by the linker, |
| 402 | as well as strings for the emulation name (normally set from the shell |
| 403 | variable @code{EMULATION_NAME} and the default BFD target name (normally |
| 404 | set from the shell variable @code{OUTPUT_FORMAT} which is normally set |
| 405 | by the @file{emulparams} file). |
| 406 | |
| 407 | The @file{genscripts.sh} script will set the shell variable |
| 408 | @code{COMPILE_IN} when it invokes the @file{emultempl} script for the |
| 409 | default emulation. In this case, the @file{emultempl} script should |
| 410 | include the linker scripts directly, and return them from the |
| 411 | @code{get_scripts} entry point. When the emulation is not the default, |
| 412 | the @code{get_scripts} entry point should just return a file name. See |
| 413 | @file{emultempl/generic.em} for an example of how this is done. |
| 414 | |
| 415 | At some point, the linker emulation entry points should be documented. |
| 416 | |
| 417 | @node Emulation Walkthrough |
| 418 | @chapter A Walkthrough of a Typical Emulation |
| 419 | |
| 420 | This chapter is to help people who are new to the way emulations |
| 421 | interact with the linker, or who are suddenly thrust into the position |
| 422 | of having to work with existing emulations. It will discuss the files |
| 423 | you need to be aware of. It will tell you when the given "hooks" in |
| 424 | the emulation will be called. It will, hopefully, give you enough |
| 425 | information about when and how things happen that you'll be able to |
| 426 | get by. As always, the source is the definitive reference to this. |
| 427 | |
| 428 | The starting point for the linker is in @file{ldmain.c} where |
| 429 | @code{main} is defined. The bulk of the code that's emulation |
| 430 | specific will initially be in @code{emultempl/@var{emulation}.em} but |
| 431 | will end up in @code{e@var{emulation}.c} when the build is done. |
| 432 | Most of the work to select and interface with emulations is in |
| 433 | @code{ldemul.h} and @code{ldemul.c}. Specifically, @code{ldemul.h} |
| 434 | defines the @code{ld_emulation_xfer_struct} structure your emulation |
| 435 | exports. |
| 436 | |
| 437 | Your emulation file exports a symbol |
| 438 | @code{ld_@var{EMULATION_NAME}_emulation}. If your emulation is |
| 439 | selected (it usually is, since usually there's only one), |
| 440 | @code{ldemul.c} sets the variable @var{ld_emulation} to point to it. |
| 441 | @code{ldemul.c} also defines a number of API functions that interface |
| 442 | to your emulation, like @code{ldemul_after_parse} which simply calls |
| 443 | your @code{ld_@var{EMULATION}_emulation.after_parse} function. For |
| 444 | the rest of this section, the functions will be mentioned, but you |
| 445 | should assume the indirect reference to your emulation also. |
| 446 | |
| 447 | We will also skip or gloss over parts of the link process that don't |
| 448 | relate to emulations, like setting up internationalization. |
| 449 | |
| 450 | After initialization, @code{main} selects an emulation by pre-scanning |
| 451 | the command line arguments. It calls @code{ldemul_choose_target} to |
| 452 | choose a target. If you set @code{choose_target} to |
| 453 | @code{ldemul_default_target}, it picks your @code{target_name} by |
| 454 | default. |
| 455 | |
| 456 | @code{main} calls @code{ldemul_before_parse}, then @code{parse_args}. |
| 457 | @code{parse_args} calls @code{ldemul_parse_args} for each arg, which |
| 458 | must update the @code{getopt} globals if it recognizes the argument. |
| 459 | If the emulation doesn't recognize it, then parse_args checks to see |
| 460 | if it recognizes it. |
| 461 | |
| 462 | Now that the emulation has had access to all its command-line options, |
| 463 | @code{main} calls @code{ldemul_set_symbols}. This can be used for any |
| 464 | initialization that may be affected by options. It is also supposed |
| 465 | to set up any variables needed by the emulation script. |
| 466 | |
| 467 | @code{main} now calls @code{ldemul_get_script} to get the emulation |
| 468 | script to use (based on arguments, no doubt, @pxref{Emulations}) and |
| 469 | runs it. While parsing, @code{ldgram.y} may call @code{ldemul_hll} or |
| 470 | @code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB} |
| 471 | commands. It may call @code{ldemul_unrecognized_file} if you asked |
| 472 | the linker to link a file it doesn't recognize. It will call |
| 473 | @code{ldemul_recognized_file} for each file it does recognize, in case |
| 474 | the emulation wants to handle some files specially. All the while, |
| 475 | it's loading the files (possibly calling |
| 476 | @code{ldemul_open_dynamic_archive}) and symbols and stuff. After it's |
| 477 | done reading the script, @code{main} calls @code{ldemul_after_parse}. |
| 478 | Use the after-parse hook to set up anything that depends on stuff the |
| 479 | script might have set up, like the entry point. |
| 480 | |
| 481 | @code{main} next calls @code{lang_process} in @code{ldlang.c}. This |
| 482 | appears to be the main core of the linking itself, as far as emulation |
| 483 | hooks are concerned(*). It first opens the output file's BFD, calling |
| 484 | @code{ldemul_set_output_arch}, and calls |
| 485 | @code{ldemul_create_output_section_statements} in case you need to use |
| 486 | other means to find or create object files (i.e. shared libraries |
| 487 | found on a path, or fake stub objects). Despite the name, nobody |
| 488 | creates output sections here. |
| 489 | |
| 490 | (*) In most cases, the BFD library does the bulk of the actual |
| 491 | linking, handling symbol tables, symbol resolution, relocations, and |
| 492 | building the final output file. See the BFD reference for all the |
| 493 | details. Your emulation is usually concerned more with managing |
| 494 | things at the file and section level, like "put this here, add this |
| 495 | section", etc. |
| 496 | |
| 497 | Next, the objects to be linked are opened and BFDs created for them, |
| 498 | and @code{ldemul_after_open} is called. At this point, you have all |
| 499 | the objects and symbols loaded, but none of the data has been placed |
| 500 | yet. |
| 501 | |
| 502 | Next comes the Big Linking Thingy (except for the parts BFD does). |
| 503 | All input sections are mapped to output sections according to the |
| 504 | script. If a section doesn't get mapped by default, |
| 505 | @code{ldemul_place_orphan} will get called to figure out where it goes. |
| 506 | Next it figures out the offsets for each section, calling |
| 507 | @code{ldemul_before_allocation} before and |
| 508 | @code{ldemul_after_allocation} after deciding where each input section |
| 509 | ends up in the output sections. |
| 510 | |
| 511 | The last part of @code{lang_process} is to figure out all the symbols' |
| 512 | values. After assigning final values to the symbols, |
| 513 | @code{ldemul_finish} is called, and after that, any undefined symbols |
| 514 | are turned into fatal errors. |
| 515 | |
| 516 | OK, back to @code{main}, which calls @code{ldwrite} in |
| 517 | @file{ldwrite.c}. @code{ldwrite} calls BFD's final_link, which does |
| 518 | all the relocation fixups and writes the output bfd to disk, and we're |
| 519 | done. |
| 520 | |
| 521 | In summary, |
| 522 | |
| 523 | @itemize @bullet |
| 524 | |
| 525 | @item @code{main()} in @file{ldmain.c} |
| 526 | @item @file{emultempl/@var{EMULATION}.em} has your code |
| 527 | @item @code{ldemul_choose_target} (defaults to your @code{target_name}) |
| 528 | @item @code{ldemul_before_parse} |
| 529 | @item Parse argv, calls @code{ldemul_parse_args} for each |
| 530 | @item @code{ldemul_set_symbols} |
| 531 | @item @code{ldemul_get_script} |
| 532 | @item parse script |
| 533 | |
| 534 | @itemize @bullet |
| 535 | @item may call @code{ldemul_hll} or @code{ldemul_syslib} |
| 536 | @item may call @code{ldemul_open_dynamic_archive} |
| 537 | @end itemize |
| 538 | |
| 539 | @item @code{ldemul_after_parse} |
| 540 | @item @code{lang_process()} in @file{ldlang.c} |
| 541 | |
| 542 | @itemize @bullet |
| 543 | @item create @code{output_bfd} |
| 544 | @item @code{ldemul_set_output_arch} |
| 545 | @item @code{ldemul_create_output_section_statements} |
| 546 | @item read objects, create input bfds - all symbols exist, but have no values |
| 547 | @item may call @code{ldemul_unrecognized_file} |
| 548 | @item will call @code{ldemul_recognized_file} |
| 549 | @item @code{ldemul_after_open} |
| 550 | @item map input sections to output sections |
| 551 | @item may call @code{ldemul_place_orphan} for remaining sections |
| 552 | @item @code{ldemul_before_allocation} |
| 553 | @item gives input sections offsets into output sections, places output sections |
| 554 | @item @code{ldemul_after_allocation} - section addresses valid |
| 555 | @item assigns values to symbols |
| 556 | @item @code{ldemul_finish} - symbol values valid |
| 557 | @end itemize |
| 558 | |
| 559 | @item output bfd is written to disk |
| 560 | |
| 561 | @end itemize |
| 562 | |
| 563 | @contents |
| 564 | @bye |