Move gas/ld NEWS from binutils to gas/ld.
[deliverable/binutils-gdb.git] / ld / ldint.texinfo
CommitLineData
252b5132
RH
1\input texinfo
2@setfilename ldint.info
0e9517a9 3@c Copyright 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
aa820537 4@c 2003, 2005, 2006, 2007
a2b64bed 5@c Free Software Foundation, Inc.
252b5132
RH
6
7@ifinfo
8@format
9START-INFO-DIR-ENTRY
10* Ld-Internals: (ldint). The GNU linker internals.
11END-INFO-DIR-ENTRY
12@end format
13@end ifinfo
14
0e9517a9 15@copying
252b5132
RH
16This file documents the internals of the GNU linker ld.
17
0e9517a9 18Copyright @copyright{} 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2007
a2b64bed 19Free Software Foundation, Inc.
252b5132
RH
20Contributed by Cygnus Support.
21
0e9517a9 22Permission is granted to copy, distribute and/or modify this document
9fbcbd81 23under the terms of the GNU Free Documentation License, Version 1.3 or
0e9517a9
NC
24any later version published by the Free Software Foundation; with the
25Invariant Sections being ``GNU General Public License'' and ``Funding
26Free Software'', the Front-Cover texts being (a) (see below), and with
27the Back-Cover Texts being (b) (see below). A copy of the license is
28included in the section entitled ``GNU Free Documentation License''.
252b5132 29
0e9517a9 30(a) The FSF's Front-Cover Text is:
252b5132 31
0e9517a9
NC
32 A GNU Manual
33
34(b) The FSF's Back-Cover Text is:
35
36 You have freedom to copy and modify this GNU Manual, like GNU
37 software. Copies published by the Free Software Foundation raise
38 funds for GNU development.
39@end copying
252b5132
RH
40
41@iftex
42@finalout
43@setchapternewpage off
44@settitle GNU Linker Internals
45@titlepage
46@title{A guide to the internals of the GNU linker}
47@author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie
48@author Cygnus Support
49@page
50
51@tex
52\def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
5b343f5a 53\xdef\manvers{2.10.91} % For use in headers, footers too
252b5132
RH
54{\parskip=0pt
55\hfill Cygnus Support\par
56\hfill \manvers\par
57\hfill \TeX{}info \texinfoversion\par
58}
59@end tex
60
61@vskip 0pt plus 1filll
9fbcbd81 62Copyright @copyright{} 1992, 1993, 1994, 1995, 1996, 1997, 1998, 2000
252b5132
RH
63Free Software Foundation, Inc.
64
704c465c 65 Permission is granted to copy, distribute and/or modify this document
9fbcbd81 66 under the terms of the GNU Free Documentation License, Version 1.3
704c465c
NC
67 or any later version published by the Free Software Foundation;
68 with no Invariant Sections, with no Front-Cover Texts, and with no
69 Back-Cover Texts. A copy of the license is included in the
70 section entitled "GNU Free Documentation License".
252b5132
RH
71
72@end titlepage
73@end iftex
74
75@node Top
76@top
77
78This file documents the internals of the GNU linker @code{ld}. It is a
79collection of miscellaneous information with little form at this point.
80Mostly, it is a repository into which you can put information about
81GNU @code{ld} as you discover it (or as you design changes to @code{ld}).
82
cf055d54
NC
83This document is distributed under the terms of the GNU Free
84Documentation License. A copy of the license is included in the
85section entitled "GNU Free Documentation License".
86
252b5132
RH
87@menu
88* README:: The README File
89* Emulations:: How linker emulations are generated
90* Emulation Walkthrough:: A Walkthrough of a Typical Emulation
b044cda1 91* Architecture Specific:: Some Architecture Specific Notes
704c465c 92* GNU Free Documentation License:: GNU Free Documentation License
252b5132
RH
93@end menu
94
95@node README
96@chapter The @file{README} File
97
98Check the @file{README} file; it often has useful information that does not
99appear anywhere else in the directory.
100
101@node Emulations
102@chapter How linker emulations are generated
103
104Each linker target has an @dfn{emulation}. The emulation includes the
105default linker script, and certain emulations also modify certain types
106of linker behaviour.
107
108Emulations are created during the build process by the shell script
109@file{genscripts.sh}.
110
111The @file{genscripts.sh} script starts by reading a file in the
112@file{emulparams} directory. This is a shell script which sets various
113shell variables used by @file{genscripts.sh} and the other shell scripts
114it invokes.
115
116The @file{genscripts.sh} script will invoke a shell script in the
117@file{scripttempl} directory in order to create default linker scripts
118written in the linker command language. The @file{scripttempl} script
119will be invoked 5 (or, in some cases, 6) times, with different
120assignments to shell variables, to create different default scripts.
121The choice of script is made based on the command line options.
122
123After creating the scripts, @file{genscripts.sh} will invoke yet another
124shell script, this time in the @file{emultempl} directory. That shell
125script will create the emulation source file, which contains C code.
126This C code permits the linker emulation to override various linker
127behaviours. Most targets use the generic emulation code, which is in
128@file{emultempl/generic.em}.
129
130To summarize, @file{genscripts.sh} reads three shell scripts: an
131emulation parameters script in the @file{emulparams} directory, a linker
132script generation script in the @file{scripttempl} directory, and an
133emulation source file generation script in the @file{emultempl}
134directory.
135
136For example, the Sun 4 linker sets up variables in
137@file{emulparams/sun4.sh}, creates linker scripts using
138@file{scripttempl/aout.sc}, and creates the emulation code using
139@file{emultempl/sunos.em}.
140
141Note that the linker can support several emulations simultaneously,
142depending upon how it is configured. An emulation can be selected with
143the @code{-m} option. The @code{-V} option will list all supported
144emulations.
145
146@menu
147* emulation parameters:: @file{emulparams} scripts
148* linker scripts:: @file{scripttempl} scripts
149* linker emulations:: @file{emultempl} scripts
150@end menu
151
152@node emulation parameters
153@section @file{emulparams} scripts
154
155Each target selects a particular file in the @file{emulparams} directory
156by setting the shell variable @code{targ_emul} in @file{configure.tgt}.
157This shell variable is used by the @file{configure} script to control
158building an emulation source file.
159
160Certain conventions are enforced. Suppose the @code{targ_emul} variable
161is set to @var{emul} in @file{configure.tgt}. The name of the emulation
162shell script will be @file{emulparams/@var{emul}.sh}. The
163@file{Makefile} must have a target named @file{e@var{emul}.c}; this
164target must depend upon @file{emulparams/@var{emul}.sh}, as well as the
165appropriate scripts in the @file{scripttempl} and @file{emultempl}
166directories. The @file{Makefile} target must invoke @code{GENSCRIPTS}
167with two arguments: @var{emul}, and the value of the make variable
168@code{tdir_@var{emul}}. The value of the latter variable will be set by
169the @file{configure} script, and is used to set the default target
170directory to search.
171
172By convention, the @file{emulparams/@var{emul}.sh} shell script should
173only set shell variables. It may set shell variables which are to be
174interpreted by the @file{scripttempl} and the @file{emultempl} scripts.
175Certain shell variables are interpreted directly by the
176@file{genscripts.sh} script.
177
178Here is a list of shell variables interpreted by @file{genscripts.sh},
179as well as some conventional shell variables interpreted by the
180@file{scripttempl} and @file{emultempl} scripts.
181
182@table @code
183@item SCRIPT_NAME
184This is the name of the @file{scripttempl} script to use. If
185@code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
b45619c0 186the script @file{scripttempl/@var{script}.sc}.
252b5132
RH
187
188@item TEMPLATE_NAME
b45619c0 189This is the name of the @file{emultempl} script to use. If
252b5132
RH
190@code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
191use the script @file{emultempl/@var{template}.em}. If this variable is
192not set, the default value is @samp{generic}.
193
194@item GENERATE_SHLIB_SCRIPT
195If this is set to a nonempty string, @file{genscripts.sh} will invoke
196the @file{scripttempl} script an extra time to create a shared library
197script. @ref{linker scripts}.
198
199@item OUTPUT_FORMAT
200This is normally set to indicate the BFD output format use (e.g.,
201@samp{"a.out-sunos-big"}. The @file{scripttempl} script will normally
202use it in an @code{OUTPUT_FORMAT} expression in the linker script.
203
204@item ARCH
205This is normally set to indicate the architecture to use (e.g.,
206@samp{sparc}). The @file{scripttempl} script will normally use it in an
207@code{OUTPUT_ARCH} expression in the linker script.
208
209@item ENTRY
210Some @file{scripttempl} scripts use this to set the entry address, in an
211@code{ENTRY} expression in the linker script.
212
213@item TEXT_START_ADDR
214Some @file{scripttempl} scripts use this to set the start address of the
215@samp{.text} section.
216
252b5132
RH
217@item SEGMENT_SIZE
218The @file{genscripts.sh} script uses this to set the default value of
219@code{DATA_ALIGNMENT} when running the @file{scripttempl} script.
220
221@item TARGET_PAGE_SIZE
222If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script
223uses this to define it.
224
225@item ALIGNMENT
226Some @file{scripttempl} scripts set this to a number to pass to
227@code{ALIGN} to set the required alignment for the @code{end} symbol.
228@end table
229
230@node linker scripts
231@section @file{scripttempl} scripts
232
233Each linker target uses a @file{scripttempl} script to generate the
234default linker scripts. The name of the @file{scripttempl} script is
235set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script.
236If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will
237invoke @file{scripttempl/@var{script}.sc}.
238
239The @file{genscripts.sh} script will invoke the @file{scripttempl}
e2a83dd0 240script 5 to 9 times. Each time it will set the shell variable
252b5132
RH
241@code{LD_FLAG} to a different value. When the linker is run, the
242options used will direct it to select a particular script. (Script
243selection is controlled by the @code{get_script} emulation entry point;
244this describes the conventional behaviour).
245
246The @file{scripttempl} script should just write a linker script, written
247in the linker command language, to standard output. If the emulation
248name--the name of the @file{emulparams} file without the @file{.sc}
249extension--is @var{emul}, then the output will be directed to
250@file{ldscripts/@var{emul}.@var{extension}} in the build directory,
251where @var{extension} changes each time the @file{scripttempl} script is
252invoked.
253
254Here is the list of values assigned to @code{LD_FLAG}.
255
256@table @code
257@item (empty)
258The script generated is used by default (when none of the following
259cases apply). The output has an extension of @file{.x}.
260@item n
261The script generated is used when the linker is invoked with the
262@code{-n} option. The output has an extension of @file{.xn}.
263@item N
264The script generated is used when the linker is invoked with the
265@code{-N} option. The output has an extension of @file{.xbn}.
266@item r
267The script generated is used when the linker is invoked with the
268@code{-r} option. The output has an extension of @file{.xr}.
269@item u
270The script generated is used when the linker is invoked with the
271@code{-Ur} option. The output has an extension of @file{.xu}.
272@item shared
273The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
274this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the
275@file{emulparams} file. The @file{emultempl} script must arrange to use
276this script at the appropriate time, normally when the linker is invoked
277with the @code{-shared} option. The output has an extension of
278@file{.xs}.
db6751f2
JJ
279@item c
280The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
281this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
282@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
283@file{emultempl} script must arrange to use this script at the appropriate
284time, normally when the linker is invoked with the @code{-z combreloc}
285option. The output has an extension of
286@file{.xc}.
287@item cshared
288The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
289this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
290@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
b45619c0 291@code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
db6751f2
JJ
292The @file{emultempl} script must arrange to use this script at the
293appropriate time, normally when the linker is invoked with the @code{-shared
294-z combreloc} option. The output has an extension of @file{.xsc}.
e2a83dd0
NC
295@item auto_import
296The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
297this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the
298@file{emulparams} file. The @file{emultempl} script must arrange to
299use this script at the appropriate time, normally when the linker is
300invoked with the @code{--enable-auto-import} option. The output has
301an extension of @file{.xa}.
252b5132
RH
302@end table
303
304Besides the shell variables set by the @file{emulparams} script, and the
305@code{LD_FLAG} variable, the @file{genscripts.sh} script will set
306certain variables for each run of the @file{scripttempl} script.
307
308@table @code
309@item RELOCATING
310This will be set to a non-empty string when the linker is doing a final
311relocation (e.g., all scripts other than @code{-r} and @code{-Ur}).
312
313@item CONSTRUCTING
314This will be set to a non-empty string when the linker is building
315global constructor and destructor tables (e.g., all scripts other than
316@code{-r}).
317
318@item DATA_ALIGNMENT
319This will be set to an @code{ALIGN} expression when the output should be
320page aligned, or to @samp{.} when generating the @code{-N} script.
321
322@item CREATE_SHLIB
323This will be set to a non-empty string when generating a @code{-shared}
324script.
db6751f2
JJ
325
326@item COMBRELOC
327This will be set to a non-empty string when generating @code{-z combreloc}
328scripts to a temporary file name which can be used during script generation.
252b5132
RH
329@end table
330
331The conventional way to write a @file{scripttempl} script is to first
332set a few shell variables, and then write out a linker script using
333@code{cat} with a here document. The linker script will use variable
334substitutions, based on the above variables and those set in the
335@file{emulparams} script, to control its behaviour.
336
337When there are parts of the @file{scripttempl} script which should only
338be run when doing a final relocation, they should be enclosed within a
339variable substitution based on @code{RELOCATING}. For example, on many
340targets special symbols such as @code{_end} should be defined when doing
341a final link. Naturally, those symbols should not be defined when doing
1049f94e 342a relocatable link using @code{-r}. The @file{scripttempl} script
252b5132
RH
343could use a construct like this to define those symbols:
344@smallexample
345 $@{RELOCATING+ _end = .;@}
346@end smallexample
347This will do the symbol assignment only if the @code{RELOCATING}
348variable is defined.
349
350The basic job of the linker script is to put the sections in the correct
351order, and at the correct memory addresses. For some targets, the
352linker script may have to do some other operations.
353
354For example, on most MIPS platforms, the linker is responsible for
355defining the special symbol @code{_gp}, used to initialize the
356@code{$gp} register. It must be set to the start of the small data
357section plus @code{0x8000}. Naturally, it should only be defined when
358doing a final relocation. This will typically be done like this:
359@smallexample
360 $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@}
361@end smallexample
362This line would appear just before the sections which compose the small
363data section (@samp{.sdata}, @samp{.sbss}). All those sections would be
364contiguous in memory.
365
366Many COFF systems build constructor tables in the linker script. The
367compiler will arrange to output the address of each global constructor
368in a @samp{.ctor} section, and the address of each global destructor in
369a @samp{.dtor} section (this is done by defining
370@code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the
371@code{gcc} configuration files). The @code{gcc} runtime support
372routines expect the constructor table to be named @code{__CTOR_LIST__}.
373They expect it to be a list of words, with the first word being the
374count of the number of entries. There should be a trailing zero word.
375(Actually, the count may be -1 if the trailing word is present, and the
376trailing word may be omitted if the count is correct, but, as the
377@code{gcc} behaviour has changed slightly over the years, it is safest
378to provide both). Here is a typical way that might be handled in a
379@file{scripttempl} file.
380@smallexample
381 $@{CONSTRUCTING+ __CTOR_LIST__ = .;@}
382 $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@}
383 $@{CONSTRUCTING+ *(.ctors)@}
384 $@{CONSTRUCTING+ LONG(0)@}
385 $@{CONSTRUCTING+ __CTOR_END__ = .;@}
386 $@{CONSTRUCTING+ __DTOR_LIST__ = .;@}
387 $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@}
388 $@{CONSTRUCTING+ *(.dtors)@}
389 $@{CONSTRUCTING+ LONG(0)@}
390 $@{CONSTRUCTING+ __DTOR_END__ = .;@}
391@end smallexample
392The use of @code{CONSTRUCTING} ensures that these linker script commands
393will only appear when the linker is supposed to be building the
394constructor and destructor tables. This example is written for a target
395which uses 4 byte pointers.
396
397Embedded systems often need to set a stack address. This is normally
398best done by using the @code{PROVIDE} construct with a default stack
399address. This permits the user to easily override the stack address
400using the @code{--defsym} option. Here is an example:
401@smallexample
402 $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@}
403@end smallexample
404The value of the symbol @code{__stack} would then be used in the startup
405code to initialize the stack pointer.
406
407@node linker emulations
408@section @file{emultempl} scripts
409
410Each linker target uses an @file{emultempl} script to generate the
411emulation code. The name of the @file{emultempl} script is set by the
412@code{TEMPLATE_NAME} variable in the @file{emulparams} script. If the
413@code{TEMPLATE_NAME} variable is not set, the default is
414@samp{generic}. If the value of @code{TEMPLATE_NAME} is @var{template},
415@file{genscripts.sh} will use @file{emultempl/@var{template}.em}.
416
417Most targets use the generic @file{emultempl} script,
418@file{emultempl/generic.em}. A different @file{emultempl} script is
419only needed if the linker must support unusual actions, such as linking
420against shared libraries.
421
422The @file{emultempl} script is normally written as a simple invocation
423of @code{cat} with a here document. The document will use a few
424variable substitutions. Typically each function names uses a
425substitution involving @code{EMULATION_NAME}, for ease of debugging when
426the linker supports multiple emulations.
427
428Every function and variable in the emitted file should be static. The
429only globally visible object must be named
430@code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is
431the name of the emulation set in @file{configure.tgt} (this is also the
432name of the @file{emulparams} file without the @file{.sh} extension).
433The @file{genscripts.sh} script will set the shell variable
434@code{EMULATION_NAME} before invoking the @file{emultempl} script.
435
436The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a
437@code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}.
438It defines a set of function pointers which are invoked by the linker,
439as well as strings for the emulation name (normally set from the shell
440variable @code{EMULATION_NAME} and the default BFD target name (normally
441set from the shell variable @code{OUTPUT_FORMAT} which is normally set
442by the @file{emulparams} file).
443
444The @file{genscripts.sh} script will set the shell variable
445@code{COMPILE_IN} when it invokes the @file{emultempl} script for the
446default emulation. In this case, the @file{emultempl} script should
447include the linker scripts directly, and return them from the
448@code{get_scripts} entry point. When the emulation is not the default,
449the @code{get_scripts} entry point should just return a file name. See
450@file{emultempl/generic.em} for an example of how this is done.
451
452At some point, the linker emulation entry points should be documented.
453
454@node Emulation Walkthrough
455@chapter A Walkthrough of a Typical Emulation
456
457This chapter is to help people who are new to the way emulations
458interact with the linker, or who are suddenly thrust into the position
459of having to work with existing emulations. It will discuss the files
460you need to be aware of. It will tell you when the given "hooks" in
461the emulation will be called. It will, hopefully, give you enough
462information about when and how things happen that you'll be able to
463get by. As always, the source is the definitive reference to this.
464
465The starting point for the linker is in @file{ldmain.c} where
466@code{main} is defined. The bulk of the code that's emulation
467specific will initially be in @code{emultempl/@var{emulation}.em} but
468will end up in @code{e@var{emulation}.c} when the build is done.
469Most of the work to select and interface with emulations is in
470@code{ldemul.h} and @code{ldemul.c}. Specifically, @code{ldemul.h}
471defines the @code{ld_emulation_xfer_struct} structure your emulation
472exports.
473
474Your emulation file exports a symbol
475@code{ld_@var{EMULATION_NAME}_emulation}. If your emulation is
476selected (it usually is, since usually there's only one),
477@code{ldemul.c} sets the variable @var{ld_emulation} to point to it.
478@code{ldemul.c} also defines a number of API functions that interface
479to your emulation, like @code{ldemul_after_parse} which simply calls
480your @code{ld_@var{EMULATION}_emulation.after_parse} function. For
481the rest of this section, the functions will be mentioned, but you
482should assume the indirect reference to your emulation also.
483
484We will also skip or gloss over parts of the link process that don't
485relate to emulations, like setting up internationalization.
486
487After initialization, @code{main} selects an emulation by pre-scanning
488the command line arguments. It calls @code{ldemul_choose_target} to
489choose a target. If you set @code{choose_target} to
490@code{ldemul_default_target}, it picks your @code{target_name} by
491default.
492
493@code{main} calls @code{ldemul_before_parse}, then @code{parse_args}.
494@code{parse_args} calls @code{ldemul_parse_args} for each arg, which
495must update the @code{getopt} globals if it recognizes the argument.
496If the emulation doesn't recognize it, then parse_args checks to see
497if it recognizes it.
498
499Now that the emulation has had access to all its command-line options,
500@code{main} calls @code{ldemul_set_symbols}. This can be used for any
501initialization that may be affected by options. It is also supposed
502to set up any variables needed by the emulation script.
503
504@code{main} now calls @code{ldemul_get_script} to get the emulation
505script to use (based on arguments, no doubt, @pxref{Emulations}) and
506runs it. While parsing, @code{ldgram.y} may call @code{ldemul_hll} or
507@code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB}
508commands. It may call @code{ldemul_unrecognized_file} if you asked
509the linker to link a file it doesn't recognize. It will call
510@code{ldemul_recognized_file} for each file it does recognize, in case
511the emulation wants to handle some files specially. All the while,
512it's loading the files (possibly calling
513@code{ldemul_open_dynamic_archive}) and symbols and stuff. After it's
514done reading the script, @code{main} calls @code{ldemul_after_parse}.
515Use the after-parse hook to set up anything that depends on stuff the
516script might have set up, like the entry point.
517
518@code{main} next calls @code{lang_process} in @code{ldlang.c}. This
519appears to be the main core of the linking itself, as far as emulation
520hooks are concerned(*). It first opens the output file's BFD, calling
521@code{ldemul_set_output_arch}, and calls
522@code{ldemul_create_output_section_statements} in case you need to use
523other means to find or create object files (i.e. shared libraries
524found on a path, or fake stub objects). Despite the name, nobody
525creates output sections here.
526
527(*) In most cases, the BFD library does the bulk of the actual
528linking, handling symbol tables, symbol resolution, relocations, and
529building the final output file. See the BFD reference for all the
530details. Your emulation is usually concerned more with managing
531things at the file and section level, like "put this here, add this
532section", etc.
533
534Next, the objects to be linked are opened and BFDs created for them,
535and @code{ldemul_after_open} is called. At this point, you have all
536the objects and symbols loaded, but none of the data has been placed
537yet.
538
539Next comes the Big Linking Thingy (except for the parts BFD does).
540All input sections are mapped to output sections according to the
541script. If a section doesn't get mapped by default,
542@code{ldemul_place_orphan} will get called to figure out where it goes.
543Next it figures out the offsets for each section, calling
544@code{ldemul_before_allocation} before and
545@code{ldemul_after_allocation} after deciding where each input section
546ends up in the output sections.
547
548The last part of @code{lang_process} is to figure out all the symbols'
549values. After assigning final values to the symbols,
550@code{ldemul_finish} is called, and after that, any undefined symbols
551are turned into fatal errors.
552
553OK, back to @code{main}, which calls @code{ldwrite} in
554@file{ldwrite.c}. @code{ldwrite} calls BFD's final_link, which does
555all the relocation fixups and writes the output bfd to disk, and we're
556done.
557
558In summary,
559
560@itemize @bullet
561
562@item @code{main()} in @file{ldmain.c}
563@item @file{emultempl/@var{EMULATION}.em} has your code
564@item @code{ldemul_choose_target} (defaults to your @code{target_name})
565@item @code{ldemul_before_parse}
566@item Parse argv, calls @code{ldemul_parse_args} for each
567@item @code{ldemul_set_symbols}
568@item @code{ldemul_get_script}
569@item parse script
570
571@itemize @bullet
572@item may call @code{ldemul_hll} or @code{ldemul_syslib}
573@item may call @code{ldemul_open_dynamic_archive}
574@end itemize
575
576@item @code{ldemul_after_parse}
577@item @code{lang_process()} in @file{ldlang.c}
578
579@itemize @bullet
580@item create @code{output_bfd}
581@item @code{ldemul_set_output_arch}
582@item @code{ldemul_create_output_section_statements}
583@item read objects, create input bfds - all symbols exist, but have no values
584@item may call @code{ldemul_unrecognized_file}
585@item will call @code{ldemul_recognized_file}
586@item @code{ldemul_after_open}
587@item map input sections to output sections
588@item may call @code{ldemul_place_orphan} for remaining sections
589@item @code{ldemul_before_allocation}
590@item gives input sections offsets into output sections, places output sections
591@item @code{ldemul_after_allocation} - section addresses valid
592@item assigns values to symbols
593@item @code{ldemul_finish} - symbol values valid
594@end itemize
595
596@item output bfd is written to disk
597
598@end itemize
599
b044cda1
CW
600@node Architecture Specific
601@chapter Some Architecture Specific Notes
602
603This is the place for notes on the behavior of @code{ld} on
604specific platforms. Currently, only Intel x86 is documented (and
605of that, only the auto-import behavior for DLLs).
606
607@menu
608* ix86:: Intel x86
609@end menu
610
611@node ix86
612@section Intel x86
613
614@table @emph
615@code{ld} can create DLLs that operate with various runtimes available
616on a common x86 operating system. These runtimes include native (using
617the mingw "platform"), cygwin, and pw.
618
619@item auto-import from DLLs
620@enumerate
621@item
622With this feature on, DLL clients can import variables from DLL
623without any concern from their side (for example, without any source
624code modifications). Auto-import can be enabled using the
625@code{--enable-auto-import} flag, or disabled via the
626@code{--disable-auto-import} flag. Auto-import is disabled by default.
627
628@item
629This is done completely in bounds of the PE specification (to be fair,
630there's a minor violation of the spec at one point, but in practice
631auto-import works on all known variants of that common x86 operating
632system) So, the resulting DLL can be used with any other PE
633compiler/linker.
634
635@item
636Auto-import is fully compatible with standard import method, in which
637variables are decorated using attribute modifiers. Libraries of either
638type may be mixed together.
639
640@item
641Overhead (space): 8 bytes per imported symbol, plus 20 for each
642reference to it; Overhead (load time): negligible; Overhead
643(virtual/physical memory): should be less than effect of DLL
644relocation.
645@end enumerate
646
647Motivation
648
649The obvious and only way to get rid of dllimport insanity is
650to make client access variable directly in the DLL, bypassing
651the extra dereference imposed by ordinary DLL runtime linking.
b45619c0 652I.e., whenever client contains something like
b044cda1
CW
653
654@code{mov dll_var,%eax,}
655
656address of dll_var in the command should be relocated to point
657into loaded DLL. The aim is to make OS loader do so, and than
658make ld help with that. Import section of PE made following
659way: there's a vector of structures each describing imports
660from particular DLL. Each such structure points to two other
b45619c0 661parallel vectors: one holding imported names, and one which
b044cda1
CW
662will hold address of corresponding imported name. So, the
663solution is de-vectorize these structures, making import
664locations be sparse and pointing directly into code.
665
666Implementation
667
668For each reference of data symbol to be imported from DLL (to
669set of which belong symbols with name <sym>, if __imp_<sym> is
670found in implib), the import fixup entry is generated. That
671entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3
672subsection. Each fixup entry contains pointer to symbol's address
673within .text section (marked with __fuN_<sym> symbol, where N is
674integer), pointer to DLL name (so, DLL name is referenced by
675multiple entries), and pointer to symbol name thunk. Symbol name
676thunk is singleton vector (__nm_th_<symbol>) pointing to
677IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing
678imported name. Here comes that "om the edge" problem mentioned above:
679PE specification rambles that name vector (OriginalFirstThunk) should
680run in parallel with addresses vector (FirstThunk), i.e. that they
681should have same number of elements and terminated with zero. We violate
682this, since FirstThunk points directly into machine code. But in
683practice, OS loader implemented the sane way: it goes thru
684OriginalFirstThunk and puts addresses to FirstThunk, not something
685else. It once again should be noted that dll and symbol name
686structures are reused across fixup entries and should be there
687anyway to support standard import stuff, so sustained overhead is
68820 bytes per reference. Other question is whether having several
689IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes,
690it is done even by native compiler/linker (libth32's functions are in
691fact resident in windows9x kernel32.dll, so if you use it, you have
692two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is
693whether referencing the same PE structures several times is valid.
694The answer is why not, prohibiting that (detecting violation) would
695require more work on behalf of loader than not doing it.
696
697@end table
698
704c465c
NC
699@node GNU Free Documentation License
700@chapter GNU Free Documentation License
701
9fbcbd81 702@include fdl.texi
704c465c 703
252b5132
RH
704@contents
705@bye
This page took 0.461491 seconds and 4 git commands to generate.