Differences

This shows you the differences between two versions of the page.

Link to this comparison view

doc:aout [2015/10/22 20:37]
vak
doc:aout [2015/10/23 19:30] (current)
vak [Relocation information]
Line 1: Line 1:
 +====== Format of a.out files ======
  
 +**A.out** is the output file of the assembler **as** and the link editor
 +**ld**. Linker makes **a.out** executable if there were no errors and
 +no unresolved external references.
 +
 +The file has four sections: a header, the program code and data,
 +relocation information and a symbol table (in that order).
 +The last two may be omitted if the program was loaded with the `-s'
 +option of **ld** or if the symbols and relocation have been removed by
 +**strip**(1).
 +
 +Layout information as given in the include file <a.out.h> for PIC32 is:
 +
 +    /*
 +     * Header prepended to each a.out file.
 +     */
 +    struct  exec {
 +        unsigned a_midmag;      /* magic number */
 +        unsigned a_text;        /* size of code segment */
 +        unsigned a_data;        /* size of initialized data */
 +        unsigned a_bss;         /* size of uninitialized data */
 +        unsigned a_reltext;     /* size of text relocation info */
 +        unsigned a_reldata;     /* size of data relocation info */
 +        unsigned a_syms;        /* size of symbol table */
 +        unsigned a_entry;       /* entry address */
 +    };
 +
 +    #define RMAGIC      0406    /* relocatable object file */
 +    #define OMAGIC      0407    /* old impure format */
 +
 +In the header the sizes of each section are given in bytes, but are word aligned.
 +The size of the header is not included in any of the other sizes.
 +
 +When an **a.out** file is executed, three logical segments are set up:
 +the code (text) segment, the data segment (with uninitialized data,
 +which starts off as all 0, following initialized), and a stack.
 +The text segment begins at address 0x7f008000 in memory;
 +the header is not loaded.
 +
 +If the magic number in the header is OMAGIC (0407),
 +it indicates that the text segment is not to be write-protected
 +and shared, so the data segment is immediately contiguous with the text
 +segment. This is the oldest kind of executable program and is the default.
 +
 +The stack segment will occupy the highest possible locations in the
 +core image: growing downwards from 0x7f01fffc. The stack segment is
 +automatically extended as required. The data segment is only extended
 +as requested by **brk**(2).
 +
 +
 +===== Relocation information =====
 +
 +Relocation information is present, only if the magic number in the header is RMAGIC (0406).
 +For every word of program text or initialized data, the relocation section
 +contains a record of variable length from 1 to 6 bytes.
 +Bytes 2-4 are present only when the relocation refers to an external symbol (xxx=6).
 +Bytes 5-6 are present only for the upper-address relocation types (zzz=2 or zzz=3).
 +
 +^ Byte 1      ^ Byte 2 ^ Byte 3 ^ Byte 4 ^ Byte 5 ^ Byte 6  ^
 +| Descriptor  |  Symbol index          |||  Lower address  ||
 +
 +Byte 1 of a relocation record contains a descriptor of format:
 +
 +^ 7 ^ 6 ^ 5 ^ 4 ^ 3 ^ 2 ^ 1 ^ 0 ^
 +| 0 |   xxx   ||| y |   zzz   |||
 +
 +Bits 6:4 (xxx) of relocation descriptor indicate the segment referred to
 +by the text or data word associated with the relocation record:
 +
 +^ xxx ^ Description     ^
 +^ 0   | Absolute number |
 +^ 2   | Reference to text segment |
 +^ 3   | Reference to initialized data |
 +^ 4   | Reference to uninitialized data (bss) |
 +^ 7   | Reference to an external symbol with index specified by bytes 2,3,4 |
 +
 +Bit 3 (y) of the relocation descriptor indicates, if 1, that the reference is
 +relative to the GP register.
 +
 +Bits 2:0 (zzz) of the relocation descriptor define a relocation type,
 +or a method of tranforming the text or data word:
 +
 +^ zzz ^ Description           ^
 +^ 0   | Byte address, 16 bits |
 +^ 1   | Byte address, 32 bits |
 +^ 2   | Upper part of byte address [31:16] |
 +^ 3   | Upper part of byte address with signed offset |
 +^ 4   | Word address [17:2]   |
 +^ 5   | Word address [27:2]   |
 +
 +The value of a word in the text or data which is not a portion of a
 +reference to an undefined external symbol is exactly that value which
 +will appear in memory when the file is executed. If a word in the text
 +or data involves a reference to an undefined external symbol, as indicated
 +by the relocation information, then the value stored in the file
 +is an offset from the associated external symbol. When the file is
 +processed by the link editor and the external symbol becomes defined,
 +the value of the symbol will be added into the word in the file.
 +
 +
 +===== Symbol table =====
 +
 +The symbol table is a sequence of variable-length records for every symbol.
 +The first symbol is numbered 0, the second 1, etc.
 +
 +^ Byte #  ^ Description ^
 +^    0    | Name length  |
 +^    1    | Symbol type  |
 +^  2...5  | Symbol value |
 +^  6...N  | Symbol name  |
 +
 +Byte 0 specifies a length of the symbol name in bytes (2...255), including the terminating zero byte.
 +
 +Byte 1 indicates a type of the symbol - see below.
 +
 +Bytes 2...5 store a symbol value (little endian).
 +
 +Bytes 6...N contain a symbol name, null terminated.
 +
 +
 +===== Symbol type =====
 +
 +^ 7 ^ 6 ^ 5 ^ 4 ^ 3 ^ 2 ^ 1 ^ 0 ^
 +| 0 | W | G |    ttttt      |||||
 +
 +  * W - weak reference
 +  * G - global, or external symbol
 +  * ttttt - symbol type:
 +
 +^ ttttt ^ Description  ^
 +^   0   | Undefined symbol |
 +^   1   | Absolute     |
 +^   2   | Text segment |
 +^   3   | Data segment |
 +^   4   | BSS segment  |
 +^  31   | File name    |
 +
 +If a symbol's type is undefined external, and the value field is nonzero,
 +the symbol is interpreted by the loader **ld** as the name of a common
 +region whose size is indicated by the value of the symbol.