您的位置:首页 > 其它

Studying note of GCC-3.4.6 source (11)

2010-03-08 13:00 295 查看

2. Compiler initialization

This core begins in the main function in file main.c at YOUR-GCC-SOURCE-DIR/gcc directory. This function does nothing but calls toplev_main.

4684 int
4685 toplev_main (unsigned int argc, const char **argv) in toplev.c
4686 {
4687 save_argv = argv;
4688
4689 /* Initialization of GCC's environment, and diagnostics. */
4690 general_init (argv[0]);
4691
4692 /* Parse the options and do minimal processing; basically just
4693 enough to default flags appropriately. */
4694 decode_options (argc, argv);
4695
4696 randomize ();
4697
4698 /* Exit early if we can (e.g. -help). */
4699 if (!exit_after_options)
4700 do_compile ();
4701
4702 if (errorcount || sorrycount)
4703 return (FATAL_EXIT_CODE);
4704
4705 return (SUCCESS_EXIT_CODE);
4706 }

This function does the whole compilation. The functions it calls are very complex ones. We now see them one by one. The first function is general_init, which as indicated by its name just does some basic initialization for the compiler.

4208 static void
4209 general_init (const char *argv0) in toplev.c
4210 {
4211 const char *p;
4212
4213 p = argv0 + strlen (argv0);
4214 while (p != argv0 && !IS_DIR_SEPARATOR (p[-1]))
4215 --p;
4216 progname = p;
4217
4218 xmalloc_set_program_name (progname);
4219
4220 hex_init ();
4221
4222 gcc_init_libintl ();
4223
4224 /* Initialize the diagnostics reporting machinery, so option parsing
4225 can give warnings and errors. */
4226 diagnostic_initialize (global_dc);
4227 /* Set a default printer. Language specific initializations will
4228 override it later. */
4229 pp_format_decoder (global_dc->printer) = &default_tree_printer;
4230
4231 /* Trap fatal signals, e.g. SIGSEGV, and convert them to ICE messages. */
4232 #ifdef SIGSEGV
4233 signal (SIGSEGV, crash_signal);
4234 #endif
4235 #ifdef SIGILL
4236 signal (SIGILL, crash_signal);
4237 #endif
4238 #ifdef SIGBUS
4239 signal (SIGBUS, crash_signal);
4240 #endif
4241 #ifdef SIGABRT
4242 signal (SIGABRT, crash_signal);
4243 #endif
4244 #if defined SIGIOT && (!defined SIGABRT || SIGABRT != SIGIOT)
4245 signal (SIGIOT, crash_signal);
4246 #endif
4247 #ifdef SIGFPE
4248 signal (SIGFPE, crash_signal);
4249 #endif
4250
4251 /* Other host-specific signal setup. */
4252 (*host_hooks.extra_signals)();
4253
4254 /* Initialize the garbage-collector, string pools and tree type hash
4255 table. */
4256 init_ggc ();
4257 init_stringpool ();
4258 init_ttree ();
4259
4260 /* Initialize register usage now so switches may override. */
4261 init_reg_sets ();
4262
4263 /* Register the language-independent parameters. */
4264 add_params (lang_independent_params, LAST_PARAM);
4265
4266 /* This must be done after add_params but before argument processing. */
4267 init_ggc_heuristics();
4268 }

At line 4218, in platform has sbrk, xmalloc_set_program_name saves current location of the program break into first_break, in case of memory allocation failure, it can form the debugging information tegother with name at line 102.

98 void
99 xmalloc_set_program_name (s) in xmalloc.c
100 const char *s;
101 {
102 name = s;
103 #ifdef HAVE_SBRK
104 /* Win32 ports other than cygwin32 don't have brk() */
105 if (first_break == NULL)
106 first_break = (char *) sbrk (0);
107 #endif /* HAVE_SBRK */
108 }

At line 4220, hex_init is empty except upon IBM or the compatible systems which use EBCDIC coding. Upon such system, hex_init initializes array of _hex_value which maps EBCDIC characters of ‘0’~’f’ to corresponding ASCII character. While upon other systems, _hex_value is self-mapping for these ASCII characters.
Following, at line 4222, gcc_init_libintl initializes the translation library for GCC (it depends on the character-sets supported by terminal).
At line 4251, host_hooks offers a mechanism that gives a chance to execute the host speciafic initialization for certain platform. For Linux platform, extra_signals is an empty function.
At line 4256, init_ggc initializes the garbage collector for the compiler. The garbage collection in GCC is also a big topic, we will see it later.

2.1. Initialize hashtable for identifier

In C++, every identifier is discriminated by its literal names (the front-end always looks up the literal name under current binding context, and the mangled name is provided to assembler and linker). In GCC, to make the name looking up more efficient, identifiers are recorded by hashtable, which is initialized by init_stringpool at line 4257.

57 void
58 init_stringpool (void) in stringpool.c
59 {
60 /* Create with 16K (2^14) entries. */
61 ident_hash = ht_create (14);
62 ident_hash->alloc_node = alloc_node;
63 gcc_obstack_init (&string_stack);
64 }

ident_hash, at line 61 above, is a global object of type ht. At line 46 below, stack is a stack object.; all identifiers’ names are allocated from it.

49 struct ht *ident_hash; in stringpool.c

43 struct ht in hashtable.h
44 {
45 /* Identifiers are allocated from here. */
46 struct obstack stack;
47
48 hashnode *entries;
49 /* Call back. */
50 hashnode (*alloc_node) (hash_table *);
51
52 unsigned int nslots; /* Total slots in the entries array. */
53 unsigned int nelements; /* Number of live elements. */
54
55 /* Link to reader, if any. For the benefit of cpplib. */
56 struct cpp_reader *pfile;
57
58 /* Table usage statistics. */
59 unsigned int searches;
60 unsigned int collisions;
61 };

hashnode, at line 48 above, is a pointer points to ht_identifier below at line 27.

37 typedef struct ht hash_table; in hashtable.h
38 typedef struct ht_identifier *hashnode;

26 typedef struct ht_identifier ht_identifier; in hashtable.h
27 struct ht_identifier GTY(())
28 {
29 const unsigned char *str;
30 unsigned int len;
31 unsigned int hash_value;
32 };

At the same time, at line 27, GTY(()) puts the object into the control of garbage collector, we will talk about this topic later. As first step to create the hashtable, ht_create is invoked to create16K entries for the hashtable.

53 hash_table *
54 ht_create (unsigned int order) in hashtable.c
55 {
56 unsigned int nslots = 1 << order;
57 hash_table *table;
58
59 table = xcalloc (1, sizeof (hash_table));
60
61 /* Strings need no alignment. */
62 _obstack_begin (&table->stack, 0, 0,
63 (void *(*) (long)) xmalloc,
64 (void (*) (void *)) free);
65
66 obstack_alignment_mask (&table->stack) = 0;
67
68 table->entries = xcalloc (nslots, sizeof (hashnode));
69 table->nslots = nslots;
70 return table;
71 }

obstack_alignment_mask at line 66 is used to indicate the mask of alignment (notice that it is the requirement from the stack, and type itself has its alignment, the final alignment applied would be the larger one).

2.1.1. Memory Management of Obstack

2.1.1.1. Initialize the control block

The stack object in the hashtable is of type obstack having following definition. It is the control block managing the stack.

168 struct obstack /* control current object in current chunk */ in obstack.h
169 {
170 long chunk_size; /* preferred size to allocate chunks in */
171 struct _obstack_chunk *chunk; /* address of current struct obstack_chunk */
172 char *object_base; /* address of object we are building */
173 char *next_free; /* where to add next char to current object */
174 char *chunk_limit; /* address of char after current chunk */
175 PTR_INT_TYPE temp; /* Temporary for some macros. */
176 int alignment_mask; /* Mask of alignment for each object. */
177 #if defined __STDC__ && __STDC__
178 /* These prototypes vary based on `use_extra_arg', and we use
179 casts to the prototypeless function type in all assignments,
180 but having prototypes here quiets -Wstrict-prototypes. */
181 struct _obstack_chunk *(*chunkfun) (void *, long);
182 void (*freefun) (void *, struct _obstack_chunk *);
183 void *extra_arg; /* first arg for chunk alloc/dealloc funcs */
184 #else
185 struct _obstack_chunk *(*chunkfun) (); /* User's fcn to allocate a chunk. */
186 void (*freefun) (); /* User's function to free a chunk. */
187 char *extra_arg; /* first arg for chunk alloc/dealloc funcs */
188 #endif
189 unsigned use_extra_arg:1; /* chunk alloc/dealloc funcs take extra arg */
190 unsigned maybe_empty_object:1;/* There is a possibility that the current
191 chunk contains a zero-length object. This
192 prevents freeing the chunk if we allocate
193 a bigger chunk to replace it. */
194 unsigned alloc_failed:1; /* No longer used, as we now call the failed
195 handler on error, but retained for binary
196 compatibility. */
197 };

To use obstack to manage memory, we need initialize the object with aligment require, allocate and free functions. Here, it was passed size of 0 (originial empty), alignment of 0 (no special request), xmalloc as chunkfun, and free as freefun.

150 int
151 _obstack_begin (h, size, alignment, chunkfun, freefun) in obstack.c
152 struct obstack *h;
153 int size;
154 int alignment;
155 #if defined (__STDC__) && __STDC__
156 POINTER (*chunkfun) (long);
157 void (*freefun) (void *);
158 #else
159 POINTER (*chunkfun) ();
160 void (*freefun) ();
161 #endif
162 {
163 register struct _obstack_chunk *chunk; /* points to new chunk */
164
165 if (alignment == 0)
166 alignment = (int) DEFAULT_ALIGNMENT;
167 if (size == 0)
168 /* Default size is what GNU malloc can fit in a 4096-byte block. */
169 {
170 /* 12 is sizeof (mhead) and 4 is EXTRA from GNU malloc.
171 Use the values for range checking, because if range checking is off,
172 the extra bytes won't be missed terribly, but if range checking is on
173 and we used a larger request, a whole extra 4096 bytes would be
174 allocated.
175
176 These number are irrelevant to the new GNU malloc. I suspect it is
177 less sensitive to the size of the request. */
178 int extra = ((((12 + DEFAULT_ROUNDING - 1) & ~(DEFAULT_ROUNDING - 1))
179 + 4 + DEFAULT_ROUNDING - 1)
180 & ~(DEFAULT_ROUNDING - 1));
181 size = 4096 - extra;
182 }
183
184 #if defined (__STDC__) && __STDC__
185 h->chunkfun = (struct _obstack_chunk * (*)(void *, long)) chunkfun;
186 h->freefun = (void (*) (void *, struct _obstack_chunk *)) freefun;
187 #else
188 h->chunkfun = (struct _obstack_chunk * (*)()) chunkfun;
189 h->freefun = freefun;
190 #endif
191 h->chunk_size = size;
192 h->alignment_mask = alignment - 1;
193 h->use_extra_arg = 0;
194
195 chunk = h->chunk = CALL_CHUNKFUN (h, h -> chunk_size);
196 if (!chunk)
197 (*obstack_alloc_failed_handler) ();
198 h->next_free = h->object_base = chunk->contents;
199 h->chunk_limit = chunk->limit
200 = (char *) chunk + h->chunk_size;
201 chunk->prev = 0;
202 /* The initial chunk now contains no empty object. */
203 h->maybe_empty_object = 0;
204 h->alloc_failed = 0;
205 return 1;
206 }

Below using fooalign DEFAULT_ALIGNMENT in fact finds out the alignment for double which will be 4 for Linux x86. And DEFAULT_ROUNDING uses fooround to determine the value of rounding for the first chunk, which is 8 for Linux x86.

62 struct fooalign {char x; double d;}; in obstack.h
63 #define DEFAULT_ALIGNMENT /
64 ((PTR_INT_TYPE) ((char *) &((struct fooalign *) 0)->d - (char *) 0))
65 /* If malloc were really smart, it would round addresses to DEFAULT_ALIGNMENT.
66 But in fact it might be less smart and round addresses to as much as
67 DEFAULT_ROUNDING. So we prepare for it to do that. */
68 union fooround {long x; double d;};
69 #define DEFAULT_ROUNDING (sizeof (union fooround))

As we expect allocated chunk can occupy a single page (4k in normal), for the recommended chunk size, we should take memory used by malloc for house keeping into account. Above at line 178, as the comment indicates, 12 bytes used for struct mhead within malloc, and 4 bytes are extra ones. Note that both sections need alignment with boundary of DEFAULT_ROUNDING. The result will be 24 bytes.
The allocated object returned is _obstack_chunk which has below definition.

161 struct _obstack_chunk /* Lives at front of each chunk. */ in obstack.h
162 {
163 char *limit; /* 1 past end of this chunk */
164 struct _obstack_chunk *prev; /* address of prior chunk or NULL */
165 char contents[4]; /* objects begin here */
166 };

As the name of the service indicates, obstack should only be used as stack. The growing of the stack will be from contents till reach limit - the bottom of the chunk. And filled chunks are linked via prev field by order.

2.1.1.2. Allocate from obstack

You can use below macro to allocate memory specified by length from obstack of h. Also you can use the series of obstack_grow to allocate object without alignment like string in obstack.

566 #define obstack_alloc(h,length) / in obstack.h
567 (obstack_blank ((h), (length)), obstack_finish ((h)))

560 #define obstack_blank(h,length) / in obstack.h
561 ( (h)->temp = (length), /
562 (((h)->chunk_limit - (h)->next_free < (h)->temp) /
563 ? (_obstack_newchunk ((h), (h)->temp), 0) : 0), /
564 obstack_blank_fast (h, (h)->temp))

When left space is big enough, following steps are executed.

348 #define obstack_blank_fast(h,n) ((h)->next_free += (n)) in obstack.h

Above macro makes checking empty object possible.See line 576 below, if next_free equates object_base after obstack_blank_fast, we are requiring empty object. As long as the memory request can be satisfied, obstack_finish should be invoked to do the real allocation.

575 #define obstack_finish(h) / in obstack.h
576 ( ((h)->next_free == (h)->object_base /
577 ? (((h)->maybe_empty_object = 1), 0) /
578 : 0), /
579 (h)->temp = __PTR_TO_INT ((h)->object_base), /
580 (h)->next_free /
581 = __INT_TO_PTR ((__PTR_TO_INT ((h)->next_free)+(h)->alignment_mask) /
582 & ~ ((h)->alignment_mask)), /
583 (((h)->next_free - (char *) (h)->chunk /
584 > (h)->chunk_limit - (char *) (h)->chunk) /
585 ? ((h)->next_free = (h)->chunk_limit) : 0), /
586 (h)->object_base = (h)->next_free, /
587 __INT_TO_PTR ((h)->temp))

Above as next_free needs aligned at boundary of alignment_mask, the resulted next_free may be out of the range of the chunk. For that case, part of bytes used for alignment can’t be allocated, but it doesn’t affect the object allocated, adjusts next_free to the bottom of the chunk. Notice that line 587 is the value of obstack_finish. It is also the value of obstack_alloc to point to the head of just allocated object.
When left space of the current chunk can’t satisfy requirement of length (line 562 above), it needs allocate a new big enough chunk for the request. Notice below line 282, if allocating with obstack_alloc, next_free and object_base are always equal (if via obstack_grow, it means the requested but not allocated memory. Before invoking obstack_finish, obstack_grow can run more than one times). And at line 287, we don’t want just allocate required memory but a little more bytes for further requests especially to obstack_grow. Further, the new chunk’s capacity should not less than chunk_size (4k - 24).

274 void
275 _obstack_newchunk (h, length) in obstack.c
276 struct obstack *h;
277 int length;
278 {
279 register struct _obstack_chunk *old_chunk = h->chunk;
280 register struct _obstack_chunk *new_chunk;
281 register long new_size;
282 register long obj_size = h->next_free - h->object_base;
283 register long i;
284 long already;
285
286 /* Compute size for new chunk. */
287 new_size = (obj_size + length) + (obj_size >> 3) + 100;
288 if (new_size < h->chunk_size)
289 new_size = h->chunk_size;
290
291 /* Allocate and initialize the new chunk. */
292 new_chunk = CALL_CHUNKFUN (h, new_size);
293 if (!new_chunk)
294 (*obstack_alloc_failed_handler) ();
295 h->chunk = new_chunk;
296 new_chunk->prev = old_chunk;
297 new_chunk->limit = h->chunk_limit = (char *) new_chunk + new_size;
298
299 /* Move the existing object to the new chunk.
300 Word at a time is fast and is safe if the object
301 is sufficiently aligned. */
302 if (h->alignment_mask + 1 >= DEFAULT_ALIGNMENT)
303 {
304 for (i = obj_size / sizeof (COPYING_UNIT) - 1;
305 i >= 0; i--)
306 ((COPYING_UNIT *)new_chunk->contents)[i]
307 = ((COPYING_UNIT *)h->object_base)[i];
308 /* We used to copy the odd few remaining bytes as one extra COPYING_UNIT,
309 but that can cross a page boundary on a machine
310 which does not do strict alignment for COPYING_UNITS. */
311 already = obj_size / sizeof (COPYING_UNIT) * sizeof (COPYING_UNIT);
312 }
313 else
314 already = 0;
315 /* Copy remaining bytes one by one. */
316 for (i = already; i < obj_size; i++)
317 new_chunk->contents[i] = h->object_base[i];
318
319 /* If the object just copied was the only data in OLD_CHUNK,
320 free that chunk and remove it from the chain.
321 But not if that chunk might contain an empty object. */
322 if (h->object_base == old_chunk->contents && ! h->maybe_empty_object)
323 {
324 new_chunk->prev = old_chunk->prev;
325 CALL_FREEFUN (h, old_chunk);
326 }
327
328 h->object_base = new_chunk->contents;
329 h->next_free = h->object_base + obj_size;
330 /* The new chunk certainly contains no empty object yet. */
331 h->maybe_empty_object = 0;
332 }

For this unfullfilled memory request, next it should be copied into the new chunk. Remember that contents is always the head of the chunk buffer, and object_base refers to the head of the unfullfiled memory request. So if in the old chunk, both are equal and there is no empty object; that means the chunk only contains this memory request and now the chunk can be released.

2.1.1.3. Free objects

As a stack, when specifies certain object to free, those that lay upon it will also be freed too.

597 #define obstack_free(h,obj) / in obstack.h
598 ( (h)->temp = (char *) (obj) - (char *) (h)->chunk, /
599 (((h)->temp > 0 && (h)->temp < (h)->chunk_limit - (char *) (h)->chunk)/
600 ? (int) ((h)->next_free = (h)->object_base /
601 = (h)->temp + (char *) (h)->chunk) /
602 : (_obstack_free ((h), (h)->temp + (char *) (h)->chunk), 0)))

If the object does not belong to current chunk, it must lay in other chunk. As we have seen, the chunks are linked by prev, the closer it to current chunk, the fresher it is. By the LIFO rule of stack, when we find out the target chunk, those chunks that visited by prev contain objects that pushed after the interested object. They should be popped out (e.g remove the chunk).

372 void
373 _obstack_free (h, obj) in obstack.c
374 struct obstack *h;
375 POINTER obj;
376 {
377 register struct _obstack_chunk *lp; /* below addr of any objects in this chunk */
378 register struct _obstack_chunk *plp; /* point to previous chunk if any */
379
380 lp = h->chunk;
381 /* We use >= because there cannot be an object at the beginning of a chunk.
382 But there can be an empty object at that address
383 at the end of another chunk. */
384 while (lp != 0 && ((POINTER) lp >= obj || (POINTER) (lp)->limit < obj))
385 {
386 plp = lp->prev;
387 CALL_FREEFUN (h, lp);
388 lp = plp;
389 /* If we switch chunks, we can't tell whether the new current
390 chunk contains an empty object, so assume that it may. */
391 h->maybe_empty_object = 1;
392 }
393 if (lp)
394 {
395 h->object_base = h->next_free = (char *) (obj);
396 h->chunk_limit = lp->limit;
397 h->chunk = lp;
398 }
399 else if (obj != 0)
400 /* obj is not in any of the chunks! */
401 abort ();
402 }

If there is chunk released, the target chunk would become the current chunk. We don’t know whether there is empty object in it, but just assume it has. Thus in _obstack_newchunk, this thunk wouldn’t be released as empty. Then in case there is empty object needs be freed, it wouldn’t be an orphan and enter line 401.

2.1.2. The allocation function

The allocation function for the entity is alloc_node as indicated at line 62 in init_stringpool.

67 static hashnode
68 alloc_node (hash_table *table ATTRIBUTE_UNUSED) in stringpool.c
69 {
70 return GCC_IDENT_TO_HT_IDENT (make_node (IDENTIFIER_NODE));
71 }

As make_node uses IDENTIFIER_NODE as parameter, it will use lang_hooks.identifier_size to decide the size of node. For C++, it is the default value defined in langhooks-def.h - sizeof (struct lang_identifier). lang_identifier is also language related. For C++, it has following definition.

220 struct lang_identifier GTY(()) in cp-tree.h
221 {
222 struct c_common_identifier c_common;
223 cxx_binding *namespace_bindings;
224 cxx_binding *bindings;
225 tree class_value;
226 tree class_template_info;
227 tree label_value;
228 tree implicit_decl;
229 tree error_locus;
230 };

In the definition of c_common_identifier, see that its first element is tree_common!

180 struct c_common_identifier GTY(()) in c-common.h
181 {
182 struct tree_common common;
183 struct cpp_hashnode node;
184 };

Besides, in lang_identifier, field namespace_bindings is the list of the binding namespaces defining the identifier, and field bindings is the list of the non-namespace scopes the definition appears. Later we can see that this arrangement is necessary for name lookup.

2.1.2.1. The way identifier node present in tree and hashtable

Above make_node returns the new created object of lang_identifier, it is casted into tree_identifier by GCC_IDENT_TO_HT_IDENT and ht_identifier part is returned.

753 #define HT_IDENT_TO_GCC_IDENT(NODE) / in tree.h
754 ((tree) ((char *) (NODE) - sizeof (struct tree_common)))
755 #define GCC_IDENT_TO_HT_IDENT(NODE) (&((struct tree_identifier *) (NODE))->id)

In the definition of tree_identifier, can see that its first element is also tree_common. But what is its second element compared with cpp_hashnode?

757 struct tree_identifier GTY(()) in tree.h
758 {
759 struct tree_common common;
760 struct ht_identifier id;
761 };

Not surprisingly, the first element of cpp_hashnode is ht_identifier. As we have found out, identifiers besides resides into the tree, they are also organized into hashtable - ident_hash. So first part of lang_identifier makes the object survives in tree, and the second element makes the object happy in hashtable.

478 struct cpp_hashnode GTY(()) in cpplib.h
479 {
480 struct ht_identifier ident;
481 unsigned int is_directive : 1;
482 unsigned int directive_index : 7; /* If is_directive,
483 then index into directive table.
484 Otherwise, a NODE_OPERATOR. */
485 unsigned char rid_code; /* Rid code - for front ends. */
486 ENUM_BITFIELD(node_type) type : 8; /* CPP node type. */
487 unsigned char flags; /* CPP flags. */
488
489 union _cpp_hashnode_value
490 {
491 /* If a macro. */
492 cpp_macro * GTY((skip (""))) macro;
493 /* Answers to an assertion. */
494 struct answer * GTY ((skip (""))) answers;
495 /* Code for a builtin macro. */
496 enum builtin_type GTY ((tag ("1"))) builtin;
497 /* Macro argument index. */
498 unsigned short GTY ((tag ("0"))) arg_index;
499 } GTY ((desc ("0"))) value;
500 };

When created by lexer, cpp_hashnode is part of cpp_token (stands for token). It is majorly used by preprocessor.

2.1.3. Searching in the hashtable

The searching method for the hashtable is cpp_lookup.

91 cpp_hashnode *
92 cpp_lookup (cpp_reader *pfile, const unsigned char *str, unsigned int len) in cpphash.c
93 {
94 /* ht_lookup cannot return NULL. */
95 return CPP_HASHNODE (ht_lookup (pfile->hash_table, str, len, HT_ALLOC));
96 }

At line 95, ht_loopup returns address of hashnode, CPP_HASHNODE casts it back into cpp_hashnode *

470 #define CPP_HASHNODE(HNODE) ((cpp_hashnode *) (HNODE)) in cpplib.h
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: