summaryrefslogtreecommitdiff
path: root/devdocs/gcc~13/type-encoding.html
blob: cb585a1bfb515a03d5eddbf437d4dfd4d016f765 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
<div class="section-level-extent" id="Type-encoding"> <div class="nav-panel"> <p> Next: <a href="garbage-collection" accesskey="n" rel="next">Garbage Collection</a>, Previous: <a href="executing-code-before-main" accesskey="p" rel="prev"><code class="code">+load</code>: Executing Code before <code class="code">main</code></a>, Up: <a href="objective-c" accesskey="u" rel="up">GNU Objective-C Features</a> [<a href="index#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="indices" title="Index" rel="index">Index</a>]</p> </div>  <h1 class="section" id="Type-Encoding"><span>8.3 Type Encoding<a class="copiable-link" href="#Type-Encoding"> ¶</a></span></h1> <p>This is an advanced section. Type encodings are used extensively by the compiler and by the runtime, but you generally do not need to know about them to use Objective-C. </p> <p>The Objective-C compiler generates type encodings for all the types. These type encodings are used at runtime to find out information about selectors and methods and about objects and classes. </p> <p>The types are encoded in the following way: </p> <table class="multitable"> <tbody>
<tr>
<td width="25%"><code class="code">_Bool</code></td>
<td width="75%"><code class="code">B</code></td>
</tr> <tr>
<td width="25%"><code class="code">char</code></td>
<td width="75%"><code class="code">c</code></td>
</tr> <tr>
<td width="25%"><code class="code">unsigned char</code></td>
<td width="75%"><code class="code">C</code></td>
</tr> <tr>
<td width="25%"><code class="code">short</code></td>
<td width="75%"><code class="code">s</code></td>
</tr> <tr>
<td width="25%"><code class="code">unsigned short</code></td>
<td width="75%"><code class="code">S</code></td>
</tr> <tr>
<td width="25%"><code class="code">int</code></td>
<td width="75%"><code class="code">i</code></td>
</tr> <tr>
<td width="25%"><code class="code">unsigned int</code></td>
<td width="75%"><code class="code">I</code></td>
</tr> <tr>
<td width="25%"><code class="code">long</code></td>
<td width="75%"><code class="code">l</code></td>
</tr> <tr>
<td width="25%"><code class="code">unsigned long</code></td>
<td width="75%"><code class="code">L</code></td>
</tr> <tr>
<td width="25%"><code class="code">long long</code></td>
<td width="75%"><code class="code">q</code></td>
</tr> <tr>
<td width="25%"><code class="code">unsigned long long</code></td>
<td width="75%"><code class="code">Q</code></td>
</tr> <tr>
<td width="25%"><code class="code">float</code></td>
<td width="75%"><code class="code">f</code></td>
</tr> <tr>
<td width="25%"><code class="code">double</code></td>
<td width="75%"><code class="code">d</code></td>
</tr> <tr>
<td width="25%"><code class="code">long double</code></td>
<td width="75%"><code class="code">D</code></td>
</tr> <tr>
<td width="25%"><code class="code">void</code></td>
<td width="75%"><code class="code">v</code></td>
</tr> <tr>
<td width="25%"><code class="code">id</code></td>
<td width="75%"><code class="code">@</code></td>
</tr> <tr>
<td width="25%"><code class="code">Class</code></td>
<td width="75%"><code class="code">#</code></td>
</tr> <tr>
<td width="25%"><code class="code">SEL</code></td>
<td width="75%"><code class="code">:</code></td>
</tr> <tr>
<td width="25%"><code class="code">char*</code></td>
<td width="75%"><code class="code">*</code></td>
</tr> <tr>
<td width="25%"><code class="code">enum</code></td>
<td width="75%">an <code class="code">enum</code> is encoded exactly as the integer type that the compiler uses for it, which depends on the enumeration values. Often the compiler users <code class="code">unsigned int</code>, which is then encoded as <code class="code">I</code>.</td>
</tr> <tr>
<td width="25%">unknown type</td>
<td width="75%"><code class="code">?</code></td>
</tr> <tr>
<td width="25%">Complex types</td>
<td width="75%">
<code class="code">j</code> followed by the inner type. For example <code class="code">_Complex double</code> is encoded as "jd".</td>
</tr> <tr>
<td width="25%">bit-fields</td>
<td width="75%">
<code class="code">b</code> followed by the starting position of the bit-field, the type of the bit-field and the size of the bit-field (the bit-fields encoding was changed from the NeXT’s compiler encoding, see below)</td>
</tr> </tbody> </table> <p>The encoding of bit-fields has changed to allow bit-fields to be properly handled by the runtime functions that compute sizes and alignments of types that contain bit-fields. The previous encoding contained only the size of the bit-field. Using only this information it is not possible to reliably compute the size occupied by the bit-field. This is very important in the presence of the Boehm’s garbage collector because the objects are allocated using the typed memory facility available in this collector. The typed memory allocation requires information about where the pointers are located inside the object. </p> <p>The position in the bit-field is the position, counting in bits, of the bit closest to the beginning of the structure. </p> <p>The non-atomic types are encoded as follows: </p> <table class="multitable"> <tbody>
<tr>
<td width="20%">pointers</td>
<td width="80%">‘<samp class="samp">^</samp>’ followed by the pointed type.</td>
</tr> <tr>
<td width="20%">arrays</td>
<td width="80%">‘<samp class="samp">[</samp>’ followed by the number of elements in the array followed by the type of the elements followed by ‘<samp class="samp">]</samp>’</td>
</tr> <tr>
<td width="20%">structures</td>
<td width="80%">‘<samp class="samp">{</samp>’ followed by the name of the structure (or ‘<samp class="samp">?</samp>’ if the structure is unnamed), the ‘<samp class="samp">=</samp>’ sign, the type of the members and by ‘<samp class="samp">}</samp>’</td>
</tr> <tr>
<td width="20%">unions</td>
<td width="80%">‘<samp class="samp">(</samp>’ followed by the name of the structure (or ‘<samp class="samp">?</samp>’ if the union is unnamed), the ‘<samp class="samp">=</samp>’ sign, the type of the members followed by ‘<samp class="samp">)</samp>’</td>
</tr> <tr>
<td width="20%">vectors</td>
<td width="80%">‘<samp class="samp">![</samp>’ followed by the vector_size (the number of bytes composing the vector) followed by a comma, followed by the alignment (in bytes) of the vector, followed by the type of the elements followed by ‘<samp class="samp">]</samp>’</td>
</tr> </tbody> </table> <p>Here are some types and their encodings, as they are generated by the compiler on an i386 machine: </p>  <table class="multitable"> <thead><tr>
<th width="60%">Objective-C type</th>
<th width="40%">Compiler encoding</th>
</tr></thead> <tbody>
<tr>
<td width="60%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">int a[10];</pre>
</div></td>
<td width="40%"><code class="code">[10i]</code></td>
</tr> <tr>
<td width="60%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">struct {
  int i;
  float f[3];
  int a:3;
  int b:2;
  char c;
}</pre>
</div></td>
<td width="40%"><code class="code">{?=i[3f]b128i3b131i2c}</code></td>
</tr> <tr>
<td width="60%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">int a __attribute__ ((vector_size (16)));</pre>
</div></td>
<td width="40%">
<code class="code">![16,16i]</code> (alignment depends on the machine)</td>
</tr> </tbody> </table>  <p>In addition to the types the compiler also encodes the type specifiers. The table below describes the encoding of the current Objective-C type specifiers: </p>  <table class="multitable"> <thead><tr>
<th width="25%">Specifier</th>
<th width="75%">Encoding</th>
</tr></thead> <tbody>
<tr>
<td width="25%"><code class="code">const</code></td>
<td width="75%"><code class="code">r</code></td>
</tr> <tr>
<td width="25%"><code class="code">in</code></td>
<td width="75%"><code class="code">n</code></td>
</tr> <tr>
<td width="25%"><code class="code">inout</code></td>
<td width="75%"><code class="code">N</code></td>
</tr> <tr>
<td width="25%"><code class="code">out</code></td>
<td width="75%"><code class="code">o</code></td>
</tr> <tr>
<td width="25%"><code class="code">bycopy</code></td>
<td width="75%"><code class="code">O</code></td>
</tr> <tr>
<td width="25%"><code class="code">byref</code></td>
<td width="75%"><code class="code">R</code></td>
</tr> <tr>
<td width="25%"><code class="code">oneway</code></td>
<td width="75%"><code class="code">V</code></td>
</tr> </tbody> </table>  <p>The type specifiers are encoded just before the type. Unlike types however, the type specifiers are only encoded when they appear in method argument types. </p> <p>Note how <code class="code">const</code> interacts with pointers: </p>  <table class="multitable"> <thead><tr>
<th width="25%">Objective-C type</th>
<th width="75%">Compiler encoding</th>
</tr></thead> <tbody>
<tr>
<td width="25%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">const int</pre>
</div></td>
<td width="75%"><code class="code">ri</code></td>
</tr> <tr>
<td width="25%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">const int*</pre>
</div></td>
<td width="75%"><code class="code">^ri</code></td>
</tr> <tr>
<td width="25%"><div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">int *const</pre>
</div></td>
<td width="75%"><code class="code">r^i</code></td>
</tr> </tbody> </table>  <p><code class="code">const int*</code> is a pointer to a <code class="code">const int</code>, and so is encoded as <code class="code">^ri</code>. <code class="code">int* const</code>, instead, is a <code class="code">const</code> pointer to an <code class="code">int</code>, and so is encoded as <code class="code">r^i</code>. </p> <p>Finally, there is a complication when encoding <code class="code">const char *</code> versus <code class="code">char * const</code>. Because <code class="code">char *</code> is encoded as <code class="code">*</code> and not as <code class="code">^c</code>, there is no way to express the fact that <code class="code">r</code> applies to the pointer or to the pointee. </p> <p>Hence, it is assumed as a convention that <code class="code">r*</code> means <code class="code">const
char *</code> (since it is what is most often meant), and there is no way to encode <code class="code">char *const</code>. <code class="code">char *const</code> would simply be encoded as <code class="code">*</code>, and the <code class="code">const</code> is lost. </p> <ul class="mini-toc"> <li><a href="legacy-type-encoding" accesskey="1">Legacy Type Encoding</a></li> <li><a href="_0040encode" accesskey="2"><code class="code">@encode</code></a></li> <li><a href="method-signatures" accesskey="3">Method Signatures</a></li> </ul> </div>  <div class="nav-panel"> <p> Next: <a href="garbage-collection">Garbage Collection</a>, Previous: <a href="executing-code-before-main"><code class="code">+load</code>: Executing Code before <code class="code">main</code></a>, Up: <a href="objective-c">GNU Objective-C Features</a> [<a href="index#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="indices" title="Index" rel="index">Index</a>]</p> </div><div class="_attribution">
  <p class="_attribution-p">
    &copy; Free Software Foundation<br>Licensed under the GNU Free Documentation License, Version 1.3.<br>
    <a href="https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Type-encoding.html" class="_attribution-link">https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Type-encoding.html</a>
  </p>
</div>