devdocs/gcc~13/x86-options.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413

<div class="subsection-level-extent" id="x86-Options"> <div class="nav-panel"> <p> Next: <a href="x86-windows-options" accesskey="n" rel="next">x86 Windows Options</a>, Previous: <a href="vxworks-options" accesskey="p" rel="prev">VxWorks Options</a>, Up: <a href="submodel-options" accesskey="u" rel="up">Machine-Dependent Options</a> [<a href="index#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="indices" title="Index" rel="index">Index</a>]</p> </div>  <h1 class="subsection" id="x86-Options-1"><span>3.19.54 x86 Options<a class="copiable-link" href="#x86-Options-1"> ¶</a></span></h1>  <p>These ‘<samp class="samp">-m</samp>’ options are defined for the x86 family of computers. </p> <dl class="table"> <dt>
<span><code class="code">-march=<var class="var">cpu-type</var></code><a class="copiable-link" href="#index-march-16"> ¶</a></span>
</dt> <dd>
<p>Generate instructions for the machine type <var class="var">cpu-type</var>. In contrast to <samp class="option">-mtune=<var class="var">cpu-type</var></samp>, which merely tunes the generated code for the specified <var class="var">cpu-type</var>, <samp class="option">-march=<var class="var">cpu-type</var></samp> allows GCC to generate code that may not run at all on processors other than the one indicated. Specifying <samp class="option">-march=<var class="var">cpu-type</var></samp> implies <samp class="option">-mtune=<var class="var">cpu-type</var></samp>, except where noted otherwise. </p> <p>The choices for <var class="var">cpu-type</var> are: </p> <dl class="table"> <dt>‘<samp class="samp">native</samp>’</dt> <dd>
<p>This selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine. Using <samp class="option">-march=native</samp> enables all instruction subsets supported by the local machine (hence the result might not run on different machines). Using <samp class="option">-mtune=native</samp> produces code optimized for the local machine under the constraints of the selected instruction set. </p> </dd> <dt>‘<samp class="samp">x86-64</samp>’</dt> <dd>
<p>A generic CPU with 64-bit extensions. </p> </dd> <dt>‘<samp class="samp">x86-64-v2</samp>’</dt> <dt>‘<samp class="samp">x86-64-v3</samp>’</dt> <dt>‘<samp class="samp">x86-64-v4</samp>’</dt> <dd>
<p>These choices for <var class="var">cpu-type</var> select the corresponding micro-architecture level from the x86-64 psABI. On ABIs other than the x86-64 psABI they select the same CPU features as the x86-64 psABI documents for the particular micro-architecture level. </p> <p>Since these <var class="var">cpu-type</var> values do not have a corresponding <samp class="option">-mtune</samp> setting, using <samp class="option">-march</samp> with these values enables generic tuning. Specific tuning can be enabled using the <samp class="option">-mtune=<var class="var">other-cpu-type</var></samp> option with an appropriate <var class="var">other-cpu-type</var> value. </p> </dd> <dt>‘<samp class="samp">i386</samp>’</dt> <dd>
<p>Original Intel i386 CPU. </p> </dd> <dt>‘<samp class="samp">i486</samp>’</dt> <dd>
<p>Intel i486 CPU. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">i586</samp>’</dt> <dt>‘<samp class="samp">pentium</samp>’</dt> <dd>
<p>Intel Pentium CPU with no MMX support. </p> </dd> <dt>‘<samp class="samp">lakemont</samp>’</dt> <dd>
<p>Intel Lakemont MCU, based on Intel Pentium CPU. </p> </dd> <dt>‘<samp class="samp">pentium-mmx</samp>’</dt> <dd>
<p>Intel Pentium MMX CPU, based on Pentium core with MMX instruction set support. </p> </dd> <dt>‘<samp class="samp">pentiumpro</samp>’</dt> <dd>
<p>Intel Pentium Pro CPU. </p> </dd> <dt>‘<samp class="samp">i686</samp>’</dt> <dd>
<p>When used with <samp class="option">-march</samp>, the Pentium Pro instruction set is used, so the code runs on all i686 family chips. When used with <samp class="option">-mtune</samp>, it has the same meaning as ‘<samp class="samp">generic</samp>’. </p> </dd> <dt>‘<samp class="samp">pentium2</samp>’</dt> <dd>
<p>Intel Pentium II CPU, based on Pentium Pro core with MMX and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">pentium3</samp>’</dt> <dt>‘<samp class="samp">pentium3m</samp>’</dt> <dd>
<p>Intel Pentium III CPU, based on Pentium Pro core with MMX, FXSR and SSE instruction set support. </p> </dd> <dt>‘<samp class="samp">pentium-m</samp>’</dt> <dd>
<p>Intel Pentium M; low-power version of Intel Pentium III CPU with MMX, SSE, SSE2 and FXSR instruction set support. Used by Centrino notebooks. </p> </dd> <dt>‘<samp class="samp">pentium4</samp>’</dt> <dt>‘<samp class="samp">pentium4m</samp>’</dt> <dd>
<p>Intel Pentium 4 CPU with MMX, SSE, SSE2 and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">prescott</samp>’</dt> <dd>
<p>Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2, SSE3 and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">nocona</samp>’</dt> <dd>
<p>Improved version of Intel Pentium 4 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">core2</samp>’</dt> <dd>
<p>Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, CX16, SAHF and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">nehalem</samp>’</dt> <dd>
<p>Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF and FXSR instruction set support. </p> </dd> <dt>‘<samp class="samp">westmere</samp>’</dt> <dd>
<p>Intel Westmere CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR and PCLMUL instruction set support. </p> </dd> <dt>‘<samp class="samp">sandybridge</samp>’</dt> <dd>
<p>Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE and PCLMUL instruction set support. </p> </dd> <dt>‘<samp class="samp">ivybridge</samp>’</dt> <dd>
<p>Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND and F16C instruction set support. </p> </dd> <dt>‘<samp class="samp">haswell</samp>’</dt> <dd>
<p>Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE and HLE instruction set support. </p> </dd> <dt>‘<samp class="samp">broadwell</samp>’</dt> <dd>
<p>Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX and PREFETCHW instruction set support. </p> </dd> <dt>‘<samp class="samp">skylake</samp>’</dt> <dd>
<p>Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES and SGX instruction set support. </p> </dd> <dt>‘<samp class="samp">bonnell</samp>’</dt> <dd>
<p>Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support. </p> </dd> <dt>‘<samp class="samp">silvermont</samp>’</dt> <dd>
<p>Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, PCLMUL, PREFETCHW and RDRND instruction set support. </p> </dd> <dt>‘<samp class="samp">goldmont</samp>’</dt> <dd>
<p>Intel Goldmont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, PCLMUL, PREFETCHW, RDRND, AES, SHA, RDSEED, XSAVE, XSAVEC, XSAVES, XSAVEOPT, CLFLUSHOPT and FSGSBASE instruction set support. </p> </dd> <dt>‘<samp class="samp">goldmont-plus</samp>’</dt> <dd>
<p>Intel Goldmont Plus CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, PCLMUL, PREFETCHW, RDRND, AES, SHA, RDSEED, XSAVE, XSAVEC, XSAVES, XSAVEOPT, CLFLUSHOPT, FSGSBASE, PTWRITE, RDPID and SGX instruction set support. </p> </dd> <dt>‘<samp class="samp">tremont</samp>’</dt> <dd>
<p>Intel Tremont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, PCLMUL, PREFETCHW, RDRND, AES, SHA, RDSEED, XSAVE, XSAVEC, XSAVES, XSAVEOPT, CLFLUSHOPT, FSGSBASE, PTWRITE, RDPID, SGX, CLWB, GFNI-SSE, MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG instruction set support. </p> </dd> <dt>‘<samp class="samp">sierraforest</samp>’</dt> <dd>
<p>Intel Sierra Forest CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PREFETCHW, PCLMUL, RDRND, XSAVE, XSAVEC, XSAVES, XSAVEOPT, FSGSBASE, PTWRITE, RDPID, SGX, GFNI-SSE, CLWB, MOVDIRI, MOVDIR64B, CLDEMOTE, WAITPKG, ADCX, AVX, AVX2, BMI, BMI2, F16C, FMA, LZCNT, PCONFIG, PKU, VAES, VPCLMULQDQ, SERIALIZE, HRESET, KL, WIDEKL, AVX-VNNI, AVXIFMA, AVXVNNIINT8, AVXNECONVERT and CMPCCXADD instruction set support. </p> </dd> <dt>‘<samp class="samp">grandridge</samp>’</dt> <dd>
<p>Intel Grand Ridge CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PREFETCHW, PCLMUL, RDRND, XSAVE, XSAVEC, XSAVES, XSAVEOPT, FSGSBASE, PTWRITE, RDPID, SGX, GFNI-SSE, CLWB, MOVDIRI, MOVDIR64B, CLDEMOTE, WAITPKG, ADCX, AVX, AVX2, BMI, BMI2, F16C, FMA, LZCNT, PCONFIG, PKU, VAES, VPCLMULQDQ, SERIALIZE, HRESET, KL, WIDEKL, AVX-VNNI, AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD and RAOINT instruction set support. </p> </dd> <dt>‘<samp class="samp">knl</samp>’</dt> <dd>
<p>Intel Knight’s Landing CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AVX512PF, AVX512ER, AVX512F, AVX512CD and PREFETCHWT1 instruction set support. </p> </dd> <dt>‘<samp class="samp">knm</samp>’</dt> <dd>
<p>Intel Knights Mill CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AVX512PF, AVX512ER, AVX512F, AVX512CD and PREFETCHWT1, AVX5124VNNIW, AVX5124FMAPS and AVX512VPOPCNTDQ instruction set support. </p> </dd> <dt>‘<samp class="samp">skylake-avx512</samp>’</dt> <dd>
<p>Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, CLWB, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set support. </p> </dd> <dt>‘<samp class="samp">cannonlake</samp>’</dt> <dd>
<p>Intel Cannonlake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, PKU, AVX512VBMI, AVX512IFMA and SHA instruction set support. </p> </dd> <dt>‘<samp class="samp">icelake-client</samp>’</dt> <dd>
<p>Intel Icelake Client CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2 , VPCLMULQDQ, AVX512BITALG, RDPID and AVX512VPOPCNTDQ instruction set support. </p> </dd> <dt>‘<samp class="samp">icelake-server</samp>’</dt> <dd>
<p>Intel Icelake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2 , VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ, PCONFIG, WBNOINVD and CLWB instruction set support. </p> </dd> <dt>‘<samp class="samp">cascadelake</samp>’</dt> <dd>
<p>Intel Cascadelake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, CLWB, AVX512VL, AVX512BW, AVX512DQ, AVX512CD and AVX512VNNI instruction set support. </p> </dd> <dt>‘<samp class="samp">cooperlake</samp>’</dt> <dd>
<p>Intel cooperlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, CLWB, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VNNI and AVX512BF16 instruction set support. </p> </dd> <dt>‘<samp class="samp">tigerlake</samp>’</dt> <dd>
<p>Intel Tigerlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2, VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ, MOVDIRI, MOVDIR64B, CLWB, AVX512VP2INTERSECT and KEYLOCKER instruction set support. </p> </dd> <dt>‘<samp class="samp">sapphirerapids</samp>’</dt> <dd>
<p>Intel sapphirerapids CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2, VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ, PCONFIG, WBNOINVD, CLWB, MOVDIRI, MOVDIR64B, ENQCMD, CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, TSXLDTRK, UINTR, AMX-BF16, AMX-TILE, AMX-INT8, AVX-VNNI, AVX512-FP16 and AVX512BF16 instruction set support. </p> </dd> <dt>‘<samp class="samp">alderlake</samp>’</dt> <dd>
<p>Intel Alderlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PREFETCHW, PCLMUL, RDRND, XSAVE, XSAVEC, XSAVES, XSAVEOPT, FSGSBASE, PTWRITE, RDPID, SGX, GFNI-SSE, CLWB, MOVDIRI, MOVDIR64B, CLDEMOTE, WAITPKG, ADCX, AVX, AVX2, BMI, BMI2, F16C, FMA, LZCNT, PCONFIG, PKU, VAES, VPCLMULQDQ, SERIALIZE, HRESET, KL, WIDEKL and AVX-VNNI instruction set support. </p> </dd> <dt>‘<samp class="samp">rocketlake</samp>’</dt> <dd>
<p>Intel Rocketlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3 , SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2, VPCLMULQDQ, AVX512BITALG, RDPID and AVX512VPOPCNTDQ instruction set support. </p> </dd> <dt>‘<samp class="samp">graniterapids</samp>’</dt> <dd>
<p>Intel graniterapids CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, PREFETCHW, AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, AVX512VBMI2, VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ, PCONFIG, WBNOINVD, CLWB, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD, CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, TSXLDTRK, UINTR, AMX-BF16, AMX-TILE, AMX-INT8, AVX-VNNI, AVX512-FP16, AVX512BF16, AMX-FP16 and PREFETCHI instruction set support. </p> </dd> <dt>‘<samp class="samp">k6</samp>’</dt> <dd>
<p>AMD K6 CPU with MMX instruction set support. </p> </dd> <dt>‘<samp class="samp">k6-2</samp>’</dt> <dt>‘<samp class="samp">k6-3</samp>’</dt> <dd>
<p>Improved versions of AMD K6 CPU with MMX and 3DNow! instruction set support. </p> </dd> <dt>‘<samp class="samp">athlon</samp>’</dt> <dt>‘<samp class="samp">athlon-tbird</samp>’</dt> <dd>
<p>AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE prefetch instructions support. </p> </dd> <dt>‘<samp class="samp">athlon-4</samp>’</dt> <dt>‘<samp class="samp">athlon-xp</samp>’</dt> <dt>‘<samp class="samp">athlon-mp</samp>’</dt> <dd>
<p>Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and full SSE instruction set support. </p> </dd> <dt>‘<samp class="samp">k8</samp>’</dt> <dt>‘<samp class="samp">opteron</samp>’</dt> <dt>‘<samp class="samp">athlon64</samp>’</dt> <dt>‘<samp class="samp">athlon-fx</samp>’</dt> <dd>
<p>Processors based on the AMD K8 core with x86-64 instruction set support, including the AMD Opteron, Athlon 64, and Athlon 64 FX processors. (This supersets MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">k8-sse3</samp>’</dt> <dt>‘<samp class="samp">opteron-sse3</samp>’</dt> <dt>‘<samp class="samp">athlon64-sse3</samp>’</dt> <dd>
<p>Improved versions of AMD K8 cores with SSE3 instruction set support. </p> </dd> <dt>‘<samp class="samp">amdfam10</samp>’</dt> <dt>‘<samp class="samp">barcelona</samp>’</dt> <dd>
<p>CPUs based on AMD Family 10h cores with x86-64 instruction set support. (This supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">bdver1</samp>’</dt> <dd>
<p>CPUs based on AMD Family 15h cores with x86-64 instruction set support. (This supersets FMA4, AVX, XOP, LWP, AES, PCLMUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">bdver2</samp>’</dt> <dd>
<p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This supersets BMI, TBM, F16C, FMA, FMA4, AVX, XOP, LWP, AES, PCLMUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">bdver3</samp>’</dt> <dd>
<p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This supersets BMI, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, XOP, LWP, AES, PCLMUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">bdver4</samp>’</dt> <dd>
<p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, AVX2, XOP, LWP, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">znver1</samp>’</dt> <dd>
<p>AMD Family 17h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">znver2</samp>’</dt> <dd>
<p>AMD Family 17h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID, WBNOINVD, and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">znver3</samp>’</dt> <dd>
<p>AMD Family 19h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID, WBNOINVD, PKU, VPCLMULQDQ, VAES, and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">znver4</samp>’</dt> <dd>
<p>AMD Family 19h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID, WBNOINVD, PKU, VPCLMULQDQ, VAES, AVX512F, AVX512DQ, AVX512IFMA, AVX512CD, AVX512BW, AVX512VL, AVX512BF16, AVX512VBMI, AVX512VBMI2, AVX512VNNI, AVX512BITALG, AVX512VPOPCNTDQ, GFNI and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">btver1</samp>’</dt> <dd>
<p>CPUs based on AMD Family 14h cores with x86-64 instruction set support. (This supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A, CX16, ABM and 64-bit instruction set extensions.) </p> </dd> <dt>‘<samp class="samp">btver2</samp>’</dt> <dd>
<p>CPUs based on AMD Family 16h cores with x86-64 instruction set support. This includes MOVBE, F16C, BMI, AVX, PCLMUL, AES, SSE4.2, SSE4.1, CX16, ABM, SSE4A, SSSE3, SSE3, SSE2, SSE, MMX and 64-bit instruction set extensions. </p> </dd> <dt>‘<samp class="samp">winchip-c6</samp>’</dt> <dd>
<p>IDT WinChip C6 CPU, dealt in same way as i486 with additional MMX instruction set support. </p> </dd> <dt>‘<samp class="samp">winchip2</samp>’</dt> <dd>
<p>IDT WinChip 2 CPU, dealt in same way as i486 with additional MMX and 3DNow! instruction set support. </p> </dd> <dt>‘<samp class="samp">c3</samp>’</dt> <dd>
<p>VIA C3 CPU with MMX and 3DNow! instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">c3-2</samp>’</dt> <dd>
<p>VIA C3-2 (Nehemiah/C5XL) CPU with MMX and SSE instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">c7</samp>’</dt> <dd>
<p>VIA C7 (Esther) CPU with MMX, SSE, SSE2 and SSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">samuel-2</samp>’</dt> <dd>
<p>VIA Eden Samuel 2 CPU with MMX and 3DNow! instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nehemiah</samp>’</dt> <dd>
<p>VIA Eden Nehemiah CPU with MMX and SSE instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">esther</samp>’</dt> <dd>
<p>VIA Eden Esther CPU with MMX, SSE, SSE2 and SSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">eden-x2</samp>’</dt> <dd>
<p>VIA Eden X2 CPU with x86-64, MMX, SSE, SSE2 and SSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">eden-x4</samp>’</dt> <dd>
<p>VIA Eden X4 CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and AVX2 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano</samp>’</dt> <dd>
<p>Generic VIA Nano CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano-1000</samp>’</dt> <dd>
<p>VIA Nano 1xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano-2000</samp>’</dt> <dd>
<p>VIA Nano 2xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano-3000</samp>’</dt> <dd>
<p>VIA Nano 3xxx CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano-x2</samp>’</dt> <dd>
<p>VIA Nano Dual Core CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">nano-x4</samp>’</dt> <dd>
<p>VIA Nano Quad Core CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 instruction set support. (No scheduling is implemented for this chip.) </p> </dd> <dt>‘<samp class="samp">lujiazui</samp>’</dt> <dd>
<p>ZHAOXIN lujiazui CPU with x86-64, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE, CX16, ABM, BMI, BMI2, F16C, FXSR, RDSEED instruction set support. </p> </dd> <dt>‘<samp class="samp">geode</samp>’</dt> <dd><p>AMD Geode embedded processor with MMX and 3DNow! instruction set support. </p></dd> </dl> </dd> <dt>
<span><code class="code">-mtune=<var class="var">cpu-type</var></code><a class="copiable-link" href="#index-mtune-17"> ¶</a></span>
</dt> <dd>
<p>Tune to <var class="var">cpu-type</var> everything applicable about the generated code, except for the ABI and the set of available instructions. While picking a specific <var class="var">cpu-type</var> schedules things appropriately for that particular chip, the compiler does not generate any code that cannot run on the default machine type unless you use a <samp class="option">-march=<var class="var">cpu-type</var></samp> option. For example, if GCC is configured for i686-pc-linux-gnu then <samp class="option">-mtune=pentium4</samp> generates code that is tuned for Pentium 4 but still runs on i686 machines. </p> <p>The choices for <var class="var">cpu-type</var> are the same as for <samp class="option">-march</samp>. In addition, <samp class="option">-mtune</samp> supports 2 extra choices for <var class="var">cpu-type</var>: </p> <dl class="table"> <dt>‘<samp class="samp">generic</samp>’</dt> <dd>
<p>Produce code optimized for the most common IA32/AMD64/EM64T processors. If you know the CPU on which your code will run, then you should use the corresponding <samp class="option">-mtune</samp> or <samp class="option">-march</samp> option instead of <samp class="option">-mtune=generic</samp>. But, if you do not know exactly what CPU users of your application will have, then you should use this option. </p> <p>As new processors are deployed in the marketplace, the behavior of this option will change. Therefore, if you upgrade to a newer version of GCC, code generation controlled by this option will change to reflect the processors that are most common at the time that version of GCC is released. </p> <p>There is no <samp class="option">-march=generic</samp> option because <samp class="option">-march</samp> indicates the instruction set the compiler can use, and there is no generic instruction set applicable to all processors. In contrast, <samp class="option">-mtune</samp> indicates the processor (or, in this case, collection of processors) for which the code is optimized. </p> </dd> <dt>‘<samp class="samp">intel</samp>’</dt> <dd>
<p>Produce code optimized for the most current Intel processors, which are Haswell and Silvermont for this version of GCC. If you know the CPU on which your code will run, then you should use the corresponding <samp class="option">-mtune</samp> or <samp class="option">-march</samp> option instead of <samp class="option">-mtune=intel</samp>. But, if you want your application performs better on both Haswell and Silvermont, then you should use this option. </p> <p>As new Intel processors are deployed in the marketplace, the behavior of this option will change. Therefore, if you upgrade to a newer version of GCC, code generation controlled by this option will change to reflect the most current Intel processors at the time that version of GCC is released. </p> <p>There is no <samp class="option">-march=intel</samp> option because <samp class="option">-march</samp> indicates the instruction set the compiler can use, and there is no common instruction set applicable to all processors. In contrast, <samp class="option">-mtune</samp> indicates the processor (or, in this case, collection of processors) for which the code is optimized. </p>
</dd> </dl> </dd> <dt>
<span><code class="code">-mcpu=<var class="var">cpu-type</var></code><a class="copiable-link" href="#index-mcpu-14"> ¶</a></span>
</dt> <dd>
<p>A deprecated synonym for <samp class="option">-mtune</samp>. </p> </dd> <dt>
<span><code class="code">-mfpmath=<var class="var">unit</var></code><a class="copiable-link" href="#index-mfpmath-1"> ¶</a></span>
</dt> <dd>
<p>Generate floating-point arithmetic for selected unit <var class="var">unit</var>. The choices for <var class="var">unit</var> are: </p> <dl class="table"> <dt>‘<samp class="samp">387</samp>’</dt> <dd>
<p>Use the standard 387 floating-point coprocessor present on the majority of chips and emulated otherwise. Code compiled with this option runs almost everywhere. The temporary results are computed in 80-bit precision instead of the precision specified by the type, resulting in slightly different results compared to most of other chips. See <samp class="option">-ffloat-store</samp> for more detailed description. </p> <p>This is the default choice for non-Darwin x86-32 targets. </p> </dd> <dt>‘<samp class="samp">sse</samp>’</dt> <dd>
<p>Use scalar floating-point instructions present in the SSE instruction set. This instruction set is supported by Pentium III and newer chips, and in the AMD line by Athlon-4, Athlon XP and Athlon MP chips. The earlier version of the SSE instruction set supports only single-precision arithmetic, thus the double and extended-precision arithmetic are still done using 387. A later version, present only in Pentium 4 and AMD x86-64 chips, supports double-precision arithmetic too. </p> <p>For the x86-32 compiler, you must use <samp class="option">-march=<var class="var">cpu-type</var></samp>, <samp class="option">-msse</samp> or <samp class="option">-msse2</samp> switches to enable SSE extensions and make this option effective. For the x86-64 compiler, these extensions are enabled by default. </p> <p>The resulting code should be considerably faster in the majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80 bits. </p> <p>This is the default choice for the x86-64 compiler, Darwin x86-32 targets, and the default choice for x86-32 targets with the SSE2 instruction set when <samp class="option">-ffast-math</samp> is enabled. </p> </dd> <dt>‘<samp class="samp">sse,387</samp>’</dt> <dt>‘<samp class="samp">sse+387</samp>’</dt> <dt>‘<samp class="samp">both</samp>’</dt> <dd><p>Attempt to utilize both instruction sets at once. This effectively doubles the amount of available registers, and on chips with separate execution units for 387 and SSE the execution resources too. Use this option with care, as it is still experimental, because the GCC register allocator does not model separate functional units well, resulting in unstable performance. </p></dd> </dl> </dd> <dt>
<span><code class="code">-masm=<var class="var">dialect</var></code><a class="copiable-link" href="#index-masm_003ddialect"> ¶</a></span>
</dt> <dd>
<p>Output assembly instructions using selected <var class="var">dialect</var>. Also affects which dialect is used for basic <code class="code">asm</code> (see <a class="pxref" href="basic-asm">Basic Asm — Assembler Instructions Without Operands</a>) and extended <code class="code">asm</code> (see <a class="pxref" href="extended-asm">Extended Asm - Assembler Instructions with C Expression Operands</a>). Supported choices (in dialect order) are ‘<samp class="samp">att</samp>’ or ‘<samp class="samp">intel</samp>’. The default is ‘<samp class="samp">att</samp>’. Darwin does not support ‘<samp class="samp">intel</samp>’. </p> </dd> <dt>
 <span><code class="code">-mieee-fp</code><a class="copiable-link" href="#index-mieee-fp"> ¶</a></span>
</dt> <dt><code class="code">-mno-ieee-fp</code></dt> <dd>
<p>Control whether or not the compiler uses IEEE floating-point comparisons. These correctly handle the case where the result of a comparison is unordered. </p> </dd> <dt>
 <span><code class="code">-m80387</code><a class="copiable-link" href="#index-m80387"> ¶</a></span>
</dt> <dt><code class="code">-mhard-float</code></dt> <dd>
<p>Generate output containing 80387 instructions for floating point. </p> </dd> <dt>
 <span><code class="code">-mno-80387</code><a class="copiable-link" href="#index-no-80387"> ¶</a></span>
</dt> <dt><code class="code">-msoft-float</code></dt> <dd>
<p>Generate output containing library calls for floating point. </p> <p><strong class="strong">Warning:</strong> the requisite libraries are not part of GCC. Normally the facilities of the machine’s usual C compiler are used, but this cannot be done directly in cross-compilation. You must make your own arrangements to provide suitable library functions for cross-compilation. </p> <p>On machines where a function returns floating-point results in the 80387 register stack, some floating-point opcodes may be emitted even if <samp class="option">-msoft-float</samp> is used. </p> </dd> <dt>
 <span><code class="code">-mno-fp-ret-in-387</code><a class="copiable-link" href="#index-mno-fp-ret-in-387"> ¶</a></span>
</dt> <dd>
<p>Do not use the FPU registers for return values of functions. </p> <p>The usual calling convention has functions return values of types <code class="code">float</code> and <code class="code">double</code> in an FPU register, even if there is no FPU. The idea is that the operating system should emulate an FPU. </p> <p>The option <samp class="option">-mno-fp-ret-in-387</samp> causes such values to be returned in ordinary CPU registers instead. </p> </dd> <dt>
 <span><code class="code">-mno-fancy-math-387</code><a class="copiable-link" href="#index-mno-fancy-math-387"> ¶</a></span>
</dt> <dd>
<p>Some 387 emulators do not support the <code class="code">sin</code>, <code class="code">cos</code> and <code class="code">sqrt</code> instructions for the 387. Specify this option to avoid generating those instructions. This option is overridden when <samp class="option">-march</samp> indicates that the target CPU always has an FPU and so the instruction does not need emulation. These instructions are not generated unless you also use the <samp class="option">-funsafe-math-optimizations</samp> switch. </p> </dd> <dt>
 <span><code class="code">-malign-double</code><a class="copiable-link" href="#index-malign-double"> ¶</a></span>
</dt> <dt><code class="code">-mno-align-double</code></dt> <dd>
<p>Control whether GCC aligns <code class="code">double</code>, <code class="code">long double</code>, and <code class="code">long long</code> variables on a two-word boundary or a one-word boundary. Aligning <code class="code">double</code> variables on a two-word boundary produces code that runs somewhat faster on a Pentium at the expense of more memory. </p> <p>On x86-64, <samp class="option">-malign-double</samp> is enabled by default. </p> <p><strong class="strong">Warning:</strong> if you use the <samp class="option">-malign-double</samp> switch, structures containing the above types are aligned differently than the published application binary interface specifications for the x86-32 and are not binary compatible with structures in code compiled without that switch. </p> </dd> <dt>
 <span><code class="code">-m96bit-long-double</code><a class="copiable-link" href="#index-m96bit-long-double"> ¶</a></span>
</dt> <dt><code class="code">-m128bit-long-double</code></dt> <dd>
<p>These switches control the size of <code class="code">long double</code> type. The x86-32 application binary interface specifies the size to be 96 bits, so <samp class="option">-m96bit-long-double</samp> is the default in 32-bit mode. </p> <p>Modern architectures (Pentium and newer) prefer <code class="code">long double</code> to be aligned to an 8- or 16-byte boundary. In arrays or structures conforming to the ABI, this is not possible. So specifying <samp class="option">-m128bit-long-double</samp> aligns <code class="code">long double</code> to a 16-byte boundary by padding the <code class="code">long double</code> with an additional 32-bit zero. </p> <p>In the x86-64 compiler, <samp class="option">-m128bit-long-double</samp> is the default choice as its ABI specifies that <code class="code">long double</code> is aligned on 16-byte boundary. </p> <p>Notice that neither of these options enable any extra precision over the x87 standard of 80 bits for a <code class="code">long double</code>. </p> <p><strong class="strong">Warning:</strong> if you override the default value for your target ABI, this changes the size of structures and arrays containing <code class="code">long double</code> variables, as well as modifying the function calling convention for functions taking <code class="code">long double</code>. Hence they are not binary-compatible with code compiled without that switch. </p> </dd> <dt>
  <span><code class="code">-mlong-double-64</code><a class="copiable-link" href="#index-mlong-double-64-1"> ¶</a></span>
</dt> <dt><code class="code">-mlong-double-80</code></dt> <dt><code class="code">-mlong-double-128</code></dt> <dd>
<p>These switches control the size of <code class="code">long double</code> type. A size of 64 bits makes the <code class="code">long double</code> type equivalent to the <code class="code">double</code> type. This is the default for 32-bit Bionic C library. A size of 128 bits makes the <code class="code">long double</code> type equivalent to the <code class="code">__float128</code> type. This is the default for 64-bit Bionic C library. </p> <p><strong class="strong">Warning:</strong> if you override the default value for your target ABI, this changes the size of structures and arrays containing <code class="code">long double</code> variables, as well as modifying the function calling convention for functions taking <code class="code">long double</code>. Hence they are not binary-compatible with code compiled without that switch. </p> </dd> <dt>
<span><code class="code">-malign-data=<var class="var">type</var></code><a class="copiable-link" href="#index-malign-data-1"> ¶</a></span>
</dt> <dd>
<p>Control how GCC aligns variables. Supported values for <var class="var">type</var> are ‘<samp class="samp">compat</samp>’ uses increased alignment value compatible uses GCC 4.8 and earlier, ‘<samp class="samp">abi</samp>’ uses alignment value as specified by the psABI, and ‘<samp class="samp">cacheline</samp>’ uses increased alignment value to match the cache line size. ‘<samp class="samp">compat</samp>’ is the default. </p> </dd> <dt>
<span><code class="code">-mlarge-data-threshold=<var class="var">threshold</var></code><a class="copiable-link" href="#index-mlarge-data-threshold"> ¶</a></span>
</dt> <dd>
<p>When <samp class="option">-mcmodel=medium</samp> is specified, data objects larger than <var class="var">threshold</var> are placed in the large data section. This value must be the same across all objects linked into the binary, and defaults to 65535. </p> </dd> <dt>
<span><code class="code">-mrtd</code><a class="copiable-link" href="#index-mrtd-1"> ¶</a></span>
</dt> <dd>
<p>Use a different function-calling convention, in which functions that take a fixed number of arguments return with the <code class="code">ret <var class="var">num</var></code> instruction, which pops their arguments while returning. This saves one instruction in the caller since there is no need to pop the arguments there. </p> <p>You can specify that an individual function is called with this calling sequence with the function attribute <code class="code">stdcall</code>. You can also override the <samp class="option">-mrtd</samp> option by using the function attribute <code class="code">cdecl</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> <p><strong class="strong">Warning:</strong> this calling convention is incompatible with the one normally used on Unix, so you cannot use it if you need to call libraries compiled with the Unix compiler. </p> <p>Also, you must provide function prototypes for all functions that take variable numbers of arguments (including <code class="code">printf</code>); otherwise incorrect code is generated for calls to those functions. </p> <p>In addition, seriously incorrect code results if you call a function with too many arguments. (Normally, extra arguments are harmlessly ignored.) </p> </dd> <dt>
<span><code class="code">-mregparm=<var class="var">num</var></code><a class="copiable-link" href="#index-mregparm"> ¶</a></span>
</dt> <dd>
<p>Control how many registers are used to pass integer arguments. By default, no registers are used to pass arguments, and at most 3 registers can be used. You can control this behavior for a specific function by using the function attribute <code class="code">regparm</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> <p><strong class="strong">Warning:</strong> if you use this switch, and <var class="var">num</var> is nonzero, then you must build all modules with the same value, including any libraries. This includes the system libraries and startup modules. </p> </dd> <dt>
<span><code class="code">-msseregparm</code><a class="copiable-link" href="#index-msseregparm"> ¶</a></span>
</dt> <dd>
<p>Use SSE register passing conventions for float and double arguments and return values. You can control this behavior for a specific function by using the function attribute <code class="code">sseregparm</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> <p><strong class="strong">Warning:</strong> if you use this switch then you must build all modules with the same value, including any libraries. This includes the system libraries and startup modules. </p> </dd> <dt>
<span><code class="code">-mvect8-ret-in-mem</code><a class="copiable-link" href="#index-mvect8-ret-in-mem"> ¶</a></span>
</dt> <dd>
<p>Return 8-byte vectors in memory instead of MMX registers. This is the default on VxWorks to match the ABI of the Sun Studio compilers until version 12. <em class="emph">Only</em> use this option if you need to remain compatible with existing code produced by those previous compiler versions or older versions of GCC. </p> </dd> <dt>
  <span><code class="code">-mpc32</code><a class="copiable-link" href="#index-mpc32"> ¶</a></span>
</dt> <dt><code class="code">-mpc64</code></dt> <dt><code class="code">-mpc80</code></dt> <dd> <p>Set 80387 floating-point precision to 32, 64 or 80 bits. When <samp class="option">-mpc32</samp> is specified, the significands of results of floating-point operations are rounded to 24 bits (single precision); <samp class="option">-mpc64</samp> rounds the significands of results of floating-point operations to 53 bits (double precision) and <samp class="option">-mpc80</samp> rounds the significands of results of floating-point operations to 64 bits (extended double precision), which is the default. When this option is used, floating-point operations in higher precisions are not available to the programmer without setting the FPU control word explicitly. </p> <p>Setting the rounding of floating-point operations to less than the default 80 bits can speed some programs by 2% or more. Note that some mathematical libraries assume that extended-precision (80-bit) floating-point operations are enabled by default; routines in such libraries could suffer significant loss of accuracy, typically through so-called “catastrophic cancellation”, when this option is used to set the precision to less than extended precision. </p> </dd> <dt>
<span><code class="code">-mdaz-ftz</code><a class="copiable-link" href="#index-mdaz-ftz"> ¶</a></span>
</dt> <dd> <p>The flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR register are used to control floating-point calculations.SSE and AVX instructions including scalar and vector instructions could benefit from enabling the FTZ and DAZ flags when <samp class="option">-mdaz-ftz</samp> is specified. Don’t set FTZ/DAZ flags when <samp class="option">-mno-daz-ftz</samp> or <samp class="option">-shared</samp> is specified, <samp class="option">-mdaz-ftz</samp> will set FTZ/DAZ flags even with <samp class="option">-shared</samp>. </p> </dd> <dt>
<span><code class="code">-mstackrealign</code><a class="copiable-link" href="#index-mstackrealign"> ¶</a></span>
</dt> <dd>
<p>Realign the stack at entry. On the x86, the <samp class="option">-mstackrealign</samp> option generates an alternate prologue and epilogue that realigns the run-time stack if necessary. This supports mixing legacy codes that keep 4-byte stack alignment with modern codes that keep 16-byte stack alignment for SSE compatibility. See also the attribute <code class="code">force_align_arg_pointer</code>, applicable to individual functions. </p> </dd> <dt>
<span><code class="code">-mpreferred-stack-boundary=<var class="var">num</var></code><a class="copiable-link" href="#index-mpreferred-stack-boundary-1"> ¶</a></span>
</dt> <dd>
<p>Attempt to keep the stack boundary aligned to a 2 raised to <var class="var">num</var> byte boundary. If <samp class="option">-mpreferred-stack-boundary</samp> is not specified, the default is 4 (16 bytes or 128 bits). </p> <p><strong class="strong">Warning:</strong> When generating code for the x86-64 architecture with SSE extensions disabled, <samp class="option">-mpreferred-stack-boundary=3</samp> can be used to keep the stack boundary aligned to 8 byte boundary. Since x86-64 ABI require 16 byte stack alignment, this is ABI incompatible and intended to be used in controlled environment where stack space is important limitation. This option leads to wrong code when functions compiled with 16 byte stack alignment (such as functions from a standard library) are called with misaligned stack. In this case, SSE instructions may lead to misaligned memory access traps. In addition, variable arguments are handled incorrectly for 16 byte aligned objects (including x87 long double and __int128), leading to wrong results. You must build all modules with <samp class="option">-mpreferred-stack-boundary=3</samp>, including any libraries. This includes the system libraries and startup modules. </p> </dd> <dt>
<span><code class="code">-mincoming-stack-boundary=<var class="var">num</var></code><a class="copiable-link" href="#index-mincoming-stack-boundary"> ¶</a></span>
</dt> <dd>
<p>Assume the incoming stack is aligned to a 2 raised to <var class="var">num</var> byte boundary. If <samp class="option">-mincoming-stack-boundary</samp> is not specified, the one specified by <samp class="option">-mpreferred-stack-boundary</samp> is used. </p> <p>On Pentium and Pentium Pro, <code class="code">double</code> and <code class="code">long double</code> values should be aligned to an 8-byte boundary (see <samp class="option">-malign-double</samp>) or suffer significant run time performance penalties. On Pentium III, the Streaming SIMD Extension (SSE) data type <code class="code">__m128</code> may not work properly if it is not 16-byte aligned. </p> <p>To ensure proper alignment of this values on the stack, the stack boundary must be as aligned as that required by any value stored on the stack. Further, every function must be generated such that it keeps the stack aligned. Thus calling a function compiled with a higher preferred stack boundary from a function compiled with a lower preferred stack boundary most likely misaligns the stack. It is recommended that libraries that use callbacks always use the default setting. </p> <p>This extra alignment does consume extra stack space, and generally increases code size. Code that is sensitive to stack space usage, such as embedded systems and operating system kernels, may want to reduce the preferred alignment to <samp class="option">-mpreferred-stack-boundary=2</samp>. </p> </dd> <dt>
<span><code class="code">-mmmx</code><a class="copiable-link" href="#index-mmmx"> ¶</a></span>
</dt>  <dt><code class="code">-msse</code></dt>  <dt><code class="code">-msse2</code></dt>  <dt><code class="code">-msse3</code></dt>  <dt><code class="code">-mssse3</code></dt>  <dt><code class="code">-msse4</code></dt>  <dt><code class="code">-msse4a</code></dt>  <dt><code class="code">-msse4.1</code></dt>  <dt><code class="code">-msse4.2</code></dt>  <dt><code class="code">-mavx</code></dt>  <dt><code class="code">-mavx2</code></dt>  <dt><code class="code">-mavx512f</code></dt>  <dt><code class="code">-mavx512pf</code></dt>  <dt><code class="code">-mavx512er</code></dt>  <dt><code class="code">-mavx512cd</code></dt>  <dt><code class="code">-mavx512vl</code></dt>  <dt><code class="code">-mavx512bw</code></dt>  <dt><code class="code">-mavx512dq</code></dt>  <dt><code class="code">-mavx512ifma</code></dt>  <dt><code class="code">-mavx512vbmi</code></dt>  <dt><code class="code">-msha</code></dt>  <dt><code class="code">-maes</code></dt>  <dt><code class="code">-mpclmul</code></dt>  <dt><code class="code">-mclflushopt</code></dt>  <dt><code class="code">-mclwb</code></dt>  <dt><code class="code">-mfsgsbase</code></dt>  <dt><code class="code">-mptwrite</code></dt>  <dt><code class="code">-mrdrnd</code></dt>  <dt><code class="code">-mf16c</code></dt>  <dt><code class="code">-mfma</code></dt>  <dt><code class="code">-mpconfig</code></dt>  <dt><code class="code">-mwbnoinvd</code></dt>  <dt><code class="code">-mfma4</code></dt>  <dt><code class="code">-mprfchw</code></dt>  <dt><code class="code">-mrdpid</code></dt>  <dt><code class="code">-mprefetchwt1</code></dt>  <dt><code class="code">-mrdseed</code></dt>  <dt><code class="code">-msgx</code></dt>  <dt><code class="code">-mxop</code></dt>  <dt><code class="code">-mlwp</code></dt>  <dt><code class="code">-m3dnow</code></dt>  <dt><code class="code">-m3dnowa</code></dt>  <dt><code class="code">-mpopcnt</code></dt>  <dt><code class="code">-mabm</code></dt>  <dt><code class="code">-madx</code></dt>  <dt><code class="code">-mbmi</code></dt>  <dt><code class="code">-mbmi2</code></dt>  <dt><code class="code">-mlzcnt</code></dt>  <dt><code class="code">-mfxsr</code></dt>  <dt><code class="code">-mxsave</code></dt>  <dt><code class="code">-mxsaveopt</code></dt>  <dt><code class="code">-mxsavec</code></dt>  <dt><code class="code">-mxsaves</code></dt>  <dt><code class="code">-mrtm</code></dt>  <dt><code class="code">-mhle</code></dt>  <dt><code class="code">-mtbm</code></dt>  <dt><code class="code">-mmwaitx</code></dt>  <dt><code class="code">-mclzero</code></dt>  <dt><code class="code">-mpku</code></dt>  <dt><code class="code">-mavx512vbmi2</code></dt>  <dt><code class="code">-mavx512bf16</code></dt>  <dt><code class="code">-mavx512fp16</code></dt>  <dt><code class="code">-mgfni</code></dt>  <dt><code class="code">-mvaes</code></dt>  <dt><code class="code">-mwaitpkg</code></dt>  <dt><code class="code">-mvpclmulqdq</code></dt>  <dt><code class="code">-mavx512bitalg</code></dt>  <dt><code class="code">-mmovdiri</code></dt>  <dt><code class="code">-mmovdir64b</code></dt>   <dt><code class="code">-menqcmd</code></dt> <dt><code class="code">-muintr</code></dt>  <dt><code class="code">-mtsxldtrk</code></dt>  <dt><code class="code">-mavx512vpopcntdq</code></dt>  <dt><code class="code">-mavx512vp2intersect</code></dt>  <dt><code class="code">-mavx5124fmaps</code></dt>  <dt><code class="code">-mavx512vnni</code></dt>  <dt><code class="code">-mavxvnni</code></dt>  <dt><code class="code">-mavx5124vnniw</code></dt>  <dt><code class="code">-mcldemote</code></dt>  <dt><code class="code">-mserialize</code></dt>  <dt><code class="code">-mamx-tile</code></dt>  <dt><code class="code">-mamx-int8</code></dt>  <dt><code class="code">-mamx-bf16</code></dt>   <dt><code class="code">-mhreset</code></dt> <dt><code class="code">-mkl</code></dt>  <dt><code class="code">-mwidekl</code></dt>  <dt><code class="code">-mavxifma</code></dt>  <dt><code class="code">-mavxvnniint8</code></dt>  <dt><code class="code">-mavxneconvert</code></dt>  <dt><code class="code">-mcmpccxadd</code></dt>  <dt><code class="code">-mamx-fp16</code></dt>  <dt><code class="code">-mprefetchi</code></dt>  <dt><code class="code">-mraoint</code></dt>  <dt><code class="code">-mamx-complex</code></dt> <dd>
<p>These switches enable the use of instructions in the MMX, SSE, SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F, AVX512PF, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG, WBNOINVD, FMA4, PREFETCHW, RDPID, PREFETCHWT1, RDSEED, SGX, XOP, LWP, 3DNow!, enhanced 3DNow!, POPCNT, ABM, ADX, BMI, BMI2, LZCNT, FXSR, XSAVE, XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI2, GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16, ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE, UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512-FP16, AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI, RAOINT, AMX-COMPLEX or CLDEMOTE extended instruction sets. Each has a corresponding <samp class="option">-mno-</samp> option to disable use of these instructions. </p> <p>These extensions are also available as built-in functions: see <a class="ref" href="x86-built-in-functions">x86 Built-in Functions</a>, for details of the functions enabled and disabled by these switches. </p> <p>To generate SSE/SSE2 instructions automatically from floating-point code (as opposed to 387 instructions), see <samp class="option">-mfpmath=sse</samp>. </p> <p>GCC depresses SSEx instructions when <samp class="option">-mavx</samp> is used. Instead, it generates new AVX instructions or AVX equivalence for all SSEx instructions when needed. </p> <p>These options enable GCC to use these extended instructions in generated code, even without <samp class="option">-mfpmath=sse</samp>. Applications that perform run-time CPU detection must compile separate files for each supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options. </p> </dd> <dt>
<span><code class="code">-mdump-tune-features</code><a class="copiable-link" href="#index-mdump-tune-features"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to dump the names of the x86 performance tuning features and default settings. The names can be used in <samp class="option">-mtune-ctrl=<var class="var">feature-list</var></samp>. </p> </dd> <dt>
<span><code class="code">-mtune-ctrl=<var class="var">feature-list</var></code><a class="copiable-link" href="#index-mtune-ctrl_003dfeature-list"> ¶</a></span>
</dt> <dd>
<p>This option is used to do fine grain control of x86 code generation features. <var class="var">feature-list</var> is a comma separated list of <var class="var">feature</var> names. See also <samp class="option">-mdump-tune-features</samp>. When specified, the <var class="var">feature</var> is turned on if it is not preceded with ‘<samp class="samp">^</samp>’, otherwise, it is turned off. <samp class="option">-mtune-ctrl=<var class="var">feature-list</var></samp> is intended to be used by GCC developers. Using it may lead to code paths not covered by testing and can potentially result in compiler ICEs or runtime errors. </p> </dd> <dt>
<span><code class="code">-mno-default</code><a class="copiable-link" href="#index-mno-default"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to turn off all tunable features. See also <samp class="option">-mtune-ctrl=<var class="var">feature-list</var></samp> and <samp class="option">-mdump-tune-features</samp>. </p> </dd> <dt>
<span><code class="code">-mcld</code><a class="copiable-link" href="#index-mcld"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to emit a <code class="code">cld</code> instruction in the prologue of functions that use string instructions. String instructions depend on the DF flag to select between autoincrement or autodecrement mode. While the ABI specifies the DF flag to be cleared on function entry, some operating systems violate this specification by not clearing the DF flag in their exception dispatchers. The exception handler can be invoked with the DF flag set, which leads to wrong direction mode when string instructions are used. This option can be enabled by default on 32-bit x86 targets by configuring GCC with the <samp class="option">--enable-cld</samp> configure option. Generation of <code class="code">cld</code> instructions can be suppressed with the <samp class="option">-mno-cld</samp> compiler option in this case. </p> </dd> <dt>
<span><code class="code">-mvzeroupper</code><a class="copiable-link" href="#index-mvzeroupper"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to emit a <code class="code">vzeroupper</code> instruction before a transfer of control flow out of the function to minimize the AVX to SSE transition penalty as well as remove unnecessary <code class="code">zeroupper</code> intrinsics. </p> </dd> <dt>
<span><code class="code">-mprefer-avx128</code><a class="copiable-link" href="#index-mprefer-avx128"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer. </p> </dd> <dt>
<span><code class="code">-mprefer-vector-width=<var class="var">opt</var></code><a class="copiable-link" href="#index-mprefer-vector-width"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to use <var class="var">opt</var>-bit vector width in instructions instead of default on the selected platform. </p> </dd> <dt>
<span><code class="code">-mmove-max=<var class="var">bits</var></code><a class="copiable-link" href="#index-mmove-max"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to set the maximum number of bits can be moved from memory to memory efficiently to <var class="var">bits</var>. The valid <var class="var">bits</var> are 128, 256 and 512. </p> </dd> <dt>
<span><code class="code">-mstore-max=<var class="var">bits</var></code><a class="copiable-link" href="#index-mstore-max"> ¶</a></span>
</dt> <dd>
<p>This option instructs GCC to set the maximum number of bits can be stored to memory efficiently to <var class="var">bits</var>. The valid <var class="var">bits</var> are 128, 256 and 512. </p> <dl class="table"> <dt>‘<samp class="samp">none</samp>’</dt> <dd>
<p>No extra limitations applied to GCC other than defined by the selected platform. </p> </dd> <dt>‘<samp class="samp">128</samp>’</dt> <dd>
<p>Prefer 128-bit vector width for instructions. </p> </dd> <dt>‘<samp class="samp">256</samp>’</dt> <dd>
<p>Prefer 256-bit vector width for instructions. </p> </dd> <dt>‘<samp class="samp">512</samp>’</dt> <dd><p>Prefer 512-bit vector width for instructions. </p></dd> </dl> </dd> <dt>
<span><code class="code">-mcx16</code><a class="copiable-link" href="#index-mcx16"> ¶</a></span>
</dt> <dd>
<p>This option enables GCC to generate <code class="code">CMPXCHG16B</code> instructions in 64-bit code to implement compare-and-exchange operations on 16-byte aligned 128-bit objects. This is useful for atomic updates of data structures exceeding one machine word in size. The compiler uses this instruction to implement <a class="ref" href="_005f_005fsync-builtins">Legacy <code class="code">__sync</code> Built-in Functions for Atomic Memory Access</a>. However, for <a class="ref" href="_005f_005fatomic-builtins">Built-in Functions for Memory Model Aware Atomic Operations</a> operating on 128-bit integers, a library call is always used. </p> </dd> <dt>
<span><code class="code">-msahf</code><a class="copiable-link" href="#index-msahf"> ¶</a></span>
</dt> <dd>
<p>This option enables generation of <code class="code">SAHF</code> instructions in 64-bit code. Early Intel Pentium 4 CPUs with Intel 64 support, prior to the introduction of Pentium 4 G1 step in December 2005, lacked the <code class="code">LAHF</code> and <code class="code">SAHF</code> instructions which are supported by AMD64. These are load and store instructions, respectively, for certain status flags. In 64-bit mode, the <code class="code">SAHF</code> instruction is used to optimize <code class="code">fmod</code>, <code class="code">drem</code>, and <code class="code">remainder</code> built-in functions; see <a class="ref" href="other-builtins">Other Built-in Functions Provided by GCC</a> for details. </p> </dd> <dt>
<span><code class="code">-mmovbe</code><a class="copiable-link" href="#index-mmovbe"> ¶</a></span>
</dt> <dd>
<p>This option enables use of the <code class="code">movbe</code> instruction to implement <code class="code">__builtin_bswap32</code> and <code class="code">__builtin_bswap64</code>. </p> </dd> <dt>
<span><code class="code">-mshstk</code><a class="copiable-link" href="#index-mshstk"> ¶</a></span>
</dt> <dd>
<p>The <samp class="option">-mshstk</samp> option enables shadow stack built-in functions from x86 Control-flow Enforcement Technology (CET). </p> </dd> <dt>
<span><code class="code">-mcrc32</code><a class="copiable-link" href="#index-mcrc32"> ¶</a></span>
</dt> <dd>
<p>This option enables built-in functions <code class="code">__builtin_ia32_crc32qi</code>, <code class="code">__builtin_ia32_crc32hi</code>, <code class="code">__builtin_ia32_crc32si</code> and <code class="code">__builtin_ia32_crc32di</code> to generate the <code class="code">crc32</code> machine instruction. </p> </dd> <dt>
<span><code class="code">-mmwait</code><a class="copiable-link" href="#index-mmwait"> ¶</a></span>
</dt> <dd>
<p>This option enables built-in functions <code class="code">__builtin_ia32_monitor</code>, and <code class="code">__builtin_ia32_mwait</code> to generate the <code class="code">monitor</code> and <code class="code">mwait</code> machine instructions. </p> </dd> <dt>
<span><code class="code">-mrecip</code><a class="copiable-link" href="#index-mrecip-1"> ¶</a></span>
</dt> <dd>
<p>This option enables use of <code class="code">RCPSS</code> and <code class="code">RSQRTSS</code> instructions (and their vectorized variants <code class="code">RCPPS</code> and <code class="code">RSQRTPS</code>) with an additional Newton-Raphson step to increase precision instead of <code class="code">DIVSS</code> and <code class="code">SQRTSS</code> (and their vectorized variants) for single-precision floating-point arguments. These instructions are generated only when <samp class="option">-funsafe-math-optimizations</samp> is enabled together with <samp class="option">-ffinite-math-only</samp> and <samp class="option">-fno-trapping-math</samp>. Note that while the throughput of the sequence is higher than the throughput of the non-reciprocal instruction, the precision of the sequence can be decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994). </p> <p>Note that GCC implements <code class="code">1.0f/sqrtf(<var class="var">x</var>)</code> in terms of <code class="code">RSQRTSS</code> (or <code class="code">RSQRTPS</code>) already with <samp class="option">-ffast-math</samp> (or the above option combination), and doesn’t need <samp class="option">-mrecip</samp>. </p> <p>Also note that GCC emits the above sequence with additional Newton-Raphson step for vectorized single-float division and vectorized <code class="code">sqrtf(<var class="var">x</var>)</code> already with <samp class="option">-ffast-math</samp> (or the above option combination), and doesn’t need <samp class="option">-mrecip</samp>. </p> </dd> <dt>
<span><code class="code">-mrecip=<var class="var">opt</var></code><a class="copiable-link" href="#index-mrecip_003dopt-1"> ¶</a></span>
</dt> <dd>
<p>This option controls which reciprocal estimate instructions may be used. <var class="var">opt</var> is a comma-separated list of options, which may be preceded by a ‘<samp class="samp">!</samp>’ to invert the option: </p> <dl class="table"> <dt>‘<samp class="samp">all</samp>’</dt> <dd>
<p>Enable all estimate instructions. </p> </dd> <dt>‘<samp class="samp">default</samp>’</dt> <dd>
<p>Enable the default instructions, equivalent to <samp class="option">-mrecip</samp>. </p> </dd> <dt>‘<samp class="samp">none</samp>’</dt> <dd>
<p>Disable all estimate instructions, equivalent to <samp class="option">-mno-recip</samp>. </p> </dd> <dt>‘<samp class="samp">div</samp>’</dt> <dd>
<p>Enable the approximation for scalar division. </p> </dd> <dt>‘<samp class="samp">vec-div</samp>’</dt> <dd>
<p>Enable the approximation for vectorized division. </p> </dd> <dt>‘<samp class="samp">sqrt</samp>’</dt> <dd>
<p>Enable the approximation for scalar square root. </p> </dd> <dt>‘<samp class="samp">vec-sqrt</samp>’</dt> <dd><p>Enable the approximation for vectorized square root. </p></dd> </dl> <p>So, for example, <samp class="option">-mrecip=all,!sqrt</samp> enables all of the reciprocal approximations, except for square root. </p> </dd> <dt>
<span><code class="code">-mveclibabi=<var class="var">type</var></code><a class="copiable-link" href="#index-mveclibabi-1"> ¶</a></span>
</dt> <dd>
<p>Specifies the ABI type to use for vectorizing intrinsics using an external library. Supported values for <var class="var">type</var> are ‘<samp class="samp">svml</samp>’ for the Intel short vector math library and ‘<samp class="samp">acml</samp>’ for the AMD math core library. To use this option, both <samp class="option">-ftree-vectorize</samp> and <samp class="option">-funsafe-math-optimizations</samp> have to be enabled, and an SVML or ACML ABI-compatible library must be specified at link time. </p> <p>GCC currently emits calls to <code class="code">vmldExp2</code>, <code class="code">vmldLn2</code>, <code class="code">vmldLog102</code>, <code class="code">vmldPow2</code>, <code class="code">vmldTanh2</code>, <code class="code">vmldTan2</code>, <code class="code">vmldAtan2</code>, <code class="code">vmldAtanh2</code>, <code class="code">vmldCbrt2</code>, <code class="code">vmldSinh2</code>, <code class="code">vmldSin2</code>, <code class="code">vmldAsinh2</code>, <code class="code">vmldAsin2</code>, <code class="code">vmldCosh2</code>, <code class="code">vmldCos2</code>, <code class="code">vmldAcosh2</code>, <code class="code">vmldAcos2</code>, <code class="code">vmlsExp4</code>, <code class="code">vmlsLn4</code>, <code class="code">vmlsLog104</code>, <code class="code">vmlsPow4</code>, <code class="code">vmlsTanh4</code>, <code class="code">vmlsTan4</code>, <code class="code">vmlsAtan4</code>, <code class="code">vmlsAtanh4</code>, <code class="code">vmlsCbrt4</code>, <code class="code">vmlsSinh4</code>, <code class="code">vmlsSin4</code>, <code class="code">vmlsAsinh4</code>, <code class="code">vmlsAsin4</code>, <code class="code">vmlsCosh4</code>, <code class="code">vmlsCos4</code>, <code class="code">vmlsAcosh4</code> and <code class="code">vmlsAcos4</code> for corresponding function type when <samp class="option">-mveclibabi=svml</samp> is used, and <code class="code">__vrd2_sin</code>, <code class="code">__vrd2_cos</code>, <code class="code">__vrd2_exp</code>, <code class="code">__vrd2_log</code>, <code class="code">__vrd2_log2</code>, <code class="code">__vrd2_log10</code>, <code class="code">__vrs4_sinf</code>, <code class="code">__vrs4_cosf</code>, <code class="code">__vrs4_expf</code>, <code class="code">__vrs4_logf</code>, <code class="code">__vrs4_log2f</code>, <code class="code">__vrs4_log10f</code> and <code class="code">__vrs4_powf</code> for the corresponding function type when <samp class="option">-mveclibabi=acml</samp> is used. </p> </dd> <dt>
<span><code class="code">-mabi=<var class="var">name</var></code><a class="copiable-link" href="#index-mabi-6"> ¶</a></span>
</dt> <dd>
<p>Generate code for the specified calling convention. Permissible values are ‘<samp class="samp">sysv</samp>’ for the ABI used on GNU/Linux and other systems, and ‘<samp class="samp">ms</samp>’ for the Microsoft ABI. The default is to use the Microsoft ABI when targeting Microsoft Windows and the SysV ABI on all other systems. You can control this behavior for specific functions by using the function attributes <code class="code">ms_abi</code> and <code class="code">sysv_abi</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> </dd> <dt>
<span><code class="code">-mforce-indirect-call</code><a class="copiable-link" href="#index-mforce-indirect-call"> ¶</a></span>
</dt> <dd>
<p>Force all calls to functions to be indirect. This is useful when using Intel Processor Trace where it generates more precise timing information for function calls. </p> </dd> <dt>
<span><code class="code">-mmanual-endbr</code><a class="copiable-link" href="#index-mmanual-endbr"> ¶</a></span>
</dt> <dd>
<p>Insert ENDBR instruction at function entry only via the <code class="code">cf_check</code> function attribute. This is useful when used with the option <samp class="option">-fcf-protection=branch</samp> to control ENDBR insertion at the function entry. </p> </dd> <dt>
<span><code class="code">-mcet-switch</code><a class="copiable-link" href="#index-mcet-switch"> ¶</a></span>
</dt> <dd>
<p>By default, CET instrumentation is turned off on switch statements that use a jump table and indirect branch track is disabled. Since jump tables are stored in read-only memory, this does not result in a direct loss of hardening. But if the jump table index is attacker-controlled, the indirect jump may not be constrained by CET. This option turns on CET instrumentation to enable indirect branch track for switch statements with jump tables which leads to the jump targets reachable via any indirect jumps. </p> </dd> <dt>
 <span><code class="code">-mcall-ms2sysv-xlogues</code><a class="copiable-link" href="#index-mcall-ms2sysv-xlogues"> ¶</a></span>
</dt> <dd>
<p>Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a System V ABI function must consider RSI, RDI and XMM6-15 as clobbered. By default, the code for saving and restoring these registers is emitted inline, resulting in fairly lengthy prologues and epilogues. Using <samp class="option">-mcall-ms2sysv-xlogues</samp> emits prologues and epilogues that use stubs in the static portion of libgcc to perform these saves and restores, thus reducing function size at the cost of a few extra instructions. </p> </dd> <dt>
<span><code class="code">-mtls-dialect=<var class="var">type</var></code><a class="copiable-link" href="#index-mtls-dialect-1"> ¶</a></span>
</dt> <dd>
<p>Generate code to access thread-local storage using the ‘<samp class="samp">gnu</samp>’ or ‘<samp class="samp">gnu2</samp>’ conventions. ‘<samp class="samp">gnu</samp>’ is the conservative default; ‘<samp class="samp">gnu2</samp>’ is more efficient, but it may add compile- and run-time requirements that cannot be satisfied on all systems. </p> </dd> <dt>
 <span><code class="code">-mpush-args</code><a class="copiable-link" href="#index-mpush-args"> ¶</a></span>
</dt> <dt><code class="code">-mno-push-args</code></dt> <dd>
<p>Use PUSH operations to store outgoing parameters. This method is shorter and usually equally fast as method using SUB/MOV operations and is enabled by default. In some cases disabling it may improve performance because of improved scheduling and reduced dependencies. </p> </dd> <dt>
<span><code class="code">-maccumulate-outgoing-args</code><a class="copiable-link" href="#index-maccumulate-outgoing-args-1"> ¶</a></span>
</dt> <dd>
<p>If enabled, the maximum amount of space required for outgoing arguments is computed in the function prologue. This is faster on most modern CPUs because of reduced dependencies, improved scheduling and reduced stack usage when the preferred stack boundary is not equal to 2. The drawback is a notable increase in code size. This switch implies <samp class="option">-mno-push-args</samp>. </p> </dd> <dt>
<span><code class="code">-mthreads</code><a class="copiable-link" href="#index-mthreads"> ¶</a></span>
</dt> <dd>
<p>Support thread-safe exception handling on MinGW. Programs that rely on thread-safe exception handling must compile and link all code with the <samp class="option">-mthreads</samp> option. When compiling, <samp class="option">-mthreads</samp> defines <samp class="option">-D_MT</samp>; when linking, it links in a special thread helper library <samp class="option">-lmingwthrd</samp> which cleans up per-thread exception-handling data. </p> </dd> <dt>
 <span><code class="code">-mms-bitfields</code><a class="copiable-link" href="#index-mms-bitfields"> ¶</a></span>
</dt> <dt><code class="code">-mno-ms-bitfields</code></dt> <dd> <p>Enable/disable bit-field layout compatible with the native Microsoft Windows compiler. </p> <p>If <code class="code">packed</code> is used on a structure, or if bit-fields are used, it may be that the Microsoft ABI lays out the structure differently than the way GCC normally does. Particularly when moving packed data between functions compiled with GCC and the native Microsoft compiler (either via function call or as data in a file), it may be necessary to access either format. </p> <p>This option is enabled by default for Microsoft Windows targets. This behavior can also be controlled locally by use of variable or type attributes. For more information, see <a class="ref" href="variable-attributes">x86 Variable Attributes</a> and <a class="ref" href="type-attributes">x86 Type Attributes</a>. </p> <p>The Microsoft structure layout algorithm is fairly simple with the exception of the bit-field packing. The padding and alignment of members of structures and whether a bit-field can straddle a storage-unit boundary are determine by these rules: </p> <ol class="enumerate"> <li> Structure members are stored sequentially in the order in which they are declared: the first member has the lowest memory address and the last member the highest. </li>
<li> Every data object has an alignment requirement. The alignment requirement for all data except structures, unions, and arrays is either the size of the object or the current packing size (specified with either the <code class="code">aligned</code> attribute or the <code class="code">pack</code> pragma), whichever is less. For structures, unions, and arrays, the alignment requirement is the largest alignment requirement of its members. Every object is allocated an offset so that: <div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">offset % alignment_requirement == 0</pre>
</div> </li>
<li> Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte allocation unit if the integral types are the same size and if the next bit-field fits into the current allocation unit without crossing the boundary imposed by the common alignment requirements of the bit-fields. </li>
</ol> <p>MSVC interprets zero-length bit-fields in the following ways: </p> <ol class="enumerate"> <li> If a zero-length bit-field is inserted between two bit-fields that are normally coalesced, the bit-fields are not coalesced. <p>For example: </p> <div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">struct
 {
   unsigned long bf_1 : 12;
   unsigned long : 0;
   unsigned long bf_2 : 12;
 } t1;</pre>
</div> <p>The size of <code class="code">t1</code> is 8 bytes with the zero-length bit-field. If the zero-length bit-field were removed, <code class="code">t1</code>’s size would be 4 bytes. </p> </li>
<li> If a zero-length bit-field is inserted after a bit-field, <code class="code">foo</code>, and the alignment of the zero-length bit-field is greater than the member that follows it, <code class="code">bar</code>, <code class="code">bar</code> is aligned as the type of the zero-length bit-field. <p>For example: </p> <div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">struct
 {
   char foo : 4;
   short : 0;
   char bar;
 } t2;

struct
 {
   char foo : 4;
   short : 0;
   double bar;
 } t3;</pre>
</div> <p>For <code class="code">t2</code>, <code class="code">bar</code> is placed at offset 2, rather than offset 1. Accordingly, the size of <code class="code">t2</code> is 4. For <code class="code">t3</code>, the zero-length bit-field does not affect the alignment of <code class="code">bar</code> or, as a result, the size of the structure. </p> <p>Taking this into account, it is important to note the following: </p> <ol class="enumerate"> <li> If a zero-length bit-field follows a normal bit-field, the type of the zero-length bit-field may affect the alignment of the structure as whole. For example, <code class="code">t2</code> has a size of 4 bytes, since the zero-length bit-field follows a normal bit-field, and is of type short. </li>
<li> Even if a zero-length bit-field is not followed by a normal bit-field, it may still affect the alignment of the structure: <div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">struct
 {
   char foo : 6;
   long : 0;
 } t4;</pre>
</div> <p>Here, <code class="code">t4</code> takes up 4 bytes. </p>
</li>
</ol> </li>
<li> Zero-length bit-fields following non-bit-field members are ignored: <div class="example smallexample"> <pre class="example-preformatted" data-language="cpp">struct
 {
   char foo;
   long : 0;
   char bar;
 } t5;</pre>
</div> <p>Here, <code class="code">t5</code> takes up 2 bytes. </p>
</li>
</ol> </dd> <dt>
 <span><code class="code">-mno-align-stringops</code><a class="copiable-link" href="#index-mno-align-stringops"> ¶</a></span>
</dt> <dd>
<p>Do not align the destination of inlined string operations. This switch reduces code size and improves performance in case the destination is already aligned, but GCC doesn’t know about it. </p> </dd> <dt>
<span><code class="code">-minline-all-stringops</code><a class="copiable-link" href="#index-minline-all-stringops"> ¶</a></span>
</dt> <dd>
<p>By default GCC inlines string operations only when the destination is known to be aligned to least a 4-byte boundary. This enables more inlining and increases code size, but may improve performance of code that depends on fast <code class="code">memcpy</code> and <code class="code">memset</code> for short lengths. The option enables inline expansion of <code class="code">strlen</code> for all pointer alignments. </p> </dd> <dt>
<span><code class="code">-minline-stringops-dynamically</code><a class="copiable-link" href="#index-minline-stringops-dynamically"> ¶</a></span>
</dt> <dd>
<p>For string operations of unknown size, use run-time checks with inline code for small blocks and a library call for large blocks. </p> </dd> <dt>
<span><code class="code">-mstringop-strategy=<var class="var">alg</var></code><a class="copiable-link" href="#index-mstringop-strategy_003dalg"> ¶</a></span>
</dt> <dd>
<p>Override the internal decision heuristic for the particular algorithm to use for inlining string operations. The allowed values for <var class="var">alg</var> are: </p> <dl class="table"> <dt>‘<samp class="samp">rep_byte</samp>’</dt> <dt>‘<samp class="samp">rep_4byte</samp>’</dt> <dt>‘<samp class="samp">rep_8byte</samp>’</dt> <dd>
<p>Expand using i386 <code class="code">rep</code> prefix of the specified size. </p> </dd> <dt>‘<samp class="samp">byte_loop</samp>’</dt> <dt>‘<samp class="samp">loop</samp>’</dt> <dt>‘<samp class="samp">unrolled_loop</samp>’</dt> <dd>
<p>Expand into an inline loop. </p> </dd> <dt>‘<samp class="samp">libcall</samp>’</dt> <dd><p>Always use a library call. </p></dd> </dl> </dd> <dt>
<span><code class="code">-mmemcpy-strategy=<var class="var">strategy</var></code><a class="copiable-link" href="#index-mmemcpy-strategy_003dstrategy"> ¶</a></span>
</dt> <dd>
<p>Override the internal decision heuristic to decide if <code class="code">__builtin_memcpy</code> should be inlined and what inline algorithm to use when the expected size of the copy operation is known. <var class="var">strategy</var> is a comma-separated list of <var class="var">alg</var>:<var class="var">max_size</var>:<var class="var">dest_align</var> triplets. <var class="var">alg</var> is specified in <samp class="option">-mstringop-strategy</samp>, <var class="var">max_size</var> specifies the max byte size with which inline algorithm <var class="var">alg</var> is allowed. For the last triplet, the <var class="var">max_size</var> must be <code class="code">-1</code>. The <var class="var">max_size</var> of the triplets in the list must be specified in increasing order. The minimal byte size for <var class="var">alg</var> is <code class="code">0</code> for the first triplet and <code class="code"><var class="var">max_size</var> + 1</code> of the preceding range. </p> </dd> <dt>
<span><code class="code">-mmemset-strategy=<var class="var">strategy</var></code><a class="copiable-link" href="#index-mmemset-strategy_003dstrategy"> ¶</a></span>
</dt> <dd>
<p>The option is similar to <samp class="option">-mmemcpy-strategy=</samp> except that it is to control <code class="code">__builtin_memset</code> expansion. </p> </dd> <dt>
<span><code class="code">-momit-leaf-frame-pointer</code><a class="copiable-link" href="#index-momit-leaf-frame-pointer-2"> ¶</a></span>
</dt> <dd>
<p>Don’t keep the frame pointer in a register for leaf functions. This avoids the instructions to save, set up, and restore frame pointers and makes an extra register available in leaf functions. The option <samp class="option">-fomit-leaf-frame-pointer</samp> removes the frame pointer for leaf functions, which might make debugging harder. </p> </dd> <dt>
<span><code class="code">-mtls-direct-seg-refs</code><a class="copiable-link" href="#index-mtls-direct-seg-refs"> ¶</a></span>
</dt> <dt><code class="code">-mno-tls-direct-seg-refs</code></dt> <dd>
<p>Controls whether TLS variables may be accessed with offsets from the TLS segment register (<code class="code">%gs</code> for 32-bit, <code class="code">%fs</code> for 64-bit), or whether the thread base pointer must be added. Whether or not this is valid depends on the operating system, and whether it maps the segment to cover the entire TLS area. </p> <p>For systems that use the GNU C Library, the default is on. </p> </dd> <dt>
<span><code class="code">-msse2avx</code><a class="copiable-link" href="#index-msse2avx"> ¶</a></span>
</dt> <dt><code class="code">-mno-sse2avx</code></dt> <dd>
<p>Specify that the assembler should encode SSE instructions with VEX prefix. The option <samp class="option">-mavx</samp> turns this on by default. </p> </dd> <dt>
<span><code class="code">-mfentry</code><a class="copiable-link" href="#index-mfentry"> ¶</a></span>
</dt> <dt><code class="code">-mno-fentry</code></dt> <dd>
<p>If profiling is active (<samp class="option">-pg</samp>), put the profiling counter call before the prologue. Note: On x86 architectures the attribute <code class="code">ms_hook_prologue</code> isn’t possible at the moment for <samp class="option">-mfentry</samp> and <samp class="option">-pg</samp>. </p> </dd> <dt>
<span><code class="code">-mrecord-mcount</code><a class="copiable-link" href="#index-mrecord-mcount"> ¶</a></span>
</dt> <dt><code class="code">-mno-record-mcount</code></dt> <dd>
<p>If profiling is active (<samp class="option">-pg</samp>), generate a __mcount_loc section that contains pointers to each profiling call. This is useful for automatically patching and out calls. </p> </dd> <dt>
<span><code class="code">-mnop-mcount</code><a class="copiable-link" href="#index-mnop-mcount"> ¶</a></span>
</dt> <dt><code class="code">-mno-nop-mcount</code></dt> <dd>
<p>If profiling is active (<samp class="option">-pg</samp>), generate the calls to the profiling functions as NOPs. This is useful when they should be patched in later dynamically. This is likely only useful together with <samp class="option">-mrecord-mcount</samp>. </p> </dd> <dt>
<span><code class="code">-minstrument-return=<var class="var">type</var></code><a class="copiable-link" href="#index-minstrument-return"> ¶</a></span>
</dt> <dd>
<p>Instrument function exit in -pg -mfentry instrumented functions with call to specified function. This only instruments true returns ending with ret, but not sibling calls ending with jump. Valid types are <var class="var">none</var> to not instrument, <var class="var">call</var> to generate a call to __return__, or <var class="var">nop5</var> to generate a 5 byte nop. </p> </dd> <dt>
<span><code class="code">-mrecord-return</code><a class="copiable-link" href="#index-mrecord-return"> ¶</a></span>
</dt> <dt><code class="code">-mno-record-return</code></dt> <dd>
<p>Generate a __return_loc section pointing to all return instrumentation code. </p> </dd> <dt>
<span><code class="code">-mfentry-name=<var class="var">name</var></code><a class="copiable-link" href="#index-mfentry-name"> ¶</a></span>
</dt> <dd>
<p>Set name of __fentry__ symbol called at function entry for -pg -mfentry functions. </p> </dd> <dt>
<span><code class="code">-mfentry-section=<var class="var">name</var></code><a class="copiable-link" href="#index-mfentry-section"> ¶</a></span>
</dt> <dd>
<p>Set name of section to record -mrecord-mcount calls (default __mcount_loc). </p> </dd> <dt>
<span><code class="code">-mskip-rax-setup</code><a class="copiable-link" href="#index-mskip-rax-setup"> ¶</a></span>
</dt> <dt><code class="code">-mno-skip-rax-setup</code></dt> <dd>
<p>When generating code for the x86-64 architecture with SSE extensions disabled, <samp class="option">-mskip-rax-setup</samp> can be used to skip setting up RAX register when there are no variable arguments passed in vector registers. </p> <p><strong class="strong">Warning:</strong> Since RAX register is used to avoid unnecessarily saving vector registers on stack when passing variable arguments, the impacts of this option are callees may waste some stack space, misbehave or jump to a random location. GCC 4.4 or newer don’t have those issues, regardless the RAX register value. </p> </dd> <dt>
<span><code class="code">-m8bit-idiv</code><a class="copiable-link" href="#index-m8bit-idiv"> ¶</a></span>
</dt> <dt><code class="code">-mno-8bit-idiv</code></dt> <dd>
<p>On some processors, like Intel Atom, 8-bit unsigned integer divide is much faster than 32-bit/64-bit integer divide. This option generates a run-time check. If both dividend and divisor are within range of 0 to 255, 8-bit unsigned integer divide is used instead of 32-bit/64-bit integer divide. </p> </dd> <dt>
 <span><code class="code">-mavx256-split-unaligned-load</code><a class="copiable-link" href="#index-mavx256-split-unaligned-load"> ¶</a></span>
</dt> <dt><code class="code">-mavx256-split-unaligned-store</code></dt> <dd>
<p>Split 32-byte AVX unaligned load and store. </p> </dd> <dt>
  <span><code class="code">-mstack-protector-guard=<var class="var">guard</var></code><a class="copiable-link" href="#index-mstack-protector-guard-4"> ¶</a></span>
</dt> <dt><code class="code">-mstack-protector-guard-reg=<var class="var">reg</var></code></dt> <dt><code class="code">-mstack-protector-guard-offset=<var class="var">offset</var></code></dt> <dd>
<p>Generate stack protection code using canary at <var class="var">guard</var>. Supported locations are ‘<samp class="samp">global</samp>’ for global canary or ‘<samp class="samp">tls</samp>’ for per-thread canary in the TLS block (the default). This option has effect only when <samp class="option">-fstack-protector</samp> or <samp class="option">-fstack-protector-all</samp> is specified. </p> <p>With the latter choice the options <samp class="option">-mstack-protector-guard-reg=<var class="var">reg</var></samp> and <samp class="option">-mstack-protector-guard-offset=<var class="var">offset</var></samp> furthermore specify which segment register (<code class="code">%fs</code> or <code class="code">%gs</code>) to use as base register for reading the canary, and from what offset from that base register. The default for those is as specified in the relevant ABI. </p> </dd> <dt>
<span><code class="code">-mgeneral-regs-only</code><a class="copiable-link" href="#index-mgeneral-regs-only-2"> ¶</a></span>
</dt> <dd>
<p>Generate code that uses only the general-purpose registers. This prevents the compiler from using floating-point, vector, mask and bound registers. </p> </dd> <dt>
<span><code class="code">-mrelax-cmpxchg-loop</code><a class="copiable-link" href="#index-mrelax-cmpxchg-loop"> ¶</a></span>
</dt> <dd>
<p>When emitting a compare-and-swap loop for <a class="ref" href="_005f_005fsync-builtins">Legacy <code class="code">__sync</code> Built-in Functions for Atomic Memory Access</a> and <a class="ref" href="_005f_005fatomic-builtins">Built-in Functions for Memory Model Aware Atomic Operations</a> lacking a native instruction, optimize for the highly contended case by issuing an atomic load before the <code class="code">CMPXCHG</code> instruction, and using the <code class="code">PAUSE</code> instruction to save CPU power when restarting the loop. </p> </dd> <dt>
<span><code class="code">-mindirect-branch=<var class="var">choice</var></code><a class="copiable-link" href="#index-mindirect-branch"> ¶</a></span>
</dt> <dd>
<p>Convert indirect call and jump with <var class="var">choice</var>. The default is ‘<samp class="samp">keep</samp>’, which keeps indirect call and jump unmodified. ‘<samp class="samp">thunk</samp>’ converts indirect call and jump to call and return thunk. ‘<samp class="samp">thunk-inline</samp>’ converts indirect call and jump to inlined call and return thunk. ‘<samp class="samp">thunk-extern</samp>’ converts indirect call and jump to external call and return thunk provided in a separate object file. You can control this behavior for a specific function by using the function attribute <code class="code">indirect_branch</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> <p>Note that <samp class="option">-mcmodel=large</samp> is incompatible with <samp class="option">-mindirect-branch=thunk</samp> and <samp class="option">-mindirect-branch=thunk-extern</samp> since the thunk function may not be reachable in the large code model. </p> <p>Note that <samp class="option">-mindirect-branch=thunk-extern</samp> is compatible with <samp class="option">-fcf-protection=branch</samp> since the external thunk can be made to enable control-flow check. </p> </dd> <dt>
<span><code class="code">-mfunction-return=<var class="var">choice</var></code><a class="copiable-link" href="#index-mfunction-return"> ¶</a></span>
</dt> <dd>
<p>Convert function return with <var class="var">choice</var>. The default is ‘<samp class="samp">keep</samp>’, which keeps function return unmodified. ‘<samp class="samp">thunk</samp>’ converts function return to call and return thunk. ‘<samp class="samp">thunk-inline</samp>’ converts function return to inlined call and return thunk. ‘<samp class="samp">thunk-extern</samp>’ converts function return to external call and return thunk provided in a separate object file. You can control this behavior for a specific function by using the function attribute <code class="code">function_return</code>. See <a class="xref" href="function-attributes">Declaring Attributes of Functions</a>. </p> <p>Note that <samp class="option">-mindirect-return=thunk-extern</samp> is compatible with <samp class="option">-fcf-protection=branch</samp> since the external thunk can be made to enable control-flow check. </p> <p>Note that <samp class="option">-mcmodel=large</samp> is incompatible with <samp class="option">-mfunction-return=thunk</samp> and <samp class="option">-mfunction-return=thunk-extern</samp> since the thunk function may not be reachable in the large code model. </p> </dd> <dt>
<span><code class="code">-mindirect-branch-register</code><a class="copiable-link" href="#index-mindirect-branch-register"> ¶</a></span>
</dt> <dd>
<p>Force indirect call and jump via register. </p> </dd> <dt>
<span><code class="code">-mharden-sls=<var class="var">choice</var></code><a class="copiable-link" href="#index-mharden-sls-1"> ¶</a></span>
</dt> <dd>
<p>Generate code to mitigate against straight line speculation (SLS) with <var class="var">choice</var>. The default is ‘<samp class="samp">none</samp>’ which disables all SLS hardening. ‘<samp class="samp">return</samp>’ enables SLS hardening for function returns. ‘<samp class="samp">indirect-jmp</samp>’ enables SLS hardening for indirect jumps. ‘<samp class="samp">all</samp>’ enables all SLS hardening. </p> </dd> <dt>
<span><code class="code">-mindirect-branch-cs-prefix</code><a class="copiable-link" href="#index-mindirect-branch-cs-prefix"> ¶</a></span>
</dt> <dd>
<p>Add CS prefix to call and jmp to indirect thunk with branch target in r8-r15 registers so that the call and jmp instruction length is 6 bytes to allow them to be replaced with ‘<samp class="samp">lfence; call *%r8-r15</samp>’ or ‘<samp class="samp">lfence; jmp *%r8-r15</samp>’ at run-time. </p> </dd> </dl> <p>These ‘<samp class="samp">-m</samp>’ switches are supported in addition to the above on x86-64 processors in 64-bit environments. </p> <dl class="table"> <dt>
    <span><code class="code">-m32</code><a class="copiable-link" href="#index-m32-2"> ¶</a></span>
</dt> <dt><code class="code">-m64</code></dt> <dt><code class="code">-mx32</code></dt> <dt><code class="code">-m16</code></dt> <dt><code class="code">-miamcu</code></dt> <dd>
<p>Generate code for a 16-bit, 32-bit or 64-bit environment. The <samp class="option">-m32</samp> option sets <code class="code">int</code>, <code class="code">long</code>, and pointer types to 32 bits, and generates code that runs on any i386 system. </p> <p>The <samp class="option">-m64</samp> option sets <code class="code">int</code> to 32 bits and <code class="code">long</code> and pointer types to 64 bits, and generates code for the x86-64 architecture. For Darwin only the <samp class="option">-m64</samp> option also turns off the <samp class="option">-fno-pic</samp> and <samp class="option">-mdynamic-no-pic</samp> options. </p> <p>The <samp class="option">-mx32</samp> option sets <code class="code">int</code>, <code class="code">long</code>, and pointer types to 32 bits, and generates code for the x86-64 architecture. </p> <p>The <samp class="option">-m16</samp> option is the same as <samp class="option">-m32</samp>, except for that it outputs the <code class="code">.code16gcc</code> assembly directive at the beginning of the assembly output so that the binary can run in 16-bit mode. </p> <p>The <samp class="option">-miamcu</samp> option generates code which conforms to Intel MCU psABI. It requires the <samp class="option">-m32</samp> option to be turned on. </p> </dd> <dt>
 <span><code class="code">-mno-red-zone</code><a class="copiable-link" href="#index-mno-red-zone"> ¶</a></span>
</dt> <dd>
<p>Do not use a so-called “red zone” for x86-64 code. The red zone is mandated by the x86-64 ABI; it is a 128-byte area beyond the location of the stack pointer that is not modified by signal or interrupt handlers and therefore can be used for temporary data without adjusting the stack pointer. The flag <samp class="option">-mno-red-zone</samp> disables this red zone. </p> </dd> <dt>
<span><code class="code">-mcmodel=small</code><a class="copiable-link" href="#index-mcmodel_003dsmall-3"> ¶</a></span>
</dt> <dd>
<p>Generate code for the small code model: the program and its symbols must be linked in the lower 2 GB of the address space. Pointers are 64 bits. Programs can be statically or dynamically linked. This is the default code model. </p> </dd> <dt>
<span><code class="code">-mcmodel=kernel</code><a class="copiable-link" href="#index-mcmodel_003dkernel"> ¶</a></span>
</dt> <dd>
<p>Generate code for the kernel code model. The kernel runs in the negative 2 GB of the address space. This model has to be used for Linux kernel code. </p> </dd> <dt>
<span><code class="code">-mcmodel=medium</code><a class="copiable-link" href="#index-mcmodel_003dmedium-1"> ¶</a></span>
</dt> <dd>
<p>Generate code for the medium model: the program is linked in the lower 2 GB of the address space. Small symbols are also placed there. Symbols with sizes larger than <samp class="option">-mlarge-data-threshold</samp> are put into large data or BSS sections and can be located above 2GB. Programs can be statically or dynamically linked. </p> </dd> <dt>
<span><code class="code">-mcmodel=large</code><a class="copiable-link" href="#index-mcmodel_003dlarge-3"> ¶</a></span>
</dt> <dd>
<p>Generate code for the large model. This model makes no assumptions about addresses and sizes of sections. </p> </dd> <dt>
<span><code class="code">-maddress-mode=long</code><a class="copiable-link" href="#index-maddress-mode_003dlong"> ¶</a></span>
</dt> <dd>
<p>Generate code for long address mode. This is only supported for 64-bit and x32 environments. It is the default address mode for 64-bit environments. </p> </dd> <dt>
<span><code class="code">-maddress-mode=short</code><a class="copiable-link" href="#index-maddress-mode_003dshort"> ¶</a></span>
</dt> <dd>
<p>Generate code for short address mode. This is only supported for 32-bit and x32 environments. It is the default address mode for 32-bit and x32 environments. </p> </dd> <dt>
<span><code class="code">-mneeded</code><a class="copiable-link" href="#index-mneeded"> ¶</a></span>
</dt> <dt><code class="code">-mno-needed</code></dt> <dd>
<p>Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property for Linux target to indicate the micro-architecture ISA level required to execute the binary. </p> </dd> <dt>
 <span><code class="code">-mno-direct-extern-access</code><a class="copiable-link" href="#index-mno-direct-extern-access"> ¶</a></span>
</dt> <dd>
<p>Without <samp class="option">-fpic</samp> nor <samp class="option">-fPIC</samp>, always use the GOT pointer to access external symbols. With <samp class="option">-fpic</samp> or <samp class="option">-fPIC</samp>, treat access to protected symbols as local symbols. The default is <samp class="option">-mdirect-extern-access</samp>. </p> <p><strong class="strong">Warning:</strong> shared libraries compiled with <samp class="option">-mno-direct-extern-access</samp> and executable compiled with <samp class="option">-mdirect-extern-access</samp> may not be binary compatible if protected symbols are used in shared libraries and executable. </p> </dd> <dt>
 <span><code class="code">-munroll-only-small-loops</code><a class="copiable-link" href="#index-munroll-only-small-loops"> ¶</a></span>
</dt> <dd>
<p>Controls conservative small loop unrolling. It is default enabled by O2, and unrolls loop with less than 4 insns by 1 time. Explicit -f[no-]unroll-[all-]loops would disable this flag to avoid any unintended unrolling behavior that user does not want. </p> </dd> <dt>
<span><code class="code">-mlam=<var class="var">choice</var></code><a class="copiable-link" href="#index-mlam"> ¶</a></span>
</dt> <dd><p>LAM(linear-address masking) allows special bits in the pointer to be used for metadata. The default is ‘<samp class="samp">none</samp>’. With ‘<samp class="samp">u48</samp>’, pointer bits in positions 62:48 can be used for metadata; With ‘<samp class="samp">u57</samp>’, pointer bits in positions 62:57 can be used for metadata. </p></dd> </dl> </div>  <div class="nav-panel"> <p> Next: <a href="x86-windows-options">x86 Windows Options</a>, Previous: <a href="vxworks-options">VxWorks Options</a>, Up: <a href="submodel-options">Machine-Dependent Options</a> [<a href="index#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="indices" title="Index" rel="index">Index</a>]</p> </div><div class="_attribution">
  <p class="_attribution-p">
    &copy; Free Software Foundation<br>Licensed under the GNU Free Documentation License, Version 1.3.<br>
    <a href="https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/x86-Options.html" class="_attribution-link">https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/x86-Options.html</a>
  </p>
</div>