Add "fill until" support
[normand.git] / README.adoc
1 // Show ToC at a specific location for a GitHub rendering
2 ifdef::env-github[]
3 :toc: macro
4 endif::env-github[]
5
6 ifndef::env-github[]
7 :toc: left
8 endif::env-github[]
9
10 // This is to mimic what GitHub does so that anchors work in an offline
11 // rendering too.
12 :idprefix:
13 :idseparator: -
14
15 // Other attributes
16 :py3: Python{nbsp}3
17
18 = Normand
19 Philippe Proulx
20
21 image::normand-logo.png[]
22
23 [.normal]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26 [.lead]
27 _**Normand**_ is a text-to-binary processor with its own language.
28
29 This package offers both a portable {py3} module and a command-line
30 tool.
31
32 WARNING: This version of Normand is 0.12, meaning both the Normand
33 language and the module/CLI interface aren't stable.
34
35 ifdef::env-github[]
36 // ToC location for a GitHub rendering
37 toc::[]
38 endif::env-github[]
39
40 == Introduction
41
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
44
45 .Simple bytes input.
46 ====
47 Consider the following Normand input:
48
49 ----
50 4f 55 32 bb $167 fe %10100111 a9 $-32
51 ----
52
53 The generated nine bytes are:
54
55 ----
56 4f 55 32 bb a7 fe a7 a9 e0
57 ----
58 ====
59
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
63
64 The Normand language is more than simple lists of bytes, though. Its
65 main features are:
66
67 Comments, including a bunch of insignificant symbols which may improve readability::
68 +
69 Input:
70 +
71 ----
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
75 60:57:18:a3:42:29
76 10839636-5d65-4a68-8e6a-21608ddf7258
77 ----
78 +
79 Output:
80 +
81 ----
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
85 ----
86
87 Hexadecimal, decimal, and binary byte constants::
88 +
89 Input:
90 +
91 ----
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
93 ----
94 +
95 Output:
96 +
97 ----
98 aa bb f7 a7 32 da
99 ----
100
101 UTF-8, UTF-16, and UTF-32 literal strings::
102 +
103 Input:
104 +
105 ----
106 "hello world!" 00
107 u16le"stress\nverdict 🤣"
108 ----
109 +
110 Output:
111 +
112 ----
113 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
114 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
115 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116 ----
117
118 Labels: special variables holding the offset where they're defined::
119 +
120 ----
121 <beg> b2 52 e3 bc 91 05
122 $100 $50 <chair> 33 9f fe
123 25 e9 89 8a <end>
124 ----
125
126 Variables::
127 +
128 ----
129 5e 65 {tower = 47} c6 7f f2 c4
130 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131 ----
132 +
133 The value of a variable assignment is the evaluation of a valid {py3}
134 expression which may include label and variable names.
135
136 Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
137 +
138 Input:
139 +
140 ----
141 {strength = 4}
142 {be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143 {le} {-1993 : 32}
144 {-3.141593 : 64}
145 ----
146 +
147 Output:
148 +
149 ----
150 67 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
151 09 c0
152 ----
153 +
154 The encoded number is the evaluation of a valid {py3} expression which
155 may include label and variable names.
156
157 https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158 +
159 Input:
160 +
161 ----
162 aa bb cc {-1993 : sleb128} <meow> dd ee ff
163 {meow * 199 : uleb128}
164 ----
165 +
166 Output:
167 +
168 ----
169 aa bb cc b7 70 dd ee ff e3 07
170 ----
171 +
172 The encoded integer is the evaluation of a valid {py3} expression which
173 may include label and variable names.
174
175 Conditional::
176 +
177 Input:
178 +
179 ----
180 aa bb cc
181
182 (
183 "foo"
184
185 !if {ICITTE > 10}
186 "bar"
187 !end
188 ) * 4
189 ----
190 +
191 Output:
192 +
193 ----
194 aa bb cc 66 6f 6f 66 6f 6f 66 6f 6f 62 61 72 66 ┆ •••foofoofoobarf
195 6f 6f 62 61 72 ┆ oobar
196 ----
197
198 Repetition::
199 +
200 Input:
201 +
202 ----
203 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
204
205 !repeat 3
206 ff ee "juice"
207 !end
208 ----
209 +
210 Output:
211 +
212 ----
213 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
214 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
215 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
216 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
217 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
218 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
219 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
220 ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
221 6a 75 69 63 65 ┆ juice
222 ----
223
224 Alignment::
225 +
226 Input:
227 +
228 ----
229 {be}
230
231 {199:32}
232 @64 {43:64}
233 @16 {-123:16}
234 @32~255 {5584:32}
235 ----
236 +
237 Output:
238 +
239 ----
240 00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
241 ff 85 ff ff 00 00 15 d0
242 ----
243
244 Filling::
245 +
246 Input:
247 +
248 ----
249 {le}
250 {0xdeadbeef:32}
251 {-1993:16}
252 {9:16}
253 +0x40
254 {ICITTE:8}
255 "meow mix"
256 +200~0xff
257 {ICITTE:8}
258 ----
259 +
260 Output:
261 +
262 ----
263 ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
264 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
265 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
266 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
267 40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
268 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
269 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
270 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
271 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
272 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275 ff ff ff ff ff ff ff ff c8 ┆ •••••••••
276 ----
277
278 Multilevel grouping::
279 +
280 Input:
281 +
282 ----
283 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
284 ----
285 +
286 Output:
287 +
288 ----
289 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
290 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
291 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
292 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
293 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
294 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
295 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
296 ----
297
298 Macros::
299 +
300 Input:
301 +
302 ----
303 !macro hello(world)
304 "hello"
305 !if world " world" !end
306 !end
307
308 !repeat 17
309 ff ff ff ff
310 m:hello({ICITTE > 15 and ICITTE < 60})
311 !end
312 ----
313 +
314 Output:
315 +
316 ----
317 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
318 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
319 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
320 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
321 ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
322 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
323 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
324 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
325 ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
326 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
327 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
328 ----
329
330 Precise error reporting::
331 +
332 ----
333 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
334 ----
335 +
336 ----
337 /tmp/meow.normand:32:6 - Unexpected character `k`.
338 ----
339 +
340 ----
341 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
342 ----
343 +
344 ----
345 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
346 ----
347
348 You can use Normand to track data source files in your favorite VCS
349 instead of raw binary files. The binary files that Normand generates can
350 be used to test file format decoding, including malformatted data, for
351 example, as well as for education.
352
353 See <<learn-normand>> to explore all the Normand features.
354
355 == Install Normand
356
357 Normand requires Python ≥ 3.4.
358
359 To install Normand:
360
361 ----
362 $ python3 -m pip install --user normand
363 ----
364
365 See
366 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
367 to learn more about a user site installation.
368
369 [NOTE]
370 ====
371 Normand has a single module file, `normand.py`, which you can copy as is
372 to your project to use it (both the <<python3-api,`normand.parse()`>>
373 function and the <<command-line-tool,command-line tool>>).
374
375 `normand.py` has _no external dependencies_, but if you're using
376 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
377 module.
378 ====
379
380 == Learn Normand
381
382 A Normand text input is a sequence of items which represent a sequence
383 of raw bytes.
384
385 [[state]] During the processing of items to data, Normand relies on a
386 current state:
387
388 [%header%autowidth]
389 |===
390 |State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
391
392 |[[cur-offset]] Current offset
393 |
394 The current offset has an effect on the value of <<label,labels>> and of
395 the special `ICITTE` name in <<fixed-length-number,fixed-length
396 number>>, <<leb-128-integer,LEB128 integer>>,
397 <<variable-assignment,variable assignment>>,
398 <<conditional-block,conditional block>>, <<repetition-block,repetition
399 block>>, <<macro-expansion,macro expansion>>, and
400 <<post-item-repetition,post-item repetition>> expression evaluation.
401
402 Each generated byte increments the current offset.
403
404 A <<current-offset-setting,current offset setting>> may change the
405 current offset without generating data.
406
407 An <<current-offset-alignment,current offset alignment>> generates
408 padding bytes to make the current offset satisfy a given alignment.
409 |`init_offset` parameter of the `parse()` function.
410 |`--offset` option.
411
412 |[[cur-bo]] Current byte order
413 |
414 The current byte order has an effect on the encoding of
415 <<fixed-length-number,fixed-length numbers>>.
416
417 A <<current-byte-order-setting,current byte order setting>> may change
418 the current byte order.
419 |`init_byte_order` parameter of the `parse()` function.
420 |`--byte-order` option.
421
422 |<<label,Labels>>
423 |Mapping of label names to integral values.
424 |`init_labels` parameter of the `parse()` function.
425 |One or more `--label` options.
426
427 |<<variable-assignment,Variables>>
428 |Mapping of variable names to integral or floating point number values.
429 |`init_variables` parameter of the `parse()` function.
430 |One or more `--var` options.
431 |===
432
433 The available items are:
434
435 * A <<byte-constant,constant integer>> representing a single byte.
436
437 * A <<literal-string,literal string>> representing a sequence of bytes
438 encoding UTF-8, UTF-16, or UTF-32 data.
439
440 * A <<current-byte-order-setting,current byte order setting>> (big or
441 little endian).
442
443 * A <<fixed-length-number,fixed-length number>> (integer or
444 floating point) using the <<cur-bo,current byte order>> and of which
445 the value is the result of a {py3} expression.
446
447 * An <<leb128-integer,LEB128 integer>> of which the value is the result
448 of a {py3} expression.
449
450 * A <<current-offset-setting,current offset setting>>.
451
452 * A <<current-offset-alignment,current offset alignment>>.
453
454 * A <<filling,filling>>.
455
456 * A <<label,label>>, that is, a named constant holding the current
457 offset.
458 +
459 This is similar to an assembly label.
460
461 * A <<variable-assignment,variable assignment>> associating a name to
462 the integral result of an evaluated {py3} expression.
463
464 * A <<group,group>>, that is, a scoped sequence of items.
465
466 * A <<conditional-block,conditional block>>.
467
468 * A <<repetition-block,repetition block>>.
469
470 * A <<macro-definition-block,macro definition block>>.
471
472 * A <<macro-expansion,macro expansion>>.
473
474 Moreover, you can repeat many items above a constant or variable number
475 of times with the ``pass:[*]`` operator _after_ the item to repeat. This
476 is called a <<post-item-repetition,post-item repetition>>.
477
478 A Normand comment may exist:
479
480 * Between items, possibly within a group.
481 * Between the nibbles of a constant hexadecimal byte.
482 * Between the bits of a constant binary byte.
483 * Between the last item and the ``pass:[*]`` character of a post-item
484 repetition, and between that ``pass:[*]`` character and the following
485 number or expression.
486 * Between the ``!repeat``/``!r`` block opening and the following
487 constant integer, name, or expression of a repetition block.
488 * Between the ``!if`` block opening and the following name or expression
489 of a conditional block.
490
491 A comment is anything between two ``pass:[#]`` characters on the same
492 line, or from ``pass:[#]`` until the end of the line. Whitespaces and
493 the following symbol characters are also considered comments where a
494 comment may exist:
495
496 ----
497 / \ ? & : ; . , [ ] _ = | -
498 ----
499
500 The latter serve to improve readability so that you may write, for
501 example, a MAC address or a UUID as is.
502
503 You can test the examples of this section with the `normand`
504 <<command-line-tool,command-line tool>> as such:
505
506 ----
507 $ normand file | hexdump -C
508 ----
509
510 where `file` is the name of a file containing the Normand input.
511
512 === Byte constant
513
514 A _byte constant_ represents a single byte.
515
516 A byte constant is:
517
518 Hexadecimal form::
519 Two consecutive hexits.
520
521 Decimal form::
522 A decimal number after the `$` prefix.
523
524 Binary form::
525 Eight bits after the `%` prefix.
526
527 ====
528 Input:
529
530 ----
531 ab cd [3d 8F] CC
532 ----
533
534 Output:
535
536 ----
537 ab cd 3d 8f cc
538 ----
539 ====
540
541 ====
542 Input:
543
544 ----
545 $192 %1100/0011 $ -77
546 ----
547
548 Output:
549
550 ----
551 c0 c3 b3
552 ----
553 ====
554
555 ====
556 Input:
557
558 ----
559 58f64689-6316-4d55-8a1a-04cada366172
560 fe80::6257:18ff:fea3:4229
561 ----
562
563 Output:
564
565 ----
566 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
567 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
568 ----
569 ====
570
571 ====
572 Input:
573
574 ----
575 %01110011 %01100001 %01101100 %01110101 %01110100
576 ----
577
578 Output:
579
580 ----
581 73 61 6c 75 74 ┆ salut
582 ----
583 ====
584
585 === Literal string
586
587 A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
588 bytes of a string.
589
590 The string to encode isn't implicitly null-terminated: use `\0` at the
591 end of the string to add a null character.
592
593 A literal string is:
594
595 . **Optional**: one of the following encodings instead of UTF-8:
596 +
597 --
598 [horizontal]
599 `u16be`:: UTF-16BE.
600 `u16le`:: UTF-16LE.
601 `u32be`:: UTF-32BE.
602 `u32le`:: UTF-32LE.
603 --
604
605 . The ``pass:["]`` prefix.
606
607 . A sequence of zero or more characters, possibly containing escape
608 sequences.
609 +
610 An escape sequence is the ``\`` character followed by one of:
611 +
612 --
613 [horizontal]
614 `0`:: Null (U+0000)
615 `a`:: Alert (U+0007)
616 `b`:: Backspace (U+0008)
617 `e`:: Escape (U+001B)
618 `f`:: Form feed (U+000C)
619 `n`:: End of line (U+000A)
620 `r`:: Carriage return (U+000D)
621 `t`:: Character tabulation (U+0009)
622 `v`:: Line tabulation (U+000B)
623 ``\``:: Reverse solidus (U+005C)
624 ``pass:["]``:: Quotation mark (U+0022)
625 --
626
627 . The ``pass:["]`` suffix.
628
629 ====
630 Input:
631
632 ----
633 "coucou tout le monde!"
634 ----
635
636 Output:
637
638 ----
639 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
640 6f 6e 64 65 21 ┆ onde!
641 ----
642 ====
643
644 ====
645 Input:
646
647 ----
648 u16le"I am not young enough to know everything."
649 ----
650
651 Output:
652
653 ----
654 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
655 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
656 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
657 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
658 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
659 2e 00 ┆ .•
660 ----
661 ====
662
663 ====
664 Input:
665
666 ----
667 u32be "\"illusion is the first\nof all pleasures\" 🦉"
668 ----
669
670 Output:
671
672 ----
673 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
674 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
675 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
676 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
677 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
678 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
679 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
680 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
681 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
682 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
683 00 00 00 20 00 01 f9 89 ┆ ••• ••••
684 ----
685 ====
686
687 === Current byte order setting
688
689 This special item sets the <<cur-bo,_current byte order_>>.
690
691 The two accepted forms are:
692
693 [horizontal]
694 ``pass:[{be}]``:: Set the current byte order to big endian.
695 ``pass:[{le}]``:: Set the current byte order to little endian.
696
697 === Fixed-length number
698
699 A _fixed-length number_ represents a fixed number of bytes encoding
700 either:
701
702 * An unsigned or signed integer (two's complement).
703 +
704 The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
705
706 * A floating point number
707 ([IEEE{nbsp}754-2008[https://standards.ieee.org/standard/754-2008.html]).
708 +
709 The available length are 32 (_binary32_) and 64 (_binary64_).
710
711 The value is the result of evaluating a {py3} expression using the
712 <<cur-bo,current byte order>>.
713
714 A fixed-length number is:
715
716 . The ``pass:[{]`` prefix.
717
718 . A valid {py3} expression.
719 +
720 For a fixed-length number at some source location{nbsp}__**L**__, this
721 expression may contain the name of any accessible <<label,label>> (not
722 within a nested group), including the name of a label defined
723 after{nbsp}__**L**__, as well as the name of any
724 <<variable-assignment,variable>> known at{nbsp}__**L**__.
725 +
726 The value of the special name `ICITTE` (`int` type) in this expression
727 is the <<cur-offset,current offset>> (before encoding the number).
728
729 . The `:` character.
730
731 . An encoding length in bits amongst:
732 +
733 --
734 The expression evaluates to an `int` or `bool` value::
735 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
736 +
737 NOTE: Normand automatically converts a `bool` value to `int`.
738
739 The expression evaluates to a `float` value::
740 `32` and `64`.
741 --
742
743 . The `}` suffix.
744
745 ====
746 Input:
747
748 ----
749 {le} {345:16}
750 {be} {-0xabcd:32}
751 ----
752
753 Output:
754
755 ----
756 59 01 ff ff 54 33
757 ----
758 ====
759
760 ====
761 Input:
762
763 ----
764 {be}
765
766 # String length in bits
767 {8 * (str_end - str_beg) : 16}
768
769 # String
770 <str_beg>
771 "hello world!"
772 <str_end>
773 ----
774
775 Output:
776
777 ----
778 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
779 ----
780 ====
781
782 ====
783 Input:
784
785 ----
786 {20 - ICITTE : 8} * 10
787 ----
788
789 Output:
790
791 ----
792 14 13 12 11 10 0f 0e 0d 0c 0b
793 ----
794 ====
795
796 ====
797 Input:
798
799 ----
800 {le}
801 {2 * 0.0529 : 32}
802 ----
803
804 Output:
805
806 ----
807 ac ad d8 3d
808 ----
809 ====
810
811 === LEB128 integer
812
813 An _LEB128 integer_ represents a variable number of bytes encoding an
814 unsigned or signed integer which is the result of evaluating a {py3}
815 expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
816 format.
817
818 An LEB128 integer is:
819
820 . The ``pass:[{]`` prefix.
821
822 . A valid {py3} expression of which the evaluation result type
823 is `int` or `bool` (automatically converted to `int`).
824 +
825 For an LEB128 integer at some source location{nbsp}__**L**__, this
826 expression may contain:
827 +
828 --
829 * The name of any <<label,label>> defined before{nbsp}__**L**__.
830 * The name of any <<variable-assignment,variable>> known
831 at{nbsp}__**L**__.
832 --
833 +
834 The value of the special name `ICITTE` (`int` type) in this expression
835 is the <<cur-offset,current offset>> (before encoding the integer).
836
837 . The `:` character.
838
839 . One of:
840 +
841 --
842 [horizontal]
843 `uleb128`:: Use the unsigned LEB128 format.
844 `sleb128`:: Use the signed LEB128 format.
845 --
846
847 . The `}` suffix.
848
849 ====
850 Input:
851
852 ----
853 {624485 : uleb128}
854 ----
855
856 Output:
857
858 ----
859 e5 8e 26
860 ----
861 ====
862
863 ====
864 Input:
865
866 ----
867 aa bb cc dd
868 <meow>
869 ee ff
870 {-981238311 + (meow * -23) : sleb128}
871 "hello"
872 ----
873
874 Output:
875
876 ----
877 aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
878 ----
879 ====
880
881 === Current offset setting
882
883 This special item sets the <<cur-offset,_current offset_>>.
884
885 A current offset setting is:
886
887 . The `<` prefix.
888
889 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
890 which is the new current offset.
891
892 . The `>` suffix.
893
894 ====
895 Input:
896
897 ----
898 {ICITTE : 8} * 8
899 <0x61> {ICITTE : 8} * 8
900 ----
901
902 Output:
903
904 ----
905 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
906 ----
907 ====
908
909 ====
910 Input:
911
912 ----
913 aa bb cc dd <meow> ee ff
914 <12> 11 22 33 <mix> 44 55
915 {meow : 8} {mix : 8}
916 ----
917
918 Output:
919
920 ----
921 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
922 ----
923 ====
924
925 === Current offset alignment
926
927 A _current offset alignment_ represents zero or more padding bytes to
928 make the <<cur-offset,current offset>> meet a given
929 https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
930
931 More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
932 a current offset alignment represents the required padding bytes until
933 the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
934
935 A current offset alignment is:
936
937 . The `@` prefix.
938
939 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
940 which is the alignment value in _bits_.
941 +
942 This value must be greater than zero and a multiple of{nbsp}8.
943
944 . **Optional**:
945 +
946 --
947 . The ``pass:[~]`` prefix.
948 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
949 which is the value of the byte to use as padding to align the
950 <<cur-offset,current offset>>.
951 --
952 +
953 Without this section, the padding byte value is zero.
954
955 ====
956 Input:
957
958 ----
959 11 22 (@32 aa bb cc) * 3
960 ----
961
962 Output:
963
964 ----
965 11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
966 ----
967 ====
968
969 ====
970 Input:
971
972 ----
973 {le}
974 77 88
975 @32~0xcc {-893.5:32}
976 @128~0x55 "meow"
977 ----
978
979 Output:
980
981 ----
982 77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
983 6d 65 6f 77 ┆ meow
984 ----
985 ====
986
987 ====
988 Input:
989
990 ----
991 aa bb cc <29> @64~255 "zoom"
992 ----
993
994 Output:
995
996 ----
997 aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
998 ----
999 ====
1000
1001 === Filling
1002
1003 A _filling_ represents zero or more padding bytes to make the
1004 <<cur-offset,current offset>> reach a given value.
1005
1006 A filling is:
1007
1008 . The ``pass:[+]`` prefix.
1009
1010 . One of:
1011
1012 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1013 which is the current offset target.
1014
1015 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1016 evaluation result type is `int` or `bool` (automatically converted to
1017 `int`), and the ``pass:[}]`` suffix.
1018 +
1019 For a filling at some source location{nbsp}__**L**__, this expression
1020 may contain:
1021 +
1022 --
1023 * The name of any <<label,label>> defined before{nbsp}__**L**__
1024 which isn't within a nested group.
1025 * The name of any <<variable-assignment,variable>> known
1026 at{nbsp}__**L**__.
1027 --
1028 +
1029 The value of the special name `ICITTE` (`int` type) in this expression
1030 is the <<cur-offset,current offset>> (before handling the items to
1031 repeat).
1032
1033 ** A valid {py3} name.
1034 +
1035 For the name `__NAME__`, this is equivalent to the
1036 `pass:[{]__NAME__pass:[}]` form above.
1037
1038 +
1039 This value must be greater than or equal to the current offset where
1040 it's used.
1041
1042 . **Optional**:
1043 +
1044 --
1045 . The ``pass:[~]`` prefix.
1046 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1047 which is the value of the byte to use as padding to reach the
1048 current offset target.
1049 --
1050 +
1051 Without this section, the padding byte value is zero.
1052
1053 ====
1054 Input:
1055
1056 ----
1057 aa bb cc dd
1058 +0x40
1059 "hello world"
1060 ----
1061
1062 Output:
1063
1064 ----
1065 aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1066 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1067 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1068 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1069 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1070 ----
1071 ====
1072
1073 ====
1074 Input:
1075
1076 ----
1077 !macro part(iter, fill)
1078 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1079 !end
1080
1081 {iter = 1}
1082
1083 !repeat 5
1084 m:part(iter, {32 + 4 * iter})
1085 {iter = iter + 1}
1086 !end
1087 ----
1088
1089 Output:
1090
1091 ----
1092 70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
1093 69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
1094 80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
1095 65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
1096 80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
1097 69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
1098 33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
1099 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1100 61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
1101 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1102 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1103 61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
1104 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1105 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1106 ----
1107 ====
1108
1109 === Label
1110
1111 A _label_ associates a name to the <<cur-offset,current offset>>.
1112
1113 All the labels of a whole Normand input must have unique names.
1114
1115 A label must not share the name of a <<variable-assignment,variable>>
1116 name.
1117
1118 A label is:
1119
1120 . The `<` prefix.
1121
1122 . A valid {py3} name which is not `ICITTE`.
1123
1124 . The `>` suffix.
1125
1126 === Variable assignment
1127
1128 A _variable assignment_ associates a name to the integral result of an
1129 evaluated {py3} expression.
1130
1131 A variable assignment is:
1132
1133 . The ``pass:[{]`` prefix.
1134
1135 . A valid {py3} name which is not `ICITTE`.
1136
1137 . The `=` character.
1138
1139 . A valid {py3} expression of which the evaluation result type
1140 is `int`, `float`, or `bool` (automatically converted to `int`).
1141 +
1142 For a variable assignment at some source location{nbsp}__**L**__, this
1143 expression may contain:
1144 +
1145 --
1146 * The name of any <<label,label>> defined before{nbsp}__**L**__
1147 which isn't within a nested group.
1148 * The name of any <<variable-assignment,variable>> known
1149 at{nbsp}__**L**__.
1150 --
1151 +
1152 The value of the special name `ICITTE` (`int` type) in this expression
1153 is the <<cur-offset,current offset>>.
1154
1155 . The `}` suffix.
1156
1157 ====
1158 Input:
1159
1160 ----
1161 {mix = 101} {le}
1162 {meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1163 "yooo" {meow + mix : 16}
1164 ----
1165
1166 Output:
1167
1168 ----
1169 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1170 ----
1171 ====
1172
1173 === Group
1174
1175 A _group_ is a scoped sequence of items.
1176
1177 The <<label,labels>> within a group aren't visible outside of it.
1178
1179 The main purpose of a group is to <<post-item-repetition,repeat>> more
1180 than a single item and to isolate labels.
1181
1182 A group is:
1183
1184 . The `(`, `!group`, or `!g` opening.
1185
1186 . Zero or more items.
1187
1188 . Depending on the group opening:
1189 +
1190 --
1191 `(`::
1192 The `)` closing.
1193
1194 `!group`::
1195 `!g`::
1196 The `!end` closing.
1197 --
1198
1199 ====
1200 Input:
1201
1202 ----
1203 ((aa bb cc) dd () ee) "leclerc"
1204 ----
1205
1206 Output:
1207
1208 ----
1209 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1210 ----
1211 ====
1212
1213 ====
1214 Input:
1215
1216 ----
1217 !group
1218 (aa bb cc) * 3 dd ee
1219 !end * 5
1220 ----
1221
1222 Output:
1223
1224 ----
1225 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1226 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1227 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1228 bb cc aa bb cc dd ee
1229 ----
1230 ====
1231
1232 ====
1233 Input:
1234
1235 ----
1236 {be}
1237 (
1238 <str_beg> u16le"sébastien diaz" <str_end>
1239 {ICITTE - str_beg : 8}
1240 {(end - str_beg) * 5 : 24}
1241 ) * 3
1242 <end>
1243 ----
1244
1245 Output:
1246
1247 ----
1248 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1249 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
1250 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1251 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
1252 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1253 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1254 ----
1255 ====
1256
1257 === Conditional block
1258
1259 A _conditional block_ represents either the bytes of one or more items
1260 if some expression is true, or no bytes at all if it's false.
1261
1262 A conditional block is:
1263
1264 . The `!if` opening.
1265
1266 . One of:
1267
1268 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1269 evaluation result type is `int` or `bool` (automatically converted to
1270 `int`), and the ``pass:[}]`` suffix.
1271 +
1272 For a conditional block at some source location{nbsp}__**L**__, this
1273 expression may contain:
1274 +
1275 --
1276 * The name of any <<label,label>> defined before{nbsp}__**L**__
1277 which isn't within a nested group.
1278 * The name of any <<variable-assignment,variable>> known
1279 at{nbsp}__**L**__.
1280 --
1281 +
1282 The value of the special name `ICITTE` (`int` type) in this expression
1283 is the <<cur-offset,current offset>> (before handling the contained
1284 items).
1285
1286 ** A valid {py3} name.
1287 +
1288 For the name `__NAME__`, this is equivalent to the
1289 `pass:[{]__NAME__pass:[}]` form above.
1290
1291 . Zero or more items.
1292
1293 . The `!end` closing.
1294
1295 ====
1296 Input:
1297
1298 ----
1299 {at = 1}
1300 {rep_count = 9}
1301
1302 !repeat rep_count
1303 "meow "
1304
1305 !if {ICITTE > 25}
1306 "mix"
1307
1308 !if {at < rep_count} 20 !end
1309 !end
1310
1311 {at = at + 1}
1312 !end
1313 ----
1314
1315 Output:
1316
1317 ----
1318 6d 65 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d ┆ meow meow meow m
1319 65 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d 69 ┆ eow meow meow mi
1320 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
1321 6d 69 78 20 6d 65 6f 77 20 6d 69 78 ┆ mix meow mix
1322 ----
1323 ====
1324
1325 ====
1326 Input:
1327
1328 ----
1329 <str_beg>
1330 u16le"meow mix!"
1331 <str_end>
1332
1333 !if {str_end - str_beg > 10}
1334 " BIG"
1335 !end
1336 ----
1337
1338 Output:
1339
1340 ----
1341 6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
1342 21 00 20 42 49 47 ┆ !• BIG
1343 ----
1344 ====
1345
1346 === Repetition block
1347
1348 A _repetition block_ represents the bytes of one or more items repeated
1349 a given number of times.
1350
1351 A repetition block is:
1352
1353 . The `!repeat` or `!r` opening.
1354
1355 . One of:
1356
1357 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1358 which is the number of times to repeat the previous item.
1359
1360 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1361 evaluation result type is `int` or `bool` (automatically converted to
1362 `int`), and the ``pass:[}]`` suffix.
1363 +
1364 For a repetition block at some source location{nbsp}__**L**__, this
1365 expression may contain:
1366 +
1367 --
1368 * The name of any <<label,label>> defined before{nbsp}__**L**__
1369 which isn't within a nested group.
1370 * The name of any <<variable-assignment,variable>> known
1371 at{nbsp}__**L**__.
1372 --
1373 +
1374 The value of the special name `ICITTE` (`int` type) in this expression
1375 is the <<cur-offset,current offset>> (before handling the items to
1376 repeat).
1377
1378 ** A valid {py3} name.
1379 +
1380 For the name `__NAME__`, this is equivalent to the
1381 `pass:[{]__NAME__pass:[}]` form above.
1382
1383 . Zero or more items.
1384
1385 . The `!end` closing.
1386
1387 You may also use a <<post-item-repetition,post-item repetition>> after
1388 some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1389 is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
1390
1391 ====
1392 Input:
1393
1394 ----
1395 !repeat 0x100
1396 {end - ICITTE - 1 : 8}
1397 !end
1398
1399 <end>
1400 ----
1401
1402 Output:
1403
1404 ----
1405 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1406 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1407 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1408 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1409 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1410 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1411 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1412 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1413 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1414 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1415 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1416 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1417 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1418 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1419 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1420 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1421 ----
1422 ====
1423
1424 ====
1425 Input:
1426
1427 ----
1428 {times = 1}
1429
1430 aa bb cc dd
1431
1432 !repeat 3
1433 <here>
1434
1435 !repeat {here + 1}
1436 ee ff
1437 !end
1438
1439 11 22 !repeat times 33 !end
1440
1441 {times = times + 1}
1442 !end
1443
1444 "coucou!"
1445 ----
1446
1447 Output:
1448
1449 ----
1450 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1451 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1452 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1453 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1454 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1455 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1456 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1457 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1458 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1459 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1460 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1461 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1462 ----
1463 ====
1464
1465 === Macro definition block
1466
1467 A _macro definition block_ associates a name and parameter names to
1468 a group of items.
1469
1470 A macro definition block doesn't lead to generated bytes itself: a
1471 <<macro-expansion,macro expansion>> does so.
1472
1473 A macro definition may only exist at the root level, that is, not within
1474 a <<group,group>>, a <<repetition-block,repetition block>>, a
1475 <<conditional-block,conditional block>>, or another
1476 <<macro-definition-block,macro definition block>>.
1477
1478 All macro definitions must have unique names.
1479
1480 A macro definition is:
1481
1482 . The `!macro` or `!m` opening.
1483
1484 . A valid {py3} name (the macro name).
1485
1486 . The `(` parameter name list prefix.
1487
1488 . A comma-separated list of zero or more unique parameter names,
1489 each one being a valid {py3} name.
1490
1491 . The `)` parameter name list suffix.
1492
1493 . Zero or more items except, recursively, a macro definition block.
1494
1495 . The `!end` closing.
1496
1497 ====
1498 ----
1499 !macro bake()
1500 {le} {ICITTE * 8 : 16}
1501 u16le"predict explode"
1502 !end
1503 ----
1504 ====
1505
1506 ====
1507 ----
1508 !macro nail(rep, with_extra, val)
1509 {iter = 1}
1510
1511 !repeat rep
1512 {val + iter : uleb128}
1513 {0xdeadbeef : 32}
1514 {iter = iter + 1}
1515 !end
1516
1517 !if with_extra
1518 "meow mix\0"
1519 !end
1520 !end
1521 ----
1522 ====
1523
1524 === Macro expansion
1525
1526 A _macro expansion_ expands the items of a defined
1527 <<macro-definition-block,macro>>.
1528
1529 The macro to expand must be defined _before_ the expansion.
1530
1531 The <<state,state>> before handling the first item of the chosen macro
1532 is:
1533
1534 <<cur-offset,Current offset>>::
1535 Unchanged.
1536
1537 <<cur-bo,Current byte order>>::
1538 Unchanged.
1539
1540 Variables::
1541 The only available variables initially are the macro parameters.
1542
1543 Labels::
1544 None.
1545
1546 The state after having handled the last item of the chosen macro is:
1547
1548 Current offset::
1549 The one before handling the first item of the macro plus the size
1550 of the generated data of the macro expansion.
1551 +
1552 IMPORTANT: This means <<current-offset-setting,current offset setting>>
1553 items within the expanded macro don't impact the final current offset.
1554
1555 Current byte order::
1556 The one before handling the first item of the macro.
1557
1558 Variables::
1559 The ones before handling the first item of the macro.
1560
1561 Labels::
1562 The ones before handling the first item of the macro.
1563
1564 A macro expansion is:
1565
1566 . The `m:` prefix.
1567
1568 . A valid {py3} name (the name of the macro to expand).
1569
1570 . The `(` parameter value list prefix.
1571
1572 . A comma-separated list of zero or more unique parameter values.
1573 +
1574 The number of parameter values must match the number of parameter
1575 names of the definition of the chosen macro.
1576 +
1577 A parameter value is one of:
1578 +
1579 --
1580 * A positive integer (hexadecimal starting with `0x` or `0X` accepted).
1581
1582 * The ``pass:[{]`` prefix, a valid {py3} expression of which the
1583 evaluation result type is `int` or `bool` (automatically converted to
1584 `int`), and the ``pass:[}]`` suffix.
1585 +
1586 For a macro expansion at some source location{nbsp}__**L**__, this
1587 expression may contain:
1588
1589 ** The name of any <<label,label>> defined before{nbsp}__**L**__
1590 which isn't within a nested group.
1591 ** The name of any <<variable-assignment,variable>> known
1592 at{nbsp}__**L**__.
1593
1594 +
1595 The value of the special name `ICITTE` (`int` type) in this expression
1596 is the <<cur-offset,current offset>> (before handling the items of the
1597 chosen macro).
1598
1599 * A valid {py3} name.
1600 +
1601 For the name `__NAME__`, this is equivalent to the
1602 `pass:[{]__NAME__pass:[}]` form above.
1603 --
1604
1605 . The `)` parameter value list suffix.
1606
1607 ====
1608 Input:
1609
1610 ----
1611 !macro bake()
1612 {le} {ICITTE * 8 : 16}
1613 u16le"predict explode"
1614 !end
1615
1616 "hello [" m:bake() "] world"
1617
1618 m:bake() * 5
1619 ----
1620
1621 Output:
1622
1623 ----
1624 68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
1625 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
1626 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
1627 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1628 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
1629 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1630 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
1631 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1632 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
1633 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1634 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
1635 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1636 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1637 ----
1638 ====
1639
1640 ====
1641 Input:
1642
1643 ----
1644 !macro A(val, is_be)
1645 {le}
1646
1647 !if is_be
1648 {be}
1649 !end
1650
1651 {val : 16}
1652 !end
1653
1654 !macro B(rep, is_be)
1655 {iter = 1}
1656
1657 !repeat rep
1658 m:A({iter * 3}, is_be)
1659 {iter = iter + 1}
1660 !end
1661 !end
1662
1663 m:B(5, 1)
1664 m:B(3, 0)
1665 ----
1666
1667 Output:
1668
1669 ----
1670 00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1671 ----
1672 ====
1673
1674 === Post-item repetition
1675
1676 A _post-item repetition_ represents the bytes of an item repeated a
1677 given number of times.
1678
1679 A post-item repetition is:
1680
1681 . One of those items:
1682
1683 ** A <<byte-constant,byte constant>>.
1684 ** A <<literal-string,literal string>>.
1685 ** A <<fixed-length-number,fixed-length number>>.
1686 ** An <<leb128-integer,LEB128 integer>>.
1687 ** A <<macro-expansion,macro-expansion>>.
1688 ** A <<group,group>>.
1689
1690 . The ``pass:[*]`` character.
1691
1692 . One of:
1693
1694 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1695 which is the number of times to repeat the previous item.
1696
1697 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1698 evaluation result type is `int` or `bool` (automatically converted to
1699 `int`), and the ``pass:[}]`` suffix.
1700 +
1701 For a post-item repetition at some source location{nbsp}__**L**__, this
1702 expression may contain:
1703 +
1704 --
1705 * The name of any <<label,label>> defined before{nbsp}__**L**__
1706 which isn't within a nested group and
1707 which isn't part of the repeated item.
1708 * The name of any <<variable-assignment,variable>> known
1709 at{nbsp}__**L**__, which isn't part of its repeated item, and which
1710 doesn't.
1711 --
1712 +
1713 The value of the special name `ICITTE` (`int` type) in this expression
1714 is the <<cur-offset,current offset>> (before handling the items to
1715 repeat).
1716
1717 ** A valid {py3} name.
1718 +
1719 For the name `__NAME__`, this is equivalent to the
1720 `pass:[{]__NAME__pass:[}]` form above.
1721
1722 You may also use a <<repetition-block,repetition block>>. The form
1723 ``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1724 ``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1725
1726 ====
1727 Input:
1728
1729 ----
1730 {end - ICITTE - 1 : 8} * 0x100 <end>
1731 ----
1732
1733 Output:
1734
1735 ----
1736 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1737 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1738 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1739 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1740 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1741 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1742 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1743 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1744 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1745 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1746 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1747 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1748 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1749 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1750 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1751 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1752 ----
1753 ====
1754
1755 ====
1756 Input:
1757
1758 ----
1759 {times = 1}
1760 aa bb cc dd
1761 (
1762 <here>
1763 (ee ff) * {here + 1}
1764 11 22 33 * {times}
1765 {times = times + 1}
1766 ) * 3
1767 "coucou!"
1768 ----
1769
1770 Output:
1771
1772 ----
1773 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1774 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1775 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1776 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1777 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1778 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1779 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1780 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1781 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1782 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1783 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1784 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1785 ----
1786 ====
1787
1788 == Command-line tool
1789
1790 If you <<install-normand,installed>> the `normand` package, then you
1791 can use the `normand` command-line tool:
1792
1793 ----
1794 $ normand <<< '"ma gang de malades"' | hexdump -C
1795 ----
1796
1797 ----
1798 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1799 00000010 65 73 |es|
1800 ----
1801
1802 If you copy the `normand.py` module to your own project, then you can
1803 run the module itself:
1804
1805 ----
1806 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1807 ----
1808
1809 ----
1810 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1811 00000010 65 73 |es|
1812 ----
1813
1814 Without a path argument, the `normand` tool reads from the standard
1815 input.
1816
1817 The `normand` tool prints the generated binary data to the standard
1818 output.
1819
1820 Various options control the initial <<state,state>> of the processor:
1821 use the `--help` option to learn more.
1822
1823 == {py3} API
1824
1825 The whole `normand` package/module public API is:
1826
1827 [source,python]
1828 ----
1829 # Byte order.
1830 class ByteOrder(enum.Enum):
1831 # Big endian.
1832 BE = ...
1833
1834 # Little endian.
1835 LE = ...
1836
1837
1838 # Text location.
1839 class TextLocation:
1840 # Line number.
1841 @property
1842 def line_no(self) -> int:
1843 ...
1844
1845 # Column number.
1846 @property
1847 def col_no(self) -> int:
1848 ...
1849
1850
1851 # Parsing error.
1852 class ParseError(RuntimeError):
1853 # Source text location.
1854 @property
1855 def text_loc(self) -> TextLocation:
1856 ...
1857
1858
1859 # Variables dictionary type (for type hints).
1860 VariablesT = typing.Dict[str, typing.Union[int, float]]
1861
1862
1863 # Labels dictionary type (for type hints).
1864 LabelsT = typing.Dict[str, int]
1865
1866
1867 # Parsing result.
1868 class ParseResult:
1869 # Generated data.
1870 @property
1871 def data(self) -> bytearray:
1872 ...
1873
1874 # Updated variable values.
1875 @property
1876 def variables(self) -> SymbolsT:
1877 ...
1878
1879 # Updated main group label values.
1880 @property
1881 def labels(self) -> SymbolsT:
1882 ...
1883
1884 # Final offset.
1885 @property
1886 def offset(self) -> int:
1887 ...
1888
1889 # Final byte order.
1890 @property
1891 def byte_order(self) -> typing.Optional[ByteOrder]:
1892 ...
1893
1894
1895 # Parses the `normand` input using the initial state defined by
1896 # `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
1897 # and returns the corresponding parsing result.
1898 def parse(normand: str,
1899 init_variables: typing.Optional[SymbolsT] = None,
1900 init_labels: typing.Optional[SymbolsT] = None,
1901 init_offset: int = 0,
1902 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1903 ...
1904 ----
1905
1906 The `normand` parameter is the actual <<learn-normand,Normand input>>
1907 while the other parameters control the initial <<state,state>>.
1908
1909 The `parse()` function raises a `ParseError` instance should it fail to
1910 parse the `normand` string for any reason.
1911
1912 == Development
1913
1914 Normand is a https://python-poetry.org/[Poetry] project.
1915
1916 To develop it, install it through Poetry and enter the virtual
1917 environment:
1918
1919 ----
1920 $ poetry install
1921 $ poetry shell
1922 $ normand <<< '"lol" * 10 0a'
1923 ----
1924
1925 `normand.py` is processed by:
1926
1927 * https://microsoft.github.io/pyright/[Pyright]
1928 * https://github.com/psf/black[Black]
1929 * https://pycqa.github.io/isort/[isort]
1930
1931 === Testing
1932
1933 Use https://docs.pytest.org/[pytest] to test Normand once the package is
1934 part of your virtual environment, for example:
1935
1936 ----
1937 $ poetry install
1938 $ poetry run pip3 install pytest
1939 $ poetry run pytest
1940 ----
1941
1942 The `pytest` project is currently not a development dependency in
1943 `pyproject.toml` due to backward compatibiliy issues with
1944 Python{nbsp}3.4.
1945
1946 In the `tests` directory, each `*.nt` file is a test. The file name
1947 prefix indicates what it's meant to test:
1948
1949 `pass-`::
1950 Everything above the `---` line is the valid Normand input
1951 to test.
1952 +
1953 Everything below the `---` line is the expected data
1954 (whitespace-separated hexadecimal bytes).
1955
1956 `fail-`::
1957 Everything above the `---` line is the invalid Normand input
1958 to test.
1959 +
1960 Everything below the `---` line is the expected error message having
1961 this form:
1962 +
1963 ----
1964 LINE:COL - MESSAGE
1965 ----
1966
1967 === Contributing
1968
1969 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1970 for code review.
1971
1972 To report a bug, https://github.com/efficios/normand/issues/new[create a
1973 GitHub issue].
This page took 0.068612 seconds and 5 git commands to generate.