0e5c506989c5154d443e660ae326515da7f09b17
[normand.git] / README.adoc
1 // Show ToC at a specific location for a GitHub rendering
2 ifdef::env-github[]
3 :toc: macro
4 endif::env-github[]
5
6 ifndef::env-github[]
7 :toc: left
8 endif::env-github[]
9
10 // This is to mimic what GitHub does so that anchors work in an offline
11 // rendering too.
12 :idprefix:
13 :idseparator: -
14
15 // Other attributes
16 :py3: Python{nbsp}3
17
18 = Normand
19 Philippe Proulx
20
21 image::normand-logo.png[]
22
23 [.normal]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26 [.lead]
27 _**Normand**_ is a text-to-binary processor with its own language.
28
29 This package offers both a portable {py3} module and a command-line
30 tool.
31
32 WARNING: This version of Normand is 0.14, meaning both the Normand
33 language and the module/CLI interface aren't stable.
34
35 ifdef::env-github[]
36 // ToC location for a GitHub rendering
37 toc::[]
38 endif::env-github[]
39
40 == Introduction
41
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
44
45 .Simple bytes input.
46 ====
47 Consider the following Normand input:
48
49 ----
50 4f 55 32 bb $167 fe %10100111 a9 $-32
51 ----
52
53 The generated nine bytes are:
54
55 ----
56 4f 55 32 bb a7 fe a7 a9 e0
57 ----
58 ====
59
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
63
64 The Normand language is more than simple lists of bytes, though. Its
65 main features are:
66
67 Comments, including a bunch of insignificant symbols which may improve readability::
68 +
69 Input:
70 +
71 ----
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
75 60:57:18:a3:42:29
76 10839636-5d65-4a68-8e6a-21608ddf7258
77 ----
78 +
79 Output:
80 +
81 ----
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
85 ----
86
87 Hexadecimal, decimal, and binary byte constants::
88 +
89 Input:
90 +
91 ----
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
93 ----
94 +
95 Output:
96 +
97 ----
98 aa bb f7 a7 32 da
99 ----
100
101 UTF-8, UTF-16, and UTF-32 literal strings::
102 +
103 Input:
104 +
105 ----
106 "hello world!" 00
107 u16le"stress\nverdict 🤣"
108 ----
109 +
110 Output:
111 +
112 ----
113 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
114 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
115 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116 ----
117
118 Labels: special variables holding the offset where they're defined::
119 +
120 ----
121 <beg> b2 52 e3 bc 91 05
122 $100 $50 <chair> 33 9f fe
123 25 e9 89 8a <end>
124 ----
125
126 Variables::
127 +
128 ----
129 5e 65 {tower = 47} c6 7f f2 c4
130 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131 ----
132 +
133 The value of a variable assignment is the evaluation of a valid {py3}
134 expression which may include label and variable names.
135
136 Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
137 +
138 Input:
139 +
140 ----
141 {strength = 4}
142 {be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143 {le} {-1993 : 32}
144 {-3.141593 : 64}
145 ----
146 +
147 Output:
148 +
149 ----
150 67 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
151 09 c0
152 ----
153 +
154 The encoded number is the evaluation of a valid {py3} expression which
155 may include label and variable names.
156
157 https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158 +
159 Input:
160 +
161 ----
162 aa bb cc {-1993 : sleb128} <meow> dd ee ff
163 {meow * 199 : uleb128}
164 ----
165 +
166 Output:
167 +
168 ----
169 aa bb cc b7 70 dd ee ff e3 07
170 ----
171 +
172 The encoded integer is the evaluation of a valid {py3} expression which
173 may include label and variable names.
174
175 Conditional::
176 +
177 Input:
178 +
179 ----
180 aa bb cc
181
182 (
183 "foo"
184
185 !if {ICITTE > 10}
186 "bar"
187 !else
188 "fight"
189 !end
190 ) * 4
191 ----
192 +
193 Output:
194 +
195 ----
196 aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
197 67 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
198 ----
199
200 Repetition::
201 +
202 Input:
203 +
204 ----
205 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
206
207 !repeat 3
208 ff ee "juice"
209 !end
210 ----
211 +
212 Output:
213 +
214 ----
215 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
216 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
217 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
218 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
219 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
220 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
221 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
222 ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
223 6a 75 69 63 65 ┆ juice
224 ----
225
226 Alignment::
227 +
228 Input:
229 +
230 ----
231 {be}
232
233 {199:32}
234 @64 {43:64}
235 @16 {-123:16}
236 @32~255 {5584:32}
237 ----
238 +
239 Output:
240 +
241 ----
242 00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
243 ff 85 ff ff 00 00 15 d0
244 ----
245
246 Filling::
247 +
248 Input:
249 +
250 ----
251 {le}
252 {0xdeadbeef:32}
253 {-1993:16}
254 {9:16}
255 +0x40
256 {ICITTE:8}
257 "meow mix"
258 +200~FFh
259 {ICITTE:8}
260 ----
261 +
262 Output:
263 +
264 ----
265 ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
266 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
267 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
268 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
269 40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
270 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
271 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
272 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277 ff ff ff ff ff ff ff ff c8 ┆ •••••••••
278 ----
279
280 Multilevel grouping::
281 +
282 Input:
283 +
284 ----
285 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
286 ----
287 +
288 Output:
289 +
290 ----
291 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
292 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
293 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
294 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
295 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
296 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
297 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
298 ----
299
300 Macros::
301 +
302 Input:
303 +
304 ----
305 !macro hello(world)
306 "hello"
307 !if world " world" !end
308 !end
309
310 !repeat 17
311 ff ff ff ff
312 m:hello({ICITTE > 15 and ICITTE < 60})
313 !end
314 ----
315 +
316 Output:
317 +
318 ----
319 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
320 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
321 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
322 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
323 ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
324 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
325 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
326 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
327 ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
328 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
329 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
330 ----
331
332 Precise error reporting::
333 +
334 ----
335 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
336 ----
337 +
338 ----
339 /tmp/meow.normand:32:6 - Unexpected character `k`.
340 ----
341 +
342 ----
343 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
344 ----
345 +
346 ----
347 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
348 ----
349
350 You can use Normand to track data source files in your favorite VCS
351 instead of raw binary files. The binary files that Normand generates can
352 be used to test file format decoding, including malformatted data, for
353 example, as well as for education.
354
355 See <<learn-normand>> to explore all the Normand features.
356
357 == Install Normand
358
359 Normand requires Python ≥ 3.4.
360
361 To install Normand:
362
363 ----
364 $ python3 -m pip install --user normand
365 ----
366
367 See
368 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
369 to learn more about a user site installation.
370
371 [NOTE]
372 ====
373 Normand has a single module file, `normand.py`, which you can copy as is
374 to your project to use it (both the <<python3-api,`normand.parse()`>>
375 function and the <<command-line-tool,command-line tool>>).
376
377 `normand.py` has _no external dependencies_, but if you're using
378 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
379 module.
380 ====
381
382 == Learn Normand
383
384 A Normand text input is a sequence of items which represent a sequence
385 of raw bytes.
386
387 [[state]] During the processing of items to data, Normand relies on a
388 current state:
389
390 [%header%autowidth]
391 |===
392 |State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
393
394 |[[cur-offset]] Current offset
395 |
396 The current offset has an effect on the value of <<label,labels>> and of
397 the special `ICITTE` name in <<fixed-length-number,fixed-length
398 number>>, <<leb-128-integer,LEB128 integer>>,
399 <<variable-assignment,variable assignment>>,
400 <<conditional-block,conditional block>>, <<repetition-block,repetition
401 block>>, <<macro-expansion,macro expansion>>, and
402 <<post-item-repetition,post-item repetition>> expression evaluation.
403
404 Each generated byte increments the current offset.
405
406 A <<current-offset-setting,current offset setting>> may change the
407 current offset without generating data.
408
409 An <<current-offset-alignment,current offset alignment>> generates
410 padding bytes to make the current offset satisfy a given alignment.
411 |`init_offset` parameter of the `parse()` function.
412 |`--offset` option.
413
414 |[[cur-bo]] Current byte order
415 |
416 The current byte order has an effect on the encoding of
417 <<fixed-length-number,fixed-length numbers>>.
418
419 A <<current-byte-order-setting,current byte order setting>> may change
420 the current byte order.
421 |`init_byte_order` parameter of the `parse()` function.
422 |`--byte-order` option.
423
424 |<<label,Labels>>
425 |Mapping of label names to integral values.
426 |`init_labels` parameter of the `parse()` function.
427 |One or more `--label` options.
428
429 |<<variable-assignment,Variables>>
430 |Mapping of variable names to integral or floating point number values.
431 |`init_variables` parameter of the `parse()` function.
432 |One or more `--var` options.
433 |===
434
435 The available items are:
436
437 * A <<byte-constant,constant integer>> representing a single byte.
438
439 * A <<literal-string,literal string>> representing a sequence of bytes
440 encoding UTF-8, UTF-16, or UTF-32 data.
441
442 * A <<current-byte-order-setting,current byte order setting>> (big or
443 little endian).
444
445 * A <<fixed-length-number,fixed-length number>> (integer or
446 floating point) using the <<cur-bo,current byte order>> and of which
447 the value is the result of a {py3} expression.
448
449 * An <<leb128-integer,LEB128 integer>> of which the value is the result
450 of a {py3} expression.
451
452 * A <<current-offset-setting,current offset setting>>.
453
454 * A <<current-offset-alignment,current offset alignment>>.
455
456 * A <<filling,filling>>.
457
458 * A <<label,label>>, that is, a named constant holding the current
459 offset.
460 +
461 This is similar to an assembly label.
462
463 * A <<variable-assignment,variable assignment>> associating a name to
464 the integral result of an evaluated {py3} expression.
465
466 * A <<group,group>>, that is, a scoped sequence of items.
467
468 * A <<conditional-block,conditional block>>.
469
470 * A <<repetition-block,repetition block>>.
471
472 * A <<macro-definition-block,macro definition block>>.
473
474 * A <<macro-expansion,macro expansion>>.
475
476 Moreover, you can repeat many items above a constant or variable number
477 of times with the ``pass:[*]`` operator _after_ the item to repeat. This
478 is called a <<post-item-repetition,post-item repetition>>.
479
480 A Normand comment may exist:
481
482 * Between items, possibly within a group.
483 * Between the nibbles of a constant hexadecimal byte.
484 * Between the bits of a constant binary byte.
485 * Between the last item and the ``pass:[*]`` character of a post-item
486 repetition, and between that ``pass:[*]`` character and the following
487 number or expression.
488 * Between the ``!repeat``/``!r`` block opening and the following
489 constant integer, name, or expression of a repetition block.
490 * Between the ``!if`` block opening and the following name or expression
491 of a conditional block.
492
493 A comment is anything between two ``pass:[#]`` characters on the same
494 line, or from ``pass:[#]`` until the end of the line. Whitespaces and
495 the following symbol characters are also considered comments where a
496 comment may exist:
497
498 ----
499 / \ ? & : ; . , [ ] _ = | -
500 ----
501
502 The latter serve to improve readability so that you may write, for
503 example, a MAC address or a UUID as is.
504
505 [[const-int]] Many items require a _constant integer_, possibly
506 negative, in which case it may start with `-` for a negative integer. A
507 positive constant integer is any of:
508
509 Decimal::
510 One or mode digits (`0` to `9`).
511
512 Hexadecimal::
513 One of:
514 +
515 * The `0x` or `0X` prefix followed with one or more hexadecimal digits
516 (`0` to `9`, `a` to `f`, or `A` to `F`).
517 * One or more hexadecimal digits followed with the `h` or `H` suffix.
518
519 Octal::
520 One of:
521 +
522 * The `0o` or `0O` prefix followed with one or more octal digits
523 (`0` to `7`).
524 * One or more octal digits followed with the `o`, `O`, `q`, or `Q`
525 suffix.
526
527 Binary::
528 One of:
529 +
530 * The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
531 * One or more bits followed with the `b` or `B` suffix.
532
533 You can test the examples of this section with the `normand`
534 <<command-line-tool,command-line tool>> as such:
535
536 ----
537 $ normand file | hexdump -C
538 ----
539
540 where `file` is the name of a file containing the Normand input.
541
542 === Byte constant
543
544 A _byte constant_ represents a single byte.
545
546 A byte constant is:
547
548 Hexadecimal form::
549 Two consecutive hexadecimal digits.
550
551 Decimal form::
552 One or more digits after the `$` prefix.
553
554 Binary form::
555 Eight bits after the `%` prefix.
556
557 ====
558 Input:
559
560 ----
561 ab cd [3d 8F] CC
562 ----
563
564 Output:
565
566 ----
567 ab cd 3d 8f cc
568 ----
569 ====
570
571 ====
572 Input:
573
574 ----
575 $192 %1100/0011 $ -77
576 ----
577
578 Output:
579
580 ----
581 c0 c3 b3
582 ----
583 ====
584
585 ====
586 Input:
587
588 ----
589 58f64689-6316-4d55-8a1a-04cada366172
590 fe80::6257:18ff:fea3:4229
591 ----
592
593 Output:
594
595 ----
596 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
597 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
598 ----
599 ====
600
601 ====
602 Input:
603
604 ----
605 %01110011 %01100001 %01101100 %01110101 %01110100
606 ----
607
608 Output:
609
610 ----
611 73 61 6c 75 74 ┆ salut
612 ----
613 ====
614
615 === Literal string
616
617 A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
618 bytes of a string.
619
620 The string to encode isn't implicitly null-terminated: use `\0` at the
621 end of the string to add a null character.
622
623 A literal string is:
624
625 . **Optional**: one of the following encodings instead of UTF-8:
626 +
627 --
628 [horizontal]
629 `u16be`:: UTF-16BE.
630 `u16le`:: UTF-16LE.
631 `u32be`:: UTF-32BE.
632 `u32le`:: UTF-32LE.
633 --
634
635 . The ``pass:["]`` prefix.
636
637 . A sequence of zero or more characters, possibly containing escape
638 sequences.
639 +
640 An escape sequence is the ``\`` character followed by one of:
641 +
642 --
643 [horizontal]
644 `0`:: Null (U+0000)
645 `a`:: Alert (U+0007)
646 `b`:: Backspace (U+0008)
647 `e`:: Escape (U+001B)
648 `f`:: Form feed (U+000C)
649 `n`:: End of line (U+000A)
650 `r`:: Carriage return (U+000D)
651 `t`:: Character tabulation (U+0009)
652 `v`:: Line tabulation (U+000B)
653 ``\``:: Reverse solidus (U+005C)
654 ``pass:["]``:: Quotation mark (U+0022)
655 --
656
657 . The ``pass:["]`` suffix.
658
659 ====
660 Input:
661
662 ----
663 "coucou tout le monde!"
664 ----
665
666 Output:
667
668 ----
669 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
670 6f 6e 64 65 21 ┆ onde!
671 ----
672 ====
673
674 ====
675 Input:
676
677 ----
678 u16le"I am not young enough to know everything."
679 ----
680
681 Output:
682
683 ----
684 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
685 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
686 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
687 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
688 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
689 2e 00 ┆ .•
690 ----
691 ====
692
693 ====
694 Input:
695
696 ----
697 u32be "\"illusion is the first\nof all pleasures\" 🦉"
698 ----
699
700 Output:
701
702 ----
703 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
704 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
705 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
706 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
707 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
708 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
709 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
710 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
711 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
712 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
713 00 00 00 20 00 01 f9 89 ┆ ••• ••••
714 ----
715 ====
716
717 === Current byte order setting
718
719 This special item sets the <<cur-bo,_current byte order_>>.
720
721 The two accepted forms are:
722
723 [horizontal]
724 ``pass:[{be}]``:: Set the current byte order to big endian.
725 ``pass:[{le}]``:: Set the current byte order to little endian.
726
727 === Fixed-length number
728
729 A _fixed-length number_ represents a fixed number of bytes encoding
730 either:
731
732 * An unsigned or signed integer (two's complement).
733 +
734 The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
735
736 * A floating point number
737 (IEEE{nbsp}754-2008[https://standards.ieee.org/standard/754-2008.html]).
738 +
739 The available length are 32 (_binary32_) and 64 (_binary64_).
740
741 The value is the result of evaluating a {py3} expression using the
742 <<cur-bo,current byte order>>.
743
744 A fixed-length number is:
745
746 . The ``pass:[{]`` prefix.
747
748 . A valid {py3} expression.
749 +
750 For a fixed-length number at some source location{nbsp}__**L**__, this
751 expression may contain the name of any accessible <<label,label>> (not
752 within a nested group), including the name of a label defined
753 after{nbsp}__**L**__, as well as the name of any
754 <<variable-assignment,variable>> known at{nbsp}__**L**__.
755 +
756 The value of the special name `ICITTE` (`int` type) in this expression
757 is the <<cur-offset,current offset>> (before encoding the number).
758
759 . The `:` character.
760
761 . An encoding length in bits amongst:
762 +
763 --
764 The expression evaluates to an `int` or `bool` value::
765 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
766 +
767 NOTE: Normand automatically converts a `bool` value to `int`.
768
769 The expression evaluates to a `float` value::
770 `32` and `64`.
771 --
772
773 . The `}` suffix.
774
775 ====
776 Input:
777
778 ----
779 {le} {345:16}
780 {be} {-0xabcd:32}
781 ----
782
783 Output:
784
785 ----
786 59 01 ff ff 54 33
787 ----
788 ====
789
790 ====
791 Input:
792
793 ----
794 {be}
795
796 # String length in bits
797 {8 * (str_end - str_beg) : 16}
798
799 # String
800 <str_beg>
801 "hello world!"
802 <str_end>
803 ----
804
805 Output:
806
807 ----
808 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
809 ----
810 ====
811
812 ====
813 Input:
814
815 ----
816 {20 - ICITTE : 8} * 10
817 ----
818
819 Output:
820
821 ----
822 14 13 12 11 10 0f 0e 0d 0c 0b
823 ----
824 ====
825
826 ====
827 Input:
828
829 ----
830 {le}
831 {2 * 0.0529 : 32}
832 ----
833
834 Output:
835
836 ----
837 ac ad d8 3d
838 ----
839 ====
840
841 === LEB128 integer
842
843 An _LEB128 integer_ represents a variable number of bytes encoding an
844 unsigned or signed integer which is the result of evaluating a {py3}
845 expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
846 format.
847
848 An LEB128 integer is:
849
850 . The ``pass:[{]`` prefix.
851
852 . A valid {py3} expression of which the evaluation result type
853 is `int` or `bool` (automatically converted to `int`).
854 +
855 For an LEB128 integer at some source location{nbsp}__**L**__, this
856 expression may contain:
857 +
858 --
859 * The name of any <<label,label>> defined before{nbsp}__**L**__
860 which isn't within a nested group.
861 * The name of any <<variable-assignment,variable>> known
862 at{nbsp}__**L**__.
863 --
864 +
865 The value of the special name `ICITTE` (`int` type) in this expression
866 is the <<cur-offset,current offset>> (before encoding the integer).
867
868 . The `:` character.
869
870 . One of:
871 +
872 --
873 [horizontal]
874 `uleb128`:: Use the unsigned LEB128 format.
875 `sleb128`:: Use the signed LEB128 format.
876 --
877
878 . The `}` suffix.
879
880 ====
881 Input:
882
883 ----
884 {624485 : uleb128}
885 ----
886
887 Output:
888
889 ----
890 e5 8e 26
891 ----
892 ====
893
894 ====
895 Input:
896
897 ----
898 aa bb cc dd
899 <meow>
900 ee ff
901 {-981238311 + (meow * -23) : sleb128}
902 "hello"
903 ----
904
905 Output:
906
907 ----
908 aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
909 ----
910 ====
911
912 === Current offset setting
913
914 This special item sets the <<cur-offset,_current offset_>>.
915
916 A current offset setting is:
917
918 . The `<` prefix.
919
920 . A <<const-int,positive constant integer>> which is the new current
921 offset.
922
923 . The `>` suffix.
924
925 ====
926 Input:
927
928 ----
929 {ICITTE : 8} * 8
930 <0x61> {ICITTE : 8} * 8
931 ----
932
933 Output:
934
935 ----
936 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
937 ----
938 ====
939
940 ====
941 Input:
942
943 ----
944 aa bb cc dd <meow> ee ff
945 <12> 11 22 33 <mix> 44 55
946 {meow : 8} {mix : 8}
947 ----
948
949 Output:
950
951 ----
952 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
953 ----
954 ====
955
956 === Current offset alignment
957
958 A _current offset alignment_ represents zero or more padding bytes to
959 make the <<cur-offset,current offset>> meet a given
960 https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
961
962 More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
963 a current offset alignment represents the required padding bytes until
964 the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
965
966 A current offset alignment is:
967
968 . The `@` prefix.
969
970 . A <<const-int,positive constant integer>> which is the alignment value
971 in _bits_.
972 +
973 This value must be greater than zero and a multiple of{nbsp}8.
974
975 . **Optional**:
976 +
977 --
978 . The ``pass:[~]`` prefix.
979 . A <<const-int,positive constant integer>> which is the value of the
980 byte to use as padding to align the <<cur-offset,current offset>>.
981 --
982 +
983 Without this section, the padding byte value is zero.
984
985 ====
986 Input:
987
988 ----
989 11 22 (@32 aa bb cc) * 3
990 ----
991
992 Output:
993
994 ----
995 11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
996 ----
997 ====
998
999 ====
1000 Input:
1001
1002 ----
1003 {le}
1004 77 88
1005 @32~0xcc {-893.5:32}
1006 @128~0x55 "meow"
1007 ----
1008
1009 Output:
1010
1011 ----
1012 77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
1013 6d 65 6f 77 ┆ meow
1014 ----
1015 ====
1016
1017 ====
1018 Input:
1019
1020 ----
1021 aa bb cc <29> @64~255 "zoom"
1022 ----
1023
1024 Output:
1025
1026 ----
1027 aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1028 ----
1029 ====
1030
1031 === Filling
1032
1033 A _filling_ represents zero or more padding bytes to make the
1034 <<cur-offset,current offset>> reach a given value.
1035
1036 A filling is:
1037
1038 . The ``pass:[+]`` prefix.
1039
1040 . One of:
1041
1042 ** A <<const-int,positive constant integer>> which is the current offset
1043 target.
1044
1045 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1046 evaluation result type is `int` or `bool` (automatically converted to
1047 `int`), and the ``pass:[}]`` suffix.
1048 +
1049 For a filling at some source location{nbsp}__**L**__, this expression
1050 may contain:
1051 +
1052 --
1053 * The name of any <<label,label>> defined before{nbsp}__**L**__
1054 which isn't within a nested group.
1055 * The name of any <<variable-assignment,variable>> known
1056 at{nbsp}__**L**__.
1057 --
1058 +
1059 The value of the special name `ICITTE` (`int` type) in this expression
1060 is the <<cur-offset,current offset>> (before handling the items to
1061 repeat).
1062
1063 ** A valid {py3} name.
1064 +
1065 For the name `__NAME__`, this is equivalent to the
1066 `pass:[{]__NAME__pass:[}]` form above.
1067
1068 +
1069 This value must be greater than or equal to the current offset where
1070 it's used.
1071
1072 . **Optional**:
1073 +
1074 --
1075 . The ``pass:[~]`` prefix.
1076 . A <<const-int,positive constant integer>> which is the value of the
1077 byte to use as padding to reach the current offset target.
1078 --
1079 +
1080 Without this section, the padding byte value is zero.
1081
1082 ====
1083 Input:
1084
1085 ----
1086 aa bb cc dd
1087 +0x40
1088 "hello world"
1089 ----
1090
1091 Output:
1092
1093 ----
1094 aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1095 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1096 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1097 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1098 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1099 ----
1100 ====
1101
1102 ====
1103 Input:
1104
1105 ----
1106 !macro part(iter, fill)
1107 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1108 !end
1109
1110 {iter = 1}
1111
1112 !repeat 5
1113 m:part(iter, {32 + 4 * iter})
1114 {iter = iter + 1}
1115 !end
1116 ----
1117
1118 Output:
1119
1120 ----
1121 70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
1122 69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
1123 80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
1124 65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
1125 80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
1126 69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
1127 33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
1128 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1129 61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
1130 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1131 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1132 61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
1133 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1134 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1135 ----
1136 ====
1137
1138 === Label
1139
1140 A _label_ associates a name to the <<cur-offset,current offset>>.
1141
1142 All the labels of a whole Normand input must have unique names.
1143
1144 A label must not share the name of a <<variable-assignment,variable>>
1145 name.
1146
1147 A label is:
1148
1149 . The `<` prefix.
1150
1151 . A valid {py3} name which is not `ICITTE`.
1152
1153 . The `>` suffix.
1154
1155 === Variable assignment
1156
1157 A _variable assignment_ associates a name to the integral result of an
1158 evaluated {py3} expression.
1159
1160 A variable assignment is:
1161
1162 . The ``pass:[{]`` prefix.
1163
1164 . A valid {py3} name which is not `ICITTE`.
1165
1166 . The `=` character.
1167
1168 . A valid {py3} expression of which the evaluation result type
1169 is `int`, `float`, or `bool` (automatically converted to `int`).
1170 +
1171 For a variable assignment at some source location{nbsp}__**L**__, this
1172 expression may contain:
1173 +
1174 --
1175 * The name of any <<label,label>> defined before{nbsp}__**L**__
1176 which isn't within a nested group.
1177 * The name of any <<variable-assignment,variable>> known
1178 at{nbsp}__**L**__.
1179 --
1180 +
1181 The value of the special name `ICITTE` (`int` type) in this expression
1182 is the <<cur-offset,current offset>>.
1183
1184 . The `}` suffix.
1185
1186 ====
1187 Input:
1188
1189 ----
1190 {mix = 101} {le}
1191 {meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1192 "yooo" {meow + mix : 16}
1193 ----
1194
1195 Output:
1196
1197 ----
1198 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1199 ----
1200 ====
1201
1202 === Group
1203
1204 A _group_ is a scoped sequence of items.
1205
1206 The <<label,labels>> within a group aren't visible outside of it.
1207
1208 The main purpose of a group is to <<post-item-repetition,repeat>> more
1209 than a single item and to isolate labels.
1210
1211 A group is:
1212
1213 . The `(`, `!group`, or `!g` opening.
1214
1215 . Zero or more items.
1216
1217 . Depending on the group opening:
1218 +
1219 --
1220 `(`::
1221 The `)` closing.
1222
1223 `!group`::
1224 `!g`::
1225 The `!end` closing.
1226 --
1227
1228 ====
1229 Input:
1230
1231 ----
1232 ((aa bb cc) dd () ee) "leclerc"
1233 ----
1234
1235 Output:
1236
1237 ----
1238 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1239 ----
1240 ====
1241
1242 ====
1243 Input:
1244
1245 ----
1246 !group
1247 (aa bb cc) * 3 dd ee
1248 !end * 5
1249 ----
1250
1251 Output:
1252
1253 ----
1254 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1255 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1256 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1257 bb cc aa bb cc dd ee
1258 ----
1259 ====
1260
1261 ====
1262 Input:
1263
1264 ----
1265 {be}
1266 (
1267 <str_beg> u16le"sébastien diaz" <str_end>
1268 {ICITTE - str_beg : 8}
1269 {(end - str_beg) * 5 : 24}
1270 ) * 3
1271 <end>
1272 ----
1273
1274 Output:
1275
1276 ----
1277 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1278 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
1279 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1280 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
1281 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1282 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1283 ----
1284 ====
1285
1286 === Conditional block
1287
1288 A _conditional block_ represents either the bytes of zero or more items
1289 if some expression is true, or the bytes of zero or more other items if
1290 it's false.
1291
1292 A conditional block is:
1293
1294 . The `!if` opening.
1295
1296 . One of:
1297
1298 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1299 evaluation result type is `int` or `bool` (automatically converted to
1300 `int`), and the ``pass:[}]`` suffix.
1301 +
1302 For a conditional block at some source location{nbsp}__**L**__, this
1303 expression may contain:
1304 +
1305 --
1306 * The name of any <<label,label>> defined before{nbsp}__**L**__
1307 which isn't within a nested group.
1308 * The name of any <<variable-assignment,variable>> known
1309 at{nbsp}__**L**__.
1310 --
1311 +
1312 The value of the special name `ICITTE` (`int` type) in this expression
1313 is the <<cur-offset,current offset>> (before handling the contained
1314 items).
1315
1316 ** A valid {py3} name.
1317 +
1318 For the name `__NAME__`, this is equivalent to the
1319 `pass:[{]__NAME__pass:[}]` form above.
1320
1321 . Zero or more items to be handled when the condition is true.
1322
1323 . **Optional**:
1324
1325 .. The `!else` opening.
1326 .. Zero or more items to be handled when the condition is false.
1327
1328 . The `!end` closing.
1329
1330 ====
1331 Input:
1332
1333 ----
1334 {at = 1}
1335 {rep_count = 9}
1336
1337 !repeat rep_count
1338 "meow "
1339
1340 !if {ICITTE > 25}
1341 "mix"
1342 !else
1343 "zoom"
1344 !end
1345
1346 !if {at < rep_count} 20 !end
1347
1348 {at = at + 1}
1349 !end
1350 ----
1351
1352 Output:
1353
1354 ----
1355 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
1356 6f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
1357 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
1358 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
1359 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
1360 6d 69 78 ┆ mix
1361 ----
1362 ====
1363
1364 ====
1365 Input:
1366
1367 ----
1368 <str_beg>
1369 u16le"meow mix!"
1370 <str_end>
1371
1372 !if {str_end - str_beg > 10}
1373 " BIG"
1374 !end
1375 ----
1376
1377 Output:
1378
1379 ----
1380 6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
1381 21 00 20 42 49 47 ┆ !• BIG
1382 ----
1383 ====
1384
1385 === Repetition block
1386
1387 A _repetition block_ represents the bytes of one or more items repeated
1388 a given number of times.
1389
1390 A repetition block is:
1391
1392 . The `!repeat` or `!r` opening.
1393
1394 . One of:
1395
1396 ** A <<const-int,positive constant integer>> which is the number of
1397 times to repeat the previous item.
1398
1399 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1400 evaluation result type is `int` or `bool` (automatically converted to
1401 `int`), and the ``pass:[}]`` suffix.
1402 +
1403 For a repetition block at some source location{nbsp}__**L**__, this
1404 expression may contain:
1405 +
1406 --
1407 * The name of any <<label,label>> defined before{nbsp}__**L**__
1408 which isn't within a nested group.
1409 * The name of any <<variable-assignment,variable>> known
1410 at{nbsp}__**L**__.
1411 --
1412 +
1413 The value of the special name `ICITTE` (`int` type) in this expression
1414 is the <<cur-offset,current offset>> (before handling the items to
1415 repeat).
1416
1417 ** A valid {py3} name.
1418 +
1419 For the name `__NAME__`, this is equivalent to the
1420 `pass:[{]__NAME__pass:[}]` form above.
1421
1422 . Zero or more items.
1423
1424 . The `!end` closing.
1425
1426 You may also use a <<post-item-repetition,post-item repetition>> after
1427 some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1428 is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
1429
1430 ====
1431 Input:
1432
1433 ----
1434 !repeat 0o400
1435 {end - ICITTE - 1 : 8}
1436 !end
1437
1438 <end>
1439 ----
1440
1441 Output:
1442
1443 ----
1444 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1445 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1446 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1447 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1448 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1449 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1450 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1451 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1452 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1453 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1454 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1455 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1456 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1457 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1458 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1459 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1460 ----
1461 ====
1462
1463 ====
1464 Input:
1465
1466 ----
1467 {times = 1}
1468
1469 aa bb cc dd
1470
1471 !repeat 3
1472 <here>
1473
1474 !repeat {here + 1}
1475 ee ff
1476 !end
1477
1478 11 22 !repeat times 33 !end
1479
1480 {times = times + 1}
1481 !end
1482
1483 "coucou!"
1484 ----
1485
1486 Output:
1487
1488 ----
1489 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1490 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1491 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1492 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1493 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1494 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1495 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1496 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1497 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1498 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1499 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1500 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1501 ----
1502 ====
1503
1504 === Macro definition block
1505
1506 A _macro definition block_ associates a name and parameter names to
1507 a group of items.
1508
1509 A macro definition block doesn't lead to generated bytes itself: a
1510 <<macro-expansion,macro expansion>> does so.
1511
1512 A macro definition may only exist at the root level, that is, not within
1513 a <<group,group>>, a <<repetition-block,repetition block>>, a
1514 <<conditional-block,conditional block>>, or another
1515 <<macro-definition-block,macro definition block>>.
1516
1517 All macro definitions must have unique names.
1518
1519 A macro definition is:
1520
1521 . The `!macro` or `!m` opening.
1522
1523 . A valid {py3} name (the macro name).
1524
1525 . The `(` parameter name list prefix.
1526
1527 . A comma-separated list of zero or more unique parameter names,
1528 each one being a valid {py3} name.
1529
1530 . The `)` parameter name list suffix.
1531
1532 . Zero or more items except, recursively, a macro definition block.
1533
1534 . The `!end` closing.
1535
1536 ====
1537 ----
1538 !macro bake()
1539 {le} {ICITTE * 8 : 16}
1540 u16le"predict explode"
1541 !end
1542 ----
1543 ====
1544
1545 ====
1546 ----
1547 !macro nail(rep, with_extra, val)
1548 {iter = 1}
1549
1550 !repeat rep
1551 {val + iter : uleb128}
1552 {0xdeadbeef : 32}
1553 {iter = iter + 1}
1554 !end
1555
1556 !if with_extra
1557 "meow mix\0"
1558 !end
1559 !end
1560 ----
1561 ====
1562
1563 === Macro expansion
1564
1565 A _macro expansion_ expands the items of a defined
1566 <<macro-definition-block,macro>>.
1567
1568 The macro to expand must be defined _before_ the expansion.
1569
1570 The <<state,state>> before handling the first item of the chosen macro
1571 is:
1572
1573 <<cur-offset,Current offset>>::
1574 Unchanged.
1575
1576 <<cur-bo,Current byte order>>::
1577 Unchanged.
1578
1579 Variables::
1580 The only available variables initially are the macro parameters.
1581
1582 Labels::
1583 None.
1584
1585 The state after having handled the last item of the chosen macro is:
1586
1587 Current offset::
1588 The one before handling the first item of the macro plus the size
1589 of the generated data of the macro expansion.
1590 +
1591 IMPORTANT: This means <<current-offset-setting,current offset setting>>
1592 items within the expanded macro don't impact the final current offset.
1593
1594 Current byte order::
1595 The one before handling the first item of the macro.
1596
1597 Variables::
1598 The ones before handling the first item of the macro.
1599
1600 Labels::
1601 The ones before handling the first item of the macro.
1602
1603 A macro expansion is:
1604
1605 . The `m:` prefix.
1606
1607 . A valid {py3} name (the name of the macro to expand).
1608
1609 . The `(` parameter value list prefix.
1610
1611 . A comma-separated list of zero or more unique parameter values.
1612 +
1613 The number of parameter values must match the number of parameter
1614 names of the definition of the chosen macro.
1615 +
1616 A parameter value is one of:
1617 +
1618 --
1619 * A <<const-int,constant integer>>, possibly negative.
1620
1621 * The ``pass:[{]`` prefix, a valid {py3} expression of which the
1622 evaluation result type is `int` or `bool` (automatically converted to
1623 `int`), and the ``pass:[}]`` suffix.
1624 +
1625 For a macro expansion at some source location{nbsp}__**L**__, this
1626 expression may contain:
1627
1628 ** The name of any <<label,label>> defined before{nbsp}__**L**__
1629 which isn't within a nested group.
1630 ** The name of any <<variable-assignment,variable>> known
1631 at{nbsp}__**L**__.
1632
1633 +
1634 The value of the special name `ICITTE` (`int` type) in this expression
1635 is the <<cur-offset,current offset>> (before handling the items of the
1636 chosen macro).
1637
1638 * A valid {py3} name.
1639 +
1640 For the name `__NAME__`, this is equivalent to the
1641 `pass:[{]__NAME__pass:[}]` form above.
1642 --
1643
1644 . The `)` parameter value list suffix.
1645
1646 ====
1647 Input:
1648
1649 ----
1650 !macro bake()
1651 {le} {ICITTE * 8 : 16}
1652 u16le"predict explode"
1653 !end
1654
1655 "hello [" m:bake() "] world"
1656
1657 m:bake() * 5
1658 ----
1659
1660 Output:
1661
1662 ----
1663 68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
1664 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
1665 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
1666 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1667 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
1668 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1669 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
1670 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1671 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
1672 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1673 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
1674 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1675 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1676 ----
1677 ====
1678
1679 ====
1680 Input:
1681
1682 ----
1683 !macro A(val, is_be)
1684 {le}
1685
1686 !if is_be
1687 {be}
1688 !end
1689
1690 {val : 16}
1691 !end
1692
1693 !macro B(rep, is_be)
1694 {iter = 1}
1695
1696 !repeat rep
1697 m:A({iter * 3}, is_be)
1698 {iter = iter + 1}
1699 !end
1700 !end
1701
1702 m:B(5, 1)
1703 m:B(3, 0)
1704 ----
1705
1706 Output:
1707
1708 ----
1709 00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1710 ----
1711 ====
1712
1713 === Post-item repetition
1714
1715 A _post-item repetition_ represents the bytes of an item repeated a
1716 given number of times.
1717
1718 A post-item repetition is:
1719
1720 . One of those items:
1721
1722 ** A <<byte-constant,byte constant>>.
1723 ** A <<literal-string,literal string>>.
1724 ** A <<fixed-length-number,fixed-length number>>.
1725 ** An <<leb128-integer,LEB128 integer>>.
1726 ** A <<macro-expansion,macro-expansion>>.
1727 ** A <<group,group>>.
1728
1729 . The ``pass:[*]`` character.
1730
1731 . One of:
1732
1733 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1734 which is the number of times to repeat the previous item.
1735
1736 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1737 evaluation result type is `int` or `bool` (automatically converted to
1738 `int`), and the ``pass:[}]`` suffix.
1739 +
1740 For a post-item repetition at some source location{nbsp}__**L**__, this
1741 expression may contain:
1742 +
1743 --
1744 * The name of any <<label,label>> defined before{nbsp}__**L**__
1745 which isn't within a nested group and
1746 which isn't part of the repeated item.
1747 * The name of any <<variable-assignment,variable>> known
1748 at{nbsp}__**L**__, which isn't part of its repeated item, and which
1749 doesn't.
1750 --
1751 +
1752 The value of the special name `ICITTE` (`int` type) in this expression
1753 is the <<cur-offset,current offset>> (before handling the items to
1754 repeat).
1755
1756 ** A valid {py3} name.
1757 +
1758 For the name `__NAME__`, this is equivalent to the
1759 `pass:[{]__NAME__pass:[}]` form above.
1760
1761 You may also use a <<repetition-block,repetition block>>. The form
1762 ``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1763 ``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1764
1765 ====
1766 Input:
1767
1768 ----
1769 {end - ICITTE - 1 : 8} * 0x100 <end>
1770 ----
1771
1772 Output:
1773
1774 ----
1775 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1776 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1777 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1778 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1779 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1780 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1781 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1782 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1783 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1784 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1785 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1786 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1787 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1788 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1789 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1790 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1791 ----
1792 ====
1793
1794 ====
1795 Input:
1796
1797 ----
1798 {times = 1}
1799 aa bb cc dd
1800 (
1801 <here>
1802 (ee ff) * {here + 1}
1803 11 22 33 * {times}
1804 {times = times + 1}
1805 ) * 3
1806 "coucou!"
1807 ----
1808
1809 Output:
1810
1811 ----
1812 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1813 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1814 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1815 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1816 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1817 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1818 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1819 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1820 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1821 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1822 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1823 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1824 ----
1825 ====
1826
1827 == Command-line tool
1828
1829 If you <<install-normand,installed>> the `normand` package, then you
1830 can use the `normand` command-line tool:
1831
1832 ----
1833 $ normand <<< '"ma gang de malades"' | hexdump -C
1834 ----
1835
1836 ----
1837 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1838 00000010 65 73 |es|
1839 ----
1840
1841 If you copy the `normand.py` module to your own project, then you can
1842 run the module itself:
1843
1844 ----
1845 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1846 ----
1847
1848 ----
1849 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1850 00000010 65 73 |es|
1851 ----
1852
1853 Without a path argument, the `normand` tool reads from the standard
1854 input.
1855
1856 The `normand` tool prints the generated binary data to the standard
1857 output.
1858
1859 Various options control the initial <<state,state>> of the processor:
1860 use the `--help` option to learn more.
1861
1862 == {py3} API
1863
1864 The whole `normand` package/module public API is:
1865
1866 [source,python]
1867 ----
1868 # Byte order.
1869 class ByteOrder(enum.Enum):
1870 # Big endian.
1871 BE = ...
1872
1873 # Little endian.
1874 LE = ...
1875
1876
1877 # Text location.
1878 class TextLocation:
1879 # Line number.
1880 @property
1881 def line_no(self) -> int:
1882 ...
1883
1884 # Column number.
1885 @property
1886 def col_no(self) -> int:
1887 ...
1888
1889
1890 # Parsing error.
1891 class ParseError(RuntimeError):
1892 # Source text location.
1893 @property
1894 def text_loc(self) -> TextLocation:
1895 ...
1896
1897
1898 # Variables dictionary type (for type hints).
1899 VariablesT = typing.Dict[str, typing.Union[int, float]]
1900
1901
1902 # Labels dictionary type (for type hints).
1903 LabelsT = typing.Dict[str, int]
1904
1905
1906 # Parsing result.
1907 class ParseResult:
1908 # Generated data.
1909 @property
1910 def data(self) -> bytearray:
1911 ...
1912
1913 # Updated variable values.
1914 @property
1915 def variables(self) -> SymbolsT:
1916 ...
1917
1918 # Updated main group label values.
1919 @property
1920 def labels(self) -> SymbolsT:
1921 ...
1922
1923 # Final offset.
1924 @property
1925 def offset(self) -> int:
1926 ...
1927
1928 # Final byte order.
1929 @property
1930 def byte_order(self) -> typing.Optional[ByteOrder]:
1931 ...
1932
1933
1934 # Parses the `normand` input using the initial state defined by
1935 # `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
1936 # and returns the corresponding parsing result.
1937 def parse(normand: str,
1938 init_variables: typing.Optional[SymbolsT] = None,
1939 init_labels: typing.Optional[SymbolsT] = None,
1940 init_offset: int = 0,
1941 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1942 ...
1943 ----
1944
1945 The `normand` parameter is the actual <<learn-normand,Normand input>>
1946 while the other parameters control the initial <<state,state>>.
1947
1948 The `parse()` function raises a `ParseError` instance should it fail to
1949 parse the `normand` string for any reason.
1950
1951 == Development
1952
1953 Normand is a https://python-poetry.org/[Poetry] project.
1954
1955 To develop it, install it through Poetry and enter the virtual
1956 environment:
1957
1958 ----
1959 $ poetry install
1960 $ poetry shell
1961 $ normand <<< '"lol" * 10 0a'
1962 ----
1963
1964 `normand.py` is processed by:
1965
1966 * https://microsoft.github.io/pyright/[Pyright]
1967 * https://github.com/psf/black[Black]
1968 * https://pycqa.github.io/isort/[isort]
1969
1970 === Testing
1971
1972 Use https://docs.pytest.org/[pytest] to test Normand once the package is
1973 part of your virtual environment, for example:
1974
1975 ----
1976 $ poetry install
1977 $ poetry run pip3 install pytest
1978 $ poetry run pytest
1979 ----
1980
1981 The `pytest` project is currently not a development dependency in
1982 `pyproject.toml` due to backward compatibiliy issues with
1983 Python{nbsp}3.4.
1984
1985 In the `tests` directory, each `*.nt` file is a test. The file name
1986 prefix indicates what it's meant to test:
1987
1988 `pass-`::
1989 Everything above the `---` line is the valid Normand input
1990 to test.
1991 +
1992 Everything below the `---` line is the expected data
1993 (whitespace-separated hexadecimal bytes).
1994
1995 `fail-`::
1996 Everything above the `---` line is the invalid Normand input
1997 to test.
1998 +
1999 Everything below the `---` line is the expected error message having
2000 this form:
2001 +
2002 ----
2003 LINE:COL - MESSAGE
2004 ----
2005
2006 === Contributing
2007
2008 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2009 for code review.
2010
2011 To report a bug, https://github.com/efficios/normand/issues/new[create a
2012 GitHub issue].
This page took 0.079405 seconds and 3 git commands to generate.