README.adoc: fix "encoded block" -> "transformation block"
[normand.git] / README.adoc
1 // Show ToC at a specific location for a GitHub rendering
2 ifdef::env-github[]
3 :toc: macro
4 endif::env-github[]
5
6 ifndef::env-github[]
7 :toc: left
8 endif::env-github[]
9
10 // This is to mimic what GitHub does so that anchors work in an offline
11 // rendering too.
12 :idprefix:
13 :idseparator: -
14
15 // Other attributes
16 :py3: Python{nbsp}3
17
18 = Normand
19 Philippe Proulx
20
21 image::normand-logo.png[]
22
23 [.normal]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26 [.lead]
27 _**Normand**_ is a text-to-binary processor with its own language.
28
29 This package offers both a portable {py3} module and a command-line
30 tool.
31
32 WARNING: This version of Normand is 0.22, meaning both the Normand
33 language and the module/CLI interface aren't stable.
34
35 ifdef::env-github[]
36 // ToC location for a GitHub rendering
37 toc::[]
38 endif::env-github[]
39
40 == Introduction
41
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
44
45 .Simple bytes input.
46 ====
47 Consider the following Normand input:
48
49 ----
50 4f 55 32 bb $167 fe %10100111 a9 $-32
51 ----
52
53 The generated nine bytes are:
54
55 ----
56 4f 55 32 bb a7 fe a7 a9 e0
57 ----
58 ====
59
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
63
64 The Normand language is more than simple lists of bytes, though. Its
65 main features are:
66
67 Comments, including a bunch of insignificant symbols which may improve readability::
68 +
69 Input:
70 +
71 ----
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
75 60:57:18:a3:42:29
76 10839636-5d65-4a68-8e6a-21608ddf7258
77 ----
78 +
79 Output:
80 +
81 ----
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
85 ----
86
87 Hexadecimal, decimal, and binary byte constants::
88 +
89 Input:
90 +
91 ----
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
93 ----
94 +
95 Output:
96 +
97 ----
98 aa bb f7 a7 32 da
99 ----
100
101 Strings::
102 +
103 Input:
104 +
105 ----
106 "hello world!" 00
107 u16le"stress\nverdict 🤣"
108 s:latin3{hex(ICITTE)}
109 ----
110 +
111 Output:
112 +
113 ----
114 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
115 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
116 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd 30 ┆ •d•i•c•t• •>•#•0
117 78 32 66 ┆ x2f
118 ----
119
120 Labels: special variables holding the offset where they're defined::
121 +
122 ----
123 <beg> b2 52 e3 bc 91 05
124 $100 $50 <chair> 33 9f fe
125 25 e9 89 8a <end>
126 ----
127
128 Variables::
129 +
130 ----
131 5e 65 {tower = 47} c6 7f f2 c4
132 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
133 ----
134 +
135 The value of a variable assignment is the evaluation of a valid {py3}
136 expression which may include label and variable names.
137
138 Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
139 +
140 Input:
141 +
142 ----
143 {strength = 4}
144 !be 67 <lbl> 44 $178 [(end - lbl) * 8 + strength : 16] $99 <end>
145 !le [-1993 : 32]
146 [-3.141593 : 64]
147 ----
148 +
149 Output:
150 +
151 ----
152 67 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
153 09 c0
154 ----
155 +
156 The encoded number is the evaluation of a valid {py3} expression which
157 may include label and variable names.
158
159 https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
160 +
161 Input:
162 +
163 ----
164 aa bb cc [-1993 : sleb128] <meow> dd ee ff
165 [meow * 199 : uleb128]
166 ----
167 +
168 Output:
169 +
170 ----
171 aa bb cc b7 70 dd ee ff e3 07
172 ----
173 +
174 The encoded integer is the evaluation of a valid {py3} expression which
175 may include label and variable names.
176
177 Conditional::
178 +
179 Input:
180 +
181 ----
182 aa bb cc
183
184 (
185 "foo"
186
187 !if {ICITTE > 10}
188 "bar"
189 !else
190 "fight"
191 !end
192 ) * 4
193 ----
194 +
195 Output:
196 +
197 ----
198 aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
199 67 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
200 ----
201
202 Repetition::
203 +
204 Input:
205 +
206 ----
207 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
208
209 !repeat 3
210 ff ee "juice"
211 !end
212 ----
213 +
214 Output:
215 +
216 ----
217 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
218 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
219 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
220 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
221 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
222 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
223 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
224 ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
225 6a 75 69 63 65 ┆ juice
226 ----
227
228 Alignment::
229 +
230 Input:
231 +
232 ----
233 !be
234
235 [199:32]
236 @64 [43:64]
237 @16 [-123:16]
238 @32~255 [5584:32]
239 ----
240 +
241 Output:
242 +
243 ----
244 00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
245 ff 85 ff ff 00 00 15 d0
246 ----
247
248 Filling::
249 +
250 Input:
251 +
252 ----
253 !le
254 [0xdeadbeef:32]
255 [-1993:16]
256 [9:16]
257 +0x40
258 [ICITTE:8]
259 "meow mix"
260 +200~FFh
261 [ICITTE:8]
262 ----
263 +
264 Output:
265 +
266 ----
267 ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
268 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
269 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
270 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
271 40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
272 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
278 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
279 ff ff ff ff ff ff ff ff c8 ┆ •••••••••
280 ----
281
282 Transformation::
283 +
284 Input:
285 +
286 ----
287 "end of file @ " [end:8]
288
289 !transform gzip
290 "this part will be gzipped"
291 !end
292
293 <end>
294 ----
295 +
296 Output:
297 +
298 ----
299 65 6e 64 20 6f 66 20 66 69 6c 65 20 40 20 3c 1f ┆ end of file @ <•
300 8b 08 00 7b 7b 26 65 02 ff 2b c9 c8 2c 56 28 48 ┆ •••{{&e••+••,V(H
301 2c 2a 51 28 cf cc c9 51 48 4a 55 48 af ca 2c 28 ┆ ,*Q(•••QHJUH••,(
302 48 4d 01 00 d4 cc 5b 8a 19 00 00 00 ┆ HM••••[•••••
303 ----
304
305 Multilevel grouping::
306 +
307 Input:
308 +
309 ----
310 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
311 ----
312 +
313 Output:
314 +
315 ----
316 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
317 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
318 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
319 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
320 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
321 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
322 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
323 ----
324
325 Macros::
326 +
327 Input:
328 +
329 ----
330 !macro hello(world)
331 "hello"
332 !if world " world" !end
333 !end
334
335 !repeat 17
336 ff ff ff ff
337 m:hello({ICITTE > 15 and ICITTE < 60})
338 !end
339 ----
340 +
341 Output:
342 +
343 ----
344 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
345 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
346 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
347 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
348 ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
349 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
350 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
351 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
352 ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
353 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
354 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
355 ----
356
357 Precise error reporting::
358 +
359 ----
360 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
361 ----
362 +
363 ----
364 /tmp/meow.normand:32:6 - Unexpected character `k`.
365 ----
366 +
367 ----
368 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
369 ----
370 +
371 ----
372 /tmp/meow.normand:32:19 - While expanding the macro `meow`:
373 /tmp/meow.normand:35:5 - While expanding the macro `zzz`:
374 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
375 ----
376
377 You can use Normand to track data source files in your favorite VCS
378 instead of raw binary files. The binary files that Normand generates can
379 be used to test file format decoding, including malformatted data, for
380 example, as well as for education.
381
382 See <<learn-normand>> to explore all the Normand features.
383
384 == Install Normand
385
386 Normand requires Python ≥ 3.4.
387
388 To install Normand:
389
390 ----
391 $ python3 -m pip install --user normand
392 ----
393
394 See
395 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
396 to learn more about a user site installation.
397
398 [NOTE]
399 ====
400 Normand has a single module file, `normand.py`, which you can copy as is
401 to your project to use it (both the <<python3-api,`normand.parse()`>>
402 function and the <<command-line-tool,command-line tool>>).
403
404 `normand.py` has _no external dependencies_, but if you're using
405 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
406 module.
407 ====
408
409 == Design goals
410
411 The design goals of Normand are:
412
413 Portability::
414 We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
415 doesn't have any external dependencies so that you may just copy the
416 module as is to your own project.
417
418 Ease of use::
419 The most basic Normand input is a sequence of hexadecimal constants
420 (for example, `4e6f726d616e64`) which produce exactly what you'd
421 expect.
422 +
423 Most Normand features map to programming language concepts you already
424 know and understand: constant integers, literal strings, variables,
425 conditionals, repetitions/loops, and the rest.
426
427 Concise and readable input::
428 We could have chosen XML or YAML as the input format, but having a
429 DSL here makes a Normand input compact and easy to read, two
430 important traits when using Normand to write tests, for example.
431 +
432 Compare the following Normand input and some hypothetical XML
433 equivalent, for example:
434 +
435 .Actual normand input.
436 ----
437 ff dd 01 ab $192 $-128 %1101:0011
438
439 [end:8]
440
441 {iter = 1}
442
443 !if {not something}
444 # five times because xyz
445 !repeat 5
446 "hello world " [iter:8]
447 {iter = iter + 1}
448 !end
449 !end
450
451 <end>
452 ----
453 +
454 .Hypothetical Normand XML input.
455 [source,xml]
456 ----
457 <?xml version="1.0" encoding="utf-8" ?>
458 <group>
459 <byte base="x" val="ff" />
460 <byte base="x" val="dd" />
461 <byte base="x" val="1" />
462 <byte base="x" val="ab" />
463 <byte base="d" val="192" />
464 <byte base="d" val="-128" />
465 <byte base="b" val="11010011" />
466 <fixed-len-num expr="end" len="8" />
467 <var-assign name="iter" expr="1" />
468 <cond expr="not something">
469 <!-- five times because xyz -->
470 <repeat expr="5">
471 <str>hello world </str>
472 <fixed-len-num expr="iter" len="8" />
473 <var-assign name="iter" expr="iter + 1" />
474 </repeat>
475 </cond>
476 <label name="end" />
477 </group>
478 ----
479
480 == Learn Normand
481
482 A Normand text input is a sequence of items which represent a sequence
483 of raw bytes.
484
485 [[state]] During the processing of items to data, Normand relies on a
486 current state:
487
488 [%header%autowidth]
489 |===
490 |State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
491
492 |[[cur-offset]] Current offset
493 |
494 The current offset has an effect on the value of <<label,labels>> and of
495 the special `ICITTE` name in <<fixed-length-number,fixed-length
496 number>>, <<leb-128-integer,LEB128 integer>>, <<string,string>>,
497 <<filling,filling>>, <<variable-assignment,variable assignment>>,
498 <<conditional-block,conditional block>>, <<repetition-block,repetition
499 block>>, <<macro-expansion,macro expansion>>, and
500 <<post-item-repetition,post-item repetition>> expression evaluation.
501
502 Each generated byte increments the current offset.
503
504 A <<current-offset-setting,current offset setting>> may change the
505 current offset without generating data.
506
507 An <<current-offset-alignment,current offset alignment>> generates
508 padding bytes to make the current offset satisfy a given alignment.
509 |`init_offset` parameter of the `parse()` function.
510 |`--offset` option.
511
512 |[[cur-bo]] Current byte order
513 |
514 The current byte order has an effect on the encoding of
515 <<fixed-length-number,fixed-length numbers>>.
516
517 A <<current-byte-order-setting,current byte order setting>> may change
518 the current byte order.
519 |`init_byte_order` parameter of the `parse()` function.
520 |`--byte-order` option.
521
522 |<<label,Labels>>
523 |Mapping of label names to integral values.
524 |`init_labels` parameter of the `parse()` function.
525 |One or more `--label` options.
526
527 |<<variable-assignment,Variables>>
528 |Mapping of variable names to integral or floating point number values.
529 |`init_variables` parameter of the `parse()` function.
530 |One or more `--var` or `--var-str` options.
531 |===
532
533 The available items are:
534
535 * A <<byte-constant,constant integer>> representing one or more
536 constant bytes.
537
538 * A <<literal-string,literal string>> representing a constant sequence
539 of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data.
540
541 * A <<current-byte-order-setting,current byte order setting>> (big or
542 little endian).
543
544 * A <<fixed-length-number,fixed-length number>> (integer or
545 floating point) using the <<cur-bo,current byte order>> and of which
546 the value is the result of a {py3} expression.
547
548 * An <<leb128-integer,LEB128 integer>> of which the value is the result
549 of a {py3} expression.
550
551 * A <<string,string>> representing a sequence of bytes encoding UTF-8,
552 UTF-16, UTF-32, or Latin-1 to Latin-10 data, and of which the value is
553 the result of a {py3} expression.
554
555 * A <<current-offset-setting,current offset setting>>.
556
557 * A <<current-offset-alignment,current offset alignment>>.
558
559 * A <<filling,filling>>.
560
561 * A <<label,label>>, that is, a named constant holding the current
562 offset.
563 +
564 This is similar to an assembly label.
565
566 * A <<variable-assignment,variable assignment>> associating a name to
567 the integral result of an evaluated {py3} expression.
568
569 * A <<group,group>>, that is, a scoped sequence of items.
570
571 * A <<conditional-block,conditional block>>.
572
573 * A <<repetition-block,repetition block>>.
574
575 * A <<transformation-block,transformation block>>.
576
577 * A <<macro-definition-block,macro definition block>>.
578
579 * A <<macro-expansion,macro expansion>>.
580
581 Moreover, you can repeat many items above a constant or variable number
582 of times with the ``pass:[*]`` operator _after_ the item to repeat. This
583 is called a <<post-item-repetition,post-item repetition>>.
584
585 A Normand comment may exist pretty much anywhere between tokens.
586
587 A comment is anything between two ``pass:[#]`` characters on the same
588 line, or from ``pass:[#]`` until the end of the line. Whitespaces are
589 also considered comments. The following symbols are also considered
590 comments around and between items, as well as between hexadecimal
591 nibbles and binary bits of <<byte-constant,byte constants>>:
592
593 ----
594 & , - . / : ; = ? \ _ |
595 ----
596
597 The latter serve to improve readability so that you may write, for
598 example, a MAC address or a UUID as is.
599
600 [[const-int]] Many items require a _constant integer_, possibly
601 negative, in which case it may start with `-` for a negative integer. A
602 positive constant integer is any of:
603
604 Decimal::
605 One or mode digits (`0` to `9`).
606
607 Hexadecimal::
608 One of:
609 +
610 * The `0x` or `0X` prefix followed with one or more hexadecimal digits
611 (`0` to `9`, `a` to `f`, or `A` to `F`).
612 * One or more hexadecimal digits followed with the `h` or `H` suffix.
613
614 Octal::
615 One of:
616 +
617 * The `0o` or `0O` prefix followed with one or more octal digits
618 (`0` to `7`).
619 * One or more octal digits followed with the `o`, `O`, `q`, or `Q`
620 suffix.
621
622 Binary::
623 One of:
624 +
625 * The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
626 * One or more bits followed with the `b` or `B` suffix.
627
628 In general, anything between `pass:[{]` and `}` is a {py3} expression.
629
630 You can test the examples of this section with the `normand`
631 <<command-line-tool,command-line tool>> as such:
632
633 ----
634 $ normand file | hexdump -C
635 ----
636
637 where `file` is the name of a file containing the Normand input.
638
639 === Byte constant
640
641 A _byte constant_ represents one or more constant bytes.
642
643 A byte constant is:
644
645 Hexadecimal form::
646 Two consecutive hexadecimal digits representing a single byte.
647
648 Decimal form::
649 One or more digits after the `$` prefix representing a single byte.
650
651 Binary form:: {empty}
652 +
653 --
654 . __**N**__ `%` prefixes (at least one).
655 +
656 The number of `%` characters is the number of subsequent expected bytes.
657
658 . __**N**__{nbsp}×{nbsp}8 bits (`0` or `1`).
659 --
660
661 ====
662 Input:
663
664 ----
665 ab cd (3d 8F) CC
666 ----
667
668 Output:
669
670 ----
671 ab cd 3d 8f cc
672 ----
673 ====
674
675 ====
676 Input:
677
678 ----
679 $192 %1100/0011 $ -77
680 ----
681
682 Output:
683
684 ----
685 c0 c3 b3
686 ----
687 ====
688
689 ====
690 Input:
691
692 ----
693 58f64689-6316-4d55-8a1a-04cada366172
694 fe80::6257:18ff:fea3:4229
695 ----
696
697 Output:
698
699 ----
700 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
701 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
702 ----
703 ====
704
705 ====
706 Input:
707
708 ----
709 %01110011 %01100001 %01101100 %01110101 %01110100
710 %%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
711 ----
712
713 Output:
714
715 ----
716 73 61 6c 75 74 d2 ff c7 ┆ salut•••
717 ----
718 ====
719
720 === Literal string
721
722 A _literal string_ represents the encoded bytes of a literal string
723 using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
724
725 The string to encode isn't implicitly null-terminated: use `\0` at the
726 end of the string to add a null character.
727
728 A literal string is:
729
730 . **Optional**: one of the following encodings instead of the default
731 UTF-8:
732 +
733 --
734 [horizontal]
735 `s:u8`::
736 `u8`::
737 UTF-8.
738
739 `s:u16be`::
740 `u16be`::
741 UTF-16BE.
742
743 `s:u16le`::
744 `u16le`::
745 UTF-16LE.
746
747 `s:u32be`::
748 `u32be`::
749 UTF-32BE.
750
751 `s:u32le`::
752 `u32le`::
753 UTF-32LE.
754
755 `s:latin1`::
756 ISO/IEC 8859-1.
757
758 `s:latin2`::
759 ISO/IEC 8859-2.
760
761 `s:latin3`::
762 ISO/IEC 8859-3.
763
764 `s:latin4`::
765 ISO/IEC 8859-4.
766
767 `s:latin5`::
768 ISO/IEC 8859-9.
769
770 `s:latin6`::
771 ISO/IEC 8859-10.
772
773 `s:latin7`::
774 ISO/IEC 8859-13.
775
776 `s:latin8`::
777 ISO/IEC 8859-14.
778
779 `s:latin9`::
780 ISO/IEC 8859-15.
781
782 `s:latin10`::
783 ISO/IEC 8859-16.
784 --
785
786 . The ``pass:["]`` prefix.
787
788 . A sequence of zero or more characters, possibly containing escape
789 sequences.
790 +
791 An escape sequence is the ``\`` character followed by one of:
792 +
793 --
794 [horizontal]
795 `0`:: Null (U+0000)
796 `a`:: Alert (U+0007)
797 `b`:: Backspace (U+0008)
798 `e`:: Escape (U+001B)
799 `f`:: Form feed (U+000C)
800 `n`:: End of line (U+000A)
801 `r`:: Carriage return (U+000D)
802 `t`:: Character tabulation (U+0009)
803 `v`:: Line tabulation (U+000B)
804 ``\``:: Reverse solidus (U+005C)
805 ``pass:["]``:: Quotation mark (U+0022)
806 --
807
808 . The ``pass:["]`` suffix.
809
810 ====
811 Input:
812
813 ----
814 "coucou tout le monde!"
815 ----
816
817 Output:
818
819 ----
820 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
821 6f 6e 64 65 21 ┆ onde!
822 ----
823 ====
824
825 ====
826 Input:
827
828 ----
829 u16le"I am not young enough to know everything."
830 ----
831
832 Output:
833
834 ----
835 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
836 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
837 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
838 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
839 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
840 2e 00 ┆ .•
841 ----
842 ====
843
844 ====
845 Input:
846
847 ----
848 s:u32be "\"illusion is the first\nof all pleasures\" 🦉"
849 ----
850
851 Output:
852
853 ----
854 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
855 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
856 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
857 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
858 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
859 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
860 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
861 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
862 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
863 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
864 00 00 00 20 00 01 f9 89 ┆ ••• ••••
865 ----
866 ====
867
868 ====
869 Input:
870
871 ----
872 s:latin1 "Paul Piché"
873 ----
874
875 Output:
876
877 ----
878 50 61 75 6c 20 50 69 63 68 e9 ┆ Paul Pich•
879 ----
880 ====
881
882 === Current byte order setting
883
884 This special item sets the <<cur-bo,_current byte order_>>.
885
886 The two accepted forms are:
887
888 [horizontal]
889 `!be`:: Set the current byte order to big endian.
890 `!le`:: Set the current byte order to little endian.
891
892 === Fixed-length number
893
894 A _fixed-length number_ represents a fixed number of bytes encoding
895 either:
896
897 * An unsigned or signed integer (two's complement).
898 +
899 The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
900
901 * A floating point number
902 (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
903 +
904 The available length are 32 (_binary32_) and 64 (_binary64_).
905
906 The value is the result of evaluating a {py3} expression using the
907 <<cur-bo,current byte order>>.
908
909 A fixed-length number is:
910
911 . The `[` prefix.
912
913 . A valid {py3} expression.
914 +
915 For a fixed-length number at some source location{nbsp}__**L**__, this
916 expression may contain the name of any accessible <<label,label>> (not
917 within a nested group), including the name of a label defined
918 after{nbsp}__**L**__ (except within a
919 <<transformation-block,transformation block>>), as well as the name of
920 any <<variable-assignment,variable>> known at{nbsp}__**L**__.
921 +
922 The value of the special name `ICITTE` (`int` type) in this expression
923 is the <<cur-offset,current offset>> (before encoding the number).
924
925 . The `:` character.
926
927 . An encoding length in bits amongst:
928 +
929 --
930 The expression evaluates to an `int` or `bool` value::
931 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
932 +
933 NOTE: Normand automatically converts a `bool` value to `int`.
934
935 The expression evaluates to a `float` value::
936 `32` and `64`.
937 --
938
939 . The `]` suffix.
940
941 ====
942 Input:
943
944 ----
945 !le [345:16]
946 !be [-0xabcd:32]
947 ----
948
949 Output:
950
951 ----
952 59 01 ff ff 54 33
953 ----
954 ====
955
956 ====
957 Input:
958
959 ----
960 !be
961
962 # String length in bits
963 [8 * (str_end - str_beg) : 16]
964
965 # String
966 <str_beg>
967 "hello world!"
968 <str_end>
969 ----
970
971 Output:
972
973 ----
974 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
975 ----
976 ====
977
978 ====
979 Input:
980
981 ----
982 [20 - ICITTE : 8] * 10
983 ----
984
985 Output:
986
987 ----
988 14 13 12 11 10 0f 0e 0d 0c 0b
989 ----
990 ====
991
992 ====
993 Input:
994
995 ----
996 !le
997 [2 * 0.0529 : 32]
998 ----
999
1000 Output:
1001
1002 ----
1003 ac ad d8 3d
1004 ----
1005 ====
1006
1007 === LEB128 integer
1008
1009 An _LEB128 integer_ represents a variable number of bytes encoding an
1010 unsigned or signed integer which is the result of evaluating a {py3}
1011 expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
1012 format.
1013
1014 An LEB128 integer is:
1015
1016 . The `[` prefix.
1017
1018 . A valid {py3} expression of which the evaluation result type
1019 is `int` or `bool` (automatically converted to `int`).
1020 +
1021 For an LEB128 integer at some source location{nbsp}__**L**__, this
1022 expression may contain:
1023 +
1024 --
1025 * The name of any <<label,label>> defined before{nbsp}__**L**__
1026 which isn't within a nested group.
1027 * The name of any <<variable-assignment,variable>> known
1028 at{nbsp}__**L**__.
1029 --
1030 +
1031 The value of the special name `ICITTE` (`int` type) in this expression
1032 is the <<cur-offset,current offset>> (before encoding the integer).
1033
1034 . The `:` character.
1035
1036 . One of:
1037 +
1038 --
1039 [horizontal]
1040 `uleb128`:: Use the unsigned LEB128 format.
1041 `sleb128`:: Use the signed LEB128 format.
1042 --
1043
1044 . The `]` suffix.
1045
1046 ====
1047 Input:
1048
1049 ----
1050 [624485 : uleb128]
1051 ----
1052
1053 Output:
1054
1055 ----
1056 e5 8e 26
1057 ----
1058 ====
1059
1060 ====
1061 Input:
1062
1063 ----
1064 aa bb cc dd
1065 <meow>
1066 ee ff
1067 [-981238311 + (meow * -23) : sleb128]
1068 "hello"
1069 ----
1070
1071 Output:
1072
1073 ----
1074 aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
1075 ----
1076 ====
1077
1078 === String
1079
1080 A _string_ represents a variable number of bytes encoding a string which
1081 is the result of evaluating a {py3} expression using the UTF-8, UTF-16,
1082 UTF-32, or Latin-1 to Latin-10 encoding.
1083
1084 A string has two possible forms:
1085
1086 Encoding prefix form:: {empty}
1087 +
1088 . An encoding amongst:
1089 +
1090 --
1091 [horizontal]
1092 `s:u8`::
1093 `u8`::
1094 UTF-8.
1095
1096 `s:u16be`::
1097 `u16be`::
1098 UTF-16BE.
1099
1100 `s:u16le`::
1101 `u16le`::
1102 UTF-16LE.
1103
1104 `s:u32be`::
1105 `u32be`::
1106 UTF-32BE.
1107
1108 `s:u32le`::
1109 `u32le`::
1110 UTF-32LE.
1111
1112 `s:latin1`::
1113 ISO/IEC 8859-1.
1114
1115 `s:latin2`::
1116 ISO/IEC 8859-2.
1117
1118 `s:latin3`::
1119 ISO/IEC 8859-3.
1120
1121 `s:latin4`::
1122 ISO/IEC 8859-4.
1123
1124 `s:latin5`::
1125 ISO/IEC 8859-9.
1126
1127 `s:latin6`::
1128 ISO/IEC 8859-10.
1129
1130 `s:latin7`::
1131 ISO/IEC 8859-13.
1132
1133 `s:latin8`::
1134 ISO/IEC 8859-14.
1135
1136 `s:latin9`::
1137 ISO/IEC 8859-15.
1138
1139 `s:latin10`::
1140 ISO/IEC 8859-16.
1141 --
1142
1143 . The ``pass:[{]`` prefix.
1144
1145 . A valid {py3} expression of which the evaluation result type
1146 is `bool`, `int`, `float`, or `str` (the first three automatically
1147 converted to `str`).
1148 +
1149 For a string at some source location{nbsp}__**L**__, this expression may
1150 contain:
1151 +
1152 --
1153 * The name of any <<label,label>> defined before{nbsp}__**L**__
1154 which isn't within a nested group.
1155 * The name of any <<variable-assignment,variable>> known
1156 at{nbsp}__**L**__.
1157 --
1158 +
1159 The value of the special name `ICITTE` (`int` type) in this expression
1160 is the <<cur-offset,current offset>> (before encoding the string).
1161
1162 . The `}` suffix.
1163
1164 Encoding suffix form:: {empty}
1165 +
1166 . The `[` prefix.
1167
1168 . A valid {py3} expression of which the evaluation result type
1169 is `bool`, `int`, `float`, or `str` (the first three automatically
1170 converted to `str`).
1171 +
1172 For a string at some source location{nbsp}__**L**__, this expression may
1173 contain:
1174 +
1175 --
1176 * The name of any <<label,label>> defined before{nbsp}__**L**__
1177 which isn't within a nested group.
1178 * The name of any <<variable-assignment,variable>> known
1179 at{nbsp}__**L**__.
1180 --
1181 +
1182 The value of the special name `ICITTE` (`int` type) in this expression
1183 is the <<cur-offset,current offset>> (before encoding the string).
1184
1185 . The `:` character.
1186
1187 . A string encoding amongst:
1188 +
1189 --
1190 [horizontal]
1191 `s:u8`::
1192 UTF-8.
1193
1194 `s:u16be`::
1195 UTF-16BE.
1196
1197 `s:u16le`::
1198 UTF-16LE.
1199
1200 `s:u32be`::
1201 UTF-32BE.
1202
1203 `s:u32le`::
1204 UTF-32LE.
1205
1206 `s:latin1`::
1207 ISO/IEC 8859-1.
1208
1209 `s:latin2`::
1210 ISO/IEC 8859-2.
1211
1212 `s:latin3`::
1213 ISO/IEC 8859-3.
1214
1215 `s:latin4`::
1216 ISO/IEC 8859-4.
1217
1218 `s:latin5`::
1219 ISO/IEC 8859-9.
1220
1221 `s:latin6`::
1222 ISO/IEC 8859-10.
1223
1224 `s:latin7`::
1225 ISO/IEC 8859-13.
1226
1227 `s:latin8`::
1228 ISO/IEC 8859-14.
1229
1230 `s:latin9`::
1231 ISO/IEC 8859-15.
1232
1233 `s:latin10`::
1234 ISO/IEC 8859-16.
1235 --
1236
1237 . The `]` suffix.
1238
1239 ====
1240 Input:
1241
1242 ----
1243 {iter = 1}
1244
1245 !repeat 10
1246 u8{iter} " "
1247 {iter = iter + 1}
1248 !end
1249 ----
1250
1251 Output:
1252
1253 ----
1254 31 20 32 20 33 20 34 20 35 20 36 20 37 20 38 20 ┆ 1 2 3 4 5 6 7 8
1255 39 20 31 30 20 ┆ 9 10
1256 ----
1257 ====
1258
1259 ====
1260 Input:
1261
1262 ----
1263 {meow = 'salut jérémie'}
1264 [meow.upper() : s:latin1]
1265 ----
1266
1267 Output:
1268
1269 ----
1270 53 41 4c 55 54 20 4a c9 52 c9 4d 49 45 ┆ SALUT J•R•MIE
1271 ----
1272 ====
1273
1274 === Current offset setting
1275
1276 This special item sets the <<cur-offset,_current offset_>>.
1277
1278 A current offset setting is:
1279
1280 . The `<` prefix.
1281
1282 . A <<const-int,positive constant integer>> which is the new current
1283 offset.
1284
1285 . The `>` suffix.
1286
1287 ====
1288 Input:
1289
1290 ----
1291 [ICITTE : 8] * 8
1292 <0x61> [ICITTE : 8] * 8
1293 ----
1294
1295 Output:
1296
1297 ----
1298 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
1299 ----
1300 ====
1301
1302 ====
1303 Input:
1304
1305 ----
1306 aa bb cc dd <meow> ee ff
1307 <12> 11 22 33 <mix> 44 55
1308 [meow : 8] [mix : 8]
1309 ----
1310
1311 Output:
1312
1313 ----
1314 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
1315 ----
1316 ====
1317
1318 === Current offset alignment
1319
1320 A _current offset alignment_ represents zero or more padding bytes to
1321 make the <<cur-offset,current offset>> meet a given
1322 https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
1323
1324 More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
1325 a current offset alignment represents the required padding bytes until
1326 the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
1327
1328 A current offset alignment is:
1329
1330 . The `@` prefix.
1331
1332 . A <<const-int,positive constant integer>> which is the alignment value
1333 in _bits_.
1334 +
1335 This value must be greater than zero and a multiple of{nbsp}8.
1336
1337 . **Optional**:
1338 +
1339 --
1340 . The ``pass:[~]`` prefix.
1341 . A <<const-int,positive constant integer>> which is the value of the
1342 byte to use as padding to align the <<cur-offset,current offset>>.
1343 --
1344 +
1345 Without this section, the padding byte value is zero.
1346
1347 ====
1348 Input:
1349
1350 ----
1351 11 22 (@32 aa bb cc) * 3
1352 ----
1353
1354 Output:
1355
1356 ----
1357 11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
1358 ----
1359 ====
1360
1361 ====
1362 Input:
1363
1364 ----
1365 !le
1366 77 88
1367 @32~0xcc [-893.5:32]
1368 @128~0x55 "meow"
1369 ----
1370
1371 Output:
1372
1373 ----
1374 77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
1375 6d 65 6f 77 ┆ meow
1376 ----
1377 ====
1378
1379 ====
1380 Input:
1381
1382 ----
1383 aa bb cc <29> @64~255 "zoom"
1384 ----
1385
1386 Output:
1387
1388 ----
1389 aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1390 ----
1391 ====
1392
1393 === Filling
1394
1395 A _filling_ represents zero or more padding bytes to make the
1396 <<cur-offset,current offset>> reach a given value.
1397
1398 A filling is:
1399
1400 . The ``pass:[+]`` prefix.
1401
1402 . One of:
1403
1404 ** A <<const-int,positive constant integer>> which is the current offset
1405 target.
1406
1407 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1408 evaluation result type is `int` or `bool` (automatically converted to
1409 `int`), and the `}` suffix.
1410 +
1411 For a filling at some source location{nbsp}__**L**__, this expression
1412 may contain:
1413 +
1414 --
1415 * The name of any <<label,label>> defined before{nbsp}__**L**__
1416 which isn't within a nested group.
1417 * The name of any <<variable-assignment,variable>> known
1418 at{nbsp}__**L**__.
1419 --
1420 +
1421 The value of the special name `ICITTE` (`int` type) in this expression
1422 is the <<cur-offset,current offset>> (before handling the items to
1423 repeat).
1424
1425 ** A valid {py3} name.
1426 +
1427 For the name `__NAME__`, this is equivalent to the
1428 `pass:[{]__NAME__}` form above.
1429
1430 +
1431 This value must be greater than or equal to the current offset where
1432 it's used.
1433
1434 . **Optional**:
1435 +
1436 --
1437 . The ``pass:[~]`` prefix.
1438 . A <<const-int,positive constant integer>> which is the value of the
1439 byte to use as padding to reach the current offset target.
1440 --
1441 +
1442 Without this section, the padding byte value is zero.
1443
1444 ====
1445 Input:
1446
1447 ----
1448 aa bb cc dd
1449 +0x40
1450 "hello world"
1451 ----
1452
1453 Output:
1454
1455 ----
1456 aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1457 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1458 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1459 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1460 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1461 ----
1462 ====
1463
1464 ====
1465 Input:
1466
1467 ----
1468 !macro part(iter, fill)
1469 <0> "particular security " [ord('0') + iter : 8] +fill~0x80
1470 !end
1471
1472 {iter = 1}
1473
1474 !repeat 5
1475 m:part(iter, {32 + 4 * iter})
1476 {iter = iter + 1}
1477 !end
1478 ----
1479
1480 Output:
1481
1482 ----
1483 70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
1484 69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
1485 80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
1486 65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
1487 80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
1488 69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
1489 33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
1490 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1491 61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
1492 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1493 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1494 61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
1495 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1496 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1497 ----
1498 ====
1499
1500 === Label
1501
1502 A _label_ associates a name to the <<cur-offset,current offset>>.
1503
1504 All the labels of a whole Normand input must have unique names.
1505
1506 A label must not share the name of a <<variable-assignment,variable>>
1507 name.
1508
1509 A label is:
1510
1511 . The `<` prefix.
1512
1513 . A valid {py3} name which is not `ICITTE`.
1514
1515 . The `>` suffix.
1516
1517 === Variable assignment
1518
1519 A _variable assignment_ associates a name to the integral result of an
1520 evaluated {py3} expression.
1521
1522 A variable assignment is:
1523
1524 . The ``pass:[{]`` prefix.
1525
1526 . A valid {py3} name which is not `ICITTE`.
1527
1528 . The `=` character.
1529
1530 . A valid {py3} expression of which the evaluation result type is `int`,
1531 `float`, or `bool` (automatically converted to `int`), or `str`.
1532 +
1533 For a variable assignment at some source location{nbsp}__**L**__, this
1534 expression may contain:
1535 +
1536 --
1537 * The name of any <<label,label>> defined before{nbsp}__**L**__
1538 which isn't within a nested group.
1539 * The name of any <<variable-assignment,variable>> known
1540 at{nbsp}__**L**__.
1541 --
1542 +
1543 The value of the special name `ICITTE` (`int` type) in this expression
1544 is the <<cur-offset,current offset>>.
1545
1546 . The `}` suffix.
1547
1548 ====
1549 Input:
1550
1551 ----
1552 {mix = 101} !le
1553 {meow = 42} 11 22 [meow:8] 33 {meow = ICITTE + 17}
1554 "yooo" [meow + mix : 16]
1555 ----
1556
1557 Output:
1558
1559 ----
1560 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1561 ----
1562 ====
1563
1564 === Group
1565
1566 A _group_ is a scoped sequence of items.
1567
1568 The <<label,labels>> within a group aren't visible outside of it.
1569
1570 The main purpose of a group is to <<post-item-repetition,repeat>> more
1571 than a single item and to isolate labels.
1572
1573 A group is:
1574
1575 . The `(`, `!group`, or `!g` opening.
1576
1577 . Zero or more items except, recursively, a macro definition block.
1578
1579 . Depending on the group opening:
1580 +
1581 --
1582 `(`::
1583 The `)` closing.
1584
1585 `!group`::
1586 `!g`::
1587 The `!end` closing.
1588 --
1589
1590 ====
1591 Input:
1592
1593 ----
1594 ((aa bb cc) dd () ee) "leclerc"
1595 ----
1596
1597 Output:
1598
1599 ----
1600 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1601 ----
1602 ====
1603
1604 ====
1605 Input:
1606
1607 ----
1608 !group
1609 (aa bb cc) * 3 dd ee
1610 !end * 5
1611 ----
1612
1613 Output:
1614
1615 ----
1616 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1617 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1618 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1619 bb cc aa bb cc dd ee
1620 ----
1621 ====
1622
1623 ====
1624 Input:
1625
1626 ----
1627 !be
1628 (
1629 <str_beg> u16le"sébastien diaz" <str_end>
1630 [ICITTE - str_beg : 8]
1631 [(end - str_beg) * 5 : 24]
1632 ) * 3
1633 <end>
1634 ----
1635
1636 Output:
1637
1638 ----
1639 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1640 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
1641 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1642 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
1643 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1644 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1645 ----
1646 ====
1647
1648 === Conditional block
1649
1650 A _conditional block_ represents either the bytes of zero or more items
1651 if some expression is true, or the bytes of zero or more other items if
1652 it's false.
1653
1654 A conditional block is:
1655
1656 . The `!if` opening.
1657
1658 . One of:
1659
1660 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1661 evaluation result type is `int` or `bool` (automatically converted to
1662 `int`), and the `}` suffix.
1663 +
1664 For a conditional block at some source location{nbsp}__**L**__, this
1665 expression may contain:
1666 +
1667 --
1668 * The name of any <<label,label>> defined before{nbsp}__**L**__
1669 which isn't within a nested group.
1670 * The name of any <<variable-assignment,variable>> known
1671 at{nbsp}__**L**__.
1672 --
1673 +
1674 The value of the special name `ICITTE` (`int` type) in this expression
1675 is the <<cur-offset,current offset>> (before handling the contained
1676 items).
1677
1678 ** A valid {py3} name.
1679 +
1680 For the name `__NAME__`, this is equivalent to the
1681 `pass:[{]__NAME__}` form above.
1682
1683 . Zero or more items to be handled when the condition is true
1684 except, recursively, a macro definition block.
1685
1686 . **Optional**:
1687
1688 .. The `!else` opening.
1689 .. Zero or more items to be handled when the condition is false
1690 except, recursively, a macro definition block
1691
1692 . The `!end` closing.
1693
1694 ====
1695 Input:
1696
1697 ----
1698 {at = 1}
1699 {rep_count = 9}
1700
1701 !repeat rep_count
1702 "meow "
1703
1704 !if {ICITTE > 25}
1705 "mix"
1706 !else
1707 "zoom"
1708 !end
1709
1710 !if {at < rep_count} 20 !end
1711
1712 {at = at + 1}
1713 !end
1714 ----
1715
1716 Output:
1717
1718 ----
1719 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
1720 6f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
1721 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
1722 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
1723 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
1724 6d 69 78 ┆ mix
1725 ----
1726 ====
1727
1728 ====
1729 Input:
1730
1731 ----
1732 <str_beg>
1733 u16le"meow mix!"
1734 <str_end>
1735
1736 !if {str_end - str_beg > 10}
1737 " BIG"
1738 !end
1739 ----
1740
1741 Output:
1742
1743 ----
1744 6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
1745 21 00 20 42 49 47 ┆ !• BIG
1746 ----
1747 ====
1748
1749 === Repetition block
1750
1751 A _repetition block_ represents the bytes of one or more items repeated
1752 a given number of times.
1753
1754 A repetition block is:
1755
1756 . The `!repeat` or `!r` opening.
1757
1758 . One of:
1759
1760 ** A <<const-int,positive constant integer>> which is the number of
1761 times to repeat the previous item.
1762
1763 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1764 evaluation result type is `int` or `bool` (automatically converted to
1765 `int`), and the `}` suffix.
1766 +
1767 For a repetition block at some source location{nbsp}__**L**__, this
1768 expression may contain:
1769 +
1770 --
1771 * The name of any <<label,label>> defined before{nbsp}__**L**__
1772 which isn't within a nested group.
1773 * The name of any <<variable-assignment,variable>> known
1774 at{nbsp}__**L**__.
1775 --
1776 +
1777 The value of the special name `ICITTE` (`int` type) in this expression
1778 is the <<cur-offset,current offset>> (before handling the items to
1779 repeat).
1780
1781 ** A valid {py3} name.
1782 +
1783 For the name `__NAME__`, this is equivalent to the
1784 `pass:[{]__NAME__}` form above.
1785
1786 . Zero or more items except, recursively, a macro definition block.
1787
1788 . The `!end` closing.
1789
1790 You may also use a <<post-item-repetition,post-item repetition>> after
1791 some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1792 is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
1793
1794 ====
1795 Input:
1796
1797 ----
1798 !repeat 0o400
1799 [end - ICITTE - 1 : 8]
1800 !end
1801
1802 <end>
1803 ----
1804
1805 Output:
1806
1807 ----
1808 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1809 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1810 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1811 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1812 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1813 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1814 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1815 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1816 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1817 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1818 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1819 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1820 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1821 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1822 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1823 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1824 ----
1825 ====
1826
1827 ====
1828 Input:
1829
1830 ----
1831 {times = 1}
1832
1833 aa bb cc dd
1834
1835 !repeat 3
1836 <here>
1837
1838 !repeat {here + 1}
1839 ee ff
1840 !end
1841
1842 11 22 !repeat times 33 !end
1843
1844 {times = times + 1}
1845 !end
1846
1847 "coucou!"
1848 ----
1849
1850 Output:
1851
1852 ----
1853 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1854 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1855 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1856 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1857 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1858 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1859 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1860 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1861 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1862 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1863 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1864 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1865 ----
1866 ====
1867
1868 === Transformation block
1869
1870 A _transformation block_ represents the bytes of one or more items
1871 transformed into other bytes by a function.
1872
1873 As of this version, Normand only offers a predetermined set of
1874 transformation functions.
1875
1876 An encoded block is:
1877
1878 . The `!transform` or `!t` opening.
1879
1880 . A transformation function name amongst:
1881 +
1882 --
1883 [horizontal]
1884 `base64`::
1885 `b64`::
1886 Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-4[Base64].
1887
1888 `base64u`::
1889 `b64u`::
1890 URL-safe Base64, using `-` instead of `pass:[+]` and `_` instead of
1891 `/`.
1892
1893 `base32`::
1894 `b32`::
1895 Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-6[Base32].
1896
1897 `base16`::
1898 `b16`::
1899 Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-8[Base16].
1900
1901 `ascii85`::
1902 `a85`::
1903 https://en.wikipedia.org/wiki/Ascii85[Ascii85] without padding.
1904
1905 `ascii85p`::
1906 `a85p`::
1907 Ascii85 with padding.
1908
1909 `base85`::
1910 `b85`::
1911 https://en.wikipedia.org/wiki/Ascii85[Base85] (like Git-style binary
1912 diffs) without padding.
1913
1914 `base85p`::
1915 `b85p`::
1916 Base85 with padding.
1917
1918 `quopri`::
1919 `qp`::
1920 MIME
1921 https://datatracker.ietf.org/doc/html/rfc2045#section-6.7[quoted-printable]
1922 without quoted whitespaces.
1923
1924 `quoprit`::
1925 `qpt`::
1926 MIME quoted-printable with quoted whitespaces.
1927
1928 `gzip`::
1929 `gz`::
1930 https://en.wikipedia.org/wiki/Gzip[gzip].
1931
1932 `bzip2`::
1933 `bz2`::
1934 https://en.wikipedia.org/wiki/Bzip2[bzip2].
1935 --
1936
1937 . Zero or more items except, recursively, a macro definition block.
1938 +
1939 Any {py3} expression within any of those items may not refer to a future
1940 <<label,label>>.
1941 +
1942 The value of the special name `ICITTE` in any {py3} expression within
1943 any of those items is the <<cur-offset,current offset>> _before_ Normand
1944 applies the transformation function. Therefore, labels defined within
1945 those items also have the current offset value _before_ Normand applies
1946 the transformation function.
1947
1948 . The `!end` closing.
1949
1950 The <<cur-offset,current offset>> after having handled the last item of
1951 a transformation block is the value of the current offset before
1952 handling the first item plus the size of the generated (transformed)
1953 bytes. In other words, <<current-offset-setting,current offset
1954 settings>> within the items of the block have no impact outside said
1955 block.
1956
1957 ====
1958 Input:
1959
1960 ----
1961 aa bb cc dd
1962
1963 "size of compressed section: " [end - start : 8]
1964
1965 <start>
1966
1967 !transform bzip2
1968 "this will be compressed!"
1969 89*100 00*5000
1970 !end
1971
1972 <end>
1973
1974 "yes!"
1975 ----
1976
1977 Output:
1978
1979 ----
1980 aa bb cc dd 73 69 7a 65 20 6f 66 20 63 6f 6d 70 ┆ ••••size of comp
1981 72 65 73 73 65 64 20 73 65 63 74 69 6f 6e 3a 20 ┆ ressed section:
1982 52 42 5a 68 39 31 41 59 26 53 59 68 e1 8c fc 00 ┆ RBZh91AY&SYh••••
1983 00 33 d1 e0 c0 00 60 00 5e 66 dc 80 00 20 00 80 ┆ •3••••`•^f••• ••
1984 00 08 20 00 31 40 d3 43 23 26 20 ca 87 a9 a1 e8 ┆ •• •1@•C#& •••••
1985 18 29 44 80 9c 80 49 bf cc b3 e8 45 ed e2 76 ad ┆ •)D•••I••••E••v•
1986 0f 12 8b 8a d6 cd 40 04 7e 2e e4 8a 70 a1 20 d1 ┆ ••••••@•~.••p• •
1987 c3 19 f8 79 65 73 21 ┆ •••yes!
1988 ----
1989 ====
1990
1991 ====
1992 Input:
1993
1994 ----
1995 88*16
1996
1997 !t a85
1998 "I am determined to be cheerful and happy in whatever situation "
1999 "I may find myself. For I have learned that the greater part of "
2000 "our misery or unhappiness is determined not by our circumstance "
2001 "but by our disposition."
2002 !end
2003
2004 @128~99h
2005
2006 !t qp <beg> [ICITTE - beg : 8] * 50 !end
2007 ----
2008
2009 Output:
2010
2011 ----
2012 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 ┆ ••••••••••••••••
2013 38 4b 5f 47 59 2b 43 6f 26 2a 41 54 44 58 25 44 ┆ 8K_GY+Co&*ATDX%D
2014 49 6d 3f 24 46 44 69 3a 32 41 4b 59 4a 72 41 53 ┆ Im?$FDi:2AKYJrAS
2015 23 6d 6f 46 5f 69 31 2f 44 49 61 6c 27 40 3b 70 ┆ #moF_i1/DIal'@;p
2016 31 32 2b 44 47 5e 39 47 41 28 45 2c 41 54 68 58 ┆ 12+DG^9GA(E,AThX
2017 2a 2b 45 4d 37 3d 46 5e 5d 42 2b 44 66 2d 5b 68 ┆ *+EM7=F^]B+Df-[h
2018 2b 44 6b 50 34 2b 44 2c 3e 2a 41 30 3e 60 37 46 ┆ +DkP4+D,>*A0>`7F
2019 28 4b 30 22 2f 67 2a 57 25 45 5a 64 70 72 42 4f ┆ (K0"/g*W%EZdprBO
2020 51 27 71 2b 44 62 55 74 45 63 2c 48 21 2b 45 56 ┆ Q'q+DbUtEc,H!+EV
2021 3a 2a 46 3c 47 5b 3d 41 4b 59 57 2b 41 52 54 5b ┆ :*F<G[=AKYW+ART[
2022 6c 45 5a 66 3d 30 45 63 60 46 42 41 66 75 23 37 ┆ lEZf=0Ec`FBAfu#7
2023 45 5a 66 34 35 46 28 4b 42 3b 2b 45 29 39 43 46 ┆ EZf45F(KB;+E)9CF
2024 60 28 6c 24 45 2c 5d 4e 2f 41 54 4d 6f 38 42 6c ┆ `(l$E,]N/ATMo8Bl
2025 62 44 2d 41 54 56 4c 28 44 2f 21 6d 21 41 30 3e ┆ bD-ATVL(D/!m!A0>
2026 63 2e 46 3c 47 25 3c 2b 45 29 43 43 2b 43 66 2c ┆ c.F<G%<+E)CC+Cf,
2027 2b 40 73 29 58 30 46 43 42 26 73 41 4b 59 48 29 ┆ +@s)X0FCB&sAKYH)
2028 46 3c 47 25 3c 2b 45 29 43 43 2b 43 6f 32 2d 45 ┆ F<G%<+E)CC+Co2-E
2029 2c 54 66 33 46 44 35 5a 32 2f 63 99 99 99 99 99 ┆ ,Tf3FD5Z2/c•••••
2030 3d 30 30 3d 30 31 3d 30 32 3d 30 33 3d 30 34 3d ┆ =00=01=02=03=04=
2031 30 35 3d 30 36 3d 30 37 3d 30 38 3d 30 39 0a 3d ┆ 05=06=07=08=09•=
2032 30 42 3d 30 43 0d 3d 30 45 3d 30 46 3d 31 30 3d ┆ 0B=0C•=0E=0F=10=
2033 31 31 3d 31 32 3d 31 33 3d 31 34 3d 31 35 3d 31 ┆ 11=12=13=14=15=1
2034 36 3d 31 37 3d 31 38 3d 31 39 3d 31 41 3d 31 42 ┆ 6=17=18=19=1A=1B
2035 3d 31 43 3d 31 44 3d 31 45 3d 31 46 20 21 22 23 ┆ =1C=1D=1E=1F !"#
2036 24 25 26 27 28 29 2a 2b 2c 2d 3d 0a 2e 2f 30 31 ┆ $%&'()*+,-=•./01
2037 ----
2038 ====
2039
2040 === Macro definition block
2041
2042 A _macro definition block_ associates a name and parameter names to
2043 a group of items.
2044
2045 A macro definition block doesn't lead to generated bytes itself: a
2046 <<macro-expansion,macro expansion>> does so.
2047
2048 A macro definition may only exist at the root level, that is, not within
2049 a <<group,group>>, a <<repetition-block,repetition block>>, a
2050 <<conditional-block,conditional block>>, or another
2051 <<macro-definition-block,macro definition block>>.
2052
2053 All macro definitions must have unique names.
2054
2055 A macro definition is:
2056
2057 . The `!macro` or `!m` opening.
2058
2059 . A valid {py3} name (the macro name).
2060
2061 . The `(` parameter name list prefix.
2062
2063 . A comma-separated list of zero or more unique parameter names,
2064 each one being a valid {py3} name.
2065
2066 . The `)` parameter name list suffix.
2067
2068 . Zero or more items except, recursively, a macro definition block.
2069
2070 . The `!end` closing.
2071
2072 ====
2073 ----
2074 !macro bake()
2075 !le [ICITTE * 8 : 16]
2076 u16le"predict explode"
2077 !end
2078 ----
2079 ====
2080
2081 ====
2082 ----
2083 !macro nail(rep, with_extra, val)
2084 {iter = 1}
2085
2086 !repeat rep
2087 [val + iter : uleb128]
2088 [0xdeadbeef : 32]
2089 {iter = iter + 1}
2090 !end
2091
2092 !if with_extra
2093 "meow mix\0"
2094 !end
2095 !end
2096 ----
2097 ====
2098
2099 === Macro expansion
2100
2101 A _macro expansion_ expands the items of a defined
2102 <<macro-definition-block,macro>>.
2103
2104 The macro to expand must be defined _before_ the expansion.
2105
2106 The <<state,state>> before handling the first item of the chosen macro
2107 is:
2108
2109 <<cur-offset,Current offset>>::
2110 Unchanged.
2111
2112 <<cur-bo,Current byte order>>::
2113 Unchanged.
2114
2115 Variables::
2116 The only available variables initially are the macro parameters.
2117
2118 Labels::
2119 None.
2120
2121 The state after having handled the last item of the chosen macro is:
2122
2123 Current offset::
2124 The one before handling the first item of the macro plus the size
2125 of the generated data of the macro expansion.
2126 +
2127 IMPORTANT: This means <<current-offset-setting,current offset setting>>
2128 items within the expanded macro don't impact the final current offset.
2129
2130 Current byte order::
2131 The one before handling the first item of the macro.
2132
2133 Variables::
2134 The ones before handling the first item of the macro.
2135
2136 Labels::
2137 The ones before handling the first item of the macro.
2138
2139 A macro expansion is:
2140
2141 . The `m:` prefix.
2142
2143 . A valid {py3} name (the name of the macro to expand).
2144
2145 . The `(` parameter value list prefix.
2146
2147 . A comma-separated list of zero or more unique parameter values.
2148 +
2149 The number of parameter values must match the number of parameter
2150 names of the definition of the chosen macro.
2151 +
2152 A parameter value is one of:
2153 +
2154 --
2155 * A <<const-int,constant integer>>, possibly negative.
2156
2157 * A constant floating point number.
2158
2159 * The ``pass:[{]`` prefix, a valid {py3} expression of which the
2160 evaluation result type is `int` or `bool` (automatically converted to
2161 `int`), and the `}` suffix.
2162 +
2163 For a macro expansion at some source location{nbsp}__**L**__, this
2164 expression may contain:
2165
2166 ** The name of any <<label,label>> defined before{nbsp}__**L**__
2167 which isn't within a nested group.
2168 ** The name of any <<variable-assignment,variable>> known
2169 at{nbsp}__**L**__.
2170
2171 +
2172 The value of the special name `ICITTE` (`int` type) in this expression
2173 is the <<cur-offset,current offset>> (before handling the items of the
2174 chosen macro).
2175
2176 * A valid {py3} name.
2177 +
2178 For the name `__NAME__`, this is equivalent to the
2179 `pass:[{]__NAME__pass:[}]` form above.
2180 --
2181
2182 . The `)` parameter value list suffix.
2183
2184 ====
2185 Input:
2186
2187 ----
2188 !macro bake()
2189 !le [ICITTE * 8 : 16]
2190 u16le"predict explode"
2191 !end
2192
2193 "hello [" m:bake() "] world"
2194
2195 m:bake() * 5
2196 ----
2197
2198 Output:
2199
2200 ----
2201 68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
2202 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
2203 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
2204 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2205 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
2206 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2207 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
2208 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2209 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
2210 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2211 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
2212 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2213 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
2214 ----
2215 ====
2216
2217 ====
2218 Input:
2219
2220 ----
2221 !macro A(val, is_be)
2222 !le
2223
2224 !if is_be
2225 !be
2226 !end
2227
2228 [val : 16]
2229 !end
2230
2231 !macro B(rep, is_be)
2232 {iter = 1}
2233
2234 !repeat rep
2235 m:A({iter * 3}, is_be)
2236 {iter = iter + 1}
2237 !end
2238 !end
2239
2240 m:B(5, 1)
2241 m:B(3, 0)
2242 ----
2243
2244 Output:
2245
2246 ----
2247 00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
2248 ----
2249 ====
2250
2251 ====
2252 Input:
2253
2254 ----
2255 !macro flt32be(val) !be [val : 32] !end
2256
2257 "CHEETOS"
2258 m:flt32be(-42.17)
2259 m:flt32be(56.23e-4)
2260 ----
2261
2262 Output:
2263
2264 ----
2265 43 48 45 45 54 4f 53 c2 28 ae 14 3b b8 41 25 ┆ CHEETOS•(••;•A%
2266 ----
2267 ====
2268
2269 === Post-item repetition
2270
2271 A _post-item repetition_ represents the bytes of an item repeated a
2272 given number of times.
2273
2274 A post-item repetition is:
2275
2276 . One of those items:
2277
2278 ** A <<byte-constant,byte constant>>.
2279 ** A <<literal-string,literal string>>.
2280 ** A <<fixed-length-number,fixed-length number>>.
2281 ** An <<leb128-integer,LEB128 integer>>.
2282 ** A <<string,string>>.
2283 ** A <<macro-expansion,macro-expansion>>.
2284 ** A <<transformation-block,transformation block>>.
2285 ** A <<group,group>>.
2286
2287 . The ``pass:[*]`` character.
2288
2289 . One of:
2290
2291 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
2292 which is the number of times to repeat the previous item.
2293
2294 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
2295 evaluation result type is `int` or `bool` (automatically converted to
2296 `int`), and the `}` suffix.
2297 +
2298 For a post-item repetition at some source location{nbsp}__**L**__, this
2299 expression may contain:
2300 +
2301 --
2302 * The name of any <<label,label>> defined before{nbsp}__**L**__
2303 which isn't within a nested group and
2304 which isn't part of the repeated item.
2305 * The name of any <<variable-assignment,variable>> known
2306 at{nbsp}__**L**__, which isn't part of its repeated item, and which
2307 doesn't.
2308 --
2309 +
2310 The value of the special name `ICITTE` (`int` type) in this expression
2311 is the <<cur-offset,current offset>> (before handling the items to
2312 repeat).
2313
2314 ** A valid {py3} name.
2315 +
2316 For the name `__NAME__`, this is equivalent to the
2317 `pass:[{]__NAME__pass:[}]` form above.
2318
2319 You may also use a <<repetition-block,repetition block>>. The form
2320 ``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
2321 ``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
2322
2323 ====
2324 Input:
2325
2326 ----
2327 [end - ICITTE - 1 : 8] * 0x100 <end>
2328 ----
2329
2330 Output:
2331
2332 ----
2333 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
2334 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
2335 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
2336 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
2337 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
2338 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
2339 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
2340 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
2341 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
2342 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
2343 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
2344 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
2345 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
2346 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
2347 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
2348 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
2349 ----
2350 ====
2351
2352 ====
2353 Input:
2354
2355 ----
2356 {times = 1}
2357 aa bb cc dd
2358 (
2359 <here>
2360 (ee ff) * {here + 1}
2361 11 22 33 * {times}
2362 {times = times + 1}
2363 ) * 3
2364 "coucou!"
2365 ----
2366
2367 Output:
2368
2369 ----
2370 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
2371 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
2372 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2373 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
2374 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2375 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2376 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2377 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2378 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2379 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2380 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
2381 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
2382 ----
2383 ====
2384
2385 == Command-line tool
2386
2387 If you <<install-normand,installed>> the `normand` package, then you
2388 can use the `normand` command-line tool:
2389
2390 ----
2391 $ normand <<< '"ma gang de malades"' | hexdump -C
2392 ----
2393
2394 ----
2395 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
2396 00000010 65 73 |es|
2397 ----
2398
2399 If you copy the `normand.py` module to your own project, then you can
2400 run the module itself:
2401
2402 ----
2403 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
2404 ----
2405
2406 ----
2407 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
2408 00000010 65 73 |es|
2409 ----
2410
2411 Without a path argument, the `normand` tool reads from the standard
2412 input.
2413
2414 The `normand` tool prints the generated binary data to the standard
2415 output.
2416
2417 Various options control the initial <<state,state>> of the processor:
2418 use the `--help` option to learn more.
2419
2420 == {py3} API
2421
2422 The whole `normand` package/module public API is:
2423
2424 [source,python]
2425 ----
2426 # Byte order.
2427 class ByteOrder(enum.Enum):
2428 # Big endian.
2429 BE = ...
2430
2431 # Little endian.
2432 LE = ...
2433
2434
2435 # Text location.
2436 class TextLocation:
2437 # Line number.
2438 @property
2439 def line_no(self) -> int:
2440 ...
2441
2442 # Column number.
2443 @property
2444 def col_no(self) -> int:
2445 ...
2446
2447
2448 # Parsing error message.
2449 class ParseErrorMessage:
2450 # Message text.
2451 @property
2452 def text(self):
2453 ...
2454
2455 # Source text location.
2456 @property
2457 def text_location(self):
2458 ...
2459
2460
2461 # Parsing error.
2462 class ParseError(RuntimeError):
2463 # Parsing error messages.
2464 #
2465 # The first message is the most _specific_ one.
2466 @property
2467 def messages(self):
2468 ...
2469
2470
2471 # Variables dictionary type (for type hints).
2472 VariablesT = typing.Dict[str, typing.Union[int, float]]
2473
2474
2475 # Labels dictionary type (for type hints).
2476 LabelsT = typing.Dict[str, int]
2477
2478
2479 # Parsing result.
2480 class ParseResult:
2481 # Generated data.
2482 @property
2483 def data(self) -> bytearray:
2484 ...
2485
2486 # Updated variable values.
2487 @property
2488 def variables(self) -> SymbolsT:
2489 ...
2490
2491 # Updated main group label values.
2492 @property
2493 def labels(self) -> SymbolsT:
2494 ...
2495
2496 # Final offset.
2497 @property
2498 def offset(self) -> int:
2499 ...
2500
2501 # Final byte order.
2502 @property
2503 def byte_order(self) -> typing.Optional[ByteOrder]:
2504 ...
2505
2506
2507 # Parses the `normand` input using the initial state defined by
2508 # `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
2509 # and returns the corresponding parsing result.
2510 def parse(normand: str,
2511 init_variables: typing.Optional[SymbolsT] = None,
2512 init_labels: typing.Optional[SymbolsT] = None,
2513 init_offset: int = 0,
2514 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
2515 ...
2516 ----
2517
2518 The `normand` parameter is the actual <<learn-normand,Normand input>>
2519 while the other parameters control the initial <<state,state>>.
2520
2521 The `parse()` function raises a `ParseError` instance should it fail to
2522 parse the `normand` string for any reason.
2523
2524 == Development
2525
2526 Normand is a https://python-poetry.org/[Poetry] project.
2527
2528 To develop it, install it through Poetry and enter the virtual
2529 environment:
2530
2531 ----
2532 $ poetry install
2533 $ poetry shell
2534 $ normand <<< '"lol" * 10 0a'
2535 ----
2536
2537 `normand.py` is processed by:
2538
2539 * https://microsoft.github.io/pyright/[Pyright]
2540 * https://github.com/psf/black[Black]
2541 * https://pycqa.github.io/isort/[isort]
2542
2543 === Testing
2544
2545 Use https://docs.pytest.org/[pytest] to test Normand once the package is
2546 part of your virtual environment, for example:
2547
2548 ----
2549 $ poetry install
2550 $ poetry run pip3 install pytest
2551 $ poetry run pytest
2552 ----
2553
2554 The `pytest` project is currently not a development dependency in
2555 `pyproject.toml` due to backward compatibiliy issues with
2556 Python{nbsp}3.4.
2557
2558 In the `tests` directory, each `*.nt` file is a test. The file name
2559 prefix indicates what it's meant to test:
2560
2561 `pass-`::
2562 Everything above the `---` line is the valid Normand input
2563 to test.
2564 +
2565 Everything below the `---` line is the expected data
2566 (whitespace-separated hexadecimal bytes).
2567
2568 `fail-`::
2569 Everything above the `---` line is the invalid Normand input
2570 to test.
2571 +
2572 Everything below the `---` line is the expected error message having
2573 this form:
2574 +
2575 ----
2576 LINE:COL - MESSAGE
2577 ----
2578
2579 === Contributing
2580
2581 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2582 for code review.
2583
2584 To report a bug, https://github.com/efficios/normand/issues/new[create a
2585 GitHub issue].
This page took 0.08216 seconds and 4 git commands to generate.