Make it possible to specify more that one byte with `%`
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
6dd69a2a 32WARNING: This version of Normand is 0.16, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
101UTF-8, UTF-16, and UTF-32 literal strings::
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
108----
109+
110Output:
111+
112----
11368 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11400 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
11500 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116----
117
118Labels: special variables holding the offset where they're defined::
119+
120----
121<beg> b2 52 e3 bc 91 05
122$100 $50 <chair> 33 9f fe
12325 e9 89 8a <end>
124----
125
126Variables::
127+
128----
1295e 65 {tower = 47} c6 7f f2 c4
13044 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131----
132+
133The value of a variable assignment is the evaluation of a valid {py3}
134expression which may include label and variable names.
135
269f6eb3 136Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
137+
138Input:
139+
140----
141{strength = 4}
142{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143{le} {-1993 : 32}
269f6eb3 144{-3.141593 : 64}
71aaa3f7
PP
145----
146+
147Output:
148+
149----
269f6eb3
PP
15067 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
15109 c0
71aaa3f7
PP
152----
153+
269f6eb3 154The encoded number is the evaluation of a valid {py3} expression which
05f81895
PP
155may include label and variable names.
156
157https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158+
159Input:
160+
161----
162aa bb cc {-1993 : sleb128} <meow> dd ee ff
163{meow * 199 : uleb128}
164----
165+
166Output:
167+
168----
169aa bb cc b7 70 dd ee ff e3 07
170----
171+
172The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
173may include label and variable names.
174
27d52a19
PP
175Conditional::
176+
177Input:
178+
179----
180aa bb cc
181
182(
183 "foo"
184
185 !if {ICITTE > 10}
186 "bar"
12b5dbc0
PP
187 !else
188 "fight"
27d52a19
PP
189 !end
190) * 4
191----
192+
193Output:
194+
195----
12b5dbc0
PP
196aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
19767 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
27d52a19
PP
198----
199
71aaa3f7
PP
200Repetition::
201+
202Input:
203+
204----
2adf4336 205aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
e57a18e1
PP
206
207!repeat 3
208 ff ee "juice"
209!end
71aaa3f7
PP
210----
211+
212Output:
213+
214----
2adf4336
PP
215aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
21600 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
21779 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
21865 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
21961 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
22068 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 22100 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
e57a18e1
PP
222ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
2236a 75 69 63 65 ┆ juice
71aaa3f7
PP
224----
225
676f6189
PP
226Alignment::
227+
228Input:
229+
230----
231{be}
232
233 {199:32}
234@64 {43:64}
235@16 {-123:16}
236@32~255 {5584:32}
237----
238+
239Output:
240+
241----
24200 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
243ff 85 ff ff 00 00 15 d0
244----
71aaa3f7 245
25ca454b
PP
246Filling::
247+
248Input:
249+
250----
251{le}
252{0xdeadbeef:32}
253{-1993:16}
254{9:16}
255+0x40
256{ICITTE:8}
257"meow mix"
fc21bb27 258+200~FFh
25ca454b
PP
259{ICITTE:8}
260----
261+
262Output:
263+
264----
265ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
26600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26940 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
270ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
271ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
272ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277ff ff ff ff ff ff ff ff c8 ┆ •••••••••
278----
279
71aaa3f7
PP
280Multilevel grouping::
281+
282Input:
283+
284----
285ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
286----
287+
288Output:
289+
290----
291ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
292bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2936f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2946d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
295aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2967a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2976f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
298----
299
320644e2
PP
300Macros::
301+
302Input:
303+
304----
305!macro hello(world)
306 "hello"
307 !if world " world" !end
308!end
309
310!repeat 17
311 ff ff ff ff
312 m:hello({ICITTE > 15 and ICITTE < 60})
313!end
314----
315+
316Output:
317+
318----
319ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3206c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
32164 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
322ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
323ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
3246f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
3256c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
32668 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
327ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
328ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3296c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
330----
331
71aaa3f7
PP
332Precise error reporting::
333+
334----
335/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
336----
337+
338----
339/tmp/meow.normand:32:6 - Unexpected character `k`.
340----
341+
342----
320644e2 343/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
71aaa3f7
PP
344----
345+
346----
f5dcb24c
PP
347/tmp/meow.normand:32:19 - While expanding the macro `meow`:
348/tmp/meow.normand:35:5 - While expanding the macro `zzz`:
320644e2 349/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
71aaa3f7
PP
350----
351
352You can use Normand to track data source files in your favorite VCS
353instead of raw binary files. The binary files that Normand generates can
354be used to test file format decoding, including malformatted data, for
355example, as well as for education.
356
357See <<learn-normand>> to explore all the Normand features.
358
359== Install Normand
360
361Normand requires Python ≥ 3.4.
362
363To install Normand:
364
365----
366$ python3 -m pip install --user normand
367----
368
369See
370https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
371to learn more about a user site installation.
372
373[NOTE]
374====
375Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 376to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
377function and the <<command-line-tool,command-line tool>>).
378
379`normand.py` has _no external dependencies_, but if you're using
380Python{nbsp}3.4, you'll need a local copy of the standard `typing`
381module.
382====
383
43937a34
PP
384== Design goals
385
386The design goals of Normand are:
387
388Portability::
389 We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
390 doesn't have any external dependencies so that you may just copy the
391 module as is to your own project.
392
393Ease of use::
394 The most basic Normand input is a sequence of hexadecimal constants
395 (for example, `4e6f726d616e64`) which produce exactly what you'd
396 expect.
397+
398Most Normand features map to programming language concepts you already
399know and understand: constant integers, literal strings, variables,
400conditionals, repetitions/loops, and the rest.
401
402Concise and readable input::
403 We could have chosen XML or YAML as the input format, but having a
404 DSL here makes a Normand input compact and easy to read, two
405 important traits when using Normand to write tests, for example.
406+
407Compare the following Normand input and some hypothetical XML
408equivalent, for example:
409+
410.Actual normand input.
411----
412ff dd 01 ab $192 $-128 %1101:0011
413
414{end:8}
415
416{iter = 1}
417
418!if {not something}
419 # five times because xyz
420 !repeat 5
421 "hello world " {iter:8}
422 {iter = iter + 1}
423 !end
424!end
425
426<end>
427----
428+
429.Hypothetical Normand XML input.
430[source,xml]
431----
432<?xml version="1.0" encoding="utf-8" ?>
433<group>
434 <byte base="x" val="ff" />
435 <byte base="x" val="dd" />
436 <byte base="x" val="1" />
437 <byte base="x" val="ab" />
438 <byte base="d" val="192" />
439 <byte base="d" val="-128" />
440 <byte base="b" val="11010011" />
441 <fixed-len-num expr="end" len="8" />
442 <var-assign name="iter" expr="1" />
443 <cond expr="not something">
444 <!-- five times because xyz -->
445 <repeat expr="5">
446 <str>hello world </str>
447 <fixed-len-num expr="iter" len="8" />
448 <var-assign name="iter" expr="iter + 1" />
449 </repeat>
450 </cond>
451 <label name="end" />
452</group>
453----
454
71aaa3f7
PP
455== Learn Normand
456
457A Normand text input is a sequence of items which represent a sequence
458of raw bytes.
459
460[[state]] During the processing of items to data, Normand relies on a
461current state:
462
463[%header%autowidth]
464|===
af3cf417 465|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
466
467|[[cur-offset]] Current offset
468|
05f81895 469The current offset has an effect on the value of <<label,labels>> and of
269f6eb3 470the special `ICITTE` name in <<fixed-length-number,fixed-length
27d52a19 471number>>, <<leb-128-integer,LEB128 integer>>,
f63f4a5d 472<<filling,filling>>, <<variable-assignment,variable assignment>>,
27d52a19 473<<conditional-block,conditional block>>, <<repetition-block,repetition
320644e2
PP
474block>>, <<macro-expansion,macro expansion>>, and
475<<post-item-repetition,post-item repetition>> expression evaluation.
71aaa3f7
PP
476
477Each generated byte increments the current offset.
478
479A <<current-offset-setting,current offset setting>> may change the
676f6189
PP
480current offset without generating data.
481
482An <<current-offset-alignment,current offset alignment>> generates
483padding bytes to make the current offset satisfy a given alignment.
71aaa3f7
PP
484|`init_offset` parameter of the `parse()` function.
485|`--offset` option.
486
487|[[cur-bo]] Current byte order
488|
05f81895 489The current byte order has an effect on the encoding of
269f6eb3 490<<fixed-length-number,fixed-length numbers>>.
71aaa3f7
PP
491
492A <<current-byte-order-setting,current byte order setting>> may change
493the current byte order.
494|`init_byte_order` parameter of the `parse()` function.
495|`--byte-order` option.
496
497|<<label,Labels>>
498|Mapping of label names to integral values.
499|`init_labels` parameter of the `parse()` function.
500|One or more `--label` options.
501
502|<<variable-assignment,Variables>>
27d52a19 503|Mapping of variable names to integral or floating point number values.
71aaa3f7
PP
504|`init_variables` parameter of the `parse()` function.
505|One or more `--var` options.
506|===
507
508The available items are:
509
6dd69a2a
PP
510* A <<byte-constant,constant integer>> representing one or more
511 constant bytes.
71aaa3f7
PP
512
513* A <<literal-string,literal string>> representing a sequence of bytes
514 encoding UTF-8, UTF-16, or UTF-32 data.
515
516* A <<current-byte-order-setting,current byte order setting>> (big or
517 little endian).
518
269f6eb3
PP
519* A <<fixed-length-number,fixed-length number>> (integer or
520 floating point) using the <<cur-bo,current byte order>> and of which
521 the value is the result of a {py3} expression.
05f81895
PP
522
523* An <<leb128-integer,LEB128 integer>> of which the value is the result
524 of a {py3} expression.
71aaa3f7
PP
525
526* A <<current-offset-setting,current offset setting>>.
527
676f6189
PP
528* A <<current-offset-alignment,current offset alignment>>.
529
25ca454b
PP
530* A <<filling,filling>>.
531
71aaa3f7
PP
532* A <<label,label>>, that is, a named constant holding the current
533 offset.
534+
535This is similar to an assembly label.
536
537* A <<variable-assignment,variable assignment>> associating a name to
538 the integral result of an evaluated {py3} expression.
539
540* A <<group,group>>, that is, a scoped sequence of items.
541
27d52a19
PP
542* A <<conditional-block,conditional block>>.
543
e57a18e1
PP
544* A <<repetition-block,repetition block>>.
545
320644e2
PP
546* A <<macro-definition-block,macro definition block>>.
547
548* A <<macro-expansion,macro expansion>>.
549
e57a18e1
PP
550Moreover, you can repeat many items above a constant or variable number
551of times with the ``pass:[*]`` operator _after_ the item to repeat. This
552is called a <<post-item-repetition,post-item repetition>>.
71aaa3f7
PP
553
554A Normand comment may exist:
555
556* Between items, possibly within a group.
557* Between the nibbles of a constant hexadecimal byte.
558* Between the bits of a constant binary byte.
e57a18e1
PP
559* Between the last item and the ``pass:[*]`` character of a post-item
560 repetition, and between that ``pass:[*]`` character and the following
561 number or expression.
261c5ecf
PP
562* Between the ``!repeat``/``!r`` block opening and the following
563 constant integer, name, or expression of a repetition block.
564* Between the ``!if`` block opening and the following name or expression
565 of a conditional block.
71aaa3f7
PP
566
567A comment is anything between two ``pass:[#]`` characters on the same
568line, or from ``pass:[#]`` until the end of the line. Whitespaces and
569the following symbol characters are also considered comments where a
570comment may exist:
571
572----
25ca454b 573/ \ ? & : ; . , [ ] _ = | -
71aaa3f7
PP
574----
575
576The latter serve to improve readability so that you may write, for
577example, a MAC address or a UUID as is.
578
fc21bb27
PP
579[[const-int]] Many items require a _constant integer_, possibly
580negative, in which case it may start with `-` for a negative integer. A
581positive constant integer is any of:
582
583Decimal::
584 One or mode digits (`0` to `9`).
585
586Hexadecimal::
587 One of:
588+
589* The `0x` or `0X` prefix followed with one or more hexadecimal digits
590 (`0` to `9`, `a` to `f`, or `A` to `F`).
591* One or more hexadecimal digits followed with the `h` or `H` suffix.
592
593Octal::
594 One of:
595+
596* The `0o` or `0O` prefix followed with one or more octal digits
597 (`0` to `7`).
598* One or more octal digits followed with the `o`, `O`, `q`, or `Q`
599 suffix.
600
601Binary::
602 One of:
603+
604* The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
605* One or more bits followed with the `b` or `B` suffix.
606
71aaa3f7
PP
607You can test the examples of this section with the `normand`
608<<command-line-tool,command-line tool>> as such:
609
610----
611$ normand file | hexdump -C
612----
613
614where `file` is the name of a file containing the Normand input.
615
616=== Byte constant
617
6dd69a2a 618A _byte constant_ represents one or more constant bytes.
71aaa3f7
PP
619
620A byte constant is:
621
622Hexadecimal form::
6dd69a2a 623 Two consecutive hexadecimal digits representing a single byte.
71aaa3f7
PP
624
625Decimal form::
6dd69a2a 626 One or more digits after the `$` prefix representing a single byte.
71aaa3f7 627
6dd69a2a
PP
628Binary form:: {empty}
629+
630--
631. __**N**__ `%` prefixes (at least one).
632+
633The number of `%` characters is the number of subsequent expected bytes.
634
635. __**N**__{nbsp}×{nbsp}8 bits (`0` or `1`).
636--
71aaa3f7
PP
637
638====
639Input:
640
641----
642ab cd [3d 8F] CC
643----
644
645Output:
646
647----
648ab cd 3d 8f cc
649----
650====
651
652====
653Input:
654
655----
656$192 %1100/0011 $ -77
657----
658
659Output:
660
661----
662c0 c3 b3
663----
664====
665
666====
667Input:
668
669----
67058f64689-6316-4d55-8a1a-04cada366172
671fe80::6257:18ff:fea3:4229
672----
673
674Output:
675
676----
67758 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
678fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
679----
680====
681
682====
683Input:
684
685----
686%01110011 %01100001 %01101100 %01110101 %01110100
6dd69a2a 687%%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
71aaa3f7
PP
688----
689
690Output:
691
692----
6dd69a2a 69373 61 6c 75 74 d2 ff c7 ┆ salut•••
71aaa3f7
PP
694----
695====
696
697=== Literal string
698
699A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
700bytes of a string.
701
702The string to encode isn't implicitly null-terminated: use `\0` at the
703end of the string to add a null character.
704
705A literal string is:
706
707. **Optional**: one of the following encodings instead of UTF-8:
708+
709--
710[horizontal]
711`u16be`:: UTF-16BE.
712`u16le`:: UTF-16LE.
713`u32be`:: UTF-32BE.
714`u32le`:: UTF-32LE.
715--
716
717. The ``pass:["]`` prefix.
718
719. A sequence of zero or more characters, possibly containing escape
720 sequences.
721+
722An escape sequence is the ``\`` character followed by one of:
723+
724--
725[horizontal]
726`0`:: Null (U+0000)
727`a`:: Alert (U+0007)
728`b`:: Backspace (U+0008)
729`e`:: Escape (U+001B)
730`f`:: Form feed (U+000C)
731`n`:: End of line (U+000A)
732`r`:: Carriage return (U+000D)
733`t`:: Character tabulation (U+0009)
734`v`:: Line tabulation (U+000B)
735``\``:: Reverse solidus (U+005C)
736``pass:["]``:: Quotation mark (U+0022)
737--
738
739. The ``pass:["]`` suffix.
740
741====
742Input:
743
744----
745"coucou tout le monde!"
746----
747
748Output:
749
750----
75163 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
7526f 6e 64 65 21 ┆ onde!
753----
754====
755
756====
757Input:
758
759----
760u16le"I am not young enough to know everything."
761----
762
763Output:
764
765----
76649 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
76720 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
7686e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
76920 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
77065 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
7712e 00 ┆ .•
772----
773====
774
775====
776Input:
777
778----
779u32be "\"illusion is the first\nof all pleasures\" 🦉"
780----
781
782Output:
783
784----
78500 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
78600 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
78700 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
78800 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
78900 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
79000 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
79100 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
79200 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
79300 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
79400 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
79500 00 00 20 00 01 f9 89 ┆ ••• ••••
796----
797====
798
799=== Current byte order setting
800
801This special item sets the <<cur-bo,_current byte order_>>.
802
803The two accepted forms are:
804
805[horizontal]
806``pass:[{be}]``:: Set the current byte order to big endian.
807``pass:[{le}]``:: Set the current byte order to little endian.
808
269f6eb3 809=== Fixed-length number
71aaa3f7 810
269f6eb3
PP
811A _fixed-length number_ represents a fixed number of bytes encoding
812either:
813
814* An unsigned or signed integer (two's complement).
815+
816The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
817
818* A floating point number
b87a3aa2 819 (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
269f6eb3
PP
820+
821The available length are 32 (_binary32_) and 64 (_binary64_).
71aaa3f7 822
269f6eb3
PP
823The value is the result of evaluating a {py3} expression using the
824<<cur-bo,current byte order>>.
825
826A fixed-length number is:
71aaa3f7
PP
827
828. The ``pass:[{]`` prefix.
829
830. A valid {py3} expression.
05f81895 831+
269f6eb3 832For a fixed-length number at some source location{nbsp}__**L**__, this
05f81895
PP
833expression may contain the name of any accessible <<label,label>> (not
834within a nested group), including the name of a label defined
835after{nbsp}__**L**__, as well as the name of any
836<<variable-assignment,variable>> known at{nbsp}__**L**__.
837+
269f6eb3
PP
838The value of the special name `ICITTE` (`int` type) in this expression
839is the <<cur-offset,current offset>> (before encoding the number).
71aaa3f7
PP
840
841. The `:` character.
842
269f6eb3
PP
843. An encoding length in bits amongst:
844+
845--
27d52a19 846The expression evaluates to an `int` or `bool` value::
269f6eb3 847 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
27d52a19
PP
848+
849NOTE: Normand automatically converts a `bool` value to `int`.
269f6eb3
PP
850
851The expression evaluates to a `float` value::
852 `32` and `64`.
853--
71aaa3f7
PP
854
855. The `}` suffix.
856
857====
858Input:
859
860----
861{le} {345:16}
862{be} {-0xabcd:32}
863----
864
865Output:
866
867----
86859 01 ff ff 54 33
869----
870====
871
872====
873Input:
874
875----
876{be}
877
878# String length in bits
879{8 * (str_end - str_beg) : 16}
880
881# String
882<str_beg>
883 "hello world!"
884<str_end>
885----
886
887Output:
888
889----
89000 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
891----
892====
893
894====
895Input:
896
897----
898{20 - ICITTE : 8} * 10
899----
900
901Output:
902
903----
90414 13 12 11 10 0f 0e 0d 0c 0b
905----
906====
907
269f6eb3
PP
908====
909Input:
910
911----
912{le}
913{2 * 0.0529 : 32}
914----
915
916Output:
917
918----
919ac ad d8 3d
920----
921====
922
05f81895
PP
923=== LEB128 integer
924
925An _LEB128 integer_ represents a variable number of bytes encoding an
926unsigned or signed integer which is the result of evaluating a {py3}
927expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
928format.
929
930An LEB128 integer is:
931
932. The ``pass:[{]`` prefix.
933
27d52a19
PP
934. A valid {py3} expression of which the evaluation result type
935 is `int` or `bool` (automatically converted to `int`).
05f81895
PP
936+
937For an LEB128 integer at some source location{nbsp}__**L**__, this
938expression may contain:
939+
940--
fc21bb27
PP
941* The name of any <<label,label>> defined before{nbsp}__**L**__
942 which isn't within a nested group.
320644e2
PP
943* The name of any <<variable-assignment,variable>> known
944 at{nbsp}__**L**__.
05f81895
PP
945--
946+
269f6eb3
PP
947The value of the special name `ICITTE` (`int` type) in this expression
948is the <<cur-offset,current offset>> (before encoding the integer).
05f81895
PP
949
950. The `:` character.
951
952. One of:
953+
954--
955[horizontal]
956`uleb128`:: Use the unsigned LEB128 format.
957`sleb128`:: Use the signed LEB128 format.
958--
959
960. The `}` suffix.
961
962====
963Input:
964
965----
966{624485 : uleb128}
967----
968
969Output:
970
971----
972e5 8e 26
973----
974====
975
976====
977Input:
978
979----
980aa bb cc dd
981<meow>
982ee ff
983{-981238311 + (meow * -23) : sleb128}
984"hello"
985----
986
c2b79cf6
PP
987Output:
988
05f81895
PP
989----
990aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
991----
992====
993
71aaa3f7
PP
994=== Current offset setting
995
996This special item sets the <<cur-offset,_current offset_>>.
997
998A current offset setting is:
999
1000. The `<` prefix.
1001
fc21bb27
PP
1002. A <<const-int,positive constant integer>> which is the new current
1003 offset.
71aaa3f7
PP
1004
1005. The `>` suffix.
1006
1007====
1008Input:
1009
1010----
1011 {ICITTE : 8} * 8
1012<0x61> {ICITTE : 8} * 8
1013----
1014
1015Output:
1016
1017----
101800 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
1019----
1020====
1021
1022====
1023Input:
1024
1025----
1026aa bb cc dd <meow> ee ff
1027<12> 11 22 33 <mix> 44 55
1028{meow : 8} {mix : 8}
1029----
1030
1031Output:
1032
1033----
1034aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
1035----
1036====
1037
676f6189
PP
1038=== Current offset alignment
1039
00deb9fa 1040A _current offset alignment_ represents zero or more padding bytes to
676f6189
PP
1041make the <<cur-offset,current offset>> meet a given
1042https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
1043
1044More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
1045a current offset alignment represents the required padding bytes until
1046the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
1047
1048A current offset alignment is:
1049
1050. The `@` prefix.
1051
fc21bb27
PP
1052. A <<const-int,positive constant integer>> which is the alignment value
1053 in _bits_.
676f6189
PP
1054+
1055This value must be greater than zero and a multiple of{nbsp}8.
1056
1057. **Optional**:
1058+
1059--
1060. The ``pass:[~]`` prefix.
fc21bb27
PP
1061. A <<const-int,positive constant integer>> which is the value of the
1062 byte to use as padding to align the <<cur-offset,current offset>>.
676f6189
PP
1063--
1064+
1065Without this section, the padding byte value is zero.
1066
1067====
1068Input:
1069
1070----
107111 22 (@32 aa bb cc) * 3
1072----
1073
1074Output:
1075
1076----
107711 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
1078----
1079====
1080
1081====
1082Input:
1083
1084----
1085{le}
108677 88
1087@32~0xcc {-893.5:32}
1088@128~0x55 "meow"
1089----
1090
1091Output:
1092
1093----
109477 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
10956d 65 6f 77 ┆ meow
1096----
1097====
1098
1099====
1100Input:
1101
1102----
1103aa bb cc <29> @64~255 "zoom"
1104----
1105
1106Output:
1107
1108----
1109aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1110----
1111====
1112
25ca454b
PP
1113=== Filling
1114
1115A _filling_ represents zero or more padding bytes to make the
1116<<cur-offset,current offset>> reach a given value.
1117
1118A filling is:
1119
1120. The ``pass:[+]`` prefix.
1121
1122. One of:
1123
fc21bb27
PP
1124** A <<const-int,positive constant integer>> which is the current offset
1125 target.
25ca454b
PP
1126
1127** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1128 evaluation result type is `int` or `bool` (automatically converted to
1129 `int`), and the ``pass:[}]`` suffix.
1130+
1131For a filling at some source location{nbsp}__**L**__, this expression
1132may contain:
1133+
1134--
1135* The name of any <<label,label>> defined before{nbsp}__**L**__
1136 which isn't within a nested group.
1137* The name of any <<variable-assignment,variable>> known
1138 at{nbsp}__**L**__.
1139--
1140+
1141The value of the special name `ICITTE` (`int` type) in this expression
1142is the <<cur-offset,current offset>> (before handling the items to
1143repeat).
1144
1145** A valid {py3} name.
1146+
1147For the name `__NAME__`, this is equivalent to the
1148`pass:[{]__NAME__pass:[}]` form above.
1149
1150+
1151This value must be greater than or equal to the current offset where
1152it's used.
1153
1154. **Optional**:
1155+
1156--
1157. The ``pass:[~]`` prefix.
fc21bb27
PP
1158. A <<const-int,positive constant integer>> which is the value of the
1159 byte to use as padding to reach the current offset target.
25ca454b
PP
1160--
1161+
1162Without this section, the padding byte value is zero.
1163
1164====
1165Input:
1166
1167----
1168aa bb cc dd
1169+0x40
1170"hello world"
1171----
1172
1173Output:
1174
1175----
1176aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
117700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
117800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
117900 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
118068 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1181----
1182====
1183
1184====
1185Input:
1186
1187----
1188!macro part(iter, fill)
1189 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1190!end
1191
1192{iter = 1}
1193
1194!repeat 5
1195 m:part(iter, {32 + 4 * iter})
1196 {iter = iter + 1}
1197!end
1198----
1199
1200Output:
1201
1202----
120370 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
120469 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
120580 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
120665 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
120780 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
120869 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
120933 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
121080 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
121161 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
121280 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
121380 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
121461 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
121580 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
121680 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1217----
1218====
1219
71aaa3f7
PP
1220=== Label
1221
1222A _label_ associates a name to the <<cur-offset,current offset>>.
1223
1224All the labels of a whole Normand input must have unique names.
1225
05f81895 1226A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
1227name.
1228
71aaa3f7
PP
1229A label is:
1230
1231. The `<` prefix.
1232
27d52a19 1233. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1234
1235. The `>` suffix.
1236
1237=== Variable assignment
1238
1239A _variable assignment_ associates a name to the integral result of an
1240evaluated {py3} expression.
1241
05f81895 1242A variable assignment is:
71aaa3f7
PP
1243
1244. The ``pass:[{]`` prefix.
1245
27d52a19 1246. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1247
1248. The `=` character.
1249
27d52a19
PP
1250. A valid {py3} expression of which the evaluation result type
1251 is `int`, `float`, or `bool` (automatically converted to `int`).
05f81895
PP
1252+
1253For a variable assignment at some source location{nbsp}__**L**__, this
320644e2
PP
1254expression may contain:
1255+
1256--
1257* The name of any <<label,label>> defined before{nbsp}__**L**__
1258 which isn't within a nested group.
1259* The name of any <<variable-assignment,variable>> known
1260 at{nbsp}__**L**__.
1261--
05f81895 1262+
269f6eb3
PP
1263The value of the special name `ICITTE` (`int` type) in this expression
1264is the <<cur-offset,current offset>>.
71aaa3f7
PP
1265
1266. The `}` suffix.
1267
1268====
1269Input:
1270
1271----
1272{mix = 101} {le}
1273{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1274"yooo" {meow + mix : 16}
1275----
1276
1277Output:
1278
1279----
128011 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1281----
1282====
1283
1284=== Group
1285
1286A _group_ is a scoped sequence of items.
1287
1288The <<label,labels>> within a group aren't visible outside of it.
1289
e57a18e1
PP
1290The main purpose of a group is to <<post-item-repetition,repeat>> more
1291than a single item and to isolate labels.
71aaa3f7
PP
1292
1293A group is:
1294
261c5ecf 1295. The `(`, `!group`, or `!g` opening.
71aaa3f7
PP
1296
1297. Zero or more items.
1298
261c5ecf
PP
1299. Depending on the group opening:
1300+
1301--
1302`(`::
1303 The `)` closing.
1304
1305`!group`::
1306`!g`::
1307 The `!end` closing.
1308--
71aaa3f7
PP
1309
1310====
1311Input:
1312
1313----
1314((aa bb cc) dd () ee) "leclerc"
1315----
1316
1317Output:
1318
1319----
1320aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1321----
1322====
1323
1324====
1325Input:
1326
1327----
261c5ecf
PP
1328!group
1329 (aa bb cc) * 3 dd ee
1330!end * 5
71aaa3f7
PP
1331----
1332
1333Output:
1334
1335----
1336aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1337cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1338ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1339bb cc aa bb cc dd ee
1340----
1341====
1342
1343====
1344Input:
1345
1346----
1347{be}
1348(
1349 <str_beg> u16le"sébastien diaz" <str_end>
1350 {ICITTE - str_beg : 8}
1351 {(end - str_beg) * 5 : 24}
1352) * 3
1353<end>
1354----
1355
1356Output:
1357
1358----
135973 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13606e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
136173 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13626e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
136373 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13646e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1365----
1366====
1367
27d52a19
PP
1368=== Conditional block
1369
12b5dbc0
PP
1370A _conditional block_ represents either the bytes of zero or more items
1371if some expression is true, or the bytes of zero or more other items if
1372it's false.
27d52a19
PP
1373
1374A conditional block is:
1375
261c5ecf 1376. The `!if` opening.
27d52a19
PP
1377
1378. One of:
1379
1380** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1381 evaluation result type is `int` or `bool` (automatically converted to
1382 `int`), and the ``pass:[}]`` suffix.
1383+
320644e2
PP
1384For a conditional block at some source location{nbsp}__**L**__, this
1385expression may contain:
27d52a19
PP
1386+
1387--
1388* The name of any <<label,label>> defined before{nbsp}__**L**__
1389 which isn't within a nested group.
1390* The name of any <<variable-assignment,variable>> known
320644e2 1391 at{nbsp}__**L**__.
27d52a19
PP
1392--
1393+
1394The value of the special name `ICITTE` (`int` type) in this expression
1395is the <<cur-offset,current offset>> (before handling the contained
1396items).
1397
1398** A valid {py3} name.
1399+
1400For the name `__NAME__`, this is equivalent to the
1401`pass:[{]__NAME__pass:[}]` form above.
1402
12b5dbc0
PP
1403. Zero or more items to be handled when the condition is true.
1404
1405. **Optional**:
1406
1407.. The `!else` opening.
1408.. Zero or more items to be handled when the condition is false.
27d52a19 1409
261c5ecf 1410. The `!end` closing.
27d52a19
PP
1411
1412====
1413Input:
1414
1415----
1416{at = 1}
1417{rep_count = 9}
1418
1419!repeat rep_count
1420 "meow "
1421
1422 !if {ICITTE > 25}
1423 "mix"
12b5dbc0
PP
1424 !else
1425 "zoom"
27d52a19
PP
1426 !end
1427
12b5dbc0
PP
1428 !if {at < rep_count} 20 !end
1429
27d52a19
PP
1430 {at = at + 1}
1431!end
1432----
1433
1434Output:
1435
1436----
12b5dbc0
PP
14376d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
14386f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
14396f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
14406d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
27d52a19 144178 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
12b5dbc0 14426d 69 78 ┆ mix
27d52a19
PP
1443----
1444====
1445
1446====
1447Input:
1448
1449----
1450<str_beg>
1451u16le"meow mix!"
1452<str_end>
1453
1454!if {str_end - str_beg > 10}
1455 " BIG"
1456!end
1457----
1458
1459Output:
1460
1461----
14626d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
146321 00 20 42 49 47 ┆ !• BIG
1464----
1465====
1466
e57a18e1 1467=== Repetition block
71aaa3f7 1468
e57a18e1
PP
1469A _repetition block_ represents the bytes of one or more items repeated
1470a given number of times.
676f6189 1471
e57a18e1 1472A repetition block is:
71aaa3f7 1473
261c5ecf 1474. The `!repeat` or `!r` opening.
71aaa3f7 1475
2adf4336
PP
1476. One of:
1477
fc21bb27
PP
1478** A <<const-int,positive constant integer>> which is the number of
1479 times to repeat the previous item.
2adf4336 1480
27d52a19
PP
1481** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1482 evaluation result type is `int` or `bool` (automatically converted to
1483 `int`), and the ``pass:[}]`` suffix.
05f81895 1484+
320644e2
PP
1485For a repetition block at some source location{nbsp}__**L**__, this
1486expression may contain:
05f81895
PP
1487+
1488--
27d52a19
PP
1489* The name of any <<label,label>> defined before{nbsp}__**L**__
1490 which isn't within a nested group.
05f81895 1491* The name of any <<variable-assignment,variable>> known
320644e2 1492 at{nbsp}__**L**__.
05f81895
PP
1493--
1494+
e57a18e1
PP
1495The value of the special name `ICITTE` (`int` type) in this expression
1496is the <<cur-offset,current offset>> (before handling the items to
1497repeat).
1498
1499** A valid {py3} name.
1500+
1501For the name `__NAME__`, this is equivalent to the
1502`pass:[{]__NAME__pass:[}]` form above.
1503
1504. Zero or more items.
1505
261c5ecf 1506. The `!end` closing.
e57a18e1
PP
1507
1508You may also use a <<post-item-repetition,post-item repetition>> after
1509some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1510is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
71aaa3f7
PP
1511
1512====
1513Input:
1514
1515----
fc21bb27 1516!repeat 0o400
e57a18e1
PP
1517 {end - ICITTE - 1 : 8}
1518!end
1519
1520<end>
71aaa3f7
PP
1521----
1522
1523Output:
1524
1525----
1526ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1527ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1528df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1529cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1530bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1531af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
15329f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
15338f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
15347f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
15356f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
15365f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
15374f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
15383f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
15392f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
15401f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
15410f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1542----
1543====
1544
2adf4336
PP
1545====
1546Input:
1547
1548----
1549{times = 1}
e57a18e1 1550
2adf4336 1551aa bb cc dd
e57a18e1
PP
1552
1553!repeat 3
2adf4336 1554 <here>
e57a18e1
PP
1555
1556 !repeat {here + 1}
1557 ee ff
1558 !end
1559
1560 11 22 !repeat times 33 !end
1561
2adf4336 1562 {times = times + 1}
e57a18e1
PP
1563!end
1564
2adf4336
PP
1565"coucou!"
1566----
1567
1568Output:
1569
1570----
1571aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
157233 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1573ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1574ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1575ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1576ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1577ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1578ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1579ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1580ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1581ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
158233 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1583----
1584====
1585
320644e2
PP
1586=== Macro definition block
1587
1588A _macro definition block_ associates a name and parameter names to
1589a group of items.
1590
1591A macro definition block doesn't lead to generated bytes itself: a
1592<<macro-expansion,macro expansion>> does so.
1593
1594A macro definition may only exist at the root level, that is, not within
1595a <<group,group>>, a <<repetition-block,repetition block>>, a
1596<<conditional-block,conditional block>>, or another
1597<<macro-definition-block,macro definition block>>.
1598
1599All macro definitions must have unique names.
1600
1601A macro definition is:
1602
1603. The `!macro` or `!m` opening.
1604
1605. A valid {py3} name (the macro name).
1606
1607. The `(` parameter name list prefix.
1608
1609. A comma-separated list of zero or more unique parameter names,
1610 each one being a valid {py3} name.
1611
1612. The `)` parameter name list suffix.
1613
1614. Zero or more items except, recursively, a macro definition block.
1615
1616. The `!end` closing.
1617
1618====
1619----
1620!macro bake()
1621 {le} {ICITTE * 8 : 16}
1622 u16le"predict explode"
1623!end
1624----
1625====
1626
1627====
1628----
1629!macro nail(rep, with_extra, val)
1630 {iter = 1}
1631
1632 !repeat rep
1633 {val + iter : uleb128}
1634 {0xdeadbeef : 32}
1635 {iter = iter + 1}
1636 !end
1637
1638 !if with_extra
1639 "meow mix\0"
1640 !end
1641!end
1642----
1643====
1644
1645=== Macro expansion
1646
1647A _macro expansion_ expands the items of a defined
1648<<macro-definition-block,macro>>.
1649
1650The macro to expand must be defined _before_ the expansion.
1651
1652The <<state,state>> before handling the first item of the chosen macro
1653is:
1654
1655<<cur-offset,Current offset>>::
1656 Unchanged.
1657
1658<<cur-bo,Current byte order>>::
1659 Unchanged.
1660
1661Variables::
1662 The only available variables initially are the macro parameters.
1663
1664Labels::
1665 None.
1666
1667The state after having handled the last item of the chosen macro is:
1668
1669Current offset::
1670 The one before handling the first item of the macro plus the size
1671 of the generated data of the macro expansion.
1672+
1673IMPORTANT: This means <<current-offset-setting,current offset setting>>
1674items within the expanded macro don't impact the final current offset.
1675
1676Current byte order::
1677 The one before handling the first item of the macro.
1678
1679Variables::
1680 The ones before handling the first item of the macro.
1681
1682Labels::
1683 The ones before handling the first item of the macro.
1684
1685A macro expansion is:
1686
1687. The `m:` prefix.
1688
1689. A valid {py3} name (the name of the macro to expand).
1690
1691. The `(` parameter value list prefix.
1692
1693. A comma-separated list of zero or more unique parameter values.
1694+
1695The number of parameter values must match the number of parameter
1696names of the definition of the chosen macro.
1697+
1698A parameter value is one of:
1699+
1700--
fc21bb27 1701* A <<const-int,constant integer>>, possibly negative.
320644e2
PP
1702
1703* The ``pass:[{]`` prefix, a valid {py3} expression of which the
1704 evaluation result type is `int` or `bool` (automatically converted to
1705 `int`), and the ``pass:[}]`` suffix.
1706+
1707For a macro expansion at some source location{nbsp}__**L**__, this
1708expression may contain:
1709
1710** The name of any <<label,label>> defined before{nbsp}__**L**__
1711 which isn't within a nested group.
1712** The name of any <<variable-assignment,variable>> known
1713 at{nbsp}__**L**__.
1714
1715+
1716The value of the special name `ICITTE` (`int` type) in this expression
1717is the <<cur-offset,current offset>> (before handling the items of the
1718chosen macro).
1719
1720* A valid {py3} name.
1721+
1722For the name `__NAME__`, this is equivalent to the
1723`pass:[{]__NAME__pass:[}]` form above.
1724--
1725
1726. The `)` parameter value list suffix.
1727
1728====
1729Input:
1730
1731----
1732!macro bake()
1733 {le} {ICITTE * 8 : 16}
1734 u16le"predict explode"
1735!end
1736
1737"hello [" m:bake() "] world"
1738
1739m:bake() * 5
1740----
1741
1742Output:
1743
1744----
174568 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
174600 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
174700 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
174870 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174965 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
175070 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
175165 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
175270 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
175365 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
175470 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
175565 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
175670 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
175765 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1758----
1759====
1760
1761====
1762Input:
1763
1764----
1765!macro A(val, is_be)
1766 {le}
1767
1768 !if is_be
1769 {be}
1770 !end
1771
1772 {val : 16}
1773!end
1774
1775!macro B(rep, is_be)
1776 {iter = 1}
1777
1778 !repeat rep
1779 m:A({iter * 3}, is_be)
1780 {iter = iter + 1}
1781 !end
1782!end
1783
1784m:B(5, 1)
1785m:B(3, 0)
1786----
1787
1788Output:
1789
1790----
179100 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1792----
1793====
1794
e57a18e1
PP
1795=== Post-item repetition
1796
1797A _post-item repetition_ represents the bytes of an item repeated a
1798given number of times.
1799
1800A post-item repetition is:
1801
27d52a19 1802. One of those items:
e57a18e1 1803
27d52a19
PP
1804** A <<byte-constant,byte constant>>.
1805** A <<literal-string,literal string>>.
1806** A <<fixed-length-number,fixed-length number>>.
1807** An <<leb128-integer,LEB128 integer>>.
320644e2 1808** A <<macro-expansion,macro-expansion>>.
27d52a19 1809** A <<group,group>>.
e57a18e1
PP
1810
1811. The ``pass:[*]`` character.
1812
1813. One of:
1814
1815** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1816 which is the number of times to repeat the previous item.
1817
27d52a19
PP
1818** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1819 evaluation result type is `int` or `bool` (automatically converted to
1820 `int`), and the ``pass:[}]`` suffix.
e57a18e1 1821+
320644e2
PP
1822For a post-item repetition at some source location{nbsp}__**L**__, this
1823expression may contain:
e57a18e1
PP
1824+
1825--
27d52a19
PP
1826* The name of any <<label,label>> defined before{nbsp}__**L**__
1827 which isn't within a nested group and
1828 which isn't part of the repeated item.
e57a18e1
PP
1829* The name of any <<variable-assignment,variable>> known
1830 at{nbsp}__**L**__, which isn't part of its repeated item, and which
320644e2 1831 doesn't.
e57a18e1
PP
1832--
1833+
1834The value of the special name `ICITTE` (`int` type) in this expression
1835is the <<cur-offset,current offset>> (before handling the items to
1836repeat).
1837
1838** A valid {py3} name.
1839+
1840For the name `__NAME__`, this is equivalent to the
1841`pass:[{]__NAME__pass:[}]` form above.
1842
1843You may also use a <<repetition-block,repetition block>>. The form
1844``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1845``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1846
1847====
1848Input:
1849
1850----
1851{end - ICITTE - 1 : 8} * 0x100 <end>
1852----
1853
1854Output:
1855
1856----
1857ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1858ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1859df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1860cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1861bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1862af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
18639f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
18648f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
18657f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
18666f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
18675f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
18684f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
18693f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
18702f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
18711f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
18720f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1873----
1874====
1875
1876====
1877Input:
1878
1879----
1880{times = 1}
1881aa bb cc dd
1882(
1883 <here>
1884 (ee ff) * {here + 1}
1885 11 22 33 * {times}
1886 {times = times + 1}
1887) * 3
1888"coucou!"
1889----
1890
1891Output:
1892
1893----
1894aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
189533 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1896ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1897ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1898ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1899ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1900ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1901ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1902ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1903ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1904ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
190533 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1906----
1907====
1908
71aaa3f7
PP
1909== Command-line tool
1910
1911If you <<install-normand,installed>> the `normand` package, then you
1912can use the `normand` command-line tool:
1913
1914----
1915$ normand <<< '"ma gang de malades"' | hexdump -C
1916----
1917
1918----
191900000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
192000000010 65 73 |es|
1921----
1922
1923If you copy the `normand.py` module to your own project, then you can
1924run the module itself:
1925
1926----
1927$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1928----
1929
1930----
193100000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
193200000010 65 73 |es|
1933----
1934
1935Without a path argument, the `normand` tool reads from the standard
1936input.
1937
1938The `normand` tool prints the generated binary data to the standard
1939output.
1940
1941Various options control the initial <<state,state>> of the processor:
1942use the `--help` option to learn more.
1943
1944== {py3} API
1945
e57a18e1 1946The whole `normand` package/module public API is:
71aaa3f7
PP
1947
1948[source,python]
1949----
e57a18e1 1950# Byte order.
71aaa3f7
PP
1951class ByteOrder(enum.Enum):
1952 # Big endian.
1953 BE = ...
1954
1955 # Little endian.
1956 LE = ...
1957
1958
e57a18e1
PP
1959# Text location.
1960class TextLocation:
71aaa3f7
PP
1961 # Line number.
1962 @property
1963 def line_no(self) -> int:
1964 ...
1965
1966 # Column number.
1967 @property
1968 def col_no(self) -> int:
1969 ...
1970
1971
f5dcb24c
PP
1972# Parsing error message.
1973class ParseErrorMessage:
1974 # Message text.
1975 @property
1976 def text(self):
1977 ...
1978
1979 # Source text location.
1980 @property
1981 def text_location(self):
1982 ...
1983
1984
e57a18e1 1985# Parsing error.
71aaa3f7 1986class ParseError(RuntimeError):
f5dcb24c
PP
1987 # Parsing error messages.
1988 #
1989 # The first message is the most _specific_ one.
71aaa3f7 1990 @property
f5dcb24c 1991 def messages(self):
71aaa3f7
PP
1992 ...
1993
1994
e57a18e1
PP
1995# Variables dictionary type (for type hints).
1996VariablesT = typing.Dict[str, typing.Union[int, float]]
1997
1998
1999# Labels dictionary type (for type hints).
2000LabelsT = typing.Dict[str, int]
1b8aa84a
PP
2001
2002
e57a18e1 2003# Parsing result.
71aaa3f7
PP
2004class ParseResult:
2005 # Generated data.
2006 @property
2007 def data(self) -> bytearray:
2008 ...
2009
2010 # Updated variable values.
2011 @property
1b8aa84a 2012 def variables(self) -> SymbolsT:
71aaa3f7
PP
2013 ...
2014
2015 # Updated main group label values.
2016 @property
1b8aa84a 2017 def labels(self) -> SymbolsT:
71aaa3f7
PP
2018 ...
2019
2020 # Final offset.
2021 @property
2022 def offset(self) -> int:
2023 ...
2024
2025 # Final byte order.
2026 @property
1b8aa84a 2027 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
2028 ...
2029
1b8aa84a 2030
e57a18e1
PP
2031# Parses the `normand` input using the initial state defined by
2032# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
2033# and returns the corresponding parsing result.
71aaa3f7 2034def parse(normand: str,
1b8aa84a
PP
2035 init_variables: typing.Optional[SymbolsT] = None,
2036 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
2037 init_offset: int = 0,
2038 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
2039 ...
2040----
2041
2042The `normand` parameter is the actual <<learn-normand,Normand input>>
2043while the other parameters control the initial <<state,state>>.
2044
2045The `parse()` function raises a `ParseError` instance should it fail to
2046parse the `normand` string for any reason.
bf8f3b38
PP
2047
2048== Development
2049
2050Normand is a https://python-poetry.org/[Poetry] project.
2051
2052To develop it, install it through Poetry and enter the virtual
2053environment:
2054
2055----
2056$ poetry install
2057$ poetry shell
2058$ normand <<< '"lol" * 10 0a'
2059----
2060
2061`normand.py` is processed by:
2062
2063* https://microsoft.github.io/pyright/[Pyright]
2064* https://github.com/psf/black[Black]
2065* https://pycqa.github.io/isort/[isort]
2066
2067=== Testing
2068
2069Use https://docs.pytest.org/[pytest] to test Normand once the package is
2070part of your virtual environment, for example:
2071
2072----
2073$ poetry install
2074$ poetry run pip3 install pytest
2075$ poetry run pytest
2076----
2077
2078The `pytest` project is currently not a development dependency in
2079`pyproject.toml` due to backward compatibiliy issues with
2080Python{nbsp}3.4.
2081
2082In the `tests` directory, each `*.nt` file is a test. The file name
2083prefix indicates what it's meant to test:
2084
2085`pass-`::
2086 Everything above the `---` line is the valid Normand input
2087 to test.
2088+
2089Everything below the `---` line is the expected data
2090(whitespace-separated hexadecimal bytes).
2091
2092`fail-`::
2093 Everything above the `---` line is the invalid Normand input
2094 to test.
2095+
2096Everything below the `---` line is the expected error message having
2097this form:
2098+
2099----
2100LINE:COL - MESSAGE
2101----
2102
2103=== Contributing
2104
2105Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2106for code review.
2107
2108To report a bug, https://github.com/efficios/normand/issues/new[create a
2109GitHub issue].
This page took 0.105219 seconds and 4 git commands to generate.