README.adoc: add "Design goals" section
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
f5dcb24c 32WARNING: This version of Normand is 0.15, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
101UTF-8, UTF-16, and UTF-32 literal strings::
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
108----
109+
110Output:
111+
112----
11368 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11400 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
11500 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116----
117
118Labels: special variables holding the offset where they're defined::
119+
120----
121<beg> b2 52 e3 bc 91 05
122$100 $50 <chair> 33 9f fe
12325 e9 89 8a <end>
124----
125
126Variables::
127+
128----
1295e 65 {tower = 47} c6 7f f2 c4
13044 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131----
132+
133The value of a variable assignment is the evaluation of a valid {py3}
134expression which may include label and variable names.
135
269f6eb3 136Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
137+
138Input:
139+
140----
141{strength = 4}
142{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143{le} {-1993 : 32}
269f6eb3 144{-3.141593 : 64}
71aaa3f7
PP
145----
146+
147Output:
148+
149----
269f6eb3
PP
15067 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
15109 c0
71aaa3f7
PP
152----
153+
269f6eb3 154The encoded number is the evaluation of a valid {py3} expression which
05f81895
PP
155may include label and variable names.
156
157https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158+
159Input:
160+
161----
162aa bb cc {-1993 : sleb128} <meow> dd ee ff
163{meow * 199 : uleb128}
164----
165+
166Output:
167+
168----
169aa bb cc b7 70 dd ee ff e3 07
170----
171+
172The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
173may include label and variable names.
174
27d52a19
PP
175Conditional::
176+
177Input:
178+
179----
180aa bb cc
181
182(
183 "foo"
184
185 !if {ICITTE > 10}
186 "bar"
12b5dbc0
PP
187 !else
188 "fight"
27d52a19
PP
189 !end
190) * 4
191----
192+
193Output:
194+
195----
12b5dbc0
PP
196aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
19767 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
27d52a19
PP
198----
199
71aaa3f7
PP
200Repetition::
201+
202Input:
203+
204----
2adf4336 205aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
e57a18e1
PP
206
207!repeat 3
208 ff ee "juice"
209!end
71aaa3f7
PP
210----
211+
212Output:
213+
214----
2adf4336
PP
215aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
21600 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
21779 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
21865 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
21961 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
22068 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 22100 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
e57a18e1
PP
222ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
2236a 75 69 63 65 ┆ juice
71aaa3f7
PP
224----
225
676f6189
PP
226Alignment::
227+
228Input:
229+
230----
231{be}
232
233 {199:32}
234@64 {43:64}
235@16 {-123:16}
236@32~255 {5584:32}
237----
238+
239Output:
240+
241----
24200 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
243ff 85 ff ff 00 00 15 d0
244----
71aaa3f7 245
25ca454b
PP
246Filling::
247+
248Input:
249+
250----
251{le}
252{0xdeadbeef:32}
253{-1993:16}
254{9:16}
255+0x40
256{ICITTE:8}
257"meow mix"
fc21bb27 258+200~FFh
25ca454b
PP
259{ICITTE:8}
260----
261+
262Output:
263+
264----
265ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
26600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26940 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
270ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
271ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
272ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277ff ff ff ff ff ff ff ff c8 ┆ •••••••••
278----
279
71aaa3f7
PP
280Multilevel grouping::
281+
282Input:
283+
284----
285ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
286----
287+
288Output:
289+
290----
291ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
292bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2936f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2946d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
295aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2967a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2976f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
298----
299
320644e2
PP
300Macros::
301+
302Input:
303+
304----
305!macro hello(world)
306 "hello"
307 !if world " world" !end
308!end
309
310!repeat 17
311 ff ff ff ff
312 m:hello({ICITTE > 15 and ICITTE < 60})
313!end
314----
315+
316Output:
317+
318----
319ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3206c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
32164 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
322ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
323ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
3246f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
3256c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
32668 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
327ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
328ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3296c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
330----
331
71aaa3f7
PP
332Precise error reporting::
333+
334----
335/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
336----
337+
338----
339/tmp/meow.normand:32:6 - Unexpected character `k`.
340----
341+
342----
320644e2 343/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
71aaa3f7
PP
344----
345+
346----
f5dcb24c
PP
347/tmp/meow.normand:32:19 - While expanding the macro `meow`:
348/tmp/meow.normand:35:5 - While expanding the macro `zzz`:
320644e2 349/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
71aaa3f7
PP
350----
351
352You can use Normand to track data source files in your favorite VCS
353instead of raw binary files. The binary files that Normand generates can
354be used to test file format decoding, including malformatted data, for
355example, as well as for education.
356
357See <<learn-normand>> to explore all the Normand features.
358
359== Install Normand
360
361Normand requires Python ≥ 3.4.
362
363To install Normand:
364
365----
366$ python3 -m pip install --user normand
367----
368
369See
370https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
371to learn more about a user site installation.
372
373[NOTE]
374====
375Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 376to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
377function and the <<command-line-tool,command-line tool>>).
378
379`normand.py` has _no external dependencies_, but if you're using
380Python{nbsp}3.4, you'll need a local copy of the standard `typing`
381module.
382====
383
43937a34
PP
384== Design goals
385
386The design goals of Normand are:
387
388Portability::
389 We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
390 doesn't have any external dependencies so that you may just copy the
391 module as is to your own project.
392
393Ease of use::
394 The most basic Normand input is a sequence of hexadecimal constants
395 (for example, `4e6f726d616e64`) which produce exactly what you'd
396 expect.
397+
398Most Normand features map to programming language concepts you already
399know and understand: constant integers, literal strings, variables,
400conditionals, repetitions/loops, and the rest.
401
402Concise and readable input::
403 We could have chosen XML or YAML as the input format, but having a
404 DSL here makes a Normand input compact and easy to read, two
405 important traits when using Normand to write tests, for example.
406+
407Compare the following Normand input and some hypothetical XML
408equivalent, for example:
409+
410.Actual normand input.
411----
412ff dd 01 ab $192 $-128 %1101:0011
413
414{end:8}
415
416{iter = 1}
417
418!if {not something}
419 # five times because xyz
420 !repeat 5
421 "hello world " {iter:8}
422 {iter = iter + 1}
423 !end
424!end
425
426<end>
427----
428+
429.Hypothetical Normand XML input.
430[source,xml]
431----
432<?xml version="1.0" encoding="utf-8" ?>
433<group>
434 <byte base="x" val="ff" />
435 <byte base="x" val="dd" />
436 <byte base="x" val="1" />
437 <byte base="x" val="ab" />
438 <byte base="d" val="192" />
439 <byte base="d" val="-128" />
440 <byte base="b" val="11010011" />
441 <fixed-len-num expr="end" len="8" />
442 <var-assign name="iter" expr="1" />
443 <cond expr="not something">
444 <!-- five times because xyz -->
445 <repeat expr="5">
446 <str>hello world </str>
447 <fixed-len-num expr="iter" len="8" />
448 <var-assign name="iter" expr="iter + 1" />
449 </repeat>
450 </cond>
451 <label name="end" />
452</group>
453----
454
71aaa3f7
PP
455== Learn Normand
456
457A Normand text input is a sequence of items which represent a sequence
458of raw bytes.
459
460[[state]] During the processing of items to data, Normand relies on a
461current state:
462
463[%header%autowidth]
464|===
af3cf417 465|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
466
467|[[cur-offset]] Current offset
468|
05f81895 469The current offset has an effect on the value of <<label,labels>> and of
269f6eb3 470the special `ICITTE` name in <<fixed-length-number,fixed-length
27d52a19 471number>>, <<leb-128-integer,LEB128 integer>>,
f63f4a5d 472<<filling,filling>>, <<variable-assignment,variable assignment>>,
27d52a19 473<<conditional-block,conditional block>>, <<repetition-block,repetition
320644e2
PP
474block>>, <<macro-expansion,macro expansion>>, and
475<<post-item-repetition,post-item repetition>> expression evaluation.
71aaa3f7
PP
476
477Each generated byte increments the current offset.
478
479A <<current-offset-setting,current offset setting>> may change the
676f6189
PP
480current offset without generating data.
481
482An <<current-offset-alignment,current offset alignment>> generates
483padding bytes to make the current offset satisfy a given alignment.
71aaa3f7
PP
484|`init_offset` parameter of the `parse()` function.
485|`--offset` option.
486
487|[[cur-bo]] Current byte order
488|
05f81895 489The current byte order has an effect on the encoding of
269f6eb3 490<<fixed-length-number,fixed-length numbers>>.
71aaa3f7
PP
491
492A <<current-byte-order-setting,current byte order setting>> may change
493the current byte order.
494|`init_byte_order` parameter of the `parse()` function.
495|`--byte-order` option.
496
497|<<label,Labels>>
498|Mapping of label names to integral values.
499|`init_labels` parameter of the `parse()` function.
500|One or more `--label` options.
501
502|<<variable-assignment,Variables>>
27d52a19 503|Mapping of variable names to integral or floating point number values.
71aaa3f7
PP
504|`init_variables` parameter of the `parse()` function.
505|One or more `--var` options.
506|===
507
508The available items are:
509
510* A <<byte-constant,constant integer>> representing a single byte.
511
512* A <<literal-string,literal string>> representing a sequence of bytes
513 encoding UTF-8, UTF-16, or UTF-32 data.
514
515* A <<current-byte-order-setting,current byte order setting>> (big or
516 little endian).
517
269f6eb3
PP
518* A <<fixed-length-number,fixed-length number>> (integer or
519 floating point) using the <<cur-bo,current byte order>> and of which
520 the value is the result of a {py3} expression.
05f81895
PP
521
522* An <<leb128-integer,LEB128 integer>> of which the value is the result
523 of a {py3} expression.
71aaa3f7
PP
524
525* A <<current-offset-setting,current offset setting>>.
526
676f6189
PP
527* A <<current-offset-alignment,current offset alignment>>.
528
25ca454b
PP
529* A <<filling,filling>>.
530
71aaa3f7
PP
531* A <<label,label>>, that is, a named constant holding the current
532 offset.
533+
534This is similar to an assembly label.
535
536* A <<variable-assignment,variable assignment>> associating a name to
537 the integral result of an evaluated {py3} expression.
538
539* A <<group,group>>, that is, a scoped sequence of items.
540
27d52a19
PP
541* A <<conditional-block,conditional block>>.
542
e57a18e1
PP
543* A <<repetition-block,repetition block>>.
544
320644e2
PP
545* A <<macro-definition-block,macro definition block>>.
546
547* A <<macro-expansion,macro expansion>>.
548
e57a18e1
PP
549Moreover, you can repeat many items above a constant or variable number
550of times with the ``pass:[*]`` operator _after_ the item to repeat. This
551is called a <<post-item-repetition,post-item repetition>>.
71aaa3f7
PP
552
553A Normand comment may exist:
554
555* Between items, possibly within a group.
556* Between the nibbles of a constant hexadecimal byte.
557* Between the bits of a constant binary byte.
e57a18e1
PP
558* Between the last item and the ``pass:[*]`` character of a post-item
559 repetition, and between that ``pass:[*]`` character and the following
560 number or expression.
261c5ecf
PP
561* Between the ``!repeat``/``!r`` block opening and the following
562 constant integer, name, or expression of a repetition block.
563* Between the ``!if`` block opening and the following name or expression
564 of a conditional block.
71aaa3f7
PP
565
566A comment is anything between two ``pass:[#]`` characters on the same
567line, or from ``pass:[#]`` until the end of the line. Whitespaces and
568the following symbol characters are also considered comments where a
569comment may exist:
570
571----
25ca454b 572/ \ ? & : ; . , [ ] _ = | -
71aaa3f7
PP
573----
574
575The latter serve to improve readability so that you may write, for
576example, a MAC address or a UUID as is.
577
fc21bb27
PP
578[[const-int]] Many items require a _constant integer_, possibly
579negative, in which case it may start with `-` for a negative integer. A
580positive constant integer is any of:
581
582Decimal::
583 One or mode digits (`0` to `9`).
584
585Hexadecimal::
586 One of:
587+
588* The `0x` or `0X` prefix followed with one or more hexadecimal digits
589 (`0` to `9`, `a` to `f`, or `A` to `F`).
590* One or more hexadecimal digits followed with the `h` or `H` suffix.
591
592Octal::
593 One of:
594+
595* The `0o` or `0O` prefix followed with one or more octal digits
596 (`0` to `7`).
597* One or more octal digits followed with the `o`, `O`, `q`, or `Q`
598 suffix.
599
600Binary::
601 One of:
602+
603* The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
604* One or more bits followed with the `b` or `B` suffix.
605
71aaa3f7
PP
606You can test the examples of this section with the `normand`
607<<command-line-tool,command-line tool>> as such:
608
609----
610$ normand file | hexdump -C
611----
612
613where `file` is the name of a file containing the Normand input.
614
615=== Byte constant
616
617A _byte constant_ represents a single byte.
618
619A byte constant is:
620
621Hexadecimal form::
fc21bb27 622 Two consecutive hexadecimal digits.
71aaa3f7
PP
623
624Decimal form::
fc21bb27 625 One or more digits after the `$` prefix.
71aaa3f7
PP
626
627Binary form::
628 Eight bits after the `%` prefix.
629
630====
631Input:
632
633----
634ab cd [3d 8F] CC
635----
636
637Output:
638
639----
640ab cd 3d 8f cc
641----
642====
643
644====
645Input:
646
647----
648$192 %1100/0011 $ -77
649----
650
651Output:
652
653----
654c0 c3 b3
655----
656====
657
658====
659Input:
660
661----
66258f64689-6316-4d55-8a1a-04cada366172
663fe80::6257:18ff:fea3:4229
664----
665
666Output:
667
668----
66958 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
670fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
671----
672====
673
674====
675Input:
676
677----
678%01110011 %01100001 %01101100 %01110101 %01110100
679----
680
681Output:
682
683----
68473 61 6c 75 74 ┆ salut
685----
686====
687
688=== Literal string
689
690A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
691bytes of a string.
692
693The string to encode isn't implicitly null-terminated: use `\0` at the
694end of the string to add a null character.
695
696A literal string is:
697
698. **Optional**: one of the following encodings instead of UTF-8:
699+
700--
701[horizontal]
702`u16be`:: UTF-16BE.
703`u16le`:: UTF-16LE.
704`u32be`:: UTF-32BE.
705`u32le`:: UTF-32LE.
706--
707
708. The ``pass:["]`` prefix.
709
710. A sequence of zero or more characters, possibly containing escape
711 sequences.
712+
713An escape sequence is the ``\`` character followed by one of:
714+
715--
716[horizontal]
717`0`:: Null (U+0000)
718`a`:: Alert (U+0007)
719`b`:: Backspace (U+0008)
720`e`:: Escape (U+001B)
721`f`:: Form feed (U+000C)
722`n`:: End of line (U+000A)
723`r`:: Carriage return (U+000D)
724`t`:: Character tabulation (U+0009)
725`v`:: Line tabulation (U+000B)
726``\``:: Reverse solidus (U+005C)
727``pass:["]``:: Quotation mark (U+0022)
728--
729
730. The ``pass:["]`` suffix.
731
732====
733Input:
734
735----
736"coucou tout le monde!"
737----
738
739Output:
740
741----
74263 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
7436f 6e 64 65 21 ┆ onde!
744----
745====
746
747====
748Input:
749
750----
751u16le"I am not young enough to know everything."
752----
753
754Output:
755
756----
75749 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
75820 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
7596e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
76020 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
76165 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
7622e 00 ┆ .•
763----
764====
765
766====
767Input:
768
769----
770u32be "\"illusion is the first\nof all pleasures\" 🦉"
771----
772
773Output:
774
775----
77600 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
77700 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
77800 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
77900 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
78000 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
78100 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
78200 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
78300 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
78400 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
78500 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
78600 00 00 20 00 01 f9 89 ┆ ••• ••••
787----
788====
789
790=== Current byte order setting
791
792This special item sets the <<cur-bo,_current byte order_>>.
793
794The two accepted forms are:
795
796[horizontal]
797``pass:[{be}]``:: Set the current byte order to big endian.
798``pass:[{le}]``:: Set the current byte order to little endian.
799
269f6eb3 800=== Fixed-length number
71aaa3f7 801
269f6eb3
PP
802A _fixed-length number_ represents a fixed number of bytes encoding
803either:
804
805* An unsigned or signed integer (two's complement).
806+
807The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
808
809* A floating point number
b87a3aa2 810 (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
269f6eb3
PP
811+
812The available length are 32 (_binary32_) and 64 (_binary64_).
71aaa3f7 813
269f6eb3
PP
814The value is the result of evaluating a {py3} expression using the
815<<cur-bo,current byte order>>.
816
817A fixed-length number is:
71aaa3f7
PP
818
819. The ``pass:[{]`` prefix.
820
821. A valid {py3} expression.
05f81895 822+
269f6eb3 823For a fixed-length number at some source location{nbsp}__**L**__, this
05f81895
PP
824expression may contain the name of any accessible <<label,label>> (not
825within a nested group), including the name of a label defined
826after{nbsp}__**L**__, as well as the name of any
827<<variable-assignment,variable>> known at{nbsp}__**L**__.
828+
269f6eb3
PP
829The value of the special name `ICITTE` (`int` type) in this expression
830is the <<cur-offset,current offset>> (before encoding the number).
71aaa3f7
PP
831
832. The `:` character.
833
269f6eb3
PP
834. An encoding length in bits amongst:
835+
836--
27d52a19 837The expression evaluates to an `int` or `bool` value::
269f6eb3 838 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
27d52a19
PP
839+
840NOTE: Normand automatically converts a `bool` value to `int`.
269f6eb3
PP
841
842The expression evaluates to a `float` value::
843 `32` and `64`.
844--
71aaa3f7
PP
845
846. The `}` suffix.
847
848====
849Input:
850
851----
852{le} {345:16}
853{be} {-0xabcd:32}
854----
855
856Output:
857
858----
85959 01 ff ff 54 33
860----
861====
862
863====
864Input:
865
866----
867{be}
868
869# String length in bits
870{8 * (str_end - str_beg) : 16}
871
872# String
873<str_beg>
874 "hello world!"
875<str_end>
876----
877
878Output:
879
880----
88100 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
882----
883====
884
885====
886Input:
887
888----
889{20 - ICITTE : 8} * 10
890----
891
892Output:
893
894----
89514 13 12 11 10 0f 0e 0d 0c 0b
896----
897====
898
269f6eb3
PP
899====
900Input:
901
902----
903{le}
904{2 * 0.0529 : 32}
905----
906
907Output:
908
909----
910ac ad d8 3d
911----
912====
913
05f81895
PP
914=== LEB128 integer
915
916An _LEB128 integer_ represents a variable number of bytes encoding an
917unsigned or signed integer which is the result of evaluating a {py3}
918expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
919format.
920
921An LEB128 integer is:
922
923. The ``pass:[{]`` prefix.
924
27d52a19
PP
925. A valid {py3} expression of which the evaluation result type
926 is `int` or `bool` (automatically converted to `int`).
05f81895
PP
927+
928For an LEB128 integer at some source location{nbsp}__**L**__, this
929expression may contain:
930+
931--
fc21bb27
PP
932* The name of any <<label,label>> defined before{nbsp}__**L**__
933 which isn't within a nested group.
320644e2
PP
934* The name of any <<variable-assignment,variable>> known
935 at{nbsp}__**L**__.
05f81895
PP
936--
937+
269f6eb3
PP
938The value of the special name `ICITTE` (`int` type) in this expression
939is the <<cur-offset,current offset>> (before encoding the integer).
05f81895
PP
940
941. The `:` character.
942
943. One of:
944+
945--
946[horizontal]
947`uleb128`:: Use the unsigned LEB128 format.
948`sleb128`:: Use the signed LEB128 format.
949--
950
951. The `}` suffix.
952
953====
954Input:
955
956----
957{624485 : uleb128}
958----
959
960Output:
961
962----
963e5 8e 26
964----
965====
966
967====
968Input:
969
970----
971aa bb cc dd
972<meow>
973ee ff
974{-981238311 + (meow * -23) : sleb128}
975"hello"
976----
977
c2b79cf6
PP
978Output:
979
05f81895
PP
980----
981aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
982----
983====
984
71aaa3f7
PP
985=== Current offset setting
986
987This special item sets the <<cur-offset,_current offset_>>.
988
989A current offset setting is:
990
991. The `<` prefix.
992
fc21bb27
PP
993. A <<const-int,positive constant integer>> which is the new current
994 offset.
71aaa3f7
PP
995
996. The `>` suffix.
997
998====
999Input:
1000
1001----
1002 {ICITTE : 8} * 8
1003<0x61> {ICITTE : 8} * 8
1004----
1005
1006Output:
1007
1008----
100900 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
1010----
1011====
1012
1013====
1014Input:
1015
1016----
1017aa bb cc dd <meow> ee ff
1018<12> 11 22 33 <mix> 44 55
1019{meow : 8} {mix : 8}
1020----
1021
1022Output:
1023
1024----
1025aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
1026----
1027====
1028
676f6189
PP
1029=== Current offset alignment
1030
00deb9fa 1031A _current offset alignment_ represents zero or more padding bytes to
676f6189
PP
1032make the <<cur-offset,current offset>> meet a given
1033https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
1034
1035More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
1036a current offset alignment represents the required padding bytes until
1037the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
1038
1039A current offset alignment is:
1040
1041. The `@` prefix.
1042
fc21bb27
PP
1043. A <<const-int,positive constant integer>> which is the alignment value
1044 in _bits_.
676f6189
PP
1045+
1046This value must be greater than zero and a multiple of{nbsp}8.
1047
1048. **Optional**:
1049+
1050--
1051. The ``pass:[~]`` prefix.
fc21bb27
PP
1052. A <<const-int,positive constant integer>> which is the value of the
1053 byte to use as padding to align the <<cur-offset,current offset>>.
676f6189
PP
1054--
1055+
1056Without this section, the padding byte value is zero.
1057
1058====
1059Input:
1060
1061----
106211 22 (@32 aa bb cc) * 3
1063----
1064
1065Output:
1066
1067----
106811 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
1069----
1070====
1071
1072====
1073Input:
1074
1075----
1076{le}
107777 88
1078@32~0xcc {-893.5:32}
1079@128~0x55 "meow"
1080----
1081
1082Output:
1083
1084----
108577 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
10866d 65 6f 77 ┆ meow
1087----
1088====
1089
1090====
1091Input:
1092
1093----
1094aa bb cc <29> @64~255 "zoom"
1095----
1096
1097Output:
1098
1099----
1100aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1101----
1102====
1103
25ca454b
PP
1104=== Filling
1105
1106A _filling_ represents zero or more padding bytes to make the
1107<<cur-offset,current offset>> reach a given value.
1108
1109A filling is:
1110
1111. The ``pass:[+]`` prefix.
1112
1113. One of:
1114
fc21bb27
PP
1115** A <<const-int,positive constant integer>> which is the current offset
1116 target.
25ca454b
PP
1117
1118** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1119 evaluation result type is `int` or `bool` (automatically converted to
1120 `int`), and the ``pass:[}]`` suffix.
1121+
1122For a filling at some source location{nbsp}__**L**__, this expression
1123may contain:
1124+
1125--
1126* The name of any <<label,label>> defined before{nbsp}__**L**__
1127 which isn't within a nested group.
1128* The name of any <<variable-assignment,variable>> known
1129 at{nbsp}__**L**__.
1130--
1131+
1132The value of the special name `ICITTE` (`int` type) in this expression
1133is the <<cur-offset,current offset>> (before handling the items to
1134repeat).
1135
1136** A valid {py3} name.
1137+
1138For the name `__NAME__`, this is equivalent to the
1139`pass:[{]__NAME__pass:[}]` form above.
1140
1141+
1142This value must be greater than or equal to the current offset where
1143it's used.
1144
1145. **Optional**:
1146+
1147--
1148. The ``pass:[~]`` prefix.
fc21bb27
PP
1149. A <<const-int,positive constant integer>> which is the value of the
1150 byte to use as padding to reach the current offset target.
25ca454b
PP
1151--
1152+
1153Without this section, the padding byte value is zero.
1154
1155====
1156Input:
1157
1158----
1159aa bb cc dd
1160+0x40
1161"hello world"
1162----
1163
1164Output:
1165
1166----
1167aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
116800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
116900 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
117000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
117168 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1172----
1173====
1174
1175====
1176Input:
1177
1178----
1179!macro part(iter, fill)
1180 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1181!end
1182
1183{iter = 1}
1184
1185!repeat 5
1186 m:part(iter, {32 + 4 * iter})
1187 {iter = iter + 1}
1188!end
1189----
1190
1191Output:
1192
1193----
119470 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
119569 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
119680 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
119765 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
119880 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
119969 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
120033 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
120180 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
120261 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
120380 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
120480 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
120561 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
120680 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
120780 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1208----
1209====
1210
71aaa3f7
PP
1211=== Label
1212
1213A _label_ associates a name to the <<cur-offset,current offset>>.
1214
1215All the labels of a whole Normand input must have unique names.
1216
05f81895 1217A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
1218name.
1219
71aaa3f7
PP
1220A label is:
1221
1222. The `<` prefix.
1223
27d52a19 1224. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1225
1226. The `>` suffix.
1227
1228=== Variable assignment
1229
1230A _variable assignment_ associates a name to the integral result of an
1231evaluated {py3} expression.
1232
05f81895 1233A variable assignment is:
71aaa3f7
PP
1234
1235. The ``pass:[{]`` prefix.
1236
27d52a19 1237. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1238
1239. The `=` character.
1240
27d52a19
PP
1241. A valid {py3} expression of which the evaluation result type
1242 is `int`, `float`, or `bool` (automatically converted to `int`).
05f81895
PP
1243+
1244For a variable assignment at some source location{nbsp}__**L**__, this
320644e2
PP
1245expression may contain:
1246+
1247--
1248* The name of any <<label,label>> defined before{nbsp}__**L**__
1249 which isn't within a nested group.
1250* The name of any <<variable-assignment,variable>> known
1251 at{nbsp}__**L**__.
1252--
05f81895 1253+
269f6eb3
PP
1254The value of the special name `ICITTE` (`int` type) in this expression
1255is the <<cur-offset,current offset>>.
71aaa3f7
PP
1256
1257. The `}` suffix.
1258
1259====
1260Input:
1261
1262----
1263{mix = 101} {le}
1264{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1265"yooo" {meow + mix : 16}
1266----
1267
1268Output:
1269
1270----
127111 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1272----
1273====
1274
1275=== Group
1276
1277A _group_ is a scoped sequence of items.
1278
1279The <<label,labels>> within a group aren't visible outside of it.
1280
e57a18e1
PP
1281The main purpose of a group is to <<post-item-repetition,repeat>> more
1282than a single item and to isolate labels.
71aaa3f7
PP
1283
1284A group is:
1285
261c5ecf 1286. The `(`, `!group`, or `!g` opening.
71aaa3f7
PP
1287
1288. Zero or more items.
1289
261c5ecf
PP
1290. Depending on the group opening:
1291+
1292--
1293`(`::
1294 The `)` closing.
1295
1296`!group`::
1297`!g`::
1298 The `!end` closing.
1299--
71aaa3f7
PP
1300
1301====
1302Input:
1303
1304----
1305((aa bb cc) dd () ee) "leclerc"
1306----
1307
1308Output:
1309
1310----
1311aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1312----
1313====
1314
1315====
1316Input:
1317
1318----
261c5ecf
PP
1319!group
1320 (aa bb cc) * 3 dd ee
1321!end * 5
71aaa3f7
PP
1322----
1323
1324Output:
1325
1326----
1327aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1328cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1329ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1330bb cc aa bb cc dd ee
1331----
1332====
1333
1334====
1335Input:
1336
1337----
1338{be}
1339(
1340 <str_beg> u16le"sébastien diaz" <str_end>
1341 {ICITTE - str_beg : 8}
1342 {(end - str_beg) * 5 : 24}
1343) * 3
1344<end>
1345----
1346
1347Output:
1348
1349----
135073 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13516e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
135273 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13536e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
135473 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
13556e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1356----
1357====
1358
27d52a19
PP
1359=== Conditional block
1360
12b5dbc0
PP
1361A _conditional block_ represents either the bytes of zero or more items
1362if some expression is true, or the bytes of zero or more other items if
1363it's false.
27d52a19
PP
1364
1365A conditional block is:
1366
261c5ecf 1367. The `!if` opening.
27d52a19
PP
1368
1369. One of:
1370
1371** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1372 evaluation result type is `int` or `bool` (automatically converted to
1373 `int`), and the ``pass:[}]`` suffix.
1374+
320644e2
PP
1375For a conditional block at some source location{nbsp}__**L**__, this
1376expression may contain:
27d52a19
PP
1377+
1378--
1379* The name of any <<label,label>> defined before{nbsp}__**L**__
1380 which isn't within a nested group.
1381* The name of any <<variable-assignment,variable>> known
320644e2 1382 at{nbsp}__**L**__.
27d52a19
PP
1383--
1384+
1385The value of the special name `ICITTE` (`int` type) in this expression
1386is the <<cur-offset,current offset>> (before handling the contained
1387items).
1388
1389** A valid {py3} name.
1390+
1391For the name `__NAME__`, this is equivalent to the
1392`pass:[{]__NAME__pass:[}]` form above.
1393
12b5dbc0
PP
1394. Zero or more items to be handled when the condition is true.
1395
1396. **Optional**:
1397
1398.. The `!else` opening.
1399.. Zero or more items to be handled when the condition is false.
27d52a19 1400
261c5ecf 1401. The `!end` closing.
27d52a19
PP
1402
1403====
1404Input:
1405
1406----
1407{at = 1}
1408{rep_count = 9}
1409
1410!repeat rep_count
1411 "meow "
1412
1413 !if {ICITTE > 25}
1414 "mix"
12b5dbc0
PP
1415 !else
1416 "zoom"
27d52a19
PP
1417 !end
1418
12b5dbc0
PP
1419 !if {at < rep_count} 20 !end
1420
27d52a19
PP
1421 {at = at + 1}
1422!end
1423----
1424
1425Output:
1426
1427----
12b5dbc0
PP
14286d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
14296f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
14306f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
14316d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
27d52a19 143278 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
12b5dbc0 14336d 69 78 ┆ mix
27d52a19
PP
1434----
1435====
1436
1437====
1438Input:
1439
1440----
1441<str_beg>
1442u16le"meow mix!"
1443<str_end>
1444
1445!if {str_end - str_beg > 10}
1446 " BIG"
1447!end
1448----
1449
1450Output:
1451
1452----
14536d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
145421 00 20 42 49 47 ┆ !• BIG
1455----
1456====
1457
e57a18e1 1458=== Repetition block
71aaa3f7 1459
e57a18e1
PP
1460A _repetition block_ represents the bytes of one or more items repeated
1461a given number of times.
676f6189 1462
e57a18e1 1463A repetition block is:
71aaa3f7 1464
261c5ecf 1465. The `!repeat` or `!r` opening.
71aaa3f7 1466
2adf4336
PP
1467. One of:
1468
fc21bb27
PP
1469** A <<const-int,positive constant integer>> which is the number of
1470 times to repeat the previous item.
2adf4336 1471
27d52a19
PP
1472** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1473 evaluation result type is `int` or `bool` (automatically converted to
1474 `int`), and the ``pass:[}]`` suffix.
05f81895 1475+
320644e2
PP
1476For a repetition block at some source location{nbsp}__**L**__, this
1477expression may contain:
05f81895
PP
1478+
1479--
27d52a19
PP
1480* The name of any <<label,label>> defined before{nbsp}__**L**__
1481 which isn't within a nested group.
05f81895 1482* The name of any <<variable-assignment,variable>> known
320644e2 1483 at{nbsp}__**L**__.
05f81895
PP
1484--
1485+
e57a18e1
PP
1486The value of the special name `ICITTE` (`int` type) in this expression
1487is the <<cur-offset,current offset>> (before handling the items to
1488repeat).
1489
1490** A valid {py3} name.
1491+
1492For the name `__NAME__`, this is equivalent to the
1493`pass:[{]__NAME__pass:[}]` form above.
1494
1495. Zero or more items.
1496
261c5ecf 1497. The `!end` closing.
e57a18e1
PP
1498
1499You may also use a <<post-item-repetition,post-item repetition>> after
1500some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1501is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
71aaa3f7
PP
1502
1503====
1504Input:
1505
1506----
fc21bb27 1507!repeat 0o400
e57a18e1
PP
1508 {end - ICITTE - 1 : 8}
1509!end
1510
1511<end>
71aaa3f7
PP
1512----
1513
1514Output:
1515
1516----
1517ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1518ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1519df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1520cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1521bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1522af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
15239f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
15248f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
15257f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
15266f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
15275f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
15284f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
15293f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
15302f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
15311f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
15320f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1533----
1534====
1535
2adf4336
PP
1536====
1537Input:
1538
1539----
1540{times = 1}
e57a18e1 1541
2adf4336 1542aa bb cc dd
e57a18e1
PP
1543
1544!repeat 3
2adf4336 1545 <here>
e57a18e1
PP
1546
1547 !repeat {here + 1}
1548 ee ff
1549 !end
1550
1551 11 22 !repeat times 33 !end
1552
2adf4336 1553 {times = times + 1}
e57a18e1
PP
1554!end
1555
2adf4336
PP
1556"coucou!"
1557----
1558
1559Output:
1560
1561----
1562aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
156333 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1564ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1565ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1566ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1567ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1568ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1569ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1570ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1571ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1572ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
157333 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1574----
1575====
1576
320644e2
PP
1577=== Macro definition block
1578
1579A _macro definition block_ associates a name and parameter names to
1580a group of items.
1581
1582A macro definition block doesn't lead to generated bytes itself: a
1583<<macro-expansion,macro expansion>> does so.
1584
1585A macro definition may only exist at the root level, that is, not within
1586a <<group,group>>, a <<repetition-block,repetition block>>, a
1587<<conditional-block,conditional block>>, or another
1588<<macro-definition-block,macro definition block>>.
1589
1590All macro definitions must have unique names.
1591
1592A macro definition is:
1593
1594. The `!macro` or `!m` opening.
1595
1596. A valid {py3} name (the macro name).
1597
1598. The `(` parameter name list prefix.
1599
1600. A comma-separated list of zero or more unique parameter names,
1601 each one being a valid {py3} name.
1602
1603. The `)` parameter name list suffix.
1604
1605. Zero or more items except, recursively, a macro definition block.
1606
1607. The `!end` closing.
1608
1609====
1610----
1611!macro bake()
1612 {le} {ICITTE * 8 : 16}
1613 u16le"predict explode"
1614!end
1615----
1616====
1617
1618====
1619----
1620!macro nail(rep, with_extra, val)
1621 {iter = 1}
1622
1623 !repeat rep
1624 {val + iter : uleb128}
1625 {0xdeadbeef : 32}
1626 {iter = iter + 1}
1627 !end
1628
1629 !if with_extra
1630 "meow mix\0"
1631 !end
1632!end
1633----
1634====
1635
1636=== Macro expansion
1637
1638A _macro expansion_ expands the items of a defined
1639<<macro-definition-block,macro>>.
1640
1641The macro to expand must be defined _before_ the expansion.
1642
1643The <<state,state>> before handling the first item of the chosen macro
1644is:
1645
1646<<cur-offset,Current offset>>::
1647 Unchanged.
1648
1649<<cur-bo,Current byte order>>::
1650 Unchanged.
1651
1652Variables::
1653 The only available variables initially are the macro parameters.
1654
1655Labels::
1656 None.
1657
1658The state after having handled the last item of the chosen macro is:
1659
1660Current offset::
1661 The one before handling the first item of the macro plus the size
1662 of the generated data of the macro expansion.
1663+
1664IMPORTANT: This means <<current-offset-setting,current offset setting>>
1665items within the expanded macro don't impact the final current offset.
1666
1667Current byte order::
1668 The one before handling the first item of the macro.
1669
1670Variables::
1671 The ones before handling the first item of the macro.
1672
1673Labels::
1674 The ones before handling the first item of the macro.
1675
1676A macro expansion is:
1677
1678. The `m:` prefix.
1679
1680. A valid {py3} name (the name of the macro to expand).
1681
1682. The `(` parameter value list prefix.
1683
1684. A comma-separated list of zero or more unique parameter values.
1685+
1686The number of parameter values must match the number of parameter
1687names of the definition of the chosen macro.
1688+
1689A parameter value is one of:
1690+
1691--
fc21bb27 1692* A <<const-int,constant integer>>, possibly negative.
320644e2
PP
1693
1694* The ``pass:[{]`` prefix, a valid {py3} expression of which the
1695 evaluation result type is `int` or `bool` (automatically converted to
1696 `int`), and the ``pass:[}]`` suffix.
1697+
1698For a macro expansion at some source location{nbsp}__**L**__, this
1699expression may contain:
1700
1701** The name of any <<label,label>> defined before{nbsp}__**L**__
1702 which isn't within a nested group.
1703** The name of any <<variable-assignment,variable>> known
1704 at{nbsp}__**L**__.
1705
1706+
1707The value of the special name `ICITTE` (`int` type) in this expression
1708is the <<cur-offset,current offset>> (before handling the items of the
1709chosen macro).
1710
1711* A valid {py3} name.
1712+
1713For the name `__NAME__`, this is equivalent to the
1714`pass:[{]__NAME__pass:[}]` form above.
1715--
1716
1717. The `)` parameter value list suffix.
1718
1719====
1720Input:
1721
1722----
1723!macro bake()
1724 {le} {ICITTE * 8 : 16}
1725 u16le"predict explode"
1726!end
1727
1728"hello [" m:bake() "] world"
1729
1730m:bake() * 5
1731----
1732
1733Output:
1734
1735----
173668 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
173700 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
173800 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
173970 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174065 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
174170 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174265 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
174370 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174465 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
174570 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174665 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
174770 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
174865 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1749----
1750====
1751
1752====
1753Input:
1754
1755----
1756!macro A(val, is_be)
1757 {le}
1758
1759 !if is_be
1760 {be}
1761 !end
1762
1763 {val : 16}
1764!end
1765
1766!macro B(rep, is_be)
1767 {iter = 1}
1768
1769 !repeat rep
1770 m:A({iter * 3}, is_be)
1771 {iter = iter + 1}
1772 !end
1773!end
1774
1775m:B(5, 1)
1776m:B(3, 0)
1777----
1778
1779Output:
1780
1781----
178200 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1783----
1784====
1785
e57a18e1
PP
1786=== Post-item repetition
1787
1788A _post-item repetition_ represents the bytes of an item repeated a
1789given number of times.
1790
1791A post-item repetition is:
1792
27d52a19 1793. One of those items:
e57a18e1 1794
27d52a19
PP
1795** A <<byte-constant,byte constant>>.
1796** A <<literal-string,literal string>>.
1797** A <<fixed-length-number,fixed-length number>>.
1798** An <<leb128-integer,LEB128 integer>>.
320644e2 1799** A <<macro-expansion,macro-expansion>>.
27d52a19 1800** A <<group,group>>.
e57a18e1
PP
1801
1802. The ``pass:[*]`` character.
1803
1804. One of:
1805
1806** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1807 which is the number of times to repeat the previous item.
1808
27d52a19
PP
1809** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1810 evaluation result type is `int` or `bool` (automatically converted to
1811 `int`), and the ``pass:[}]`` suffix.
e57a18e1 1812+
320644e2
PP
1813For a post-item repetition at some source location{nbsp}__**L**__, this
1814expression may contain:
e57a18e1
PP
1815+
1816--
27d52a19
PP
1817* The name of any <<label,label>> defined before{nbsp}__**L**__
1818 which isn't within a nested group and
1819 which isn't part of the repeated item.
e57a18e1
PP
1820* The name of any <<variable-assignment,variable>> known
1821 at{nbsp}__**L**__, which isn't part of its repeated item, and which
320644e2 1822 doesn't.
e57a18e1
PP
1823--
1824+
1825The value of the special name `ICITTE` (`int` type) in this expression
1826is the <<cur-offset,current offset>> (before handling the items to
1827repeat).
1828
1829** A valid {py3} name.
1830+
1831For the name `__NAME__`, this is equivalent to the
1832`pass:[{]__NAME__pass:[}]` form above.
1833
1834You may also use a <<repetition-block,repetition block>>. The form
1835``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1836``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1837
1838====
1839Input:
1840
1841----
1842{end - ICITTE - 1 : 8} * 0x100 <end>
1843----
1844
1845Output:
1846
1847----
1848ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1849ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1850df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1851cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1852bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1853af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
18549f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
18558f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
18567f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
18576f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
18585f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
18594f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
18603f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
18612f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
18621f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
18630f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1864----
1865====
1866
1867====
1868Input:
1869
1870----
1871{times = 1}
1872aa bb cc dd
1873(
1874 <here>
1875 (ee ff) * {here + 1}
1876 11 22 33 * {times}
1877 {times = times + 1}
1878) * 3
1879"coucou!"
1880----
1881
1882Output:
1883
1884----
1885aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
188633 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1887ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1888ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1889ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1890ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1891ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1892ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1893ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1894ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1895ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
189633 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1897----
1898====
1899
71aaa3f7
PP
1900== Command-line tool
1901
1902If you <<install-normand,installed>> the `normand` package, then you
1903can use the `normand` command-line tool:
1904
1905----
1906$ normand <<< '"ma gang de malades"' | hexdump -C
1907----
1908
1909----
191000000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
191100000010 65 73 |es|
1912----
1913
1914If you copy the `normand.py` module to your own project, then you can
1915run the module itself:
1916
1917----
1918$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1919----
1920
1921----
192200000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
192300000010 65 73 |es|
1924----
1925
1926Without a path argument, the `normand` tool reads from the standard
1927input.
1928
1929The `normand` tool prints the generated binary data to the standard
1930output.
1931
1932Various options control the initial <<state,state>> of the processor:
1933use the `--help` option to learn more.
1934
1935== {py3} API
1936
e57a18e1 1937The whole `normand` package/module public API is:
71aaa3f7
PP
1938
1939[source,python]
1940----
e57a18e1 1941# Byte order.
71aaa3f7
PP
1942class ByteOrder(enum.Enum):
1943 # Big endian.
1944 BE = ...
1945
1946 # Little endian.
1947 LE = ...
1948
1949
e57a18e1
PP
1950# Text location.
1951class TextLocation:
71aaa3f7
PP
1952 # Line number.
1953 @property
1954 def line_no(self) -> int:
1955 ...
1956
1957 # Column number.
1958 @property
1959 def col_no(self) -> int:
1960 ...
1961
1962
f5dcb24c
PP
1963# Parsing error message.
1964class ParseErrorMessage:
1965 # Message text.
1966 @property
1967 def text(self):
1968 ...
1969
1970 # Source text location.
1971 @property
1972 def text_location(self):
1973 ...
1974
1975
e57a18e1 1976# Parsing error.
71aaa3f7 1977class ParseError(RuntimeError):
f5dcb24c
PP
1978 # Parsing error messages.
1979 #
1980 # The first message is the most _specific_ one.
71aaa3f7 1981 @property
f5dcb24c 1982 def messages(self):
71aaa3f7
PP
1983 ...
1984
1985
e57a18e1
PP
1986# Variables dictionary type (for type hints).
1987VariablesT = typing.Dict[str, typing.Union[int, float]]
1988
1989
1990# Labels dictionary type (for type hints).
1991LabelsT = typing.Dict[str, int]
1b8aa84a
PP
1992
1993
e57a18e1 1994# Parsing result.
71aaa3f7
PP
1995class ParseResult:
1996 # Generated data.
1997 @property
1998 def data(self) -> bytearray:
1999 ...
2000
2001 # Updated variable values.
2002 @property
1b8aa84a 2003 def variables(self) -> SymbolsT:
71aaa3f7
PP
2004 ...
2005
2006 # Updated main group label values.
2007 @property
1b8aa84a 2008 def labels(self) -> SymbolsT:
71aaa3f7
PP
2009 ...
2010
2011 # Final offset.
2012 @property
2013 def offset(self) -> int:
2014 ...
2015
2016 # Final byte order.
2017 @property
1b8aa84a 2018 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
2019 ...
2020
1b8aa84a 2021
e57a18e1
PP
2022# Parses the `normand` input using the initial state defined by
2023# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
2024# and returns the corresponding parsing result.
71aaa3f7 2025def parse(normand: str,
1b8aa84a
PP
2026 init_variables: typing.Optional[SymbolsT] = None,
2027 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
2028 init_offset: int = 0,
2029 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
2030 ...
2031----
2032
2033The `normand` parameter is the actual <<learn-normand,Normand input>>
2034while the other parameters control the initial <<state,state>>.
2035
2036The `parse()` function raises a `ParseError` instance should it fail to
2037parse the `normand` string for any reason.
bf8f3b38
PP
2038
2039== Development
2040
2041Normand is a https://python-poetry.org/[Poetry] project.
2042
2043To develop it, install it through Poetry and enter the virtual
2044environment:
2045
2046----
2047$ poetry install
2048$ poetry shell
2049$ normand <<< '"lol" * 10 0a'
2050----
2051
2052`normand.py` is processed by:
2053
2054* https://microsoft.github.io/pyright/[Pyright]
2055* https://github.com/psf/black[Black]
2056* https://pycqa.github.io/isort/[isort]
2057
2058=== Testing
2059
2060Use https://docs.pytest.org/[pytest] to test Normand once the package is
2061part of your virtual environment, for example:
2062
2063----
2064$ poetry install
2065$ poetry run pip3 install pytest
2066$ poetry run pytest
2067----
2068
2069The `pytest` project is currently not a development dependency in
2070`pyproject.toml` due to backward compatibiliy issues with
2071Python{nbsp}3.4.
2072
2073In the `tests` directory, each `*.nt` file is a test. The file name
2074prefix indicates what it's meant to test:
2075
2076`pass-`::
2077 Everything above the `---` line is the valid Normand input
2078 to test.
2079+
2080Everything below the `---` line is the expected data
2081(whitespace-separated hexadecimal bytes).
2082
2083`fail-`::
2084 Everything above the `---` line is the invalid Normand input
2085 to test.
2086+
2087Everything below the `---` line is the expected error message having
2088this form:
2089+
2090----
2091LINE:COL - MESSAGE
2092----
2093
2094=== Contributing
2095
2096Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2097for code review.
2098
2099To report a bug, https://github.com/efficios/normand/issues/new[create a
2100GitHub issue].
This page took 0.101689 seconds and 4 git commands to generate.