Accept many more prefixes and suffixes for a constant integer
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
fc21bb27 32WARNING: This version of Normand is 0.13, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
101UTF-8, UTF-16, and UTF-32 literal strings::
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
108----
109+
110Output:
111+
112----
11368 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11400 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
11500 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116----
117
118Labels: special variables holding the offset where they're defined::
119+
120----
121<beg> b2 52 e3 bc 91 05
122$100 $50 <chair> 33 9f fe
12325 e9 89 8a <end>
124----
125
126Variables::
127+
128----
1295e 65 {tower = 47} c6 7f f2 c4
13044 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131----
132+
133The value of a variable assignment is the evaluation of a valid {py3}
134expression which may include label and variable names.
135
269f6eb3 136Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
137+
138Input:
139+
140----
141{strength = 4}
142{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143{le} {-1993 : 32}
269f6eb3 144{-3.141593 : 64}
71aaa3f7
PP
145----
146+
147Output:
148+
149----
269f6eb3
PP
15067 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
15109 c0
71aaa3f7
PP
152----
153+
269f6eb3 154The encoded number is the evaluation of a valid {py3} expression which
05f81895
PP
155may include label and variable names.
156
157https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158+
159Input:
160+
161----
162aa bb cc {-1993 : sleb128} <meow> dd ee ff
163{meow * 199 : uleb128}
164----
165+
166Output:
167+
168----
169aa bb cc b7 70 dd ee ff e3 07
170----
171+
172The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
173may include label and variable names.
174
27d52a19
PP
175Conditional::
176+
177Input:
178+
179----
180aa bb cc
181
182(
183 "foo"
184
185 !if {ICITTE > 10}
186 "bar"
187 !end
188) * 4
189----
190+
191Output:
192+
193----
194aa bb cc 66 6f 6f 66 6f 6f 66 6f 6f 62 61 72 66 ┆ •••foofoofoobarf
1956f 6f 62 61 72 ┆ oobar
196----
197
71aaa3f7
PP
198Repetition::
199+
200Input:
201+
202----
2adf4336 203aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
e57a18e1
PP
204
205!repeat 3
206 ff ee "juice"
207!end
71aaa3f7
PP
208----
209+
210Output:
211+
212----
2adf4336
PP
213aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
21400 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
21579 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
21665 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
21761 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
21868 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 21900 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
e57a18e1
PP
220ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
2216a 75 69 63 65 ┆ juice
71aaa3f7
PP
222----
223
676f6189
PP
224Alignment::
225+
226Input:
227+
228----
229{be}
230
231 {199:32}
232@64 {43:64}
233@16 {-123:16}
234@32~255 {5584:32}
235----
236+
237Output:
238+
239----
24000 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
241ff 85 ff ff 00 00 15 d0
242----
71aaa3f7 243
25ca454b
PP
244Filling::
245+
246Input:
247+
248----
249{le}
250{0xdeadbeef:32}
251{-1993:16}
252{9:16}
253+0x40
254{ICITTE:8}
255"meow mix"
fc21bb27 256+200~FFh
25ca454b
PP
257{ICITTE:8}
258----
259+
260Output:
261+
262----
263ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
26400 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26500 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26740 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
268ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
269ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
270ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
271ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
272ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275ff ff ff ff ff ff ff ff c8 ┆ •••••••••
276----
277
71aaa3f7
PP
278Multilevel grouping::
279+
280Input:
281+
282----
283ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
284----
285+
286Output:
287+
288----
289ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
290bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2916f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2926d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
293aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2947a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2956f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
296----
297
320644e2
PP
298Macros::
299+
300Input:
301+
302----
303!macro hello(world)
304 "hello"
305 !if world " world" !end
306!end
307
308!repeat 17
309 ff ff ff ff
310 m:hello({ICITTE > 15 and ICITTE < 60})
311!end
312----
313+
314Output:
315+
316----
317ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3186c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
31964 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
320ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
321ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
3226f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
3236c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
32468 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
325ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
326ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3276c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
328----
329
71aaa3f7
PP
330Precise error reporting::
331+
332----
333/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
334----
335+
336----
337/tmp/meow.normand:32:6 - Unexpected character `k`.
338----
339+
340----
320644e2 341/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
71aaa3f7
PP
342----
343+
344----
320644e2 345/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
71aaa3f7
PP
346----
347
348You can use Normand to track data source files in your favorite VCS
349instead of raw binary files. The binary files that Normand generates can
350be used to test file format decoding, including malformatted data, for
351example, as well as for education.
352
353See <<learn-normand>> to explore all the Normand features.
354
355== Install Normand
356
357Normand requires Python ≥ 3.4.
358
359To install Normand:
360
361----
362$ python3 -m pip install --user normand
363----
364
365See
366https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
367to learn more about a user site installation.
368
369[NOTE]
370====
371Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 372to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
373function and the <<command-line-tool,command-line tool>>).
374
375`normand.py` has _no external dependencies_, but if you're using
376Python{nbsp}3.4, you'll need a local copy of the standard `typing`
377module.
378====
379
380== Learn Normand
381
382A Normand text input is a sequence of items which represent a sequence
383of raw bytes.
384
385[[state]] During the processing of items to data, Normand relies on a
386current state:
387
388[%header%autowidth]
389|===
af3cf417 390|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
391
392|[[cur-offset]] Current offset
393|
05f81895 394The current offset has an effect on the value of <<label,labels>> and of
269f6eb3 395the special `ICITTE` name in <<fixed-length-number,fixed-length
27d52a19
PP
396number>>, <<leb-128-integer,LEB128 integer>>,
397<<variable-assignment,variable assignment>>,
398<<conditional-block,conditional block>>, <<repetition-block,repetition
320644e2
PP
399block>>, <<macro-expansion,macro expansion>>, and
400<<post-item-repetition,post-item repetition>> expression evaluation.
71aaa3f7
PP
401
402Each generated byte increments the current offset.
403
404A <<current-offset-setting,current offset setting>> may change the
676f6189
PP
405current offset without generating data.
406
407An <<current-offset-alignment,current offset alignment>> generates
408padding bytes to make the current offset satisfy a given alignment.
71aaa3f7
PP
409|`init_offset` parameter of the `parse()` function.
410|`--offset` option.
411
412|[[cur-bo]] Current byte order
413|
05f81895 414The current byte order has an effect on the encoding of
269f6eb3 415<<fixed-length-number,fixed-length numbers>>.
71aaa3f7
PP
416
417A <<current-byte-order-setting,current byte order setting>> may change
418the current byte order.
419|`init_byte_order` parameter of the `parse()` function.
420|`--byte-order` option.
421
422|<<label,Labels>>
423|Mapping of label names to integral values.
424|`init_labels` parameter of the `parse()` function.
425|One or more `--label` options.
426
427|<<variable-assignment,Variables>>
27d52a19 428|Mapping of variable names to integral or floating point number values.
71aaa3f7
PP
429|`init_variables` parameter of the `parse()` function.
430|One or more `--var` options.
431|===
432
433The available items are:
434
435* A <<byte-constant,constant integer>> representing a single byte.
436
437* A <<literal-string,literal string>> representing a sequence of bytes
438 encoding UTF-8, UTF-16, or UTF-32 data.
439
440* A <<current-byte-order-setting,current byte order setting>> (big or
441 little endian).
442
269f6eb3
PP
443* A <<fixed-length-number,fixed-length number>> (integer or
444 floating point) using the <<cur-bo,current byte order>> and of which
445 the value is the result of a {py3} expression.
05f81895
PP
446
447* An <<leb128-integer,LEB128 integer>> of which the value is the result
448 of a {py3} expression.
71aaa3f7
PP
449
450* A <<current-offset-setting,current offset setting>>.
451
676f6189
PP
452* A <<current-offset-alignment,current offset alignment>>.
453
25ca454b
PP
454* A <<filling,filling>>.
455
71aaa3f7
PP
456* A <<label,label>>, that is, a named constant holding the current
457 offset.
458+
459This is similar to an assembly label.
460
461* A <<variable-assignment,variable assignment>> associating a name to
462 the integral result of an evaluated {py3} expression.
463
464* A <<group,group>>, that is, a scoped sequence of items.
465
27d52a19
PP
466* A <<conditional-block,conditional block>>.
467
e57a18e1
PP
468* A <<repetition-block,repetition block>>.
469
320644e2
PP
470* A <<macro-definition-block,macro definition block>>.
471
472* A <<macro-expansion,macro expansion>>.
473
e57a18e1
PP
474Moreover, you can repeat many items above a constant or variable number
475of times with the ``pass:[*]`` operator _after_ the item to repeat. This
476is called a <<post-item-repetition,post-item repetition>>.
71aaa3f7
PP
477
478A Normand comment may exist:
479
480* Between items, possibly within a group.
481* Between the nibbles of a constant hexadecimal byte.
482* Between the bits of a constant binary byte.
e57a18e1
PP
483* Between the last item and the ``pass:[*]`` character of a post-item
484 repetition, and between that ``pass:[*]`` character and the following
485 number or expression.
261c5ecf
PP
486* Between the ``!repeat``/``!r`` block opening and the following
487 constant integer, name, or expression of a repetition block.
488* Between the ``!if`` block opening and the following name or expression
489 of a conditional block.
71aaa3f7
PP
490
491A comment is anything between two ``pass:[#]`` characters on the same
492line, or from ``pass:[#]`` until the end of the line. Whitespaces and
493the following symbol characters are also considered comments where a
494comment may exist:
495
496----
25ca454b 497/ \ ? & : ; . , [ ] _ = | -
71aaa3f7
PP
498----
499
500The latter serve to improve readability so that you may write, for
501example, a MAC address or a UUID as is.
502
fc21bb27
PP
503[[const-int]] Many items require a _constant integer_, possibly
504negative, in which case it may start with `-` for a negative integer. A
505positive constant integer is any of:
506
507Decimal::
508 One or mode digits (`0` to `9`).
509
510Hexadecimal::
511 One of:
512+
513* The `0x` or `0X` prefix followed with one or more hexadecimal digits
514 (`0` to `9`, `a` to `f`, or `A` to `F`).
515* One or more hexadecimal digits followed with the `h` or `H` suffix.
516
517Octal::
518 One of:
519+
520* The `0o` or `0O` prefix followed with one or more octal digits
521 (`0` to `7`).
522* One or more octal digits followed with the `o`, `O`, `q`, or `Q`
523 suffix.
524
525Binary::
526 One of:
527+
528* The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
529* One or more bits followed with the `b` or `B` suffix.
530
71aaa3f7
PP
531You can test the examples of this section with the `normand`
532<<command-line-tool,command-line tool>> as such:
533
534----
535$ normand file | hexdump -C
536----
537
538where `file` is the name of a file containing the Normand input.
539
540=== Byte constant
541
542A _byte constant_ represents a single byte.
543
544A byte constant is:
545
546Hexadecimal form::
fc21bb27 547 Two consecutive hexadecimal digits.
71aaa3f7
PP
548
549Decimal form::
fc21bb27 550 One or more digits after the `$` prefix.
71aaa3f7
PP
551
552Binary form::
553 Eight bits after the `%` prefix.
554
555====
556Input:
557
558----
559ab cd [3d 8F] CC
560----
561
562Output:
563
564----
565ab cd 3d 8f cc
566----
567====
568
569====
570Input:
571
572----
573$192 %1100/0011 $ -77
574----
575
576Output:
577
578----
579c0 c3 b3
580----
581====
582
583====
584Input:
585
586----
58758f64689-6316-4d55-8a1a-04cada366172
588fe80::6257:18ff:fea3:4229
589----
590
591Output:
592
593----
59458 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
595fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
596----
597====
598
599====
600Input:
601
602----
603%01110011 %01100001 %01101100 %01110101 %01110100
604----
605
606Output:
607
608----
60973 61 6c 75 74 ┆ salut
610----
611====
612
613=== Literal string
614
615A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
616bytes of a string.
617
618The string to encode isn't implicitly null-terminated: use `\0` at the
619end of the string to add a null character.
620
621A literal string is:
622
623. **Optional**: one of the following encodings instead of UTF-8:
624+
625--
626[horizontal]
627`u16be`:: UTF-16BE.
628`u16le`:: UTF-16LE.
629`u32be`:: UTF-32BE.
630`u32le`:: UTF-32LE.
631--
632
633. The ``pass:["]`` prefix.
634
635. A sequence of zero or more characters, possibly containing escape
636 sequences.
637+
638An escape sequence is the ``\`` character followed by one of:
639+
640--
641[horizontal]
642`0`:: Null (U+0000)
643`a`:: Alert (U+0007)
644`b`:: Backspace (U+0008)
645`e`:: Escape (U+001B)
646`f`:: Form feed (U+000C)
647`n`:: End of line (U+000A)
648`r`:: Carriage return (U+000D)
649`t`:: Character tabulation (U+0009)
650`v`:: Line tabulation (U+000B)
651``\``:: Reverse solidus (U+005C)
652``pass:["]``:: Quotation mark (U+0022)
653--
654
655. The ``pass:["]`` suffix.
656
657====
658Input:
659
660----
661"coucou tout le monde!"
662----
663
664Output:
665
666----
66763 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
6686f 6e 64 65 21 ┆ onde!
669----
670====
671
672====
673Input:
674
675----
676u16le"I am not young enough to know everything."
677----
678
679Output:
680
681----
68249 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
68320 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
6846e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
68520 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
68665 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
6872e 00 ┆ .•
688----
689====
690
691====
692Input:
693
694----
695u32be "\"illusion is the first\nof all pleasures\" 🦉"
696----
697
698Output:
699
700----
70100 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
70200 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
70300 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
70400 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
70500 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
70600 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
70700 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
70800 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
70900 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
71000 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
71100 00 00 20 00 01 f9 89 ┆ ••• ••••
712----
713====
714
715=== Current byte order setting
716
717This special item sets the <<cur-bo,_current byte order_>>.
718
719The two accepted forms are:
720
721[horizontal]
722``pass:[{be}]``:: Set the current byte order to big endian.
723``pass:[{le}]``:: Set the current byte order to little endian.
724
269f6eb3 725=== Fixed-length number
71aaa3f7 726
269f6eb3
PP
727A _fixed-length number_ represents a fixed number of bytes encoding
728either:
729
730* An unsigned or signed integer (two's complement).
731+
732The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
733
734* A floating point number
fc21bb27 735 (IEEE{nbsp}754-2008[https://standards.ieee.org/standard/754-2008.html]).
269f6eb3
PP
736+
737The available length are 32 (_binary32_) and 64 (_binary64_).
71aaa3f7 738
269f6eb3
PP
739The value is the result of evaluating a {py3} expression using the
740<<cur-bo,current byte order>>.
741
742A fixed-length number is:
71aaa3f7
PP
743
744. The ``pass:[{]`` prefix.
745
746. A valid {py3} expression.
05f81895 747+
269f6eb3 748For a fixed-length number at some source location{nbsp}__**L**__, this
05f81895
PP
749expression may contain the name of any accessible <<label,label>> (not
750within a nested group), including the name of a label defined
751after{nbsp}__**L**__, as well as the name of any
752<<variable-assignment,variable>> known at{nbsp}__**L**__.
753+
269f6eb3
PP
754The value of the special name `ICITTE` (`int` type) in this expression
755is the <<cur-offset,current offset>> (before encoding the number).
71aaa3f7
PP
756
757. The `:` character.
758
269f6eb3
PP
759. An encoding length in bits amongst:
760+
761--
27d52a19 762The expression evaluates to an `int` or `bool` value::
269f6eb3 763 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
27d52a19
PP
764+
765NOTE: Normand automatically converts a `bool` value to `int`.
269f6eb3
PP
766
767The expression evaluates to a `float` value::
768 `32` and `64`.
769--
71aaa3f7
PP
770
771. The `}` suffix.
772
773====
774Input:
775
776----
777{le} {345:16}
778{be} {-0xabcd:32}
779----
780
781Output:
782
783----
78459 01 ff ff 54 33
785----
786====
787
788====
789Input:
790
791----
792{be}
793
794# String length in bits
795{8 * (str_end - str_beg) : 16}
796
797# String
798<str_beg>
799 "hello world!"
800<str_end>
801----
802
803Output:
804
805----
80600 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
807----
808====
809
810====
811Input:
812
813----
814{20 - ICITTE : 8} * 10
815----
816
817Output:
818
819----
82014 13 12 11 10 0f 0e 0d 0c 0b
821----
822====
823
269f6eb3
PP
824====
825Input:
826
827----
828{le}
829{2 * 0.0529 : 32}
830----
831
832Output:
833
834----
835ac ad d8 3d
836----
837====
838
05f81895
PP
839=== LEB128 integer
840
841An _LEB128 integer_ represents a variable number of bytes encoding an
842unsigned or signed integer which is the result of evaluating a {py3}
843expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
844format.
845
846An LEB128 integer is:
847
848. The ``pass:[{]`` prefix.
849
27d52a19
PP
850. A valid {py3} expression of which the evaluation result type
851 is `int` or `bool` (automatically converted to `int`).
05f81895
PP
852+
853For an LEB128 integer at some source location{nbsp}__**L**__, this
854expression may contain:
855+
856--
fc21bb27
PP
857* The name of any <<label,label>> defined before{nbsp}__**L**__
858 which isn't within a nested group.
320644e2
PP
859* The name of any <<variable-assignment,variable>> known
860 at{nbsp}__**L**__.
05f81895
PP
861--
862+
269f6eb3
PP
863The value of the special name `ICITTE` (`int` type) in this expression
864is the <<cur-offset,current offset>> (before encoding the integer).
05f81895
PP
865
866. The `:` character.
867
868. One of:
869+
870--
871[horizontal]
872`uleb128`:: Use the unsigned LEB128 format.
873`sleb128`:: Use the signed LEB128 format.
874--
875
876. The `}` suffix.
877
878====
879Input:
880
881----
882{624485 : uleb128}
883----
884
885Output:
886
887----
888e5 8e 26
889----
890====
891
892====
893Input:
894
895----
896aa bb cc dd
897<meow>
898ee ff
899{-981238311 + (meow * -23) : sleb128}
900"hello"
901----
902
c2b79cf6
PP
903Output:
904
05f81895
PP
905----
906aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
907----
908====
909
71aaa3f7
PP
910=== Current offset setting
911
912This special item sets the <<cur-offset,_current offset_>>.
913
914A current offset setting is:
915
916. The `<` prefix.
917
fc21bb27
PP
918. A <<const-int,positive constant integer>> which is the new current
919 offset.
71aaa3f7
PP
920
921. The `>` suffix.
922
923====
924Input:
925
926----
927 {ICITTE : 8} * 8
928<0x61> {ICITTE : 8} * 8
929----
930
931Output:
932
933----
93400 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
935----
936====
937
938====
939Input:
940
941----
942aa bb cc dd <meow> ee ff
943<12> 11 22 33 <mix> 44 55
944{meow : 8} {mix : 8}
945----
946
947Output:
948
949----
950aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
951----
952====
953
676f6189
PP
954=== Current offset alignment
955
00deb9fa 956A _current offset alignment_ represents zero or more padding bytes to
676f6189
PP
957make the <<cur-offset,current offset>> meet a given
958https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
959
960More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
961a current offset alignment represents the required padding bytes until
962the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
963
964A current offset alignment is:
965
966. The `@` prefix.
967
fc21bb27
PP
968. A <<const-int,positive constant integer>> which is the alignment value
969 in _bits_.
676f6189
PP
970+
971This value must be greater than zero and a multiple of{nbsp}8.
972
973. **Optional**:
974+
975--
976. The ``pass:[~]`` prefix.
fc21bb27
PP
977. A <<const-int,positive constant integer>> which is the value of the
978 byte to use as padding to align the <<cur-offset,current offset>>.
676f6189
PP
979--
980+
981Without this section, the padding byte value is zero.
982
983====
984Input:
985
986----
98711 22 (@32 aa bb cc) * 3
988----
989
990Output:
991
992----
99311 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
994----
995====
996
997====
998Input:
999
1000----
1001{le}
100277 88
1003@32~0xcc {-893.5:32}
1004@128~0x55 "meow"
1005----
1006
1007Output:
1008
1009----
101077 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
10116d 65 6f 77 ┆ meow
1012----
1013====
1014
1015====
1016Input:
1017
1018----
1019aa bb cc <29> @64~255 "zoom"
1020----
1021
1022Output:
1023
1024----
1025aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1026----
1027====
1028
25ca454b
PP
1029=== Filling
1030
1031A _filling_ represents zero or more padding bytes to make the
1032<<cur-offset,current offset>> reach a given value.
1033
1034A filling is:
1035
1036. The ``pass:[+]`` prefix.
1037
1038. One of:
1039
fc21bb27
PP
1040** A <<const-int,positive constant integer>> which is the current offset
1041 target.
25ca454b
PP
1042
1043** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1044 evaluation result type is `int` or `bool` (automatically converted to
1045 `int`), and the ``pass:[}]`` suffix.
1046+
1047For a filling at some source location{nbsp}__**L**__, this expression
1048may contain:
1049+
1050--
1051* The name of any <<label,label>> defined before{nbsp}__**L**__
1052 which isn't within a nested group.
1053* The name of any <<variable-assignment,variable>> known
1054 at{nbsp}__**L**__.
1055--
1056+
1057The value of the special name `ICITTE` (`int` type) in this expression
1058is the <<cur-offset,current offset>> (before handling the items to
1059repeat).
1060
1061** A valid {py3} name.
1062+
1063For the name `__NAME__`, this is equivalent to the
1064`pass:[{]__NAME__pass:[}]` form above.
1065
1066+
1067This value must be greater than or equal to the current offset where
1068it's used.
1069
1070. **Optional**:
1071+
1072--
1073. The ``pass:[~]`` prefix.
fc21bb27
PP
1074. A <<const-int,positive constant integer>> which is the value of the
1075 byte to use as padding to reach the current offset target.
25ca454b
PP
1076--
1077+
1078Without this section, the padding byte value is zero.
1079
1080====
1081Input:
1082
1083----
1084aa bb cc dd
1085+0x40
1086"hello world"
1087----
1088
1089Output:
1090
1091----
1092aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
109300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
109400 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
109500 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
109668 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1097----
1098====
1099
1100====
1101Input:
1102
1103----
1104!macro part(iter, fill)
1105 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1106!end
1107
1108{iter = 1}
1109
1110!repeat 5
1111 m:part(iter, {32 + 4 * iter})
1112 {iter = iter + 1}
1113!end
1114----
1115
1116Output:
1117
1118----
111970 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
112069 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
112180 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
112265 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
112380 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
112469 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
112533 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
112680 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
112761 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
112880 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
112980 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
113061 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
113180 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
113280 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1133----
1134====
1135
71aaa3f7
PP
1136=== Label
1137
1138A _label_ associates a name to the <<cur-offset,current offset>>.
1139
1140All the labels of a whole Normand input must have unique names.
1141
05f81895 1142A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
1143name.
1144
71aaa3f7
PP
1145A label is:
1146
1147. The `<` prefix.
1148
27d52a19 1149. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1150
1151. The `>` suffix.
1152
1153=== Variable assignment
1154
1155A _variable assignment_ associates a name to the integral result of an
1156evaluated {py3} expression.
1157
05f81895 1158A variable assignment is:
71aaa3f7
PP
1159
1160. The ``pass:[{]`` prefix.
1161
27d52a19 1162. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1163
1164. The `=` character.
1165
27d52a19
PP
1166. A valid {py3} expression of which the evaluation result type
1167 is `int`, `float`, or `bool` (automatically converted to `int`).
05f81895
PP
1168+
1169For a variable assignment at some source location{nbsp}__**L**__, this
320644e2
PP
1170expression may contain:
1171+
1172--
1173* The name of any <<label,label>> defined before{nbsp}__**L**__
1174 which isn't within a nested group.
1175* The name of any <<variable-assignment,variable>> known
1176 at{nbsp}__**L**__.
1177--
05f81895 1178+
269f6eb3
PP
1179The value of the special name `ICITTE` (`int` type) in this expression
1180is the <<cur-offset,current offset>>.
71aaa3f7
PP
1181
1182. The `}` suffix.
1183
1184====
1185Input:
1186
1187----
1188{mix = 101} {le}
1189{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1190"yooo" {meow + mix : 16}
1191----
1192
1193Output:
1194
1195----
119611 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1197----
1198====
1199
1200=== Group
1201
1202A _group_ is a scoped sequence of items.
1203
1204The <<label,labels>> within a group aren't visible outside of it.
1205
e57a18e1
PP
1206The main purpose of a group is to <<post-item-repetition,repeat>> more
1207than a single item and to isolate labels.
71aaa3f7
PP
1208
1209A group is:
1210
261c5ecf 1211. The `(`, `!group`, or `!g` opening.
71aaa3f7
PP
1212
1213. Zero or more items.
1214
261c5ecf
PP
1215. Depending on the group opening:
1216+
1217--
1218`(`::
1219 The `)` closing.
1220
1221`!group`::
1222`!g`::
1223 The `!end` closing.
1224--
71aaa3f7
PP
1225
1226====
1227Input:
1228
1229----
1230((aa bb cc) dd () ee) "leclerc"
1231----
1232
1233Output:
1234
1235----
1236aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1237----
1238====
1239
1240====
1241Input:
1242
1243----
261c5ecf
PP
1244!group
1245 (aa bb cc) * 3 dd ee
1246!end * 5
71aaa3f7
PP
1247----
1248
1249Output:
1250
1251----
1252aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1253cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1254ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1255bb cc aa bb cc dd ee
1256----
1257====
1258
1259====
1260Input:
1261
1262----
1263{be}
1264(
1265 <str_beg> u16le"sébastien diaz" <str_end>
1266 {ICITTE - str_beg : 8}
1267 {(end - str_beg) * 5 : 24}
1268) * 3
1269<end>
1270----
1271
1272Output:
1273
1274----
127573 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
12766e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
127773 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
12786e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
127973 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
12806e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1281----
1282====
1283
27d52a19
PP
1284=== Conditional block
1285
1286A _conditional block_ represents either the bytes of one or more items
1287if some expression is true, or no bytes at all if it's false.
1288
1289A conditional block is:
1290
261c5ecf 1291. The `!if` opening.
27d52a19
PP
1292
1293. One of:
1294
1295** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1296 evaluation result type is `int` or `bool` (automatically converted to
1297 `int`), and the ``pass:[}]`` suffix.
1298+
320644e2
PP
1299For a conditional block at some source location{nbsp}__**L**__, this
1300expression may contain:
27d52a19
PP
1301+
1302--
1303* The name of any <<label,label>> defined before{nbsp}__**L**__
1304 which isn't within a nested group.
1305* The name of any <<variable-assignment,variable>> known
320644e2 1306 at{nbsp}__**L**__.
27d52a19
PP
1307--
1308+
1309The value of the special name `ICITTE` (`int` type) in this expression
1310is the <<cur-offset,current offset>> (before handling the contained
1311items).
1312
1313** A valid {py3} name.
1314+
1315For the name `__NAME__`, this is equivalent to the
1316`pass:[{]__NAME__pass:[}]` form above.
1317
1318. Zero or more items.
1319
261c5ecf 1320. The `!end` closing.
27d52a19
PP
1321
1322====
1323Input:
1324
1325----
1326{at = 1}
1327{rep_count = 9}
1328
1329!repeat rep_count
1330 "meow "
1331
1332 !if {ICITTE > 25}
1333 "mix"
1334
1335 !if {at < rep_count} 20 !end
1336 !end
1337
1338 {at = at + 1}
1339!end
1340----
1341
1342Output:
1343
1344----
13456d 65 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d ┆ meow meow meow m
134665 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d 69 ┆ eow meow meow mi
134778 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
13486d 69 78 20 6d 65 6f 77 20 6d 69 78 ┆ mix meow mix
1349----
1350====
1351
1352====
1353Input:
1354
1355----
1356<str_beg>
1357u16le"meow mix!"
1358<str_end>
1359
1360!if {str_end - str_beg > 10}
1361 " BIG"
1362!end
1363----
1364
1365Output:
1366
1367----
13686d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
136921 00 20 42 49 47 ┆ !• BIG
1370----
1371====
1372
e57a18e1 1373=== Repetition block
71aaa3f7 1374
e57a18e1
PP
1375A _repetition block_ represents the bytes of one or more items repeated
1376a given number of times.
676f6189 1377
e57a18e1 1378A repetition block is:
71aaa3f7 1379
261c5ecf 1380. The `!repeat` or `!r` opening.
71aaa3f7 1381
2adf4336
PP
1382. One of:
1383
fc21bb27
PP
1384** A <<const-int,positive constant integer>> which is the number of
1385 times to repeat the previous item.
2adf4336 1386
27d52a19
PP
1387** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1388 evaluation result type is `int` or `bool` (automatically converted to
1389 `int`), and the ``pass:[}]`` suffix.
05f81895 1390+
320644e2
PP
1391For a repetition block at some source location{nbsp}__**L**__, this
1392expression may contain:
05f81895
PP
1393+
1394--
27d52a19
PP
1395* The name of any <<label,label>> defined before{nbsp}__**L**__
1396 which isn't within a nested group.
05f81895 1397* The name of any <<variable-assignment,variable>> known
320644e2 1398 at{nbsp}__**L**__.
05f81895
PP
1399--
1400+
e57a18e1
PP
1401The value of the special name `ICITTE` (`int` type) in this expression
1402is the <<cur-offset,current offset>> (before handling the items to
1403repeat).
1404
1405** A valid {py3} name.
1406+
1407For the name `__NAME__`, this is equivalent to the
1408`pass:[{]__NAME__pass:[}]` form above.
1409
1410. Zero or more items.
1411
261c5ecf 1412. The `!end` closing.
e57a18e1
PP
1413
1414You may also use a <<post-item-repetition,post-item repetition>> after
1415some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1416is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
71aaa3f7
PP
1417
1418====
1419Input:
1420
1421----
fc21bb27 1422!repeat 0o400
e57a18e1
PP
1423 {end - ICITTE - 1 : 8}
1424!end
1425
1426<end>
71aaa3f7
PP
1427----
1428
1429Output:
1430
1431----
1432ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1433ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1434df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1435cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1436bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1437af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
14389f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
14398f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
14407f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
14416f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
14425f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
14434f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
14443f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
14452f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
14461f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
14470f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1448----
1449====
1450
2adf4336
PP
1451====
1452Input:
1453
1454----
1455{times = 1}
e57a18e1 1456
2adf4336 1457aa bb cc dd
e57a18e1
PP
1458
1459!repeat 3
2adf4336 1460 <here>
e57a18e1
PP
1461
1462 !repeat {here + 1}
1463 ee ff
1464 !end
1465
1466 11 22 !repeat times 33 !end
1467
2adf4336 1468 {times = times + 1}
e57a18e1
PP
1469!end
1470
2adf4336
PP
1471"coucou!"
1472----
1473
1474Output:
1475
1476----
1477aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
147833 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1479ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1480ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1481ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1482ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1483ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1484ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1485ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1486ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1487ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
148833 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1489----
1490====
1491
320644e2
PP
1492=== Macro definition block
1493
1494A _macro definition block_ associates a name and parameter names to
1495a group of items.
1496
1497A macro definition block doesn't lead to generated bytes itself: a
1498<<macro-expansion,macro expansion>> does so.
1499
1500A macro definition may only exist at the root level, that is, not within
1501a <<group,group>>, a <<repetition-block,repetition block>>, a
1502<<conditional-block,conditional block>>, or another
1503<<macro-definition-block,macro definition block>>.
1504
1505All macro definitions must have unique names.
1506
1507A macro definition is:
1508
1509. The `!macro` or `!m` opening.
1510
1511. A valid {py3} name (the macro name).
1512
1513. The `(` parameter name list prefix.
1514
1515. A comma-separated list of zero or more unique parameter names,
1516 each one being a valid {py3} name.
1517
1518. The `)` parameter name list suffix.
1519
1520. Zero or more items except, recursively, a macro definition block.
1521
1522. The `!end` closing.
1523
1524====
1525----
1526!macro bake()
1527 {le} {ICITTE * 8 : 16}
1528 u16le"predict explode"
1529!end
1530----
1531====
1532
1533====
1534----
1535!macro nail(rep, with_extra, val)
1536 {iter = 1}
1537
1538 !repeat rep
1539 {val + iter : uleb128}
1540 {0xdeadbeef : 32}
1541 {iter = iter + 1}
1542 !end
1543
1544 !if with_extra
1545 "meow mix\0"
1546 !end
1547!end
1548----
1549====
1550
1551=== Macro expansion
1552
1553A _macro expansion_ expands the items of a defined
1554<<macro-definition-block,macro>>.
1555
1556The macro to expand must be defined _before_ the expansion.
1557
1558The <<state,state>> before handling the first item of the chosen macro
1559is:
1560
1561<<cur-offset,Current offset>>::
1562 Unchanged.
1563
1564<<cur-bo,Current byte order>>::
1565 Unchanged.
1566
1567Variables::
1568 The only available variables initially are the macro parameters.
1569
1570Labels::
1571 None.
1572
1573The state after having handled the last item of the chosen macro is:
1574
1575Current offset::
1576 The one before handling the first item of the macro plus the size
1577 of the generated data of the macro expansion.
1578+
1579IMPORTANT: This means <<current-offset-setting,current offset setting>>
1580items within the expanded macro don't impact the final current offset.
1581
1582Current byte order::
1583 The one before handling the first item of the macro.
1584
1585Variables::
1586 The ones before handling the first item of the macro.
1587
1588Labels::
1589 The ones before handling the first item of the macro.
1590
1591A macro expansion is:
1592
1593. The `m:` prefix.
1594
1595. A valid {py3} name (the name of the macro to expand).
1596
1597. The `(` parameter value list prefix.
1598
1599. A comma-separated list of zero or more unique parameter values.
1600+
1601The number of parameter values must match the number of parameter
1602names of the definition of the chosen macro.
1603+
1604A parameter value is one of:
1605+
1606--
fc21bb27 1607* A <<const-int,constant integer>>, possibly negative.
320644e2
PP
1608
1609* The ``pass:[{]`` prefix, a valid {py3} expression of which the
1610 evaluation result type is `int` or `bool` (automatically converted to
1611 `int`), and the ``pass:[}]`` suffix.
1612+
1613For a macro expansion at some source location{nbsp}__**L**__, this
1614expression may contain:
1615
1616** The name of any <<label,label>> defined before{nbsp}__**L**__
1617 which isn't within a nested group.
1618** The name of any <<variable-assignment,variable>> known
1619 at{nbsp}__**L**__.
1620
1621+
1622The value of the special name `ICITTE` (`int` type) in this expression
1623is the <<cur-offset,current offset>> (before handling the items of the
1624chosen macro).
1625
1626* A valid {py3} name.
1627+
1628For the name `__NAME__`, this is equivalent to the
1629`pass:[{]__NAME__pass:[}]` form above.
1630--
1631
1632. The `)` parameter value list suffix.
1633
1634====
1635Input:
1636
1637----
1638!macro bake()
1639 {le} {ICITTE * 8 : 16}
1640 u16le"predict explode"
1641!end
1642
1643"hello [" m:bake() "] world"
1644
1645m:bake() * 5
1646----
1647
1648Output:
1649
1650----
165168 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
165200 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
165300 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
165470 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
165565 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
165670 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
165765 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
165870 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
165965 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
166070 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
166165 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
166270 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
166365 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1664----
1665====
1666
1667====
1668Input:
1669
1670----
1671!macro A(val, is_be)
1672 {le}
1673
1674 !if is_be
1675 {be}
1676 !end
1677
1678 {val : 16}
1679!end
1680
1681!macro B(rep, is_be)
1682 {iter = 1}
1683
1684 !repeat rep
1685 m:A({iter * 3}, is_be)
1686 {iter = iter + 1}
1687 !end
1688!end
1689
1690m:B(5, 1)
1691m:B(3, 0)
1692----
1693
1694Output:
1695
1696----
169700 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1698----
1699====
1700
e57a18e1
PP
1701=== Post-item repetition
1702
1703A _post-item repetition_ represents the bytes of an item repeated a
1704given number of times.
1705
1706A post-item repetition is:
1707
27d52a19 1708. One of those items:
e57a18e1 1709
27d52a19
PP
1710** A <<byte-constant,byte constant>>.
1711** A <<literal-string,literal string>>.
1712** A <<fixed-length-number,fixed-length number>>.
1713** An <<leb128-integer,LEB128 integer>>.
320644e2 1714** A <<macro-expansion,macro-expansion>>.
27d52a19 1715** A <<group,group>>.
e57a18e1
PP
1716
1717. The ``pass:[*]`` character.
1718
1719. One of:
1720
1721** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1722 which is the number of times to repeat the previous item.
1723
27d52a19
PP
1724** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1725 evaluation result type is `int` or `bool` (automatically converted to
1726 `int`), and the ``pass:[}]`` suffix.
e57a18e1 1727+
320644e2
PP
1728For a post-item repetition at some source location{nbsp}__**L**__, this
1729expression may contain:
e57a18e1
PP
1730+
1731--
27d52a19
PP
1732* The name of any <<label,label>> defined before{nbsp}__**L**__
1733 which isn't within a nested group and
1734 which isn't part of the repeated item.
e57a18e1
PP
1735* The name of any <<variable-assignment,variable>> known
1736 at{nbsp}__**L**__, which isn't part of its repeated item, and which
320644e2 1737 doesn't.
e57a18e1
PP
1738--
1739+
1740The value of the special name `ICITTE` (`int` type) in this expression
1741is the <<cur-offset,current offset>> (before handling the items to
1742repeat).
1743
1744** A valid {py3} name.
1745+
1746For the name `__NAME__`, this is equivalent to the
1747`pass:[{]__NAME__pass:[}]` form above.
1748
1749You may also use a <<repetition-block,repetition block>>. The form
1750``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1751``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1752
1753====
1754Input:
1755
1756----
1757{end - ICITTE - 1 : 8} * 0x100 <end>
1758----
1759
1760Output:
1761
1762----
1763ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1764ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1765df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1766cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1767bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1768af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
17699f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
17708f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
17717f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
17726f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
17735f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
17744f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
17753f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
17762f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
17771f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
17780f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1779----
1780====
1781
1782====
1783Input:
1784
1785----
1786{times = 1}
1787aa bb cc dd
1788(
1789 <here>
1790 (ee ff) * {here + 1}
1791 11 22 33 * {times}
1792 {times = times + 1}
1793) * 3
1794"coucou!"
1795----
1796
1797Output:
1798
1799----
1800aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
180133 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1802ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1803ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1804ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1805ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1806ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1807ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1808ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1809ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1810ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
181133 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1812----
1813====
1814
71aaa3f7
PP
1815== Command-line tool
1816
1817If you <<install-normand,installed>> the `normand` package, then you
1818can use the `normand` command-line tool:
1819
1820----
1821$ normand <<< '"ma gang de malades"' | hexdump -C
1822----
1823
1824----
182500000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
182600000010 65 73 |es|
1827----
1828
1829If you copy the `normand.py` module to your own project, then you can
1830run the module itself:
1831
1832----
1833$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1834----
1835
1836----
183700000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
183800000010 65 73 |es|
1839----
1840
1841Without a path argument, the `normand` tool reads from the standard
1842input.
1843
1844The `normand` tool prints the generated binary data to the standard
1845output.
1846
1847Various options control the initial <<state,state>> of the processor:
1848use the `--help` option to learn more.
1849
1850== {py3} API
1851
e57a18e1 1852The whole `normand` package/module public API is:
71aaa3f7
PP
1853
1854[source,python]
1855----
e57a18e1 1856# Byte order.
71aaa3f7
PP
1857class ByteOrder(enum.Enum):
1858 # Big endian.
1859 BE = ...
1860
1861 # Little endian.
1862 LE = ...
1863
1864
e57a18e1
PP
1865# Text location.
1866class TextLocation:
71aaa3f7
PP
1867 # Line number.
1868 @property
1869 def line_no(self) -> int:
1870 ...
1871
1872 # Column number.
1873 @property
1874 def col_no(self) -> int:
1875 ...
1876
1877
e57a18e1 1878# Parsing error.
71aaa3f7
PP
1879class ParseError(RuntimeError):
1880 # Source text location.
1881 @property
e57a18e1 1882 def text_loc(self) -> TextLocation:
71aaa3f7
PP
1883 ...
1884
1885
e57a18e1
PP
1886# Variables dictionary type (for type hints).
1887VariablesT = typing.Dict[str, typing.Union[int, float]]
1888
1889
1890# Labels dictionary type (for type hints).
1891LabelsT = typing.Dict[str, int]
1b8aa84a
PP
1892
1893
e57a18e1 1894# Parsing result.
71aaa3f7
PP
1895class ParseResult:
1896 # Generated data.
1897 @property
1898 def data(self) -> bytearray:
1899 ...
1900
1901 # Updated variable values.
1902 @property
1b8aa84a 1903 def variables(self) -> SymbolsT:
71aaa3f7
PP
1904 ...
1905
1906 # Updated main group label values.
1907 @property
1b8aa84a 1908 def labels(self) -> SymbolsT:
71aaa3f7
PP
1909 ...
1910
1911 # Final offset.
1912 @property
1913 def offset(self) -> int:
1914 ...
1915
1916 # Final byte order.
1917 @property
1b8aa84a 1918 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
1919 ...
1920
1b8aa84a 1921
e57a18e1
PP
1922# Parses the `normand` input using the initial state defined by
1923# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
1924# and returns the corresponding parsing result.
71aaa3f7 1925def parse(normand: str,
1b8aa84a
PP
1926 init_variables: typing.Optional[SymbolsT] = None,
1927 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
1928 init_offset: int = 0,
1929 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1930 ...
1931----
1932
1933The `normand` parameter is the actual <<learn-normand,Normand input>>
1934while the other parameters control the initial <<state,state>>.
1935
1936The `parse()` function raises a `ParseError` instance should it fail to
1937parse the `normand` string for any reason.
bf8f3b38
PP
1938
1939== Development
1940
1941Normand is a https://python-poetry.org/[Poetry] project.
1942
1943To develop it, install it through Poetry and enter the virtual
1944environment:
1945
1946----
1947$ poetry install
1948$ poetry shell
1949$ normand <<< '"lol" * 10 0a'
1950----
1951
1952`normand.py` is processed by:
1953
1954* https://microsoft.github.io/pyright/[Pyright]
1955* https://github.com/psf/black[Black]
1956* https://pycqa.github.io/isort/[isort]
1957
1958=== Testing
1959
1960Use https://docs.pytest.org/[pytest] to test Normand once the package is
1961part of your virtual environment, for example:
1962
1963----
1964$ poetry install
1965$ poetry run pip3 install pytest
1966$ poetry run pytest
1967----
1968
1969The `pytest` project is currently not a development dependency in
1970`pyproject.toml` due to backward compatibiliy issues with
1971Python{nbsp}3.4.
1972
1973In the `tests` directory, each `*.nt` file is a test. The file name
1974prefix indicates what it's meant to test:
1975
1976`pass-`::
1977 Everything above the `---` line is the valid Normand input
1978 to test.
1979+
1980Everything below the `---` line is the expected data
1981(whitespace-separated hexadecimal bytes).
1982
1983`fail-`::
1984 Everything above the `---` line is the invalid Normand input
1985 to test.
1986+
1987Everything below the `---` line is the expected error message having
1988this form:
1989+
1990----
1991LINE:COL - MESSAGE
1992----
1993
1994=== Contributing
1995
1996Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1997for code review.
1998
1999To report a bug, https://github.com/efficios/normand/issues/new[create a
2000GitHub issue].
This page took 0.098348 seconds and 4 git commands to generate.