Add many string features
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
7a7b31e8 32WARNING: This version of Normand is 0.19, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
7a7b31e8 101Strings::
71aaa3f7
PP
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
7a7b31e8 108s:latin3{hex(ICITTE)}
71aaa3f7
PP
109----
110+
111Output:
112+
113----
11468 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11500 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
7a7b31e8
PP
11600 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd 30 ┆ •d•i•c•t• •>•#•0
11778 32 66 ┆ x2f
71aaa3f7
PP
118----
119
120Labels: special variables holding the offset where they're defined::
121+
122----
123<beg> b2 52 e3 bc 91 05
124$100 $50 <chair> 33 9f fe
12525 e9 89 8a <end>
126----
127
128Variables::
129+
130----
1315e 65 {tower = 47} c6 7f f2 c4
13244 {hurl = tower - 14} b5 {tower = hurl} 26 2d
133----
134+
135The value of a variable assignment is the evaluation of a valid {py3}
136expression which may include label and variable names.
137
269f6eb3 138Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
139+
140Input:
141+
142----
143{strength = 4}
144{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
145{le} {-1993 : 32}
269f6eb3 146{-3.141593 : 64}
71aaa3f7
PP
147----
148+
149Output:
150+
151----
269f6eb3
PP
15267 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
15309 c0
71aaa3f7
PP
154----
155+
269f6eb3 156The encoded number is the evaluation of a valid {py3} expression which
05f81895
PP
157may include label and variable names.
158
159https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
160+
161Input:
162+
163----
164aa bb cc {-1993 : sleb128} <meow> dd ee ff
165{meow * 199 : uleb128}
166----
167+
168Output:
169+
170----
171aa bb cc b7 70 dd ee ff e3 07
172----
173+
174The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
175may include label and variable names.
176
27d52a19
PP
177Conditional::
178+
179Input:
180+
181----
182aa bb cc
183
184(
185 "foo"
186
187 !if {ICITTE > 10}
188 "bar"
12b5dbc0
PP
189 !else
190 "fight"
27d52a19
PP
191 !end
192) * 4
193----
194+
195Output:
196+
197----
12b5dbc0
PP
198aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
19967 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
27d52a19
PP
200----
201
71aaa3f7
PP
202Repetition::
203+
204Input:
205+
206----
2adf4336 207aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
e57a18e1
PP
208
209!repeat 3
210 ff ee "juice"
211!end
71aaa3f7
PP
212----
213+
214Output:
215+
216----
2adf4336
PP
217aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
21800 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
21979 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
22065 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
22161 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
22268 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 22300 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
e57a18e1
PP
224ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
2256a 75 69 63 65 ┆ juice
71aaa3f7
PP
226----
227
676f6189
PP
228Alignment::
229+
230Input:
231+
232----
233{be}
234
235 {199:32}
236@64 {43:64}
237@16 {-123:16}
238@32~255 {5584:32}
239----
240+
241Output:
242+
243----
24400 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
245ff 85 ff ff 00 00 15 d0
246----
71aaa3f7 247
25ca454b
PP
248Filling::
249+
250Input:
251+
252----
253{le}
254{0xdeadbeef:32}
255{-1993:16}
256{9:16}
257+0x40
258{ICITTE:8}
259"meow mix"
fc21bb27 260+200~FFh
25ca454b
PP
261{ICITTE:8}
262----
263+
264Output:
265+
266----
267ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
26800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
26900 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
27000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
27140 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
272ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
278ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
279ff ff ff ff ff ff ff ff c8 ┆ •••••••••
280----
281
71aaa3f7
PP
282Multilevel grouping::
283+
284Input:
285+
286----
287ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
288----
289+
290Output:
291+
292----
293ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
294bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2956f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2966d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
297aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2987a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2996f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
300----
301
320644e2
PP
302Macros::
303+
304Input:
305+
306----
307!macro hello(world)
308 "hello"
309 !if world " world" !end
310!end
311
312!repeat 17
313 ff ff ff ff
314 m:hello({ICITTE > 15 and ICITTE < 60})
315!end
316----
317+
318Output:
319+
320----
321ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3226c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
32364 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
324ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
325ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
3266f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
3276c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
32868 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
329ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
330ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
3316c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
332----
333
71aaa3f7
PP
334Precise error reporting::
335+
336----
337/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
338----
339+
340----
341/tmp/meow.normand:32:6 - Unexpected character `k`.
342----
343+
344----
320644e2 345/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
71aaa3f7
PP
346----
347+
348----
f5dcb24c
PP
349/tmp/meow.normand:32:19 - While expanding the macro `meow`:
350/tmp/meow.normand:35:5 - While expanding the macro `zzz`:
320644e2 351/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
71aaa3f7
PP
352----
353
354You can use Normand to track data source files in your favorite VCS
355instead of raw binary files. The binary files that Normand generates can
356be used to test file format decoding, including malformatted data, for
357example, as well as for education.
358
359See <<learn-normand>> to explore all the Normand features.
360
361== Install Normand
362
363Normand requires Python ≥ 3.4.
364
365To install Normand:
366
367----
368$ python3 -m pip install --user normand
369----
370
371See
372https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
373to learn more about a user site installation.
374
375[NOTE]
376====
377Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 378to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
379function and the <<command-line-tool,command-line tool>>).
380
381`normand.py` has _no external dependencies_, but if you're using
382Python{nbsp}3.4, you'll need a local copy of the standard `typing`
383module.
384====
385
43937a34
PP
386== Design goals
387
388The design goals of Normand are:
389
390Portability::
391 We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
392 doesn't have any external dependencies so that you may just copy the
393 module as is to your own project.
394
395Ease of use::
396 The most basic Normand input is a sequence of hexadecimal constants
397 (for example, `4e6f726d616e64`) which produce exactly what you'd
398 expect.
399+
400Most Normand features map to programming language concepts you already
401know and understand: constant integers, literal strings, variables,
402conditionals, repetitions/loops, and the rest.
403
404Concise and readable input::
405 We could have chosen XML or YAML as the input format, but having a
406 DSL here makes a Normand input compact and easy to read, two
407 important traits when using Normand to write tests, for example.
408+
409Compare the following Normand input and some hypothetical XML
410equivalent, for example:
411+
412.Actual normand input.
413----
414ff dd 01 ab $192 $-128 %1101:0011
415
416{end:8}
417
418{iter = 1}
419
420!if {not something}
421 # five times because xyz
422 !repeat 5
423 "hello world " {iter:8}
424 {iter = iter + 1}
425 !end
426!end
427
428<end>
429----
430+
431.Hypothetical Normand XML input.
432[source,xml]
433----
434<?xml version="1.0" encoding="utf-8" ?>
435<group>
436 <byte base="x" val="ff" />
437 <byte base="x" val="dd" />
438 <byte base="x" val="1" />
439 <byte base="x" val="ab" />
440 <byte base="d" val="192" />
441 <byte base="d" val="-128" />
442 <byte base="b" val="11010011" />
443 <fixed-len-num expr="end" len="8" />
444 <var-assign name="iter" expr="1" />
445 <cond expr="not something">
446 <!-- five times because xyz -->
447 <repeat expr="5">
448 <str>hello world </str>
449 <fixed-len-num expr="iter" len="8" />
450 <var-assign name="iter" expr="iter + 1" />
451 </repeat>
452 </cond>
453 <label name="end" />
454</group>
455----
456
71aaa3f7
PP
457== Learn Normand
458
459A Normand text input is a sequence of items which represent a sequence
460of raw bytes.
461
462[[state]] During the processing of items to data, Normand relies on a
463current state:
464
465[%header%autowidth]
466|===
af3cf417 467|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
468
469|[[cur-offset]] Current offset
470|
05f81895 471The current offset has an effect on the value of <<label,labels>> and of
269f6eb3 472the special `ICITTE` name in <<fixed-length-number,fixed-length
7a7b31e8 473number>>, <<leb-128-integer,LEB128 integer>>, <<string,string>>,
f63f4a5d 474<<filling,filling>>, <<variable-assignment,variable assignment>>,
27d52a19 475<<conditional-block,conditional block>>, <<repetition-block,repetition
320644e2
PP
476block>>, <<macro-expansion,macro expansion>>, and
477<<post-item-repetition,post-item repetition>> expression evaluation.
71aaa3f7
PP
478
479Each generated byte increments the current offset.
480
481A <<current-offset-setting,current offset setting>> may change the
676f6189
PP
482current offset without generating data.
483
484An <<current-offset-alignment,current offset alignment>> generates
485padding bytes to make the current offset satisfy a given alignment.
71aaa3f7
PP
486|`init_offset` parameter of the `parse()` function.
487|`--offset` option.
488
489|[[cur-bo]] Current byte order
490|
05f81895 491The current byte order has an effect on the encoding of
269f6eb3 492<<fixed-length-number,fixed-length numbers>>.
71aaa3f7
PP
493
494A <<current-byte-order-setting,current byte order setting>> may change
495the current byte order.
496|`init_byte_order` parameter of the `parse()` function.
497|`--byte-order` option.
498
499|<<label,Labels>>
500|Mapping of label names to integral values.
501|`init_labels` parameter of the `parse()` function.
502|One or more `--label` options.
503
504|<<variable-assignment,Variables>>
27d52a19 505|Mapping of variable names to integral or floating point number values.
71aaa3f7 506|`init_variables` parameter of the `parse()` function.
7a7b31e8 507|One or more `--var` or `--var-str` options.
71aaa3f7
PP
508|===
509
510The available items are:
511
6dd69a2a
PP
512* A <<byte-constant,constant integer>> representing one or more
513 constant bytes.
71aaa3f7 514
7a7b31e8
PP
515* A <<literal-string,literal string>> representing a constant sequence
516 of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data.
71aaa3f7
PP
517
518* A <<current-byte-order-setting,current byte order setting>> (big or
519 little endian).
520
269f6eb3
PP
521* A <<fixed-length-number,fixed-length number>> (integer or
522 floating point) using the <<cur-bo,current byte order>> and of which
523 the value is the result of a {py3} expression.
05f81895
PP
524
525* An <<leb128-integer,LEB128 integer>> of which the value is the result
526 of a {py3} expression.
71aaa3f7 527
7a7b31e8
PP
528* A <<string,string>> representing a sequence of bytes encoding UTF-8,
529 UTF-16, UTF-32, or Latin-1 to Latin-10 data, and of which the value is
530 the result of a {py3} expression.
531
71aaa3f7
PP
532* A <<current-offset-setting,current offset setting>>.
533
676f6189
PP
534* A <<current-offset-alignment,current offset alignment>>.
535
25ca454b
PP
536* A <<filling,filling>>.
537
71aaa3f7
PP
538* A <<label,label>>, that is, a named constant holding the current
539 offset.
540+
541This is similar to an assembly label.
542
543* A <<variable-assignment,variable assignment>> associating a name to
544 the integral result of an evaluated {py3} expression.
545
546* A <<group,group>>, that is, a scoped sequence of items.
547
27d52a19
PP
548* A <<conditional-block,conditional block>>.
549
e57a18e1
PP
550* A <<repetition-block,repetition block>>.
551
320644e2
PP
552* A <<macro-definition-block,macro definition block>>.
553
554* A <<macro-expansion,macro expansion>>.
555
e57a18e1
PP
556Moreover, you can repeat many items above a constant or variable number
557of times with the ``pass:[*]`` operator _after_ the item to repeat. This
558is called a <<post-item-repetition,post-item repetition>>.
71aaa3f7
PP
559
560A Normand comment may exist:
561
562* Between items, possibly within a group.
563* Between the nibbles of a constant hexadecimal byte.
564* Between the bits of a constant binary byte.
e57a18e1
PP
565* Between the last item and the ``pass:[*]`` character of a post-item
566 repetition, and between that ``pass:[*]`` character and the following
567 number or expression.
261c5ecf
PP
568* Between the ``!repeat``/``!r`` block opening and the following
569 constant integer, name, or expression of a repetition block.
570* Between the ``!if`` block opening and the following name or expression
571 of a conditional block.
71aaa3f7
PP
572
573A comment is anything between two ``pass:[#]`` characters on the same
574line, or from ``pass:[#]`` until the end of the line. Whitespaces and
575the following symbol characters are also considered comments where a
576comment may exist:
577
578----
25ca454b 579/ \ ? & : ; . , [ ] _ = | -
71aaa3f7
PP
580----
581
582The latter serve to improve readability so that you may write, for
583example, a MAC address or a UUID as is.
584
fc21bb27
PP
585[[const-int]] Many items require a _constant integer_, possibly
586negative, in which case it may start with `-` for a negative integer. A
587positive constant integer is any of:
588
589Decimal::
590 One or mode digits (`0` to `9`).
591
592Hexadecimal::
593 One of:
594+
595* The `0x` or `0X` prefix followed with one or more hexadecimal digits
596 (`0` to `9`, `a` to `f`, or `A` to `F`).
597* One or more hexadecimal digits followed with the `h` or `H` suffix.
598
599Octal::
600 One of:
601+
602* The `0o` or `0O` prefix followed with one or more octal digits
603 (`0` to `7`).
604* One or more octal digits followed with the `o`, `O`, `q`, or `Q`
605 suffix.
606
607Binary::
608 One of:
609+
610* The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
611* One or more bits followed with the `b` or `B` suffix.
612
71aaa3f7
PP
613You can test the examples of this section with the `normand`
614<<command-line-tool,command-line tool>> as such:
615
616----
617$ normand file | hexdump -C
618----
619
620where `file` is the name of a file containing the Normand input.
621
622=== Byte constant
623
6dd69a2a 624A _byte constant_ represents one or more constant bytes.
71aaa3f7
PP
625
626A byte constant is:
627
628Hexadecimal form::
6dd69a2a 629 Two consecutive hexadecimal digits representing a single byte.
71aaa3f7
PP
630
631Decimal form::
6dd69a2a 632 One or more digits after the `$` prefix representing a single byte.
71aaa3f7 633
6dd69a2a
PP
634Binary form:: {empty}
635+
636--
637. __**N**__ `%` prefixes (at least one).
638+
639The number of `%` characters is the number of subsequent expected bytes.
640
641. __**N**__{nbsp}×{nbsp}8 bits (`0` or `1`).
642--
71aaa3f7
PP
643
644====
645Input:
646
647----
648ab cd [3d 8F] CC
649----
650
651Output:
652
653----
654ab cd 3d 8f cc
655----
656====
657
658====
659Input:
660
661----
662$192 %1100/0011 $ -77
663----
664
665Output:
666
667----
668c0 c3 b3
669----
670====
671
672====
673Input:
674
675----
67658f64689-6316-4d55-8a1a-04cada366172
677fe80::6257:18ff:fea3:4229
678----
679
680Output:
681
682----
68358 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
684fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
685----
686====
687
688====
689Input:
690
691----
692%01110011 %01100001 %01101100 %01110101 %01110100
6dd69a2a 693%%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
71aaa3f7
PP
694----
695
696Output:
697
698----
6dd69a2a 69973 61 6c 75 74 d2 ff c7 ┆ salut•••
71aaa3f7
PP
700----
701====
702
703=== Literal string
704
7a7b31e8
PP
705A _literal string_ represents the encoded bytes of a literal string
706using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
71aaa3f7
PP
707
708The string to encode isn't implicitly null-terminated: use `\0` at the
709end of the string to add a null character.
710
711A literal string is:
712
7a7b31e8
PP
713. **Optional**: one of the following encodings instead of the default
714 UTF-8:
71aaa3f7
PP
715+
716--
717[horizontal]
7a7b31e8
PP
718`s:u8`::
719`u8`::
720 UTF-8.
721
722`s:u16be`::
723`u16be`::
724 UTF-16BE.
725
726`s:u16le`::
727`u16le`::
728 UTF-16LE.
729
730`s:u32be`::
731`u32be`::
732 UTF-32BE.
733
734`s:u32le`::
735`u32le`::
736 UTF-32LE.
737
738`s:latin1`::
739 ISO/IEC 8859-1.
740
741`s:latin2`::
742 ISO/IEC 8859-2.
743
744`s:latin3`::
745 ISO/IEC 8859-3.
746
747`s:latin4`::
748 ISO/IEC 8859-4.
749
750`s:latin5`::
751 ISO/IEC 8859-9.
752
753`s:latin6`::
754 ISO/IEC 8859-10.
755
756`s:latin7`::
757 ISO/IEC 8859-13.
758
759`s:latin8`::
760 ISO/IEC 8859-14.
761
762`s:latin9`::
763 ISO/IEC 8859-15.
764
765`s:latin10`::
766 ISO/IEC 8859-16.
71aaa3f7
PP
767--
768
769. The ``pass:["]`` prefix.
770
771. A sequence of zero or more characters, possibly containing escape
772 sequences.
773+
774An escape sequence is the ``\`` character followed by one of:
775+
776--
777[horizontal]
778`0`:: Null (U+0000)
779`a`:: Alert (U+0007)
780`b`:: Backspace (U+0008)
781`e`:: Escape (U+001B)
782`f`:: Form feed (U+000C)
783`n`:: End of line (U+000A)
784`r`:: Carriage return (U+000D)
785`t`:: Character tabulation (U+0009)
786`v`:: Line tabulation (U+000B)
787``\``:: Reverse solidus (U+005C)
788``pass:["]``:: Quotation mark (U+0022)
789--
790
791. The ``pass:["]`` suffix.
792
793====
794Input:
795
796----
797"coucou tout le monde!"
798----
799
800Output:
801
802----
80363 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
8046f 6e 64 65 21 ┆ onde!
805----
806====
807
808====
809Input:
810
811----
812u16le"I am not young enough to know everything."
813----
814
815Output:
816
817----
81849 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
81920 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
8206e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
82120 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
82265 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
8232e 00 ┆ .•
824----
825====
826
827====
828Input:
829
830----
7a7b31e8 831s:u32be "\"illusion is the first\nof all pleasures\" 🦉"
71aaa3f7
PP
832----
833
834Output:
835
836----
83700 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
83800 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
83900 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
84000 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
84100 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
84200 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
84300 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
84400 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
84500 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
84600 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
84700 00 00 20 00 01 f9 89 ┆ ••• ••••
848----
849====
850
7a7b31e8
PP
851====
852Input:
853
854----
855s:latin1 "Paul Piché"
856----
857
858Output:
859
860----
86150 61 75 6c 20 50 69 63 68 e9 ┆ Paul Pich•
862----
863====
864
71aaa3f7
PP
865=== Current byte order setting
866
867This special item sets the <<cur-bo,_current byte order_>>.
868
869The two accepted forms are:
870
871[horizontal]
872``pass:[{be}]``:: Set the current byte order to big endian.
873``pass:[{le}]``:: Set the current byte order to little endian.
874
269f6eb3 875=== Fixed-length number
71aaa3f7 876
269f6eb3
PP
877A _fixed-length number_ represents a fixed number of bytes encoding
878either:
879
880* An unsigned or signed integer (two's complement).
881+
882The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
883
884* A floating point number
b87a3aa2 885 (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
269f6eb3
PP
886+
887The available length are 32 (_binary32_) and 64 (_binary64_).
71aaa3f7 888
269f6eb3
PP
889The value is the result of evaluating a {py3} expression using the
890<<cur-bo,current byte order>>.
891
892A fixed-length number is:
71aaa3f7
PP
893
894. The ``pass:[{]`` prefix.
895
896. A valid {py3} expression.
05f81895 897+
269f6eb3 898For a fixed-length number at some source location{nbsp}__**L**__, this
05f81895
PP
899expression may contain the name of any accessible <<label,label>> (not
900within a nested group), including the name of a label defined
901after{nbsp}__**L**__, as well as the name of any
902<<variable-assignment,variable>> known at{nbsp}__**L**__.
903+
269f6eb3
PP
904The value of the special name `ICITTE` (`int` type) in this expression
905is the <<cur-offset,current offset>> (before encoding the number).
71aaa3f7
PP
906
907. The `:` character.
908
269f6eb3
PP
909. An encoding length in bits amongst:
910+
911--
27d52a19 912The expression evaluates to an `int` or `bool` value::
269f6eb3 913 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
27d52a19
PP
914+
915NOTE: Normand automatically converts a `bool` value to `int`.
269f6eb3
PP
916
917The expression evaluates to a `float` value::
918 `32` and `64`.
919--
71aaa3f7
PP
920
921. The `}` suffix.
922
923====
924Input:
925
926----
927{le} {345:16}
928{be} {-0xabcd:32}
929----
930
931Output:
932
933----
93459 01 ff ff 54 33
935----
936====
937
938====
939Input:
940
941----
942{be}
943
944# String length in bits
945{8 * (str_end - str_beg) : 16}
946
947# String
948<str_beg>
949 "hello world!"
950<str_end>
951----
952
953Output:
954
955----
95600 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
957----
958====
959
960====
961Input:
962
963----
964{20 - ICITTE : 8} * 10
965----
966
967Output:
968
969----
97014 13 12 11 10 0f 0e 0d 0c 0b
971----
972====
973
269f6eb3
PP
974====
975Input:
976
977----
978{le}
979{2 * 0.0529 : 32}
980----
981
982Output:
983
984----
985ac ad d8 3d
986----
987====
988
05f81895
PP
989=== LEB128 integer
990
991An _LEB128 integer_ represents a variable number of bytes encoding an
992unsigned or signed integer which is the result of evaluating a {py3}
993expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
994format.
995
996An LEB128 integer is:
997
998. The ``pass:[{]`` prefix.
999
27d52a19
PP
1000. A valid {py3} expression of which the evaluation result type
1001 is `int` or `bool` (automatically converted to `int`).
05f81895
PP
1002+
1003For an LEB128 integer at some source location{nbsp}__**L**__, this
1004expression may contain:
1005+
1006--
fc21bb27
PP
1007* The name of any <<label,label>> defined before{nbsp}__**L**__
1008 which isn't within a nested group.
320644e2
PP
1009* The name of any <<variable-assignment,variable>> known
1010 at{nbsp}__**L**__.
05f81895
PP
1011--
1012+
269f6eb3
PP
1013The value of the special name `ICITTE` (`int` type) in this expression
1014is the <<cur-offset,current offset>> (before encoding the integer).
05f81895
PP
1015
1016. The `:` character.
1017
1018. One of:
1019+
1020--
1021[horizontal]
1022`uleb128`:: Use the unsigned LEB128 format.
1023`sleb128`:: Use the signed LEB128 format.
1024--
1025
1026. The `}` suffix.
1027
1028====
1029Input:
1030
1031----
1032{624485 : uleb128}
1033----
1034
1035Output:
1036
1037----
1038e5 8e 26
1039----
1040====
1041
1042====
1043Input:
1044
1045----
1046aa bb cc dd
1047<meow>
1048ee ff
1049{-981238311 + (meow * -23) : sleb128}
1050"hello"
1051----
1052
c2b79cf6
PP
1053Output:
1054
05f81895
PP
1055----
1056aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
1057----
1058====
1059
7a7b31e8
PP
1060=== String
1061
1062A _string_ represents a variable number of bytes encoding a string which
1063is the result of evaluating a {py3} expression using the UTF-8, UTF-16,
1064UTF-32, or Latin-1 to Latin-10 encoding.
1065
1066A string has two possible forms:
1067
1068Encoding prefix form:: {empty}
1069+
1070. An encoding amongst:
1071+
1072--
1073[horizontal]
1074`s:u8`::
1075`u8`::
1076 UTF-8.
1077
1078`s:u16be`::
1079`u16be`::
1080 UTF-16BE.
1081
1082`s:u16le`::
1083`u16le`::
1084 UTF-16LE.
1085
1086`s:u32be`::
1087`u32be`::
1088 UTF-32BE.
1089
1090`s:u32le`::
1091`u32le`::
1092 UTF-32LE.
1093
1094`s:latin1`::
1095 ISO/IEC 8859-1.
1096
1097`s:latin2`::
1098 ISO/IEC 8859-2.
1099
1100`s:latin3`::
1101 ISO/IEC 8859-3.
1102
1103`s:latin4`::
1104 ISO/IEC 8859-4.
1105
1106`s:latin5`::
1107 ISO/IEC 8859-9.
1108
1109`s:latin6`::
1110 ISO/IEC 8859-10.
1111
1112`s:latin7`::
1113 ISO/IEC 8859-13.
1114
1115`s:latin8`::
1116 ISO/IEC 8859-14.
1117
1118`s:latin9`::
1119 ISO/IEC 8859-15.
1120
1121`s:latin10`::
1122 ISO/IEC 8859-16.
1123--
1124
1125. The ``pass:[{]`` prefix.
1126
1127. A valid {py3} expression of which the evaluation result type
1128 is `bool`, `int`, `float`, or `str` (the first three automatically
1129 converted to `str`).
1130+
1131For a string at some source location{nbsp}__**L**__, this expression may
1132contain:
1133+
1134--
1135* The name of any <<label,label>> defined before{nbsp}__**L**__
1136 which isn't within a nested group.
1137* The name of any <<variable-assignment,variable>> known
1138 at{nbsp}__**L**__.
1139--
1140+
1141The value of the special name `ICITTE` (`int` type) in this expression
1142is the <<cur-offset,current offset>> (before encoding the string).
1143
1144. The `}` suffix.
1145
1146Encoding suffix form:: {empty}
1147+
1148. The ``pass:[{]`` prefix.
1149
1150. A valid {py3} expression of which the evaluation result type
1151 is `bool`, `int`, `float`, or `str` (the first three automatically
1152 converted to `str`).
1153+
1154For a string at some source location{nbsp}__**L**__, this expression may
1155contain:
1156+
1157--
1158* The name of any <<label,label>> defined before{nbsp}__**L**__
1159 which isn't within a nested group.
1160* The name of any <<variable-assignment,variable>> known
1161 at{nbsp}__**L**__.
1162--
1163+
1164The value of the special name `ICITTE` (`int` type) in this expression
1165is the <<cur-offset,current offset>> (before encoding the string).
1166
1167. The `:` character.
1168
1169. A string encoding amongst:
1170+
1171--
1172[horizontal]
1173`s:u8`::
1174 UTF-8.
1175
1176`s:u16be`::
1177 UTF-16BE.
1178
1179`s:u16le`::
1180 UTF-16LE.
1181
1182`s:u32be`::
1183 UTF-32BE.
1184
1185`s:u32le`::
1186 UTF-32LE.
1187
1188`s:latin1`::
1189 ISO/IEC 8859-1.
1190
1191`s:latin2`::
1192 ISO/IEC 8859-2.
1193
1194`s:latin3`::
1195 ISO/IEC 8859-3.
1196
1197`s:latin4`::
1198 ISO/IEC 8859-4.
1199
1200`s:latin5`::
1201 ISO/IEC 8859-9.
1202
1203`s:latin6`::
1204 ISO/IEC 8859-10.
1205
1206`s:latin7`::
1207 ISO/IEC 8859-13.
1208
1209`s:latin8`::
1210 ISO/IEC 8859-14.
1211
1212`s:latin9`::
1213 ISO/IEC 8859-15.
1214
1215`s:latin10`::
1216 ISO/IEC 8859-16.
1217--
1218
1219. The `}` suffix.
1220
1221====
1222Input:
1223
1224----
1225{iter = 1}
1226
1227!repeat 10
1228 {iter : s:u8} " "
1229 {iter = iter + 1}
1230!end
1231----
1232
1233Output:
1234
1235----
123631 20 32 20 33 20 34 20 35 20 36 20 37 20 38 20 ┆ 1 2 3 4 5 6 7 8
123739 20 31 30 20 ┆ 9 10
1238----
1239====
1240
1241====
1242Input:
1243
1244----
1245{meow = 'salut jérémie'}
1246{meow.upper() : s:latin1}
1247----
1248
1249Output:
1250
1251----
125253 41 4c 55 54 20 4a c9 52 c9 4d 49 45 ┆ SALUT J•R•MIE
1253----
1254====
1255
71aaa3f7
PP
1256=== Current offset setting
1257
1258This special item sets the <<cur-offset,_current offset_>>.
1259
1260A current offset setting is:
1261
1262. The `<` prefix.
1263
fc21bb27
PP
1264. A <<const-int,positive constant integer>> which is the new current
1265 offset.
71aaa3f7
PP
1266
1267. The `>` suffix.
1268
1269====
1270Input:
1271
1272----
1273 {ICITTE : 8} * 8
1274<0x61> {ICITTE : 8} * 8
1275----
1276
1277Output:
1278
1279----
128000 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
1281----
1282====
1283
1284====
1285Input:
1286
1287----
1288aa bb cc dd <meow> ee ff
1289<12> 11 22 33 <mix> 44 55
1290{meow : 8} {mix : 8}
1291----
1292
1293Output:
1294
1295----
1296aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
1297----
1298====
1299
676f6189
PP
1300=== Current offset alignment
1301
00deb9fa 1302A _current offset alignment_ represents zero or more padding bytes to
676f6189
PP
1303make the <<cur-offset,current offset>> meet a given
1304https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
1305
1306More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
1307a current offset alignment represents the required padding bytes until
1308the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
1309
1310A current offset alignment is:
1311
1312. The `@` prefix.
1313
fc21bb27
PP
1314. A <<const-int,positive constant integer>> which is the alignment value
1315 in _bits_.
676f6189
PP
1316+
1317This value must be greater than zero and a multiple of{nbsp}8.
1318
1319. **Optional**:
1320+
1321--
1322. The ``pass:[~]`` prefix.
fc21bb27
PP
1323. A <<const-int,positive constant integer>> which is the value of the
1324 byte to use as padding to align the <<cur-offset,current offset>>.
676f6189
PP
1325--
1326+
1327Without this section, the padding byte value is zero.
1328
1329====
1330Input:
1331
1332----
133311 22 (@32 aa bb cc) * 3
1334----
1335
1336Output:
1337
1338----
133911 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
1340----
1341====
1342
1343====
1344Input:
1345
1346----
1347{le}
134877 88
1349@32~0xcc {-893.5:32}
1350@128~0x55 "meow"
1351----
1352
1353Output:
1354
1355----
135677 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
13576d 65 6f 77 ┆ meow
1358----
1359====
1360
1361====
1362Input:
1363
1364----
1365aa bb cc <29> @64~255 "zoom"
1366----
1367
1368Output:
1369
1370----
1371aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1372----
1373====
1374
25ca454b
PP
1375=== Filling
1376
1377A _filling_ represents zero or more padding bytes to make the
1378<<cur-offset,current offset>> reach a given value.
1379
1380A filling is:
1381
1382. The ``pass:[+]`` prefix.
1383
1384. One of:
1385
fc21bb27
PP
1386** A <<const-int,positive constant integer>> which is the current offset
1387 target.
25ca454b
PP
1388
1389** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1390 evaluation result type is `int` or `bool` (automatically converted to
1391 `int`), and the ``pass:[}]`` suffix.
1392+
1393For a filling at some source location{nbsp}__**L**__, this expression
1394may contain:
1395+
1396--
1397* The name of any <<label,label>> defined before{nbsp}__**L**__
1398 which isn't within a nested group.
1399* The name of any <<variable-assignment,variable>> known
1400 at{nbsp}__**L**__.
1401--
1402+
1403The value of the special name `ICITTE` (`int` type) in this expression
1404is the <<cur-offset,current offset>> (before handling the items to
1405repeat).
1406
1407** A valid {py3} name.
1408+
1409For the name `__NAME__`, this is equivalent to the
1410`pass:[{]__NAME__pass:[}]` form above.
1411
1412+
1413This value must be greater than or equal to the current offset where
1414it's used.
1415
1416. **Optional**:
1417+
1418--
1419. The ``pass:[~]`` prefix.
fc21bb27
PP
1420. A <<const-int,positive constant integer>> which is the value of the
1421 byte to use as padding to reach the current offset target.
25ca454b
PP
1422--
1423+
1424Without this section, the padding byte value is zero.
1425
1426====
1427Input:
1428
1429----
1430aa bb cc dd
1431+0x40
1432"hello world"
1433----
1434
1435Output:
1436
1437----
1438aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
143900 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
144000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
144100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
144268 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1443----
1444====
1445
1446====
1447Input:
1448
1449----
1450!macro part(iter, fill)
1451 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1452!end
1453
1454{iter = 1}
1455
1456!repeat 5
1457 m:part(iter, {32 + 4 * iter})
1458 {iter = iter + 1}
1459!end
1460----
1461
1462Output:
1463
1464----
146570 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
146669 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
146780 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
146865 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
146980 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
147069 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
147133 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
147280 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
147361 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
147480 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
147580 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
147661 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
147780 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
147880 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1479----
1480====
1481
71aaa3f7
PP
1482=== Label
1483
1484A _label_ associates a name to the <<cur-offset,current offset>>.
1485
1486All the labels of a whole Normand input must have unique names.
1487
05f81895 1488A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
1489name.
1490
71aaa3f7
PP
1491A label is:
1492
1493. The `<` prefix.
1494
27d52a19 1495. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1496
1497. The `>` suffix.
1498
1499=== Variable assignment
1500
1501A _variable assignment_ associates a name to the integral result of an
1502evaluated {py3} expression.
1503
05f81895 1504A variable assignment is:
71aaa3f7
PP
1505
1506. The ``pass:[{]`` prefix.
1507
27d52a19 1508. A valid {py3} name which is not `ICITTE`.
71aaa3f7
PP
1509
1510. The `=` character.
1511
7a7b31e8
PP
1512. A valid {py3} expression of which the evaluation result type is `int`,
1513 `float`, or `bool` (automatically converted to `int`), or `str`.
05f81895
PP
1514+
1515For a variable assignment at some source location{nbsp}__**L**__, this
320644e2
PP
1516expression may contain:
1517+
1518--
1519* The name of any <<label,label>> defined before{nbsp}__**L**__
1520 which isn't within a nested group.
1521* The name of any <<variable-assignment,variable>> known
1522 at{nbsp}__**L**__.
1523--
05f81895 1524+
269f6eb3
PP
1525The value of the special name `ICITTE` (`int` type) in this expression
1526is the <<cur-offset,current offset>>.
71aaa3f7
PP
1527
1528. The `}` suffix.
1529
1530====
1531Input:
1532
1533----
1534{mix = 101} {le}
1535{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1536"yooo" {meow + mix : 16}
1537----
1538
1539Output:
1540
1541----
154211 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1543----
1544====
1545
1546=== Group
1547
1548A _group_ is a scoped sequence of items.
1549
1550The <<label,labels>> within a group aren't visible outside of it.
1551
e57a18e1
PP
1552The main purpose of a group is to <<post-item-repetition,repeat>> more
1553than a single item and to isolate labels.
71aaa3f7
PP
1554
1555A group is:
1556
261c5ecf 1557. The `(`, `!group`, or `!g` opening.
71aaa3f7
PP
1558
1559. Zero or more items.
1560
261c5ecf
PP
1561. Depending on the group opening:
1562+
1563--
1564`(`::
1565 The `)` closing.
1566
1567`!group`::
1568`!g`::
1569 The `!end` closing.
1570--
71aaa3f7
PP
1571
1572====
1573Input:
1574
1575----
1576((aa bb cc) dd () ee) "leclerc"
1577----
1578
1579Output:
1580
1581----
1582aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1583----
1584====
1585
1586====
1587Input:
1588
1589----
261c5ecf
PP
1590!group
1591 (aa bb cc) * 3 dd ee
1592!end * 5
71aaa3f7
PP
1593----
1594
1595Output:
1596
1597----
1598aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1599cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1600ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1601bb cc aa bb cc dd ee
1602----
1603====
1604
1605====
1606Input:
1607
1608----
1609{be}
1610(
1611 <str_beg> u16le"sébastien diaz" <str_end>
1612 {ICITTE - str_beg : 8}
1613 {(end - str_beg) * 5 : 24}
1614) * 3
1615<end>
1616----
1617
1618Output:
1619
1620----
162173 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
16226e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
162373 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
16246e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
162573 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
16266e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1627----
1628====
1629
27d52a19
PP
1630=== Conditional block
1631
12b5dbc0
PP
1632A _conditional block_ represents either the bytes of zero or more items
1633if some expression is true, or the bytes of zero or more other items if
1634it's false.
27d52a19
PP
1635
1636A conditional block is:
1637
261c5ecf 1638. The `!if` opening.
27d52a19
PP
1639
1640. One of:
1641
1642** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1643 evaluation result type is `int` or `bool` (automatically converted to
1644 `int`), and the ``pass:[}]`` suffix.
1645+
320644e2
PP
1646For a conditional block at some source location{nbsp}__**L**__, this
1647expression may contain:
27d52a19
PP
1648+
1649--
1650* The name of any <<label,label>> defined before{nbsp}__**L**__
1651 which isn't within a nested group.
1652* The name of any <<variable-assignment,variable>> known
320644e2 1653 at{nbsp}__**L**__.
27d52a19
PP
1654--
1655+
1656The value of the special name `ICITTE` (`int` type) in this expression
1657is the <<cur-offset,current offset>> (before handling the contained
1658items).
1659
1660** A valid {py3} name.
1661+
1662For the name `__NAME__`, this is equivalent to the
1663`pass:[{]__NAME__pass:[}]` form above.
1664
12b5dbc0
PP
1665. Zero or more items to be handled when the condition is true.
1666
1667. **Optional**:
1668
1669.. The `!else` opening.
1670.. Zero or more items to be handled when the condition is false.
27d52a19 1671
261c5ecf 1672. The `!end` closing.
27d52a19
PP
1673
1674====
1675Input:
1676
1677----
1678{at = 1}
1679{rep_count = 9}
1680
1681!repeat rep_count
1682 "meow "
1683
1684 !if {ICITTE > 25}
1685 "mix"
12b5dbc0
PP
1686 !else
1687 "zoom"
27d52a19
PP
1688 !end
1689
12b5dbc0
PP
1690 !if {at < rep_count} 20 !end
1691
27d52a19
PP
1692 {at = at + 1}
1693!end
1694----
1695
1696Output:
1697
1698----
12b5dbc0
PP
16996d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
17006f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
17016f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
17026d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
27d52a19 170378 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
12b5dbc0 17046d 69 78 ┆ mix
27d52a19
PP
1705----
1706====
1707
1708====
1709Input:
1710
1711----
1712<str_beg>
1713u16le"meow mix!"
1714<str_end>
1715
1716!if {str_end - str_beg > 10}
1717 " BIG"
1718!end
1719----
1720
1721Output:
1722
1723----
17246d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
172521 00 20 42 49 47 ┆ !• BIG
1726----
1727====
1728
e57a18e1 1729=== Repetition block
71aaa3f7 1730
e57a18e1
PP
1731A _repetition block_ represents the bytes of one or more items repeated
1732a given number of times.
676f6189 1733
e57a18e1 1734A repetition block is:
71aaa3f7 1735
261c5ecf 1736. The `!repeat` or `!r` opening.
71aaa3f7 1737
2adf4336
PP
1738. One of:
1739
fc21bb27
PP
1740** A <<const-int,positive constant integer>> which is the number of
1741 times to repeat the previous item.
2adf4336 1742
27d52a19
PP
1743** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1744 evaluation result type is `int` or `bool` (automatically converted to
1745 `int`), and the ``pass:[}]`` suffix.
05f81895 1746+
320644e2
PP
1747For a repetition block at some source location{nbsp}__**L**__, this
1748expression may contain:
05f81895
PP
1749+
1750--
27d52a19
PP
1751* The name of any <<label,label>> defined before{nbsp}__**L**__
1752 which isn't within a nested group.
05f81895 1753* The name of any <<variable-assignment,variable>> known
320644e2 1754 at{nbsp}__**L**__.
05f81895
PP
1755--
1756+
e57a18e1
PP
1757The value of the special name `ICITTE` (`int` type) in this expression
1758is the <<cur-offset,current offset>> (before handling the items to
1759repeat).
1760
1761** A valid {py3} name.
1762+
1763For the name `__NAME__`, this is equivalent to the
1764`pass:[{]__NAME__pass:[}]` form above.
1765
1766. Zero or more items.
1767
261c5ecf 1768. The `!end` closing.
e57a18e1
PP
1769
1770You may also use a <<post-item-repetition,post-item repetition>> after
1771some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1772is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
71aaa3f7
PP
1773
1774====
1775Input:
1776
1777----
fc21bb27 1778!repeat 0o400
e57a18e1
PP
1779 {end - ICITTE - 1 : 8}
1780!end
1781
1782<end>
71aaa3f7
PP
1783----
1784
1785Output:
1786
1787----
1788ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1789ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1790df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1791cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1792bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1793af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
17949f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
17958f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
17967f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
17976f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
17985f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
17994f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
18003f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
18012f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
18021f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
18030f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1804----
1805====
1806
2adf4336
PP
1807====
1808Input:
1809
1810----
1811{times = 1}
e57a18e1 1812
2adf4336 1813aa bb cc dd
e57a18e1
PP
1814
1815!repeat 3
2adf4336 1816 <here>
e57a18e1
PP
1817
1818 !repeat {here + 1}
1819 ee ff
1820 !end
1821
1822 11 22 !repeat times 33 !end
1823
2adf4336 1824 {times = times + 1}
e57a18e1
PP
1825!end
1826
2adf4336
PP
1827"coucou!"
1828----
1829
1830Output:
1831
1832----
1833aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
183433 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1835ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1836ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1837ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1838ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1839ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1840ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1841ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1842ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1843ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
184433 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1845----
1846====
1847
320644e2
PP
1848=== Macro definition block
1849
1850A _macro definition block_ associates a name and parameter names to
1851a group of items.
1852
1853A macro definition block doesn't lead to generated bytes itself: a
1854<<macro-expansion,macro expansion>> does so.
1855
1856A macro definition may only exist at the root level, that is, not within
1857a <<group,group>>, a <<repetition-block,repetition block>>, a
1858<<conditional-block,conditional block>>, or another
1859<<macro-definition-block,macro definition block>>.
1860
1861All macro definitions must have unique names.
1862
1863A macro definition is:
1864
1865. The `!macro` or `!m` opening.
1866
1867. A valid {py3} name (the macro name).
1868
1869. The `(` parameter name list prefix.
1870
1871. A comma-separated list of zero or more unique parameter names,
1872 each one being a valid {py3} name.
1873
1874. The `)` parameter name list suffix.
1875
1876. Zero or more items except, recursively, a macro definition block.
1877
1878. The `!end` closing.
1879
1880====
1881----
1882!macro bake()
1883 {le} {ICITTE * 8 : 16}
1884 u16le"predict explode"
1885!end
1886----
1887====
1888
1889====
1890----
1891!macro nail(rep, with_extra, val)
1892 {iter = 1}
1893
1894 !repeat rep
1895 {val + iter : uleb128}
1896 {0xdeadbeef : 32}
1897 {iter = iter + 1}
1898 !end
1899
1900 !if with_extra
1901 "meow mix\0"
1902 !end
1903!end
1904----
1905====
1906
1907=== Macro expansion
1908
1909A _macro expansion_ expands the items of a defined
1910<<macro-definition-block,macro>>.
1911
1912The macro to expand must be defined _before_ the expansion.
1913
1914The <<state,state>> before handling the first item of the chosen macro
1915is:
1916
1917<<cur-offset,Current offset>>::
1918 Unchanged.
1919
1920<<cur-bo,Current byte order>>::
1921 Unchanged.
1922
1923Variables::
1924 The only available variables initially are the macro parameters.
1925
1926Labels::
1927 None.
1928
1929The state after having handled the last item of the chosen macro is:
1930
1931Current offset::
1932 The one before handling the first item of the macro plus the size
1933 of the generated data of the macro expansion.
1934+
1935IMPORTANT: This means <<current-offset-setting,current offset setting>>
1936items within the expanded macro don't impact the final current offset.
1937
1938Current byte order::
1939 The one before handling the first item of the macro.
1940
1941Variables::
1942 The ones before handling the first item of the macro.
1943
1944Labels::
1945 The ones before handling the first item of the macro.
1946
1947A macro expansion is:
1948
1949. The `m:` prefix.
1950
1951. A valid {py3} name (the name of the macro to expand).
1952
1953. The `(` parameter value list prefix.
1954
1955. A comma-separated list of zero or more unique parameter values.
1956+
1957The number of parameter values must match the number of parameter
1958names of the definition of the chosen macro.
1959+
1960A parameter value is one of:
1961+
1962--
fc21bb27 1963* A <<const-int,constant integer>>, possibly negative.
320644e2 1964
dbd84e74
PP
1965* A constant floating point number.
1966
320644e2
PP
1967* The ``pass:[{]`` prefix, a valid {py3} expression of which the
1968 evaluation result type is `int` or `bool` (automatically converted to
1969 `int`), and the ``pass:[}]`` suffix.
1970+
1971For a macro expansion at some source location{nbsp}__**L**__, this
1972expression may contain:
1973
1974** The name of any <<label,label>> defined before{nbsp}__**L**__
1975 which isn't within a nested group.
1976** The name of any <<variable-assignment,variable>> known
1977 at{nbsp}__**L**__.
1978
1979+
1980The value of the special name `ICITTE` (`int` type) in this expression
1981is the <<cur-offset,current offset>> (before handling the items of the
1982chosen macro).
1983
1984* A valid {py3} name.
1985+
1986For the name `__NAME__`, this is equivalent to the
1987`pass:[{]__NAME__pass:[}]` form above.
1988--
1989
1990. The `)` parameter value list suffix.
1991
1992====
1993Input:
1994
1995----
1996!macro bake()
1997 {le} {ICITTE * 8 : 16}
1998 u16le"predict explode"
1999!end
2000
2001"hello [" m:bake() "] world"
2002
2003m:bake() * 5
2004----
2005
2006Output:
2007
2008----
200968 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
201000 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
201100 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
201270 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
201365 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
201470 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
201565 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
201670 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
201765 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
201870 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
201965 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
202070 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
202165 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
2022----
2023====
2024
2025====
2026Input:
2027
2028----
2029!macro A(val, is_be)
2030 {le}
2031
2032 !if is_be
2033 {be}
2034 !end
2035
2036 {val : 16}
2037!end
2038
2039!macro B(rep, is_be)
2040 {iter = 1}
2041
2042 !repeat rep
2043 m:A({iter * 3}, is_be)
2044 {iter = iter + 1}
2045 !end
2046!end
2047
2048m:B(5, 1)
2049m:B(3, 0)
2050----
2051
2052Output:
2053
2054----
205500 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
2056----
2057====
2058
dbd84e74
PP
2059====
2060Input:
2061
2062----
2063!macro flt32be(val) {be} {val : 32} !end
2064
2065"CHEETOS"
2066m:flt32be(-42.17)
2067m:flt32be(56.23e-4)
2068----
2069
2070Output:
2071
2072----
207343 48 45 45 54 4f 53 c2 28 ae 14 3b b8 41 25 ┆ CHEETOS•(••;•A%
2074----
2075====
2076
e57a18e1
PP
2077=== Post-item repetition
2078
2079A _post-item repetition_ represents the bytes of an item repeated a
2080given number of times.
2081
2082A post-item repetition is:
2083
27d52a19 2084. One of those items:
e57a18e1 2085
27d52a19
PP
2086** A <<byte-constant,byte constant>>.
2087** A <<literal-string,literal string>>.
2088** A <<fixed-length-number,fixed-length number>>.
2089** An <<leb128-integer,LEB128 integer>>.
7a7b31e8 2090** A <<string,string>>.
320644e2 2091** A <<macro-expansion,macro-expansion>>.
27d52a19 2092** A <<group,group>>.
e57a18e1
PP
2093
2094. The ``pass:[*]`` character.
2095
2096. One of:
2097
2098** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
2099 which is the number of times to repeat the previous item.
2100
27d52a19
PP
2101** The ``pass:[{]`` prefix, a valid {py3} expression of which the
2102 evaluation result type is `int` or `bool` (automatically converted to
2103 `int`), and the ``pass:[}]`` suffix.
e57a18e1 2104+
320644e2
PP
2105For a post-item repetition at some source location{nbsp}__**L**__, this
2106expression may contain:
e57a18e1
PP
2107+
2108--
27d52a19
PP
2109* The name of any <<label,label>> defined before{nbsp}__**L**__
2110 which isn't within a nested group and
2111 which isn't part of the repeated item.
e57a18e1
PP
2112* The name of any <<variable-assignment,variable>> known
2113 at{nbsp}__**L**__, which isn't part of its repeated item, and which
320644e2 2114 doesn't.
e57a18e1
PP
2115--
2116+
2117The value of the special name `ICITTE` (`int` type) in this expression
2118is the <<cur-offset,current offset>> (before handling the items to
2119repeat).
2120
2121** A valid {py3} name.
2122+
2123For the name `__NAME__`, this is equivalent to the
2124`pass:[{]__NAME__pass:[}]` form above.
2125
2126You may also use a <<repetition-block,repetition block>>. The form
2127``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
2128``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
2129
2130====
2131Input:
2132
2133----
2134{end - ICITTE - 1 : 8} * 0x100 <end>
2135----
2136
2137Output:
2138
2139----
2140ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
2141ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
2142df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
2143cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
2144bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
2145af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
21469f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
21478f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
21487f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
21496f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
21505f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
21514f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
21523f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
21532f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
21541f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
21550f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
2156----
2157====
2158
2159====
2160Input:
2161
2162----
2163{times = 1}
2164aa bb cc dd
2165(
2166 <here>
2167 (ee ff) * {here + 1}
2168 11 22 33 * {times}
2169 {times = times + 1}
2170) * 3
2171"coucou!"
2172----
2173
2174Output:
2175
2176----
2177aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
217833 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
2179ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2180ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
2181ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2182ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2183ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2184ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2185ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2186ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2187ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
218833 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
2189----
2190====
2191
71aaa3f7
PP
2192== Command-line tool
2193
2194If you <<install-normand,installed>> the `normand` package, then you
2195can use the `normand` command-line tool:
2196
2197----
2198$ normand <<< '"ma gang de malades"' | hexdump -C
2199----
2200
2201----
220200000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
220300000010 65 73 |es|
2204----
2205
2206If you copy the `normand.py` module to your own project, then you can
2207run the module itself:
2208
2209----
2210$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
2211----
2212
2213----
221400000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
221500000010 65 73 |es|
2216----
2217
2218Without a path argument, the `normand` tool reads from the standard
2219input.
2220
2221The `normand` tool prints the generated binary data to the standard
2222output.
2223
2224Various options control the initial <<state,state>> of the processor:
2225use the `--help` option to learn more.
2226
2227== {py3} API
2228
e57a18e1 2229The whole `normand` package/module public API is:
71aaa3f7
PP
2230
2231[source,python]
2232----
e57a18e1 2233# Byte order.
71aaa3f7
PP
2234class ByteOrder(enum.Enum):
2235 # Big endian.
2236 BE = ...
2237
2238 # Little endian.
2239 LE = ...
2240
2241
e57a18e1
PP
2242# Text location.
2243class TextLocation:
71aaa3f7
PP
2244 # Line number.
2245 @property
2246 def line_no(self) -> int:
2247 ...
2248
2249 # Column number.
2250 @property
2251 def col_no(self) -> int:
2252 ...
2253
2254
f5dcb24c
PP
2255# Parsing error message.
2256class ParseErrorMessage:
2257 # Message text.
2258 @property
2259 def text(self):
2260 ...
2261
2262 # Source text location.
2263 @property
2264 def text_location(self):
2265 ...
2266
2267
e57a18e1 2268# Parsing error.
71aaa3f7 2269class ParseError(RuntimeError):
f5dcb24c
PP
2270 # Parsing error messages.
2271 #
2272 # The first message is the most _specific_ one.
71aaa3f7 2273 @property
f5dcb24c 2274 def messages(self):
71aaa3f7
PP
2275 ...
2276
2277
e57a18e1
PP
2278# Variables dictionary type (for type hints).
2279VariablesT = typing.Dict[str, typing.Union[int, float]]
2280
2281
2282# Labels dictionary type (for type hints).
2283LabelsT = typing.Dict[str, int]
1b8aa84a
PP
2284
2285
e57a18e1 2286# Parsing result.
71aaa3f7
PP
2287class ParseResult:
2288 # Generated data.
2289 @property
2290 def data(self) -> bytearray:
2291 ...
2292
2293 # Updated variable values.
2294 @property
1b8aa84a 2295 def variables(self) -> SymbolsT:
71aaa3f7
PP
2296 ...
2297
2298 # Updated main group label values.
2299 @property
1b8aa84a 2300 def labels(self) -> SymbolsT:
71aaa3f7
PP
2301 ...
2302
2303 # Final offset.
2304 @property
2305 def offset(self) -> int:
2306 ...
2307
2308 # Final byte order.
2309 @property
1b8aa84a 2310 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
2311 ...
2312
1b8aa84a 2313
e57a18e1
PP
2314# Parses the `normand` input using the initial state defined by
2315# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
2316# and returns the corresponding parsing result.
71aaa3f7 2317def parse(normand: str,
1b8aa84a
PP
2318 init_variables: typing.Optional[SymbolsT] = None,
2319 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
2320 init_offset: int = 0,
2321 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
2322 ...
2323----
2324
2325The `normand` parameter is the actual <<learn-normand,Normand input>>
2326while the other parameters control the initial <<state,state>>.
2327
2328The `parse()` function raises a `ParseError` instance should it fail to
2329parse the `normand` string for any reason.
bf8f3b38
PP
2330
2331== Development
2332
2333Normand is a https://python-poetry.org/[Poetry] project.
2334
2335To develop it, install it through Poetry and enter the virtual
2336environment:
2337
2338----
2339$ poetry install
2340$ poetry shell
2341$ normand <<< '"lol" * 10 0a'
2342----
2343
2344`normand.py` is processed by:
2345
2346* https://microsoft.github.io/pyright/[Pyright]
2347* https://github.com/psf/black[Black]
2348* https://pycqa.github.io/isort/[isort]
2349
2350=== Testing
2351
2352Use https://docs.pytest.org/[pytest] to test Normand once the package is
2353part of your virtual environment, for example:
2354
2355----
2356$ poetry install
2357$ poetry run pip3 install pytest
2358$ poetry run pytest
2359----
2360
2361The `pytest` project is currently not a development dependency in
2362`pyproject.toml` due to backward compatibiliy issues with
2363Python{nbsp}3.4.
2364
2365In the `tests` directory, each `*.nt` file is a test. The file name
2366prefix indicates what it's meant to test:
2367
2368`pass-`::
2369 Everything above the `---` line is the valid Normand input
2370 to test.
2371+
2372Everything below the `---` line is the expected data
2373(whitespace-separated hexadecimal bytes).
2374
2375`fail-`::
2376 Everything above the `---` line is the invalid Normand input
2377 to test.
2378+
2379Everything below the `---` line is the expected error message having
2380this form:
2381+
2382----
2383LINE:COL - MESSAGE
2384----
2385
2386=== Contributing
2387
2388Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2389for code review.
2390
2391To report a bug, https://github.com/efficios/normand/issues/new[create a
2392GitHub issue].
This page took 0.132117 seconds and 4 git commands to generate.