52e81c198ebd74577ef96f480f850262729015de
[normand.git] / README.adoc
1 // Show ToC at a specific location for a GitHub rendering
2 ifdef::env-github[]
3 :toc: macro
4 endif::env-github[]
5
6 ifndef::env-github[]
7 :toc: left
8 endif::env-github[]
9
10 // This is to mimic what GitHub does so that anchors work in an offline
11 // rendering too.
12 :idprefix:
13 :idseparator: -
14
15 // Other attributes
16 :py3: Python{nbsp}3
17
18 = Normand
19 Philippe Proulx
20
21 image::normand-logo.png[]
22
23 [.normal]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26 [.lead]
27 _**Normand**_ is a text-to-binary processor with its own language.
28
29 This package offers both a portable {py3} module and a command-line
30 tool.
31
32 WARNING: This version of Normand is 0.20, meaning both the Normand
33 language and the module/CLI interface aren't stable.
34
35 ifdef::env-github[]
36 // ToC location for a GitHub rendering
37 toc::[]
38 endif::env-github[]
39
40 == Introduction
41
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
44
45 .Simple bytes input.
46 ====
47 Consider the following Normand input:
48
49 ----
50 4f 55 32 bb $167 fe %10100111 a9 $-32
51 ----
52
53 The generated nine bytes are:
54
55 ----
56 4f 55 32 bb a7 fe a7 a9 e0
57 ----
58 ====
59
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
63
64 The Normand language is more than simple lists of bytes, though. Its
65 main features are:
66
67 Comments, including a bunch of insignificant symbols which may improve readability::
68 +
69 Input:
70 +
71 ----
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
75 60:57:18:a3:42:29
76 10839636-5d65-4a68-8e6a-21608ddf7258
77 ----
78 +
79 Output:
80 +
81 ----
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
85 ----
86
87 Hexadecimal, decimal, and binary byte constants::
88 +
89 Input:
90 +
91 ----
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
93 ----
94 +
95 Output:
96 +
97 ----
98 aa bb f7 a7 32 da
99 ----
100
101 Strings::
102 +
103 Input:
104 +
105 ----
106 "hello world!" 00
107 u16le"stress\nverdict 🤣"
108 s:latin3{hex(ICITTE)}
109 ----
110 +
111 Output:
112 +
113 ----
114 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
115 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
116 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd 30 ┆ •d•i•c•t• •>•#•0
117 78 32 66 ┆ x2f
118 ----
119
120 Labels: special variables holding the offset where they're defined::
121 +
122 ----
123 <beg> b2 52 e3 bc 91 05
124 $100 $50 <chair> 33 9f fe
125 25 e9 89 8a <end>
126 ----
127
128 Variables::
129 +
130 ----
131 5e 65 {tower = 47} c6 7f f2 c4
132 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
133 ----
134 +
135 The value of a variable assignment is the evaluation of a valid {py3}
136 expression which may include label and variable names.
137
138 Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
139 +
140 Input:
141 +
142 ----
143 {strength = 4}
144 {be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
145 {le} {-1993 : 32}
146 {-3.141593 : 64}
147 ----
148 +
149 Output:
150 +
151 ----
152 67 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
153 09 c0
154 ----
155 +
156 The encoded number is the evaluation of a valid {py3} expression which
157 may include label and variable names.
158
159 https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
160 +
161 Input:
162 +
163 ----
164 aa bb cc {-1993 : sleb128} <meow> dd ee ff
165 {meow * 199 : uleb128}
166 ----
167 +
168 Output:
169 +
170 ----
171 aa bb cc b7 70 dd ee ff e3 07
172 ----
173 +
174 The encoded integer is the evaluation of a valid {py3} expression which
175 may include label and variable names.
176
177 Conditional::
178 +
179 Input:
180 +
181 ----
182 aa bb cc
183
184 (
185 "foo"
186
187 !if {ICITTE > 10}
188 "bar"
189 !else
190 "fight"
191 !end
192 ) * 4
193 ----
194 +
195 Output:
196 +
197 ----
198 aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
199 67 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
200 ----
201
202 Repetition::
203 +
204 Input:
205 +
206 ----
207 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
208
209 !repeat 3
210 ff ee "juice"
211 !end
212 ----
213 +
214 Output:
215 +
216 ----
217 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
218 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
219 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
220 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
221 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
222 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
223 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
224 ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
225 6a 75 69 63 65 ┆ juice
226 ----
227
228 Alignment::
229 +
230 Input:
231 +
232 ----
233 {be}
234
235 {199:32}
236 @64 {43:64}
237 @16 {-123:16}
238 @32~255 {5584:32}
239 ----
240 +
241 Output:
242 +
243 ----
244 00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
245 ff 85 ff ff 00 00 15 d0
246 ----
247
248 Filling::
249 +
250 Input:
251 +
252 ----
253 {le}
254 {0xdeadbeef:32}
255 {-1993:16}
256 {9:16}
257 +0x40
258 {ICITTE:8}
259 "meow mix"
260 +200~FFh
261 {ICITTE:8}
262 ----
263 +
264 Output:
265 +
266 ----
267 ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
268 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
269 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
270 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
271 40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
272 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
273 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
274 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
275 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
276 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
277 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
278 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
279 ff ff ff ff ff ff ff ff c8 ┆ •••••••••
280 ----
281
282 Multilevel grouping::
283 +
284 Input:
285 +
286 ----
287 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
288 ----
289 +
290 Output:
291 +
292 ----
293 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
294 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
295 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
296 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
297 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
298 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
299 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
300 ----
301
302 Macros::
303 +
304 Input:
305 +
306 ----
307 !macro hello(world)
308 "hello"
309 !if world " world" !end
310 !end
311
312 !repeat 17
313 ff ff ff ff
314 m:hello({ICITTE > 15 and ICITTE < 60})
315 !end
316 ----
317 +
318 Output:
319 +
320 ----
321 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
322 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
323 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
324 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
325 ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
326 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
327 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
328 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
329 ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
330 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
331 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
332 ----
333
334 Precise error reporting::
335 +
336 ----
337 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
338 ----
339 +
340 ----
341 /tmp/meow.normand:32:6 - Unexpected character `k`.
342 ----
343 +
344 ----
345 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
346 ----
347 +
348 ----
349 /tmp/meow.normand:32:19 - While expanding the macro `meow`:
350 /tmp/meow.normand:35:5 - While expanding the macro `zzz`:
351 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
352 ----
353
354 You can use Normand to track data source files in your favorite VCS
355 instead of raw binary files. The binary files that Normand generates can
356 be used to test file format decoding, including malformatted data, for
357 example, as well as for education.
358
359 See <<learn-normand>> to explore all the Normand features.
360
361 == Install Normand
362
363 Normand requires Python ≥ 3.4.
364
365 To install Normand:
366
367 ----
368 $ python3 -m pip install --user normand
369 ----
370
371 See
372 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
373 to learn more about a user site installation.
374
375 [NOTE]
376 ====
377 Normand has a single module file, `normand.py`, which you can copy as is
378 to your project to use it (both the <<python3-api,`normand.parse()`>>
379 function and the <<command-line-tool,command-line tool>>).
380
381 `normand.py` has _no external dependencies_, but if you're using
382 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
383 module.
384 ====
385
386 == Design goals
387
388 The design goals of Normand are:
389
390 Portability::
391 We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
392 doesn't have any external dependencies so that you may just copy the
393 module as is to your own project.
394
395 Ease of use::
396 The most basic Normand input is a sequence of hexadecimal constants
397 (for example, `4e6f726d616e64`) which produce exactly what you'd
398 expect.
399 +
400 Most Normand features map to programming language concepts you already
401 know and understand: constant integers, literal strings, variables,
402 conditionals, repetitions/loops, and the rest.
403
404 Concise and readable input::
405 We could have chosen XML or YAML as the input format, but having a
406 DSL here makes a Normand input compact and easy to read, two
407 important traits when using Normand to write tests, for example.
408 +
409 Compare the following Normand input and some hypothetical XML
410 equivalent, for example:
411 +
412 .Actual normand input.
413 ----
414 ff dd 01 ab $192 $-128 %1101:0011
415
416 {end:8}
417
418 {iter = 1}
419
420 !if {not something}
421 # five times because xyz
422 !repeat 5
423 "hello world " {iter:8}
424 {iter = iter + 1}
425 !end
426 !end
427
428 <end>
429 ----
430 +
431 .Hypothetical Normand XML input.
432 [source,xml]
433 ----
434 <?xml version="1.0" encoding="utf-8" ?>
435 <group>
436 <byte base="x" val="ff" />
437 <byte base="x" val="dd" />
438 <byte base="x" val="1" />
439 <byte base="x" val="ab" />
440 <byte base="d" val="192" />
441 <byte base="d" val="-128" />
442 <byte base="b" val="11010011" />
443 <fixed-len-num expr="end" len="8" />
444 <var-assign name="iter" expr="1" />
445 <cond expr="not something">
446 <!-- five times because xyz -->
447 <repeat expr="5">
448 <str>hello world </str>
449 <fixed-len-num expr="iter" len="8" />
450 <var-assign name="iter" expr="iter + 1" />
451 </repeat>
452 </cond>
453 <label name="end" />
454 </group>
455 ----
456
457 == Learn Normand
458
459 A Normand text input is a sequence of items which represent a sequence
460 of raw bytes.
461
462 [[state]] During the processing of items to data, Normand relies on a
463 current state:
464
465 [%header%autowidth]
466 |===
467 |State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
468
469 |[[cur-offset]] Current offset
470 |
471 The current offset has an effect on the value of <<label,labels>> and of
472 the special `ICITTE` name in <<fixed-length-number,fixed-length
473 number>>, <<leb-128-integer,LEB128 integer>>, <<string,string>>,
474 <<filling,filling>>, <<variable-assignment,variable assignment>>,
475 <<conditional-block,conditional block>>, <<repetition-block,repetition
476 block>>, <<macro-expansion,macro expansion>>, and
477 <<post-item-repetition,post-item repetition>> expression evaluation.
478
479 Each generated byte increments the current offset.
480
481 A <<current-offset-setting,current offset setting>> may change the
482 current offset without generating data.
483
484 An <<current-offset-alignment,current offset alignment>> generates
485 padding bytes to make the current offset satisfy a given alignment.
486 |`init_offset` parameter of the `parse()` function.
487 |`--offset` option.
488
489 |[[cur-bo]] Current byte order
490 |
491 The current byte order has an effect on the encoding of
492 <<fixed-length-number,fixed-length numbers>>.
493
494 A <<current-byte-order-setting,current byte order setting>> may change
495 the current byte order.
496 |`init_byte_order` parameter of the `parse()` function.
497 |`--byte-order` option.
498
499 |<<label,Labels>>
500 |Mapping of label names to integral values.
501 |`init_labels` parameter of the `parse()` function.
502 |One or more `--label` options.
503
504 |<<variable-assignment,Variables>>
505 |Mapping of variable names to integral or floating point number values.
506 |`init_variables` parameter of the `parse()` function.
507 |One or more `--var` or `--var-str` options.
508 |===
509
510 The available items are:
511
512 * A <<byte-constant,constant integer>> representing one or more
513 constant bytes.
514
515 * A <<literal-string,literal string>> representing a constant sequence
516 of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data.
517
518 * A <<current-byte-order-setting,current byte order setting>> (big or
519 little endian).
520
521 * A <<fixed-length-number,fixed-length number>> (integer or
522 floating point) using the <<cur-bo,current byte order>> and of which
523 the value is the result of a {py3} expression.
524
525 * An <<leb128-integer,LEB128 integer>> of which the value is the result
526 of a {py3} expression.
527
528 * A <<string,string>> representing a sequence of bytes encoding UTF-8,
529 UTF-16, UTF-32, or Latin-1 to Latin-10 data, and of which the value is
530 the result of a {py3} expression.
531
532 * A <<current-offset-setting,current offset setting>>.
533
534 * A <<current-offset-alignment,current offset alignment>>.
535
536 * A <<filling,filling>>.
537
538 * A <<label,label>>, that is, a named constant holding the current
539 offset.
540 +
541 This is similar to an assembly label.
542
543 * A <<variable-assignment,variable assignment>> associating a name to
544 the integral result of an evaluated {py3} expression.
545
546 * A <<group,group>>, that is, a scoped sequence of items.
547
548 * A <<conditional-block,conditional block>>.
549
550 * A <<repetition-block,repetition block>>.
551
552 * A <<macro-definition-block,macro definition block>>.
553
554 * A <<macro-expansion,macro expansion>>.
555
556 Moreover, you can repeat many items above a constant or variable number
557 of times with the ``pass:[*]`` operator _after_ the item to repeat. This
558 is called a <<post-item-repetition,post-item repetition>>.
559
560 A Normand comment may exist pretty much anywhere between tokens.
561
562 A comment is anything between two ``pass:[#]`` characters on the same
563 line, or from ``pass:[#]`` until the end of the line. Whitespaces are
564 also considered comments. The following symbols are also considered
565 comments around and between items, as well as between hexadecimal
566 nibbles and binary bits of <<byte-constant,byte constants>>:
567
568 ----
569 / \ ? & : ; . , [ ] _ = | -
570 ----
571
572 The latter serve to improve readability so that you may write, for
573 example, a MAC address or a UUID as is.
574
575 [[const-int]] Many items require a _constant integer_, possibly
576 negative, in which case it may start with `-` for a negative integer. A
577 positive constant integer is any of:
578
579 Decimal::
580 One or mode digits (`0` to `9`).
581
582 Hexadecimal::
583 One of:
584 +
585 * The `0x` or `0X` prefix followed with one or more hexadecimal digits
586 (`0` to `9`, `a` to `f`, or `A` to `F`).
587 * One or more hexadecimal digits followed with the `h` or `H` suffix.
588
589 Octal::
590 One of:
591 +
592 * The `0o` or `0O` prefix followed with one or more octal digits
593 (`0` to `7`).
594 * One or more octal digits followed with the `o`, `O`, `q`, or `Q`
595 suffix.
596
597 Binary::
598 One of:
599 +
600 * The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
601 * One or more bits followed with the `b` or `B` suffix.
602
603 You can test the examples of this section with the `normand`
604 <<command-line-tool,command-line tool>> as such:
605
606 ----
607 $ normand file | hexdump -C
608 ----
609
610 where `file` is the name of a file containing the Normand input.
611
612 === Byte constant
613
614 A _byte constant_ represents one or more constant bytes.
615
616 A byte constant is:
617
618 Hexadecimal form::
619 Two consecutive hexadecimal digits representing a single byte.
620
621 Decimal form::
622 One or more digits after the `$` prefix representing a single byte.
623
624 Binary form:: {empty}
625 +
626 --
627 . __**N**__ `%` prefixes (at least one).
628 +
629 The number of `%` characters is the number of subsequent expected bytes.
630
631 . __**N**__{nbsp}×{nbsp}8 bits (`0` or `1`).
632 --
633
634 ====
635 Input:
636
637 ----
638 ab cd [3d 8F] CC
639 ----
640
641 Output:
642
643 ----
644 ab cd 3d 8f cc
645 ----
646 ====
647
648 ====
649 Input:
650
651 ----
652 $192 %1100/0011 $ -77
653 ----
654
655 Output:
656
657 ----
658 c0 c3 b3
659 ----
660 ====
661
662 ====
663 Input:
664
665 ----
666 58f64689-6316-4d55-8a1a-04cada366172
667 fe80::6257:18ff:fea3:4229
668 ----
669
670 Output:
671
672 ----
673 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
674 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
675 ----
676 ====
677
678 ====
679 Input:
680
681 ----
682 %01110011 %01100001 %01101100 %01110101 %01110100
683 %%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
684 ----
685
686 Output:
687
688 ----
689 73 61 6c 75 74 d2 ff c7 ┆ salut•••
690 ----
691 ====
692
693 === Literal string
694
695 A _literal string_ represents the encoded bytes of a literal string
696 using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
697
698 The string to encode isn't implicitly null-terminated: use `\0` at the
699 end of the string to add a null character.
700
701 A literal string is:
702
703 . **Optional**: one of the following encodings instead of the default
704 UTF-8:
705 +
706 --
707 [horizontal]
708 `s:u8`::
709 `u8`::
710 UTF-8.
711
712 `s:u16be`::
713 `u16be`::
714 UTF-16BE.
715
716 `s:u16le`::
717 `u16le`::
718 UTF-16LE.
719
720 `s:u32be`::
721 `u32be`::
722 UTF-32BE.
723
724 `s:u32le`::
725 `u32le`::
726 UTF-32LE.
727
728 `s:latin1`::
729 ISO/IEC 8859-1.
730
731 `s:latin2`::
732 ISO/IEC 8859-2.
733
734 `s:latin3`::
735 ISO/IEC 8859-3.
736
737 `s:latin4`::
738 ISO/IEC 8859-4.
739
740 `s:latin5`::
741 ISO/IEC 8859-9.
742
743 `s:latin6`::
744 ISO/IEC 8859-10.
745
746 `s:latin7`::
747 ISO/IEC 8859-13.
748
749 `s:latin8`::
750 ISO/IEC 8859-14.
751
752 `s:latin9`::
753 ISO/IEC 8859-15.
754
755 `s:latin10`::
756 ISO/IEC 8859-16.
757 --
758
759 . The ``pass:["]`` prefix.
760
761 . A sequence of zero or more characters, possibly containing escape
762 sequences.
763 +
764 An escape sequence is the ``\`` character followed by one of:
765 +
766 --
767 [horizontal]
768 `0`:: Null (U+0000)
769 `a`:: Alert (U+0007)
770 `b`:: Backspace (U+0008)
771 `e`:: Escape (U+001B)
772 `f`:: Form feed (U+000C)
773 `n`:: End of line (U+000A)
774 `r`:: Carriage return (U+000D)
775 `t`:: Character tabulation (U+0009)
776 `v`:: Line tabulation (U+000B)
777 ``\``:: Reverse solidus (U+005C)
778 ``pass:["]``:: Quotation mark (U+0022)
779 --
780
781 . The ``pass:["]`` suffix.
782
783 ====
784 Input:
785
786 ----
787 "coucou tout le monde!"
788 ----
789
790 Output:
791
792 ----
793 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
794 6f 6e 64 65 21 ┆ onde!
795 ----
796 ====
797
798 ====
799 Input:
800
801 ----
802 u16le"I am not young enough to know everything."
803 ----
804
805 Output:
806
807 ----
808 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
809 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
810 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
811 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
812 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
813 2e 00 ┆ .•
814 ----
815 ====
816
817 ====
818 Input:
819
820 ----
821 s:u32be "\"illusion is the first\nof all pleasures\" 🦉"
822 ----
823
824 Output:
825
826 ----
827 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
828 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
829 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
830 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
831 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
832 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
833 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
834 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
835 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
836 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
837 00 00 00 20 00 01 f9 89 ┆ ••• ••••
838 ----
839 ====
840
841 ====
842 Input:
843
844 ----
845 s:latin1 "Paul Piché"
846 ----
847
848 Output:
849
850 ----
851 50 61 75 6c 20 50 69 63 68 e9 ┆ Paul Pich•
852 ----
853 ====
854
855 === Current byte order setting
856
857 This special item sets the <<cur-bo,_current byte order_>>.
858
859 The two accepted forms are:
860
861 [horizontal]
862 ``pass:[{be}]``:: Set the current byte order to big endian.
863 ``pass:[{le}]``:: Set the current byte order to little endian.
864
865 === Fixed-length number
866
867 A _fixed-length number_ represents a fixed number of bytes encoding
868 either:
869
870 * An unsigned or signed integer (two's complement).
871 +
872 The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
873
874 * A floating point number
875 (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
876 +
877 The available length are 32 (_binary32_) and 64 (_binary64_).
878
879 The value is the result of evaluating a {py3} expression using the
880 <<cur-bo,current byte order>>.
881
882 A fixed-length number is:
883
884 . The ``pass:[{]`` prefix.
885
886 . A valid {py3} expression.
887 +
888 For a fixed-length number at some source location{nbsp}__**L**__, this
889 expression may contain the name of any accessible <<label,label>> (not
890 within a nested group), including the name of a label defined
891 after{nbsp}__**L**__, as well as the name of any
892 <<variable-assignment,variable>> known at{nbsp}__**L**__.
893 +
894 The value of the special name `ICITTE` (`int` type) in this expression
895 is the <<cur-offset,current offset>> (before encoding the number).
896
897 . The `:` character.
898
899 . An encoding length in bits amongst:
900 +
901 --
902 The expression evaluates to an `int` or `bool` value::
903 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
904 +
905 NOTE: Normand automatically converts a `bool` value to `int`.
906
907 The expression evaluates to a `float` value::
908 `32` and `64`.
909 --
910
911 . The `}` suffix.
912
913 ====
914 Input:
915
916 ----
917 {le} {345:16}
918 {be} {-0xabcd:32}
919 ----
920
921 Output:
922
923 ----
924 59 01 ff ff 54 33
925 ----
926 ====
927
928 ====
929 Input:
930
931 ----
932 {be}
933
934 # String length in bits
935 {8 * (str_end - str_beg) : 16}
936
937 # String
938 <str_beg>
939 "hello world!"
940 <str_end>
941 ----
942
943 Output:
944
945 ----
946 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
947 ----
948 ====
949
950 ====
951 Input:
952
953 ----
954 {20 - ICITTE : 8} * 10
955 ----
956
957 Output:
958
959 ----
960 14 13 12 11 10 0f 0e 0d 0c 0b
961 ----
962 ====
963
964 ====
965 Input:
966
967 ----
968 {le}
969 {2 * 0.0529 : 32}
970 ----
971
972 Output:
973
974 ----
975 ac ad d8 3d
976 ----
977 ====
978
979 === LEB128 integer
980
981 An _LEB128 integer_ represents a variable number of bytes encoding an
982 unsigned or signed integer which is the result of evaluating a {py3}
983 expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
984 format.
985
986 An LEB128 integer is:
987
988 . The ``pass:[{]`` prefix.
989
990 . A valid {py3} expression of which the evaluation result type
991 is `int` or `bool` (automatically converted to `int`).
992 +
993 For an LEB128 integer at some source location{nbsp}__**L**__, this
994 expression may contain:
995 +
996 --
997 * The name of any <<label,label>> defined before{nbsp}__**L**__
998 which isn't within a nested group.
999 * The name of any <<variable-assignment,variable>> known
1000 at{nbsp}__**L**__.
1001 --
1002 +
1003 The value of the special name `ICITTE` (`int` type) in this expression
1004 is the <<cur-offset,current offset>> (before encoding the integer).
1005
1006 . The `:` character.
1007
1008 . One of:
1009 +
1010 --
1011 [horizontal]
1012 `uleb128`:: Use the unsigned LEB128 format.
1013 `sleb128`:: Use the signed LEB128 format.
1014 --
1015
1016 . The `}` suffix.
1017
1018 ====
1019 Input:
1020
1021 ----
1022 {624485 : uleb128}
1023 ----
1024
1025 Output:
1026
1027 ----
1028 e5 8e 26
1029 ----
1030 ====
1031
1032 ====
1033 Input:
1034
1035 ----
1036 aa bb cc dd
1037 <meow>
1038 ee ff
1039 {-981238311 + (meow * -23) : sleb128}
1040 "hello"
1041 ----
1042
1043 Output:
1044
1045 ----
1046 aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
1047 ----
1048 ====
1049
1050 === String
1051
1052 A _string_ represents a variable number of bytes encoding a string which
1053 is the result of evaluating a {py3} expression using the UTF-8, UTF-16,
1054 UTF-32, or Latin-1 to Latin-10 encoding.
1055
1056 A string has two possible forms:
1057
1058 Encoding prefix form:: {empty}
1059 +
1060 . An encoding amongst:
1061 +
1062 --
1063 [horizontal]
1064 `s:u8`::
1065 `u8`::
1066 UTF-8.
1067
1068 `s:u16be`::
1069 `u16be`::
1070 UTF-16BE.
1071
1072 `s:u16le`::
1073 `u16le`::
1074 UTF-16LE.
1075
1076 `s:u32be`::
1077 `u32be`::
1078 UTF-32BE.
1079
1080 `s:u32le`::
1081 `u32le`::
1082 UTF-32LE.
1083
1084 `s:latin1`::
1085 ISO/IEC 8859-1.
1086
1087 `s:latin2`::
1088 ISO/IEC 8859-2.
1089
1090 `s:latin3`::
1091 ISO/IEC 8859-3.
1092
1093 `s:latin4`::
1094 ISO/IEC 8859-4.
1095
1096 `s:latin5`::
1097 ISO/IEC 8859-9.
1098
1099 `s:latin6`::
1100 ISO/IEC 8859-10.
1101
1102 `s:latin7`::
1103 ISO/IEC 8859-13.
1104
1105 `s:latin8`::
1106 ISO/IEC 8859-14.
1107
1108 `s:latin9`::
1109 ISO/IEC 8859-15.
1110
1111 `s:latin10`::
1112 ISO/IEC 8859-16.
1113 --
1114
1115 . The ``pass:[{]`` prefix.
1116
1117 . A valid {py3} expression of which the evaluation result type
1118 is `bool`, `int`, `float`, or `str` (the first three automatically
1119 converted to `str`).
1120 +
1121 For a string at some source location{nbsp}__**L**__, this expression may
1122 contain:
1123 +
1124 --
1125 * The name of any <<label,label>> defined before{nbsp}__**L**__
1126 which isn't within a nested group.
1127 * The name of any <<variable-assignment,variable>> known
1128 at{nbsp}__**L**__.
1129 --
1130 +
1131 The value of the special name `ICITTE` (`int` type) in this expression
1132 is the <<cur-offset,current offset>> (before encoding the string).
1133
1134 . The `}` suffix.
1135
1136 Encoding suffix form:: {empty}
1137 +
1138 . The ``pass:[{]`` prefix.
1139
1140 . A valid {py3} expression of which the evaluation result type
1141 is `bool`, `int`, `float`, or `str` (the first three automatically
1142 converted to `str`).
1143 +
1144 For a string at some source location{nbsp}__**L**__, this expression may
1145 contain:
1146 +
1147 --
1148 * The name of any <<label,label>> defined before{nbsp}__**L**__
1149 which isn't within a nested group.
1150 * The name of any <<variable-assignment,variable>> known
1151 at{nbsp}__**L**__.
1152 --
1153 +
1154 The value of the special name `ICITTE` (`int` type) in this expression
1155 is the <<cur-offset,current offset>> (before encoding the string).
1156
1157 . The `:` character.
1158
1159 . A string encoding amongst:
1160 +
1161 --
1162 [horizontal]
1163 `s:u8`::
1164 UTF-8.
1165
1166 `s:u16be`::
1167 UTF-16BE.
1168
1169 `s:u16le`::
1170 UTF-16LE.
1171
1172 `s:u32be`::
1173 UTF-32BE.
1174
1175 `s:u32le`::
1176 UTF-32LE.
1177
1178 `s:latin1`::
1179 ISO/IEC 8859-1.
1180
1181 `s:latin2`::
1182 ISO/IEC 8859-2.
1183
1184 `s:latin3`::
1185 ISO/IEC 8859-3.
1186
1187 `s:latin4`::
1188 ISO/IEC 8859-4.
1189
1190 `s:latin5`::
1191 ISO/IEC 8859-9.
1192
1193 `s:latin6`::
1194 ISO/IEC 8859-10.
1195
1196 `s:latin7`::
1197 ISO/IEC 8859-13.
1198
1199 `s:latin8`::
1200 ISO/IEC 8859-14.
1201
1202 `s:latin9`::
1203 ISO/IEC 8859-15.
1204
1205 `s:latin10`::
1206 ISO/IEC 8859-16.
1207 --
1208
1209 . The `}` suffix.
1210
1211 ====
1212 Input:
1213
1214 ----
1215 {iter = 1}
1216
1217 !repeat 10
1218 {iter : s:u8} " "
1219 {iter = iter + 1}
1220 !end
1221 ----
1222
1223 Output:
1224
1225 ----
1226 31 20 32 20 33 20 34 20 35 20 36 20 37 20 38 20 ┆ 1 2 3 4 5 6 7 8
1227 39 20 31 30 20 ┆ 9 10
1228 ----
1229 ====
1230
1231 ====
1232 Input:
1233
1234 ----
1235 {meow = 'salut jérémie'}
1236 {meow.upper() : s:latin1}
1237 ----
1238
1239 Output:
1240
1241 ----
1242 53 41 4c 55 54 20 4a c9 52 c9 4d 49 45 ┆ SALUT J•R•MIE
1243 ----
1244 ====
1245
1246 === Current offset setting
1247
1248 This special item sets the <<cur-offset,_current offset_>>.
1249
1250 A current offset setting is:
1251
1252 . The `<` prefix.
1253
1254 . A <<const-int,positive constant integer>> which is the new current
1255 offset.
1256
1257 . The `>` suffix.
1258
1259 ====
1260 Input:
1261
1262 ----
1263 {ICITTE : 8} * 8
1264 <0x61> {ICITTE : 8} * 8
1265 ----
1266
1267 Output:
1268
1269 ----
1270 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
1271 ----
1272 ====
1273
1274 ====
1275 Input:
1276
1277 ----
1278 aa bb cc dd <meow> ee ff
1279 <12> 11 22 33 <mix> 44 55
1280 {meow : 8} {mix : 8}
1281 ----
1282
1283 Output:
1284
1285 ----
1286 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
1287 ----
1288 ====
1289
1290 === Current offset alignment
1291
1292 A _current offset alignment_ represents zero or more padding bytes to
1293 make the <<cur-offset,current offset>> meet a given
1294 https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
1295
1296 More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
1297 a current offset alignment represents the required padding bytes until
1298 the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
1299
1300 A current offset alignment is:
1301
1302 . The `@` prefix.
1303
1304 . A <<const-int,positive constant integer>> which is the alignment value
1305 in _bits_.
1306 +
1307 This value must be greater than zero and a multiple of{nbsp}8.
1308
1309 . **Optional**:
1310 +
1311 --
1312 . The ``pass:[~]`` prefix.
1313 . A <<const-int,positive constant integer>> which is the value of the
1314 byte to use as padding to align the <<cur-offset,current offset>>.
1315 --
1316 +
1317 Without this section, the padding byte value is zero.
1318
1319 ====
1320 Input:
1321
1322 ----
1323 11 22 (@32 aa bb cc) * 3
1324 ----
1325
1326 Output:
1327
1328 ----
1329 11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
1330 ----
1331 ====
1332
1333 ====
1334 Input:
1335
1336 ----
1337 {le}
1338 77 88
1339 @32~0xcc {-893.5:32}
1340 @128~0x55 "meow"
1341 ----
1342
1343 Output:
1344
1345 ----
1346 77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
1347 6d 65 6f 77 ┆ meow
1348 ----
1349 ====
1350
1351 ====
1352 Input:
1353
1354 ----
1355 aa bb cc <29> @64~255 "zoom"
1356 ----
1357
1358 Output:
1359
1360 ----
1361 aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
1362 ----
1363 ====
1364
1365 === Filling
1366
1367 A _filling_ represents zero or more padding bytes to make the
1368 <<cur-offset,current offset>> reach a given value.
1369
1370 A filling is:
1371
1372 . The ``pass:[+]`` prefix.
1373
1374 . One of:
1375
1376 ** A <<const-int,positive constant integer>> which is the current offset
1377 target.
1378
1379 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1380 evaluation result type is `int` or `bool` (automatically converted to
1381 `int`), and the ``pass:[}]`` suffix.
1382 +
1383 For a filling at some source location{nbsp}__**L**__, this expression
1384 may contain:
1385 +
1386 --
1387 * The name of any <<label,label>> defined before{nbsp}__**L**__
1388 which isn't within a nested group.
1389 * The name of any <<variable-assignment,variable>> known
1390 at{nbsp}__**L**__.
1391 --
1392 +
1393 The value of the special name `ICITTE` (`int` type) in this expression
1394 is the <<cur-offset,current offset>> (before handling the items to
1395 repeat).
1396
1397 ** A valid {py3} name.
1398 +
1399 For the name `__NAME__`, this is equivalent to the
1400 `pass:[{]__NAME__pass:[}]` form above.
1401
1402 +
1403 This value must be greater than or equal to the current offset where
1404 it's used.
1405
1406 . **Optional**:
1407 +
1408 --
1409 . The ``pass:[~]`` prefix.
1410 . A <<const-int,positive constant integer>> which is the value of the
1411 byte to use as padding to reach the current offset target.
1412 --
1413 +
1414 Without this section, the padding byte value is zero.
1415
1416 ====
1417 Input:
1418
1419 ----
1420 aa bb cc dd
1421 +0x40
1422 "hello world"
1423 ----
1424
1425 Output:
1426
1427 ----
1428 aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1429 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1430 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1431 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
1432 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
1433 ----
1434 ====
1435
1436 ====
1437 Input:
1438
1439 ----
1440 !macro part(iter, fill)
1441 <0> "particular security " {ord('0') + iter : 8} +fill~0x80
1442 !end
1443
1444 {iter = 1}
1445
1446 !repeat 5
1447 m:part(iter, {32 + 4 * iter})
1448 {iter = iter + 1}
1449 !end
1450 ----
1451
1452 Output:
1453
1454 ----
1455 70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
1456 69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
1457 80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
1458 65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
1459 80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
1460 69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
1461 33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
1462 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1463 61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
1464 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1465 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
1466 61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
1467 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
1468 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
1469 ----
1470 ====
1471
1472 === Label
1473
1474 A _label_ associates a name to the <<cur-offset,current offset>>.
1475
1476 All the labels of a whole Normand input must have unique names.
1477
1478 A label must not share the name of a <<variable-assignment,variable>>
1479 name.
1480
1481 A label is:
1482
1483 . The `<` prefix.
1484
1485 . A valid {py3} name which is not `ICITTE`.
1486
1487 . The `>` suffix.
1488
1489 === Variable assignment
1490
1491 A _variable assignment_ associates a name to the integral result of an
1492 evaluated {py3} expression.
1493
1494 A variable assignment is:
1495
1496 . The ``pass:[{]`` prefix.
1497
1498 . A valid {py3} name which is not `ICITTE`.
1499
1500 . The `=` character.
1501
1502 . A valid {py3} expression of which the evaluation result type is `int`,
1503 `float`, or `bool` (automatically converted to `int`), or `str`.
1504 +
1505 For a variable assignment at some source location{nbsp}__**L**__, this
1506 expression may contain:
1507 +
1508 --
1509 * The name of any <<label,label>> defined before{nbsp}__**L**__
1510 which isn't within a nested group.
1511 * The name of any <<variable-assignment,variable>> known
1512 at{nbsp}__**L**__.
1513 --
1514 +
1515 The value of the special name `ICITTE` (`int` type) in this expression
1516 is the <<cur-offset,current offset>>.
1517
1518 . The `}` suffix.
1519
1520 ====
1521 Input:
1522
1523 ----
1524 {mix = 101} {le}
1525 {meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1526 "yooo" {meow + mix : 16}
1527 ----
1528
1529 Output:
1530
1531 ----
1532 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1533 ----
1534 ====
1535
1536 === Group
1537
1538 A _group_ is a scoped sequence of items.
1539
1540 The <<label,labels>> within a group aren't visible outside of it.
1541
1542 The main purpose of a group is to <<post-item-repetition,repeat>> more
1543 than a single item and to isolate labels.
1544
1545 A group is:
1546
1547 . The `(`, `!group`, or `!g` opening.
1548
1549 . Zero or more items.
1550
1551 . Depending on the group opening:
1552 +
1553 --
1554 `(`::
1555 The `)` closing.
1556
1557 `!group`::
1558 `!g`::
1559 The `!end` closing.
1560 --
1561
1562 ====
1563 Input:
1564
1565 ----
1566 ((aa bb cc) dd () ee) "leclerc"
1567 ----
1568
1569 Output:
1570
1571 ----
1572 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1573 ----
1574 ====
1575
1576 ====
1577 Input:
1578
1579 ----
1580 !group
1581 (aa bb cc) * 3 dd ee
1582 !end * 5
1583 ----
1584
1585 Output:
1586
1587 ----
1588 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1589 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1590 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1591 bb cc aa bb cc dd ee
1592 ----
1593 ====
1594
1595 ====
1596 Input:
1597
1598 ----
1599 {be}
1600 (
1601 <str_beg> u16le"sébastien diaz" <str_end>
1602 {ICITTE - str_beg : 8}
1603 {(end - str_beg) * 5 : 24}
1604 ) * 3
1605 <end>
1606 ----
1607
1608 Output:
1609
1610 ----
1611 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1612 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
1613 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1614 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
1615 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1616 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1617 ----
1618 ====
1619
1620 === Conditional block
1621
1622 A _conditional block_ represents either the bytes of zero or more items
1623 if some expression is true, or the bytes of zero or more other items if
1624 it's false.
1625
1626 A conditional block is:
1627
1628 . The `!if` opening.
1629
1630 . One of:
1631
1632 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1633 evaluation result type is `int` or `bool` (automatically converted to
1634 `int`), and the ``pass:[}]`` suffix.
1635 +
1636 For a conditional block at some source location{nbsp}__**L**__, this
1637 expression may contain:
1638 +
1639 --
1640 * The name of any <<label,label>> defined before{nbsp}__**L**__
1641 which isn't within a nested group.
1642 * The name of any <<variable-assignment,variable>> known
1643 at{nbsp}__**L**__.
1644 --
1645 +
1646 The value of the special name `ICITTE` (`int` type) in this expression
1647 is the <<cur-offset,current offset>> (before handling the contained
1648 items).
1649
1650 ** A valid {py3} name.
1651 +
1652 For the name `__NAME__`, this is equivalent to the
1653 `pass:[{]__NAME__pass:[}]` form above.
1654
1655 . Zero or more items to be handled when the condition is true.
1656
1657 . **Optional**:
1658
1659 .. The `!else` opening.
1660 .. Zero or more items to be handled when the condition is false.
1661
1662 . The `!end` closing.
1663
1664 ====
1665 Input:
1666
1667 ----
1668 {at = 1}
1669 {rep_count = 9}
1670
1671 !repeat rep_count
1672 "meow "
1673
1674 !if {ICITTE > 25}
1675 "mix"
1676 !else
1677 "zoom"
1678 !end
1679
1680 !if {at < rep_count} 20 !end
1681
1682 {at = at + 1}
1683 !end
1684 ----
1685
1686 Output:
1687
1688 ----
1689 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
1690 6f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
1691 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
1692 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
1693 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
1694 6d 69 78 ┆ mix
1695 ----
1696 ====
1697
1698 ====
1699 Input:
1700
1701 ----
1702 <str_beg>
1703 u16le"meow mix!"
1704 <str_end>
1705
1706 !if {str_end - str_beg > 10}
1707 " BIG"
1708 !end
1709 ----
1710
1711 Output:
1712
1713 ----
1714 6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
1715 21 00 20 42 49 47 ┆ !• BIG
1716 ----
1717 ====
1718
1719 === Repetition block
1720
1721 A _repetition block_ represents the bytes of one or more items repeated
1722 a given number of times.
1723
1724 A repetition block is:
1725
1726 . The `!repeat` or `!r` opening.
1727
1728 . One of:
1729
1730 ** A <<const-int,positive constant integer>> which is the number of
1731 times to repeat the previous item.
1732
1733 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1734 evaluation result type is `int` or `bool` (automatically converted to
1735 `int`), and the ``pass:[}]`` suffix.
1736 +
1737 For a repetition block at some source location{nbsp}__**L**__, this
1738 expression may contain:
1739 +
1740 --
1741 * The name of any <<label,label>> defined before{nbsp}__**L**__
1742 which isn't within a nested group.
1743 * The name of any <<variable-assignment,variable>> known
1744 at{nbsp}__**L**__.
1745 --
1746 +
1747 The value of the special name `ICITTE` (`int` type) in this expression
1748 is the <<cur-offset,current offset>> (before handling the items to
1749 repeat).
1750
1751 ** A valid {py3} name.
1752 +
1753 For the name `__NAME__`, this is equivalent to the
1754 `pass:[{]__NAME__pass:[}]` form above.
1755
1756 . Zero or more items.
1757
1758 . The `!end` closing.
1759
1760 You may also use a <<post-item-repetition,post-item repetition>> after
1761 some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1762 is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
1763
1764 ====
1765 Input:
1766
1767 ----
1768 !repeat 0o400
1769 {end - ICITTE - 1 : 8}
1770 !end
1771
1772 <end>
1773 ----
1774
1775 Output:
1776
1777 ----
1778 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1779 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1780 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1781 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1782 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1783 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1784 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1785 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1786 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1787 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1788 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1789 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1790 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1791 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1792 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1793 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1794 ----
1795 ====
1796
1797 ====
1798 Input:
1799
1800 ----
1801 {times = 1}
1802
1803 aa bb cc dd
1804
1805 !repeat 3
1806 <here>
1807
1808 !repeat {here + 1}
1809 ee ff
1810 !end
1811
1812 11 22 !repeat times 33 !end
1813
1814 {times = times + 1}
1815 !end
1816
1817 "coucou!"
1818 ----
1819
1820 Output:
1821
1822 ----
1823 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1824 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1825 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1826 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1827 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1828 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1829 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1830 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1831 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1832 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1833 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1834 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1835 ----
1836 ====
1837
1838 === Macro definition block
1839
1840 A _macro definition block_ associates a name and parameter names to
1841 a group of items.
1842
1843 A macro definition block doesn't lead to generated bytes itself: a
1844 <<macro-expansion,macro expansion>> does so.
1845
1846 A macro definition may only exist at the root level, that is, not within
1847 a <<group,group>>, a <<repetition-block,repetition block>>, a
1848 <<conditional-block,conditional block>>, or another
1849 <<macro-definition-block,macro definition block>>.
1850
1851 All macro definitions must have unique names.
1852
1853 A macro definition is:
1854
1855 . The `!macro` or `!m` opening.
1856
1857 . A valid {py3} name (the macro name).
1858
1859 . The `(` parameter name list prefix.
1860
1861 . A comma-separated list of zero or more unique parameter names,
1862 each one being a valid {py3} name.
1863
1864 . The `)` parameter name list suffix.
1865
1866 . Zero or more items except, recursively, a macro definition block.
1867
1868 . The `!end` closing.
1869
1870 ====
1871 ----
1872 !macro bake()
1873 {le} {ICITTE * 8 : 16}
1874 u16le"predict explode"
1875 !end
1876 ----
1877 ====
1878
1879 ====
1880 ----
1881 !macro nail(rep, with_extra, val)
1882 {iter = 1}
1883
1884 !repeat rep
1885 {val + iter : uleb128}
1886 {0xdeadbeef : 32}
1887 {iter = iter + 1}
1888 !end
1889
1890 !if with_extra
1891 "meow mix\0"
1892 !end
1893 !end
1894 ----
1895 ====
1896
1897 === Macro expansion
1898
1899 A _macro expansion_ expands the items of a defined
1900 <<macro-definition-block,macro>>.
1901
1902 The macro to expand must be defined _before_ the expansion.
1903
1904 The <<state,state>> before handling the first item of the chosen macro
1905 is:
1906
1907 <<cur-offset,Current offset>>::
1908 Unchanged.
1909
1910 <<cur-bo,Current byte order>>::
1911 Unchanged.
1912
1913 Variables::
1914 The only available variables initially are the macro parameters.
1915
1916 Labels::
1917 None.
1918
1919 The state after having handled the last item of the chosen macro is:
1920
1921 Current offset::
1922 The one before handling the first item of the macro plus the size
1923 of the generated data of the macro expansion.
1924 +
1925 IMPORTANT: This means <<current-offset-setting,current offset setting>>
1926 items within the expanded macro don't impact the final current offset.
1927
1928 Current byte order::
1929 The one before handling the first item of the macro.
1930
1931 Variables::
1932 The ones before handling the first item of the macro.
1933
1934 Labels::
1935 The ones before handling the first item of the macro.
1936
1937 A macro expansion is:
1938
1939 . The `m:` prefix.
1940
1941 . A valid {py3} name (the name of the macro to expand).
1942
1943 . The `(` parameter value list prefix.
1944
1945 . A comma-separated list of zero or more unique parameter values.
1946 +
1947 The number of parameter values must match the number of parameter
1948 names of the definition of the chosen macro.
1949 +
1950 A parameter value is one of:
1951 +
1952 --
1953 * A <<const-int,constant integer>>, possibly negative.
1954
1955 * A constant floating point number.
1956
1957 * The ``pass:[{]`` prefix, a valid {py3} expression of which the
1958 evaluation result type is `int` or `bool` (automatically converted to
1959 `int`), and the ``pass:[}]`` suffix.
1960 +
1961 For a macro expansion at some source location{nbsp}__**L**__, this
1962 expression may contain:
1963
1964 ** The name of any <<label,label>> defined before{nbsp}__**L**__
1965 which isn't within a nested group.
1966 ** The name of any <<variable-assignment,variable>> known
1967 at{nbsp}__**L**__.
1968
1969 +
1970 The value of the special name `ICITTE` (`int` type) in this expression
1971 is the <<cur-offset,current offset>> (before handling the items of the
1972 chosen macro).
1973
1974 * A valid {py3} name.
1975 +
1976 For the name `__NAME__`, this is equivalent to the
1977 `pass:[{]__NAME__pass:[}]` form above.
1978 --
1979
1980 . The `)` parameter value list suffix.
1981
1982 ====
1983 Input:
1984
1985 ----
1986 !macro bake()
1987 {le} {ICITTE * 8 : 16}
1988 u16le"predict explode"
1989 !end
1990
1991 "hello [" m:bake() "] world"
1992
1993 m:bake() * 5
1994 ----
1995
1996 Output:
1997
1998 ----
1999 68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
2000 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
2001 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
2002 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2003 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
2004 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2005 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
2006 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2007 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
2008 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2009 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
2010 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
2011 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
2012 ----
2013 ====
2014
2015 ====
2016 Input:
2017
2018 ----
2019 !macro A(val, is_be)
2020 {le}
2021
2022 !if is_be
2023 {be}
2024 !end
2025
2026 {val : 16}
2027 !end
2028
2029 !macro B(rep, is_be)
2030 {iter = 1}
2031
2032 !repeat rep
2033 m:A({iter * 3}, is_be)
2034 {iter = iter + 1}
2035 !end
2036 !end
2037
2038 m:B(5, 1)
2039 m:B(3, 0)
2040 ----
2041
2042 Output:
2043
2044 ----
2045 00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
2046 ----
2047 ====
2048
2049 ====
2050 Input:
2051
2052 ----
2053 !macro flt32be(val) {be} {val : 32} !end
2054
2055 "CHEETOS"
2056 m:flt32be(-42.17)
2057 m:flt32be(56.23e-4)
2058 ----
2059
2060 Output:
2061
2062 ----
2063 43 48 45 45 54 4f 53 c2 28 ae 14 3b b8 41 25 ┆ CHEETOS•(••;•A%
2064 ----
2065 ====
2066
2067 === Post-item repetition
2068
2069 A _post-item repetition_ represents the bytes of an item repeated a
2070 given number of times.
2071
2072 A post-item repetition is:
2073
2074 . One of those items:
2075
2076 ** A <<byte-constant,byte constant>>.
2077 ** A <<literal-string,literal string>>.
2078 ** A <<fixed-length-number,fixed-length number>>.
2079 ** An <<leb128-integer,LEB128 integer>>.
2080 ** A <<string,string>>.
2081 ** A <<macro-expansion,macro-expansion>>.
2082 ** A <<group,group>>.
2083
2084 . The ``pass:[*]`` character.
2085
2086 . One of:
2087
2088 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
2089 which is the number of times to repeat the previous item.
2090
2091 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
2092 evaluation result type is `int` or `bool` (automatically converted to
2093 `int`), and the ``pass:[}]`` suffix.
2094 +
2095 For a post-item repetition at some source location{nbsp}__**L**__, this
2096 expression may contain:
2097 +
2098 --
2099 * The name of any <<label,label>> defined before{nbsp}__**L**__
2100 which isn't within a nested group and
2101 which isn't part of the repeated item.
2102 * The name of any <<variable-assignment,variable>> known
2103 at{nbsp}__**L**__, which isn't part of its repeated item, and which
2104 doesn't.
2105 --
2106 +
2107 The value of the special name `ICITTE` (`int` type) in this expression
2108 is the <<cur-offset,current offset>> (before handling the items to
2109 repeat).
2110
2111 ** A valid {py3} name.
2112 +
2113 For the name `__NAME__`, this is equivalent to the
2114 `pass:[{]__NAME__pass:[}]` form above.
2115
2116 You may also use a <<repetition-block,repetition block>>. The form
2117 ``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
2118 ``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
2119
2120 ====
2121 Input:
2122
2123 ----
2124 {end - ICITTE - 1 : 8} * 0x100 <end>
2125 ----
2126
2127 Output:
2128
2129 ----
2130 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
2131 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
2132 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
2133 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
2134 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
2135 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
2136 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
2137 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
2138 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
2139 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
2140 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
2141 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
2142 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
2143 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
2144 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
2145 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
2146 ----
2147 ====
2148
2149 ====
2150 Input:
2151
2152 ----
2153 {times = 1}
2154 aa bb cc dd
2155 (
2156 <here>
2157 (ee ff) * {here + 1}
2158 11 22 33 * {times}
2159 {times = times + 1}
2160 ) * 3
2161 "coucou!"
2162 ----
2163
2164 Output:
2165
2166 ----
2167 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
2168 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
2169 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2170 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
2171 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2172 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2173 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2174 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2175 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2176 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
2177 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
2178 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
2179 ----
2180 ====
2181
2182 == Command-line tool
2183
2184 If you <<install-normand,installed>> the `normand` package, then you
2185 can use the `normand` command-line tool:
2186
2187 ----
2188 $ normand <<< '"ma gang de malades"' | hexdump -C
2189 ----
2190
2191 ----
2192 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
2193 00000010 65 73 |es|
2194 ----
2195
2196 If you copy the `normand.py` module to your own project, then you can
2197 run the module itself:
2198
2199 ----
2200 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
2201 ----
2202
2203 ----
2204 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
2205 00000010 65 73 |es|
2206 ----
2207
2208 Without a path argument, the `normand` tool reads from the standard
2209 input.
2210
2211 The `normand` tool prints the generated binary data to the standard
2212 output.
2213
2214 Various options control the initial <<state,state>> of the processor:
2215 use the `--help` option to learn more.
2216
2217 == {py3} API
2218
2219 The whole `normand` package/module public API is:
2220
2221 [source,python]
2222 ----
2223 # Byte order.
2224 class ByteOrder(enum.Enum):
2225 # Big endian.
2226 BE = ...
2227
2228 # Little endian.
2229 LE = ...
2230
2231
2232 # Text location.
2233 class TextLocation:
2234 # Line number.
2235 @property
2236 def line_no(self) -> int:
2237 ...
2238
2239 # Column number.
2240 @property
2241 def col_no(self) -> int:
2242 ...
2243
2244
2245 # Parsing error message.
2246 class ParseErrorMessage:
2247 # Message text.
2248 @property
2249 def text(self):
2250 ...
2251
2252 # Source text location.
2253 @property
2254 def text_location(self):
2255 ...
2256
2257
2258 # Parsing error.
2259 class ParseError(RuntimeError):
2260 # Parsing error messages.
2261 #
2262 # The first message is the most _specific_ one.
2263 @property
2264 def messages(self):
2265 ...
2266
2267
2268 # Variables dictionary type (for type hints).
2269 VariablesT = typing.Dict[str, typing.Union[int, float]]
2270
2271
2272 # Labels dictionary type (for type hints).
2273 LabelsT = typing.Dict[str, int]
2274
2275
2276 # Parsing result.
2277 class ParseResult:
2278 # Generated data.
2279 @property
2280 def data(self) -> bytearray:
2281 ...
2282
2283 # Updated variable values.
2284 @property
2285 def variables(self) -> SymbolsT:
2286 ...
2287
2288 # Updated main group label values.
2289 @property
2290 def labels(self) -> SymbolsT:
2291 ...
2292
2293 # Final offset.
2294 @property
2295 def offset(self) -> int:
2296 ...
2297
2298 # Final byte order.
2299 @property
2300 def byte_order(self) -> typing.Optional[ByteOrder]:
2301 ...
2302
2303
2304 # Parses the `normand` input using the initial state defined by
2305 # `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
2306 # and returns the corresponding parsing result.
2307 def parse(normand: str,
2308 init_variables: typing.Optional[SymbolsT] = None,
2309 init_labels: typing.Optional[SymbolsT] = None,
2310 init_offset: int = 0,
2311 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
2312 ...
2313 ----
2314
2315 The `normand` parameter is the actual <<learn-normand,Normand input>>
2316 while the other parameters control the initial <<state,state>>.
2317
2318 The `parse()` function raises a `ParseError` instance should it fail to
2319 parse the `normand` string for any reason.
2320
2321 == Development
2322
2323 Normand is a https://python-poetry.org/[Poetry] project.
2324
2325 To develop it, install it through Poetry and enter the virtual
2326 environment:
2327
2328 ----
2329 $ poetry install
2330 $ poetry shell
2331 $ normand <<< '"lol" * 10 0a'
2332 ----
2333
2334 `normand.py` is processed by:
2335
2336 * https://microsoft.github.io/pyright/[Pyright]
2337 * https://github.com/psf/black[Black]
2338 * https://pycqa.github.io/isort/[isort]
2339
2340 === Testing
2341
2342 Use https://docs.pytest.org/[pytest] to test Normand once the package is
2343 part of your virtual environment, for example:
2344
2345 ----
2346 $ poetry install
2347 $ poetry run pip3 install pytest
2348 $ poetry run pytest
2349 ----
2350
2351 The `pytest` project is currently not a development dependency in
2352 `pyproject.toml` due to backward compatibiliy issues with
2353 Python{nbsp}3.4.
2354
2355 In the `tests` directory, each `*.nt` file is a test. The file name
2356 prefix indicates what it's meant to test:
2357
2358 `pass-`::
2359 Everything above the `---` line is the valid Normand input
2360 to test.
2361 +
2362 Everything below the `---` line is the expected data
2363 (whitespace-separated hexadecimal bytes).
2364
2365 `fail-`::
2366 Everything above the `---` line is the invalid Normand input
2367 to test.
2368 +
2369 Everything below the `---` line is the expected error message having
2370 this form:
2371 +
2372 ----
2373 LINE:COL - MESSAGE
2374 ----
2375
2376 === Contributing
2377
2378 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
2379 for code review.
2380
2381 To report a bug, https://github.com/efficios/normand/issues/new[create a
2382 GitHub issue].
This page took 0.072368 seconds and 3 git commands to generate.