1 // Show ToC at a specific location for a GitHub rendering
10 // This is to mimic what GitHub does so that anchors work in an offline
21 image::normand-logo.png[]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
27 _**Normand**_ is a text-to-binary processor with its own language.
29 This package offers both a portable {py3} module and a command-line
32 WARNING: This version of Normand is 0.11, meaning both the Normand
33 language and the module/CLI interface aren't stable.
36 // ToC location for a GitHub rendering
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
47 Consider the following Normand input:
50 4f 55 32 bb $167 fe %10100111 a9 $-32
53 The generated nine bytes are:
56 4f 55 32 bb a7 fe a7 a9 e0
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
64 The Normand language is more than simple lists of bytes, though. Its
67 Comments, including a bunch of insignificant symbols which may improve readability::
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
76 10839636-5d65-4a68-8e6a-21608ddf7258
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
87 Hexadecimal, decimal, and binary byte constants::
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
101 UTF-8, UTF-16, and UTF-32 literal strings::
107 u16le"stress\nverdict 🤣"
113 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
114 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
115 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
118 Labels: special variables holding the offset where they're defined::
121 <beg> b2 52 e3 bc 91 05
122 $100 $50 <chair> 33 9f fe
129 5e 65 {tower = 47} c6 7f f2 c4
130 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
133 The value of a variable assignment is the evaluation of a valid {py3}
134 expression which may include label and variable names.
136 Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
142 {be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
150 67 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
154 The encoded number is the evaluation of a valid {py3} expression which
155 may include label and variable names.
157 https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
162 aa bb cc {-1993 : sleb128} <meow> dd ee ff
163 {meow * 199 : uleb128}
169 aa bb cc b7 70 dd ee ff e3 07
172 The encoded integer is the evaluation of a valid {py3} expression which
173 may include label and variable names.
194 aa bb cc 66 6f 6f 66 6f 6f 66 6f 6f 62 61 72 66 ┆ •••foofoofoobarf
195 6f 6f 62 61 72 ┆ oobar
203 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
213 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
214 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
215 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
216 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
217 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
218 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
219 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
220 ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
221 6a 75 69 63 65 ┆ juice
240 00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
241 ff 85 ff ff 00 00 15 d0
244 Multilevel grouping::
249 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
255 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
256 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
257 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
258 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
259 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
260 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
261 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
271 !if world " world" !end
276 m:hello({ICITTE > 15 and ICITTE < 60})
283 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
284 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
285 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
286 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
287 ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
288 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
289 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
290 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
291 ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
292 ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
293 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
296 Precise error reporting::
299 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
303 /tmp/meow.normand:32:6 - Unexpected character `k`.
307 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
311 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
314 You can use Normand to track data source files in your favorite VCS
315 instead of raw binary files. The binary files that Normand generates can
316 be used to test file format decoding, including malformatted data, for
317 example, as well as for education.
319 See <<learn-normand>> to explore all the Normand features.
323 Normand requires Python ≥ 3.4.
328 $ python3 -m pip install --user normand
332 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
333 to learn more about a user site installation.
337 Normand has a single module file, `normand.py`, which you can copy as is
338 to your project to use it (both the <<python3-api,`normand.parse()`>>
339 function and the <<command-line-tool,command-line tool>>).
341 `normand.py` has _no external dependencies_, but if you're using
342 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
348 A Normand text input is a sequence of items which represent a sequence
351 [[state]] During the processing of items to data, Normand relies on a
356 |State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
358 |[[cur-offset]] Current offset
360 The current offset has an effect on the value of <<label,labels>> and of
361 the special `ICITTE` name in <<fixed-length-number,fixed-length
362 number>>, <<leb-128-integer,LEB128 integer>>,
363 <<variable-assignment,variable assignment>>,
364 <<conditional-block,conditional block>>, <<repetition-block,repetition
365 block>>, <<macro-expansion,macro expansion>>, and
366 <<post-item-repetition,post-item repetition>> expression evaluation.
368 Each generated byte increments the current offset.
370 A <<current-offset-setting,current offset setting>> may change the
371 current offset without generating data.
373 An <<current-offset-alignment,current offset alignment>> generates
374 padding bytes to make the current offset satisfy a given alignment.
375 |`init_offset` parameter of the `parse()` function.
378 |[[cur-bo]] Current byte order
380 The current byte order has an effect on the encoding of
381 <<fixed-length-number,fixed-length numbers>>.
383 A <<current-byte-order-setting,current byte order setting>> may change
384 the current byte order.
385 |`init_byte_order` parameter of the `parse()` function.
386 |`--byte-order` option.
389 |Mapping of label names to integral values.
390 |`init_labels` parameter of the `parse()` function.
391 |One or more `--label` options.
393 |<<variable-assignment,Variables>>
394 |Mapping of variable names to integral or floating point number values.
395 |`init_variables` parameter of the `parse()` function.
396 |One or more `--var` options.
399 The available items are:
401 * A <<byte-constant,constant integer>> representing a single byte.
403 * A <<literal-string,literal string>> representing a sequence of bytes
404 encoding UTF-8, UTF-16, or UTF-32 data.
406 * A <<current-byte-order-setting,current byte order setting>> (big or
409 * A <<fixed-length-number,fixed-length number>> (integer or
410 floating point) using the <<cur-bo,current byte order>> and of which
411 the value is the result of a {py3} expression.
413 * An <<leb128-integer,LEB128 integer>> of which the value is the result
414 of a {py3} expression.
416 * A <<current-offset-setting,current offset setting>>.
418 * A <<current-offset-alignment,current offset alignment>>.
420 * A <<label,label>>, that is, a named constant holding the current
423 This is similar to an assembly label.
425 * A <<variable-assignment,variable assignment>> associating a name to
426 the integral result of an evaluated {py3} expression.
428 * A <<group,group>>, that is, a scoped sequence of items.
430 * A <<conditional-block,conditional block>>.
432 * A <<repetition-block,repetition block>>.
434 * A <<macro-definition-block,macro definition block>>.
436 * A <<macro-expansion,macro expansion>>.
438 Moreover, you can repeat many items above a constant or variable number
439 of times with the ``pass:[*]`` operator _after_ the item to repeat. This
440 is called a <<post-item-repetition,post-item repetition>>.
442 A Normand comment may exist:
444 * Between items, possibly within a group.
445 * Between the nibbles of a constant hexadecimal byte.
446 * Between the bits of a constant binary byte.
447 * Between the last item and the ``pass:[*]`` character of a post-item
448 repetition, and between that ``pass:[*]`` character and the following
449 number or expression.
450 * Between the ``!repeat``/``!r`` block opening and the following
451 constant integer, name, or expression of a repetition block.
452 * Between the ``!if`` block opening and the following name or expression
453 of a conditional block.
455 A comment is anything between two ``pass:[#]`` characters on the same
456 line, or from ``pass:[#]`` until the end of the line. Whitespaces and
457 the following symbol characters are also considered comments where a
461 / \ ? & : ; . , + [ ] _ = | -
464 The latter serve to improve readability so that you may write, for
465 example, a MAC address or a UUID as is.
467 You can test the examples of this section with the `normand`
468 <<command-line-tool,command-line tool>> as such:
471 $ normand file | hexdump -C
474 where `file` is the name of a file containing the Normand input.
478 A _byte constant_ represents a single byte.
483 Two consecutive hexits.
486 A decimal number after the `$` prefix.
489 Eight bits after the `%` prefix.
509 $192 %1100/0011 $ -77
523 58f64689-6316-4d55-8a1a-04cada366172
524 fe80::6257:18ff:fea3:4229
530 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
531 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
539 %01110011 %01100001 %01101100 %01110101 %01110100
545 73 61 6c 75 74 ┆ salut
551 A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
554 The string to encode isn't implicitly null-terminated: use `\0` at the
555 end of the string to add a null character.
559 . **Optional**: one of the following encodings instead of UTF-8:
569 . The ``pass:["]`` prefix.
571 . A sequence of zero or more characters, possibly containing escape
574 An escape sequence is the ``\`` character followed by one of:
580 `b`:: Backspace (U+0008)
581 `e`:: Escape (U+001B)
582 `f`:: Form feed (U+000C)
583 `n`:: End of line (U+000A)
584 `r`:: Carriage return (U+000D)
585 `t`:: Character tabulation (U+0009)
586 `v`:: Line tabulation (U+000B)
587 ``\``:: Reverse solidus (U+005C)
588 ``pass:["]``:: Quotation mark (U+0022)
591 . The ``pass:["]`` suffix.
597 "coucou tout le monde!"
603 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
604 6f 6e 64 65 21 ┆ onde!
612 u16le"I am not young enough to know everything."
618 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
619 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
620 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
621 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
622 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
631 u32be "\"illusion is the first\nof all pleasures\" 🦉"
637 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
638 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
639 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
640 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
641 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
642 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
643 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
644 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
645 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
646 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
647 00 00 00 20 00 01 f9 89 ┆ ••• ••••
651 === Current byte order setting
653 This special item sets the <<cur-bo,_current byte order_>>.
655 The two accepted forms are:
658 ``pass:[{be}]``:: Set the current byte order to big endian.
659 ``pass:[{le}]``:: Set the current byte order to little endian.
661 === Fixed-length number
663 A _fixed-length number_ represents a fixed number of bytes encoding
666 * An unsigned or signed integer (two's complement).
668 The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
670 * A floating point number
671 ([IEEE{nbsp}754-2008[https://standards.ieee.org/standard/754-2008.html]).
673 The available length are 32 (_binary32_) and 64 (_binary64_).
675 The value is the result of evaluating a {py3} expression using the
676 <<cur-bo,current byte order>>.
678 A fixed-length number is:
680 . The ``pass:[{]`` prefix.
682 . A valid {py3} expression.
684 For a fixed-length number at some source location{nbsp}__**L**__, this
685 expression may contain the name of any accessible <<label,label>> (not
686 within a nested group), including the name of a label defined
687 after{nbsp}__**L**__, as well as the name of any
688 <<variable-assignment,variable>> known at{nbsp}__**L**__.
690 The value of the special name `ICITTE` (`int` type) in this expression
691 is the <<cur-offset,current offset>> (before encoding the number).
695 . An encoding length in bits amongst:
698 The expression evaluates to an `int` or `bool` value::
699 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
701 NOTE: Normand automatically converts a `bool` value to `int`.
703 The expression evaluates to a `float` value::
730 # String length in bits
731 {8 * (str_end - str_beg) : 16}
742 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
750 {20 - ICITTE : 8} * 10
756 14 13 12 11 10 0f 0e 0d 0c 0b
777 An _LEB128 integer_ represents a variable number of bytes encoding an
778 unsigned or signed integer which is the result of evaluating a {py3}
779 expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
782 An LEB128 integer is:
784 . The ``pass:[{]`` prefix.
786 . A valid {py3} expression of which the evaluation result type
787 is `int` or `bool` (automatically converted to `int`).
789 For an LEB128 integer at some source location{nbsp}__**L**__, this
790 expression may contain:
793 * The name of any <<label,label>> defined before{nbsp}__**L**__.
794 * The name of any <<variable-assignment,variable>> known
798 The value of the special name `ICITTE` (`int` type) in this expression
799 is the <<cur-offset,current offset>> (before encoding the integer).
807 `uleb128`:: Use the unsigned LEB128 format.
808 `sleb128`:: Use the signed LEB128 format.
834 {-981238311 + (meow * -23) : sleb128}
841 aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
845 === Current offset setting
847 This special item sets the <<cur-offset,_current offset_>>.
849 A current offset setting is:
853 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
854 which is the new current offset.
863 <0x61> {ICITTE : 8} * 8
869 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
877 aa bb cc dd <meow> ee ff
878 <12> 11 22 33 <mix> 44 55
885 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
889 === Current offset alignment
891 A _current offset alignment_ represents zero or more padding bytes to
892 make the <<cur-offset,current offset>> meet a given
893 https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
895 More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
896 a current offset alignment represents the required padding bytes until
897 the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
899 A current offset alignment is:
903 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
904 which is the alignment value in _bits_.
906 This value must be greater than zero and a multiple of{nbsp}8.
911 . The ``pass:[~]`` prefix.
912 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
913 which is the value of the byte to use as padding to align the
914 <<cur-offset,current offset>>.
917 Without this section, the padding byte value is zero.
923 11 22 (@32 aa bb cc) * 3
929 11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
946 77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
955 aa bb cc <29> @64~255 "zoom"
961 aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
967 A _label_ associates a name to the <<cur-offset,current offset>>.
969 All the labels of a whole Normand input must have unique names.
971 A label must not share the name of a <<variable-assignment,variable>>
978 . A valid {py3} name which is not `ICITTE`.
982 === Variable assignment
984 A _variable assignment_ associates a name to the integral result of an
985 evaluated {py3} expression.
987 A variable assignment is:
989 . The ``pass:[{]`` prefix.
991 . A valid {py3} name which is not `ICITTE`.
995 . A valid {py3} expression of which the evaluation result type
996 is `int`, `float`, or `bool` (automatically converted to `int`).
998 For a variable assignment at some source location{nbsp}__**L**__, this
999 expression may contain:
1002 * The name of any <<label,label>> defined before{nbsp}__**L**__
1003 which isn't within a nested group.
1004 * The name of any <<variable-assignment,variable>> known
1008 The value of the special name `ICITTE` (`int` type) in this expression
1009 is the <<cur-offset,current offset>>.
1018 {meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
1019 "yooo" {meow + mix : 16}
1025 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
1031 A _group_ is a scoped sequence of items.
1033 The <<label,labels>> within a group aren't visible outside of it.
1035 The main purpose of a group is to <<post-item-repetition,repeat>> more
1036 than a single item and to isolate labels.
1040 . The `(`, `!group`, or `!g` opening.
1042 . Zero or more items.
1044 . Depending on the group opening:
1059 ((aa bb cc) dd () ee) "leclerc"
1065 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
1074 (aa bb cc) * 3 dd ee
1081 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
1082 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
1083 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
1084 bb cc aa bb cc dd ee
1094 <str_beg> u16le"sébastien diaz" <str_end>
1095 {ICITTE - str_beg : 8}
1096 {(end - str_beg) * 5 : 24}
1104 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1105 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
1106 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1107 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
1108 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
1109 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
1113 === Conditional block
1115 A _conditional block_ represents either the bytes of one or more items
1116 if some expression is true, or no bytes at all if it's false.
1118 A conditional block is:
1120 . The `!if` opening.
1124 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1125 evaluation result type is `int` or `bool` (automatically converted to
1126 `int`), and the ``pass:[}]`` suffix.
1128 For a conditional block at some source location{nbsp}__**L**__, this
1129 expression may contain:
1132 * The name of any <<label,label>> defined before{nbsp}__**L**__
1133 which isn't within a nested group.
1134 * The name of any <<variable-assignment,variable>> known
1138 The value of the special name `ICITTE` (`int` type) in this expression
1139 is the <<cur-offset,current offset>> (before handling the contained
1142 ** A valid {py3} name.
1144 For the name `__NAME__`, this is equivalent to the
1145 `pass:[{]__NAME__pass:[}]` form above.
1147 . Zero or more items.
1149 . The `!end` closing.
1164 !if {at < rep_count} 20 !end
1174 6d 65 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d ┆ meow meow meow m
1175 65 6f 77 20 6d 65 6f 77 20 6d 65 6f 77 20 6d 69 ┆ eow meow meow mi
1176 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
1177 6d 69 78 20 6d 65 6f 77 20 6d 69 78 ┆ mix meow mix
1189 !if {str_end - str_beg > 10}
1197 6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
1198 21 00 20 42 49 47 ┆ !• BIG
1202 === Repetition block
1204 A _repetition block_ represents the bytes of one or more items repeated
1205 a given number of times.
1207 A repetition block is:
1209 . The `!repeat` or `!r` opening.
1213 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1214 which is the number of times to repeat the previous item.
1216 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1217 evaluation result type is `int` or `bool` (automatically converted to
1218 `int`), and the ``pass:[}]`` suffix.
1220 For a repetition block at some source location{nbsp}__**L**__, this
1221 expression may contain:
1224 * The name of any <<label,label>> defined before{nbsp}__**L**__
1225 which isn't within a nested group.
1226 * The name of any <<variable-assignment,variable>> known
1230 The value of the special name `ICITTE` (`int` type) in this expression
1231 is the <<cur-offset,current offset>> (before handling the items to
1234 ** A valid {py3} name.
1236 For the name `__NAME__`, this is equivalent to the
1237 `pass:[{]__NAME__pass:[}]` form above.
1239 . Zero or more items.
1241 . The `!end` closing.
1243 You may also use a <<post-item-repetition,post-item repetition>> after
1244 some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
1245 is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
1252 {end - ICITTE - 1 : 8}
1261 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1262 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1263 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1264 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1265 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1266 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1267 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1268 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1269 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1270 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1271 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1272 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1273 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1274 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1275 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1276 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1295 11 22 !repeat times 33 !end
1306 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1307 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1308 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1309 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1310 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1311 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1312 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1313 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1314 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1315 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1316 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1317 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1321 === Macro definition block
1323 A _macro definition block_ associates a name and parameter names to
1326 A macro definition block doesn't lead to generated bytes itself: a
1327 <<macro-expansion,macro expansion>> does so.
1329 A macro definition may only exist at the root level, that is, not within
1330 a <<group,group>>, a <<repetition-block,repetition block>>, a
1331 <<conditional-block,conditional block>>, or another
1332 <<macro-definition-block,macro definition block>>.
1334 All macro definitions must have unique names.
1336 A macro definition is:
1338 . The `!macro` or `!m` opening.
1340 . A valid {py3} name (the macro name).
1342 . The `(` parameter name list prefix.
1344 . A comma-separated list of zero or more unique parameter names,
1345 each one being a valid {py3} name.
1347 . The `)` parameter name list suffix.
1349 . Zero or more items except, recursively, a macro definition block.
1351 . The `!end` closing.
1356 {le} {ICITTE * 8 : 16}
1357 u16le"predict explode"
1364 !macro nail(rep, with_extra, val)
1368 {val + iter : uleb128}
1382 A _macro expansion_ expands the items of a defined
1383 <<macro-definition-block,macro>>.
1385 The macro to expand must be defined _before_ the expansion.
1387 The <<state,state>> before handling the first item of the chosen macro
1390 <<cur-offset,Current offset>>::
1393 <<cur-bo,Current byte order>>::
1397 The only available variables initially are the macro parameters.
1402 The state after having handled the last item of the chosen macro is:
1405 The one before handling the first item of the macro plus the size
1406 of the generated data of the macro expansion.
1408 IMPORTANT: This means <<current-offset-setting,current offset setting>>
1409 items within the expanded macro don't impact the final current offset.
1411 Current byte order::
1412 The one before handling the first item of the macro.
1415 The ones before handling the first item of the macro.
1418 The ones before handling the first item of the macro.
1420 A macro expansion is:
1424 . A valid {py3} name (the name of the macro to expand).
1426 . The `(` parameter value list prefix.
1428 . A comma-separated list of zero or more unique parameter values.
1430 The number of parameter values must match the number of parameter
1431 names of the definition of the chosen macro.
1433 A parameter value is one of:
1436 * A positive integer (hexadecimal starting with `0x` or `0X` accepted).
1438 * The ``pass:[{]`` prefix, a valid {py3} expression of which the
1439 evaluation result type is `int` or `bool` (automatically converted to
1440 `int`), and the ``pass:[}]`` suffix.
1442 For a macro expansion at some source location{nbsp}__**L**__, this
1443 expression may contain:
1445 ** The name of any <<label,label>> defined before{nbsp}__**L**__
1446 which isn't within a nested group.
1447 ** The name of any <<variable-assignment,variable>> known
1451 The value of the special name `ICITTE` (`int` type) in this expression
1452 is the <<cur-offset,current offset>> (before handling the items of the
1455 * A valid {py3} name.
1457 For the name `__NAME__`, this is equivalent to the
1458 `pass:[{]__NAME__pass:[}]` form above.
1461 . The `)` parameter value list suffix.
1468 {le} {ICITTE * 8 : 16}
1469 u16le"predict explode"
1472 "hello [" m:bake() "] world"
1480 68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
1481 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
1482 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
1483 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1484 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
1485 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1486 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
1487 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1488 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
1489 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1490 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
1491 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
1492 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
1500 !macro A(val, is_be)
1510 !macro B(rep, is_be)
1514 m:A({iter * 3}, is_be)
1526 00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
1530 === Post-item repetition
1532 A _post-item repetition_ represents the bytes of an item repeated a
1533 given number of times.
1535 A post-item repetition is:
1537 . One of those items:
1539 ** A <<byte-constant,byte constant>>.
1540 ** A <<literal-string,literal string>>.
1541 ** A <<fixed-length-number,fixed-length number>>.
1542 ** An <<leb128-integer,LEB128 integer>>.
1543 ** A <<macro-expansion,macro-expansion>>.
1544 ** A <<group,group>>.
1546 . The ``pass:[*]`` character.
1550 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
1551 which is the number of times to repeat the previous item.
1553 ** The ``pass:[{]`` prefix, a valid {py3} expression of which the
1554 evaluation result type is `int` or `bool` (automatically converted to
1555 `int`), and the ``pass:[}]`` suffix.
1557 For a post-item repetition at some source location{nbsp}__**L**__, this
1558 expression may contain:
1561 * The name of any <<label,label>> defined before{nbsp}__**L**__
1562 which isn't within a nested group and
1563 which isn't part of the repeated item.
1564 * The name of any <<variable-assignment,variable>> known
1565 at{nbsp}__**L**__, which isn't part of its repeated item, and which
1569 The value of the special name `ICITTE` (`int` type) in this expression
1570 is the <<cur-offset,current offset>> (before handling the items to
1573 ** A valid {py3} name.
1575 For the name `__NAME__`, this is equivalent to the
1576 `pass:[{]__NAME__pass:[}]` form above.
1578 You may also use a <<repetition-block,repetition block>>. The form
1579 ``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
1580 ``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
1586 {end - ICITTE - 1 : 8} * 0x100 <end>
1592 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
1593 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
1594 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
1595 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
1596 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
1597 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
1598 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
1599 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
1600 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
1601 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
1602 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
1603 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
1604 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
1605 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
1606 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
1607 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
1619 (ee ff) * {here + 1}
1629 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
1630 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1631 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1632 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1633 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1634 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1635 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1636 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1637 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1638 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1639 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
1640 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1644 == Command-line tool
1646 If you <<install-normand,installed>> the `normand` package, then you
1647 can use the `normand` command-line tool:
1650 $ normand <<< '"ma gang de malades"' | hexdump -C
1654 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1658 If you copy the `normand.py` module to your own project, then you can
1659 run the module itself:
1662 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1666 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
1670 Without a path argument, the `normand` tool reads from the standard
1673 The `normand` tool prints the generated binary data to the standard
1676 Various options control the initial <<state,state>> of the processor:
1677 use the `--help` option to learn more.
1681 The whole `normand` package/module public API is:
1686 class ByteOrder(enum.Enum):
1698 def line_no(self) -> int:
1703 def col_no(self) -> int:
1708 class ParseError(RuntimeError):
1709 # Source text location.
1711 def text_loc(self) -> TextLocation:
1715 # Variables dictionary type (for type hints).
1716 VariablesT = typing.Dict[str, typing.Union[int, float]]
1719 # Labels dictionary type (for type hints).
1720 LabelsT = typing.Dict[str, int]
1727 def data(self) -> bytearray:
1730 # Updated variable values.
1732 def variables(self) -> SymbolsT:
1735 # Updated main group label values.
1737 def labels(self) -> SymbolsT:
1742 def offset(self) -> int:
1747 def byte_order(self) -> typing.Optional[ByteOrder]:
1751 # Parses the `normand` input using the initial state defined by
1752 # `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
1753 # and returns the corresponding parsing result.
1754 def parse(normand: str,
1755 init_variables: typing.Optional[SymbolsT] = None,
1756 init_labels: typing.Optional[SymbolsT] = None,
1757 init_offset: int = 0,
1758 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1762 The `normand` parameter is the actual <<learn-normand,Normand input>>
1763 while the other parameters control the initial <<state,state>>.
1765 The `parse()` function raises a `ParseError` instance should it fail to
1766 parse the `normand` string for any reason.
1770 Normand is a https://python-poetry.org/[Poetry] project.
1772 To develop it, install it through Poetry and enter the virtual
1778 $ normand <<< '"lol" * 10 0a'
1781 `normand.py` is processed by:
1783 * https://microsoft.github.io/pyright/[Pyright]
1784 * https://github.com/psf/black[Black]
1785 * https://pycqa.github.io/isort/[isort]
1789 Use https://docs.pytest.org/[pytest] to test Normand once the package is
1790 part of your virtual environment, for example:
1794 $ poetry run pip3 install pytest
1798 The `pytest` project is currently not a development dependency in
1799 `pyproject.toml` due to backward compatibiliy issues with
1802 In the `tests` directory, each `*.nt` file is a test. The file name
1803 prefix indicates what it's meant to test:
1806 Everything above the `---` line is the valid Normand input
1809 Everything below the `---` line is the expected data
1810 (whitespace-separated hexadecimal bytes).
1813 Everything above the `---` line is the invalid Normand input
1816 Everything below the `---` line is the expected error message having
1825 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1828 To report a bug, https://github.com/efficios/normand/issues/new[create a