README.adoc: add missing "Output:" paragraph
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
1b8aa84a 32WARNING: This version of Normand is 0.5, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
101UTF-8, UTF-16, and UTF-32 literal strings::
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
108----
109+
110Output:
111+
112----
11368 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11400 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
11500 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116----
117
118Labels: special variables holding the offset where they're defined::
119+
120----
121<beg> b2 52 e3 bc 91 05
122$100 $50 <chair> 33 9f fe
12325 e9 89 8a <end>
124----
125
126Variables::
127+
128----
1295e 65 {tower = 47} c6 7f f2 c4
13044 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131----
132+
133The value of a variable assignment is the evaluation of a valid {py3}
134expression which may include label and variable names.
135
05f81895 136Fixed-length integer with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
137+
138Input:
139+
140----
141{strength = 4}
142{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143{le} {-1993 : 32}
144----
145+
146Output:
147+
148----
14967 44 b2 00 2c 63 37 f8 ff ff
150----
151+
05f81895
PP
152The encoded integer is the evaluation of a valid {py3} expression which
153may include label and variable names.
154
155https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
156+
157Input:
158+
159----
160aa bb cc {-1993 : sleb128} <meow> dd ee ff
161{meow * 199 : uleb128}
162----
163+
164Output:
165+
166----
167aa bb cc b7 70 dd ee ff e3 07
168----
169+
170The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
171may include label and variable names.
172
173Repetition::
174+
175Input:
176+
177----
2adf4336 178aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
71aaa3f7
PP
179----
180+
181Output:
182+
183----
2adf4336
PP
184aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
18500 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
18679 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
18765 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
18861 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
18968 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 19000 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
71aaa3f7
PP
191----
192
193
194Multilevel grouping::
195+
196Input:
197+
198----
199ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
200----
201+
202Output:
203+
204----
205ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
206bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2076f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2086d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
209aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2107a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2116f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
212----
213
214Precise error reporting::
215+
216----
217/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
218----
219+
220----
221/tmp/meow.normand:32:6 - Unexpected character `k`.
222----
223+
224----
2adf4336 225/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`mix`, `zoom`}.
71aaa3f7
PP
226----
227+
228----
229/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE` at byte offset 45.
230----
231
232You can use Normand to track data source files in your favorite VCS
233instead of raw binary files. The binary files that Normand generates can
234be used to test file format decoding, including malformatted data, for
235example, as well as for education.
236
237See <<learn-normand>> to explore all the Normand features.
238
239== Install Normand
240
241Normand requires Python ≥ 3.4.
242
243To install Normand:
244
245----
246$ python3 -m pip install --user normand
247----
248
249See
250https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
251to learn more about a user site installation.
252
253[NOTE]
254====
255Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 256to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
257function and the <<command-line-tool,command-line tool>>).
258
259`normand.py` has _no external dependencies_, but if you're using
260Python{nbsp}3.4, you'll need a local copy of the standard `typing`
261module.
262====
263
264== Learn Normand
265
266A Normand text input is a sequence of items which represent a sequence
267of raw bytes.
268
269[[state]] During the processing of items to data, Normand relies on a
270current state:
271
272[%header%autowidth]
273|===
af3cf417 274|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
275
276|[[cur-offset]] Current offset
277|
05f81895
PP
278The current offset has an effect on the value of <<label,labels>> and of
279the special `ICITTE` name in <<fixed-length-integer,fixed-length
280integer>>, <<leb-128-integer,LEB128 integer>>, and
71aaa3f7
PP
281<<variable-assignment,variable assignment>> expression evaluation.
282
283Each generated byte increments the current offset.
284
285A <<current-offset-setting,current offset setting>> may change the
286current offset.
287|`init_offset` parameter of the `parse()` function.
288|`--offset` option.
289
290|[[cur-bo]] Current byte order
291|
05f81895
PP
292The current byte order has an effect on the encoding of
293<<fixed-length-integer,fixed-length integers>>.
71aaa3f7
PP
294
295A <<current-byte-order-setting,current byte order setting>> may change
296the current byte order.
297|`init_byte_order` parameter of the `parse()` function.
298|`--byte-order` option.
299
300|<<label,Labels>>
301|Mapping of label names to integral values.
302|`init_labels` parameter of the `parse()` function.
303|One or more `--label` options.
304
305|<<variable-assignment,Variables>>
306|Mapping of variable names to integral values.
307|`init_variables` parameter of the `parse()` function.
308|One or more `--var` options.
309|===
310
311The available items are:
312
313* A <<byte-constant,constant integer>> representing a single byte.
314
315* A <<literal-string,literal string>> representing a sequence of bytes
316 encoding UTF-8, UTF-16, or UTF-32 data.
317
318* A <<current-byte-order-setting,current byte order setting>> (big or
319 little endian).
320
05f81895
PP
321* A <<fixed-length-integer,fixed-length integer>> using the
322 <<cur-bo,current byte order>> and of which the value is the result of
323 a {py3} expression.
324
325* An <<leb128-integer,LEB128 integer>> of which the value is the result
326 of a {py3} expression.
71aaa3f7
PP
327
328* A <<current-offset-setting,current offset setting>>.
329
330* A <<label,label>>, that is, a named constant holding the current
331 offset.
332+
333This is similar to an assembly label.
334
335* A <<variable-assignment,variable assignment>> associating a name to
336 the integral result of an evaluated {py3} expression.
337
338* A <<group,group>>, that is, a scoped sequence of items.
339
340Moreover, you can <<repetition,repeat>> any item above, except an offset
2adf4336
PP
341or a label, a given fixed or variable number of times. This is called a
342repetition.
71aaa3f7
PP
343
344A Normand comment may exist:
345
346* Between items, possibly within a group.
347* Between the nibbles of a constant hexadecimal byte.
348* Between the bits of a constant binary byte.
349* Between the last item and the ``pass:[*]`` character of a repetition,
2adf4336
PP
350 and between that ``pass:[*]`` character and the following number
351 or expression.
71aaa3f7
PP
352
353A comment is anything between two ``pass:[#]`` characters on the same
354line, or from ``pass:[#]`` until the end of the line. Whitespaces and
355the following symbol characters are also considered comments where a
356comment may exist:
357
358----
359! @ / \ ? & : ; . , + [ ] _ = | -
360----
361
362The latter serve to improve readability so that you may write, for
363example, a MAC address or a UUID as is.
364
365You can test the examples of this section with the `normand`
366<<command-line-tool,command-line tool>> as such:
367
368----
369$ normand file | hexdump -C
370----
371
372where `file` is the name of a file containing the Normand input.
373
374=== Byte constant
375
376A _byte constant_ represents a single byte.
377
378A byte constant is:
379
380Hexadecimal form::
381 Two consecutive hexits.
382
383Decimal form::
384 A decimal number after the `$` prefix.
385
386Binary form::
387 Eight bits after the `%` prefix.
388
389====
390Input:
391
392----
393ab cd [3d 8F] CC
394----
395
396Output:
397
398----
399ab cd 3d 8f cc
400----
401====
402
403====
404Input:
405
406----
407$192 %1100/0011 $ -77
408----
409
410Output:
411
412----
413c0 c3 b3
414----
415====
416
417====
418Input:
419
420----
42158f64689-6316-4d55-8a1a-04cada366172
422fe80::6257:18ff:fea3:4229
423----
424
425Output:
426
427----
42858 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
429fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
430----
431====
432
433====
434Input:
435
436----
437%01110011 %01100001 %01101100 %01110101 %01110100
438----
439
440Output:
441
442----
44373 61 6c 75 74 ┆ salut
444----
445====
446
447=== Literal string
448
449A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
450bytes of a string.
451
452The string to encode isn't implicitly null-terminated: use `\0` at the
453end of the string to add a null character.
454
455A literal string is:
456
457. **Optional**: one of the following encodings instead of UTF-8:
458+
459--
460[horizontal]
461`u16be`:: UTF-16BE.
462`u16le`:: UTF-16LE.
463`u32be`:: UTF-32BE.
464`u32le`:: UTF-32LE.
465--
466
467. The ``pass:["]`` prefix.
468
469. A sequence of zero or more characters, possibly containing escape
470 sequences.
471+
472An escape sequence is the ``\`` character followed by one of:
473+
474--
475[horizontal]
476`0`:: Null (U+0000)
477`a`:: Alert (U+0007)
478`b`:: Backspace (U+0008)
479`e`:: Escape (U+001B)
480`f`:: Form feed (U+000C)
481`n`:: End of line (U+000A)
482`r`:: Carriage return (U+000D)
483`t`:: Character tabulation (U+0009)
484`v`:: Line tabulation (U+000B)
485``\``:: Reverse solidus (U+005C)
486``pass:["]``:: Quotation mark (U+0022)
487--
488
489. The ``pass:["]`` suffix.
490
491====
492Input:
493
494----
495"coucou tout le monde!"
496----
497
498Output:
499
500----
50163 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
5026f 6e 64 65 21 ┆ onde!
503----
504====
505
506====
507Input:
508
509----
510u16le"I am not young enough to know everything."
511----
512
513Output:
514
515----
51649 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
51720 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
5186e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
51920 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
52065 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
5212e 00 ┆ .•
522----
523====
524
525====
526Input:
527
528----
529u32be "\"illusion is the first\nof all pleasures\" 🦉"
530----
531
532Output:
533
534----
53500 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
53600 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
53700 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
53800 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
53900 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
54000 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
54100 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
54200 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
54300 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
54400 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
54500 00 00 20 00 01 f9 89 ┆ ••• ••••
546----
547====
548
549=== Current byte order setting
550
551This special item sets the <<cur-bo,_current byte order_>>.
552
553The two accepted forms are:
554
555[horizontal]
556``pass:[{be}]``:: Set the current byte order to big endian.
557``pass:[{le}]``:: Set the current byte order to little endian.
558
05f81895 559=== Fixed-length integer
71aaa3f7 560
05f81895
PP
561A _fixed-length integer_ represents a fixed number of bytes encoding an
562unsigned or signed integer which is the result of evaluating a {py3}
563expression using the <<cur-bo,current byte order>>.
71aaa3f7 564
05f81895 565A fixed-length integer is:
71aaa3f7
PP
566
567. The ``pass:[{]`` prefix.
568
569. A valid {py3} expression.
05f81895
PP
570+
571For a fixed-length integer at some source location{nbsp}__**L**__, this
572expression may contain the name of any accessible <<label,label>> (not
573within a nested group), including the name of a label defined
574after{nbsp}__**L**__, as well as the name of any
575<<variable-assignment,variable>> known at{nbsp}__**L**__.
576+
577The value of the special name `ICITTE` in this expression is the
578<<cur-offset,current offset>> (before encoding the integer).
71aaa3f7
PP
579
580. The `:` character.
581
582. An encoding length in bits amongst `8`, `16`, `24`, `32`, `40`,
583 `48`, `56`, and `64`.
584
585. The `}` suffix.
586
587====
588Input:
589
590----
591{le} {345:16}
592{be} {-0xabcd:32}
593----
594
595Output:
596
597----
59859 01 ff ff 54 33
599----
600====
601
602====
603Input:
604
605----
606{be}
607
608# String length in bits
609{8 * (str_end - str_beg) : 16}
610
611# String
612<str_beg>
613 "hello world!"
614<str_end>
615----
616
617Output:
618
619----
62000 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
621----
622====
623
624====
625Input:
626
627----
628{20 - ICITTE : 8} * 10
629----
630
631Output:
632
633----
63414 13 12 11 10 0f 0e 0d 0c 0b
635----
636====
637
05f81895
PP
638=== LEB128 integer
639
640An _LEB128 integer_ represents a variable number of bytes encoding an
641unsigned or signed integer which is the result of evaluating a {py3}
642expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
643format.
644
645An LEB128 integer is:
646
647. The ``pass:[{]`` prefix.
648
649. A valid {py3} expression.
650+
651For an LEB128 integer at some source location{nbsp}__**L**__, this
652expression may contain:
653+
654--
655* The name of any <<label,label>> defined before{nbsp}__**L**__.
656* The name of any <<variable-assignment,variable>> known at{nbsp}__**L**__
657 which doesn't, directly or indirectly, refer to a label
658 defined after{nbsp}__**L**__.
659--
660+
661The value of the special name `ICITTE` in this expression is the
662<<cur-offset,current offset>> (before encoding the integer).
663
664. The `:` character.
665
666. One of:
667+
668--
669[horizontal]
670`uleb128`:: Use the unsigned LEB128 format.
671`sleb128`:: Use the signed LEB128 format.
672--
673
674. The `}` suffix.
675
676====
677Input:
678
679----
680{624485 : uleb128}
681----
682
683Output:
684
685----
686e5 8e 26
687----
688====
689
690====
691Input:
692
693----
694aa bb cc dd
695<meow>
696ee ff
697{-981238311 + (meow * -23) : sleb128}
698"hello"
699----
700
c2b79cf6
PP
701Output:
702
05f81895
PP
703----
704aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
705----
706====
707
71aaa3f7
PP
708=== Current offset setting
709
710This special item sets the <<cur-offset,_current offset_>>.
711
712A current offset setting is:
713
714. The `<` prefix.
715
716. A positive integer (hexadecimal starting with `0x` or `0X` accepted)
717 which is the new current offset.
718
719. The `>` suffix.
720
721====
722Input:
723
724----
725 {ICITTE : 8} * 8
726<0x61> {ICITTE : 8} * 8
727----
728
729Output:
730
731----
73200 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
733----
734====
735
736====
737Input:
738
739----
740aa bb cc dd <meow> ee ff
741<12> 11 22 33 <mix> 44 55
742{meow : 8} {mix : 8}
743----
744
745Output:
746
747----
748aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
749----
750====
751
752=== Label
753
754A _label_ associates a name to the <<cur-offset,current offset>>.
755
756All the labels of a whole Normand input must have unique names.
757
05f81895 758A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
759name.
760
71aaa3f7
PP
761A label is:
762
763. The `<` prefix.
764
05f81895
PP
765. A valid {py3} name which is not `ICITTE` (see
766 <<fixed-length-integer>>, <<leb128-integer>>, and
767 <<variable-assignment>> to learn more).
71aaa3f7
PP
768
769. The `>` suffix.
770
771=== Variable assignment
772
773A _variable assignment_ associates a name to the integral result of an
774evaluated {py3} expression.
775
05f81895 776A variable assignment is:
71aaa3f7
PP
777
778. The ``pass:[{]`` prefix.
779
05f81895
PP
780. A valid {py3} name which is not `ICITTE` (see
781 <<fixed-length-integer>>, <<leb128-integer>>, and
782 <<variable-assignment>> to learn more).
71aaa3f7
PP
783
784. The `=` character.
785
786. A valid {py3} expression.
05f81895
PP
787+
788For a variable assignment at some source location{nbsp}__**L**__, this
789expression may contain the name of any accessible <<label,label>> (not
790within a nested group), including the name of a label defined
791after{nbsp}__**L**__, as well as the name of any
792<<variable-assignment,variable>> known at{nbsp}__**L**__.
793+
794The value of the special name `ICITTE` in this expression is the
795<<cur-offset,current offset>>.
71aaa3f7
PP
796
797. The `}` suffix.
798
799====
800Input:
801
802----
803{mix = 101} {le}
804{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
805"yooo" {meow + mix : 16}
806----
807
808Output:
809
810----
81111 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
812----
813====
814
815=== Group
816
817A _group_ is a scoped sequence of items.
818
819The <<label,labels>> within a group aren't visible outside of it.
820
821The main purpose of a group is to <<repetition,repeat>> more than a
822single item.
823
824A group is:
825
826. The `(` prefix.
827
828. Zero or more items.
829
830. The `)` suffix.
831
832====
833Input:
834
835----
836((aa bb cc) dd () ee) "leclerc"
837----
838
839Output:
840
841----
842aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
843----
844====
845
846====
847Input:
848
849----
850((aa bb cc) * 3 dd ee) * 5
851----
852
853Output:
854
855----
856aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
857cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
858ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
859bb cc aa bb cc dd ee
860----
861====
862
863====
864Input:
865
866----
867{be}
868(
869 <str_beg> u16le"sébastien diaz" <str_end>
870 {ICITTE - str_beg : 8}
871 {(end - str_beg) * 5 : 24}
872) * 3
873<end>
874----
875
876Output:
877
878----
87973 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
8806e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
88173 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
8826e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
88373 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
8846e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
885----
886====
887
888=== Repetition
889
890A _repetition_ represents the bytes of an item repeated a given number
891of times.
892
893A repetition is:
894
895. Any item.
896
897. The ``pass:[*]`` character.
898
2adf4336
PP
899. One of:
900
901** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
902 which is the number of times to repeat the previous item.
903
904** The ``pass:[{]`` prefix, a valid {py3} expression, and the
905 ``pass:[}]`` suffix.
05f81895
PP
906+
907For a repetition at some source location{nbsp}__**L**__, this expression
908may contain:
909+
910--
911* The name of any <<label,label>> defined before{nbsp}__**L**__ and
912 which isn't part of its repeated item.
913* The name of any <<variable-assignment,variable>> known
914 at{nbsp}__**L**__, which isn't part of its repeated item, and which
915 doesn't, directly or indirectly, refer to a label defined
916 after{nbsp}__**L**__.
917--
918+
919This expression must not contain the special name `ICITTE`.
71aaa3f7
PP
920
921====
922Input:
923
924----
925{end - ICITTE - 1 : 8} * 0x100 <end>
926----
927
928Output:
929
930----
931ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
932ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
933df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
934cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
935bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
936af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
9379f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
9388f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
9397f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
9406f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
9415f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
9424f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
9433f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
9442f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
9451f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
9460f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
947----
948====
949
2adf4336
PP
950====
951Input:
952
953----
954{times = 1}
955aa bb cc dd
956(
957 <here>
958 (ee ff) * {here + 1}
959 11 22 33 * {times}
960 {times = times + 1}
961) * 3
962"coucou!"
963----
964
965Output:
966
967----
968aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
96933 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
970ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
971ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
972ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
973ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
974ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
975ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
976ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
977ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
978ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
97933 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
980----
981====
982
983====
984This example shows how to use a repetition as a conditional section
985depending on some predefined variable.
986
987Input:
988
989----
990aa bb cc dd
991(ee ff "meow mix" 00) * {cond}
992{be} {-1993:16}
993----
994
995Output (`cond` is 0):
996
997----
998aa bb cc dd f8 37
999----
1000
1001Output (`cond` is 1):
1002
1003----
1004aa bb cc dd ee ff 6d 65 6f 77 20 6d 69 78 00 f8 ┆ ••••••meow mix••
100537 ┆ 7
1006----
1007====
1008
71aaa3f7
PP
1009== Command-line tool
1010
1011If you <<install-normand,installed>> the `normand` package, then you
1012can use the `normand` command-line tool:
1013
1014----
1015$ normand <<< '"ma gang de malades"' | hexdump -C
1016----
1017
1018----
101900000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
102000000010 65 73 |es|
1021----
1022
1023If you copy the `normand.py` module to your own project, then you can
1024run the module itself:
1025
1026----
1027$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1028----
1029
1030----
103100000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
103200000010 65 73 |es|
1033----
1034
1035Without a path argument, the `normand` tool reads from the standard
1036input.
1037
1038The `normand` tool prints the generated binary data to the standard
1039output.
1040
1041Various options control the initial <<state,state>> of the processor:
1042use the `--help` option to learn more.
1043
1044== {py3} API
1045
1046The whole `normand` package/module API is:
1047
1048[source,python]
1049----
1050class ByteOrder(enum.Enum):
1051 # Big endian.
1052 BE = ...
1053
1054 # Little endian.
1055 LE = ...
1056
1057
71aaa3f7
PP
1058class TextLoc:
1059 # Line number.
1060 @property
1061 def line_no(self) -> int:
1062 ...
1063
1064 # Column number.
1065 @property
1066 def col_no(self) -> int:
1067 ...
1068
1069
1070class ParseError(RuntimeError):
1071 # Source text location.
1072 @property
1073 def text_loc(self) -> TextLoc:
1074 ...
1075
1076
1b8aa84a
PP
1077SymbolsT = typing.Dict[str, int]
1078
1079
71aaa3f7
PP
1080class ParseResult:
1081 # Generated data.
1082 @property
1083 def data(self) -> bytearray:
1084 ...
1085
1086 # Updated variable values.
1087 @property
1b8aa84a 1088 def variables(self) -> SymbolsT:
71aaa3f7
PP
1089 ...
1090
1091 # Updated main group label values.
1092 @property
1b8aa84a 1093 def labels(self) -> SymbolsT:
71aaa3f7
PP
1094 ...
1095
1096 # Final offset.
1097 @property
1098 def offset(self) -> int:
1099 ...
1100
1101 # Final byte order.
1102 @property
1b8aa84a 1103 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
1104 ...
1105
1b8aa84a 1106
71aaa3f7 1107def parse(normand: str,
1b8aa84a
PP
1108 init_variables: typing.Optional[SymbolsT] = None,
1109 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
1110 init_offset: int = 0,
1111 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1112 ...
1113----
1114
1115The `normand` parameter is the actual <<learn-normand,Normand input>>
1116while the other parameters control the initial <<state,state>>.
1117
1118The `parse()` function raises a `ParseError` instance should it fail to
1119parse the `normand` string for any reason.
bf8f3b38
PP
1120
1121== Development
1122
1123Normand is a https://python-poetry.org/[Poetry] project.
1124
1125To develop it, install it through Poetry and enter the virtual
1126environment:
1127
1128----
1129$ poetry install
1130$ poetry shell
1131$ normand <<< '"lol" * 10 0a'
1132----
1133
1134`normand.py` is processed by:
1135
1136* https://microsoft.github.io/pyright/[Pyright]
1137* https://github.com/psf/black[Black]
1138* https://pycqa.github.io/isort/[isort]
1139
1140=== Testing
1141
1142Use https://docs.pytest.org/[pytest] to test Normand once the package is
1143part of your virtual environment, for example:
1144
1145----
1146$ poetry install
1147$ poetry run pip3 install pytest
1148$ poetry run pytest
1149----
1150
1151The `pytest` project is currently not a development dependency in
1152`pyproject.toml` due to backward compatibiliy issues with
1153Python{nbsp}3.4.
1154
1155In the `tests` directory, each `*.nt` file is a test. The file name
1156prefix indicates what it's meant to test:
1157
1158`pass-`::
1159 Everything above the `---` line is the valid Normand input
1160 to test.
1161+
1162Everything below the `---` line is the expected data
1163(whitespace-separated hexadecimal bytes).
1164
1165`fail-`::
1166 Everything above the `---` line is the invalid Normand input
1167 to test.
1168+
1169Everything below the `---` line is the expected error message having
1170this form:
1171+
1172----
1173LINE:COL - MESSAGE
1174----
1175
1176=== Contributing
1177
1178Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1179for code review.
1180
1181To report a bug, https://github.com/efficios/normand/issues/new[create a
1182GitHub issue].
This page took 0.064455 seconds and 4 git commands to generate.