Add initial tests
[normand.git] / README.adoc
1 // Show ToC at a specific location for a GitHub rendering
2 ifdef::env-github[]
3 :toc: macro
4 endif::env-github[]
5
6 ifndef::env-github[]
7 :toc: left
8 endif::env-github[]
9
10 // This is to mimic what GitHub does so that anchors work in an offline
11 // rendering too.
12 :idprefix:
13 :idseparator: -
14
15 // Other attributes
16 :py3: Python{nbsp}3
17
18 = Normand
19 Philippe Proulx
20
21 image::normand-logo.png[]
22
23 [.normal]
24 image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26 [.lead]
27 _**Normand**_ is a text-to-binary processor with its own language.
28
29 This package offers both a portable {py3} module and a command-line
30 tool.
31
32 WARNING: This version of Normand is 0.1, meaning both the Normand
33 language and the module/CLI interface aren't stable.
34
35 ifdef::env-github[]
36 // ToC location for a GitHub rendering
37 toc::[]
38 endif::env-github[]
39
40 == Introduction
41
42 The purpose of Normand is to consume human-readable text representing
43 bytes and to produce the corresponding binary data.
44
45 .Simple bytes input.
46 ====
47 Consider the following Normand input:
48
49 ----
50 4f 55 32 bb $167 fe %10100111 a9 $-32
51 ----
52
53 The generated nine bytes are:
54
55 ----
56 4f 55 32 bb a7 fe a7 a9 e0
57 ----
58 ====
59
60 As you can see in the last example, the fundamental unit of the Normand
61 language is the _byte_. The order in which you list bytes will be the
62 order of the generated data.
63
64 The Normand language is more than simple lists of bytes, though. Its
65 main features are:
66
67 Comments, including a bunch of insignificant symbols which may improve readability::
68 +
69 Input:
70 +
71 ----
72 ff bb %1101:0010 # This is a comment
73 78 29 af $192 # This too # 99 $-80
74 fe80::6257:18ff:fea3:4229
75 60:57:18:a3:42:29
76 10839636-5d65-4a68-8e6a-21608ddf7258
77 ----
78 +
79 Output:
80 +
81 ----
82 ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83 a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
84 68 8e 6a 21 60 8d df 72 58
85 ----
86
87 Hexadecimal, decimal, and binary byte constants::
88 +
89 Input:
90 +
91 ----
92 aa bb $247 $-89 %0011_0010 %11.01= 10/10
93 ----
94 +
95 Output:
96 +
97 ----
98 aa bb f7 a7 32 da
99 ----
100
101 UTF-8, UTF-16, and UTF-32 literal strings::
102 +
103 Input:
104 +
105 ----
106 "hello world!" 00
107 u16le"stress\nverdict 🤣"
108 ----
109 +
110 Output:
111 +
112 ----
113 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
114 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
115 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116 ----
117
118 Labels: special variables holding the offset where they're defined::
119 +
120 ----
121 <beg> b2 52 e3 bc 91 05
122 $100 $50 <chair> 33 9f fe
123 25 e9 89 8a <end>
124 ----
125
126 Variables::
127 +
128 ----
129 5e 65 {tower = 47} c6 7f f2 c4
130 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131 ----
132 +
133 The value of a variable assignment is the evaluation of a valid {py3}
134 expression which may include label and variable names.
135
136 Value encoding with a specific length (8{nbsp}bits to 64{nbsp}bits) and byte order::
137 +
138 Input:
139 +
140 ----
141 {strength = 4}
142 {be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143 {le} {-1993 : 32}
144 ----
145 +
146 Output:
147 +
148 ----
149 67 44 b2 00 2c 63 37 f8 ff ff
150 ----
151 +
152 The encoded value is the evaluation of a valid {py3} expression which
153 may include label and variable names.
154
155 Repetition::
156 +
157 Input:
158 +
159 ----
160 aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
161 ----
162 +
163 Output:
164 +
165 ----
166 aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
167 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
168 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
169 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
170 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
171 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
172 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
173 ----
174
175
176 Multilevel grouping::
177 +
178 Input:
179 +
180 ----
181 ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
182 ----
183 +
184 Output:
185 +
186 ----
187 ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
188 bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
189 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
190 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
191 aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
192 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
193 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
194 ----
195
196 Precise error reporting::
197 +
198 ----
199 /tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
200 ----
201 +
202 ----
203 /tmp/meow.normand:32:6 - Unexpected character `k`.
204 ----
205 +
206 ----
207 /tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`mix`, `zoom`}.
208 ----
209 +
210 ----
211 /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE` at byte offset 45.
212 ----
213
214 You can use Normand to track data source files in your favorite VCS
215 instead of raw binary files. The binary files that Normand generates can
216 be used to test file format decoding, including malformatted data, for
217 example, as well as for education.
218
219 See <<learn-normand>> to explore all the Normand features.
220
221 == Install Normand
222
223 Normand requires Python ≥ 3.4.
224
225 To install Normand:
226
227 ----
228 $ python3 -m pip install --user normand
229 ----
230
231 See
232 https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
233 to learn more about a user site installation.
234
235 [NOTE]
236 ====
237 Normand has a single module file, `normand.py`, which you can copy as is
238 to your project to use it (both the <<python-3-api,`normand.parse()`>>
239 function and the <<command-line-tool,command-line tool>>).
240
241 `normand.py` has _no external dependencies_, but if you're using
242 Python{nbsp}3.4, you'll need a local copy of the standard `typing`
243 module.
244 ====
245
246 == Learn Normand
247
248 A Normand text input is a sequence of items which represent a sequence
249 of raw bytes.
250
251 [[state]] During the processing of items to data, Normand relies on a
252 current state:
253
254 [%header%autowidth]
255 |===
256 |State variable |Description |Initial value: <<python-3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
257
258 |[[cur-offset]] Current offset
259 |
260 The current offset has an effect on the value of
261 <<label,labels>> and of the special `ICITTE` name in <<value,value>> and
262 <<variable-assignment,variable assignment>> expression evaluation.
263
264 Each generated byte increments the current offset.
265
266 A <<current-offset-setting,current offset setting>> may change the
267 current offset.
268 |`init_offset` parameter of the `parse()` function.
269 |`--offset` option.
270
271 |[[cur-bo]] Current byte order
272 |
273 The current byte order has an effect on the encoding of <<value,values>>.
274
275 A <<current-byte-order-setting,current byte order setting>> may change
276 the current byte order.
277 |`init_byte_order` parameter of the `parse()` function.
278 |`--byte-order` option.
279
280 |<<label,Labels>>
281 |Mapping of label names to integral values.
282 |`init_labels` parameter of the `parse()` function.
283 |One or more `--label` options.
284
285 |<<variable-assignment,Variables>>
286 |Mapping of variable names to integral values.
287 |`init_variables` parameter of the `parse()` function.
288 |One or more `--var` options.
289 |===
290
291 The available items are:
292
293 * A <<byte-constant,constant integer>> representing a single byte.
294
295 * A <<literal-string,literal string>> representing a sequence of bytes
296 encoding UTF-8, UTF-16, or UTF-32 data.
297
298 * A <<current-byte-order-setting,current byte order setting>> (big or
299 little endian).
300
301 * A <<value,{py3} expression to be evaluated>> as an unsigned or signed
302 integer to be encoded on one or more bytes using the current byte
303 order.
304
305 * A <<current-offset-setting,current offset setting>>.
306
307 * A <<label,label>>, that is, a named constant holding the current
308 offset.
309 +
310 This is similar to an assembly label.
311
312 * A <<variable-assignment,variable assignment>> associating a name to
313 the integral result of an evaluated {py3} expression.
314
315 * A <<group,group>>, that is, a scoped sequence of items.
316
317 Moreover, you can <<repetition,repeat>> any item above, except an offset
318 or a label, a given fixed or variable number of times. This is called a
319 repetition.
320
321 A Normand comment may exist:
322
323 * Between items, possibly within a group.
324 * Between the nibbles of a constant hexadecimal byte.
325 * Between the bits of a constant binary byte.
326 * Between the last item and the ``pass:[*]`` character of a repetition,
327 and between that ``pass:[*]`` character and the following number
328 or expression.
329
330 A comment is anything between two ``pass:[#]`` characters on the same
331 line, or from ``pass:[#]`` until the end of the line. Whitespaces and
332 the following symbol characters are also considered comments where a
333 comment may exist:
334
335 ----
336 ! @ / \ ? & : ; . , + [ ] _ = | -
337 ----
338
339 The latter serve to improve readability so that you may write, for
340 example, a MAC address or a UUID as is.
341
342 You can test the examples of this section with the `normand`
343 <<command-line-tool,command-line tool>> as such:
344
345 ----
346 $ normand file | hexdump -C
347 ----
348
349 where `file` is the name of a file containing the Normand input.
350
351 === Byte constant
352
353 A _byte constant_ represents a single byte.
354
355 A byte constant is:
356
357 Hexadecimal form::
358 Two consecutive hexits.
359
360 Decimal form::
361 A decimal number after the `$` prefix.
362
363 Binary form::
364 Eight bits after the `%` prefix.
365
366 ====
367 Input:
368
369 ----
370 ab cd [3d 8F] CC
371 ----
372
373 Output:
374
375 ----
376 ab cd 3d 8f cc
377 ----
378 ====
379
380 ====
381 Input:
382
383 ----
384 $192 %1100/0011 $ -77
385 ----
386
387 Output:
388
389 ----
390 c0 c3 b3
391 ----
392 ====
393
394 ====
395 Input:
396
397 ----
398 58f64689-6316-4d55-8a1a-04cada366172
399 fe80::6257:18ff:fea3:4229
400 ----
401
402 Output:
403
404 ----
405 58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
406 fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
407 ----
408 ====
409
410 ====
411 Input:
412
413 ----
414 %01110011 %01100001 %01101100 %01110101 %01110100
415 ----
416
417 Output:
418
419 ----
420 73 61 6c 75 74 ┆ salut
421 ----
422 ====
423
424 === Literal string
425
426 A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
427 bytes of a string.
428
429 The string to encode isn't implicitly null-terminated: use `\0` at the
430 end of the string to add a null character.
431
432 A literal string is:
433
434 . **Optional**: one of the following encodings instead of UTF-8:
435 +
436 --
437 [horizontal]
438 `u16be`:: UTF-16BE.
439 `u16le`:: UTF-16LE.
440 `u32be`:: UTF-32BE.
441 `u32le`:: UTF-32LE.
442 --
443
444 . The ``pass:["]`` prefix.
445
446 . A sequence of zero or more characters, possibly containing escape
447 sequences.
448 +
449 An escape sequence is the ``\`` character followed by one of:
450 +
451 --
452 [horizontal]
453 `0`:: Null (U+0000)
454 `a`:: Alert (U+0007)
455 `b`:: Backspace (U+0008)
456 `e`:: Escape (U+001B)
457 `f`:: Form feed (U+000C)
458 `n`:: End of line (U+000A)
459 `r`:: Carriage return (U+000D)
460 `t`:: Character tabulation (U+0009)
461 `v`:: Line tabulation (U+000B)
462 ``\``:: Reverse solidus (U+005C)
463 ``pass:["]``:: Quotation mark (U+0022)
464 --
465
466 . The ``pass:["]`` suffix.
467
468 ====
469 Input:
470
471 ----
472 "coucou tout le monde!"
473 ----
474
475 Output:
476
477 ----
478 63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
479 6f 6e 64 65 21 ┆ onde!
480 ----
481 ====
482
483 ====
484 Input:
485
486 ----
487 u16le"I am not young enough to know everything."
488 ----
489
490 Output:
491
492 ----
493 49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
494 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
495 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
496 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
497 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
498 2e 00 ┆ .•
499 ----
500 ====
501
502 ====
503 Input:
504
505 ----
506 u32be "\"illusion is the first\nof all pleasures\" 🦉"
507 ----
508
509 Output:
510
511 ----
512 00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
513 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
514 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
515 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
516 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
517 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
518 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
519 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
520 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
521 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
522 00 00 00 20 00 01 f9 89 ┆ ••• ••••
523 ----
524 ====
525
526 === Current byte order setting
527
528 This special item sets the <<cur-bo,_current byte order_>>.
529
530 The two accepted forms are:
531
532 [horizontal]
533 ``pass:[{be}]``:: Set the current byte order to big endian.
534 ``pass:[{le}]``:: Set the current byte order to little endian.
535
536 === Value
537
538 A _value_ represents a fixed number of bytes encoding an unsigned or
539 signed integer which is the result of evaluating a {py3} expression
540 using the <<cur-bo,current byte order>>.
541
542 For a value at some source location{nbsp}__**L**__, its {py3} expression
543 may contain the name of any accessible <<label,label>>, including the
544 name of a label defined after{nbsp}__**L**__, as well as the name of any
545 <<variable-assignment,variable>> known at{nbsp}__**L**__.
546
547 An accessible label is either:
548
549 * Outside of the current <<group,group>>.
550 * Within the same immediate group (not within a nested group).
551
552 In the {py3} expression of a value, the value of the special name
553 `ICITTE` is the <<cur-offset,current offset>> (before encoding the
554 value).
555
556 A value is:
557
558 . The ``pass:[{]`` prefix.
559
560 . A valid {py3} expression.
561
562 . The `:` character.
563
564 . An encoding length in bits amongst `8`, `16`, `24`, `32`, `40`,
565 `48`, `56`, and `64`.
566
567 . The `}` suffix.
568
569 ====
570 Input:
571
572 ----
573 {le} {345:16}
574 {be} {-0xabcd:32}
575 ----
576
577 Output:
578
579 ----
580 59 01 ff ff 54 33
581 ----
582 ====
583
584 ====
585 Input:
586
587 ----
588 {be}
589
590 # String length in bits
591 {8 * (str_end - str_beg) : 16}
592
593 # String
594 <str_beg>
595 "hello world!"
596 <str_end>
597 ----
598
599 Output:
600
601 ----
602 00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
603 ----
604 ====
605
606 ====
607 Input:
608
609 ----
610 {20 - ICITTE : 8} * 10
611 ----
612
613 Output:
614
615 ----
616 14 13 12 11 10 0f 0e 0d 0c 0b
617 ----
618 ====
619
620 === Current offset setting
621
622 This special item sets the <<cur-offset,_current offset_>>.
623
624 A current offset setting is:
625
626 . The `<` prefix.
627
628 . A positive integer (hexadecimal starting with `0x` or `0X` accepted)
629 which is the new current offset.
630
631 . The `>` suffix.
632
633 ====
634 Input:
635
636 ----
637 {ICITTE : 8} * 8
638 <0x61> {ICITTE : 8} * 8
639 ----
640
641 Output:
642
643 ----
644 00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
645 ----
646 ====
647
648 ====
649 Input:
650
651 ----
652 aa bb cc dd <meow> ee ff
653 <12> 11 22 33 <mix> 44 55
654 {meow : 8} {mix : 8}
655 ----
656
657 Output:
658
659 ----
660 aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
661 ----
662 ====
663
664 === Label
665
666 A _label_ associates a name to the <<cur-offset,current offset>>.
667
668 All the labels of a whole Normand input must have unique names.
669
670 A label may not share the name of a <<variable-assignment,variable>>
671 name.
672
673 A label name may not be `ICITTE` (see <<value>> and
674 <<variable-assignment>> to learn more).
675
676 A label is:
677
678 . The `<` prefix.
679
680 . A valid {py3} name which is not `ICITTE`.
681
682 . The `>` suffix.
683
684 === Variable assignment
685
686 A _variable assignment_ associates a name to the integral result of an
687 evaluated {py3} expression.
688
689 For a variable assignment at some source location{nbsp}__**L**__, its
690 {py3} expression may contain the name of any accessible <<label,label>>,
691 including the name of a label defined after{nbsp}__**L**__, as well as
692 the name of any variable known at{nbsp}__**L**__.
693
694 An accessible label is either:
695
696 * Outside of the current <<group,group>>.
697 * Within the same immediate group (not within a nested group).
698
699 A variable name may not be `ICITTE`.
700
701 In the {py3} expression of a variable assignment, the special name
702 `ICITTE` is the <<cur-offset,current offset>>.
703
704 A variable is:
705
706 . The ``pass:[{]`` prefix.
707
708 . A valid {py3} name which is not `ICITTE`.
709
710 . The `=` character.
711
712 . A valid {py3} expression.
713
714 . The `}` suffix.
715
716 ====
717 Input:
718
719 ----
720 {mix = 101} {le}
721 {meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
722 "yooo" {meow + mix : 16}
723 ----
724
725 Output:
726
727 ----
728 11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
729 ----
730 ====
731
732 === Group
733
734 A _group_ is a scoped sequence of items.
735
736 The <<label,labels>> within a group aren't visible outside of it.
737
738 The main purpose of a group is to <<repetition,repeat>> more than a
739 single item.
740
741 A group is:
742
743 . The `(` prefix.
744
745 . Zero or more items.
746
747 . The `)` suffix.
748
749 ====
750 Input:
751
752 ----
753 ((aa bb cc) dd () ee) "leclerc"
754 ----
755
756 Output:
757
758 ----
759 aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
760 ----
761 ====
762
763 ====
764 Input:
765
766 ----
767 ((aa bb cc) * 3 dd ee) * 5
768 ----
769
770 Output:
771
772 ----
773 aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
774 cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
775 ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
776 bb cc aa bb cc dd ee
777 ----
778 ====
779
780 ====
781 Input:
782
783 ----
784 {be}
785 (
786 <str_beg> u16le"sébastien diaz" <str_end>
787 {ICITTE - str_beg : 8}
788 {(end - str_beg) * 5 : 24}
789 ) * 3
790 <end>
791 ----
792
793 Output:
794
795 ----
796 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
797 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
798 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
799 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
800 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
801 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
802 ----
803 ====
804
805 === Repetition
806
807 A _repetition_ represents the bytes of an item repeated a given number
808 of times.
809
810 A repetition is:
811
812 . Any item.
813
814 . The ``pass:[*]`` character.
815
816 . One of:
817
818 ** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
819 which is the number of times to repeat the previous item.
820
821 ** The ``pass:[{]`` prefix, a valid {py3} expression, and the
822 ``pass:[}]`` suffix.
823
824 When using an expression, it can't refer, directly or indirectly, to a
825 subsequent label name and to the reserved `ICITTE` name.
826
827 ====
828 Input:
829
830 ----
831 {end - ICITTE - 1 : 8} * 0x100 <end>
832 ----
833
834 Output:
835
836 ----
837 ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
838 ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
839 df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
840 cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
841 bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
842 af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
843 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
844 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
845 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
846 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
847 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
848 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
849 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
850 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
851 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
852 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
853 ----
854 ====
855
856 ====
857 Input:
858
859 ----
860 {times = 1}
861 aa bb cc dd
862 (
863 <here>
864 (ee ff) * {here + 1}
865 11 22 33 * {times}
866 {times = times + 1}
867 ) * 3
868 "coucou!"
869 ----
870
871 Output:
872
873 ----
874 aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
875 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
876 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
877 ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
878 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
879 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
880 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
881 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
882 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
883 ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
884 ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
885 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
886 ----
887 ====
888
889 ====
890 This example shows how to use a repetition as a conditional section
891 depending on some predefined variable.
892
893 Input:
894
895 ----
896 aa bb cc dd
897 (ee ff "meow mix" 00) * {cond}
898 {be} {-1993:16}
899 ----
900
901 Output (`cond` is 0):
902
903 ----
904 aa bb cc dd f8 37
905 ----
906
907 Output (`cond` is 1):
908
909 ----
910 aa bb cc dd ee ff 6d 65 6f 77 20 6d 69 78 00 f8 ┆ ••••••meow mix••
911 37 ┆ 7
912 ----
913 ====
914
915 == Command-line tool
916
917 If you <<install-normand,installed>> the `normand` package, then you
918 can use the `normand` command-line tool:
919
920 ----
921 $ normand <<< '"ma gang de malades"' | hexdump -C
922 ----
923
924 ----
925 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
926 00000010 65 73 |es|
927 ----
928
929 If you copy the `normand.py` module to your own project, then you can
930 run the module itself:
931
932 ----
933 $ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
934 ----
935
936 ----
937 00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
938 00000010 65 73 |es|
939 ----
940
941 Without a path argument, the `normand` tool reads from the standard
942 input.
943
944 The `normand` tool prints the generated binary data to the standard
945 output.
946
947 Various options control the initial <<state,state>> of the processor:
948 use the `--help` option to learn more.
949
950 == {py3} API
951
952 The whole `normand` package/module API is:
953
954 [source,python]
955 ----
956 class ByteOrder(enum.Enum):
957 # Big endian.
958 BE = ...
959
960 # Little endian.
961 LE = ...
962
963
964 VarsT = typing.Dict[str, int]
965
966
967 class TextLoc:
968 # Line number.
969 @property
970 def line_no(self) -> int:
971 ...
972
973 # Column number.
974 @property
975 def col_no(self) -> int:
976 ...
977
978
979 class ParseError(RuntimeError):
980 # Source text location.
981 @property
982 def text_loc(self) -> TextLoc:
983 ...
984
985
986 class ParseResult:
987 # Generated data.
988 @property
989 def data(self) -> bytearray:
990 ...
991
992 # Updated variable values.
993 @property
994 def variables(self) -> VarsT:
995 ...
996
997 # Updated main group label values.
998 @property
999 def labels(self) -> VarsT:
1000 ...
1001
1002 # Final offset.
1003 @property
1004 def offset(self) -> int:
1005 ...
1006
1007 # Final byte order.
1008 @property
1009 def byte_order(self) -> typing.Optional[int]:
1010 ...
1011
1012 def parse(normand: str,
1013 init_variables: typing.Optional[VarsT] = None,
1014 init_labels: typing.Optional[VarsT] = None,
1015 init_offset: int = 0,
1016 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1017 ...
1018 ----
1019
1020 The `normand` parameter is the actual <<learn-normand,Normand input>>
1021 while the other parameters control the initial <<state,state>>.
1022
1023 The `parse()` function raises a `ParseError` instance should it fail to
1024 parse the `normand` string for any reason.
1025
1026 == Development
1027
1028 Normand is a https://python-poetry.org/[Poetry] project.
1029
1030 To develop it, install it through Poetry and enter the virtual
1031 environment:
1032
1033 ----
1034 $ poetry install
1035 $ poetry shell
1036 $ normand <<< '"lol" * 10 0a'
1037 ----
1038
1039 `normand.py` is processed by:
1040
1041 * https://microsoft.github.io/pyright/[Pyright]
1042 * https://github.com/psf/black[Black]
1043 * https://pycqa.github.io/isort/[isort]
1044
1045 === Testing
1046
1047 Use https://docs.pytest.org/[pytest] to test Normand once the package is
1048 part of your virtual environment, for example:
1049
1050 ----
1051 $ poetry install
1052 $ poetry run pip3 install pytest
1053 $ poetry run pytest
1054 ----
1055
1056 The `pytest` project is currently not a development dependency in
1057 `pyproject.toml` due to backward compatibiliy issues with
1058 Python{nbsp}3.4.
1059
1060 In the `tests` directory, each `*.nt` file is a test. The file name
1061 prefix indicates what it's meant to test:
1062
1063 `pass-`::
1064 Everything above the `---` line is the valid Normand input
1065 to test.
1066 +
1067 Everything below the `---` line is the expected data
1068 (whitespace-separated hexadecimal bytes).
1069
1070 `fail-`::
1071 Everything above the `---` line is the invalid Normand input
1072 to test.
1073 +
1074 Everything below the `---` line is the expected error message having
1075 this form:
1076 +
1077 ----
1078 LINE:COL - MESSAGE
1079 ----
1080
1081 === Contributing
1082
1083 Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1084 for code review.
1085
1086 To report a bug, https://github.com/efficios/normand/issues/new[create a
1087 GitHub issue].
This page took 0.050314 seconds and 5 git commands to generate.