Add fixed-length floating point number support
[normand.git] / README.adoc
CommitLineData
bb2f9e9c
PP
1// Show ToC at a specific location for a GitHub rendering
2ifdef::env-github[]
3:toc: macro
4endif::env-github[]
5
6ifndef::env-github[]
71aaa3f7 7:toc: left
bb2f9e9c
PP
8endif::env-github[]
9
10// This is to mimic what GitHub does so that anchors work in an offline
11// rendering too.
12:idprefix:
13:idseparator: -
71aaa3f7 14
bb2f9e9c 15// Other attributes
71aaa3f7
PP
16:py3: Python{nbsp}3
17
bb2f9e9c
PP
18= Normand
19Philippe Proulx
20
df0f8552
PP
21image::normand-logo.png[]
22
71aaa3f7
PP
23[.normal]
24image:https://img.shields.io/pypi/v/normand.svg?label=Latest%20version[link="https://pypi.python.org/pypi/normand"]
25
26[.lead]
27_**Normand**_ is a text-to-binary processor with its own language.
28
29This package offers both a portable {py3} module and a command-line
30tool.
31
269f6eb3 32WARNING: This version of Normand is 0.6, meaning both the Normand
71aaa3f7
PP
33language and the module/CLI interface aren't stable.
34
bb2f9e9c
PP
35ifdef::env-github[]
36// ToC location for a GitHub rendering
37toc::[]
38endif::env-github[]
39
71aaa3f7
PP
40== Introduction
41
42The purpose of Normand is to consume human-readable text representing
43bytes and to produce the corresponding binary data.
44
45.Simple bytes input.
46====
47Consider the following Normand input:
48
49----
504f 55 32 bb $167 fe %10100111 a9 $-32
51----
52
53The generated nine bytes are:
54
55----
564f 55 32 bb a7 fe a7 a9 e0
57----
58====
59
60As you can see in the last example, the fundamental unit of the Normand
61language is the _byte_. The order in which you list bytes will be the
62order of the generated data.
63
64The Normand language is more than simple lists of bytes, though. Its
65main features are:
66
67Comments, including a bunch of insignificant symbols which may improve readability::
68+
69Input:
70+
71----
72ff bb %1101:0010 # This is a comment
7378 29 af $192 # This too # 99 $-80
74fe80::6257:18ff:fea3:4229
7560:57:18:a3:42:29
7610839636-5d65-4a68-8e6a-21608ddf7258
77----
78+
79Output:
80+
81----
82ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe
83a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a
8468 8e 6a 21 60 8d df 72 58
85----
86
87Hexadecimal, decimal, and binary byte constants::
88+
89Input:
90+
91----
92aa bb $247 $-89 %0011_0010 %11.01= 10/10
93----
94+
95Output:
96+
97----
98aa bb f7 a7 32 da
99----
100
101UTF-8, UTF-16, and UTF-32 literal strings::
102+
103Input:
104+
105----
106"hello world!" 00
107u16le"stress\nverdict 🤣"
108----
109+
110Output:
111+
112----
11368 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
11400 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
11500 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
116----
117
118Labels: special variables holding the offset where they're defined::
119+
120----
121<beg> b2 52 e3 bc 91 05
122$100 $50 <chair> 33 9f fe
12325 e9 89 8a <end>
124----
125
126Variables::
127+
128----
1295e 65 {tower = 47} c6 7f f2 c4
13044 {hurl = tower - 14} b5 {tower = hurl} 26 2d
131----
132+
133The value of a variable assignment is the evaluation of a valid {py3}
134expression which may include label and variable names.
135
269f6eb3 136Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
71aaa3f7
PP
137+
138Input:
139+
140----
141{strength = 4}
142{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
143{le} {-1993 : 32}
269f6eb3 144{-3.141593 : 64}
71aaa3f7
PP
145----
146+
147Output:
148+
149----
269f6eb3
PP
15067 44 b2 00 2c 63 37 f8 ff ff 7f bd c2 82 fb 21
15109 c0
71aaa3f7
PP
152----
153+
269f6eb3 154The encoded number is the evaluation of a valid {py3} expression which
05f81895
PP
155may include label and variable names.
156
157https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
158+
159Input:
160+
161----
162aa bb cc {-1993 : sleb128} <meow> dd ee ff
163{meow * 199 : uleb128}
164----
165+
166Output:
167+
168----
169aa bb cc b7 70 dd ee ff e3 07
170----
171+
172The encoded integer is the evaluation of a valid {py3} expression which
71aaa3f7
PP
173may include label and variable names.
174
175Repetition::
176+
177Input:
178+
179----
2adf4336 180aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
71aaa3f7
PP
181----
182+
183Output:
184+
185----
2adf4336
PP
186aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah
18700 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
18879 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y
18965 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye
19061 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
19168 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
71aaa3f7 19200 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
71aaa3f7
PP
193----
194
195
196Multilevel grouping::
197+
198Input:
199+
200----
201ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
202----
203+
204Output:
205+
206----
207ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom••
208bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z
2096f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo
2106d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom•
211aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom•••
2127a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo
2136f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
214----
215
216Precise error reporting::
217+
218----
219/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
220----
221+
222----
223/tmp/meow.normand:32:6 - Unexpected character `k`.
224----
225+
226----
2adf4336 227/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`mix`, `zoom`}.
71aaa3f7
PP
228----
229+
230----
231/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE` at byte offset 45.
232----
233
234You can use Normand to track data source files in your favorite VCS
235instead of raw binary files. The binary files that Normand generates can
236be used to test file format decoding, including malformatted data, for
237example, as well as for education.
238
239See <<learn-normand>> to explore all the Normand features.
240
241== Install Normand
242
243Normand requires Python ≥ 3.4.
244
245To install Normand:
246
247----
248$ python3 -m pip install --user normand
249----
250
251See
252https://packaging.python.org/en/latest/tutorials/installing-packages/#installing-to-the-user-site[Installing to the User Site]
253to learn more about a user site installation.
254
255[NOTE]
256====
257Normand has a single module file, `normand.py`, which you can copy as is
af3cf417 258to your project to use it (both the <<python3-api,`normand.parse()`>>
71aaa3f7
PP
259function and the <<command-line-tool,command-line tool>>).
260
261`normand.py` has _no external dependencies_, but if you're using
262Python{nbsp}3.4, you'll need a local copy of the standard `typing`
263module.
264====
265
266== Learn Normand
267
268A Normand text input is a sequence of items which represent a sequence
269of raw bytes.
270
271[[state]] During the processing of items to data, Normand relies on a
272current state:
273
274[%header%autowidth]
275|===
af3cf417 276|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
71aaa3f7
PP
277
278|[[cur-offset]] Current offset
279|
05f81895 280The current offset has an effect on the value of <<label,labels>> and of
269f6eb3
PP
281the special `ICITTE` name in <<fixed-length-number,fixed-length
282number>>, <<leb-128-integer,LEB128 integer>>, and
71aaa3f7
PP
283<<variable-assignment,variable assignment>> expression evaluation.
284
285Each generated byte increments the current offset.
286
287A <<current-offset-setting,current offset setting>> may change the
288current offset.
289|`init_offset` parameter of the `parse()` function.
290|`--offset` option.
291
292|[[cur-bo]] Current byte order
293|
05f81895 294The current byte order has an effect on the encoding of
269f6eb3 295<<fixed-length-number,fixed-length numbers>>.
71aaa3f7
PP
296
297A <<current-byte-order-setting,current byte order setting>> may change
298the current byte order.
299|`init_byte_order` parameter of the `parse()` function.
300|`--byte-order` option.
301
302|<<label,Labels>>
303|Mapping of label names to integral values.
304|`init_labels` parameter of the `parse()` function.
305|One or more `--label` options.
306
307|<<variable-assignment,Variables>>
308|Mapping of variable names to integral values.
309|`init_variables` parameter of the `parse()` function.
310|One or more `--var` options.
311|===
312
313The available items are:
314
315* A <<byte-constant,constant integer>> representing a single byte.
316
317* A <<literal-string,literal string>> representing a sequence of bytes
318 encoding UTF-8, UTF-16, or UTF-32 data.
319
320* A <<current-byte-order-setting,current byte order setting>> (big or
321 little endian).
322
269f6eb3
PP
323* A <<fixed-length-number,fixed-length number>> (integer or
324 floating point) using the <<cur-bo,current byte order>> and of which
325 the value is the result of a {py3} expression.
05f81895
PP
326
327* An <<leb128-integer,LEB128 integer>> of which the value is the result
328 of a {py3} expression.
71aaa3f7
PP
329
330* A <<current-offset-setting,current offset setting>>.
331
332* A <<label,label>>, that is, a named constant holding the current
333 offset.
334+
335This is similar to an assembly label.
336
337* A <<variable-assignment,variable assignment>> associating a name to
338 the integral result of an evaluated {py3} expression.
339
340* A <<group,group>>, that is, a scoped sequence of items.
341
342Moreover, you can <<repetition,repeat>> any item above, except an offset
2adf4336
PP
343or a label, a given fixed or variable number of times. This is called a
344repetition.
71aaa3f7
PP
345
346A Normand comment may exist:
347
348* Between items, possibly within a group.
349* Between the nibbles of a constant hexadecimal byte.
350* Between the bits of a constant binary byte.
351* Between the last item and the ``pass:[*]`` character of a repetition,
2adf4336
PP
352 and between that ``pass:[*]`` character and the following number
353 or expression.
71aaa3f7
PP
354
355A comment is anything between two ``pass:[#]`` characters on the same
356line, or from ``pass:[#]`` until the end of the line. Whitespaces and
357the following symbol characters are also considered comments where a
358comment may exist:
359
360----
361! @ / \ ? & : ; . , + [ ] _ = | -
362----
363
364The latter serve to improve readability so that you may write, for
365example, a MAC address or a UUID as is.
366
367You can test the examples of this section with the `normand`
368<<command-line-tool,command-line tool>> as such:
369
370----
371$ normand file | hexdump -C
372----
373
374where `file` is the name of a file containing the Normand input.
375
376=== Byte constant
377
378A _byte constant_ represents a single byte.
379
380A byte constant is:
381
382Hexadecimal form::
383 Two consecutive hexits.
384
385Decimal form::
386 A decimal number after the `$` prefix.
387
388Binary form::
389 Eight bits after the `%` prefix.
390
391====
392Input:
393
394----
395ab cd [3d 8F] CC
396----
397
398Output:
399
400----
401ab cd 3d 8f cc
402----
403====
404
405====
406Input:
407
408----
409$192 %1100/0011 $ -77
410----
411
412Output:
413
414----
415c0 c3 b3
416----
417====
418
419====
420Input:
421
422----
42358f64689-6316-4d55-8a1a-04cada366172
424fe80::6257:18ff:fea3:4229
425----
426
427Output:
428
429----
43058 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar
431fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
432----
433====
434
435====
436Input:
437
438----
439%01110011 %01100001 %01101100 %01110101 %01110100
440----
441
442Output:
443
444----
44573 61 6c 75 74 ┆ salut
446----
447====
448
449=== Literal string
450
451A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
452bytes of a string.
453
454The string to encode isn't implicitly null-terminated: use `\0` at the
455end of the string to add a null character.
456
457A literal string is:
458
459. **Optional**: one of the following encodings instead of UTF-8:
460+
461--
462[horizontal]
463`u16be`:: UTF-16BE.
464`u16le`:: UTF-16LE.
465`u32be`:: UTF-32BE.
466`u32le`:: UTF-32LE.
467--
468
469. The ``pass:["]`` prefix.
470
471. A sequence of zero or more characters, possibly containing escape
472 sequences.
473+
474An escape sequence is the ``\`` character followed by one of:
475+
476--
477[horizontal]
478`0`:: Null (U+0000)
479`a`:: Alert (U+0007)
480`b`:: Backspace (U+0008)
481`e`:: Escape (U+001B)
482`f`:: Form feed (U+000C)
483`n`:: End of line (U+000A)
484`r`:: Carriage return (U+000D)
485`t`:: Character tabulation (U+0009)
486`v`:: Line tabulation (U+000B)
487``\``:: Reverse solidus (U+005C)
488``pass:["]``:: Quotation mark (U+0022)
489--
490
491. The ``pass:["]`` suffix.
492
493====
494Input:
495
496----
497"coucou tout le monde!"
498----
499
500Output:
501
502----
50363 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m
5046f 6e 64 65 21 ┆ onde!
505----
506====
507
508====
509Input:
510
511----
512u16le"I am not young enough to know everything."
513----
514
515Output:
516
517----
51849 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t•
51920 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e•
5206e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o•
52120 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v•
52265 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g•
5232e 00 ┆ .•
524----
525====
526
527====
528Input:
529
530----
531u32be "\"illusion is the first\nof all pleasures\" 🦉"
532----
533
534Output:
535
536----
53700 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l
53800 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o
53900 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s
54000 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e
54100 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r
54200 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o
54300 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l
54400 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l
54500 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u
54600 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••"
54700 00 00 20 00 01 f9 89 ┆ ••• ••••
548----
549====
550
551=== Current byte order setting
552
553This special item sets the <<cur-bo,_current byte order_>>.
554
555The two accepted forms are:
556
557[horizontal]
558``pass:[{be}]``:: Set the current byte order to big endian.
559``pass:[{le}]``:: Set the current byte order to little endian.
560
269f6eb3 561=== Fixed-length number
71aaa3f7 562
269f6eb3
PP
563A _fixed-length number_ represents a fixed number of bytes encoding
564either:
565
566* An unsigned or signed integer (two's complement).
567+
568The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
569
570* A floating point number
571 ([IEEE{nbsp}754-2008[https://standards.ieee.org/standard/754-2008.html]).
572+
573The available length are 32 (_binary32_) and 64 (_binary64_).
71aaa3f7 574
269f6eb3
PP
575The value is the result of evaluating a {py3} expression using the
576<<cur-bo,current byte order>>.
577
578A fixed-length number is:
71aaa3f7
PP
579
580. The ``pass:[{]`` prefix.
581
582. A valid {py3} expression.
05f81895 583+
269f6eb3 584For a fixed-length number at some source location{nbsp}__**L**__, this
05f81895
PP
585expression may contain the name of any accessible <<label,label>> (not
586within a nested group), including the name of a label defined
587after{nbsp}__**L**__, as well as the name of any
588<<variable-assignment,variable>> known at{nbsp}__**L**__.
589+
269f6eb3
PP
590The value of the special name `ICITTE` (`int` type) in this expression
591is the <<cur-offset,current offset>> (before encoding the number).
71aaa3f7
PP
592
593. The `:` character.
594
269f6eb3
PP
595. An encoding length in bits amongst:
596+
597--
598The expression evaluates to an `int` value::
599 `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
600
601The expression evaluates to a `float` value::
602 `32` and `64`.
603--
71aaa3f7
PP
604
605. The `}` suffix.
606
607====
608Input:
609
610----
611{le} {345:16}
612{be} {-0xabcd:32}
613----
614
615Output:
616
617----
61859 01 ff ff 54 33
619----
620====
621
622====
623Input:
624
625----
626{be}
627
628# String length in bits
629{8 * (str_end - str_beg) : 16}
630
631# String
632<str_beg>
633 "hello world!"
634<str_end>
635----
636
637Output:
638
639----
64000 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
641----
642====
643
644====
645Input:
646
647----
648{20 - ICITTE : 8} * 10
649----
650
651Output:
652
653----
65414 13 12 11 10 0f 0e 0d 0c 0b
655----
656====
657
269f6eb3
PP
658====
659Input:
660
661----
662{le}
663{2 * 0.0529 : 32}
664----
665
666Output:
667
668----
669ac ad d8 3d
670----
671====
672
05f81895
PP
673=== LEB128 integer
674
675An _LEB128 integer_ represents a variable number of bytes encoding an
676unsigned or signed integer which is the result of evaluating a {py3}
677expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
678format.
679
680An LEB128 integer is:
681
682. The ``pass:[{]`` prefix.
683
684. A valid {py3} expression.
685+
686For an LEB128 integer at some source location{nbsp}__**L**__, this
687expression may contain:
688+
689--
690* The name of any <<label,label>> defined before{nbsp}__**L**__.
691* The name of any <<variable-assignment,variable>> known at{nbsp}__**L**__
692 which doesn't, directly or indirectly, refer to a label
693 defined after{nbsp}__**L**__.
694--
695+
269f6eb3
PP
696The value of the special name `ICITTE` (`int` type) in this expression
697is the <<cur-offset,current offset>> (before encoding the integer).
05f81895
PP
698
699. The `:` character.
700
701. One of:
702+
703--
704[horizontal]
705`uleb128`:: Use the unsigned LEB128 format.
706`sleb128`:: Use the signed LEB128 format.
707--
708
709. The `}` suffix.
710
711====
712Input:
713
714----
715{624485 : uleb128}
716----
717
718Output:
719
720----
721e5 8e 26
722----
723====
724
725====
726Input:
727
728----
729aa bb cc dd
730<meow>
731ee ff
732{-981238311 + (meow * -23) : sleb128}
733"hello"
734----
735
c2b79cf6
PP
736Output:
737
05f81895
PP
738----
739aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
740----
741====
742
71aaa3f7
PP
743=== Current offset setting
744
745This special item sets the <<cur-offset,_current offset_>>.
746
747A current offset setting is:
748
749. The `<` prefix.
750
751. A positive integer (hexadecimal starting with `0x` or `0X` accepted)
752 which is the new current offset.
753
754. The `>` suffix.
755
756====
757Input:
758
759----
760 {ICITTE : 8} * 8
761<0x61> {ICITTE : 8} * 8
762----
763
764Output:
765
766----
76700 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
768----
769====
770
771====
772Input:
773
774----
775aa bb cc dd <meow> ee ff
776<12> 11 22 33 <mix> 44 55
777{meow : 8} {mix : 8}
778----
779
780Output:
781
782----
783aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
784----
785====
786
787=== Label
788
789A _label_ associates a name to the <<cur-offset,current offset>>.
790
791All the labels of a whole Normand input must have unique names.
792
05f81895 793A label must not share the name of a <<variable-assignment,variable>>
71aaa3f7
PP
794name.
795
71aaa3f7
PP
796A label is:
797
798. The `<` prefix.
799
05f81895 800. A valid {py3} name which is not `ICITTE` (see
269f6eb3 801 <<fixed-length-number>>, <<leb128-integer>>, and
05f81895 802 <<variable-assignment>> to learn more).
71aaa3f7
PP
803
804. The `>` suffix.
805
806=== Variable assignment
807
808A _variable assignment_ associates a name to the integral result of an
809evaluated {py3} expression.
810
05f81895 811A variable assignment is:
71aaa3f7
PP
812
813. The ``pass:[{]`` prefix.
814
05f81895 815. A valid {py3} name which is not `ICITTE` (see
269f6eb3 816 <<fixed-length-number>>, <<leb128-integer>>, and
05f81895 817 <<variable-assignment>> to learn more).
71aaa3f7
PP
818
819. The `=` character.
820
821. A valid {py3} expression.
05f81895
PP
822+
823For a variable assignment at some source location{nbsp}__**L**__, this
824expression may contain the name of any accessible <<label,label>> (not
825within a nested group), including the name of a label defined
826after{nbsp}__**L**__, as well as the name of any
827<<variable-assignment,variable>> known at{nbsp}__**L**__.
828+
269f6eb3
PP
829The value of the special name `ICITTE` (`int` type) in this expression
830is the <<cur-offset,current offset>>.
71aaa3f7
PP
831
832. The `}` suffix.
833
834====
835Input:
836
837----
838{mix = 101} {le}
839{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
840"yooo" {meow + mix : 16}
841----
842
843Output:
844
845----
84611 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
847----
848====
849
850=== Group
851
852A _group_ is a scoped sequence of items.
853
854The <<label,labels>> within a group aren't visible outside of it.
855
856The main purpose of a group is to <<repetition,repeat>> more than a
857single item.
858
859A group is:
860
861. The `(` prefix.
862
863. Zero or more items.
864
865. The `)` suffix.
866
867====
868Input:
869
870----
871((aa bb cc) dd () ee) "leclerc"
872----
873
874Output:
875
876----
877aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
878----
879====
880
881====
882Input:
883
884----
885((aa bb cc) * 3 dd ee) * 5
886----
887
888Output:
889
890----
891aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
892cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
893ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
894bb cc aa bb cc dd ee
895----
896====
897
898====
899Input:
900
901----
902{be}
903(
904 <str_beg> u16le"sébastien diaz" <str_end>
905 {ICITTE - str_beg : 8}
906 {(end - str_beg) * 5 : 24}
907) * 3
908<end>
909----
910
911Output:
912
913----
91473 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
9156e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
91673 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
9176e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
91873 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
9196e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
920----
921====
922
923=== Repetition
924
925A _repetition_ represents the bytes of an item repeated a given number
926of times.
927
928A repetition is:
929
930. Any item.
931
932. The ``pass:[*]`` character.
933
2adf4336
PP
934. One of:
935
936** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
937 which is the number of times to repeat the previous item.
938
939** The ``pass:[{]`` prefix, a valid {py3} expression, and the
940 ``pass:[}]`` suffix.
05f81895
PP
941+
942For a repetition at some source location{nbsp}__**L**__, this expression
943may contain:
944+
945--
946* The name of any <<label,label>> defined before{nbsp}__**L**__ and
947 which isn't part of its repeated item.
948* The name of any <<variable-assignment,variable>> known
949 at{nbsp}__**L**__, which isn't part of its repeated item, and which
950 doesn't, directly or indirectly, refer to a label defined
951 after{nbsp}__**L**__.
952--
953+
954This expression must not contain the special name `ICITTE`.
71aaa3f7
PP
955
956====
957Input:
958
959----
960{end - ICITTE - 1 : 8} * 0x100 <end>
961----
962
963Output:
964
965----
966ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
967ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
968df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
969cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
970bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
971af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
9729f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
9738f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
9747f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
9756f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
9765f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
9774f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
9783f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
9792f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
9801f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
9810f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
982----
983====
984
2adf4336
PP
985====
986Input:
987
988----
989{times = 1}
990aa bb cc dd
991(
992 <here>
993 (ee ff) * {here + 1}
994 11 22 33 * {times}
995 {times = times + 1}
996) * 3
997"coucou!"
998----
999
1000Output:
1001
1002----
1003aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
100433 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
1005ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1006ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
1007ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1008ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1009ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1010ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1011ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1012ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
1013ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
101433 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
1015----
1016====
1017
1018====
1019This example shows how to use a repetition as a conditional section
1020depending on some predefined variable.
1021
1022Input:
1023
1024----
1025aa bb cc dd
1026(ee ff "meow mix" 00) * {cond}
1027{be} {-1993:16}
1028----
1029
1030Output (`cond` is 0):
1031
1032----
1033aa bb cc dd f8 37
1034----
1035
1036Output (`cond` is 1):
1037
1038----
1039aa bb cc dd ee ff 6d 65 6f 77 20 6d 69 78 00 f8 ┆ ••••••meow mix••
104037 ┆ 7
1041----
1042====
1043
71aaa3f7
PP
1044== Command-line tool
1045
1046If you <<install-normand,installed>> the `normand` package, then you
1047can use the `normand` command-line tool:
1048
1049----
1050$ normand <<< '"ma gang de malades"' | hexdump -C
1051----
1052
1053----
105400000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
105500000010 65 73 |es|
1056----
1057
1058If you copy the `normand.py` module to your own project, then you can
1059run the module itself:
1060
1061----
1062$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
1063----
1064
1065----
106600000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad|
106700000010 65 73 |es|
1068----
1069
1070Without a path argument, the `normand` tool reads from the standard
1071input.
1072
1073The `normand` tool prints the generated binary data to the standard
1074output.
1075
1076Various options control the initial <<state,state>> of the processor:
1077use the `--help` option to learn more.
1078
1079== {py3} API
1080
1081The whole `normand` package/module API is:
1082
1083[source,python]
1084----
1085class ByteOrder(enum.Enum):
1086 # Big endian.
1087 BE = ...
1088
1089 # Little endian.
1090 LE = ...
1091
1092
71aaa3f7
PP
1093class TextLoc:
1094 # Line number.
1095 @property
1096 def line_no(self) -> int:
1097 ...
1098
1099 # Column number.
1100 @property
1101 def col_no(self) -> int:
1102 ...
1103
1104
1105class ParseError(RuntimeError):
1106 # Source text location.
1107 @property
1108 def text_loc(self) -> TextLoc:
1109 ...
1110
1111
1b8aa84a
PP
1112SymbolsT = typing.Dict[str, int]
1113
1114
71aaa3f7
PP
1115class ParseResult:
1116 # Generated data.
1117 @property
1118 def data(self) -> bytearray:
1119 ...
1120
1121 # Updated variable values.
1122 @property
1b8aa84a 1123 def variables(self) -> SymbolsT:
71aaa3f7
PP
1124 ...
1125
1126 # Updated main group label values.
1127 @property
1b8aa84a 1128 def labels(self) -> SymbolsT:
71aaa3f7
PP
1129 ...
1130
1131 # Final offset.
1132 @property
1133 def offset(self) -> int:
1134 ...
1135
1136 # Final byte order.
1137 @property
1b8aa84a 1138 def byte_order(self) -> typing.Optional[ByteOrder]:
71aaa3f7
PP
1139 ...
1140
1b8aa84a 1141
71aaa3f7 1142def parse(normand: str,
1b8aa84a
PP
1143 init_variables: typing.Optional[SymbolsT] = None,
1144 init_labels: typing.Optional[SymbolsT] = None,
71aaa3f7
PP
1145 init_offset: int = 0,
1146 init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
1147 ...
1148----
1149
1150The `normand` parameter is the actual <<learn-normand,Normand input>>
1151while the other parameters control the initial <<state,state>>.
1152
1153The `parse()` function raises a `ParseError` instance should it fail to
1154parse the `normand` string for any reason.
bf8f3b38
PP
1155
1156== Development
1157
1158Normand is a https://python-poetry.org/[Poetry] project.
1159
1160To develop it, install it through Poetry and enter the virtual
1161environment:
1162
1163----
1164$ poetry install
1165$ poetry shell
1166$ normand <<< '"lol" * 10 0a'
1167----
1168
1169`normand.py` is processed by:
1170
1171* https://microsoft.github.io/pyright/[Pyright]
1172* https://github.com/psf/black[Black]
1173* https://pycqa.github.io/isort/[isort]
1174
1175=== Testing
1176
1177Use https://docs.pytest.org/[pytest] to test Normand once the package is
1178part of your virtual environment, for example:
1179
1180----
1181$ poetry install
1182$ poetry run pip3 install pytest
1183$ poetry run pytest
1184----
1185
1186The `pytest` project is currently not a development dependency in
1187`pyproject.toml` due to backward compatibiliy issues with
1188Python{nbsp}3.4.
1189
1190In the `tests` directory, each `*.nt` file is a test. The file name
1191prefix indicates what it's meant to test:
1192
1193`pass-`::
1194 Everything above the `---` line is the valid Normand input
1195 to test.
1196+
1197Everything below the `---` line is the expected data
1198(whitespace-separated hexadecimal bytes).
1199
1200`fail-`::
1201 Everything above the `---` line is the invalid Normand input
1202 to test.
1203+
1204Everything below the `---` line is the expected error message having
1205this form:
1206+
1207----
1208LINE:COL - MESSAGE
1209----
1210
1211=== Contributing
1212
1213Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
1214for code review.
1215
1216To report a bug, https://github.com/efficios/normand/issues/new[create a
1217GitHub issue].
This page took 0.066997 seconds and 4 git commands to generate.