Commit | Line | Data |
---|---|---|
5d026c72 KB |
1 | The QNX6 Filesystem |
2 | =================== | |
3 | ||
4 | The qnx6fs is used by newer QNX operating system versions. (e.g. Neutrino) | |
5 | It got introduced in QNX 6.4.0 and is used default since 6.4.1. | |
6 | ||
7 | Option | |
8 | ====== | |
9 | ||
10 | mmi_fs Mount filesystem as used for example by Audi MMI 3G system | |
11 | ||
12 | Specification | |
13 | ============= | |
14 | ||
15 | qnx6fs shares many properties with traditional Unix filesystems. It has the | |
16 | concepts of blocks, inodes and directories. | |
17 | On QNX it is possible to create little endian and big endian qnx6 filesystems. | |
18 | This feature makes it possible to create and use a different endianness fs | |
19 | for the target (QNX is used on quite a range of embedded systems) plattform | |
c94bed8e | 20 | running on a different endianness. |
5d026c72 KB |
21 | The Linux driver handles endianness transparently. (LE and BE) |
22 | ||
23 | Blocks | |
24 | ------ | |
25 | ||
26 | The space in the device or file is split up into blocks. These are a fixed | |
27 | size of 512, 1024, 2048 or 4096, which is decided when the filesystem is | |
28 | created. | |
c94bed8e | 29 | Blockpointers are 32bit, so the maximum space that can be addressed is |
5d026c72 KB |
30 | 2^32 * 4096 bytes or 16TB |
31 | ||
32 | The superblocks | |
33 | --------------- | |
34 | ||
35 | The superblock contains all global information about the filesystem. | |
36 | Each qnx6fs got two superblocks, each one having a 64bit serial number. | |
37 | That serial number is used to identify the "active" superblock. | |
38 | In write mode with reach new snapshot (after each synchronous write), the | |
39 | serial of the new master superblock is increased (old superblock serial + 1) | |
40 | ||
41 | So basically the snapshot functionality is realized by an atomic final | |
42 | update of the serial number. Before updating that serial, all modifications | |
43 | are done by copying all modified blocks during that specific write request | |
44 | (or period) and building up a new (stable) filesystem structure under the | |
45 | inactive superblock. | |
46 | ||
47 | Each superblock holds a set of root inodes for the different filesystem | |
48 | parts. (Inode, Bitmap and Longfilenames) | |
49 | Each of these root nodes holds information like total size of the stored | |
c94bed8e MI |
50 | data and the addressing levels in that specific tree. |
51 | If the level value is 0, up to 16 direct blocks can be addressed by each | |
5d026c72 | 52 | node. |
c94bed8e MI |
53 | Level 1 adds an additional indirect addressing level where each indirect |
54 | addressing block holds up to blocksize / 4 bytes pointers to data blocks. | |
55 | Level 2 adds an additional indirect addressing block level (so, already up | |
56 | to 16 * 256 * 256 = 1048576 blocks that can be addressed by such a tree). | |
5d026c72 KB |
57 | |
58 | Unused block pointers are always set to ~0 - regardless of root node, | |
c94bed8e | 59 | indirect addressing blocks or inodes. |
5d026c72 KB |
60 | Data leaves are always on the lowest level. So no data is stored on upper |
61 | tree levels. | |
62 | ||
63 | The first Superblock is located at 0x2000. (0x2000 is the bootblock size) | |
64 | The Audi MMI 3G first superblock directly starts at byte 0. | |
65 | Second superblock position can either be calculated from the superblock | |
66 | information (total number of filesystem blocks) or by taking the highest | |
c94bed8e | 67 | device address, zeroing the last 3 bytes and then subtracting 0x1000 from |
5d026c72 KB |
68 | that address. |
69 | ||
70 | 0x1000 is the size reserved for each superblock - regardless of the | |
71 | blocksize of the filesystem. | |
72 | ||
73 | Inodes | |
74 | ------ | |
75 | ||
76 | Each object in the filesystem is represented by an inode. (index node) | |
77 | The inode structure contains pointers to the filesystem blocks which contain | |
78 | the data held in the object and all of the metadata about an object except | |
79 | its longname. (filenames longer than 27 characters) | |
80 | The metadata about an object includes the permissions, owner, group, flags, | |
81 | size, number of blocks used, access time, change time and modification time. | |
82 | ||
83 | Object mode field is POSIX format. (which makes things easier) | |
84 | ||
85 | There are also pointers to the first 16 blocks, if the object data can be | |
c94bed8e MI |
86 | addressed with 16 direct blocks. |
87 | For more than 16 blocks an indirect addressing in form of another tree is | |
5d026c72 KB |
88 | used. (scheme is the same as the one used for the superblock root nodes) |
89 | ||
90 | The filesize is stored 64bit. Inode counting starts with 1. (whilst long | |
91 | filename inodes start with 0) | |
92 | ||
93 | Directories | |
94 | ----------- | |
95 | ||
96 | A directory is a filesystem object and has an inode just like a file. | |
97 | It is a specially formatted file containing records which associate each | |
98 | name with an inode number. | |
99 | '.' inode number points to the directory inode | |
100 | '..' inode number points to the parent directory inode | |
101 | Eeach filename record additionally got a filename length field. | |
102 | ||
103 | One special case are long filenames or subdirectory names. | |
104 | These got set a filename length field of 0xff in the corresponding directory | |
105 | record plus the longfile inode number also stored in that record. | |
106 | With that longfilename inode number, the longfilename tree can be walked | |
107 | starting with the superblock longfilename root node pointers. | |
108 | ||
109 | Special files | |
110 | ------------- | |
111 | ||
112 | Symbolic links are also filesystem objects with inodes. They got a specific | |
113 | bit in the inode mode field identifying them as symbolic link. | |
114 | The directory entry file inode pointer points to the target file inode. | |
115 | ||
116 | Hard links got an inode, a directory entry, but a specific mode bit set, | |
117 | no block pointers and the directory file record pointing to the target file | |
118 | inode. | |
119 | ||
120 | Character and block special devices do not exist in QNX as those files | |
c94bed8e | 121 | are handled by the QNX kernel/drivers and created in /dev independent of the |
5d026c72 KB |
122 | underlaying filesystem. |
123 | ||
124 | Long filenames | |
125 | -------------- | |
126 | ||
c94bed8e | 127 | Long filenames are stored in a separate addressing tree. The staring point |
5d026c72 KB |
128 | is the longfilename root node in the active superblock. |
129 | Each data block (tree leaves) holds one long filename. That filename is | |
130 | limited to 510 bytes. The first two starting bytes are used as length field | |
131 | for the actual filename. | |
132 | If that structure shall fit for all allowed blocksizes, it is clear why there | |
133 | is a limit of 510 bytes for the actual filename stored. | |
134 | ||
135 | Bitmap | |
136 | ------ | |
137 | ||
138 | The qnx6fs filesystem allocation bitmap is stored in a tree under bitmap | |
139 | root node in the superblock and each bit in the bitmap represents one | |
140 | filesystem block. | |
141 | The first block is block 0, which starts 0x1000 after superblock start. | |
142 | So for a normal qnx6fs 0x3000 (bootblock + superblock) is the physical | |
143 | address at which block 0 is located. | |
144 | ||
145 | Bits at the end of the last bitmap block are set to 1, if the device is | |
146 | smaller than addressing space in the bitmap. | |
147 | ||
148 | Bitmap system area | |
149 | ------------------ | |
150 | ||
f884ab15 | 151 | The bitmap itself is divided into three parts. |
9ed354b7 | 152 | First the system area, that is split into two halves. |
5d026c72 KB |
153 | Then userspace. |
154 | ||
155 | The requirement for a static, fixed preallocated system area comes from how | |
156 | qnx6fs deals with writes. | |
157 | Each superblock got it's own half of the system area. So superblock #1 | |
158 | always uses blocks from the lower half whilst superblock #2 just writes to | |
159 | blocks represented by the upper half bitmap system area bits. | |
160 | ||
161 | Bitmap blocks, Inode blocks and indirect addressing blocks for those two | |
162 | tree structures are treated as system blocks. | |
163 | ||
164 | The rational behind that is that a write request can work on a new snapshot | |
165 | (system area of the inactive - resp. lower serial numbered superblock) while | |
166 | at the same time there is still a complete stable filesystem structer in the | |
167 | other half of the system area. | |
168 | ||
169 | When finished with writing (a sync write is completed, the maximum sync leap | |
170 | time or a filesystem sync is requested), serial of the previously inactive | |
171 | superblock atomically is increased and the fs switches over to that - then | |
172 | stable declared - superblock. | |
173 | ||
174 | For all data outside the system area, blocks are just copied while writing. |