Commit | Line | Data |
---|---|---|
8f0aa2f2 DH |
1 | ==================================== |
2 | SLOW WORK ITEM EXECUTION THREAD POOL | |
3 | ==================================== | |
4 | ||
5 | By: David Howells <dhowells@redhat.com> | |
6 | ||
7 | The slow work item execution thread pool is a pool of threads for performing | |
8 | things that take a relatively long time, such as making mkdir calls. | |
9 | Typically, when processing something, these items will spend a lot of time | |
10 | blocking a thread on I/O, thus making that thread unavailable for doing other | |
11 | work. | |
12 | ||
13 | The standard workqueue model is unsuitable for this class of work item as that | |
14 | limits the owner to a single thread or a single thread per CPU. For some | |
15 | tasks, however, more threads - or fewer - are required. | |
16 | ||
17 | There is just one pool per system. It contains no threads unless something | |
18 | wants to use it - and that something must register its interest first. When | |
19 | the pool is active, the number of threads it contains is dynamic, varying | |
20 | between a maximum and minimum setting, depending on the load. | |
21 | ||
22 | ||
23 | ==================== | |
24 | CLASSES OF WORK ITEM | |
25 | ==================== | |
26 | ||
27 | This pool support two classes of work items: | |
28 | ||
29 | (*) Slow work items. | |
30 | ||
31 | (*) Very slow work items. | |
32 | ||
33 | The former are expected to finish much quicker than the latter. | |
34 | ||
35 | An operation of the very slow class may do a batch combination of several | |
36 | lookups, mkdirs, and a create for instance. | |
37 | ||
38 | An operation of the ordinarily slow class may, for example, write stuff or | |
39 | expand files, provided the time taken to do so isn't too long. | |
40 | ||
41 | Operations of both types may sleep during execution, thus tying up the thread | |
42 | loaned to it. | |
43 | ||
6b8268b1 JA |
44 | A further class of work item is available, based on the slow work item class: |
45 | ||
46 | (*) Delayed slow work items. | |
47 | ||
48 | These are slow work items that have a timer to defer queueing of the item for | |
49 | a while. | |
50 | ||
8f0aa2f2 DH |
51 | |
52 | THREAD-TO-CLASS ALLOCATION | |
53 | -------------------------- | |
54 | ||
55 | Not all the threads in the pool are available to work on very slow work items. | |
56 | The number will be between one and one fewer than the number of active threads. | |
57 | This is configurable (see the "Pool Configuration" section). | |
58 | ||
59 | All the threads are available to work on ordinarily slow work items, but a | |
60 | percentage of the threads will prefer to work on very slow work items. | |
61 | ||
62 | The configuration ensures that at least one thread will be available to work on | |
63 | very slow work items, and at least one thread will be available that won't work | |
64 | on very slow work items at all. | |
65 | ||
66 | ||
67 | ===================== | |
68 | USING SLOW WORK ITEMS | |
69 | ===================== | |
70 | ||
71 | Firstly, a module or subsystem wanting to make use of slow work items must | |
72 | register its interest: | |
73 | ||
3d7a641e | 74 | int ret = slow_work_register_user(struct module *module); |
8f0aa2f2 | 75 | |
3d7a641e DH |
76 | This will return 0 if successful, or a -ve error upon failure. The module |
77 | pointer should be the module interested in using this facility (almost | |
78 | certainly THIS_MODULE). | |
8f0aa2f2 DH |
79 | |
80 | ||
81 | Slow work items may then be set up by: | |
82 | ||
83 | (1) Declaring a slow_work struct type variable: | |
84 | ||
85 | #include <linux/slow-work.h> | |
86 | ||
87 | struct slow_work myitem; | |
88 | ||
89 | (2) Declaring the operations to be used for this item: | |
90 | ||
91 | struct slow_work_ops myitem_ops = { | |
92 | .get_ref = myitem_get_ref, | |
93 | .put_ref = myitem_put_ref, | |
94 | .execute = myitem_execute, | |
95 | }; | |
96 | ||
97 | [*] For a description of the ops, see section "Item Operations". | |
98 | ||
99 | (3) Initialising the item: | |
100 | ||
101 | slow_work_init(&myitem, &myitem_ops); | |
102 | ||
6b8268b1 JA |
103 | or: |
104 | ||
105 | delayed_slow_work_init(&myitem, &myitem_ops); | |
106 | ||
8f0aa2f2 DH |
107 | or: |
108 | ||
109 | vslow_work_init(&myitem, &myitem_ops); | |
110 | ||
111 | depending on its class. | |
112 | ||
113 | A suitably set up work item can then be enqueued for processing: | |
114 | ||
115 | int ret = slow_work_enqueue(&myitem); | |
116 | ||
117 | This will return a -ve error if the thread pool is unable to gain a reference | |
6b8268b1 JA |
118 | on the item, 0 otherwise, or (for delayed work): |
119 | ||
120 | int ret = delayed_slow_work_enqueue(&myitem, my_jiffy_delay); | |
8f0aa2f2 DH |
121 | |
122 | ||
123 | The items are reference counted, so there ought to be no need for a flush | |
01609502 JA |
124 | operation. But as the reference counting is optional, means to cancel |
125 | existing work items are also included: | |
126 | ||
127 | cancel_slow_work(&myitem); | |
6b8268b1 | 128 | cancel_delayed_slow_work(&myitem); |
01609502 JA |
129 | |
130 | can be used to cancel pending work. The above cancel function waits for | |
131 | existing work to have been executed (or prevent execution of them, depending | |
132 | on timing). | |
133 | ||
134 | ||
135 | When all a module's slow work items have been processed, and the | |
8f0aa2f2 DH |
136 | module has no further interest in the facility, it should unregister its |
137 | interest: | |
138 | ||
3d7a641e DH |
139 | slow_work_unregister_user(struct module *module); |
140 | ||
141 | The module pointer is used to wait for all outstanding work items for that | |
142 | module before completing the unregistration. This prevents the put_ref() code | |
143 | from being taken away before it completes. module should almost certainly be | |
144 | THIS_MODULE. | |
8f0aa2f2 DH |
145 | |
146 | ||
147 | =============== | |
148 | ITEM OPERATIONS | |
149 | =============== | |
150 | ||
151 | Each work item requires a table of operations of type struct slow_work_ops. | |
8fba10a4 DH |
152 | Only ->execute() is required; the getting and putting of a reference and the |
153 | describing of an item are all optional. | |
8f0aa2f2 DH |
154 | |
155 | (*) Get a reference on an item: | |
156 | ||
157 | int (*get_ref)(struct slow_work *work); | |
158 | ||
159 | This allows the thread pool to attempt to pin an item by getting a | |
160 | reference on it. This function should return 0 if the reference was | |
161 | granted, or a -ve error otherwise. If an error is returned, | |
162 | slow_work_enqueue() will fail. | |
163 | ||
164 | The reference is held whilst the item is queued and whilst it is being | |
165 | executed. The item may then be requeued with the same reference held, or | |
166 | the reference will be released. | |
167 | ||
168 | (*) Release a reference on an item: | |
169 | ||
170 | void (*put_ref)(struct slow_work *work); | |
171 | ||
172 | This allows the thread pool to unpin an item by releasing the reference on | |
173 | it. The thread pool will not touch the item again once this has been | |
174 | called. | |
175 | ||
176 | (*) Execute an item: | |
177 | ||
178 | void (*execute)(struct slow_work *work); | |
179 | ||
180 | This should perform the work required of the item. It may sleep, it may | |
181 | perform disk I/O and it may wait for locks. | |
182 | ||
8fba10a4 DH |
183 | (*) View an item through /proc: |
184 | ||
185 | void (*desc)(struct slow_work *work, struct seq_file *m); | |
186 | ||
187 | If supplied, this should print to 'm' a small string describing the work | |
188 | the item is to do. This should be no more than about 40 characters, and | |
189 | shouldn't include a newline character. | |
190 | ||
191 | See the 'Viewing executing and queued items' section below. | |
192 | ||
8f0aa2f2 DH |
193 | |
194 | ================== | |
195 | POOL CONFIGURATION | |
196 | ================== | |
197 | ||
198 | The slow-work thread pool has a number of configurables: | |
199 | ||
200 | (*) /proc/sys/kernel/slow-work/min-threads | |
201 | ||
202 | The minimum number of threads that should be in the pool whilst it is in | |
203 | use. This may be anywhere between 2 and max-threads. | |
204 | ||
205 | (*) /proc/sys/kernel/slow-work/max-threads | |
206 | ||
207 | The maximum number of threads that should in the pool. This may be | |
208 | anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater. | |
209 | ||
210 | (*) /proc/sys/kernel/slow-work/vslow-percentage | |
211 | ||
212 | The percentage of active threads in the pool that may be used to execute | |
213 | very slow work items. This may be between 1 and 99. The resultant number | |
214 | is bounded to between 1 and one fewer than the number of active threads. | |
215 | This ensures there is always at least one thread that can process very | |
216 | slow work items, and always at least one thread that won't. | |
8fba10a4 DH |
217 | |
218 | ||
219 | ================================== | |
220 | VIEWING EXECUTING AND QUEUED ITEMS | |
221 | ================================== | |
222 | ||
223 | If CONFIG_SLOW_WORK_PROC is enabled, a proc file is made available: | |
224 | ||
225 | /proc/slow_work_rq | |
226 | ||
227 | through which the list of work items being executed and the queues of items to | |
228 | be executed may be viewed. The owner of a work item is given the chance to | |
229 | add some information of its own. | |
230 | ||
231 | The contents look something like the following: | |
232 | ||
233 | THR PID ITEM ADDR FL MARK DESC | |
234 | === ===== ================ == ===== ========== | |
235 | 0 3005 ffff880023f52348 a 952ms FSC: OBJ17d3: LOOK | |
236 | 1 3006 ffff880024e33668 2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2 | |
237 | 2 3165 ffff8800296dd180 a 424ms FSC: OBJ17e4: LOOK | |
238 | 3 4089 ffff8800262c8d78 a 212ms FSC: OBJ17ea: CRTN | |
239 | 4 4090 ffff88002792bed8 2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2 | |
240 | 5 4092 ffff88002a0ef308 2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2 | |
241 | 6 4094 ffff88002abaf4b8 2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2 | |
242 | 7 4095 ffff88002bb188e0 a 388ms FSC: OBJ17e9: CRTN | |
243 | vsq - ffff880023d99668 1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2 | |
244 | vsq - ffff8800295d1740 1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2 | |
245 | vsq - ffff880025ba3308 1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2 | |
246 | vsq - ffff880024ec83e0 1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2 | |
247 | vsq - ffff880026618e00 1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2 | |
248 | vsq - ffff880025a2a4b8 1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2 | |
249 | vsq - ffff880023cbe6d8 9 212ms FSC: OBJ17eb: LOOK | |
250 | vsq - ffff880024d37590 9 212ms FSC: OBJ17ec: LOOK | |
251 | vsq - ffff880027746cb0 9 212ms FSC: OBJ17ed: LOOK | |
252 | vsq - ffff880024d37ae8 9 212ms FSC: OBJ17ee: LOOK | |
253 | vsq - ffff880024d37cb0 9 212ms FSC: OBJ17ef: LOOK | |
254 | vsq - ffff880025036550 9 212ms FSC: OBJ17f0: LOOK | |
255 | vsq - ffff8800250368e0 9 212ms FSC: OBJ17f1: LOOK | |
256 | vsq - ffff880025036aa8 9 212ms FSC: OBJ17f2: LOOK | |
257 | ||
258 | In the 'THR' column, executing items show the thread they're occupying and | |
259 | queued threads indicate which queue they're on. 'PID' shows the process ID of | |
260 | a slow-work thread that's executing something. 'FL' shows the work item flags. | |
261 | 'MARK' indicates how long since an item was queued or began executing. Lastly, | |
262 | the 'DESC' column permits the owner of an item to give some information. | |
263 |