Commit | Line | Data |
---|---|---|
2908d778 JB |
1 | SAS Layer |
2 | --------- | |
3 | ||
4 | The SAS Layer is a management infrastructure which manages | |
5 | SAS LLDDs. It sits between SCSI Core and SAS LLDDs. The | |
6 | layout is as follows: while SCSI Core is concerned with | |
7 | SAM/SPC issues, and a SAS LLDD+sequencer is concerned with | |
8 | phy/OOB/link management, the SAS layer is concerned with: | |
9 | ||
10 | * SAS Phy/Port/HA event management (LLDD generates, | |
11 | SAS Layer processes), | |
12 | * SAS Port management (creation/destruction), | |
13 | * SAS Domain discovery and revalidation, | |
14 | * SAS Domain device management, | |
15 | * SCSI Host registration/unregistration, | |
16 | * Device registration with SCSI Core (SAS) or libata | |
17 | (SATA), and | |
18 | * Expander management and exporting expander control | |
19 | to user space. | |
20 | ||
21 | A SAS LLDD is a PCI device driver. It is concerned with | |
22 | phy/OOB management, and vendor specific tasks and generates | |
23 | events to the SAS layer. | |
24 | ||
25 | The SAS Layer does most SAS tasks as outlined in the SAS 1.1 | |
26 | spec. | |
27 | ||
28 | The sas_ha_struct describes the SAS LLDD to the SAS layer. | |
29 | Most of it is used by the SAS Layer but a few fields need to | |
30 | be initialized by the LLDDs. | |
31 | ||
32 | After initializing your hardware, from the probe() function | |
33 | you call sas_register_ha(). It will register your LLDD with | |
34 | the SCSI subsystem, creating a SCSI host and it will | |
35 | register your SAS driver with the sysfs SAS tree it creates. | |
36 | It will then return. Then you enable your phys to actually | |
37 | start OOB (at which point your driver will start calling the | |
38 | notify_* event callbacks). | |
39 | ||
40 | Structure descriptions: | |
41 | ||
42 | struct sas_phy -------------------- | |
43 | Normally this is statically embedded to your driver's | |
44 | phy structure: | |
45 | struct my_phy { | |
46 | blah; | |
47 | struct sas_phy sas_phy; | |
48 | bleh; | |
49 | }; | |
50 | And then all the phys are an array of my_phy in your HA | |
51 | struct (shown below). | |
52 | ||
53 | Then as you go along and initialize your phys you also | |
54 | initialize the sas_phy struct, along with your own | |
55 | phy structure. | |
56 | ||
57 | In general, the phys are managed by the LLDD and the ports | |
58 | are managed by the SAS layer. So the phys are initialized | |
59 | and updated by the LLDD and the ports are initialized and | |
60 | updated by the SAS layer. | |
61 | ||
62 | There is a scheme where the LLDD can RW certain fields, | |
63 | and the SAS layer can only read such ones, and vice versa. | |
64 | The idea is to avoid unnecessary locking. | |
65 | ||
66 | enabled -- must be set (0/1) | |
67 | id -- must be set [0,MAX_PHYS) | |
68 | class, proto, type, role, oob_mode, linkrate -- must be set | |
69 | oob_mode -- you set this when OOB has finished and then notify | |
70 | the SAS Layer. | |
71 | ||
72 | sas_addr -- this normally points to an array holding the sas | |
73 | address of the phy, possibly somewhere in your my_phy | |
74 | struct. | |
75 | ||
76 | attached_sas_addr -- set this when you (LLDD) receive an | |
77 | IDENTIFY frame or a FIS frame, _before_ notifying the SAS | |
78 | layer. The idea is that sometimes the LLDD may want to fake | |
79 | or provide a different SAS address on that phy/port and this | |
80 | allows it to do this. At best you should copy the sas | |
81 | address from the IDENTIFY frame or maybe generate a SAS | |
82 | address for SATA directly attached devices. The Discover | |
83 | process may later change this. | |
84 | ||
85 | frame_rcvd -- this is where you copy the IDENTIFY/FIS frame | |
86 | when you get it; you lock, copy, set frame_rcvd_size and | |
87 | unlock the lock, and then call the event. It is a pointer | |
88 | since there's no way to know your hw frame size _exactly_, | |
89 | so you define the actual array in your phy struct and let | |
90 | this pointer point to it. You copy the frame from your | |
91 | DMAable memory to that area holding the lock. | |
92 | ||
93 | sas_prim -- this is where primitives go when they're | |
94 | received. See sas.h. Grab the lock, set the primitive, | |
95 | release the lock, notify. | |
96 | ||
97 | port -- this points to the sas_port if the phy belongs | |
98 | to a port -- the LLDD only reads this. It points to the | |
99 | sas_port this phy is part of. Set by the SAS Layer. | |
100 | ||
101 | ha -- may be set; the SAS layer sets it anyway. | |
102 | ||
103 | lldd_phy -- you should set this to point to your phy so you | |
104 | can find your way around faster when the SAS layer calls one | |
105 | of your callbacks and passes you a phy. If the sas_phy is | |
106 | embedded you can also use container_of -- whatever you | |
107 | prefer. | |
108 | ||
109 | ||
110 | struct sas_port -------------------- | |
111 | The LLDD doesn't set any fields of this struct -- it only | |
112 | reads them. They should be self explanatory. | |
113 | ||
114 | phy_mask is 32 bit, this should be enough for now, as I | |
115 | haven't heard of a HA having more than 8 phys. | |
116 | ||
117 | lldd_port -- I haven't found use for that -- maybe other | |
118 | LLDD who wish to have internal port representation can make | |
119 | use of this. | |
120 | ||
121 | ||
122 | struct sas_ha_struct -------------------- | |
123 | It normally is statically declared in your own LLDD | |
124 | structure describing your adapter: | |
125 | struct my_sas_ha { | |
126 | blah; | |
127 | struct sas_ha_struct sas_ha; | |
128 | struct my_phy phys[MAX_PHYS]; | |
129 | struct sas_port sas_ports[MAX_PHYS]; /* (1) */ | |
130 | bleh; | |
131 | }; | |
132 | ||
133 | (1) If your LLDD doesn't have its own port representation. | |
134 | ||
135 | What needs to be initialized (sample function given below). | |
136 | ||
137 | pcidev | |
138 | sas_addr -- since the SAS layer doesn't want to mess with | |
139 | memory allocation, etc, this points to statically | |
140 | allocated array somewhere (say in your host adapter | |
141 | structure) and holds the SAS address of the host | |
142 | adapter as given by you or the manufacturer, etc. | |
143 | sas_port | |
144 | sas_phy -- an array of pointers to structures. (see | |
145 | note above on sas_addr). | |
146 | These must be set. See more notes below. | |
147 | num_phys -- the number of phys present in the sas_phy array, | |
148 | and the number of ports present in the sas_port | |
149 | array. There can be a maximum num_phys ports (one per | |
150 | port) so we drop the num_ports, and only use | |
151 | num_phys. | |
152 | ||
153 | The event interface: | |
154 | ||
155 | /* LLDD calls these to notify the class of an event. */ | |
156 | void (*notify_ha_event)(struct sas_ha_struct *, enum ha_event); | |
157 | void (*notify_port_event)(struct sas_phy *, enum port_event); | |
158 | void (*notify_phy_event)(struct sas_phy *, enum phy_event); | |
159 | ||
160 | When sas_register_ha() returns, those are set and can be | |
161 | called by the LLDD to notify the SAS layer of such events | |
162 | the SAS layer. | |
163 | ||
164 | The port notification: | |
165 | ||
166 | /* The class calls these to notify the LLDD of an event. */ | |
167 | void (*lldd_port_formed)(struct sas_phy *); | |
168 | void (*lldd_port_deformed)(struct sas_phy *); | |
169 | ||
170 | If the LLDD wants notification when a port has been formed | |
171 | or deformed it sets those to a function satisfying the type. | |
172 | ||
173 | A SAS LLDD should also implement at least one of the Task | |
174 | Management Functions (TMFs) described in SAM: | |
175 | ||
176 | /* Task Management Functions. Must be called from process context. */ | |
177 | int (*lldd_abort_task)(struct sas_task *); | |
178 | int (*lldd_abort_task_set)(struct domain_device *, u8 *lun); | |
179 | int (*lldd_clear_aca)(struct domain_device *, u8 *lun); | |
180 | int (*lldd_clear_task_set)(struct domain_device *, u8 *lun); | |
181 | int (*lldd_I_T_nexus_reset)(struct domain_device *); | |
182 | int (*lldd_lu_reset)(struct domain_device *, u8 *lun); | |
183 | int (*lldd_query_task)(struct sas_task *); | |
184 | ||
185 | For more information please read SAM from T10.org. | |
186 | ||
187 | Port and Adapter management: | |
188 | ||
189 | /* Port and Adapter management */ | |
190 | int (*lldd_clear_nexus_port)(struct sas_port *); | |
191 | int (*lldd_clear_nexus_ha)(struct sas_ha_struct *); | |
192 | ||
193 | A SAS LLDD should implement at least one of those. | |
194 | ||
195 | Phy management: | |
196 | ||
197 | /* Phy management */ | |
198 | int (*lldd_control_phy)(struct sas_phy *, enum phy_func); | |
199 | ||
200 | lldd_ha -- set this to point to your HA struct. You can also | |
201 | use container_of if you embedded it as shown above. | |
202 | ||
203 | A sample initialization and registration function | |
204 | can look like this (called last thing from probe()) | |
205 | *but* before you enable the phys to do OOB: | |
206 | ||
207 | static int register_sas_ha(struct my_sas_ha *my_ha) | |
208 | { | |
209 | int i; | |
210 | static struct sas_phy *sas_phys[MAX_PHYS]; | |
211 | static struct sas_port *sas_ports[MAX_PHYS]; | |
212 | ||
213 | my_ha->sas_ha.sas_addr = &my_ha->sas_addr[0]; | |
214 | ||
215 | for (i = 0; i < MAX_PHYS; i++) { | |
216 | sas_phys[i] = &my_ha->phys[i].sas_phy; | |
217 | sas_ports[i] = &my_ha->sas_ports[i]; | |
218 | } | |
219 | ||
220 | my_ha->sas_ha.sas_phy = sas_phys; | |
221 | my_ha->sas_ha.sas_port = sas_ports; | |
222 | my_ha->sas_ha.num_phys = MAX_PHYS; | |
223 | ||
224 | my_ha->sas_ha.lldd_port_formed = my_port_formed; | |
225 | ||
226 | my_ha->sas_ha.lldd_dev_found = my_dev_found; | |
227 | my_ha->sas_ha.lldd_dev_gone = my_dev_gone; | |
228 | ||
229 | my_ha->sas_ha.lldd_max_execute_num = lldd_max_execute_num; (1) | |
230 | ||
231 | my_ha->sas_ha.lldd_queue_size = ha_can_queue; | |
232 | my_ha->sas_ha.lldd_execute_task = my_execute_task; | |
233 | ||
234 | my_ha->sas_ha.lldd_abort_task = my_abort_task; | |
235 | my_ha->sas_ha.lldd_abort_task_set = my_abort_task_set; | |
236 | my_ha->sas_ha.lldd_clear_aca = my_clear_aca; | |
237 | my_ha->sas_ha.lldd_clear_task_set = my_clear_task_set; | |
238 | my_ha->sas_ha.lldd_I_T_nexus_reset= NULL; (2) | |
239 | my_ha->sas_ha.lldd_lu_reset = my_lu_reset; | |
240 | my_ha->sas_ha.lldd_query_task = my_query_task; | |
241 | ||
242 | my_ha->sas_ha.lldd_clear_nexus_port = my_clear_nexus_port; | |
243 | my_ha->sas_ha.lldd_clear_nexus_ha = my_clear_nexus_ha; | |
244 | ||
245 | my_ha->sas_ha.lldd_control_phy = my_control_phy; | |
246 | ||
247 | return sas_register_ha(&my_ha->sas_ha); | |
248 | } | |
249 | ||
250 | (1) This is normally a LLDD parameter, something of the | |
251 | lines of a task collector. What it tells the SAS Layer is | |
252 | whether the SAS layer should run in Direct Mode (default: | |
253 | value 0 or 1) or Task Collector Mode (value greater than 1). | |
254 | ||
255 | In Direct Mode, the SAS Layer calls Execute Task as soon as | |
256 | it has a command to send to the SDS, _and_ this is a single | |
257 | command, i.e. not linked. | |
258 | ||
259 | Some hardware (e.g. aic94xx) has the capability to DMA more | |
260 | than one task at a time (interrupt) from host memory. Task | |
261 | Collector Mode is an optional feature for HAs which support | |
262 | this in their hardware. (Again, it is completely optional | |
263 | even if your hardware supports it.) | |
264 | ||
265 | In Task Collector Mode, the SAS Layer would do _natural_ | |
266 | coalescing of tasks and at the appropriate moment it would | |
267 | call your driver to DMA more than one task in a single HA | |
268 | interrupt. DMBS may want to use this by insmod/modprobe | |
269 | setting the lldd_max_execute_num to something greater than | |
270 | 1. | |
271 | ||
272 | (2) SAS 1.1 does not define I_T Nexus Reset TMF. | |
273 | ||
274 | Events | |
275 | ------ | |
276 | ||
277 | Events are _the only way_ a SAS LLDD notifies the SAS layer | |
278 | of anything. There is no other method or way a LLDD to tell | |
279 | the SAS layer of anything happening internally or in the SAS | |
280 | domain. | |
281 | ||
282 | Phy events: | |
283 | PHYE_LOSS_OF_SIGNAL, (C) | |
284 | PHYE_OOB_DONE, | |
285 | PHYE_OOB_ERROR, (C) | |
286 | PHYE_SPINUP_HOLD. | |
287 | ||
288 | Port events, passed on a _phy_: | |
289 | PORTE_BYTES_DMAED, (M) | |
290 | PORTE_BROADCAST_RCVD, (E) | |
291 | PORTE_LINK_RESET_ERR, (C) | |
292 | PORTE_TIMER_EVENT, (C) | |
293 | PORTE_HARD_RESET. | |
294 | ||
295 | Host Adapter event: | |
296 | HAE_RESET | |
297 | ||
298 | A SAS LLDD should be able to generate | |
299 | - at least one event from group C (choice), | |
300 | - events marked M (mandatory) are mandatory (only one), | |
301 | - events marked E (expander) if it wants the SAS layer | |
302 | to handle domain revalidation (only one such). | |
303 | - Unmarked events are optional. | |
304 | ||
305 | Meaning: | |
306 | ||
307 | HAE_RESET -- when your HA got internal error and was reset. | |
308 | ||
309 | PORTE_BYTES_DMAED -- on receiving an IDENTIFY/FIS frame | |
310 | PORTE_BROADCAST_RCVD -- on receiving a primitive | |
311 | PORTE_LINK_RESET_ERR -- timer expired, loss of signal, loss | |
312 | of DWS, etc. (*) | |
313 | PORTE_TIMER_EVENT -- DWS reset timeout timer expired (*) | |
314 | PORTE_HARD_RESET -- Hard Reset primitive received. | |
315 | ||
316 | PHYE_LOSS_OF_SIGNAL -- the device is gone (*) | |
317 | PHYE_OOB_DONE -- OOB went fine and oob_mode is valid | |
318 | PHYE_OOB_ERROR -- Error while doing OOB, the device probably | |
319 | got disconnected. (*) | |
320 | PHYE_SPINUP_HOLD -- SATA is present, COMWAKE not sent. | |
321 | ||
322 | (*) should set/clear the appropriate fields in the phy, | |
323 | or alternatively call the inlined sas_phy_disconnected() | |
324 | which is just a helper, from their tasklet. | |
325 | ||
326 | The Execute Command SCSI RPC: | |
327 | ||
328 | int (*lldd_execute_task)(struct sas_task *, int num, | |
329 | unsigned long gfp_flags); | |
330 | ||
331 | Used to queue a task to the SAS LLDD. @task is the tasks to | |
332 | be executed. @num should be the number of tasks being | |
333 | queued at this function call (they are linked listed via | |
334 | task::list), @gfp_mask should be the gfp_mask defining the | |
335 | context of the caller. | |
336 | ||
337 | This function should implement the Execute Command SCSI RPC, | |
338 | or if you're sending a SCSI Task as linked commands, you | |
339 | should also use this function. | |
340 | ||
341 | That is, when lldd_execute_task() is called, the command(s) | |
342 | go out on the transport *immediately*. There is *no* | |
343 | queuing of any sort and at any level in a SAS LLDD. | |
344 | ||
345 | The use of task::list is two-fold, one for linked commands, | |
346 | the other discussed below. | |
347 | ||
348 | It is possible to queue up more than one task at a time, by | |
349 | initializing the list element of struct sas_task, and | |
350 | passing the number of tasks enlisted in this manner in num. | |
351 | ||
352 | Returns: -SAS_QUEUE_FULL, -ENOMEM, nothing was queued; | |
353 | 0, the task(s) were queued. | |
354 | ||
355 | If you want to pass num > 1, then either | |
356 | A) you're the only caller of this function and keep track | |
357 | of what you've queued to the LLDD, or | |
358 | B) you know what you're doing and have a strategy of | |
359 | retrying. | |
360 | ||
361 | As opposed to queuing one task at a time (function call), | |
362 | batch queuing of tasks, by having num > 1, greatly | |
363 | simplifies LLDD code, sequencer code, and _hardware design_, | |
364 | and has some performance advantages in certain situations | |
365 | (DBMS). | |
366 | ||
367 | The LLDD advertises if it can take more than one command at | |
368 | a time at lldd_execute_task(), by setting the | |
369 | lldd_max_execute_num parameter (controlled by "collector" | |
370 | module parameter in aic94xx SAS LLDD). | |
371 | ||
372 | You should leave this to the default 1, unless you know what | |
373 | you're doing. | |
374 | ||
375 | This is a function of the LLDD, to which the SAS layer can | |
376 | cater to. | |
377 | ||
378 | int lldd_queue_size | |
379 | The host adapter's queue size. This is the maximum | |
380 | number of commands the lldd can have pending to domain | |
381 | devices on behalf of all upper layers submitting through | |
382 | lldd_execute_task(). | |
383 | ||
384 | You really want to set this to something (much) larger than | |
385 | 1. | |
386 | ||
387 | This _really_ has absolutely nothing to do with queuing. | |
388 | There is no queuing in SAS LLDDs. | |
389 | ||
390 | struct sas_task { | |
391 | dev -- the device this task is destined to | |
392 | list -- must be initialized (INIT_LIST_HEAD) | |
393 | task_proto -- _one_ of enum sas_proto | |
394 | scatter -- pointer to scatter gather list array | |
395 | num_scatter -- number of elements in scatter | |
fa00e7e1 | 396 | total_xfer_len -- total number of bytes expected to be transferred |
2908d778 JB |
397 | data_dir -- PCI_DMA_... |
398 | task_done -- callback when the task has finished execution | |
399 | }; | |
400 | ||
401 | When an external entity, entity other than the LLDD or the | |
402 | SAS Layer, wants to work with a struct domain_device, it | |
403 | _must_ call kobject_get() when getting a handle on the | |
404 | device and kobject_put() when it is done with the device. | |
405 | ||
406 | This does two things: | |
407 | A) implements proper kfree() for the device; | |
408 | B) increments/decrements the kref for all players: | |
409 | domain_device | |
410 | all domain_device's ... (if past an expander) | |
411 | port | |
412 | host adapter | |
413 | pci device | |
414 | and up the ladder, etc. | |
415 | ||
416 | DISCOVERY | |
417 | --------- | |
418 | ||
419 | The sysfs tree has the following purposes: | |
420 | a) It shows you the physical layout of the SAS domain at | |
421 | the current time, i.e. how the domain looks in the | |
422 | physical world right now. | |
423 | b) Shows some device parameters _at_discovery_time_. | |
424 | ||
425 | This is a link to the tree(1) program, very useful in | |
426 | viewing the SAS domain: | |
427 | ftp://mama.indstate.edu/linux/tree/ | |
428 | I expect user space applications to actually create a | |
429 | graphical interface of this. | |
430 | ||
431 | That is, the sysfs domain tree doesn't show or keep state if | |
432 | you e.g., change the meaning of the READY LED MEANING | |
433 | setting, but it does show you the current connection status | |
434 | of the domain device. | |
435 | ||
436 | Keeping internal device state changes is responsibility of | |
437 | upper layers (Command set drivers) and user space. | |
438 | ||
439 | When a device or devices are unplugged from the domain, this | |
440 | is reflected in the sysfs tree immediately, and the device(s) | |
441 | removed from the system. | |
442 | ||
443 | The structure domain_device describes any device in the SAS | |
444 | domain. It is completely managed by the SAS layer. A task | |
445 | points to a domain device, this is how the SAS LLDD knows | |
446 | where to send the task(s) to. A SAS LLDD only reads the | |
447 | contents of the domain_device structure, but it never creates | |
448 | or destroys one. | |
449 | ||
450 | Expander management from User Space | |
451 | ----------------------------------- | |
452 | ||
453 | In each expander directory in sysfs, there is a file called | |
454 | "smp_portal". It is a binary sysfs attribute file, which | |
455 | implements an SMP portal (Note: this is *NOT* an SMP port), | |
456 | to which user space applications can send SMP requests and | |
457 | receive SMP responses. | |
458 | ||
459 | Functionality is deceptively simple: | |
460 | ||
461 | 1. Build the SMP frame you want to send. The format and layout | |
462 | is described in the SAS spec. Leave the CRC field equal 0. | |
463 | open(2) | |
464 | 2. Open the expander's SMP portal sysfs file in RW mode. | |
465 | write(2) | |
466 | 3. Write the frame you built in 1. | |
467 | read(2) | |
468 | 4. Read the amount of data you expect to receive for the frame you built. | |
469 | If you receive different amount of data you expected to receive, | |
470 | then there was some kind of error. | |
471 | close(2) | |
472 | All this process is shown in detail in the function do_smp_func() | |
473 | and its callers, in the file "expander_conf.c". | |
474 | ||
475 | The kernel functionality is implemented in the file | |
476 | "sas_expander.c". | |
477 | ||
478 | The program "expander_conf.c" implements this. It takes one | |
479 | argument, the sysfs file name of the SMP portal to the | |
480 | expander, and gives expander information, including routing | |
481 | tables. | |
482 | ||
483 | The SMP portal gives you complete control of the expander, | |
484 | so please be careful. |