Commit | Line | Data |
---|---|---|
a3fa73bd JL |
1 | ################################################################################ |
2 | # # | |
3 | # NFS/RDMA README # | |
4 | # # | |
5 | ################################################################################ | |
6 | ||
7 | Author: NetApp and Open Grid Computing | |
007de8b4 | 8 | Date: May 29, 2008 |
a3fa73bd JL |
9 | |
10 | Table of Contents | |
11 | ~~~~~~~~~~~~~~~~~ | |
12 | - Overview | |
13 | - Getting Help | |
14 | - Installation | |
15 | - Check RDMA and NFS Setup | |
16 | - NFS/RDMA Setup | |
17 | ||
18 | Overview | |
19 | ~~~~~~~~ | |
20 | ||
21 | This document describes how to install and setup the Linux NFS/RDMA client | |
22 | and server software. | |
23 | ||
24 | The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server | |
25 | was first included in the following release, Linux 2.6.25. | |
26 | ||
27 | In our testing, we have obtained excellent performance results (full 10Gbit | |
28 | wire bandwidth at minimal client CPU) under many workloads. The code passes | |
29 | the full Connectathon test suite and operates over both Infiniband and iWARP | |
30 | RDMA adapters. | |
31 | ||
32 | Getting Help | |
33 | ~~~~~~~~~~~~ | |
34 | ||
35 | If you get stuck, you can ask questions on the | |
36 | ||
37 | nfs-rdma-devel@lists.sourceforge.net | |
38 | ||
39 | mailing list. | |
40 | ||
41 | Installation | |
42 | ~~~~~~~~~~~~ | |
43 | ||
44 | These instructions are a step by step guide to building a machine for | |
45 | use with NFS/RDMA. | |
46 | ||
47 | - Install an RDMA device | |
48 | ||
49 | Any device supported by the drivers in drivers/infiniband/hw is acceptable. | |
50 | ||
51 | Testing has been performed using several Mellanox-based IB cards, the | |
52 | Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter. | |
53 | ||
54 | - Install a Linux distribution and tools | |
55 | ||
56 | The first kernel release to contain both the NFS/RDMA client and server was | |
57 | Linux 2.6.25 Therefore, a distribution compatible with this and subsequent | |
58 | Linux kernel release should be installed. | |
59 | ||
60 | The procedures described in this document have been tested with | |
61 | distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). | |
62 | ||
007de8b4 | 63 | - Install nfs-utils-1.1.2 or greater on the client |
a3fa73bd | 64 | |
007de8b4 | 65 | An NFS/RDMA mount point can be obtained by using the mount.nfs command in |
3cd2cfea BF |
66 | nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils |
67 | version with support for NFS/RDMA mounts, but for various reasons we | |
68 | recommend using nfs-utils-1.1.2 or greater). To see which version of | |
69 | mount.nfs you are using, type: | |
a3fa73bd | 70 | |
007de8b4 | 71 | $ /sbin/mount.nfs -V |
a3fa73bd | 72 | |
007de8b4 JL |
73 | If the version is less than 1.1.2 or the command does not exist, |
74 | you should install the latest version of nfs-utils. | |
a3fa73bd JL |
75 | |
76 | Download the latest package from: | |
77 | ||
78 | http://www.kernel.org/pub/linux/utils/nfs | |
79 | ||
80 | Uncompress the package and follow the installation instructions. | |
81 | ||
007de8b4 JL |
82 | If you will not need the idmapper and gssd executables (you do not need |
83 | these to create an NFS/RDMA enabled mount command), the installation | |
84 | process can be simplified by disabling these features when running | |
85 | configure: | |
a3fa73bd | 86 | |
007de8b4 | 87 | $ ./configure --disable-gss --disable-nfsv4 |
a3fa73bd | 88 | |
007de8b4 JL |
89 | To build nfs-utils you will need the tcp_wrappers package installed. For |
90 | more information on this see the package's README and INSTALL files. | |
a3fa73bd JL |
91 | |
92 | After building the nfs-utils package, there will be a mount.nfs binary in | |
93 | the utils/mount directory. This binary can be used to initiate NFS v2, v3, | |
3cd2cfea BF |
94 | or v4 mounts. To initiate a v4 mount, the binary must be called |
95 | mount.nfs4. The standard technique is to create a symlink called | |
96 | mount.nfs4 to mount.nfs. | |
a3fa73bd | 97 | |
007de8b4 JL |
98 | This mount.nfs binary should be installed at /sbin/mount.nfs as follows: |
99 | ||
100 | $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs | |
101 | ||
102 | In this location, mount.nfs will be invoked automatically for NFS mounts | |
103 | by the system mount commmand. | |
104 | ||
105 | NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed | |
a3fa73bd JL |
106 | on the NFS client machine. You do not need this specific version of |
107 | nfs-utils on the server. Furthermore, only the mount.nfs command from | |
007de8b4 | 108 | nfs-utils-1.1.2 is needed on the client. |
a3fa73bd JL |
109 | |
110 | - Install a Linux kernel with NFS/RDMA | |
111 | ||
112 | The NFS/RDMA client and server are both included in the mainline Linux | |
113 | kernel version 2.6.25 and later. This and other versions of the 2.6 Linux | |
114 | kernel can be found at: | |
115 | ||
116 | ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ | |
117 | ||
118 | Download the sources and place them in an appropriate location. | |
119 | ||
120 | - Configure the RDMA stack | |
121 | ||
122 | Make sure your kernel configuration has RDMA support enabled. Under | |
123 | Device Drivers -> InfiniBand support, update the kernel configuration | |
124 | to enable InfiniBand support [NOTE: the option name is misleading. Enabling | |
125 | InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)]. | |
126 | ||
127 | Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or | |
128 | iWARP adapter support (amso, cxgb3, etc.). | |
129 | ||
130 | If you are using InfiniBand, be sure to enable IP-over-InfiniBand support. | |
131 | ||
132 | - Configure the NFS client and server | |
133 | ||
134 | Your kernel configuration must also have NFS file system support and/or | |
135 | NFS server support enabled. These and other NFS related configuration | |
136 | options can be found under File Systems -> Network File Systems. | |
137 | ||
138 | - Build, install, reboot | |
139 | ||
140 | The NFS/RDMA code will be enabled automatically if NFS and RDMA | |
141 | are turned on. The NFS/RDMA client and server are configured via the hidden | |
142 | SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The | |
143 | value of SUNRPC_XPRT_RDMA will be: | |
144 | ||
145 | - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client | |
146 | and server will not be built | |
147 | - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M, | |
148 | in this case the NFS/RDMA client and server will be built as modules | |
149 | - Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client | |
150 | and server will be built into the kernel | |
151 | ||
152 | Therefore, if you have followed the steps above and turned no NFS and RDMA, | |
153 | the NFS/RDMA client and server will be built. | |
154 | ||
155 | Build a new kernel, install it, boot it. | |
156 | ||
157 | Check RDMA and NFS Setup | |
158 | ~~~~~~~~~~~~~~~~~~~~~~~~ | |
159 | ||
160 | Before configuring the NFS/RDMA software, it is a good idea to test | |
161 | your new kernel to ensure that the kernel is working correctly. | |
162 | In particular, it is a good idea to verify that the RDMA stack | |
163 | is functioning as expected and standard NFS over TCP/IP and/or UDP/IP | |
164 | is working properly. | |
165 | ||
166 | - Check RDMA Setup | |
167 | ||
168 | If you built the RDMA components as modules, load them at | |
169 | this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel | |
170 | card: | |
171 | ||
007de8b4 JL |
172 | $ modprobe ib_mthca |
173 | $ modprobe ib_ipoib | |
a3fa73bd JL |
174 | |
175 | If you are using InfiniBand, make sure there is a Subnet Manager (SM) | |
176 | running on the network. If your IB switch has an embedded SM, you can | |
177 | use it. Otherwise, you will need to run an SM, such as OpenSM, on one | |
178 | of your end nodes. | |
179 | ||
180 | If an SM is running on your network, you should see the following: | |
181 | ||
007de8b4 | 182 | $ cat /sys/class/infiniband/driverX/ports/1/state |
a3fa73bd JL |
183 | 4: ACTIVE |
184 | ||
185 | where driverX is mthca0, ipath5, ehca3, etc. | |
186 | ||
187 | To further test the InfiniBand software stack, use IPoIB (this | |
188 | assumes you have two IB hosts named host1 and host2): | |
189 | ||
007de8b4 JL |
190 | host1$ ifconfig ib0 a.b.c.x |
191 | host2$ ifconfig ib0 a.b.c.y | |
192 | host1$ ping a.b.c.y | |
193 | host2$ ping a.b.c.x | |
a3fa73bd JL |
194 | |
195 | For other device types, follow the appropriate procedures. | |
196 | ||
197 | - Check NFS Setup | |
198 | ||
199 | For the NFS components enabled above (client and/or server), | |
200 | test their functionality over standard Ethernet using TCP/IP or UDP/IP. | |
201 | ||
202 | NFS/RDMA Setup | |
203 | ~~~~~~~~~~~~~~ | |
204 | ||
205 | We recommend that you use two machines, one to act as the client and | |
206 | one to act as the server. | |
207 | ||
208 | One time configuration: | |
209 | ||
210 | - On the server system, configure the /etc/exports file and | |
211 | start the NFS/RDMA server. | |
212 | ||
c272cca6 | 213 | Exports entries with the following formats have been tested: |
a3fa73bd | 214 | |
c272cca6 JL |
215 | /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) |
216 | /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) | |
a3fa73bd | 217 | |
3cd2cfea BF |
218 | The IP address(es) is(are) the client's IPoIB address for an InfiniBand |
219 | HCA or the cleint's iWARP address(es) for an RNIC. | |
c272cca6 | 220 | |
3cd2cfea BF |
221 | NOTE: The "insecure" option must be used because the NFS/RDMA client does |
222 | not use a reserved port. | |
a3fa73bd JL |
223 | |
224 | Each time a machine boots: | |
225 | ||
226 | - Load and configure the RDMA drivers | |
227 | ||
228 | For InfiniBand using a Mellanox adapter: | |
229 | ||
007de8b4 JL |
230 | $ modprobe ib_mthca |
231 | $ modprobe ib_ipoib | |
232 | $ ifconfig ib0 a.b.c.d | |
a3fa73bd JL |
233 | |
234 | NOTE: use unique addresses for the client and server | |
235 | ||
236 | - Start the NFS server | |
237 | ||
3cd2cfea BF |
238 | If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in |
239 | kernel config), load the RDMA transport module: | |
a3fa73bd | 240 | |
007de8b4 | 241 | $ modprobe svcrdma |
a3fa73bd | 242 | |
3cd2cfea BF |
243 | Regardless of how the server was built (module or built-in), start the |
244 | server: | |
a3fa73bd | 245 | |
007de8b4 | 246 | $ /etc/init.d/nfs start |
a3fa73bd JL |
247 | |
248 | or | |
249 | ||
007de8b4 | 250 | $ service nfs start |
a3fa73bd JL |
251 | |
252 | Instruct the server to listen on the RDMA transport: | |
253 | ||
096abd77 | 254 | $ echo rdma 20049 > /proc/fs/nfsd/portlist |
a3fa73bd JL |
255 | |
256 | - On the client system | |
257 | ||
3cd2cfea BF |
258 | If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in |
259 | kernel config), load the RDMA client module: | |
a3fa73bd | 260 | |
007de8b4 | 261 | $ modprobe xprtrdma.ko |
a3fa73bd | 262 | |
3cd2cfea BF |
263 | Regardless of how the client was built (module or built-in), use this |
264 | command to mount the NFS/RDMA server: | |
a3fa73bd | 265 | |
096abd77 | 266 | $ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt |
a3fa73bd | 267 | |
3cd2cfea BF |
268 | To verify that the mount is using RDMA, run "cat /proc/mounts" and check |
269 | the "proto" field for the given mount. | |
a3fa73bd JL |
270 | |
271 | Congratulations! You're using NFS/RDMA! |