This guide is intended for RHEL based systems. The latest Lustre compatible version, as of writing this, is version 8.7 so that is what will be used. This guide assumes a fairly standard installation of the operating system. It also assumes a user familiar with RHEL systems and Linux in general. It will not cover every step, nor will it cover every possible configuration. The official Lustre documentation should be consulted for more advanced usage. I will not be taking any questions.
Lustre is a filesystem/clustering/networking tool for creating highly configurable storage setups. It has features like clustering, active/passive and active/active failure modes, network wide file striping, adaptive rerouting, Infiniband and RDMA support, ZFS, and more. The Lustre website has more detailed information about features and capabilities.
Lustre is a kernel module for Linux and as a result, it does require specific versions of the kernel alongside specific distributions in order to work. Check here for more information as to which systems are supported.
Lustre divides the duties of the filesystem into several distinct groups which allows different servers to specialize for each of these duties. The three types are as follows:
Handles routing and permission I suppose? TODO: read docs
Stores the permissions, file structure, access times, etc.
Can also store small files as it can speed up certain workloads.
Needs high IOPs and low latency for optimal performance.
Storage capacity should be a few dozen gigabytes per
terabyte of OSS.
Read more about MDS/MDT size here.
Block storage for files.
Large disk and memory capacities are best for this.
Higher network bandwidth is also preferred.
Each function can be performed by one or more servers, and multiple servers can perform the same function. This allows for features like failover and data striping. A single server can also perform multiple functions. Having a combined MGS/MDS setup is not uncommon as the MGS requirements are low.
As of right now, RHEL version 8.7 is the latest supported operating system.
The following software are dependencies of Lustre and must be installed:
The high availability
, extras
, and
powertools
repositories are also required.
yum-config-manager --enable ha powertools extras
Install epel-release
now that the needed repositories have
been enabled.
Make sure to use the correct version of Lustre for your version of RHEL. You MUST use the kernel provided by Lustre in order to have it work.
Go here to download the latest RPMs for the Lustre server packages:
https://downloads.whamcloud.com/public/lustre/latest-release/el8.7/server/RPMS/x86_64/
You'll also need the e2fsprogs packages located here:
https://downloads.whamcloud.com/public/e2fsprogs/latest/el8/RPMS/x86_64/
Packages marked debugsource
or tests
are not required. These packages are fairly large at
about 900MB and likely should not be used on production servers.
Additionally, if you are not using ZFS, do not download any packages with
zfs
in the name, nor should you download lustre-all-dkms
.
Otherwise, you'll need to grab them in order to use ZFS.
In the directory that you've placed all of the packages use
localinstall
to install the new kernel. This may take
a bit of time.
yum localinstall kernel* -y
Now that the new kernel has been installed, reboot and confirm that the Lustre kernel is loaded. For example:
uname -r
Should return somethine like this:
4.18.0-425.3.1.el8_lustre.x86_64
If not, you may need to tweak which kernel is automatically selected.
Run the following commands to install the rest of the software.
Note the order, it takes into account some dependency needs. This
may take a bit of time as modules will need to compile. If you aren't
using ZFS, simply leave out libzpool*
, zfs*
,
libzfs*
, and replace lustre-all-dkms*
with lustre-ldiskfs-dkms*
.
yum localinstall kmod-lustre-2.* -y
yum localinstall e2fsprogs-* libcom* libss* -y
yum localinstall zfs* libuutil* libzpool* libnvpair* libzfs* -y
yum localinstall lustre-all-dkms* lustre-debuginfo* -y
yum lcoalinstall lustre-2* lustre-osd-* lustre-resource-agents-* -y
Lustre requires a unique host ID be set on the platform. Run the following two commands to generate one:
hid=`[ -f /etc/hostid ] && od -An -tx /etc/hostid|sed 's/ //g'`
[ "$hid" = `hostid` ] || genhostid
Load the newly installed modules to confirm that they are working:
modprobe zfs
modprobe lustre
modprobe lnet
If all three of those worked, place them into
/etc/modules-load.d/lustre.conf
. The file likely will not
exist, so you'll need to make it. This will ensure that they load on boot.
ksocklnd
isn't a part of Lustre as far as I know, but it is
needed for lnet
to load.
The contents of the file should look like this:
ksocklnd
zfs
lustre
lnet
Make sure to enable the Lustre services with systemd.
systemctl enable lustre
systemctl enable lnet
The software installation should be complete now. Reboot the system to
veryify that the modules load automatically and that no errors occur.
Check dmesg
and the status of the services for errors.
If you see -ENOENT
from LNetError
in
dmesg
, that's okay for now. This will be resolved when
the disks and network are set up.
To create an MDT for a server:
mkfs.lustre --mgs --servicenode 10.0.0.50@tcp1
--backfstype=zfs z_mdt/p
This tells Lustre that the MGS is 10.0.0.50 and that it will
be found at 10.0.0.50 on the tcp1
connection.
The filesystem in use is ZFS, hence the --backfstype=zfs
.
The zpool selected is z_mdt
and the partition to use is
p
.