jiajf的个人博客分享 http://blog.sciencenet.cn/u/jiajf

博文

我的Lustre安装记录

已有 1535 次阅读 2022-11-25 23:01 |系统分类:科研笔记

服务端是CENTOS 7.9

顺序:

1. install the OS,

2. install the infiniband driver from the lustre web.(需要卸载掉mlnx的原有驱动)

如果ib0看不到:

sudo modprobe -rv ib_isert rpcrdma ib_srpt

sudo service openibd start

3. install the Kernel downloaded from the lustre web.

4. install the lustre server.

中间会提示zfs osd出错(由于安装的是ldiskfs,无视)

Libmpi.so.12也可以强制过去。


如果在安装infiniband时出现:

Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx4_ib belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx4_en belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx5_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx5_ib belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx5_fpga_tools does not exist, skipping… [FAILED]

Module ib_umad belong to kernel which is not a part of MLNX[FAILED] skipping…

Module ib_uverbs belong to kernel which is not a part of ML[FAILED] skipping…

Module ib_ipoib belong to kernel which is not a part of MLN[FAILED]skipping…

Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]

Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]skipping…

Module ib_ucm does not exist, skipping… ? ? ? ? ? ? ? ? ?[FAILED]

Module rdma_ucm belong to kernel which is not a part of MLN[FAILED]skipping…

There are two potential solutions.

As a workaround for the Bright packages, perform the following:

(1). In /etc/init.d/openibd on line 132, set FORCE=0 to FORCE=1. This causes openibd to ignore the kernel difference but relies on weak-updates.?

(2). Edit /etc/infiniband/openib.conf and set UCM_LOAD=no and MLX5_FPGA_LOAD=no. As most customers aren’t using Legacy cards or FPGAs, this should not be an issue.?

(3). Restart the openibd service.

Once complete, the Mellanox OFED modules should load as expected.

# service openibd start

Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]

由于我的服务器(两台IO服务器)上各有两块网卡,一块光纤卡(万兆)用来连接原来的20台机子,还有一块HCA卡,用来连接后买的10台服务器infiniband网络。服务器同时为两个网络上的机子提供文件服务。

我的/etc/modprobe.d/lustre.conf是这样的:

options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"

尽管没有网络会使用tcp0(ens2f0)但还得把它加上。tcp1是我的光纤卡接口。网上查到的解决方案,描述如下:

Lustre:  

in recent Lustre releases, some specific filesystem could not be mounted due to a communication error between clients and servers, depending on the LNET configuration.

If we have a filesystem running on a host with 2 interfaces, let say tcp0 and tcp1 and the devices are setup to reply on both interfaces (formatted with --servicenode IP1@tcp0,IP2@tcp1).

If a client is connected only to tcp0 and try to mount this filesystem, it fails with an I/O error because it is trying to connect using tcp1 interface.

Mount failed:

# mount -t lustre x.y.z.a@tcp:/lustre /mnt/lustre

mount.lustre: mount x.y.z.a@tcp:/lustre at /mnt/client failed: Input/output error

Is the MGS running?

dmesg shows that communication fails using the wrong IP

[422880.743179] LNetError: 19787:0:(lib-move.c:1714:lnet_select_pathway()) no route to a.b.c.d@tcp1

# lnetctl peer show

peer:

 - primary nid: a.b.c.d@tcp1

 Multi-Rail: False

 peer ni:

 - nid: x.y.z.a@tcp

 state: NA

 - nid: 0@<0:0>

 state:

Ping is OK though:

# lctl ping x.y.z.a@tcp

12345-0@lo

12345-a.b.c.d@tcp1

12345-x.y.z.a@tcp

server (lustre-2.10.4-1.el7.x86_64):

options lnet networks=tcp0(en0),o2ib0(in0)

client (lustre-2.12.5-RC1-0.el7.x86_64):

options lnet networks="o2ib(ib0)"

These two workarounds seem to work (only?very?limited testing so far):

Configuring LNET tcp on the client (although I actually only want to use IB):

options lnet networks="o2ib(ib0),tcp(enp3s0f0)" 

Executing this before the actual Lustre mount:

lnetctl set discovery 0


我配好后的MDT文件系统如下:tunefs.lustre  /dev/sdb1

checking for existing Lustre data: found

Reading CONFIGS/mountdata

   Read previous values:

Target:     lustre-MDT0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x5

              (MDT MGS )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:

   Permanent disk data:

Target:     lustre-MDT0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x5

              (MDT MGS )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:

配好后的OST如下:

tunefs.lustre  /dev/sdb2

checking for existing Lustre data: found

Reading CONFIGS/mountdata


   Read previous values:

Target:     lustre-OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x2

              (OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

   Permanent disk data:

Target:     lustre-OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x2

              (OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

(其中10.10.1.101是我光纤卡的IP)

/etc/rc.local如下:

modprobe lnet

modprobe lustre

mount -t lustre /dev/sdb1 /mdt

mount -t lustre /dev/sdb2 /ost0

mount -t lustre /dev/sdb3 /ost1

第二台IO上:

cat /etc/modprobe.d/lustre.conf 

options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"


tunefs.lustre

checking for existing Lustre data: found

Reading CONFIGS/mountdata

   Read previous values:

Target:     lustre-OST0002

Index:      2

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x2

              (OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

   Permanent disk data:

Target:     lustre-OST0002

Index:      2

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x2

              (OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

在光纤卡的客户机上:

modprobe lnet

modprobe lustre

mount -t lustre fio01@tcp:/lustre /lustre

其中fio01是10.10.1.101 

在HCA卡的客户机上:

modprobe lnet

modprobe lustre

lnetctl set discovery 0

mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre

umount -l /lustre

mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre

其中11.11.1.101是服务端MDS上HCA的IP。

(这里很奇怪,需要卸载一次文件系统再挂载,才能稳定访问。)



https://m.sciencenet.cn/blog-3367558-1365331.html

上一篇:Extended Huckel Molecular Orbital Program (EHMO程序一枚)
下一篇:Cp2k 2022.1安装记录

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-4 16:27

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部