Bartas1979
Mission Specialist
Mission Specialist
  • 4,669 Views

Global lock failed: check that global lockspace is started

Jump to solution

Hi all,

I'm preparing for EX436 exam on RH labs and testing many scenarios. One of these scenarios is prepare cluster with assigned two IP addresses to each node and then use iscsi target with dlm, multipath and LVM's.

Everything goes somoothly till... I starting with LVM configuration. I getting error "Global lock failed: check that global lockspace is started".

Below steps which I using for reproduce that error:

1. pcs cluster setup prod1 \
node1.mydomain.com addr=172.25.250.10 addr=172.25.250.50\
node2.mydomain.com addr=172.25.250.11 addr=172.25.250.51\
node3.mydomain.com addr=172.25.250.12 addr=172.25.250.52\
 
2. pcs stonith create fence_node1 fence_ipmilan pcmk_host_list=node1.mydomain.com  ip=192.168.0.101 username=myadmin password=secret_password lanplus=1 power_timeout=180
pcs stonith create fence_node2 fence_ipmilan pcmk_host_list=node2.mydomain.com  ip=192.168.0.101 username=myadmin password=secret_password lanplus=1 power_timeout=180
pcs stonith create fence_node3 fence_ipmilan pcmk_host_list=node2.mydomain.com  ip=192.168.0.101 username=myadmin password=secret_password lanplus=1 power_timeout=180
 
3. cat /etc/corosync/corosync.conf
nodelist {
node {
ring0_addr 172.25.250.10
ring1_addr 172.25.250.50
name: node1.mydomain.com
nodeid :1
}
node {
ring0_addr 172.25.250.11
ring1_addr 172.25.250.51
name: node2.mydomain.com
nodeid :2
}
node {
ring0_addr 172.25.250.12
ring1_addr 172.25.250.52
name: node3.mydomain.com
nodeid :3
}
}
 
4. dnf install -y dlm iscsi-initiator-utils lvm2-lockd device-mapper-multipath gfs2-utils [on all nodes]
 
5. Edit /etc/iscsi/initiatorname.iscsi file and set the IQN for the client initiator. 
(iqn.2023-11.com.mydomain:<short_hostname>)[on all nodes]
 
6. systemctl enable --now iscsid [on all nodes]
 
7. iscsiadm -m discovery -t st -p 192.168.1.15 192.168.1.15
iscsiadm -m node -T iqn.2023-11.com.mydomain:store-prod -p 192.168.1.15 -l
iscsiadm -m discovery -t st -p 192.168.1.15 192.168.2.15
iscsiadm -m node -T iqn.2023-11.com.mydomain:store-prod -p 192.168.2.15 -l
[on all nodes]
 
8. mpathconf --enable
 
9. iscsiadm -m session -P 3
 
10. udevadm info /dev/sdb | grep ID_SERIAL=
 
11. Edit the /etc/multipath.conf
multipaths {
multipath {
wwid 3600140562aeac25dc4c4eb5842574c7a
alias diska
}
}
 
12. Copy /etc/multipath.conf to two nodes in cluster node2 and node3:
scp /etc/multipath.conf root@node2:/etc/
scp /etc/multipath.conf root@node3:/etc/
 
13. systemctl enable --now multipathd [on all nodes]
 
14. pcs resource create dlm ocf:pacemaker:controld --group=locking
 
15. pcs resource create lvmlockd ocf:heartbeat:lvmlockd --group=locking
 
16. pcs resource clone locking interleave=true
 
17. pvcreate /dev/mapper/diska
 
Global lock failed: check that global lockspace is started
 
Could I ask you for help to find me where I made mistake? Thank you.
Labels (2)
16 Replies
Boolabs
Cadet
Cadet
  • 1,208 Views

Hello,

In my scenario, I have a two node HA cluster with a separate QNet device for quorum. Cluster nodes have two network rings for cluster communication. I was coming across the same error when trying to configure a GFS2 volume ("Global Lock failed: check that global lock space is started").

What appeared to have solved it for me is a combination of:

1. Enabling the "sctp" kernel module by losely following: https://access.redhat.com/solutions/6625041 (note that the module will initially be black-listed)

2. Adding the line "rrp_mode: passive" in the "totem{}" section of /etc/corosync/corosync.conf on each cluster node.

In addition to setting "use_lvmlockd = 1" in "/etc/lvm/lvm.conf"

Only then did the "vcgreate..." command work:

[booboo@server-01 ~]$ sudo vgcreate --shared iscsi-shared /dev/mapper/mpatha
Volume group "iscsi-shared" successfully created
VG iscsi-shared starting dlm lockspace
Starting locking. Waiting until locks are ready...
[booboo@server-01 ~]$ sudo vgdisplay
Devices file PVID ttFV7VFjsRfGtnTkYvOvbdtknr2yfpjn last seen on /dev/sdc not found.
--- Volume group ---
VG Name iscsi-shared
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 2
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size <50.00 GiB
PE Size 4.00 MiB
Total PE 12799
Alloc PE / Size 0 / 0
Free PE / Size 12799 / <50.00 GiB
VG UUID pvfG10-GM4T-VY4N-GnQN-Y5LH-7iV2-gcASx8

NOTE: My dev device being set up here (/dev/mapper/mpatha) is multi-path to an iSCSI Target on a remote Ceph instance.

This was all essentially reverse-engineered through trial and error. I believe that in a cluster where there are multiple corosync rings configured, "dlm" must use the "sctp" protocol for lock management between cluster nodes, as without the "rrp_mode: passive" setting, it will by default use TCP, which will not work in a multi-homed server (you may indeed see similar messages on the console and in "dmesg").

See section 19.1 of https://documentation.suse.com/sle-ha/15-SP3/html/SLE-HA-all/cha-ha-storage-dlm.html for a little more detail.

With these two pieces in place ("sctp" and "rrp_mode"), dlm should show: "dlm: Using SCTP for communications" in "dmesg" and the subsequent commands for creating shared volume groups should work.

jeet11
Mission Specialist
Mission Specialist
  • 953 Views

Hello Everyone,

I'm also facing the same issue. not able to create pv and vg.
while lvmlockd is already 1 and service is also running.

Is there any solution?

Thanks

jeet11
Mission Specialist
Mission Specialist
  • 952 Views

pvcreate /dev/mapper/mpatha

start a lock manager, lvmlockd did not find one running.
Global lock failed: check global lockspace is started.

jeet11
Mission Specialist
Mission Specialist
  • 952 Views

I also tried to reboot the both cluster nodes,
use_lvmlockd = 1  <<---- Used this /etc/lvm/lvm.conf file.
Manually restart the lvmlockd service also.
also able to see "mpatha" device on  both 2 cluster nodes using "lsblk".
But none of the above trick solved the issue in my case. I'm preparing for  RedHat exam EX436. Any solution will be appriciated.

Thankyou in advance.

AlexonOliveira
Flight Engineer
Flight Engineer
  • 2,233 Views

I could reproduce the reported issue, as follows:

 

[root@nodea ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 10G 0 disk
└─diska 253:0 0 10G 0 mpath
sdb 8:16 0 10G 0 disk
└─diska 253:0 0 10G 0 mpath
vda 252:0 0 10G 0 disk
├─vda1 252:1 0 1M 0 part
├─vda2 252:2 0 100M 0 part /boot/efi
└─vda3 252:3 0 9.9G 0 part /

[root@nodea ~]# pcs resource create dlm ocf:pacemaker:controld op monitor interval=30s on-fail=fence --group=locking

[root@nodea ~]# pcs resource create lvmlockd ocf:heartbeat:lvmlockd op monitor interval=30s on-fail=fence --group=locking

[root@nodea ~]# pcs resource clone locking interleave=true

[root@nodea ~]# pvs
Skipping global lock: lockspace not found or started

[root@nodea ~]# pvs
Global lock failed: error -210

[root@nodea ~]# pvs
Skipping global lock: lockspace not found or started

[root@nodea ~]# pvcreate /dev/mapper/diska
Global lock failed: check that global lockspace is started

 

As you can see, I'm running my cluster with two links for each node. Apparently, that's the issue, according to the following KB:

https://access.redhat.com/solutions/5099971

So, to solve it, remove the extra ring, like the following example:

 

[root@nodea ~]# pcs cluster corosync | grep nodelist -A21
nodelist {
node {
ring0_addr: 192.168.0.10
ring1_addr: 192.168.2.10
name: nodea.private.example.com
nodeid: 1
}

node {
ring0_addr: 192.168.0.11
ring1_addr: 192.168.2.11
name: nodeb.private.example.com
nodeid: 2
}

node {
ring0_addr: 192.168.0.12
ring1_addr: 192.168.2.12
name: nodec.private.example.com
nodeid: 3
}
}

[root@nodea ~]# pcs cluster link remove 1
Sending updated corosync.conf to nodes...
nodea.private.example.com: Succeeded
nodeb.private.example.com: Succeeded
nodec.private.example.com: Succeeded
nodea.private.example.com: Corosync configuration reloaded

[root@nodea ~]# pcs cluster corosync | grep nodelist -A18
nodelist {
node {
ring0_addr: 192.168.0.10
name: nodea.private.example.com
nodeid: 1
}

node {
ring0_addr: 192.168.0.11
name: nodeb.private.example.com
nodeid: 2
}

node {
ring0_addr: 192.168.0.12
name: nodec.private.example.com
nodeid: 3
}
}

[root@nodea ~]# pcs cluster stop --all
nodea.private.example.com: Stopping Cluster (pacemaker)...
nodec.private.example.com: Stopping Cluster (pacemaker)...
nodeb.private.example.com: Stopping Cluster (pacemaker)...
nodea.private.example.com: Stopping Cluster (corosync)...
nodeb.private.example.com: Stopping Cluster (corosync)...
nodec.private.example.com: Stopping Cluster (corosync)...

[root@nodea ~]# reboot

[root@nodeb ~]# reboot

[root@nodec ~]# reboot

[root@nodea ~]# pcs status --full
Cluster name: cluster1
Cluster Summary:
* Stack: corosync
* Current DC: nodec.private.example.com (3) (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
* Last updated: Wed May 29 21:38:34 2024
* Last change: Wed May 29 21:31:46 2024 by root via crm_resource on nodea.private.example.com
* 3 nodes configured
* 9 resource instances configured

Node List:
* Online: [ nodea.private.example.com (1) nodeb.private.example.com (2) nodec.private.example.com (3) ]

Full List of Resources:
* fence_nodea (stonith:fence_ipmilan): Started nodea.private.example.com
* fence_nodeb (stonith:fence_ipmilan): Started nodeb.private.example.com
* fence_nodec (stonith:fence_ipmilan): Started nodec.private.example.com
* Clone Set: locking-clone [locking]:
* Resource Group: locking:0:
* dlm (ocf::pacemaker:controld): Started nodec.private.example.com
* lvmlockd (ocf::heartbeat:lvmlockd): Started nodec.private.example.com
* Resource Group: locking:1:
* dlm (ocf::pacemaker:controld): Started nodea.private.example.com
* lvmlockd (ocf::heartbeat:lvmlockd): Started nodea.private.example.com
* Resource Group: locking:2:
* dlm (ocf::pacemaker:controld): Started nodeb.private.example.com
* lvmlockd (ocf::heartbeat:lvmlockd): Started nodeb.private.example.com

Migration Summary:

Tickets:

PCSD Status:
nodea.private.example.com: Online
nodeb.private.example.com: Online
nodec.private.example.com: Online

Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled

[root@nodea ~]# pvs

[root@nodea ~]# pvcreate /dev/mapper/diska
Physical volume "/dev/mapper/diska" successfully created.

[root@nodea ~]# vgcreate --shared vg1 /dev/mapper/diska
Volume group "vg1" successfully created
VG vg1 starting dlm lockspace
Starting locking. Waiting until locks are ready...

[root@nodea ~]# ssh nodeb vgchange --lock-start vg1
VG vg1 starting dlm lockspace
Starting locking. Waiting until locks are ready...

[root@nodea ~]# ssh nodec vgchange --lock-start vg1
VG vg1 starting dlm lockspace
Starting locking. Waiting until locks are ready...

[root@nodea ~]# lvcreate --activate sy -L4G -n lv1 vg1
Logical volume "lv1" created.

[root@nodea ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/mapper/diska vg1 lvm2 a-- <10.00g <6.00g

[root@nodea ~]# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 1 1 0 wz--ns <10.00g <6.00g

[root@nodea ~]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv1 vg1 -wi-a----- 4.00g

Alexon Oliveira
Jeet2
Cadet
Cadet
  • 735 Views

Hi I'm also facing the same issue.
If anyone know the solution, please share it.

Thank you in advance for your time and help.

sgvredhat
Cadet
Cadet
  • 899 Views

I can see this KB article which explains that DLM does not support configuring more than one ring. Please refer this KB article for details : 

 

https://access.redhat.com/solutions/5099971

 

Here is another policy document with more details : https://access.redhat.com/login?redirectTo=https%3A%2F%2Faccess.redhat.com%2Fsolutions%2F5099971

 

 

Join the discussion
You must log in to join this conversation.