Hi, everyone. I’m using RHEL 7.9 Virtual Server Instances that I’m creating using the IBM Portal. My goal is to back the VSI up by creating its image (Using DD Command for that), uploading that image to the IBM Cloud Object Storage (COS) Bucket, converting it into IBM Cloud native format (From .img to .qcow2), creating a custom image from that image and using that image to create a newly restored VSI.
Now the problem I’m facing with this process is that when I create a new VSI using the custom image, the new VSI does not boot up. When it boots, it keeps giving me errors. Screenshots have been attached.
However, this problem seems intermittent as, occasionally, the VSI does boot successfully. But sadly, the rate of failure is far more than the rate of success.
Solutions I’ve tried:
- Installing cloud-init
- Installing virtio
- Following instructions from this page : https://www.ibm.com/docs/en/cic/1.1.2?topic=zvm-installation-configuration-cloud-init-linux-server
Solutions that have worked but very inconsistently:
- Booting into rescue mode and executing the command “dracut -f” and then rebooting back into normal mode
- Booting directly into rescue mode (By making changes in the grub file in the source machine) and then rebooting into normal mode
Note: The last two solutions both, have only worked once or twice. Repeating them results in a blank screen which can be seen through the KVM Console on the IBM Portal.
The final results:
- Blank screen shown on KVM Console
- Errors upon boot (Refer to attachments)
Note: These problems occur in the versions 7.9, 8.4 and 8.6 of RHEL. RHEL version 9 seems to be showing no problems whatsoever.
I would appreciate any sort of help as I have tried countless times but this problem is coming up over and over again.
Hello, could you by any chance, edit the boot and try to boot into emergency.target using systemd.unit=emergency.target
And look into the full errors, eg uploading the text error here (full, the output of the dmesg, probably if possible, start the network, and upload via ssh the txt file).
2nd guess that I can see is, there are broken copying when using dd to the img, that cause the xfs got wrong metadata there. I can't help any further because it's hard to guest the problem without hands on the actual VMs.
I think the best way to solve it, is to ask the IBM cloud teams (if you have subscription), or red hat (also if you have subscription).
Probably there are other expert people here can help. But tbf, it's hard to guess with just that limited data.
Hi. I appreciate your comprehensive reply.
I tried booting into emergency mode and I got this error once the VSI booted up (Screenshot attached). Also, after hitting ctrl + x after adding systemd.unit=emergency.target, this screen shows up. But if I edit the grub menu again, "systemd.unit=emergency.target" is not there and despite that, the VSI is still giving the error shown in the screenshot.
Your second guess makes a lot of sense but it seems unlikely considering all the tests I've performed in other flavours of linux as well as RHEL version 9. It's only 7.x and 8.x versions of RHEL that are showing these issues.