Author: Brandon B. Jozsa

"The world hate change, yet it is the only thing that has brought progress."
- Charles Kittering

Table of Contents

- Part I: Introduction
- Part II: Gathering Drive Information
    - Preparation: Gather information about your environment
- Part III: Creating a MachineConfig
    - Using Butane to Create a MachineConfig
    - Verification
- Part IV: Conclusion

Part I: Introduction

The other day I was working on OpenShift 4.17, and attempting to install the OpenShift Data Foundation (ODF) operator. There was a subtle change made that wasn't called out in the release notes, so I wanted to discuss it here for a moment.

In OpenShift Data Foundation 4.17, for new deployments (this is key, really), you can no longer install the operator on rotational disks. To be fair, ODF always had this requirement, but it would let you continue to install the operator on rotational disks anyway, with the understanding that Red Hat would not support issues related to storage latency, etc. This should be fairly obvious, but always run ODF on non-rotational drives, such as SSD or preverably NVMe. It should be pointed out that there are scenarios where you can run ODF on SAN presented disks as well, but this is beyond the scope of what I want to discuss today.

But let me discuss my lab scenario, and why the workaround I am presenting today is acceptable, specifically for my testing purposes. I am running an OpenShift SNO deployment on a Dell T550, with Dell-branded Micron 9200 Pro U.2 drives. These are very capable drives, and well within the specification for running ODF.

However, I am running OpenShift Virtualization within this SNO environment, and because of this, any of the Virtual Machines running within this OpenShift environment will be presented from QEMU as Rotational drives.

Recently, I started deploying OpenShift on top of OpenShift Virtualization, so I can maximize my hardware and so I can wholly rely on OpenShift as my primary virtualization platform for 2025. (I will create a blog post on this very soon, so stay tuned).

So you may see where this is going. On my host OpenShift environment, I am deploying a guest OpenShift environment, and within that guest environment I want to deploy ODF and yes, OpenShift virtualization (along with some amazing network scenarios I've been working on - again, more to come on this soon).

So in summary, the disks are fast - I don't have to worry about this. But they're being labelled as Rotational disks, when in fact they're actually non-rotational.

I want to change that, and I'll tell you how to do this as well in the next section.

Part II: Gathering Drive Information

The first thing we're going to do is collect a bunch of useful information about our disks. You may not need all of this, but it's always good practice to know you're environment before you start changing things. I can't tell you how many times, a customer knew their environment well, but inevitably something unexpectedly changed and a review of the environment prevented a potential disruption with their proof of concept. Leave nothing to chance.

Preparation: Gather information about your environment

Collect information about your environment.

❯ oc get nodes
NAME       STATUS   ROLES                         AGE   VERSION
cp0        Ready    control-plane,master,worker   66d   v1.29.8+f10c92d
cp1        Ready    control-plane,master,worker   66d   v1.29.8+f10c92d
cp2        Ready    control-plane,master,worker   66d   v1.29.8+f10c92d

Have a look at each of the nodes associated physical disks.

NODE_NAME=node/cp0

cat <<EOF | oc debug $NODE_NAME
chroot /host
lsblk -o NAME,ROTA,SIZE,TYPE
EOF

The output will look similar to the following:

sh-5.1# lsblk -o NAME,ROTA,SIZE,TYPE
sh-5.1# exit
NAME   ROTA   SIZE TYPE
sda       1   1.8T disk
sdb       1   5.5T disk
sdc       1   7.3T disk
sdd       1   3.6T disk
sde       1   931G disk
|-sde1    1     1M part
|-sde2    1   127M part
|-sde3    1   384M part
`-sde4    1 930.5G part
sr0       1  1024M rom

Removing debug pod ...

Now I want you to run the following command, which will list out all of the mappings between disk assignments (i.e. /dev/sdc) and disk by-path (i.e. /dev/disk/by-path/).

NODE_NAME=node/cp0

cat <<EOF | oc debug $NODE_NAME
chroot /host
ls -aslc /dev/disk/by-path/
EOF

The output will look like the following:

sh-5.1# ls -aslc /dev/disk/by-path/
sh-5.1# exit
total 0
0 drwxr-xr-x. 2 root root 260 Sep  7 00:30 .
0 drwxr-xr-x. 9 root root 180 Sep  7 00:30 ..
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:00:17.0-ata-8 -> ../../sr0
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:00:17.0-ata-8.0 -> ../../sr0
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:0:0 -> ../../sda
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:1:0 -> ../../sdc
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:2:0 -> ../../sdb
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:3:0 -> ../../sdd
0 lrwxrwxrwx. 1 root root   9 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:4:0 -> ../../sde
0 lrwxrwxrwx. 1 root root  10 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:4:0-part1 -> ../../sde1
0 lrwxrwxrwx. 1 root root  10 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:4:0-part2 -> ../../sde2
0 lrwxrwxrwx. 1 root root  10 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:4:0-part3 -> ../../sde3
0 lrwxrwxrwx. 1 root root  10 Sep  7 00:30 pci-0000:1a:00.0-scsi-0:2:4:0-part4 -> ../../sde4

Removing debug pod ...

Lastly, if you want to collect read/write IOP information about your drives, in order to determine if you're within specification to deploy ODF on your "Rotational" drives, you can follow the Github/Gist link HERE which will walk you through gathering this information.

Part III: Creating a MachineConfig

If you haven't created a MachineConfig yet, it can seem a bit confusing at first. So let me explain this at a very high level, and then I will get into the details for what we need to change below.

As you probably have heard before, RHCOS is a immutable, container-focused operating system. This means that the core OS is read-only, and an operator (called the "Machine Config Operator"). The MCO controls the state of the OS at all times. So if someone creates a breaking change in the OS, the MCO can put that OS in a working last known state. This helps organizations make sweeping changes across their organization centrally, and in a safe way. Red Hat treats RHCOS more like an appliance than a traditional OS like RHEL. When I say that the OS is a container-focused OS, I also mean that there is no package manager like Yum or DNF. So then, how would you get packages or make changes to the OS?

This is where Machine Configs come into play. With MCs, you can create files, add binary utilities, create systemd units, etc.

This is what we're going to leverage, in order to create a boot/watchdog service that can make our HDD devices look like NonRotational devices for ODF.

Using Butane to Create a MachineConfig

Source Documentation: HERE

The first thing I would like you to do is download a Red Hat utility called butane. I'm going to be installing the ARM-based MacOS binary, but be sure to change the ARCH variable to something more useful if required for your system.
```
ARCH="darwin-amd64"

mkdir -p ~/.local/bin/
curl https://mirror.openshift.com/pub/openshift-v4/clients/butane/latest/butane-$ARCH --output ~/.local/bin/butane
chmod +x ~/.local/bin/butane

butane --version
```

Now, have a look at the docs for Butane, however I'm going to show you everything you need to do right here. We're going to use the butane utility to create a custom MachineConfig that we can apply to each of our nodes. So let's create a file called 99-fake-nonrotational-mc.bu.

NOTE: You do need to pay attention to the label that I am targetting in my example below. I am targetting machineconfiguration.openshift.io/role: master because I am running this MachineConfig against a compact cluster (3 schdulable control-plane nodes). If you have dedicated infra nodes, or are using a full cluster deployment, change your label accordingly.

cat << 'EOF' > 99-fake-nonrotational-mc.bu
variant: openshift
version: 4.17.0
metadata:
  name: 99-fake-nonrotational
  labels:
    machineconfiguration.openshift.io/role: master
storage:
  files:
    - path: /etc/fake-nonrotational.sh
      mode: 0755
      contents:
        inline: |
          #!/bin/bash

          # Find disks with a size of 120G:
          target_disks=$(lsblk -dn -o NAME,SIZE | awk '$2 == "120G" {print $1}')

          # Mark these disks as NonRotational:
          for disk in $target_disks; do
              echo "Changing disk: /dev/$disk to non-rotational."
              echo 0 > /sys/block/$disk/queue/rotational
          done
systemd:
  units:
    - name: fake-nonrotational.service
      enabled: true
      contents: |
        [Unit]
        Description=Force specific-size disks to be nonrotational
        After=local-fs.target
        Wants=local-fs.target

        [Service]
        Type=oneshot
        ExecStart=/etc/fake-nonrotational.sh
        Restart=on-failure
        RestartSec=5s
        RemainAfterExit=true
        User=root

        [Install]
        WantedBy=multi-user.target
EOF

What you'll notice is that we have the script embedded in our butane .bu file (see the example below). If you want to make changes to the script for your use case; make these changes now.

#!/bin/bash

# Find disks with a size of 120G:
target_disks=$(lsblk -dn -o NAME,SIZE | awk '$2 == "120G" {print $1}')

# Mark that disk as NonRotational:
for disk in $target_disks; do
    echo "Changing disk: /dev/$disk to non-rotational."
    echo 0 > /sys/block/$disk/queue/rotational
done

My script example above is doing the following:

Look for a disk with 120G (exactly)
Change the disk from a Rotational disk to a NonRotational disk by editing the value in /sys/block/$disk/queue/rotational (but only for the disk that matches 120G

Now you can run butane to create the MachineConfig by using the following command.
```
butane 99-fake-nonrotational-mc.bu -o 99-fake-nonrotational-mc.yaml
```

Below was the final result of my butane results. I am simply going to add an annotations.description to my MachiineConfig, but after I've done that I'm going to apply the manifest to my cluster like so.

cat <<EOF | oc apply -f -
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  annotations:
    description: "MC sets disks with size 120G to non-rotational. Documentation found at: https://tinyurl.com/mc-nonrotational-hack"
  name: 99-fake-nonrotational
spec:
  config:
    ignition:
      version: 3.4.0
    storage:
      files:
        - contents:
            compression: gzip
            source: data:;base64,H4sIAAAAAAAC/0zNvU7DQBDE8X6fYnBOCkiYTVKCjIRQQBTQ0NGgc+5ir2ztgu9CxNe7I44i9P/fzOyIW1FufeqJZrgRDQiShoS95B4eST4ibIvlanFL2U9dzM8laNzxmNpxQB0UteHh6n59+nj3tMYX/H7A3K3QNKh+YYXPl0k0wy2/5ydEW5vKC0Th/o9eIBgBQNz0huq699qJdiU+B4f4xq7AbFDTerLss5j68aw6uAUuwek9cTvaZvgT/LqLu8gHQME00k8AAAD//0gklW0AAQAA
          mode: 493
          path: /etc/fake-nonrotational.sh
    systemd:
      units:
        - contents: |
            [Unit]
            Description=Force specific-size disks to be nonrotational
            After=local-fs.target
            Wants=local-fs.target

            [Service]
            Type=oneshot
            ExecStart=/etc/fake-nonrotational.sh
            Restart=on-failure
            RestartSec=5s
            RemainAfterExit=true
            User=root

            [Install]
            WantedBy=multi-user.target
          enabled: true
          name: fake-nonrotational.service
EOF

Verification

To verify that things are working, use the following command against one fo your nodes where the MC was applied.

NODE_NAME=node/cp0

cat <<EOF | oc debug $NODE_NAME
  cat /sys/block/sda/queue/rotational
  cat /sys/block/sdb/queue/rotational
EOF

You can also check the journald logs as well, since this was created as a service.

NODE_NAME=node/cp0

cat <<EOF | oc debug $NODE_NAME
chroot /host

journalctl | grep -e fake-nonrotational
EOF

Part IV: Conclusion

So, there you have it! With Linux, you pretty much control your own destiny. I love little hacks like this, because it enables people to get the most out of their home lab environments. And for larger organizations, the more that people learn in safe lab environments, the more confident they can be when working on production environments.

I hope you've enjoyed today little blog post. I was pretty happy in getting this to work, and I'm even more happy to pass this information along to others!

It's the holiday season (Dec 21st 2024). Enjoy the time with your family and friends! Until the next post, take care everyone.

- v1k0d3n (Brandon B. Jozsa)