OpenStack Series Chapter3:Introducing Overcloud

Overview

In Red Hat OpenStack Platform (RHOSP), a Red Hat supported overcloud is built by the undercloud Director node using the Deployment service TripleO. At the same time, the undercloud can deploy and manage multiple overclouds.

The Deployment template

An overcloud consists of numerous deployed nodes, each configured using roles, which specify the services and configuration required for a node to perform that role. Predefined RHOSP role definitions are located in the /usr/share/openstack-tripleo-heat-templates/roles directory and can be viewed with the openstack role command.

Check all default roles

(undercloud) [stack@director ~]$ ls /usr/share/openstack-tripleo-heat-templates/roles | wc -l
52

(undercloud) [stack@director ~]$ ls /usr/share/openstack-tripleo-heat-templates/roles
BlockStorage.yaml           ComputePPC64LE.yaml                 DistributedComputeScaleOut.yaml
CellController.yaml         ComputeRBDEphemeral.yaml            HciCephAll.yaml
CephAll.yaml                ComputeRealTime.yaml                HciCephFile.yaml
CephFile.yaml               ComputeSriov.yaml                   HciCephMon.yaml
CephObject.yaml             ComputeSriovIB.yaml                 HciCephObject.yaml
CephStorage.yaml            ComputeSriovRT.yaml                 IronicConductor.yaml
Compute.yaml                Controller.yaml                     Messaging.yaml
ComputeAlt.yaml             ControllerAllNovaStandalone.yaml    Networker.yaml
ComputeDVR.yaml             ControllerNoCeph.yaml               NetworkerSriov.yaml
ComputeHCI.yaml             ControllerNovaStandalone.yaml       Novacontrol.yaml
ComputeHCIOvsDpdk.yaml      ControllerOpenstack.yaml            ObjectStorage.yaml
ComputeInstanceHA.yaml      ControllerSriov.yaml                README.rst
ComputeLiquidio.yaml        ControllerStorageDashboard.yaml     Standalone.yaml
ComputeLocalEphemeral.yaml  ControllerStorageNfs.yaml           Telemetry.yaml
ComputeOvsDpdk.yaml         Database.yaml                       Undercloud.yaml
ComputeOvsDpdkRT.yaml       DistributedCompute.yaml             UndercloudMinion.yaml
ComputeOvsDpdkSriov.yaml    DistributedComputeHCI.yaml
ComputeOvsDpdkSriovRT.yaml  DistributedComputeHCIScaleOut.yaml

Check all deployed roles

(undercloud) [stack@director ~]$ grep '^- name:' ~/templates/roles_data.yaml
- name: Controller
- name: Compute
- name: CephStorage
- name: ComputeHCI

Controller

Control plane node with all core services loaded. Provides service APIs and handles database, messaging, and network processing functions.

In the exercise case, we provide only one Controller node.
Compute

Standard compute node working as a hypervisor running deployed instances and stacks.
CephStorage

Storage back end node using a containerized Red Hat Ceph Storage server and multiple object storage disks (OSDs).
ComputeHCI

Converged node. (Compute node + CephStorage node.)

Compute node with hyperconverged infrastructure, combining compute and Ceph OSD functionality on a single node.

Components and Services

Core Services

Block Storage Service (Cinder)

Cinder -> Ceph.

Original Swift only provides the object storage. So, Cinder will schedule the back-end Ceph for block and file storage.
Image Service (Glance)

The Image service stores and manages images used to deploy instances. (Object Storage)
Orchestration Service (Heat)

In the undercloud, Heat templates deploy each overcloud as a stack.

In the overcloud, Heat templates deploy application workloads as stacks.

All templates are based on yaml files.
Dashboard Service (Horizon)

Web UI Front-end Dashboard It just looks like a console. More convenient and intuitionistic.

The Dashboard service provides a browser-based interface for self-service cloud users to create and configure resources and launch and manage instances and stacks.
Identity Service (Keystone)

It looks like a gate-keeper to confirm/verify the comer.

The Identity service provides domain, project, and user authorization for other overcloud services.
OpenStack Networking Service (Neutron)

It’s responsible for the network.

The OpenStack Networking service manages virtual networking infrastructure. It provides the virtual network and manages the network interfaces.
Compute Service (Nova)

The Compute service schedules and runs on-demand virtual machines.

It’s responsible for creating, starting, stoping and removing the virtual machines.
Messaging Service (Oslo)

It provides a common messaging framework and provides a compatible and consistent communication feature set between each services.
Object Store (Swift)

Original and default object storage back-end.

The Object Store service provides self-service cloud user object storage. Other overcloud services that collect data or objects can use the Object Store service as a back end.

Discretionary and Operational OpenStack Services

Bare Metal Service (Ironic)

Initializes and introspects -> scaling.

The Bare Metal service locates and prepares compute resources, including bare metal and virtual machines.
File Share Service (Manila)

It can use either the NFS or CIFS protocols to provide file sharing to instances.
Load Balancing Service(Octavia)

It is an original HAProxy-based service.
The Load Balancing service enables failover network traffic distribution to instances in a multitier application architecture.

Managing the Overcloud

Listing all Overcloud nodes

(undercloud) [stack@director ~]$ openstack server list \
> -c 'Name' -c 'Status' -c 'Networks'
+-------------+--------+------------------------+
| Name        | Status | Networks               |
+-------------+--------+------------------------+
| controller0 | ACTIVE | ctlplane=172.25.249.56 |
| computehci0 | ACTIVE | ctlplane=172.25.249.54 |
| compute1    | ACTIVE | ctlplane=172.25.249.53 |
| compute0    | ACTIVE | ctlplane=172.25.249.59 |
| ceph0       | ACTIVE | ctlplane=172.25.249.58 |
+-------------+--------+------------------------+

High Availability

Red Hat OpenStack Platform uses Pacemaker as the cluster resource manager, HAProxy as the load balancing cluster service, and MariaDB Galera as the replicated database service.

RHOSP Director installs a duplicate set of OpenStack components on each controller node and manages them as a single service.

Pacemaker -> Manager and Scheduler
HAProxy -> Allocate all network traffic.
MariaDB -> Duplicate the data.

We could use the command to check the status of our cluster.

[heat-admin@controller0 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Cluster Summary:
  * Stack: corosync
  * Current DC: controller0 (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition with quorum
  * Last updated: Tue May  6 13:33:04 2025
  * Last change:  Mon May  5 02:29:30 2025 by root via crm_resource on controller0
  * 5 nodes configured
  * 22 resource instances configured

Node List:
  * Online: [ controller0 ]
  * GuestOnline: [ galera-bundle-0@controller0 ovn-dbs-bundle-0@controller0 rabbitmq-bundle-0@controller0 redis-bundle-0@controller0 ]

Full List of Resources:
  * Container bundle: galera-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-mariadb:pcmklatest]:
    * galera-bundle-0	(ocf::heartbeat:galera):	Master controller0
  * Container bundle: rabbitmq-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-rabbitmq:pcmklatest]:
    * rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started controller0
  * Container bundle: redis-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-redis:pcmklatest]:
    * redis-bundle-0	(ocf::heartbeat:redis):	Master controller0
  * ip-172.25.249.50	(ocf::heartbeat:IPaddr2):	Started controller0
  * ip-172.25.250.50	(ocf::heartbeat:IPaddr2):	Started controller0
  * ip-172.24.1.51	(ocf::heartbeat:IPaddr2):	Started controller0
  * ip-172.24.1.50	(ocf::heartbeat:IPaddr2):	Started controller0
  * ip-172.24.3.50	(ocf::heartbeat:IPaddr2):	Started controller0
  * ip-172.24.4.50	(ocf::heartbeat:IPaddr2):	Started controller0
  * Container bundle: haproxy-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-haproxy:pcmklatest]:
    * haproxy-bundle-podman-0	(ocf::heartbeat:podman):	Started controller0
  * Container bundle: ovn-dbs-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-ovn-northd:pcmklatest]:
    * ovn-dbs-bundle-0	(ocf::ovn:ovndb-servers):	Master controller0
  * ip-172.24.1.52	(ocf::heartbeat:IPaddr2):	Started controller0
  * Container bundle: openstack-cinder-volume [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-cinder-volume:pcmklatest]:
    * openstack-cinder-volume-podman-0	(ocf::heartbeat:podman):	Started controller0
  * Container bundle: openstack-manila-share [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-manila-share:pcmklatest]:
    * openstack-manila-share-podman-0	(ocf::heartbeat:podman):	Started controller0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Accessing Services on the Overloud

When the overcloud is first installed, the procedure creates the overcloudrc identity environment file under the stackrc user’s home directory with the admin credentials to access
the overcloud.

stackrc

It will be created automatically when deploy the undercloud which contains the environment variables in the undercloud.

It could be used to deploy the overcloud.
overcloudrc

After the overcloud was deployed, it will be generated by TripleO.

It is the token/authentication file to access the overcloud.

We could check and manage the service on overcloud by CLI command with it.

Switch to overcloud environment variables

# Check all environment variables on the overcloud
(overcloud) [stack@director ~]$ env | grep ^OS_*
OS_IMAGE_API_VERSION=2
OS_AUTH_URL=http://172.25.250.50:5000
OS_CLOUDNAME=overcloud
OS_REGION_NAME=regionOne
OS_PROJECT_NAME=admin
OS_PROJECT_DOMAIN_NAME=Default
OS_USER_DOMAIN_NAME=Default
OS_IDENTITY_API_VERSION=3
OS_AUTH_TYPE=password
OS_NO_CACHE=True
OS_COMPUTE_API_VERSION=2.latest
OS_PASSWORD=redhat
OS_USERNAME=admin
OS_VOLUME_API_VERSION=3

# Check all service on the overcloud
(overcloud) [stack@director ~]$ openstack service list
+----------------------------------+-----------+----------------+
| ID                               | Name      | Type           |
+----------------------------------+-----------+----------------+
| 0932e03c896a4e73b905b81a0e4e8b02 | cinderv2  | volumev2       |
| 1b5a14e5ff2d4d988dcbb7d1a31aa257 | panko     | event          |
| 1e50893246094c05b0b8b5028efa65d6 | heat      | orchestration  |
| 20ee8664c18146639e207d1dd833066b | heat-cfn  | cloudformation |
| 2ed07d8f7afd4126b279bd273d029962 | gnocchi   | metric         |
| 31442049b55040d5bb52589e0fd1b31f | nova      | compute        |
| 446c6d0b77a74d05a491393c8d4cc106 | keystone  | identity       |
| 76870267b75a4d0bb70772f586730659 | neutron   | network        |
| 8413046b29f44285a772a8bbd172dbd8 | manilav2  | sharev2        |
| bb7a5e08aad345a489ce0207684ae402 | swift     | object-store   |
| c970fbfbc71a49008760c6d9a6f3ee91 | aodh      | alarming       |
| cc386f44375343d19e1fc78e8e78db35 | cinderv3  | volumev3       |
| cddaafd6d76b424b946a8fdb47a8ed12 | manila    | share          |
| d799151c504541edacd68a08324d37c0 | glance    | image          |
| e15d67325321405b854447a92bfe866f | octavia   | load-balancer  |
| f8ebce21244748c09b20df6e9c939b73 | placement | placement      |
+----------------------------------+-----------+----------------+

Network

Check the allocated VLAN.

[heat-admin@controller0 ~]$ sudo ovs-vsctl show
da52b817-899b-459f-b486-b529fcbd9275
    Bridge br-prov2
        fail_mode: standalone
        Port eth4
            Interface eth4
        Port br-prov2
            Interface br-prov2
                type: internal
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
        Port o-hm0
            Interface o-hm0
                type: internal
        Port patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506
            Interface patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506
                type: patch
                options: {peer=patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int}
        Port ovn-8e0b99-0
            Interface ovn-8e0b99-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="172.24.2.2"}
        Port ovn-3f678a-0
            Interface ovn-3f678a-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="172.24.2.6"}
        Port ovn-b8bd46-0
            Interface ovn-b8bd46-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="172.24.2.12"}
    Bridge br-ex
        fail_mode: standalone
        Port br-ex
            Interface br-ex
                type: internal
        Port patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int
            Interface patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int
                type: patch
                options: {peer=patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506}
        Port eth2
            Interface eth2
    Bridge br-prov1
        fail_mode: standalone
        Port eth3
            Interface eth3
        Port br-prov1
            Interface br-prov1
                type: internal
    Bridge br-trunk
        fail_mode: standalone
        Port vlan40
            tag: 40
            Interface vlan40
                type: internal
        Port eth1
            Interface eth1
        Port vlan20
            tag: 20
            Interface vlan20
                type: internal
        Port vlan30
            tag: 30
            Interface vlan30
                type: internal
        Port vlan50
            tag: 50
            Interface vlan50
                type: internal
        Port br-trunk
            Interface br-trunk
                type: internal
        Port vlan10
            tag: 10
            Interface vlan10
                type: internal
    ovs_version: "2.13.0"

Check the ip address.

[heat-admin@controller0 ~]$ ip --br a
lo               UNKNOWN        127.0.0.1/8 ::1/128 
eth0             UP             172.25.249.56/24 172.25.249.50/32 fe80::5054:ff:fe00:f901/64 
eth1             UP             fe80::5054:ff:fe01:1/64 
eth2             UP             fe80::5054:ff:fe02:fa01/64 
eth3             UP             fe80::5054:ff:fe03:1/64 
eth4             UP             fe80::5054:ff:fe04:1/64 
ovs-system       DOWN           
br-prov1         UNKNOWN        fe80::5054:ff:fe03:1/64 
genev_sys_6081   UNKNOWN        fe80::3c78:cbff:feee:3bf0/64 
o-hm0            UNKNOWN        172.23.3.42/16 fe80::f816:3eff:fecd:dfd4/64 
br-int           UNKNOWN        fe80::40e:d7ff:fe34:2944/64 
br-ex            UNKNOWN        172.25.250.1/24 172.25.250.50/32 fe80::5054:ff:fe02:fa01/64 
br-prov2         UNKNOWN        fe80::5054:ff:fe04:1/64 
vlan40           UNKNOWN        172.24.4.1/24 172.24.4.50/32 fe80::b4ca:55ff:fe88:4fd4/64 
vlan10           UNKNOWN        172.24.1.1/24 172.24.1.51/32 172.24.1.50/32 172.24.1.52/32 fe80::f45f:9ff:feed:88a4/64 
vlan20           UNKNOWN        172.24.2.1/24 fe80::8cd7:ccff:fe8a:c345/64 
vlan50           UNKNOWN        172.24.5.1/24 fe80::c2c:bcff:fe88:d879/64 
vlan30           UNKNOWN        172.24.3.1/24 172.24.3.50/32 fe80::14aa:e9ff:fe4e:95a2/64 
br-trunk         UNKNOWN        fe80::5054:ff:fe01:1/64

The eth0 interface is the 172.25.249.0
provisioning network.

The br-trunk is the internal network.

The br-ex bridge is the 172.25.250.0 public network.

Containerized services

As I introduced in the Chapter1, all services are containerized here.

Let’s check all services in podman container.

[root@controller0 ~]# podman ps --format "table {{.Names}} {{.Status}}"
Names                              Status
openstack-manila-share-podman-0    Up 11 hours ago
openstack-cinder-volume-podman-0   Up 11 hours ago
ovn-dbs-bundle-podman-0            Up 11 hours ago
haproxy-bundle-podman-0            Up 11 hours ago
rabbitmq-bundle-podman-0           Up 11 hours ago
galera-bundle-podman-0             Up 11 hours ago
redis-bundle-podman-0              Up 11 hours ago
ceph-mgr-controller0               Up 11 hours ago
ceph-mon-controller0               Up 11 hours ago
ceph-mds-controller0               Up 11 hours ago
octavia_worker                     Up 11 hours ago
...ommited...

More detailed configuration information could be viewed by inspect command.

E.g.

[root@controller0 ~]# podman inspect keystone | jq .[].HostConfig.Binds
[
  "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro,rprivate,rbind",
  "/etc/puppet:/etc/puppet:ro,rprivate,rbind",
  "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro,rprivate,rbind",
  "/var/lib/config-data/puppet-generated/keystone:/var/lib/kolla/config_files/src:ro,rprivate,rbind",
  "/var/log/containers/keystone:/var/log/keystone:rw,rprivate,rbind",
  "/dev/log:/dev/log:rw,rprivate,nosuid,rbind",
  "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro,rprivate,rbind",
  "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro,rprivate,rbind",
  "/var/log/containers/httpd/keystone:/var/log/httpd:rw,rprivate,rbind",
  "/etc/localtime:/etc/localtime:ro,rprivate,rbind",
  "/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro,rprivate,rbind",
  "/var/lib/kolla/config_files/keystone.json:/var/lib/kolla/config_files/config.json:ro,rprivate,rbind",
  "/etc/hosts:/etc/hosts:ro,rprivate,rbind"
]

Configuration Management of Containerized Services

Remember to modify the configuration in physical rather than inside the container, because all changes will be reset after restarting the container.

Configuration files path

/var/log/config-data/puppet-generated/[service_name]/etc/

Excepted for using vim command to modify the configuration, crudini also makes sense.

[root@controller0 puppet-generated]# podman exec -it keystone crudini --get /etc/keystone/keystone.conf DEFAULT debug
False

[root@controller0 puppet-generated]# crudini --get keystone/etc/keystone/keystone.conf DEFAULT debug
False

[root@controller0 puppet-generated]# crudini --set keystone/etc/keystone/keystone.conf DEFAULT debug 
True

[root@controller0 puppet-generated]# crudini --get keystone/etc/keystone/keystone.conf DEFAULT debug
True

[root@controller0 puppet-generated]# podman restart keystone
keystone

[root@controller0 puppet-generated]# podman exec -it keystone crudini --get /etc/keystone/keystone.conf DEFAULT debug
True

Log files path

/var/log/containers/[service_name]

[root@controller0 containers]# pwd
/var/log/containers
[root@controller0 containers]# ls
aodh        glance   heat     keystone   mysql    octavia      placement  stdouts
ceilometer  gnocchi  horizon  manila     neutron  openvswitch  rabbitmq   swift
cinder      haproxy  httpd    memcached  nova     panko        redis
[root@controller0 containers]# cd keystone/
[root@controller0 keystone]# ls
keystone.log        keystone.log.11.gz  keystone.log.14.gz  keystone.log.4.gz  keystone.log.7.gz
keystone.log.1      keystone.log.12.gz  keystone.log.2.gz   keystone.log.5.gz  keystone.log.8.gz
keystone.log.10.gz  keystone.log.13.gz  keystone.log.3.gz   keystone.log.6.gz  keystone.log.9.gz

Containers also retain memory-based log structures that store the container’s console STDOUT activity.(/var/log/containers/stdouts) Use podman logs service to view the container’s console activity, for example podman logs keystone.

Managing Storage Nodes

Check the services on the Controller node

[root@controller0 stdouts]# podman ps --format "{{.Names}}" | grep ceph
ceph-mgr-controller0
ceph-mon-controller0
ceph-mds-controller0

We obtain three ceph services mgr+mon+mds.

[root@controller0 stdouts]# systemctl list-units | grep ceph
ceph-mds@controller0.service                                                                                                         loaded active     running         Ceph MDS                                                                                                                    
ceph-mgr@controller0.service                                                                                                         loaded active     running         Ceph Manager                                                                                                                
ceph-mon@controller0.service                                                                                                         loaded active     running         Ceph Monitor                                                                                                                
system-ceph\x2dmds.slice                                                                                                             loaded active     active          system-ceph\x2dmds.slice                                                                                                    
system-ceph\x2dmgr.slice                                                                                                             loaded active     active          system-ceph\x2dmgr.slice                                                                                                    
system-ceph\x2dmon.slice                                                                                                             loaded active     active          system-ceph\x2dmon.slice

Check all storage pools

[root@controller0 stdouts]# systemctl status ceph-mon@controller.service
● ceph-mon@controller.service - Ceph Monitor
   Loaded: loaded (/etc/systemd/system/ceph-mon@.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph osd pool ls
vms
volumes
images
manila_data
manila_metadata

List all OSDs

[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph osd ls
0
1
2
3
4
5

Check the health situation of the ceph cluster

[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph -s
  cluster:
    id:     96157d2f-395f-4a54-8a19-b6042465100d
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum controller0 (age 11h)
    mgr: controller0(active, since 11h)
    mds: cephfs:1 {0=controller0=up:active}
    osd: 6 osds: 6 up (since 11h), 6 in (since 4y)
 
  task status:
    scrub status:
        mds.controller0: idle
 
  data:
    pools:   5 pools, 320 pgs
    objects: 2.62k objects, 19 GiB
    usage:   25 GiB used, 95 GiB / 120 GiB avail
    pgs:     320 active+clean