Overview

In Red Hat OpenStack Platform (RHOSP), a Red Hat supported overcloud is built by the undercloud Director node using the Deployment service TripleO. At the same time, the undercloud can deploy and manage multiple overclouds.

The Deployment template

An overcloud consists of numerous deployed nodes, each configured using roles, which specify the services and configuration required for a node to perform that role. Predefined RHOSP role definitions are located in the /usr/share/openstack-tripleo-heat-templates/roles directory and can be viewed with the openstack role command.

Check all default roles

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
(undercloud) [stack@director ~]$ ls /usr/share/openstack-tripleo-heat-templates/roles | wc -l
52

(undercloud) [stack@director ~]$ ls /usr/share/openstack-tripleo-heat-templates/roles
BlockStorage.yaml ComputePPC64LE.yaml DistributedComputeScaleOut.yaml
CellController.yaml ComputeRBDEphemeral.yaml HciCephAll.yaml
CephAll.yaml ComputeRealTime.yaml HciCephFile.yaml
CephFile.yaml ComputeSriov.yaml HciCephMon.yaml
CephObject.yaml ComputeSriovIB.yaml HciCephObject.yaml
CephStorage.yaml ComputeSriovRT.yaml IronicConductor.yaml
Compute.yaml Controller.yaml Messaging.yaml
ComputeAlt.yaml ControllerAllNovaStandalone.yaml Networker.yaml
ComputeDVR.yaml ControllerNoCeph.yaml NetworkerSriov.yaml
ComputeHCI.yaml ControllerNovaStandalone.yaml Novacontrol.yaml
ComputeHCIOvsDpdk.yaml ControllerOpenstack.yaml ObjectStorage.yaml
ComputeInstanceHA.yaml ControllerSriov.yaml README.rst
ComputeLiquidio.yaml ControllerStorageDashboard.yaml Standalone.yaml
ComputeLocalEphemeral.yaml ControllerStorageNfs.yaml Telemetry.yaml
ComputeOvsDpdk.yaml Database.yaml Undercloud.yaml
ComputeOvsDpdkRT.yaml DistributedCompute.yaml UndercloudMinion.yaml
ComputeOvsDpdkSriov.yaml DistributedComputeHCI.yaml
ComputeOvsDpdkSriovRT.yaml DistributedComputeHCIScaleOut.yaml

Check all deployed roles

1
2
3
4
5
(undercloud) [stack@director ~]$ grep '^- name:' ~/templates/roles_data.yaml
- name: Controller
- name: Compute
- name: CephStorage
- name: ComputeHCI
  • Controller

    Control plane node with all core services loaded. Provides service APIs and handles database, messaging, and network processing functions.

    In the exercise case, we provide only one Controller node.

  • Compute

    Standard compute node working as a hypervisor running deployed instances and stacks.

  • CephStorage

    Storage back end node using a containerized Red Hat Ceph Storage server and multiple object storage disks (OSDs).

  • ComputeHCI

    Converged node. (Compute node + CephStorage node.)

    Compute node with hyperconverged infrastructure, combining compute and Ceph OSD functionality on a single node.

Components and Services

Core Services

  • Block Storage Service (Cinder)

    Cinder -> Ceph.

    Original Swift only provides the object storage. So, Cinder will schedule the back-end Ceph for block and file storage.

  • Image Service (Glance)

    The Image service stores and manages images used to deploy instances. (Object Storage)

  • Orchestration Service (Heat)

    In the undercloud, Heat templates deploy each overcloud as a stack.

    In the overcloud, Heat templates deploy application workloads as stacks.

    All templates are based on yaml files.

  • Dashboard Service (Horizon)

    Web UI Front-end Dashboard It just looks like a console. More convenient and intuitionistic.

    The Dashboard service provides a browser-based interface for self-service cloud users to create and configure resources and launch and manage instances and stacks.

  • Identity Service (Keystone)

    It looks like a gate-keeper to confirm/verify the comer.

    The Identity service provides domain, project, and user authorization for other overcloud services.

  • OpenStack Networking Service (Neutron)

    It’s responsible for the network.

    The OpenStack Networking service manages virtual networking infrastructure. It provides the virtual network and manages the network interfaces.

  • Compute Service (Nova)

    The Compute service schedules and runs on-demand virtual machines.

    It’s responsible for creating, starting, stoping and removing the virtual machines.

  • Messaging Service (Oslo)

    It provides a common messaging framework and provides a compatible and consistent communication feature set between each services.

  • Object Store (Swift)

    Original and default object storage back-end.

    The Object Store service provides self-service cloud user object storage. Other overcloud services that collect data or objects can use the Object Store service as a back end.

Discretionary and Operational OpenStack Services

  • Bare Metal Service (Ironic)

    Initializes and introspects -> scaling.

    The Bare Metal service locates and prepares compute resources, including bare metal and virtual machines.

  • File Share Service (Manila)

    It can use either the NFS or CIFS protocols to provide file sharing to instances.

  • Load Balancing Service(Octavia)

    It is an original HAProxy-based service.
    The Load Balancing service enables failover network traffic distribution to instances in a multitier application architecture.

Managing the Overcloud

Listing all Overcloud nodes

1
2
3
4
5
6
7
8
9
10
11
(undercloud) [stack@director ~]$ openstack server list \
> -c 'Name' -c 'Status' -c 'Networks'
+-------------+--------+------------------------+
| Name | Status | Networks |
+-------------+--------+------------------------+
| controller0 | ACTIVE | ctlplane=172.25.249.56 |
| computehci0 | ACTIVE | ctlplane=172.25.249.54 |
| compute1 | ACTIVE | ctlplane=172.25.249.53 |
| compute0 | ACTIVE | ctlplane=172.25.249.59 |
| ceph0 | ACTIVE | ctlplane=172.25.249.58 |
+-------------+--------+------------------------+

High Availability

Red Hat OpenStack Platform uses Pacemaker as the cluster resource manager, HAProxy as the load balancing cluster service, and MariaDB Galera as the replicated database service.

RHOSP Director installs a duplicate set of OpenStack components on each controller node and manages them as a single service.

  • Pacemaker -> Manager and Scheduler

  • HAProxy -> Allocate all network traffic.

  • MariaDB -> Duplicate the data.

We could use the command to check the status of our cluster.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[heat-admin@controller0 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Cluster Summary:
* Stack: corosync
* Current DC: controller0 (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition with quorum
* Last updated: Tue May 6 13:33:04 2025
* Last change: Mon May 5 02:29:30 2025 by root via crm_resource on controller0
* 5 nodes configured
* 22 resource instances configured

Node List:
* Online: [ controller0 ]
* GuestOnline: [ galera-bundle-0@controller0 ovn-dbs-bundle-0@controller0 rabbitmq-bundle-0@controller0 redis-bundle-0@controller0 ]

Full List of Resources:
* Container bundle: galera-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-mariadb:pcmklatest]:
* galera-bundle-0 (ocf::heartbeat:galera): Master controller0
* Container bundle: rabbitmq-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-rabbitmq:pcmklatest]:
* rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started controller0
* Container bundle: redis-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-redis:pcmklatest]:
* redis-bundle-0 (ocf::heartbeat:redis): Master controller0
* ip-172.25.249.50 (ocf::heartbeat:IPaddr2): Started controller0
* ip-172.25.250.50 (ocf::heartbeat:IPaddr2): Started controller0
* ip-172.24.1.51 (ocf::heartbeat:IPaddr2): Started controller0
* ip-172.24.1.50 (ocf::heartbeat:IPaddr2): Started controller0
* ip-172.24.3.50 (ocf::heartbeat:IPaddr2): Started controller0
* ip-172.24.4.50 (ocf::heartbeat:IPaddr2): Started controller0
* Container bundle: haproxy-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-haproxy:pcmklatest]:
* haproxy-bundle-podman-0 (ocf::heartbeat:podman): Started controller0
* Container bundle: ovn-dbs-bundle [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-ovn-northd:pcmklatest]:
* ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller0
* ip-172.24.1.52 (ocf::heartbeat:IPaddr2): Started controller0
* Container bundle: openstack-cinder-volume [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-cinder-volume:pcmklatest]:
* openstack-cinder-volume-podman-0 (ocf::heartbeat:podman): Started controller0
* Container bundle: openstack-manila-share [cluster.common.tag/gls-dle-dev-osp16-osp16_containers-openstack-manila-share:pcmklatest]:
* openstack-manila-share-podman-0 (ocf::heartbeat:podman): Started controller0

Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled

Accessing Services on the Overloud

When the overcloud is first installed, the procedure creates the overcloudrc identity environment file under the stackrc user’s home directory with the admin credentials to access
the overcloud.

  • stackrc

    It will be created automatically when deploy the undercloud which contains the environment variables in the undercloud.

    It could be used to deploy the overcloud.

  • overcloudrc

    After the overcloud was deployed, it will be generated by TripleO.

    It is the token/authentication file to access the overcloud.

    We could check and manage the service on overcloud by CLI command with it.

Switch to overcloud environment variables

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Check all environment variables on the overcloud
(overcloud) [stack@director ~]$ env | grep ^OS_*
OS_IMAGE_API_VERSION=2
OS_AUTH_URL=http://172.25.250.50:5000
OS_CLOUDNAME=overcloud
OS_REGION_NAME=regionOne
OS_PROJECT_NAME=admin
OS_PROJECT_DOMAIN_NAME=Default
OS_USER_DOMAIN_NAME=Default
OS_IDENTITY_API_VERSION=3
OS_AUTH_TYPE=password
OS_NO_CACHE=True
OS_COMPUTE_API_VERSION=2.latest
OS_PASSWORD=redhat
OS_USERNAME=admin
OS_VOLUME_API_VERSION=3

# Check all service on the overcloud
(overcloud) [stack@director ~]$ openstack service list
+----------------------------------+-----------+----------------+
| ID | Name | Type |
+----------------------------------+-----------+----------------+
| 0932e03c896a4e73b905b81a0e4e8b02 | cinderv2 | volumev2 |
| 1b5a14e5ff2d4d988dcbb7d1a31aa257 | panko | event |
| 1e50893246094c05b0b8b5028efa65d6 | heat | orchestration |
| 20ee8664c18146639e207d1dd833066b | heat-cfn | cloudformation |
| 2ed07d8f7afd4126b279bd273d029962 | gnocchi | metric |
| 31442049b55040d5bb52589e0fd1b31f | nova | compute |
| 446c6d0b77a74d05a491393c8d4cc106 | keystone | identity |
| 76870267b75a4d0bb70772f586730659 | neutron | network |
| 8413046b29f44285a772a8bbd172dbd8 | manilav2 | sharev2 |
| bb7a5e08aad345a489ce0207684ae402 | swift | object-store |
| c970fbfbc71a49008760c6d9a6f3ee91 | aodh | alarming |
| cc386f44375343d19e1fc78e8e78db35 | cinderv3 | volumev3 |
| cddaafd6d76b424b946a8fdb47a8ed12 | manila | share |
| d799151c504541edacd68a08324d37c0 | glance | image |
| e15d67325321405b854447a92bfe866f | octavia | load-balancer |
| f8ebce21244748c09b20df6e9c939b73 | placement | placement |
+----------------------------------+-----------+----------------+

Network

Check the allocated VLAN.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
[heat-admin@controller0 ~]$ sudo ovs-vsctl show
da52b817-899b-459f-b486-b529fcbd9275
Bridge br-prov2
fail_mode: standalone
Port eth4
Interface eth4
Port br-prov2
Interface br-prov2
type: internal
Bridge br-int
Port br-int
Interface br-int
type: internal
Port o-hm0
Interface o-hm0
type: internal
Port patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506
Interface patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506
type: patch
options: {peer=patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int}
Port ovn-8e0b99-0
Interface ovn-8e0b99-0
type: geneve
options: {csum="true", key=flow, remote_ip="172.24.2.2"}
Port ovn-3f678a-0
Interface ovn-3f678a-0
type: geneve
options: {csum="true", key=flow, remote_ip="172.24.2.6"}
Port ovn-b8bd46-0
Interface ovn-b8bd46-0
type: geneve
options: {csum="true", key=flow, remote_ip="172.24.2.12"}
Bridge br-ex
fail_mode: standalone
Port br-ex
Interface br-ex
type: internal
Port patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int
Interface patch-provnet-275c889d-ef54-4a38-88ff-a7cedee11506-to-br-int
type: patch
options: {peer=patch-br-int-to-provnet-275c889d-ef54-4a38-88ff-a7cedee11506}
Port eth2
Interface eth2
Bridge br-prov1
fail_mode: standalone
Port eth3
Interface eth3
Port br-prov1
Interface br-prov1
type: internal
Bridge br-trunk
fail_mode: standalone
Port vlan40
tag: 40
Interface vlan40
type: internal
Port eth1
Interface eth1
Port vlan20
tag: 20
Interface vlan20
type: internal
Port vlan30
tag: 30
Interface vlan30
type: internal
Port vlan50
tag: 50
Interface vlan50
type: internal
Port br-trunk
Interface br-trunk
type: internal
Port vlan10
tag: 10
Interface vlan10
type: internal
ovs_version: "2.13.0"

Check the ip address.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[heat-admin@controller0 ~]$ ip --br a
lo UNKNOWN 127.0.0.1/8 ::1/128
eth0 UP 172.25.249.56/24 172.25.249.50/32 fe80::5054:ff:fe00:f901/64
eth1 UP fe80::5054:ff:fe01:1/64
eth2 UP fe80::5054:ff:fe02:fa01/64
eth3 UP fe80::5054:ff:fe03:1/64
eth4 UP fe80::5054:ff:fe04:1/64
ovs-system DOWN
br-prov1 UNKNOWN fe80::5054:ff:fe03:1/64
genev_sys_6081 UNKNOWN fe80::3c78:cbff:feee:3bf0/64
o-hm0 UNKNOWN 172.23.3.42/16 fe80::f816:3eff:fecd:dfd4/64
br-int UNKNOWN fe80::40e:d7ff:fe34:2944/64
br-ex UNKNOWN 172.25.250.1/24 172.25.250.50/32 fe80::5054:ff:fe02:fa01/64
br-prov2 UNKNOWN fe80::5054:ff:fe04:1/64
vlan40 UNKNOWN 172.24.4.1/24 172.24.4.50/32 fe80::b4ca:55ff:fe88:4fd4/64
vlan10 UNKNOWN 172.24.1.1/24 172.24.1.51/32 172.24.1.50/32 172.24.1.52/32 fe80::f45f:9ff:feed:88a4/64
vlan20 UNKNOWN 172.24.2.1/24 fe80::8cd7:ccff:fe8a:c345/64
vlan50 UNKNOWN 172.24.5.1/24 fe80::c2c:bcff:fe88:d879/64
vlan30 UNKNOWN 172.24.3.1/24 172.24.3.50/32 fe80::14aa:e9ff:fe4e:95a2/64
br-trunk UNKNOWN fe80::5054:ff:fe01:1/64

The eth0 interface is the 172.25.249.0
provisioning network.

The br-trunk is the internal network.

The br-ex bridge is the 172.25.250.0 public network.

Containerized services

As I introduced in the Chapter1, all services are containerized here.

Let’s check all services in podman container.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@controller0 ~]# podman ps --format "table {{.Names}} {{.Status}}"
Names Status
openstack-manila-share-podman-0 Up 11 hours ago
openstack-cinder-volume-podman-0 Up 11 hours ago
ovn-dbs-bundle-podman-0 Up 11 hours ago
haproxy-bundle-podman-0 Up 11 hours ago
rabbitmq-bundle-podman-0 Up 11 hours ago
galera-bundle-podman-0 Up 11 hours ago
redis-bundle-podman-0 Up 11 hours ago
ceph-mgr-controller0 Up 11 hours ago
ceph-mon-controller0 Up 11 hours ago
ceph-mds-controller0 Up 11 hours ago
octavia_worker Up 11 hours ago
...ommited...

More detailed configuration information could be viewed by inspect command.

E.g.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@controller0 ~]# podman inspect keystone | jq .[].HostConfig.Binds
[
"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro,rprivate,rbind",
"/etc/puppet:/etc/puppet:ro,rprivate,rbind",
"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro,rprivate,rbind",
"/var/lib/config-data/puppet-generated/keystone:/var/lib/kolla/config_files/src:ro,rprivate,rbind",
"/var/log/containers/keystone:/var/log/keystone:rw,rprivate,rbind",
"/dev/log:/dev/log:rw,rprivate,nosuid,rbind",
"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro,rprivate,rbind",
"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro,rprivate,rbind",
"/var/log/containers/httpd/keystone:/var/log/httpd:rw,rprivate,rbind",
"/etc/localtime:/etc/localtime:ro,rprivate,rbind",
"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro,rprivate,rbind",
"/var/lib/kolla/config_files/keystone.json:/var/lib/kolla/config_files/config.json:ro,rprivate,rbind",
"/etc/hosts:/etc/hosts:ro,rprivate,rbind"
]

Configuration Management of Containerized Services

Remember to modify the configuration in physical rather than inside the container, because all changes will be reset after restarting the container.

  • Configuration files path

    /var/log/config-data/puppet-generated/[service_name]/etc/

    Excepted for using vim command to modify the configuration, crudini also makes sense.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    [root@controller0 puppet-generated]# podman exec -it keystone crudini --get /etc/keystone/keystone.conf DEFAULT debug
    False

    [root@controller0 puppet-generated]# crudini --get keystone/etc/keystone/keystone.conf DEFAULT debug
    False

    [root@controller0 puppet-generated]# crudini --set keystone/etc/keystone/keystone.conf DEFAULT debug
    True

    [root@controller0 puppet-generated]# crudini --get keystone/etc/keystone/keystone.conf DEFAULT debug
    True

    [root@controller0 puppet-generated]# podman restart keystone
    keystone

    [root@controller0 puppet-generated]# podman exec -it keystone crudini --get /etc/keystone/keystone.conf DEFAULT debug
    True
  • Log files path

    /var/log/containers/[service_name]

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    [root@controller0 containers]# pwd
    /var/log/containers
    [root@controller0 containers]# ls
    aodh glance heat keystone mysql octavia placement stdouts
    ceilometer gnocchi horizon manila neutron openvswitch rabbitmq swift
    cinder haproxy httpd memcached nova panko redis
    [root@controller0 containers]# cd keystone/
    [root@controller0 keystone]# ls
    keystone.log keystone.log.11.gz keystone.log.14.gz keystone.log.4.gz keystone.log.7.gz
    keystone.log.1 keystone.log.12.gz keystone.log.2.gz keystone.log.5.gz keystone.log.8.gz
    keystone.log.10.gz keystone.log.13.gz keystone.log.3.gz keystone.log.6.gz keystone.log.9.gz

    Containers also retain memory-based log structures that store the container’s console STDOUT activity.(/var/log/containers/stdouts) Use podman logs service to view the container’s console activity, for example podman logs keystone.

Managing Storage Nodes

Check the services on the Controller node

1
2
3
4
[root@controller0 stdouts]# podman ps --format "{{.Names}}" | grep ceph
ceph-mgr-controller0
ceph-mon-controller0
ceph-mds-controller0

We obtain three ceph services mgr+mon+mds.

1
2
3
4
5
6
7
[root@controller0 stdouts]# systemctl list-units | grep ceph
ceph-mds@controller0.service loaded active running Ceph MDS
ceph-mgr@controller0.service loaded active running Ceph Manager
ceph-mon@controller0.service loaded active running Ceph Monitor
system-ceph\x2dmds.slice loaded active active system-ceph\x2dmds.slice
system-ceph\x2dmgr.slice loaded active active system-ceph\x2dmgr.slice
system-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.slice

Check all storage pools

1
2
3
4
5
6
7
8
9
10
11
[root@controller0 stdouts]# systemctl status ceph-mon@controller.service
● ceph-mon@controller.service - Ceph Monitor
Loaded: loaded (/etc/systemd/system/ceph-mon@.service; disabled; vendor preset: disabled)
Active: inactive (dead)

[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph osd pool ls
vms
volumes
images
manila_data
manila_metadata

List all OSDs

1
2
3
4
5
6
7
[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph osd ls
0
1
2
3
4
5

Check the health situation of the ceph cluster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@controller0 stdouts]# podman exec ceph-mon-controller0 ceph -s
cluster:
id: 96157d2f-395f-4a54-8a19-b6042465100d
health: HEALTH_OK

services:
mon: 1 daemons, quorum controller0 (age 11h)
mgr: controller0(active, since 11h)
mds: cephfs:1 {0=controller0=up:active}
osd: 6 osds: 6 up (since 11h), 6 in (since 4y)

task status:
scrub status:
mds.controller0: idle

data:
pools: 5 pools, 320 pgs
objects: 2.62k objects, 19 GiB
usage: 25 GiB used, 95 GiB / 120 GiB avail
pgs: 320 active+clean