xp idle

Idle Experimentation

Configuration

physical nodes: 1 control, 1 network, 20 computes.
control: neutron_server, nova_conductor, nova_scheduler, nova_novncproxy, nova_consoleauth, nova_api, glance_api, glance_registry, keystone, rabbitmq, mariadb, memcached, cron, kolla_toolbox, heka, cadvisor, grafana, influx, docker_registry, collectd
network: neutron_metadata_agent, neutron_l3_agent, neutron_dhcp_agent, neutron_openvswitch_agent, neutron_openvswitch_agent, openvswitch_db, keepalived, haproxy, cron, kolla_toolbox, heka, cadvisor
compute: nova_ssh, nova_libvirt, nova_compute_fake_1, …, nova_compute_fake_#fake, openvswitch_db, openvswitch_vswitchd, neutron_openvswitch_agent, neutron_openvswitch_agent_fake_1, …, neutron_openvswitch_agent_fake_#fake, cron, kolla_toolbox, heka, cadvisor

Monitored information

collectd: memcached (hit, miss, set), mysql (total insert/delete/update), docker (contextswitch), tcpconns (#socket open on a specific port).
cadvisor: networkIn/networkOut, mysql cpu/mem, rabbit cpu/mem.

cadvisor (each 10 sec log into influx, monitor one physical compute node)

Get results

First, find the name of the host machine.

cd results
vagrant up idle
XPHOST=`vagrant ssh-config idle | grep HostName | awk '{print $2}'`

Then, create ssh tunnels

# Get an access to the grafana
ssh -NL 3000:${XPHOST}:3000 rennes.g5k
# Get an access to the nginx with kolla logs
ssh -NL 8000:${XPHOST}:8000 rennes.g5k

Results

Nova API

Purpose: Nova API is the user-facing interface to all OS services. Nova API processes client REST requests, which typically involve database reads/writes and optionally sending RPC messages to other Nova services via the oslo.messaging queue [fn:os-archi].

Nova API	5	10	25	50
Mem max (GB)	1.60	1.60	1.72	1.60

Remarks: Using max for memory makes sense since the curve instantly reaches the max and never decreases.

Trend: The amount of used memory is constant whatever the number of node is.

Findings: TODO

[fn:os-archi] http://docs.openstack.org/developer/nova/architecture.html

Nova Conductor

Purpose: Nova conductor acts as a database proxy between computes and the database. Such proxy makes possible the communication between an upgraded system and an old version nova compute. The conductor handles requests that need reconfiguration and objects that need conversion.

Nova Conductor	5	10	25	50
Mem max (MB)	2.47	2.47	2.45	2.47
CPU avg (#usage)	1.22	2.00	3.68	7.00
Rx avg	122kB/s	161kB/s	560kB/s	1.1MB/s
Tx avg	85kB/s	242kB/s	360kB/s	750kB/s

Remarks: For memory, using max seems OK since the curve quickly (10 min) reaches the max and never decreases.

Trend: Mem is constant and CPU is linear in number of nodes.

Findings: TODO

Nova Scheduler

Purpose: Nova scheduler decides which hosts gets each instance.

Nova Scheduler	5	10	25	50
Mem max (MB)	100.30	104.00	106.30	109.60

Remarks: Using max makes sense since the curve instantly reaches the max and never decreases. Is this linear with the number of nodes?

Findings: TODO

Keystone

Keystone is the OS identity service. It provides API for client authentication, service discovery and multi-tenant authorisation. Nova api talks to Keystone thought an http connextion.

Keystone	5	10	25	50
Mem max (MB)	774	771	772	780

Remarks: None

Trends: The max usage of memory seems constant in the number of nodes.

What about the CPU? Keystone is called 4 times during the deployment, and every time keystone is called there is a CPU peak. The value of theses peaks are the same for all experimentation. However, the time between the second and the third peak increases with the number of node. The time is of 5 minutes in the first experiment, 7 minutes in the third one and 9.3 in the last one. Thus, OS does something after the second peak that takes more times if the number of nodes increases. Time between the third and forth peak increases in the same manner.

Findings: TODO

Neutron server

Purpose: Neutron Nova api and nova compute talk to neutron throught an http connexion.

Neutron Server	5	10	25	50
Mem max (MB)	419	420	359	441
CPU avg (#usage)	0.14	0.21	0.39	0.69

Remarks: The trend of memory usage is not clear, especially because of the third and fourth experiment results. It could be constant or linear.

Trends: I don’t know for the memory usage (see remarks below). The trend of CPU usage is clearer. It appears that CPU increases linearly with the number of nodes.

haproxy

TODO: Purpose

Haproxy	5	10	25	50
Mem max (MB)	6.27	6.32	7.04	8.71
Mem avg (MB)	5.60	5.54	5.99	7.22
CPU avg (#usage)	0.11	0.18	0.33	0.49

Remarks: Mem & CPU seem linear.

TODO: What we could say

rabbitmq

TODO: Purpose

rabbitmq	5	10	25	50
Mem max (GB)	1.59	2.52	5.08	11.25
CPU avg (#usage)	1	1	3	5
#connection avg (K)	1.32	2.35	5.53	10.05
#connection max (stationnary value) (K)	1.5	2.93	6.89	13.5

Remarks: Mem, CPU and #connection seem linear.

TODO: What we could say

mariadb

TODO: Purpose

mariadb	5	10	25	50
Mem max (MB)	502	546	570	594
CPU avg	0.03	0.06	0.13	0.21
#connection avg	68	72	100	132
#connection tot (more stationnary)	79	85	120	170
#select / s	todo
#update / s	todo

Remarks: Lineare.

TODO: What we could say

memcached

Memcached is constant in number of get/hit/miss/set.

Mariadb periodic requests

nova-compute → nova-scheduler

computes periodically sends their instances uuid

See https://github.com/openstack/nova/blob/35b2132723cf2412e42bb5e52f72abaef31dadbd/nova/scheduler/host_manager.py#L682

Controlled by the configuration parameter : scheduler_instance_sync_interval : https://github.com/openstack/nova/blob/407e659eb9c228eb1ec06ec49864279aeab0a1a1/nova/conf/compute.py#L438

100 computes -> 5039 times in 1h48 -> 46 per minutes
200 computes -> 4285 times in 1h25 -> 50 per minutes (do we really have 200 computes here ?) EDIT: there was a misconfiguration in the results we have this is fixed (TODO put the right number here)
500 computes -> 18223 times in 1h14 -> 246 per minutes
1000 computes -> 34996 times in 1h11 -> 500 per minutes -> 8.3 /s

Those number are consistent since by default instances are synced every 120s.

nova-scheduler runs some periodic tasks (but the frequency shouldn’t depend on the number of c nodes)

See https://github.com/openstack/nova/blob/35b2132723cf2412e42bb5e52f72abaef31dadbd/nova/scheduler/manager.py#L79

nova-conductor

From the conductor logs (`connection_debug=100`)

Each nova-conductor updates its state in the db

1 UPDATE every 10s
25 SELECT / min pour conductor service

Nova-compute updates its state in the db

1 UPDATE every 10s
1 SELECT approx less frequency as above for nova compute

For each compute we have periodic tasks about the state (instances, …)

-> if we try to check every match that aren’t that doesn’t deal with conductor :

cat /var/lib/docker/volumes/kolla_logs/_data/nova/nova-conductor.log | grep -e "-\] (0," | grep -v -e "(0, 7, 1)" | grep -v -e 'nova-conductor'| wc -l

we got something that increases linearly with the number of computes with 12 matches/min/compute

one match is in this context at least one select and sometimes one (select + one update : service update).

Every nova service is reporing its state periodically : it’s controlled by the param report_interval in nova.conf https://github.com/openstack/nova/blob/407e659eb9c228eb1ec06ec49864279aeab0a1a1/nova/conf/service.py#L24

neutron-server

From the neutron-server logs (`connection_debug=100`)

Agents status are updated periodically thus the load increases linearly with the number of openvswitch agents.

Observation: 1 SELECT and 1 UPDATE / 25s / agent

Low level access to all mariadb queries

This could be used to see what’s missing: https://mariadb.com/kb/en/mariadb/general-query-log/

Notes on connection pooling

Pool of connections are used for the connections to the

DB and
Messaging middleware

For the DB it relies on the sqlalchemy pooling system : http://docs.sqlalchemy.org/en/latest/core/pooling.html This can be controlled on the openstack side from the following parameters :

(nova default) max_overflow (= None),max_pool_size (= None), pool_timeout (= None)
(sqlalchemy default) max_overflow (= 10),max_pool_size (= 5), pool_timeout (= 30)
(kolla default)max_overflow (= 1000),max_pool_size (= 50), pool_timeout (= -1)

ref : http://docs.openstack.org/mitaka/config-reference/compute/config-options.html#nova-common

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xp idle

Idle Experimentation

Configuration

Monitored information

Get results

Results

Nova API

Nova Conductor

Nova Scheduler

Keystone

Neutron server

haproxy

rabbitmq

mariadb

memcached

Mariadb periodic requests

nova-compute → nova-scheduler

computes periodically sends their instances uuid

nova-scheduler runs some periodic tasks (but the frequency shouldn’t depend on the number of c nodes)

nova-conductor

From the conductor logs (`connection_debug=100`)

neutron-server

From the neutron-server logs (`connection_debug=100`)

Low level access to all mariadb queries

Notes on connection pooling

XPs

Metrics

Clone this wiki locally

xp idle

Idle Experimentation

Configuration

Monitored information

Get results

Results

Nova API

Nova Conductor

Nova Scheduler

Keystone

Neutron server

haproxy

rabbitmq

mariadb

memcached

Mariadb periodic requests

nova-compute → nova-scheduler

computes periodically sends their instances uuid

nova-scheduler runs some periodic tasks (but the frequency shouldn’t depend on the number of c nodes)

nova-conductor

From the conductor logs (connection_debug=100)

neutron-server

From the neutron-server logs (connection_debug=100)

Low level access to all mariadb queries

Notes on connection pooling

XPs

Metrics

Clone this wiki locally

From the conductor logs (`connection_debug=100`)

From the neutron-server logs (`connection_debug=100`)