KVM\Qemu\Openstack – Manage a live migration

virsh qemu-monitor-command {VMNAME} --pretty '{"execute":"migrate_cancel"}'

Allow Virsh more downtime(If it cant keepup with RAM utilization)

virsh migrate-setmaxdowntime VMNAME 2500

 

Check migration status

virsh domjobinfo instance-000002ac
Job type: Unbounded
Operation: Outgoing migration
Time elapsed: 1307956 ms
Data processed: 118.662 GiB
Data remaining: 9.203 MiB
Data total: 8.005 GiB
Memory processed: 118.662 GiB
Memory remaining: 9.203 MiB
Memory total: 8.005 GiB
Memory bandwidth: 41.294 MiB/s
Dirty rate: 35040 pages/s
Page size: 4096 bytes
Iteration: 197
Constant pages: 1751031
Normal pages: 31041965
Normal data: 118.416 GiB
Expected downtime: 3314 ms
Setup time: 70 ms

 

https://www.redhat.com/archives/libvirt-users/2014-January/msg00007.html
https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/abort-live-migration.html

 

https://www.server24.eu/private-cloud/complete-live-migration-vms-high-load/

 

Oepnsatck Rocky – Keystone – Requesting a token scoped as a different project

Because yet again I found the documentation to be wrong on the Openstack site, here is what I have FINALLY managed to determine is the correct request to get a token issued to an admin user an another project. In my case i need this to create volume backups because the backup API does not allow you to project a project ID when creating a backup

http://KeystoneIP:5000/v3/auth/tokens

{
"auth": {
"scope": {
"project": {
"domain": {
"name": "Default"
},
"name": "OtherProjectName"
}
},
"identity": {
"password": {
"user": {
"domain": {
"name": "Default"
},
"password": "password",
"name": "admin"
}
},
"methods": [
"password"
]
}
}
}

Bonding in active-backup using linux bridges on Ubuntu 18

Because this was WAAYY more difficult to find any decent doco on that I had ever expected, here is what worked for me

I deleted the netplan config file at /etc/netplan/01-netcfg.yaml

rm /etc/netplan/01-netcfg.yaml

Ensure that ‘bonding’ appears in /etc/modules (I’s not here by default)

echo bonding >> /etc/modules

 

Here is /etc/network/interfaces

source-directory /etc/network/interfaces.d
auto lo
iface lo inet loopback

allow-hotplug ens3f0
iface ens3f0 inet manual
bond-master bond0

allow-hotplug ens3f1
iface ens3f1 inet manual
bond-master bond0

allow-hotplug bond0
iface bond0 inet static
address 172.16.103.12/24
gateway 172.16.103.254
mtu 9000

bond-mode active-backup
bond-miimon 100
bond-slaves none

 

Ceph Nautilus – “Required devices (block and data) not present for bluestore”

When using the new ceph-volume scan and activate commands on ceph Nautilus after an upgrade from Luminous I was getting the following message

[root@ceph2 ~]# ceph-volume simple activate --all
--> activating OSD specified in /etc/ceph/osd/37-11af5440-dadf-40e3-8924-2bbad3ee5b58.json
Running command: /bin/ln -snf /dev/sdh2 /var/lib/ceph/osd/ceph-37/block
Running command: /bin/chown -R ceph:ceph /dev/sdh2
Running command: /bin/systemctl enable ceph-volume@simple-37-11af5440-dadf-40e3-8924-2bbad3ee5b58
Running command: /bin/ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
--> All ceph-disk systemd units have been disabled to prevent OSDs getting triggered by UDEV events
Running command: /bin/systemctl enable --runtime ceph-osd@37
Running command: /bin/systemctl start ceph-osd@37
--> Successfully activated OSD 37 with FSID 11af5440-dadf-40e3-8924-2bbad3ee5b58
--> activating OSD specified in /etc/ceph/osd/11-8c5b0218-4d32-404f-b06b-f6e90906ab7d.json
--> Required devices (block and data) not present for bluestore
--> bluestore devices found: [u'data']
--> RuntimeError: Unable to activate bluestore OSD due to missing devices

You can see one volume activated while the other didn’t.
It turns out this is because one volume was configure for bluestore and the other wasn’t and there is some sort of a bug in the ceph-volume command and when it writes out the /etc/ceph/osd/{OSDID}-GUID.json files it omits the “type”: “filestore” line for any non-bluestore disks, but the ceph-volume activate command assumes it’s a bluestore volume unless otherwise specified in the json file.
The quick and easy fix was to add the line “type”: “filestore”, to the json files for any non-bluestore disks and run ceph-volume simple activate –all again.
Time permitting i’ll hunt down the bug in the scan command and submit a pull request if it’s not already been done

Metadata service in DHCP namespace

Some gold info
http://kimizhang.com/metadata-service-in-dhcp-namespace/

Whats inside an APT package?

sudo dpkg –listfiles docker-ce

Generate OTP keys in linux – Extracting FreeOTP keys

apt install oathtool
oathtool --totp -b -d 6 KY3OUPMUYWCKS53F

Linux: TOTP Password Generator

https://github.com/philipsharp/FreeOTPDecoder
Enable USB debugging – https://www.kingoapp.com/root-tutorials/how-to-enable-usb-debugging-mode-on-android.htm
Backup the FreeOTP app – adb backup -f ~/freeotp.ab -noapk org.fedorahosted.freeotp
Decompress the backup – dd if=freeotp.ab bs=1 skip=24 | python -c "import zlib,sys;sys.stdout.write(zlib.decompress(sys.stdin.read()))" | tar -xvf -
Decode the keys – https://github.com/philipsharp/FreeOTPDecoder

Openstack – Manually edit VM

Find the host the VM is running on and the instance ID(Use console view to get instance ID)

cp /etc/libvirt/qemu/instance-0000030a.xml .
edit instance-0000030a.xml to be what you need it to be

While the VM is running (Warning, will crash the VM)

virsh destroy instance-0000030a
virsh undefine instance-0000030a
virsh define instance-0000030a.xml
virsh start instance-0000030a

Ceph scrubbing performance

Original article here – http://sudomakeinstall.com/linux-systems/ceph-scrubbing

Ceph’s default IO priority and class for behind the scene disk operations should be considered required vs best efforts. For those of us who actually utilize our storage for services that require performance will quickly find that deep scrub grinds even the most powerful systems to a halt.

Below are the settings to run the scrub as the lowest possible priority. This REQUIRES CFQ as the scheduler for the spindle disk. Without CFQ you cannot prioritize IO. Since only 1 service utilizes these disk CFQ performance will be comparable to deadline and noop.

Show the current scheduler

for file in /sys/block/sd*; do
echo ${file}
cat ${file}/queue/scheduler
echo “”
done

Set all disks to CFQ

for file in /sys/block/sd*; do
echo cfq > ${file}/queue/scheduler
cat ${file}/queue/scheduler
echo “”
done

Inject the new settings for the existing OSD:
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'

Edit your ceph.conf on your storage nodes to automatically set the the priority at runtime.
#Reduce impact of scrub.
osd_disk_thread_ioprio_class = "idle"
osd_disk_thread_ioprio_priority = 7

You can go a step further and setup redhats optimizations for the system charactistics.
tuned-adm profile latency-performance
This information referenced from multiple sources.

Reference documentation.
http://dachary.org/?p=3268

Disable scrubbing in realtime to determine its impact on your running cluster.
http://dachary.org/?p=3157

A detailed analysis of the scrubbing io impact.
http://blog.simon.leinen.ch/2015/02/ceph-deep-scrubbing-impact.html

OSD Configuration Reference
http://ceph.com/docs/master/rados/configuration/osd-config-ref/

Redhat system tuning.
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Tool_Reference-tuned_adm.html

List pools and their crush rule

ceph osd pool ls | while read line; do echo $line && ceph osd pool get $line crush_rule; done