[Bug 1717224] Comment bridged from LTC Bugzilla
bugproxy
bugproxy at us.ibm.com
Fri Nov 10 19:50:30 UTC 2017
------- Comment From swgreenl at us.ibm.com 2017-11-10 14:39 EDT-------
(In reply to comment #31)
> Hi Scott,
> the howto is mixed for Desktop users, Server users and selective upgrades.
> For your case you only need the most simple case which would be:
>
> Essentially you want to:
>
> # Check - all other updates done (to clear the view)
> $ apt list --upgradable
> Listing... Done
>
> # Enable proposed for z on Server
> $ echo "deb http://ports.ubuntu.com/ubuntu-ports/ xenial-proposed main
> restricted universe multiverse" | sudo tee
> /etc/apt/sources.list.d/enable-proposed.list
> $ sudo apt update
> $ apt list --upgradable
> [...]
> linux-headers-generic/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
> linux-headers-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
> linux-image-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
>
> # Install just the kernels from proposed
> $ sudo apt install linux-generic
>
> No need to set apt prefs if you only do a selective install.
> If you'd do a global "sudo apt upgrade" you'd get all, but that is likely
> not what you want in your case. After you have done so you can just
> enable/disable the line in /etc/apt/sources.list.d/enable-proposed.list as
> needed.
>
> Hope that helps
Yes, your instructions were immensely useful, thanks for the
explanation.
With the proposed fix applied, I am now able to start over 100 virtual
guests, even with aio-max-nr set to 64K:
root at zm93k8:~# cat /proc/sys/fs/aio-max-nr
65535
root at zm93k8:/tmp# virsh list |grep running
86 zs93kag70041 running
87 zs93kag70042 running
88 zs93kag70055 running
89 zs93kag70056 running
90 zs93kag70057 running
91 zs93kag70058 running
92 zs93kag70059 running
93 zs93kag70060 running
94 zs93kag70061 running
95 zs93kag70062 running
96 zs93kag70063 running
97 zs93kag70064 running
98 zs93kag70065 running
99 zs93kag70066 running
100 zs93kag70067 running
101 zs93kag70068 running
102 zs93kag70069 running
103 zs93kag70070 running
104 zs93kag70071 running
105 zs93kag70072 running
106 zs93kag70073 running
107 zs93kag70074 running
108 zs93kag70075 running
109 zs93kag70077 running
110 zs93kag70078 running
111 zs93kag70079 running
112 zs93kag70080 running
113 zs93kag70081 running
114 zs93kag70082 running
115 zs93kag70083 running
116 zs93kag70084 running
117 zs93kag70085 running
118 zs93kag70086 running
119 zs93kag70087 running
120 zs93kag70088 running
121 zs93kag70089 running
122 zs93kag70090 running
123 zs93kag70091 running
124 zs93kag70092 running
125 zs93kag70093 running
126 zs93kag70094 running
127 zs93kag70095 running
128 zs93kag70096 running
129 zs93kag70097 running
130 zs93kag70098 running
131 zs93kag70099 running
132 zs93kag70100 running
133 zs93kag70101 running
134 zs93kag70102 running
135 zs93kag70103 running
136 zs93kag70104 running
137 zs93kag70105 running
138 zs93kag70106 running
139 zs93kag70107 running
140 zs93kag70108 running
141 zs93kag70109 running
142 zs93kag70110 running
143 zs93kag70111 running
144 zs93kag70112 running
145 zs93kag70113 running
146 zs93kag70114 running
147 zs93kag70115 running
148 zs93kag70116 running
149 zs93kag70117 running
150 zs93kag70118 running
151 zs93kag70119 running
152 zs93kag70120 running
153 zs93kag70121 running
154 zs93kag70122 running
155 zs93kag70123 running
156 zs93kag70124 running
157 zs93kag70125 running
158 zs93kag70126 running
159 zs93kag70127 running
160 zs93kag70128 running
161 zs93kag70129 running
162 zs93kag70130 running
163 zs93kag70131 running
164 zs93kag70132 running
165 zs93kag70133 running
166 zs93kag70134 running
167 zs93kag70135 running
168 zs93kag70136 running
169 zs93kag70137 running
170 zs93kag70138 running
172 zs93kag70024 running
173 zs93kag70025 running
174 zs93kag70026 running
175 zs93kag70027 running
176 zs93kag70038 running
177 zs93kag70039 running
178 zs93kag70040 running
179 zs93kag70043 running
180 zs93kag70044 running
181 zs93kag70045 running
182 zs93kag70046 running
183 zs93kag70047 running
184 zs93kag70048 running
185 zs93kag70049 running
186 zs93kag70050 running
187 zs93kag70051 running
188 zs93kag70052 running
189 zs93kag70053 running
190 zs93kag70054 running
191 zs93kag70076 running
root at zm93k8:/tmp#
When will this fix be available to external customers? We will want to
recommend it to our zKVM users. Thank you !
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to procps in Ubuntu.
https://bugs.launchpad.net/bugs/1717224
Title:
virsh start of virtual guest domain fails with internal error due to
low default aio-max-nr sysctl value
Status in Ubuntu on IBM z Systems:
In Progress
Status in kvm package in Ubuntu:
Confirmed
Status in linux package in Ubuntu:
In Progress
Status in procps package in Ubuntu:
New
Status in kvm source package in Xenial:
New
Status in linux source package in Xenial:
In Progress
Status in procps source package in Xenial:
New
Status in kvm source package in Zesty:
New
Status in linux source package in Zesty:
In Progress
Status in procps source package in Zesty:
New
Status in kvm source package in Artful:
Confirmed
Status in linux source package in Artful:
In Progress
Status in procps source package in Artful:
New
Bug description:
Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its
KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with
the following error:
root at zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038
error: Failed to start domain zs93kag70038
error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device
The previous 17 guests started fine:
root at zm93k8# virsh start zs93kag70020
Domain zs93kag70020 started
root at zm93k8# virsh start zs93kag70021
Domain zs93kag70021 started
.
.
root at zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036
Domain zs93kag70036 started
We ended up fixing the issue by adding the following line to /etc/sysctl.conf :
fs.aio-max-nr = 4194304
... then, reload the sysctl config file:
root at zm93k8:/etc# sysctl -p /etc/sysctl.conf
fs.aio-max-nr = 4194304
Now, we're able to start more guests...
root at zm93k8:/etc# virsh start zs93kag70036
Domain zs93kag70036 started
The default value was originally set to 65535:
root at zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr
65536
Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system:
[root at zs93ka ~]# cat /proc/sys/fs/aio-max-nr
4194304
ubuntu at zm93k8:/etc$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
ubuntu at zm93k8:/etc$
ubuntu at zm93k8:/etc$ dpkg -s qemu-kvm |grep Version
Version: 1:2.5+dfsg-5ubuntu10.8
Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to
how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much
higher, to accommodate significantly more virtual guests?
Thanks!
---uname output---
ubuntu at zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux
Machine Type = z14
---Debugger---
A debugger is not configured
---Steps to Reproduce---
See Problem Description.
The problem was happening a week ago, so this may not reflect that
activity.
This file was collected on Aug 7, one week after we were hitting the
problem. If I need to reproduce the problem and get fresh data,
please let me know.
/var/log/messages doesn't exist on this system, so I provided syslog
output instead.
All data have been collected too late after the problem was observed
over a week ago. If you need me to reproduce the problem and get new
data, please let me know. That's not a problem.
Also, we would have to make special arrangements for login access to
these systems. I'm happy to run traces and data collection for you as
needed. If that's not sufficient, then we'll explore log in access
for you.
Thanks... - Scott G.
I was able to successfully recreate the problem and captured / attached new debug docs.
Recreate procedure:
# Started out with no virtual guests running.
ubuntu at zm93k8:/home/scottg$ virsh list
Id Name State
----------------------------------------------------
# Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf
ubuntu at zm93k8:~$ tail -1 /etc/sysctl.conf
fs.aio-max-nr = 65536
## sysctl -a shows:
fs.aio-max-nr = 4194304
## Reload sysctl.
ubuntu at zm93k8:~$ sudo sysctl -p /etc/sysctl.conf
fs.aio-max-nr = 65536
ubuntu at zm93k8:~$
ubuntu at zm93k8:~$ sudo sysctl -a |grep fs.aio-max-nr
fs.aio-max-nr = 65536
ubuntu at zm93k8:~$ cat /proc/sys/fs/aio-max-nr
65536
# Attempt to start more than 17 qcow2 virtual guests on the Ubuntu
host. Fails on the 18th XML.
Script used to start guests..
ubuntu at zm93k8:/home/scottg$ date;./start_privs.sh
Wed Aug 23 13:21:25 EDT 2017
virsh start zs93kag70015
Domain zs93kag70015 started
Started zs93kag70015 succesfully ...
virsh start zs93kag70020
Domain zs93kag70020 started
Started zs93kag70020 succesfully ...
virsh start zs93kag70021
Domain zs93kag70021 started
Started zs93kag70021 succesfully ...
virsh start zs93kag70022
Domain zs93kag70022 started
Started zs93kag70022 succesfully ...
virsh start zs93kag70023
Domain zs93kag70023 started
Started zs93kag70023 succesfully ...
virsh start zs93kag70024
Domain zs93kag70024 started
Started zs93kag70024 succesfully ...
virsh start zs93kag70025
Domain zs93kag70025 started
Started zs93kag70025 succesfully ...
virsh start zs93kag70026
Domain zs93kag70026 started
Started zs93kag70026 succesfully ...
virsh start zs93kag70027
Domain zs93kag70027 started
Started zs93kag70027 succesfully ...
virsh start zs93kag70028
Domain zs93kag70028 started
Started zs93kag70028 succesfully ...
virsh start zs93kag70029
Domain zs93kag70029 started
Started zs93kag70029 succesfully ...
virsh start zs93kag70030
Domain zs93kag70030 started
Started zs93kag70030 succesfully ...
virsh start zs93kag70031
Domain zs93kag70031 started
Started zs93kag70031 succesfully ...
virsh start zs93kag70032
Domain zs93kag70032 started
Started zs93kag70032 succesfully ...
virsh start zs93kag70033
Domain zs93kag70033 started
Started zs93kag70033 succesfully ...
virsh start zs93kag70034
Domain zs93kag70034 started
Started zs93kag70034 succesfully ...
virsh start zs93kag70035
Domain zs93kag70035 started
Started zs93kag70035 succesfully ...
virsh start zs93kag70036
error: Failed to start domain zs93kag70036
error: internal error: process exited while connecting to monitor: 2017-08-23T17:21:47.131809Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70036.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device
Exiting script ... start zs93kag70036 failed
ubuntu at zm93k8:/home/scottg$
# Show that there are only 17 running guests.
ubuntu at zm93k8:/home/scottg$ virsh list |grep run |wc -l
17
ubuntu at zm93k8:/home/scottg$ virsh list
Id Name State
----------------------------------------------------
25 zs93kag70015 running
26 zs93kag70020 running
27 zs93kag70021 running
28 zs93kag70022 running
29 zs93kag70023 running
30 zs93kag70024 running
31 zs93kag70025 running
32 zs93kag70026 running
33 zs93kag70027 running
34 zs93kag70028 running
35 zs93kag70029 running
36 zs93kag70030 running
37 zs93kag70031 running
38 zs93kag70032 running
39 zs93kag70033 running
40 zs93kag70034 running
41 zs93kag70035 running
# For fun, try starting zs93kag70036 again manually.
ubuntu at zm93k8:/home/scottg$ date;virsh start zs93kag70036
Wed Aug 23 13:27:28 EDT 2017
error: Failed to start domain zs93kag70036
error: internal error: process exited while connecting to monitor: 2017-08-23T17:27:30.031782Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70036.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device
# Show the XML (they're all basically the same)...
ubuntu at zm93k8:/home/scottg$ cat zs93kag70036.xml
<domain type='kvm'>
<name>zs93kag70036</name>
<memory unit='MiB'>4096</memory>
<currentMemory unit='MiB'>2048</currentMemory>
<vcpu placement='static'>2</vcpu>
<os>
<type arch='s390x' machine='s390-ccw-virtio'>hvm</type>
</os>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>preserve</on_crash>
<devices>
<emulator>/usr/bin/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name ='qemu' type='qcow2' cache='none' io='native'/>
<source file='/guestimages/data1/zs93kag70036.qcow2'/>
<target dev='vda' bus='virtio'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/>
<boot order='1'/>
</disk>
<interface type='network'>
<source network='privnet1'/>
<model type='virtio'/>
<mac address='52:54:00:70:d0:36'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
</interface>
<!--
<disk type='block' device='disk'>
<driver name ='qemu' type='raw' cache='none'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-36005076802810e5540000000000006e4'/>
<target dev='vde' bus='virtio'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0005'/>
<readonly/>
</disk>
-->
<disk type='file' device='disk'>
<driver name ='qemu' type='raw' cache='none' io='native'/>
<source file='/guestimages/data1/zs93kag70036.prm'/>
<target dev='vdf' bus='virtio'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0006'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/guestimages/data1/zs93kag70036.iso'/>
<target dev='sda' bus='scsi'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='usb' index='0' model='none'/>
<memballoon model='none'/>
<console type='pty'>
<target type='sclp' port='0'/>
</console>
</devices>
</domain>
This condition is very easy to replicate. However, we may be losing this system in the next day or two, so please let me know ASAP if you need any more data. Thank you...
- Scott G.
== Comment: #11 - Viktor Mihajlovski <MIHAJLOV at de.ibm.com> - 2017-09-14
In order to support many KVM guests it is advisable to raise the aio-max-nr as suggested in the problem description, see also http://kvmonz.blogspot.co.uk/p/blog-page_7.html. I would also suggest that the system default setting is increased.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1717224/+subscriptions
More information about the foundations-bugs
mailing list