delete instance bug - dianaclarke/openstack-notes GitHub Wiki
- What happens to the instance in the database when a keystoneclient ConnectionRefused exception is raised here?
- compute manager, do_terminate_instance, delete_instance, shutdown_instance, volume_api.detach
- The instance in the database will indeed be stuck in the deleting task state (nova bug)
- Perhaps the terminate_connection contract in cinderclient changed (cinder bug)?
- Perhaps it's inheriting the keystone error by mistake?
- Where does ClientException appear? https://github.com/openstack/python-cinderclient/search?utf8=%E2%9C%93&q=ClientException
- Where is ClientException raised? https://github.com/openstack/python-cinderclient/blob/635b14b058505e8e17ad8dbcfbf8c2a245b2dc5f/cinderclient/client.py#L306
- Related? "Use adapter from keystoneclient" (Aug 25, 2014) https://github.com/openstack/python-cinderclient/commit/84af49ce97f481731939344c9c45752ea62db5af
- Related? "Allow cinderclient to handle exception response" (April 6, 2015) https://review.openstack.org/#/c/166351/
- Fixed in master? Bug was reported in Juno which was released around Oct 16, 2014.
- Hmmm: https://specs.openstack.org/openstack/cinder-specs/specs/liberty/implement-force-detach-for-safe-cleanup.html
- Reproduce:
- Create instance via dashboard:
$ nova list
+--------------------------------------+------------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+------------+-------------+------------------+
| 76eea290-5523-40ff-8ca5-90a19cb4d7e2 | instance-1 | ACTIVE | - | Running | private=10.0.0.2 |
+--------------------------------------+------------+--------+------------+-------------+------------------+
- Create volume via dashboard:
$ cinder list
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| 971c61ce-fa14-4749-9646-8b2f441ea3b8 | available | volume-1 | 1 | lvmdriver-1 | false | |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
- Attach volume to instance:
$ nova volume-attach 76eea290-5523-40ff-8ca5-90a19cb4d7e2 971c61ce-fa14-4749-9646-8b2f441ea3b8
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | 971c61ce-fa14-4749-9646-8b2f441ea3b8 |
| serverId | 76eea290-5523-40ff-8ca5-90a19cb4d7e2 |
| volumeId | 971c61ce-fa14-4749-9646-8b2f441ea3b8 |
+----------+--------------------------------------+
- Delete instance (success):
$ nova delete 76eea290-5523-40ff-8ca5-90a19cb4d7e2
Request to delete server 76eea290-5523-40ff-8ca5-90a19cb4d7e2 has been accepted.
[diana@localhost devstack]$ nova list
+----+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+----+------+--------+------------+-------------+----------+
+----+------+--------+------------+-------------+----------+
- Create another instance and attach the volume:
$ nova list
+--------------------------------------+------------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+------------+-------------+------------------+
| 51d9efbe-056e-4061-87ee-40ebdc78bc22 | instance-2 | ACTIVE | - | Running | private=10.0.0.3 |
+--------------------------------------+------------+--------+------------+-------------+------------------+
$ cinder list
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| 971c61ce-fa14-4749-9646-8b2f441ea3b8 | available | volume-1 | 1 | lvmdriver-1 | false | |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
$ nova volume-attach 51d9efbe-056e-4061-87ee-40ebdc78bc22 971c61ce-fa14-4749-9646-8b2f441ea3b8
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | 971c61ce-fa14-4749-9646-8b2f441ea3b8 |
| serverId | 51d9efbe-056e-4061-87ee-40ebdc78bc22 |
| volumeId | 971c61ce-fa14-4749-9646-8b2f441ea3b8 |
+----------+--------------------------------------+
- DevStack screen:
$ screen -ls
There is a screen on:
11460.stack (Detached)
1 Socket in /var/run/screen/S-diana.
$ screen -r 11460.stack
-
Bring down the cinder API via screen (
c-api
): -
This should now fail:
$ cinder list
ERROR: Unable to establish connection to http://192.168.122.8:8776/v2/1dafdc28eced46bea0b4ef3fd1a19954/volumes/detail
- What happens when we delete the instance with the cinder API down? Yup, stuck in error state.
$ nova delete 51d9efbe-056e-4061-87ee-40ebdc78bc22
Request to delete server 51d9efbe-056e-4061-87ee-40ebdc78bc22 has been accepted.
[diana@localhost devstack]$ nova list
+--------------------------------------+------------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+------------+-------------+------------------+
| 51d9efbe-056e-4061-87ee-40ebdc78bc22 | instance-2 | ERROR | - | Running | private=10.0.0.3 |
+--------------------------------------+------------+--------+------------+-------------+------------------+
- Reset the instance state back to
ACTIVE
:
$ nova reset-state --active 51d9efbe-056e-4061-87ee-40ebdc78bc22
Reset state for server 51d9efbe-056e-4061-87ee-40ebdc78bc22 succeeded; new state is active
$ nova list
+--------------------------------------+------------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+------------+-------------+----------+
| 51d9efbe-056e-4061-87ee-40ebdc78bc22 | instance-2 | ACTIVE | - | NOSTATE | |
+--------------------------------------+------------+--------+------------+-------------+----------+
- Logs during a failed instance delete (with
c-api
down):
2016-03-29 13:19:49.004 DEBUG nova.osapi_compute.wsgi.server [req-50801e06-3ec0-463a-983d-6d53c693542e None None] (17028) accepted ('192.168.122.8', 41163) from (pid=17028) server /usr/lib/python2.7/site-packages/eventlet/wsgi.py:867
2016-03-29 13:19:49.053 INFO nova.osapi_compute.wsgi.server [req-820e7f43-b8c1-47d9-9f4e-11f98402cb26 admin demo] 192.168.122.8 "GET /v2.1/1dafdc28eced46bea0b4ef3fd1a19954 HTTP/1.1" status: 404 len: 264 time: 0.0479391
2016-03-29 13:19:49.058 DEBUG nova.api.openstack.wsgi [req-00268547-c0d8-4d9a-a1ee-84ad5c859feb admin demo] Calling method '<bound method VersionsController.show of <nova.api.openstack.compute.versionsV21.VersionsController object at 0x7f4479a028d0>>' from (pid=17028) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:699
2016-03-29 13:19:49.063 INFO nova.osapi_compute.wsgi.server [req-00268547-c0d8-4d9a-a1ee-84ad5c859feb admin demo] 192.168.122.8 "GET /v2.1/ HTTP/1.1" status: 200 len: 655 time: 0.0083120
2016-03-29 13:19:49.204 DEBUG nova.api.openstack.wsgi [req-66951ebb-bc69-4c2d-a49a-edd33a229aa7 admin demo] Calling method '<bound method ServersController.show of <nova.api.openstack.compute.servers.ServersController object at 0x7f4479b1f350>>' from (pid=17028) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:699
2016-03-29 13:19:49.205 DEBUG nova.compute.api [req-66951ebb-bc69-4c2d-a49a-edd33a229aa7 admin demo] [instance: b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d] Fetching instance by UUID from (pid=17028) get /opt/stack/nova/nova/compute/api.py:2020
2016-03-29 13:19:49.332 DEBUG nova.objects.instance [req-66951ebb-bc69-4c2d-a49a-edd33a229aa7 admin demo] Lazy-loading 'fault' on Instance uuid b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d from (pid=17028) obj_load_attr /opt/stack/nova/nova/objects/instance.py:895
2016-03-29 13:19:49.367 DEBUG nova.policy [req-66951ebb-bc69-4c2d-a49a-edd33a229aa7 admin demo] Policy check for os_compute_api:os-hide-server-addresses failed with credentials {'domain': None, 'project_name': u'demo', 'project_domain': None, 'timestamp': '2016-03-29T17:19:49.201691', 'remote_address': '192.168.122.8', 'quota_class': None, 'resource_uuid': None, 'is_admin': True, 'user': u'e2b98906b1a0432bb892ae37ca89b2fe', 'service_catalog': [{u'endpoints': [{u'adminURL': u'http://192.168.122.8:8776/v2/1dafdc28eced46bea0b4ef3fd1a19954', u'region': u'RegionOne', u'internalURL': u'http://192.168.122.8:8776/v2/1dafdc28eced46bea0b4ef3fd1a19954', u'publicURL': u'http://192.168.122.8:8776/v2/1dafdc28eced46bea0b4ef3fd1a19954'}], u'type': u'volumev2', u'name': u'cinderv2'}, {u'endpoints': [{u'adminURL': u'http://192.168.122.8:8776/v1/1dafdc28eced46bea0b4ef3fd1a19954', u'region': u'RegionOne', u'internalURL': u'http://192.168.122.8:8776/v1/1dafdc28eced46bea0b4ef3fd1a19954', u'publicURL': u'http://192.168.122.8:8776/v1/1dafdc28eced46bea0b4ef3fd1a19954'}], u'type': u'volume', u'name': u'cinder'}], 'tenant': u'1dafdc28eced46bea0b4ef3fd1a19954', 'read_only': False, 'project_id': u'1dafdc28eced46bea0b4ef3fd1a19954', 'user_id': u'e2b98906b1a0432bb892ae37ca89b2fe', 'show_deleted': False, 'roles': [u'admin'], 'user_identity': 'e2b98906b1a0432bb892ae37ca89b2fe 1dafdc28eced46bea0b4ef3fd1a19954 - - -', 'read_deleted': 'no', 'request_id': 'req-66951ebb-bc69-4c2d-a49a-edd33a229aa7', 'instance_lock_checked': False, 'user_domain': None, 'user_name': u'admin'} from (pid=17028) enforce /opt/stack/nova/nova/policy.py:107
2016-03-29 13:19:49.554 INFO nova.osapi_compute.wsgi.server [req-66951ebb-bc69-4c2d-a49a-edd33a229aa7 admin demo] 192.168.122.8 "GET /v2.1/1dafdc28eced46bea0b4ef3fd1a19954/servers/b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d HTTP/1.1" status: 200 len: 6520 time: 0.3544841
2016-03-29 13:19:49.571 DEBUG nova.api.openstack.wsgi [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] Calling method '<bound method ServersController.delete of <nova.api.openstack.compute.servers.ServersController object at 0x7f4479b1f350>>' from (pid=17028) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:699
2016-03-29 13:19:49.573 DEBUG nova.compute.api [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] [instance: b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d] Fetching instance by UUID from (pid=17028) get /opt/stack/nova/nova/compute/api.py:2020
2016-03-29 13:19:49.777 DEBUG nova.compute.api [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] [instance: b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d] Going to try to terminate instance from (pid=17028) delete /opt/stack/nova/nova/compute/api.py:1917
2016-03-29 13:19:49.949 DEBUG nova.quota [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] Quota limits for project 1dafdc28eced46bea0b4ef3fd1a19954: {'project_id': u'1dafdc28eced46bea0b4ef3fd1a19954'} from (pid=17028) reserve /opt/stack/nova/nova/quota.py:559
2016-03-29 13:19:49.949 DEBUG nova.quota [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] Getting quotas for project 1dafdc28eced46bea0b4ef3fd1a19954. Resources: ['instances', 'ram', 'cores'] from (pid=17028) _get_quotas /opt/stack/nova/nova/quota.py:409
2016-03-29 13:19:50.049 DEBUG nova.quota [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] Created reservations ['6c31c4d2-5d12-4efa-b656-567db369592e', '9f5c6887-579b-4ed2-a9f7-45fb82506837', '0c0a1e41-ad6f-4184-9d1c-16fe331c00e4'] from (pid=17028) reserve /opt/stack/nova/nova/quota.py:1345
2016-03-29 13:19:50.144 DEBUG oslo_messaging._drivers.amqpdriver [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] CAST unique_id: 5524e64ea0c64ac9ba9ee0dcd4841120 exchange 'nova' topic 'compute' from (pid=17028) _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:454
2016-03-29 13:19:50.169 INFO nova.osapi_compute.wsgi.server [req-8b9dfd45-5b24-44fd-abf4-4b6067828f8e admin demo] 192.168.122.8 "DELETE /v2.1/1dafdc28eced46bea0b4ef3fd1a19954/servers/b25f6951-c5c0-4bac-a3c0-c7ebecd93e5d HTTP/1.1" status: 204 len: 275 time: 0.6138361
-
Hmmm... this isn't actually hitting the code in question ;(
-
Trying again. I think I was looking at the wrong logs.
$ nova list
+--------------------------------------+------------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+------------+-------------+------------------+
| 70bca485-9244-4ada-8ee9-cb9657df839f | instance-c | ACTIVE | - | Running | private=10.0.0.4 |
+--------------------------------------+------------+--------+------------+-------------+------------------+
$ cinder list
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| 98298ff3-7b5a-4167-aa1d-ba33bfcf3d83 | available | volume-c | 1 | lvmdriver-1 | false | |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
$ nova volume-attach 70bca485-9244-4ada-8ee9-cb9657df839f 98298ff3-7b5a-4167-aa1d-ba33bfcf3d83
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | 98298ff3-7b5a-4167-aa1d-ba33bfcf3d83 |
| serverId | 70bca485-9244-4ada-8ee9-cb9657df839f |
| volumeId | 98298ff3-7b5a-4167-aa1d-ba33bfcf3d83 |
+----------+--------------------------------------+
bring c-api down via screen
$ nova delete 70bca485-9244-4ada-8ee9-cb9657df839f
Request to delete server 70bca485-9244-4ada-8ee9-cb9657df839f has been accepted.
2016-03-29 16:11:05.181 ERROR nova.compute.manager [req-051c3fa7-53bb-48f5-8400-
6c023649a921 admin alt_demo] Connection to cinder host failed: Unable to establi
sh connection to http://192.168.122.8:8776/v2/4d7ab515a49b47d98f6a72ea3d4c0ed8/v
olumes/98298ff3-7b5a-4167-aa1d-ba33bfcf3d83/action
$ nova list
+----+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+----+------+--------+------------+-------------+----------+
+----+------+--------+------------+-------------+----------+
- The above delete was successful because I added a diaper exception to this chain: https://github.com/openstack/nova/blob/1906aa421433713c2719dcd1ce57c541fb46fbd2/nova/compute/manager.py#L2355
- Conclusion: able to reproduce, submit a similar patch upstream
- Added upstream bug: https://bugs.launchpad.net/nova/+bug/1563547
- Example error message, post patch:
2016-04-01 09:28:50.236 WARNING nova.compute.manager [req-bcce6323-45bb-4b2e-9a59-d1d30559ac94 admin alt_demo
] [instance: 60147f99-f604-460a-b127-7664a825c095] Ignoring unknown exception for volume b87f98e4-edf1-4fbe-b
d12-0772e606d833: Connection to cinder host failed: Unable to establish connection to http://192.168.122.8:87
76/v2/4d7ab515a49b47d98f6a72ea3d4c0ed8/volumes/b87f98e4-edf1-4fbe-bd12-0772e606d833/action