[Bug 1904585] [NEW] opal-prd: Have a worker process handle page offlining (Fixes "PlatServices: dyndealloc memory_error() failed" is getting reported in error log (opal-prd))
bugproxy
1904585 at bugs.launchpad.net
Tue Nov 17 14:59:48 UTC 2020
Public bug reported:
--Problem Description---
https://github.com/open-
power/skiboot/commit/8cbd0de88d162e387f11569eee1bdecef8fad2e3
opal-prd: Have a worker process handle page offlining
The memory_error() hservice interface expects the memory_error() call to
just accept the offline request and return without actually offlining the
memory. Currently we will attempt to offline the marked pages before
returning to HBRT which can result in an excessively long time spent in the
memory_error() hservice call which blocks HBRT from processing other
errors. Fix this by adding a worker process which performs the page
offlining via the sysfs memory error interfaces.
Reviewed-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
Thanks in advance for your support.
Machine Type = Power8 and Power9 OPAL systems
---Steps to Reproduce---
* Inject memory error (UE)
* Verify that opal-prd doesn't return asynchronously to the platform after requesting the memory offlining operation
Userspace tool common name: opal-prd
We need this fix for 16.04.x and 18.04.x LTS releases.
Fix also is needed for 20.04 and 20.10.
** Affects: ubuntu-power-systems
Importance: Critical
Assignee: Canonical Foundations Team (canonical-foundations)
Status: New
** Affects: skiboot (Ubuntu)
Importance: Undecided
Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
Status: New
** Tags: architecture-ppc64le bugnameltc-189252 severity-critical targetmilestone-inin18045
** Tags added: architecture-ppc64le bugnameltc-189252 severity-critical
targetmilestone-inin---
** Changed in: ubuntu
Assignee: (unassigned) => Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
** Package changed: ubuntu => skiboot (Ubuntu)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to skiboot in Ubuntu.
Matching subscriptions: foundations-bugs-skiboot
https://bugs.launchpad.net/bugs/1904585
Title:
opal-prd: Have a worker process handle page offlining (Fixes
"PlatServices: dyndealloc memory_error() failed" is getting reported
in error log (opal-prd))
Status in The Ubuntu-power-systems project:
New
Status in skiboot package in Ubuntu:
New
Bug description:
--Problem Description---
https://github.com/open-
power/skiboot/commit/8cbd0de88d162e387f11569eee1bdecef8fad2e3
opal-prd: Have a worker process handle page offlining
The memory_error() hservice interface expects the memory_error() call to
just accept the offline request and return without actually offlining the
memory. Currently we will attempt to offline the marked pages before
returning to HBRT which can result in an excessively long time spent in the
memory_error() hservice call which blocks HBRT from processing other
errors. Fix this by adding a worker process which performs the page
offlining via the sysfs memory error interfaces.
Reviewed-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
Thanks in advance for your support.
Machine Type = Power8 and Power9 OPAL systems
---Steps to Reproduce---
* Inject memory error (UE)
* Verify that opal-prd doesn't return asynchronously to the platform after requesting the memory offlining operation
Userspace tool common name: opal-prd
We need this fix for 16.04.x and 18.04.x LTS releases.
Fix also is needed for 20.04 and 20.10.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1904585/+subscriptions
More information about the foundations-bugs
mailing list