[PATCH] opal: prd_info: Add resilience to service check

Deb McLemore debmc at linux.vnet.ibm.com
Mon Apr 9 13:28:13 UTC 2018


Output from each system:


~/fwts$ opal-prd --version
opal-prd opal-prd-5.1.13

~/fwts$ sudo systemctl stop opal-prd.service
Warning: Stopping opal-prd.service, but it can still be activated by:
  opal-prd.socket


~/fwts$ opal-prd --version
opal-prd opal-prd-5.4.3

~/fwts$ sudo systemctl stop opal-prd.service

On 04/09/2018 08:07 AM, Deb McLemore wrote:

> Just an update on this, narrowing this down to the Host OS (Ubuntu 16.04)
>
> has different levels of opal-prd daemon.  So far it seems that some
>
> changes to the fwts_pipe_readwrite does not return some socket info that it use to
>
> and so maybe different paths.  There is a fix we can do to properly
>
> only look at the return code from the child exit process (fwts_pipe_close2) on the case
>
> where there is no socket data coming back on the systemctl stop command and not the
>
> output buffer of the socket handling, but really need to look deeper to
>
> see the underlying issue more clearly, but I wanted to update the mailing
>
> list.
>
>
> $ opal-prd --version
> opal-prd opal-prd-5.1.13
>
>
> $ opal-prd --version
> opal-prd opal-prd-5.4.3
>
>
> On 04/07/2018 01:41 PM, Deborah McLemore wrote:
>> The case I reproduced was manually running the "fwts prd_info" and all it does 
>> is a 'systemd status', then if 'running', 'systemd stop'.  The 'systemd stop' 
>> fails with -1.
>> It works ok on some levels of Ubuntu and others not, I will do more 
>> investigation to see the root differences, but the proposed enhancement
>> is a good one to ignore 'systemd stop' exit status since we did get a successful 
>> status of 'running' from the 'systemd status' query.
>> The 'systemd stop' functionally works (the service is stopped), its just the 
>> exit status from the 'systemd stop' which is the -1 on some OS's.  We should be
>> more resilient.  We only attempt to 'systemd start' after the test runs if we 
>> had determined that we were 'running' and tried the 'systemd stop', so its not 
>> so quick, but possibly.
>> =====================================
>> Deb McLemore
>> IBM OpenPower - IBM Systems
>> (512) 286 9980
>>
>> debmc at us.ibm.com
>> debmc at linux.vnet.ibm.com - (plain text)
>> =====================================
>>
>>     ----- Original message -----
>>     From: ppaidipe <ppaidipe at linux.vnet.ibm.com>
>>     To: Deborah McLemore/Austin/IBM at IBMUS
>>     Cc: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>, Deb McLemore
>>     <debmc at linux.vnet.ibm.com>, fwts-devel at lists.ubuntu.com
>>     Subject: Re: [PATCH] opal: prd_info: Add resilience to service check
>>     Date: Sat, Apr 7, 2018 1:16 PM
>>     On 2018-04-07 20:50, Deborah McLemore wrote:
>>      > We are getting -1 back, what is the expected exit status from systemd
>>      > stop ?
>>      >
>>
>>       From the execution of test what i understand is we are requesting
>>     start/stop
>>     the service too quickly which made the test fail.
>>
>>     Apr 07 13:11:18 xxxxxxxxxxx systemd[1]: opal-prd.service: Start request
>>     repeated too quickly.
>>     Apr 07 13:11:18 xxxxxxxxxxx systemd[1]: opal-prd.service: Failed with
>>     result 'start-limit-hit'.
>>     Apr 07 13:11:18 xxxxxxxxxxx systemd[1]: Failed to start OPAL PRD daemon.
>>
>>     So we need to request start/restart only when it is done with stop, and
>>     also request for stop
>>     only when the daemon is already started.
>>
>>
>>     Thanks
>>     Pridhiviraj
>>
>>      > Sent from my iPhone
>>      >
>>      >> On Apr 7, 2018, at 9:23 AM, Vasant Hegde
>>      > <hegdevasant at linux.vnet.ibm.com> wrote:
>>      >>
>>      >>> On 04/07/2018 07:40 PM, Deb McLemore wrote:
>>      >>> When the opal-prd.service is running and attempt to stop is
>>      >>> performed, ignore the exit status and continue.
>>      >>
>>      >> Deb,
>>      >>
>>      >> Can you please explain why do you want to ignore exit status here?
>>      >> Is there any issues?
>>      >>
>>      >> -Vasant
>>      >>
>>      >>
>>      >>
>>      >>>
>>      >>> Signed-off-by: Deb McLemore <debmc at linux.vnet.ibm.com>
>>      >>> ---
>>      >>> src/opal/prd_info.c | 20 ++++----------------
>>      >>> 1 file changed, 4 insertions(+), 16 deletions(-)
>>      >>>
>>      >>> diff --git a/src/opal/prd_info.c b/src/opal/prd_info.c
>>      >>> index 4082a18..2db9413 100644
>>      >>> --- a/src/opal/prd_info.c
>>      >>> +++ b/src/opal/prd_info.c
>>      >>> @@ -73,7 +73,7 @@ static int prd_dev_query(fwts_framework *fw)
>>      >>>
>>      >>> static int prd_service_check(fwts_framework *fw, int *restart)
>>      >>> {
>>      >>> - int rc = FWTS_OK, status = 0, stop_status = 0;
>>      >>> + int rc = FWTS_OK, status = 0;
>>      >>> char *command;
>>      >>> char *output = NULL;
>>      >>>
>>      >>> @@ -97,25 +97,13 @@ static int prd_service_check(fwts_framework
>>      > *fw, int *restart)
>>      >>> goto out;
>>      >>> case 0: /* "running" */
>>      >>> command = "systemctl stop opal-prd.service 2>&1";
>>      >>> - stop_status = fwts_exec2(command, &output);
>>      >>> + fwts_exec2(command, &output);
>>      >>>
>>      >>> if (output)
>>      >>> free(output);
>>      >>>
>>      >>> - switch (stop_status) {
>>      >>> - case 0:
>>      >>> - *restart = 1;
>>      >>> - break;
>>      >>> - default:
>>      >>> - fwts_failed(fw, LOG_LEVEL_HIGH, "OPAL PRD Info",
>>      >>> - "Attempt was made to stop the "
>>      >>> - "opal-prd.service but was not "
>>      >>> - "successful. Try to "
>>      >>> - ""sudo systemctl stop "
>>      >>> - "opal-prd.service" and retry.");
>>      >>> - rc = FWTS_ERROR;
>>      >>> - goto out;
>>      >>> - }
>>      >>> + *restart = 1;
>>      >>> + break;
>>      >>> default:
>>      >>> break;
>>      >>> }
>>      >>>
>>      >>
>>      >>
>>      >> --
>>      >> fwts-devel mailing list
>>      >> fwts-devel at lists.ubuntu.com
>>      >> Modify settings or unsubscribe at:
>>      >
>>     https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ubuntu.com_mailman_listinfo_fwts-2Ddevel&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=V3KRDPsp3yMosW9R4elWYg&m=Sy-O20yWd_N3piZoJOEzigB1XzmLV4OUCfEyl3ENAcc&s=oPh1ACx1NGTgif-0V5BIQffXXqjymI8QC_bagI2jZsA&e=
>>      > [1]
>>      >>
>>      >
>>      >
>>      >
>>      > Links:
>>      > ------
>>      > [1]
>>      >
>>     https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ubuntu.com_mailman_listinfo_fwts-2Ddevel&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=V3KRDPsp3yMosW9R4elWYg&m=Sy-O20yWd_N3piZoJOEzigB1XzmLV4OUCfEyl3ENAcc&s=oPh1ACx1NGTgif-0V5BIQffXXqjymI8QC_bagI2jZsA&e=
>>
>>




More information about the fwts-devel mailing list