NAK: [SRU][Trusty][PATCH 1/1] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c

joseph.salisbury at canonical.com joseph.salisbury at canonical.com
Sat Nov 14 18:28:15 UTC 2015


Thanks for the feedback, Tim.  

On November 14, 2015 9:57:51 AM EST, Tim Gardner <tim.gardner at canonical.com> wrote:
>On 11/12/2015 01:16 PM, Tim Gardner wrote:
>> On 11/12/2015 11:24 AM, Joseph Salisbury wrote:
>>> From: Kosuke Tatsukawa <tatsu at ab.jp.nec.com>
>>>
>>> BugLink: http://bugs.launchpad.net/bugs/1512815
>>>
>>> My colleague ran into a program stall on a x86_64 server, where
>>> n_tty_read() was waiting for data even if there was data in the
>buffer
>>> in the pty.  kernel stack for the stuck process looks like below.
>>>   #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
>>>   #1 [ffff88303d107bd0] schedule at ffffffff815c513e
>>>   #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
>>>   #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
>>>   #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
>>>   #5 [ffff88303d107dd0] tty_read at ffffffff81368013
>>>   #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
>>>   #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
>>>   #8 [ffff88303d107f00] sys_read at ffffffff811a4306
>>>   #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at
>ffffffff815c86d7
>>>
>>> There seems to be two problems causing this issue.
>>>
>>> First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
>>> updates ldata->commit_head using smp_store_release() and then checks
>>> the wait queue using waitqueue_active().  However, since there is no
>>> memory barrier, __receive_buf() could return without calling
>>> wake_up_interactive_poll(), and at the same time, n_tty_read() could
>>> start to wait in wait_woken() as in the following chart.
>>>
>>>          __receive_buf()                         n_tty_read()
>>>
>------------------------------------------------------------------------
>>> if (waitqueue_active(&tty->read_wait))
>>> /* Memory operations issued after the
>>>     RELEASE may be completed before the
>>>     RELEASE operation has completed */
>>>
>>> add_wait_queue(&tty->read_wait, &wait);
>>>                                          ...
>>>                                          if (!input_available_p(tty,
>>> 0)) {
>>> smp_store_release(&ldata->commit_head,
>>>                    ldata->read_head);
>>>                                          ...
>>>                                          timeout = wait_woken(&wait,
>>>                                            TASK_INTERRUPTIBLE,
>timeout);
>>>
>------------------------------------------------------------------------
>>>
>>> The second problem is that n_tty_read() also lacks a memory barrier
>>> call and could also cause __receive_buf() to return without calling
>>> wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
>>> as in the chart below.
>>>
>>>          __receive_buf()                         n_tty_read()
>>>
>------------------------------------------------------------------------
>>>                                          spin_lock_irqsave(&q->lock,
>>> flags);
>>>                                          /* from add_wait_queue() */
>>>                                          ...
>>>                                          if (!input_available_p(tty,
>>> 0)) {
>>>                                          /* Memory operations issued
>>> after the
>>>                                             RELEASE may be completed
>>> before the
>>>                                             RELEASE operation has
>>> completed */
>>> smp_store_release(&ldata->commit_head,
>>>                    ldata->read_head);
>>> if (waitqueue_active(&tty->read_wait))
>>>                                          __add_wait_queue(q, wait);
>>>
>>> spin_unlock_irqrestore(&q->lock,flags);
>>>                                          /* from add_wait_queue() */
>>>                                          ...
>>>                                          timeout = wait_woken(&wait,
>>>                                            TASK_INTERRUPTIBLE,
>timeout);
>>>
>------------------------------------------------------------------------
>>>
>>> There are also other places in drivers/tty/n_tty.c which have
>similar
>>> calls to waitqueue_active(), so instead of adding many memory
>barrier
>>> calls, this patch simply removes the call to waitqueue_active(),
>>> leaving just wake_up*() behind.
>>>
>>> This fixes both problems because, even though the memory access
>before
>>> or after the spinlocks in both wake_up*() and add_wait_queue() can
>>> sneak into the critical section, it cannot go past it and the
>critical
>>> section assures that they will be serialized (please see "INTER-CPU
>>> ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for
>a
>>> better explanation).  Moreover, the resulting code is much simpler.
>>>
>>> Latency measurement using a ping-pong test over a pty doesn't show
>any
>>> visible performance drop.
>>>
>>> Signed-off-by: Kosuke Tatsukawa <tatsu at ab.jp.nec.com>
>>> Cc: stable at vger.kernel.org
>>> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>>> (backported from commit e81107d4c6bd098878af9796b24edc8d4a9524fd)
>>> Signed-off-by: Joseph Salisbury <joseph.salisbury at canonical.com>
>>> ---
>>>   drivers/tty/n_tty.c | 11 +++++------
>>>   1 file changed, 5 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
>>> index 84dcdf4..2d4088d 100644
>>> --- a/drivers/tty/n_tty.c
>>> +++ b/drivers/tty/n_tty.c
>>> @@ -363,6 +363,7 @@ static void n_tty_packet_mode_flush(struct
>>> tty_struct *tty)
>>>       spin_lock_irqsave(&tty->ctrl_lock, flags);
>>>       if (tty->link->packet) {
>>>           tty->ctrl_status |= TIOCPKT_FLUSHREAD;
>>> +        spin_unlock_irqrestore(&tty->ctrl_lock, flags);
>>>           wake_up_interruptible(&tty->link->read_wait);
>>>       }
>>>       spin_unlock_irqrestore(&tty->ctrl_lock, flags);
>>
>> This doesn't look right. I think spin lock/unlock have to be
>balanced,
>> plus it was not part of the original patch.
>>
>>> @@ -1173,7 +1174,7 @@ static void n_tty_receive_break(struct
>>> tty_struct *tty)
>>>           put_tty_queue('\0', ldata);
>>>       }
>>>       put_tty_queue('\0', ldata);
>>> -    wake_up_interruptible(&tty->read_wait);
>>> +    wake_up_interruptible_poll(&tty->read_wait, POLLIN);
>>>   }
>>>
>>
>> I'm also a little leery of using wake_up_interruptible_poll() when
>the
>> original code was wake_up_interruptible(), though upstream is
>currently
>> using wake_up_interruptible_poll(). However, test results appear to
>be
>> positive, so meh.
>>
>> See attached backport.
>>
>> rtg
>>
>>
>
>well, my backport isn't completely right either. Just reproduce your 
>original patch without the spin_unlock_irqrestore() changes.
>
>rtg
>-- 
>Tim Gardner tim.gardner at canonical.com

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20151114/9330bca5/attachment.html>


More information about the kernel-team mailing list