[Bug 1861776] Re: fseek(…, …, SEEK_SET) causes reading over the skipped range

Tue Dec 8 03:17:33 UTC 2020

Launchpad has imported 6 comments from the remote bug at
https://sourceware.org/bugzilla/show_bug.cgi?id=25497.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2020-02-03T16:12:07+00:00 Konstantin wrote:

When fseek called, it in turn calls lseek (as expected), and then calls
read() over the skipped range (as not expected). In the best case, it's
a waste of CPU and IO resources. In the worst case, this causes an
application that tried to skip too big range to just hang on fseek().

This is a follow up to discussion at https://sourceware.org/ml/libc-
help/2020-01/threads.html#00046

# Steps to reproduce (in terms of terminal commands)

    $ cat test.c
    #include <fcntl.h>
    #include <stdio.h>

    int main() {
        FILE* f = fopen("/tmp/test.c", "r");
        if (!f)
            perror("");
        fseek(f, 30, SEEK_SET);
    }
    $ gcc test.c -o a
    $ strace ./a 2>&1 | tail
    mprotect(0x7fd2c36c1000, 4096, PROT_READ) = 0
    munmap(0x7fd2c3628000, 451693)          = 0
    brk(NULL)                               = 0x557c9e900000
    brk(0x557c9e921000)                     = 0x557c9e921000
    openat(AT_FDCWD, "/tmp/test.c", O_RDONLY) = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=155, ...}) = 0
    lseek(3, 0, SEEK_SET)                   = 0
    read(3, "#include <fcntl.h>\n#include <s", 30) = 30
    exit_group(0)                           = ?
    +++ exited with

## Expected

There's no read() call after lseek()

## Actual

Both lseek() and read() are called.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/0

------------------------------------------------------------------------
On 2020-02-03T16:19:43+00:00 Carlos-0 wrote:

I'm not sure what the consequences are for optimizing away the read as
part of the FILE buffer management. That is the question that would need
to be answered here before we could do something like this.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/1

------------------------------------------------------------------------
On 2020-02-03T17:11:03+00:00 Andreas Schwab wrote:

The read is required to sychronize the underlying file position, while
keeping the stdio buffer aligned on a block boundary.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/2

------------------------------------------------------------------------
On 2020-02-03T18:05:01+00:00 Konstantin wrote:

(In reply to Andreas Schwab from comment #2)
> The read is required to sychronize the underlying file position, while
> keeping the stdio buffer aligned on a block boundary.

Though I don't know why it's necessary, but would it be possible in this
case to at least only read just one block, that is the last block before
the position a program is trying to set with fseek()? So at least, when
a program tries to do fseek(…,0x80000000, SEEK_SET), it wouldn't hang on
fseek trying to read half a terabyte of data.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/3

------------------------------------------------------------------------
On 2020-02-03T19:47:30+00:00 Andreas Schwab wrote:

Where do you see it reading more than one block?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/4

------------------------------------------------------------------------
On 2020-02-03T22:47:23+00:00 Konstantin wrote:

(In reply to Andreas Schwab from comment #4)
> Where do you see it reading more than one block?

Oh, I stand corrected, on glibc 2.30 this is no longer reproducible.
Though it's reproducible on glibc 2.27, just 3 versions ago. Reproducing
that simply requires one to run something like `sudo hexdump -C /dev/sda
-s 0xa8000f9000 -n 1`: if it hangs, it's because `fseek()` hexdump is
using tries to read 0xa8000f9000 amount of data.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861776/comments/5

** Changed in: glibc
       Status: Unknown => New

** Changed in: glibc
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1861776

Title:
  fseek(…, …, SEEK_SET) causes reading over the skipped range

Status in GLibC:
  New
Status in glibc package in Ubuntu:
  Fix Released
Status in glibc source package in Bionic:
  New
Status in glibc source package in Focal:
  Fix Released

Bug description:
  When fseek is called, it calls in turn 1. lseek(), and 2. read(). In
  glibc 2.29 (maybe earlier) read() is only called for the last block.
  However in glibc 2.27 Ubuntu 18.04 is using, the read happens over the
  whole skipped range, which may cause a hang of an app that tries to
  skip too big range.

  There's is a related report:
  https://sourceware.org/bugzilla/show_bug.cgi?id=25497 Note, per
  comments, in at least glibc 2.29 read() only happens for the *last
  block*. This means there was some fix for fseek() to not read over
  everything it skipped, which Ubuntu didn't backport to older glibc
  it's using.

  # Steps to reproduce

  In command below, replace `/dev/sda` if necessary with a device that
  is at least 2 GB in size.

  Run `sudo hexdump -C /dev/sda -s 0x80000000 -n 1`. This command uses
  `hexdump` to print content of a disk at a large offset.

  ## Expected

  The command returns immediately with a print

  ## Actual

  The command hangs with high CPU load. If you use `strace hexdump …`,
  you'll see there a bunch of reads happens. These reads arise from
  glibc 2.27 implementation of `fseek()`.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glibc/+bug/1861776/+subscriptions