[Bug 1882535] Re: [focal/core20][python3.7+] staging conflicts when multiple python parts have the same python dependencies

Dmitrii Shcherbakov 1882535 at bugs.launchpad.net
Tue Jun 9 13:30:04 UTC 2020


** Bug watch added: github.com/pypa/pip/issues #8414
   https://github.com/pypa/pip/issues/8414

** Also affects: pip via
   https://github.com/pypa/pip/issues/8414
   Importance: Unknown
       Status: Unknown

** No longer affects: python3.8 (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to python3.8 in Ubuntu.
https://bugs.launchpad.net/bugs/1882535

Title:
  [focal/core20][python3.7+] staging conflicts when multiple python
  parts have the same python dependencies

Status in pip:
  Unknown
Status in Snapcraft:
  Triaged
Status in python-pip package in Ubuntu:
  New

Bug description:
  TL;DR: as of python 3.7, .pyc files by default include a timestamp and a size of the source file which results in a change of a hash every time a .pyc file is generated for a given source file. This results in staging conflicts for python parts.
  https://docs.python.org/3/library/py_compile.html#py_compile.compile
  https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP

  Description/analysis:

  When building a project with multiple identical python dependencies I
  consistently get an error like this:

  Failed to stage: Parts 'openstack-projects' and 'cluster' have the following files, but with different contents:
      bin/activate
      bin/activate.csh
      bin/activate.fish
      bin/python3
      pyvenv.cfg
      bin/python3
      lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
      lib/python3.8/site-packages/__pycache__/easy_install.cpython-38.pyc
      lib/python3.8/site-packages/certifi/__pycache__/__init__.cpython-38.pyc
      lib/python3.8/site-packages/certifi/__pycache__/__main__.cpython-38.pyc
  # many other .pyc files ...

  While snapcraft suggests that I use something like `organize`,
  `filesets` and `stage`, the issue is that the source files for those
  dependencies are identical - there is no reason for any manual work
  here.

  Source hashes are the same:

  snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
  6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976

  ./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
  snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py
  6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976  ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py

  .pyc files are different:

  snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
  398b47a5abfc87e9da73153e42d48dcd5d917bd637a0e0af1eb6999f19fb1085  ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc

  snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
  d4642cfecd727d228944a1d31ff728e7ef6529a7a88898f6568ea6e96d1f8f82  ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc

  RECORD files include hashes as well, hence they are also different:

  snapcraft-microstack # diff ./parts/openstack-projects/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD ./parts/cluster/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
  1c1
  < ../../../bin/flask,sha256=VXQqccMeG03Rn8_yN8Kq3Up13rzyaoHsEckFnCxHor4,242
  ---
  > ../../../bin/flask,sha256=NAzPpe84iZFX3PYsCZEirt3fAFObAjBuCpM25792kSU,231

  Apparently, as of python 3.7, .pyc files include a timestamp and a
  size of the source by default (PycInvalidationMode.TIMESTAMP). There
  is a way to override this behavior by setting the SOURCE_DATE_EPOCH
  environment variable to switch py_compile to using
  PycInvalidationMode.CHECKED_HASH:

  https://docs.python.org/3/library/py_compile.html
  py_compile.compile(file, cfile=None, dfile=None, doraise=False, optimize=-1, invalidation_mode=PycInvalidationMode.TIMESTAMP, quiet=0)

  invalidation_mode should be a member of the PycInvalidationMode enum
  and controls how the generated bytecode cache is invalidated at
  runtime. The default is PycInvalidationMode.CHECKED_HASH ***if the
  SOURCE_DATE_EPOCH environment variable is set***, otherwise ***the
  default is PycInvalidationMode.TIMESTAMP***.

  https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP
  TIMESTAMP
  The .pyc file includes the timestamp and size of the source file, which Python will compare against the metadata of the source file at runtime to determine if the .pyc file needs to be regenerated.

  https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.CHECKED_HASH
  CHECKED_HASH
  The .pyc file includes a hash of the source file content, which Python will compare against the source at runtime to determine if the .pyc file needs to be regenerated.

  Adding something like this seems to be needed:
      build-environment:
        - SOURCE_DATE_EPOCH: '1591640328'

  However, see
  https://bugs.launchpad.net/snapcraft/+bug/1882535/comments/2

To manage notifications about this bug go to:
https://bugs.launchpad.net/pip/+bug/1882535/+subscriptions



More information about the foundations-bugs mailing list