[Bug 1882535] Re: [focal/core20][python3.7+] staging conflicts when multiple python parts have the same python dependencies
Dmitrii Shcherbakov
1882535 at bugs.launchpad.net
Tue Jun 9 13:30:04 UTC 2020
** Bug watch added: github.com/pypa/pip/issues #8414
https://github.com/pypa/pip/issues/8414
** Also affects: pip via
https://github.com/pypa/pip/issues/8414
Importance: Unknown
Status: Unknown
** No longer affects: python3.8 (Ubuntu)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to python3.8 in Ubuntu.
https://bugs.launchpad.net/bugs/1882535
Title:
[focal/core20][python3.7+] staging conflicts when multiple python
parts have the same python dependencies
Status in pip:
Unknown
Status in Snapcraft:
Triaged
Status in python-pip package in Ubuntu:
New
Bug description:
TL;DR: as of python 3.7, .pyc files by default include a timestamp and a size of the source file which results in a change of a hash every time a .pyc file is generated for a given source file. This results in staging conflicts for python parts.
https://docs.python.org/3/library/py_compile.html#py_compile.compile
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP
Description/analysis:
When building a project with multiple identical python dependencies I
consistently get an error like this:
Failed to stage: Parts 'openstack-projects' and 'cluster' have the following files, but with different contents:
bin/activate
bin/activate.csh
bin/activate.fish
bin/python3
pyvenv.cfg
bin/python3
lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
lib/python3.8/site-packages/__pycache__/easy_install.cpython-38.pyc
lib/python3.8/site-packages/certifi/__pycache__/__init__.cpython-38.pyc
lib/python3.8/site-packages/certifi/__pycache__/__main__.cpython-38.pyc
# many other .pyc files ...
While snapcraft suggests that I use something like `organize`,
`filesets` and `stage`, the issue is that the source files for those
dependencies are identical - there is no reason for any manual work
here.
Source hashes are the same:
snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976
./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py
6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976 ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py
.pyc files are different:
snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
398b47a5abfc87e9da73153e42d48dcd5d917bd637a0e0af1eb6999f19fb1085 ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
d4642cfecd727d228944a1d31ff728e7ef6529a7a88898f6568ea6e96d1f8f82 ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
RECORD files include hashes as well, hence they are also different:
snapcraft-microstack # diff ./parts/openstack-projects/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD ./parts/cluster/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
1c1
< ../../../bin/flask,sha256=VXQqccMeG03Rn8_yN8Kq3Up13rzyaoHsEckFnCxHor4,242
---
> ../../../bin/flask,sha256=NAzPpe84iZFX3PYsCZEirt3fAFObAjBuCpM25792kSU,231
Apparently, as of python 3.7, .pyc files include a timestamp and a
size of the source by default (PycInvalidationMode.TIMESTAMP). There
is a way to override this behavior by setting the SOURCE_DATE_EPOCH
environment variable to switch py_compile to using
PycInvalidationMode.CHECKED_HASH:
https://docs.python.org/3/library/py_compile.html
py_compile.compile(file, cfile=None, dfile=None, doraise=False, optimize=-1, invalidation_mode=PycInvalidationMode.TIMESTAMP, quiet=0)
invalidation_mode should be a member of the PycInvalidationMode enum
and controls how the generated bytecode cache is invalidated at
runtime. The default is PycInvalidationMode.CHECKED_HASH ***if the
SOURCE_DATE_EPOCH environment variable is set***, otherwise ***the
default is PycInvalidationMode.TIMESTAMP***.
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP
TIMESTAMP
The .pyc file includes the timestamp and size of the source file, which Python will compare against the metadata of the source file at runtime to determine if the .pyc file needs to be regenerated.
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.CHECKED_HASH
CHECKED_HASH
The .pyc file includes a hash of the source file content, which Python will compare against the source at runtime to determine if the .pyc file needs to be regenerated.
Adding something like this seems to be needed:
build-environment:
- SOURCE_DATE_EPOCH: '1591640328'
However, see
https://bugs.launchpad.net/snapcraft/+bug/1882535/comments/2
To manage notifications about this bug go to:
https://bugs.launchpad.net/pip/+bug/1882535/+subscriptions
More information about the foundations-bugs
mailing list