[Merge] lp:~cjwatson/extract-changelogs/order-by-date into lp:extract-changelogs

Colin Watson cjwatson at canonical.com
Wed Apr 13 23:48:22 UTC 2016


Colin Watson has proposed merging lp:~cjwatson/extract-changelogs/order-by-date into lp:extract-changelogs.

Commit message:
Use archive.getPublishedSources(order_by_date=True) for a significant speedup.

Requested reviews:
  Ubuntu Core Development Team (ubuntu-core-dev)

For more details, see:
https://code.launchpad.net/~cjwatson/extract-changelogs/order-by-date/+merge/291837

Use archive.getPublishedSources(order_by_date=True) for a significant speedup.

The query that extract-changelogs is currently relying on is very slow, and there are some subtle ways in which iterating over the collection can go wrong.  For ddeb-retriever, we did a fair bit of work on this:

  https://bugs.launchpad.net/launchpad/+bug/1441729
  https://code.launchpad.net/~cjwatson/launchpad/db-index-bpph-datecreated/+merge/255539
  https://code.launchpad.net/~cjwatson/launchpad/getpublishedbinaries-sorting/+merge/255822

In the case of extract-changelogs, it should be sufficient to add order_by_date=True, which has the effect of joining fewer tables and using a reasonably well-indexed query to return a collection which is in decreasing ID order.  If the collection changes during iteration (as long as you don't try to do any status filtering or similar, as explained in a comment here) then the worst case is that you get the same source package more than once, but extract-changelogs already handles this in LaunchpadChangelogsCrawler._unpack_changelogs_to_target.

Please do test this!  I have not done so.  However, I hear that extract-changelogs times out when asked to work from a very old starting date, and this should make it behave a lot better.
-- 
Your team Ubuntu Core Development Team is requested to review the proposed merge of lp:~cjwatson/extract-changelogs/order-by-date into lp:extract-changelogs.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: review-diff.txt
Type: text/x-diff
Size: 1878 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/ubuntu-reviews/attachments/20160413/d262c8ad/attachment.diff>


More information about the Ubuntu-reviews mailing list