ALT Linux Team development discussions
 help / color / mirror / Atom feed
From: "Vladimir D. Seleznev" <vseleznv@altlinux.org>
To: ALT Linux Team development discussions <devel@lists.altlinux.org>
Subject: Re: [devel] RFC: girar: optimize rebuild
Date: Sat, 11 Apr 2020 18:33:49 +0300
Message-ID: <20200411153349.GB1624106@portlab> (raw)
In-Reply-To: <20200411133631.daac861f97979c67511cf3ef@altlinux.org>

On Sat, Apr 11, 2020 at 01:36:31PM +0300, Andrey Savchenko wrote:
> On Sat, 11 Apr 2020 02:10:42 +0300 Vladimir D. Seleznev wrote:
> > 
> > Hi!
> > 
> > The first part of rebuilt packages optimization for girar. It introduces
> > pkg_identity() and simple optimization of the rebuilt sourcerpm.
> > 
> > pkg_identity() takes RPM package and returns a value called package identity,
> > a hash of subset of RPM package header. That subset is the entire header
> > without some nonessential artifacts like buildhost, buildtime, header hashsum,
> > etc.
> > 
> > The two package builds of the same NEVR might have equal or different
> > package identities. The equal identities mean that build results of these
> > packages are equal too, that allows build optimization. The practical
> > example of simple rebuilt sourcerpm optimization also introduced.
> > 
> > The future work can be about optimization of "copied" to another branch
> > sourcerpm with retrieved from archive sourcerpm, and binary packages
> > optimization (this case has an issue when binary subpackages are mixed
> > archs, i.e. arch and noarch, this probably could work only with single-arch
> > builds).
> > 
> > Please review and discuss.
> 
> I see two problems with proposed approach:
> 
> 1) It assumes there will be not pkg_identity hash collisions. This
> is wrong. They may occur sooner or later and the code *must*
> correctly deal with such collisions. Remember what happened to
> subversion when collision occurred in a repository, while git was
> resilient.

Any hashsum function has collisions by definition. The only way to avoid
them is not to use hashsums.

> The way proposal is now the identity hash collision will lead to
> undergraded repository at best and broken at worst.

No, it will not, cause any issues that this collision might bring up
will be caught by later build checks.

> I see no easy way to fix this problem, but it must be either fixed
> or proposed optimization rejected.
> 
> 2) The hash function choise — sha256 ­— is very unfortunate: it has
> longer digest than sha1, but otherwise is vulnerable to the same
> attack; so right now it is still marginally secure, but it will not
> last long. Moreover sha256 is quite slow.

The good news: it is not about security.

> It is better to use newer generation of hash functions, e.g.
> blake2b based on the chacha stream cipher. It is more future proof
> and faster at the same time. You can just use the b2sum
> implementation from the GNU coreutils.

-- 
   WBR,
   Vladimir D. Seleznev


  reply	other threads:[~2020-04-11 15:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-10 23:10 Vladimir D. Seleznev
2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev
2020-04-13 18:01   ` Dmitry V. Levin
2020-04-13 19:32     ` Vladimir D. Seleznev
2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev
2020-04-11 11:29   ` Alexey Tourbin
2020-04-14 16:42     ` Vladimir D. Seleznev
2020-04-16 21:51       ` Alexey Tourbin
2020-04-17 13:54         ` Dmitry V. Levin
2020-04-20  9:05           ` [devel] stopping a cascade of rebuilds Alexey Tourbin
2020-04-23 19:21             ` Vladimir D. Seleznev
2020-04-23 20:54               ` Dmitry V. Levin
2020-04-27  5:38               ` Alexey Tourbin
2020-04-20  8:36         ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Alexey Tourbin
2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko
2020-04-11 15:33   ` Vladimir D. Seleznev [this message]
2020-04-11 23:31   ` Alexey V. Vissarionov
2020-04-14 14:57     ` Andrey Savchenko
2020-04-14 16:20       ` Vladimir D. Seleznev
2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy
2020-04-11 15:21   ` Vladimir D. Seleznev
2020-04-11 16:41     ` Gleb Fotengauer-Malinovskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200411153349.GB1624106@portlab \
    --to=vseleznv@altlinux.org \
    --cc=devel@lists.altlinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

ALT Linux Team development discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
		devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
	public-inbox-index devel

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git