From: "Vladimir D. Seleznev" <vseleznv@altlinux.org> To: ALT Linux Team development discussions <devel@lists.altlinux.org> Subject: Re: [devel] RFC: girar: optimize rebuild Date: Sat, 11 Apr 2020 18:33:49 +0300 Message-ID: <20200411153349.GB1624106@portlab> (raw) In-Reply-To: <20200411133631.daac861f97979c67511cf3ef@altlinux.org> On Sat, Apr 11, 2020 at 01:36:31PM +0300, Andrey Savchenko wrote: > On Sat, 11 Apr 2020 02:10:42 +0300 Vladimir D. Seleznev wrote: > > > > Hi! > > > > The first part of rebuilt packages optimization for girar. It introduces > > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > > > pkg_identity() takes RPM package and returns a value called package identity, > > a hash of subset of RPM package header. That subset is the entire header > > without some nonessential artifacts like buildhost, buildtime, header hashsum, > > etc. > > > > The two package builds of the same NEVR might have equal or different > > package identities. The equal identities mean that build results of these > > packages are equal too, that allows build optimization. The practical > > example of simple rebuilt sourcerpm optimization also introduced. > > > > The future work can be about optimization of "copied" to another branch > > sourcerpm with retrieved from archive sourcerpm, and binary packages > > optimization (this case has an issue when binary subpackages are mixed > > archs, i.e. arch and noarch, this probably could work only with single-arch > > builds). > > > > Please review and discuss. > > I see two problems with proposed approach: > > 1) It assumes there will be not pkg_identity hash collisions. This > is wrong. They may occur sooner or later and the code *must* > correctly deal with such collisions. Remember what happened to > subversion when collision occurred in a repository, while git was > resilient. Any hashsum function has collisions by definition. The only way to avoid them is not to use hashsums. > The way proposal is now the identity hash collision will lead to > undergraded repository at best and broken at worst. No, it will not, cause any issues that this collision might bring up will be caught by later build checks. > I see no easy way to fix this problem, but it must be either fixed > or proposed optimization rejected. > > 2) The hash function choise — sha256 — is very unfortunate: it has > longer digest than sha1, but otherwise is vulnerable to the same > attack; so right now it is still marginally secure, but it will not > last long. Moreover sha256 is quite slow. The good news: it is not about security. > It is better to use newer generation of hash functions, e.g. > blake2b based on the chacha stream cipher. It is more future proof > and faster at the same time. You can just use the b2sum > implementation from the GNU coreutils. -- WBR, Vladimir D. Seleznev
next prev parent reply other threads:[~2020-04-11 15:33 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-10 23:10 Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev 2020-04-13 18:01 ` Dmitry V. Levin 2020-04-13 19:32 ` Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev 2020-04-11 11:29 ` Alexey Tourbin 2020-04-14 16:42 ` Vladimir D. Seleznev 2020-04-16 21:51 ` Alexey Tourbin 2020-04-17 13:54 ` Dmitry V. Levin 2020-04-20 9:05 ` [devel] stopping a cascade of rebuilds Alexey Tourbin 2020-04-23 19:21 ` Vladimir D. Seleznev 2020-04-23 20:54 ` Dmitry V. Levin 2020-04-27 5:38 ` Alexey Tourbin 2020-04-20 8:36 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Alexey Tourbin 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko 2020-04-11 15:33 ` Vladimir D. Seleznev [this message] 2020-04-11 23:31 ` Alexey V. Vissarionov 2020-04-14 14:57 ` Andrey Savchenko 2020-04-14 16:20 ` Vladimir D. Seleznev 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy 2020-04-11 15:21 ` Vladimir D. Seleznev 2020-04-11 16:41 ` Gleb Fotengauer-Malinovskiy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200411153349.GB1624106@portlab \ --to=vseleznv@altlinux.org \ --cc=devel@lists.altlinux.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
ALT Linux Team development discussions This inbox may be cloned and mirrored by anyone: git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \ devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru public-inbox-index devel Example config snippet for mirrors. Newsgroup available over NNTP: nntp://lore.altlinux.org/org.altlinux.lists.devel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git