* [devel] RFC: girar: optimize rebuild @ 2020-04-10 23:10 Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev ` (3 more replies) 0 siblings, 4 replies; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-10 23:10 UTC (permalink / raw) To: devel; +Cc: vseleznv Hi! The first part of rebuilt packages optimization for girar. It introduces pkg_identity() and simple optimization of the rebuilt sourcerpm. pkg_identity() takes RPM package and returns a value called package identity, a hash of subset of RPM package header. That subset is the entire header without some nonessential artifacts like buildhost, buildtime, header hashsum, etc. The two package builds of the same NEVR might have equal or different package identities. The equal identities mean that build results of these packages are equal too, that allows build optimization. The practical example of simple rebuilt sourcerpm optimization also introduced. The future work can be about optimization of "copied" to another branch sourcerpm with retrieved from archive sourcerpm, and binary packages optimization (this case has an issue when binary subpackages are mixed archs, i.e. arch and noarch, this probably could work only with single-arch builds). Please review and discuss. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() 2020-04-10 23:10 [devel] RFC: girar: optimize rebuild Vladimir D. Seleznev @ 2020-04-10 23:10 ` Vladimir D. Seleznev 2020-04-13 18:01 ` Dmitry V. Levin 2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev ` (2 subsequent siblings) 3 siblings, 1 reply; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-10 23:10 UTC (permalink / raw) To: devel; +Cc: vseleznv --- gb/gb-sh-functions | 112 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+) diff --git a/gb/gb-sh-functions b/gb/gb-sh-functions index cd9039a..d7b1980 100644 --- a/gb/gb-sh-functions +++ b/gb/gb-sh-functions @@ -283,4 +283,115 @@ rpm_changes_since() gb-x-changelog-complement ${tmpdir}/changelog_old ${tmpdir}/changelog_new } + +pkg_identity() +{ + local keep_branch= pkg= + pkg="${1-}"; shift + if [ "$pkg" = "--with-branch" ]; then + pkg="${1-}"; shift + keep_branch=1 + fi + + # List of rpm tags that should be filtered + # + # RPM tags that contain insufficient information about package + # contents and relationship, and do not affect package functionality + # should be filtered. + # + # The main criterias for tags to be filtered: + # + # - Tag contains random or not reproducible value that is assigning + # during the build, and this value does not affect package + # functionality; + # - Tag contains metadata about build host properties; + # - Tag contains metadata of package headers, including its signatures; + # - Tag is only related to package database; + # - Other reasons that are considered worthy. + cat >"$tmpdir"/filtertags <<EOF +ARCH +ARCHIVESIZE +AUTOINSTALLED +BUILDARCHS +BUILDHOST +BUILDTIME +COOKIE +DBINSTANCE +DISTRIBUTION +DISTTAG +DISTURL +DSAHEADER +FILESTATES +HDRID +HEADERCOLOR +HEADERI18NTABLE +HEADERIMAGE +HEADERIMMUTABLE +HEADERREGIONS +HEADERSIGNATURES +IDENTITY +INSTALLCOLOR +INSTALLTID +INSTALLTIME +INSTFILENAMES +INSTPREFIXES +LONGARCHIVESIZE +LONGSIGSIZE +LONGSIZE +PKGID +RPMVERSION +RSAHEADER +SHA1HEADER +SIGGPG +SIGLEMD5_1 +SIGMD5 +SIGPGP +SIGSIZE +SIZE +SOURCEPKGID +EOF + + local rpm_version="$(rpm --version)" + # tag extensions do not exists in the rpm 4.0.4 + if [ "$rpm_version" != "RPM version 4.0.4" ]; then + rpm --verbose --querytags | + while read tagname tagval tagtype ext; do + if [ "$ext" = "ext" ]; then + # filter all tag extensions too + echo "$tagname" >> "$tmpdir"/filtertags + fi + done + fi + + sort -o "$tmpdir"/filtertags{,} + rpmquery --querytags |sort >"$tmpdir"/querytags + join -v1 "$tmpdir/querytags" "$tmpdir/filtertags" >"$tmpdir"/tags + + # erase extra apt indices tags + sed -i '/APTINDEX/d' "$tmpdir"/tags + + # construct query format string in form "[tag:%{tag:shescape}\n]" + local qf="$(sed -E 's/^(.+)$/[\1:%{\1:shescape}\\n]/' "$tmpdir"/tags)" + + # we want to filter disttag part of provides and requires ... + local disttag= var_disttag= + disttag="$(rpmquery --qf '%{disttag}' -p "$pkg")" + quote_sed_regexp_variable var_disttag "$disttag" + + # ... except cases when we want to keep branch name + local branch_name= + if [ -n "$keep_branch" ]; then + branch_name=":${disttag%%+*}" + fi + + (rpmquery --qf "${qf}\n" -p "$pkg" || touch "$tmpdir"/FAIL) | + sed "s/^\(\(PROVIDEVERSION\|REQUIREVERSION\):.*\):$var_disttag$/\1$branch_name/g" | + sed '/^$/d' | + sha256sum - | + cut -d" " -f1 + + if [ -f "$tmpdir"/FAIL ]; then + stamp_echo >&2 "Cannot calculate package identity of $pkg" + return 1 + fi +} -- 2.25.2 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev @ 2020-04-13 18:01 ` Dmitry V. Levin 2020-04-13 19:32 ` Vladimir D. Seleznev 0 siblings, 1 reply; 22+ messages in thread From: Dmitry V. Levin @ 2020-04-13 18:01 UTC (permalink / raw) To: Vladimir D. Seleznev; +Cc: ALT Devel discussion list On Sat, Apr 11, 2020 at 02:10:43AM +0300, Vladimir D. Seleznev wrote: > --- > gb/gb-sh-functions | 112 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 112 insertions(+) > > diff --git a/gb/gb-sh-functions b/gb/gb-sh-functions > index cd9039a..d7b1980 100644 > --- a/gb/gb-sh-functions > +++ b/gb/gb-sh-functions > @@ -283,4 +283,115 @@ rpm_changes_since() > > gb-x-changelog-complement ${tmpdir}/changelog_old ${tmpdir}/changelog_new > } > + > +pkg_identity() > +{ > + local keep_branch= pkg= > + pkg="${1-}"; shift > + if [ "$pkg" = "--with-branch" ]; then > + pkg="${1-}"; shift > + keep_branch=1 > + fi > + > + # List of rpm tags that should be filtered > + # > + # RPM tags that contain insufficient information about package > + # contents and relationship, and do not affect package functionality > + # should be filtered. > + # > + # The main criterias for tags to be filtered: > + # > + # - Tag contains random or not reproducible value that is assigning > + # during the build, and this value does not affect package > + # functionality; > + # - Tag contains metadata about build host properties; > + # - Tag contains metadata of package headers, including its signatures; > + # - Tag is only related to package database; > + # - Other reasons that are considered worthy. > + cat >"$tmpdir"/filtertags <<EOF > +ARCH > +ARCHIVESIZE > +AUTOINSTALLED > +BUILDARCHS > +BUILDHOST > +BUILDTIME > +COOKIE > +DBINSTANCE > +DISTRIBUTION > +DISTTAG > +DISTURL > +DSAHEADER > +FILESTATES > +HDRID > +HEADERCOLOR > +HEADERI18NTABLE > +HEADERIMAGE > +HEADERIMMUTABLE > +HEADERREGIONS > +HEADERSIGNATURES > +IDENTITY > +INSTALLCOLOR > +INSTALLTID > +INSTALLTIME > +INSTFILENAMES > +INSTPREFIXES > +LONGARCHIVESIZE > +LONGSIGSIZE > +LONGSIZE > +PKGID > +RPMVERSION > +RSAHEADER > +SHA1HEADER > +SIGGPG > +SIGLEMD5_1 > +SIGMD5 > +SIGPGP > +SIGSIZE > +SIZE > +SOURCEPKGID > +EOF First of all, I don't think this tools belongs to girar. > + local rpm_version="$(rpm --version)" > + # tag extensions do not exists in the rpm 4.0.4 > + if [ "$rpm_version" != "RPM version 4.0.4" ]; then > + rpm --verbose --querytags | > + while read tagname tagval tagtype ext; do > + if [ "$ext" = "ext" ]; then > + # filter all tag extensions too > + echo "$tagname" >> "$tmpdir"/filtertags > + fi > + done > + fi Second, I don't see why do you need this rpm version check. Can't you just do this regardless of rpm version? > + sort -o "$tmpdir"/filtertags{,} > + rpmquery --querytags |sort >"$tmpdir"/querytags We don't assume that pipefail is enabled, so we prefer rpmquery --querytags "$tmpdir"/querytags sort -u -o "$tmpdir"/querytags{,} > + join -v1 "$tmpdir/querytags" "$tmpdir/filtertags" >"$tmpdir"/tags > + > + # erase extra apt indices tags > + sed -i '/APTINDEX/d' "$tmpdir"/tags Have you ever seen a package with these tags in its header? > + > + # construct query format string in form "[tag:%{tag:shescape}\n]" > + local qf="$(sed -E 's/^(.+)$/[\1:%{\1:shescape}\\n]/' "$tmpdir"/tags)" This construction ignores sed exit status. -- ldv ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() 2020-04-13 18:01 ` Dmitry V. Levin @ 2020-04-13 19:32 ` Vladimir D. Seleznev 0 siblings, 0 replies; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-13 19:32 UTC (permalink / raw) To: ALT Linux Team development discussions On Mon, Apr 13, 2020 at 09:01:43PM +0300, Dmitry V. Levin wrote: > On Sat, Apr 11, 2020 at 02:10:43AM +0300, Vladimir D. Seleznev wrote: > > --- > > gb/gb-sh-functions | 112 +++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 112 insertions(+) > > > > diff --git a/gb/gb-sh-functions b/gb/gb-sh-functions > > index cd9039a..d7b1980 100644 > > --- a/gb/gb-sh-functions > > +++ b/gb/gb-sh-functions > > @@ -283,4 +283,115 @@ rpm_changes_since() > > > > gb-x-changelog-complement ${tmpdir}/changelog_old ${tmpdir}/changelog_new > > } > > + > > +pkg_identity() > > +{ > > + local keep_branch= pkg= > > + pkg="${1-}"; shift > > + if [ "$pkg" = "--with-branch" ]; then > > + pkg="${1-}"; shift > > + keep_branch=1 > > + fi > > + > > + # List of rpm tags that should be filtered > > + # > > + # RPM tags that contain insufficient information about package > > + # contents and relationship, and do not affect package functionality > > + # should be filtered. > > + # > > + # The main criterias for tags to be filtered: > > + # > > + # - Tag contains random or not reproducible value that is assigning > > + # during the build, and this value does not affect package > > + # functionality; > > + # - Tag contains metadata about build host properties; > > + # - Tag contains metadata of package headers, including its signatures; > > + # - Tag is only related to package database; > > + # - Other reasons that are considered worthy. > > + cat >"$tmpdir"/filtertags <<EOF > > +ARCH > > +ARCHIVESIZE > > +AUTOINSTALLED > > +BUILDARCHS > > +BUILDHOST > > +BUILDTIME > > +COOKIE > > +DBINSTANCE > > +DISTRIBUTION > > +DISTTAG > > +DISTURL > > +DSAHEADER > > +FILESTATES > > +HDRID > > +HEADERCOLOR > > +HEADERI18NTABLE > > +HEADERIMAGE > > +HEADERIMMUTABLE > > +HEADERREGIONS > > +HEADERSIGNATURES > > +IDENTITY > > +INSTALLCOLOR > > +INSTALLTID > > +INSTALLTIME > > +INSTFILENAMES > > +INSTPREFIXES > > +LONGARCHIVESIZE > > +LONGSIGSIZE > > +LONGSIZE > > +PKGID > > +RPMVERSION > > +RSAHEADER > > +SHA1HEADER > > +SIGGPG > > +SIGLEMD5_1 > > +SIGMD5 > > +SIGPGP > > +SIGSIZE > > +SIZE > > +SOURCEPKGID > > +EOF > > First of all, I don't think this tools belongs to girar. Ok, it will be a separate tool. Should it be implemented as some functionality of rpm, or as a side project? > > + local rpm_version="$(rpm --version)" > > + # tag extensions do not exists in the rpm 4.0.4 > > + if [ "$rpm_version" != "RPM version 4.0.4" ]; then > > + rpm --verbose --querytags | > > + while read tagname tagval tagtype ext; do > > + if [ "$ext" = "ext" ]; then > > + # filter all tag extensions too > > + echo "$tagname" >> "$tmpdir"/filtertags > > + fi > > + done > > + fi > > Second, I don't see why do you need this rpm version check. > Can't you just do this regardless of rpm version? A new version of rpm can brings new tag extensions, so I can maintain list of tag extensions by myself, or I can get the list from rpm output. The second option is way more convenient. > > + sort -o "$tmpdir"/filtertags{,} > > + rpmquery --querytags |sort >"$tmpdir"/querytags Ok. > We don't assume that pipefail is enabled, so we prefer > rpmquery --querytags "$tmpdir"/querytags > sort -u -o "$tmpdir"/querytags{,} > > > + join -v1 "$tmpdir/querytags" "$tmpdir/filtertags" >"$tmpdir"/tags > > + > > + # erase extra apt indices tags > > + sed -i '/APTINDEX/d' "$tmpdir"/tags > > Have you ever seen a package with these tags in its header? I haven't, but just for sure... But if actually they would appear in package header we shouldn't just ignore them. I'll wipe these lines out. > > + > > + # construct query format string in form "[tag:%{tag:shescape}\n]" > > + local qf="$(sed -E 's/^(.+)$/[\1:%{\1:shescape}\\n]/' "$tmpdir"/tags)" > > This construction ignores sed exit status. Is local qf="$(sed -E 's/^(.+)$/[\1:%{\1:shescape}\\n]/' "$tmpdir"/tags |touch $tmpdir/FAIL)" if [ -f"$tmpdir/FAIL" ]; then ... OK? -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-10 23:10 [devel] RFC: girar: optimize rebuild Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev @ 2020-04-10 23:10 ` Vladimir D. Seleznev 2020-04-11 11:29 ` Alexey Tourbin 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy 3 siblings, 1 reply; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-10 23:10 UTC (permalink / raw) To: devel; +Cc: vseleznv --- gb/gb-task-check-build-i | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gb/gb-task-check-build-i b/gb/gb-task-check-build-i index 9017e12..38a6ab3 100755 --- a/gb/gb-task-check-build-i +++ b/gb/gb-task-check-build-i @@ -249,10 +249,26 @@ for arch in $GB_ARCH; do fi done +osrpm_identity= +osrpm="$GB_REPO_DIR/files/SRPMS/$srpmsu" +if [ -f "$osrpm" ]; then + echo >&2 "$I: Found $srpmsu in the repo, this means the package was rebuilt" + osrpm_identity="$(pkg_identity "$osrpm")" +fi + for arch in $GB_ARCH; do [ -d "$arch/srpm" -o ! -s "$arch/excluded" ] || continue f="$arch/srpm/$srpmsu" [ -f "$f" ] || continue + srpm_identity="$(pkg_identity "$f")" + echo >&2 "$I: $arch $srpmsu identity = $srpm_identity" + # non-empty $osrpm_identity means the NEVR was rebuilt + # optimize rebuilt sourcerpm if identities of original and rebuilt sourcerpms are equal + if [ -n "$osrpm_identity" ] && + [ "$osrpm_identity" = "$srpm_identity" ]; then + echo >&2 "$I: $arch: optimize rebuilt $srpmsu cause its identity is equal to $srpmsu in the repo" + install -p "$osrpm" "$f" + fi built_pkgname="$(rpmquery --qf '%{name}' -p -- "$f")" echo "$built_pkgname" > pkgname break -- 2.25.2 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev @ 2020-04-11 11:29 ` Alexey Tourbin 2020-04-14 16:42 ` Vladimir D. Seleznev 0 siblings, 1 reply; 22+ messages in thread From: Alexey Tourbin @ 2020-04-11 11:29 UTC (permalink / raw) To: ALT Linux Team development discussions On Sat, Apr 11, 2020 at 2:11 AM Vladimir D. Seleznev <vseleznv@altlinux.org> wrote: > +osrpm_identity= > +osrpm="$GB_REPO_DIR/files/SRPMS/$srpmsu" > +if [ -f "$osrpm" ]; then > + echo >&2 "$I: Found $srpmsu in the repo, this means the package was rebuilt" > + osrpm_identity="$(pkg_identity "$osrpm")" > +fi > + > for arch in $GB_ARCH; do > [ -d "$arch/srpm" -o ! -s "$arch/excluded" ] || continue > f="$arch/srpm/$srpmsu" > [ -f "$f" ] || continue > + srpm_identity="$(pkg_identity "$f")" > + echo >&2 "$I: $arch $srpmsu identity = $srpm_identity" > + # non-empty $osrpm_identity means the NEVR was rebuilt > + # optimize rebuilt sourcerpm if identities of original and rebuilt sourcerpms are equal > + if [ -n "$osrpm_identity" ] && > + [ "$osrpm_identity" = "$srpm_identity" ]; then > + echo >&2 "$I: $arch: optimize rebuilt $srpmsu cause its identity is equal to $srpmsu in the repo" > + install -p "$osrpm" "$f" > + fi > built_pkgname="$(rpmquery --qf '%{name}' -p -- "$f")" > echo "$built_pkgname" > pkgname > break So how does it work in practice? Suppose I first uploaded a .src.rpm package. Do we store the original src.rpm, the one with the uploader's signature? When it gets rebuilt, this should not affect the original .src.rpm (as if it was uploaded again). No special handling is required in this case. Then suppose I build a gearifeid package from Sisyphus for p9. But your code only handles GB_REPO_DIR, not the NEIGHBOUR_REPO_DIR the package comes from. To be clear, that information is lost: when you request to build a signed tag from /gears, it does not imply that there is a corresponding .src.rpm in any REPO_DIR. There is already a problem with cross-repo copying: if done in earnest, both repos need to be locked. And of course this is deadlock-prone. You can do better without any locking if you identify every package in all repos with your new identity hash. This can be done relatively easy, since you already have that big content-addressable storage. You can hardlink it into a shadow identity-addressable storage. Once you've done that, you obtain the global / beatific vision: given a package, you instantly know if you have already seen something like this. (On the second thought: you don't need locking because the -f test is atomic and files cannot be removed from the storage, but there will still be race conditions. It's not too bad in practice. Further those race conditions can be detected at the task-commit stage.) There is one specific problem with the outlined approach: the notion of identity is flawed, because the disttag may or may not matter. Sometimes you cannot substitute a package for another package with the same identity but a different disttag. Specifically this is the case with strict dependencies between subpackages. You cannot substitute a subpackage unless you also substitute all the other subpackages. This is further complicated by noarch subpackages: you need to coordinate substitution across architectures. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-11 11:29 ` Alexey Tourbin @ 2020-04-14 16:42 ` Vladimir D. Seleznev 2020-04-16 21:51 ` Alexey Tourbin 0 siblings, 1 reply; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-14 16:42 UTC (permalink / raw) To: ALT Linux Team development discussions On Sat, Apr 11, 2020 at 02:29:55PM +0300, Alexey Tourbin wrote: > On Sat, Apr 11, 2020 at 2:11 AM Vladimir D. Seleznev > <vseleznv@altlinux.org> wrote: > > +osrpm_identity= > > +osrpm="$GB_REPO_DIR/files/SRPMS/$srpmsu" > > +if [ -f "$osrpm" ]; then > > + echo >&2 "$I: Found $srpmsu in the repo, this means the package was rebuilt" > > + osrpm_identity="$(pkg_identity "$osrpm")" > > +fi > > + > > for arch in $GB_ARCH; do > > [ -d "$arch/srpm" -o ! -s "$arch/excluded" ] || continue > > f="$arch/srpm/$srpmsu" > > [ -f "$f" ] || continue > > + srpm_identity="$(pkg_identity "$f")" > > + echo >&2 "$I: $arch $srpmsu identity = $srpm_identity" > > + # non-empty $osrpm_identity means the NEVR was rebuilt > > + # optimize rebuilt sourcerpm if identities of original and rebuilt sourcerpms are equal > > + if [ -n "$osrpm_identity" ] && > > + [ "$osrpm_identity" = "$srpm_identity" ]; then > > + echo >&2 "$I: $arch: optimize rebuilt $srpmsu cause its identity is equal to $srpmsu in the repo" > > + install -p "$osrpm" "$f" > > + fi > > built_pkgname="$(rpmquery --qf '%{name}' -p -- "$f")" > > echo "$built_pkgname" > pkgname > > break > > So how does it work in practice? Suppose I first uploaded a .src.rpm > package. Do we store the original src.rpm, the one with the uploader's > signature? When it gets rebuilt, this should not affect the original > .src.rpm (as if it was uploaded again). No special handling is > required in this case. Yes. It all was about the package build from the gear repo to not multiply generated sourcerpms. > Then suppose I build a gearifeid package from Sisyphus for p9. But > your code only handles GB_REPO_DIR, not the NEIGHBOUR_REPO_DIR the > package comes from. To be clear, that information is lost: when you > request to build a signed tag from /gears, it does not imply that > there is a corresponding .src.rpm in any REPO_DIR. It's future part. I wrote some code that check the uprepos, but I didn't like it. The correct way is checking uprepos archives as well. > There is already a problem with cross-repo copying: if done in > earnest, both repos need to be locked. And of course this is > deadlock-prone. You can do better without any locking if you identify > every package in all repos with your new identity hash. This can be > done relatively easy, since you already have that big > content-addressable storage. You can hardlink it into a shadow > identity-addressable storage. Once you've done that, you obtain the > global / beatific vision: given a package, you instantly know if you > have already seen something like this. (On the second thought: you > don't need locking because the -f test is atomic and files cannot be > removed from the storage, but there will still be race conditions. > It's not too bad in practice. Further those race conditions can be > detected at the task-commit stage.) I like the idea, but there are some issues with this solution: these *are* collisions. I explain this below, but this idea will work perfectly with sourcerpms. The problem is that if we want to hande binary rpms as well, there will be kind of collisions by design. For example, package foo has two subpackages: foo-data and libfoo. After foo rebuild foo-data has the same identity as previous foo-data build, but libfoo has the different now. According the plan, the whole rebuild has significant changes and all binary packages should be substituted with new one. And now we have two foo-data packages with the same identity value, but they are belong to different builds. > There is one specific problem with the outlined approach: the notion > of identity is flawed, because the disttag may or may not matter. > Sometimes you cannot substitute a package for another package with the > same identity but a different disttag. Specifically this is the case > with strict dependencies between subpackages. You cannot substitute a > subpackage unless you also substitute all the other subpackages. Yes, that is correct, I considered this. > This is further complicated by noarch subpackages: you need to > coordinate substitution across architectures. This is more complicated with mix-arch builds. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-14 16:42 ` Vladimir D. Seleznev @ 2020-04-16 21:51 ` Alexey Tourbin 2020-04-17 13:54 ` Dmitry V. Levin 2020-04-20 8:36 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Alexey Tourbin 0 siblings, 2 replies; 22+ messages in thread From: Alexey Tourbin @ 2020-04-16 21:51 UTC (permalink / raw) To: ALT Linux Team development discussions On Tue, Apr 14, 2020 at 7:42 PM Vladimir D. Seleznev <vseleznv@altlinux.org> wrote: > > Then suppose I build a gearifeid package from Sisyphus for p9. But > > your code only handles GB_REPO_DIR, not the NEIGHBOUR_REPO_DIR the > > package comes from. To be clear, that information is lost: when you > > request to build a signed tag from /gears, it does not imply that > > there is a corresponding .src.rpm in any REPO_DIR. > > It's future part. I wrote some code that check the uprepos, but I didn't > like it. The correct way is checking uprepos archives as well. I gave it some more thought. First, the way you're trying to hash all the unknown tags is interesting, but I wouldn't do that. You may want to hash everything if you don't understand the internal structure and prefer to treat the input as a black box. On the contrary, we understand the internals of a package well. What's then a minimal subset of tags to identify a package? For a source package, it's FileDigests + BuildRequires + BuildConflicts. That's all! They determine the outcome of a build, and we may reasonably postulate that the rest of the tags should not influence that outcome. You don't even have to hash NEVR, because the specfile is already in FileDigests. (But you should probably hash FileFlags, because they point out which file is the specfile. You should also hash FileModes, because some sources may be executable. But that's about all.) Second, referring to the discussion about hash functions, the hash function you're using isn't all that important. That's because you're hashing MD5 sums in FileDigests, and those are the weakest link and (theoretically) the main cause of any collision. The speed isn't important either, because you're hashing relatively short inputs. So what's the right set of tags for a binary package, and what is its identity? (I'm not sure identity is the right word, I would rather call it ID. Identity is who you are and what you believe in, for example a black person who votes for Obama.) I've already hinted that identity can be defined via substitution: if you replace a package with a different package but the same identity, there should be no functional difference, and furthermore no difference "for all intents and purposes", except for a few observable differences which we deem immaterial and permit explicitly, such as FileMtimes. So obviously you need to hash at least FileDigests and Requires/Provides/Obsoletes/Conflicts. This should satisfy the definition of ID for rpm (the dependencies are satisfied in the same way, and file conflicts are the resolved in the same way, so rpm can't tell the difference if we make a substitution.) It isn't clear whether you should hash informational sections such as %description. It can be argued that under the same NEVR, the description shouldn't change anyway. Is it possible that nothing changes in a package but the description? Would we still want to update/replace the package then? Finally, your identity hash need not to be fixed once and forever. It is used only for internal bookkeeping, so once in a while you are allowed to change the hash and rebuild the identity-addressable storage. You should have a script for that in girar/admin. It may take an hour or so to complete, but that's not too bad. > > There is already a problem with cross-repo copying: if done in > > earnest, both repos need to be locked. And of course this is > > deadlock-prone. You can do better without any locking if you identify > > every package in all repos with your new identity hash. This can be > > done relatively easy, since you already have that big > > content-addressable storage. You can hardlink it into a shadow > > identity-addressable storage. Once you've done that, you obtain the > > global / beatific vision: given a package, you instantly know if you > > have already seen something like this. (On the second thought: you > > don't need locking because the -f test is atomic and files cannot be > > removed from the storage, but there will still be race conditions. > > It's not too bad in practice. Further those race conditions can be > > detected at the task-commit stage.) > > I like the idea, but there are some issues with this solution: these > *are* collisions. I explain this below, but this idea will work > perfectly with sourcerpms. > > The problem is that if we want to hande binary rpms as well, there will > be kind of collisions by design. For example, package foo has two > subpackages: foo-data and libfoo. After foo rebuild foo-data has the > same identity as previous foo-data build, but libfoo has the different > now. According the plan, the whole rebuild has significant changes and > all binary packages should be substituted with new one. And now we have > two foo-data packages with the same identity value, but they are belong > to different builds. > > > There is one specific problem with the outlined approach: the notion > > of identity is flawed, because the disttag may or may not matter. > > Sometimes you cannot substitute a package for another package with the > > same identity but a different disttag. Specifically this is the case > > with strict dependencies between subpackages. You cannot substitute a > > subpackage unless you also substitute all the other subpackages. > > Yes, that is correct, I considered this. So for src.rpm packages, it's a solved problem. For binary packages, the identity should specifically exclude disttag. It will no longer satisfy the definition of ID for rpm (substitution will break for subpackages with strict dependencies). Therefore for binary packages, we need to track <ID,disttag> tuples. This is a one-to-many relation: for each ID, there may be a few disttags. So for binary packages we need a separate identity-addressable storage which maps ID to <disttag,filehash> (while for source packages, a hardlink maps ID to filehash). If implemented naively, this will create many small files, one file per ID, most files with just one line. In a more practical implementation, you should probably group all those small files by package name. So you'll have: $ cat id2f/libfoo <libfoo-ID1> <disttag1> <libfoo-filehash1> <libfoo-ID2> <disttag2> <libfoo-filehash2> $ cat id2f/foo-data <foo-data-ID1> <disttag1> <foo-data-filehash1> <foo-data-ID1> <disttag2> <foo-data-filehash2> Note that for libfoo, the IDs are different, but with foo-data the IDs are the same. This indicates that the contents of libfoo have changed after a rebuild, while the contents of foo-data have not. Suppose you have such a store, and foo.src.rpm is getting rebuilt again (or copied to p9). You can then check up with the store and see if the outcome can be replaced with either libfoo-filehash1 + foo-data-filehash1 (with disttag1) or libfoo-filehash2 + foo-data-filehash2 (with disttag2), but not in other combinations. You'll need an elaborate algorithm which coordinates substitutions across architectures, but this seems doable. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-16 21:51 ` Alexey Tourbin @ 2020-04-17 13:54 ` Dmitry V. Levin 2020-04-20 9:05 ` [devel] stopping a cascade of rebuilds Alexey Tourbin 2020-04-20 8:36 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Alexey Tourbin 1 sibling, 1 reply; 22+ messages in thread From: Dmitry V. Levin @ 2020-04-17 13:54 UTC (permalink / raw) To: devel On Fri, Apr 17, 2020 at 12:51:51AM +0300, Alexey Tourbin wrote: [...] > So what's the right set of tags for a binary package, and what is its > identity? (I'm not sure identity is the right word, I would rather > call it ID. Identity is who you are and what you believe in, for > example a black person who votes for Obama.) I've already hinted that > identity can be defined via substitution: if you replace a package > with a different package but the same identity, there should be no > functional difference, and furthermore no difference "for all intents > and purposes", except for a few observable differences which we deem > immaterial and permit explicitly, such as FileMtimes. So obviously you > need to hash at least FileDigests and > Requires/Provides/Obsoletes/Conflicts. This should satisfy the > definition of ID for rpm (the dependencies are satisfied in the same > way, and file conflicts are the resolved in the same way, so rpm can't > tell the difference if we make a substitution.) What about various types of scripts? $ rpmquery --querytags | grep 'PROG$' PREINPROG POSTINPROG PREUNPROG POSTUNPROG VERIFYSCRIPTPROG TRIGGERSCRIPTPROG > It isn't clear whether you should hash informational sections such as > %description. It can be argued that under the same NEVR, the > description shouldn't change anyway. Is it possible that nothing > changes in a package but the description? Would we still want to > update/replace the package then? > > Finally, your identity hash need not to be fixed once and forever. It > is used only for internal bookkeeping, so once in a while you are > allowed to change the hash and rebuild the identity-addressable > storage. You should have a script for that in girar/admin. It may take > an hour or so to complete, but that's not too bad. -- ldv ^ permalink raw reply [flat|nested] 22+ messages in thread
* [devel] stopping a cascade of rebuilds 2020-04-17 13:54 ` Dmitry V. Levin @ 2020-04-20 9:05 ` Alexey Tourbin 2020-04-23 19:21 ` Vladimir D. Seleznev 0 siblings, 1 reply; 22+ messages in thread From: Alexey Tourbin @ 2020-04-20 9:05 UTC (permalink / raw) To: ALT Linux Team development discussions On Fri, Apr 17, 2020 at 4:54 PM Dmitry V. Levin <ldv@altlinux.org> wrote: > > So what's the right set of tags for a binary package, and what is its > > identity? (I'm not sure identity is the right word, I would rather > > call it ID. Identity is who you are and what you believe in, for > > example a black person who votes for Obama.) I've already hinted that > > identity can be defined via substitution: if you replace a package > > with a different package but the same identity, there should be no > > functional difference, and furthermore no difference "for all intents > > and purposes", except for a few observable differences which we deem > > immaterial and permit explicitly, such as FileMtimes. So obviously you > > need to hash at least FileDigests and > > Requires/Provides/Obsoletes/Conflicts. This should satisfy the > > definition of ID for rpm (the dependencies are satisfied in the same > > way, and file conflicts are the resolved in the same way, so rpm can't > > tell the difference if we make a substitution.) > > What about various types of scripts? > > $ rpmquery --querytags | grep 'PROG$' > PREINPROG > POSTINPROG > PREUNPROG > POSTUNPROG > VERIFYSCRIPTPROG > TRIGGERSCRIPTPROG Sure, slipped my mind. These IDs can have another application. Suppose a task is resumed, and subtask #100 needs a rebuild. It is likely that, after the rebuild, the result won't change (all subpackage end up with the same IDs). However, we replace the packages anyway. This will trigger the rebuild of subtask #200, if it depends on subtrask #100. There are two approaches to spare the unnecessary rebuild: 1) Don't overwrite local files with those from the remote build, if they are identical. 2) When tracking the contents of BuildRoot, take into account IDs instead of HeaderSHA1. There is a peculiarity with the first approach: if subtask #100 has both arch and noarch subpackages, the decision to overwrite files cannot be made locally / per architecture. If you need to overwrite subpackages for at least one architecture, then you have to overwrite for all architectures, otherwise there will be unmet dependencies between arch and noarch subpackages. I think we already have this problem: the decision to rebuild a subtask is made locally (or rather remotely), per architecture. So occasionally we've seen unmet dependencies due to disttag mismatch. Has the problem been addressed? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] stopping a cascade of rebuilds 2020-04-20 9:05 ` [devel] stopping a cascade of rebuilds Alexey Tourbin @ 2020-04-23 19:21 ` Vladimir D. Seleznev 2020-04-23 20:54 ` Dmitry V. Levin 2020-04-27 5:38 ` Alexey Tourbin 0 siblings, 2 replies; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-23 19:21 UTC (permalink / raw) To: ALT Linux Team development discussions On Mon, Apr 20, 2020 at 12:05:26PM +0300, Alexey Tourbin wrote: > On Fri, Apr 17, 2020 at 4:54 PM Dmitry V. Levin <ldv@altlinux.org> wrote: > > > So what's the right set of tags for a binary package, and what is its > > > identity? (I'm not sure identity is the right word, I would rather > > > call it ID. Identity is who you are and what you believe in, for > > > example a black person who votes for Obama.) I've already hinted that > > > identity can be defined via substitution: if you replace a package > > > with a different package but the same identity, there should be no > > > functional difference, and furthermore no difference "for all intents > > > and purposes", except for a few observable differences which we deem > > > immaterial and permit explicitly, such as FileMtimes. So obviously you > > > need to hash at least FileDigests and > > > Requires/Provides/Obsoletes/Conflicts. This should satisfy the > > > definition of ID for rpm (the dependencies are satisfied in the same > > > way, and file conflicts are the resolved in the same way, so rpm can't > > > tell the difference if we make a substitution.) > > > > What about various types of scripts? > > > > $ rpmquery --querytags | grep 'PROG$' > > PREINPROG > > POSTINPROG > > PREUNPROG > > POSTUNPROG > > VERIFYSCRIPTPROG > > TRIGGERSCRIPTPROG Well, so, your position is that we control and understand our packages, and we can explicitly define what tags are essential for the task. Indeed, my idea was to filter all unrelated tags because in the future rpm-build there can be new tags, but it seems like a needless reinsurance. You are right that we always can add more tags to calculate the identity. > Sure, slipped my mind. > So for src.rpm packages, it's a solved problem. For binary packages, > the identity should specifically exclude disttag. It will no longer > satisfy the definition of ID for rpm (substitution will break for > subpackages with strict dependencies). Therefore for binary packages, > we need to track <ID,disttag> tuples. Why should we track them? If we rebuild a package, we should check whether identity of its binary packages had changed. If it had not, we shouldn't replace its binary packages by rebuilt packages. That simple. The things get more complicated in case of "copying" packages. In that case this schema could help. But we need also track buildtime. Just to prevent package replacing with earlier build. So, it is triple now. > This is a one-to-many relation: for each ID, there may be a few > disttags. So for binary packages we need a separate > identity-addressable storage which maps ID to <disttag,filehash> > (while for source packages, a hardlink maps ID to filehash). If > implemented naively, this will create many small files, one file per > ID, most files with just one line. In a more practical implementation, > you should probably group all those small files by package name. So > you'll have: > > $ cat id2f/libfoo > <libfoo-ID1> <disttag1> <libfoo-filehash1> > <libfoo-ID2> <disttag2> <libfoo-filehash2> > > $ cat id2f/foo-data > <foo-data-ID1> <disttag1> <foo-data-filehash1> > <foo-data-ID1> <disttag2> <foo-data-filehash2> > > Note that for libfoo, the IDs are different, but with foo-data the IDs > are the same. This indicates that the contents of libfoo have changed > after a rebuild, while the contents of foo-data have not. That is correct. > It may even make sense to group the mappings by src.rpm name instead > of package name. At first it seems less intuitive, but in return it > can give you a consistent view similar to MVCC snapshot. Of course, > these files should be updated atomically, with rename(2). To check a > set subpackages, you first need to copy the file to a local dir. This > should rule out the case in which some subpackages have been added to > the file and some not. I don't get this idea, please expand it. > These files are to be updated during the task-commit stage, under the > exclusive lock. This is also the right moment to detect race > conditions. Suppose you build the same package for sisyphus and p9 in > parallel, and the build result is the same. Before adding new > packages, you recheck if the whole set can be replaced with the > already existing packages. One of the two tasks then should fail (or > automatically scheduled for another iteration). > These IDs can have another application. Suppose a task is resumed, and > subtask #100 needs a rebuild. It is likely that, after the rebuild, > the result won't change (all subpackage end up with the same IDs). > However, we replace the packages anyway. This will trigger the > rebuild of subtask #200, if it depends on subtrask #100. There are two > approaches to spare the unnecessary rebuild: > > 1) Don't overwrite local files with those from the remote build, if > they are identical. > 2) When tracking the contents of BuildRoot, take into account IDs > instead of HeaderSHA1. I like the idea. > There is a peculiarity with the first approach: if subtask #100 has > both arch and noarch subpackages, the decision to overwrite files > cannot be made locally / per architecture. If you need to overwrite > subpackages for at least one architecture, then you have to overwrite > for all architectures, otherwise there will be unmet dependencies > between arch and noarch subpackages. That is correct. > I think we already have this problem: the decision to rebuild a > subtask is made locally (or rather remotely), per architecture. So > occasionally we've seen unmet dependencies due to disttag mismatch. > Has the problem been addressed? Hmm, I'll think about that. P.S. > I'm not sure identity is the right word, I would rather call it ID. It's just a term, we can define it as we want. There was another option to name it BuildID, but for some reason it was refused. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] stopping a cascade of rebuilds 2020-04-23 19:21 ` Vladimir D. Seleznev @ 2020-04-23 20:54 ` Dmitry V. Levin 2020-04-27 5:38 ` Alexey Tourbin 1 sibling, 0 replies; 22+ messages in thread From: Dmitry V. Levin @ 2020-04-23 20:54 UTC (permalink / raw) To: ALT Linux Team development discussions On Thu, Apr 23, 2020 at 10:21:28PM +0300, Vladimir D. Seleznev wrote: [...] > It's just a term, we can define it as we want. There was another option > to name it BuildID, but for some reason it was refused. The reason why it was rejected is to avoid confusion with Build ID: $ readelf -n /bin/true |grep -i 'build.*id' Displaying notes found in: .note.gnu.build-id GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: a8463ba44bcb8caad1b9c466a28dc573ff7878f2 -- ldv ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] stopping a cascade of rebuilds 2020-04-23 19:21 ` Vladimir D. Seleznev 2020-04-23 20:54 ` Dmitry V. Levin @ 2020-04-27 5:38 ` Alexey Tourbin 1 sibling, 0 replies; 22+ messages in thread From: Alexey Tourbin @ 2020-04-27 5:38 UTC (permalink / raw) To: ALT Linux Team development discussions On Thu, Apr 23, 2020 at 10:21 PM Vladimir D. Seleznev <vseleznv@altlinux.org> wrote: > > So for src.rpm packages, it's a solved problem. For binary packages, > > the identity should specifically exclude disttag. It will no longer > > satisfy the definition of ID for rpm (substitution will break for > > subpackages with strict dependencies). Therefore for binary packages, > > we need to track <ID,disttag> tuples. > > Why should we track them? If we rebuild a package, we should check > whether identity of its binary packages had changed. If it had not, we > shouldn't replace its binary packages by rebuilt packages. That simple. Because in the identity-addressable storage, we can have a few packages with the same ID but different disttags, as in the example below. The fact that foo-data-ID1 hasn't changed doesn't mean you can immediately grab foo-data-filehash1. > > $ cat id2f/libfoo > > <libfoo-ID1> <disttag1> <libfoo-filehash1> > > <libfoo-ID2> <disttag2> <libfoo-filehash2> > > > > $ cat id2f/foo-data > > <foo-data-ID1> <disttag1> <foo-data-filehash1> > > <foo-data-ID1> <disttag2> <foo-data-filehash2> > The things get more complicated in case of "copying" packages. In that > case this schema could help. But we need also track buildtime. Just to > prevent package replacing with earlier build. So, it is triple now. Hmm, haven't thought of buildtime. We've got a package in the repo, a freshly rebuilt package with the same EVR, and an identical package (or a few identical packages) in the identity-addressable storage. We need to prevent the outcome in which the buildtime of the package in the repo goes down. That's what you mean by "prevent package replacing with earlier build", right? (Otherwise replacing with an earlier build is not a problem.) Such an outcome is unlikely to occur in practice, but not impossible. Three parties are now involved. You cannot simply ask, "can I overwrite build/100 with packages from identity-addressable storage?" The third piece of information is what you already have in the repo. > > It may even make sense to group the mappings by src.rpm name instead > > of package name. At first it seems less intuitive, but in return it > > can give you a consistent view similar to MVCC snapshot. Of course, > > these files should be updated atomically, with rename(2). To check a > > set subpackages, you first need to copy the file to a local dir. This > > should rule out the case in which some subpackages have been added to > > the file and some not. > > I don't get this idea, please expand it. These are implementation details (that are not all that important unless we agree on the data model). How do you access the identity-addressable storage concurrently? The worst case is that id2f/libfoo is written and read simultaneously, so you read a half-written file. To avoid this, files should be replaced atomically: $ mv local/libfoo id2f/.tmp$$.libfoo && mv id2f/.tmp$$.libfoo id2f/libfoo You are now guaranteed that you open either old id2f/libfoo or new id2f/libfoo, but not something in between. But this is not enough, because there is a higher-order inconsistency: you don't want to process new id2f/libfoo with old id2f/foo-data (when id2f/foo-data is just being replaced). To get a "consistent view" across subpackages, you can combine id2f/libfoo and id2f/foo-data into a single file. To process subpackages, you first need to make a local copy of that file. Because with command-line tools, you'd want to open the file a few times, so you can't do it directly on id2f/libfoo. This is an alternative to the "single writer, multiple readers" model. In this model, the reader first has to obtain a shared lock on id2f. In what I describe, the reader can proceed without any locking. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo 2020-04-16 21:51 ` Alexey Tourbin 2020-04-17 13:54 ` Dmitry V. Levin @ 2020-04-20 8:36 ` Alexey Tourbin 1 sibling, 0 replies; 22+ messages in thread From: Alexey Tourbin @ 2020-04-20 8:36 UTC (permalink / raw) To: ALT Linux Team development discussions On Fri, Apr 17, 2020 at 12:51 AM Alexey Tourbin <alexey.tourbin@gmail.com> wrote: > So for src.rpm packages, it's a solved problem. For binary packages, > the identity should specifically exclude disttag. It will no longer > satisfy the definition of ID for rpm (substitution will break for > subpackages with strict dependencies). Therefore for binary packages, > we need to track <ID,disttag> tuples. This is a one-to-many relation: > for each ID, there may be a few disttags. So for binary packages we > need a separate identity-addressable storage which maps ID to > <disttag,filehash> (while for source packages, a hardlink maps ID to > filehash). If implemented naively, this will create many small files, > one file per ID, most files with just one line. In a more practical > implementation, you should probably group all those small files by > package name. So you'll have: > > $ cat id2f/libfoo > <libfoo-ID1> <disttag1> <libfoo-filehash1> > <libfoo-ID2> <disttag2> <libfoo-filehash2> > > $ cat id2f/foo-data > <foo-data-ID1> <disttag1> <foo-data-filehash1> > <foo-data-ID1> <disttag2> <foo-data-filehash2> > > Note that for libfoo, the IDs are different, but with foo-data the IDs > are the same. This indicates that the contents of libfoo have changed > after a rebuild, while the contents of foo-data have not. It may even make sense to group the mappings by src.rpm name instead of package name. At first it seems less intuitive, but in return it can give you a consistent view similar to MVCC snapshot. Of course, these files should be updated atomically, with rename(2). To check a set subpackages, you first need to copy the file to a local dir. This should rule out the case in which some subpackages have been added to the file and some not. These files are to be updated during the task-commit stage, under the exclusive lock. This is also the right moment to detect race conditions. Suppose you build the same package for sisyphus and p9 in parallel, and the build result is the same. Before adding new packages, you recheck if the whole set can be replaced with the already existing packages. One of the two tasks then should fail (or automatically scheduled for another iteration). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-10 23:10 [devel] RFC: girar: optimize rebuild Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev @ 2020-04-11 10:36 ` Andrey Savchenko 2020-04-11 15:33 ` Vladimir D. Seleznev 2020-04-11 23:31 ` Alexey V. Vissarionov 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy 3 siblings, 2 replies; 22+ messages in thread From: Andrey Savchenko @ 2020-04-11 10:36 UTC (permalink / raw) To: ALT Linux Team development discussions [-- Attachment #1: Type: text/plain, Size: 2186 bytes --] On Sat, 11 Apr 2020 02:10:42 +0300 Vladimir D. Seleznev wrote: > > Hi! > > The first part of rebuilt packages optimization for girar. It introduces > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > pkg_identity() takes RPM package and returns a value called package identity, > a hash of subset of RPM package header. That subset is the entire header > without some nonessential artifacts like buildhost, buildtime, header hashsum, > etc. > > The two package builds of the same NEVR might have equal or different > package identities. The equal identities mean that build results of these > packages are equal too, that allows build optimization. The practical > example of simple rebuilt sourcerpm optimization also introduced. > > The future work can be about optimization of "copied" to another branch > sourcerpm with retrieved from archive sourcerpm, and binary packages > optimization (this case has an issue when binary subpackages are mixed > archs, i.e. arch and noarch, this probably could work only with single-arch > builds). > > Please review and discuss. I see two problems with proposed approach: 1) It assumes there will be not pkg_identity hash collisions. This is wrong. They may occur sooner or later and the code *must* correctly deal with such collisions. Remember what happened to subversion when collision occurred in a repository, while git was resilient. The way proposal is now the identity hash collision will lead to undergraded repository at best and broken at worst. I see no easy way to fix this problem, but it must be either fixed or proposed optimization rejected. 2) The hash function choise — sha256 — is very unfortunate: it has longer digest than sha1, but otherwise is vulnerable to the same attack; so right now it is still marginally secure, but it will not last long. Moreover sha256 is quite slow. It is better to use newer generation of hash functions, e.g. blake2b based on the chacha stream cipher. It is more future proof and faster at the same time. You can just use the b2sum implementation from the GNU coreutils. Best regards, Andrew Savchenko [-- Attachment #2: Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko @ 2020-04-11 15:33 ` Vladimir D. Seleznev 2020-04-11 23:31 ` Alexey V. Vissarionov 1 sibling, 0 replies; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-11 15:33 UTC (permalink / raw) To: ALT Linux Team development discussions On Sat, Apr 11, 2020 at 01:36:31PM +0300, Andrey Savchenko wrote: > On Sat, 11 Apr 2020 02:10:42 +0300 Vladimir D. Seleznev wrote: > > > > Hi! > > > > The first part of rebuilt packages optimization for girar. It introduces > > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > > > pkg_identity() takes RPM package and returns a value called package identity, > > a hash of subset of RPM package header. That subset is the entire header > > without some nonessential artifacts like buildhost, buildtime, header hashsum, > > etc. > > > > The two package builds of the same NEVR might have equal or different > > package identities. The equal identities mean that build results of these > > packages are equal too, that allows build optimization. The practical > > example of simple rebuilt sourcerpm optimization also introduced. > > > > The future work can be about optimization of "copied" to another branch > > sourcerpm with retrieved from archive sourcerpm, and binary packages > > optimization (this case has an issue when binary subpackages are mixed > > archs, i.e. arch and noarch, this probably could work only with single-arch > > builds). > > > > Please review and discuss. > > I see two problems with proposed approach: > > 1) It assumes there will be not pkg_identity hash collisions. This > is wrong. They may occur sooner or later and the code *must* > correctly deal with such collisions. Remember what happened to > subversion when collision occurred in a repository, while git was > resilient. Any hashsum function has collisions by definition. The only way to avoid them is not to use hashsums. > The way proposal is now the identity hash collision will lead to > undergraded repository at best and broken at worst. No, it will not, cause any issues that this collision might bring up will be caught by later build checks. > I see no easy way to fix this problem, but it must be either fixed > or proposed optimization rejected. > > 2) The hash function choise — sha256 — is very unfortunate: it has > longer digest than sha1, but otherwise is vulnerable to the same > attack; so right now it is still marginally secure, but it will not > last long. Moreover sha256 is quite slow. The good news: it is not about security. > It is better to use newer generation of hash functions, e.g. > blake2b based on the chacha stream cipher. It is more future proof > and faster at the same time. You can just use the b2sum > implementation from the GNU coreutils. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko 2020-04-11 15:33 ` Vladimir D. Seleznev @ 2020-04-11 23:31 ` Alexey V. Vissarionov 2020-04-14 14:57 ` Andrey Savchenko 1 sibling, 1 reply; 22+ messages in thread From: Alexey V. Vissarionov @ 2020-04-11 23:31 UTC (permalink / raw) To: ALT Linux Team development discussions On 2020-04-11 13:36:31 +0300, Andrey Savchenko wrote: >> The first part of rebuilt packages optimization for girar. >> It introduces pkg_identity() and simple optimization of the >> rebuilt sourcerpm. >> pkg_identity() takes RPM package and returns a value called >> package identity, a hash of subset of RPM package header. >> That subset is the entire header without some nonessential >> artifacts like buildhost, buildtime, header hashsum, etc. > I see two problems with proposed approach: > 1) It assumes there will be not pkg_identity hash collisions. > This is wrong. They may occur sooner or later and the code > *must* correctly deal with such collisions. The solution is well known: prefix the hash with a time_t value to let it grow monotonously while still being strictly dependent on sensitive data. Whether we'd face a hash collision, we could check whether the timestamps differ significantly. > 2) The hash function choise — sha256 — is very unfortunate: > it has longer digest than sha1, but otherwise is vulnerable > to the same attack; so right now it is still marginally secure, > but it will not last long. We don't really need any cryptographic-grade hash function here: all we need is just a checksum with a good distribution to detect whether something had changed - obviously enough, nobody would try to build and exploit collisions here. Said that, we can use almost any polynomial. > Moreover sha256 is quite slow. SHA2 is implemented in the hardware in some modern CPUs, so it's quite fast there. > It is better to use newer generation of hash functions, e.g. > blake2b based on the chacha stream cipher. Both are still quite marginal... but, once again, we should not care of that too much - any hash function would do the job. Even if we'd switch to another polynomial in the future. -- Alexey V. Vissarionov gremlin ПРИ altlinux ТЧК org; +vii-cmiii-ccxxix-lxxix-xlii GPG: 0D92F19E1C0DC36E27F61A29CD17E2B43D879005 @ hkp://keys.gnupg.net ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-11 23:31 ` Alexey V. Vissarionov @ 2020-04-14 14:57 ` Andrey Savchenko 2020-04-14 16:20 ` Vladimir D. Seleznev 0 siblings, 1 reply; 22+ messages in thread From: Andrey Savchenko @ 2020-04-14 14:57 UTC (permalink / raw) To: ALT Linux Team development discussions [-- Attachment #1: Type: text/plain, Size: 3631 bytes --] On Sun, 12 Apr 2020 02:31:43 +0300 Alexey V. Vissarionov wrote: > On 2020-04-11 13:36:31 +0300, Andrey Savchenko wrote: > > >> The first part of rebuilt packages optimization for girar. > >> It introduces pkg_identity() and simple optimization of the > >> rebuilt sourcerpm. > >> pkg_identity() takes RPM package and returns a value called > >> package identity, a hash of subset of RPM package header. > >> That subset is the entire header without some nonessential > >> artifacts like buildhost, buildtime, header hashsum, etc. > > I see two problems with proposed approach: > > 1) It assumes there will be not pkg_identity hash collisions. > > This is wrong. They may occur sooner or later and the code > > *must* correctly deal with such collisions. > > The solution is well known: prefix the hash with a time_t value > to let it grow monotonously while still being strictly dependent > on sensitive data. Yes, this is a good idea. > Whether we'd face a hash collision, we could check whether the > timestamps differ significantly. > > > 2) The hash function choise — sha256 — is very unfortunate: > > it has longer digest than sha1, but otherwise is vulnerable > > to the same attack; so right now it is still marginally secure, > > but it will not last long. > We don't really need any cryptographic-grade hash function here: > all we need is just a checksum with a good distribution to detect > whether something had changed - obviously enough, nobody would > try to build and exploit collisions here. Said that, we can use > almost any polynomial. Still it may be a security issue. Consider what will happen if wrong source rpm will be used: new modifications including security fixes may be silently omitted from a branch. > > Moreover sha256 is quite slow. > > SHA2 is implemented in the hardware in some modern CPUs, so it's > quite fast there. Only in some and only for amd64 arch. But our man build infrastructure also uses ppc64le and aarch64, so it is very important to be efficient, especially on aarch64 which is a bottleneck for most tasks. And consider that we have secondary build systems for other arches like mips, riscv, e2k. A talk is cheap, so let's see some some numbers. 0) dd if=/dev/urandom of=/tmp/test.file bs=1M count=2048 1) Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz $ time sha256sum -b /tmp/test.file 8.67user 0.27system 0:08.94elapsed 99%CPU (0avgtext+0avgdata 1944maxresident)k 8.70user 0.25system 0:08.96elapsed 99%CPU (0avgtext+0avgdata 2148maxresident)k 8.65user 0.28system 0:08.93elapsed 99%CPU (0avgtext+0avgdata 2064maxresident)k $ time b2sum -b /tmp/test.file 2.48user 0.32system 0:02.81elapsed 99%CPU (0avgtext+0avgdata 2120maxresident)k 2.46user 0.30system 0:02.76elapsed 99%CPU (0avgtext+0avgdata 2120maxresident)k 2.47user 0.29system 0:02.77elapsed 99%CPU (0avgtext+0avgdata 2068maxresident)k 2) E8C (1300 MHz, MBE8C-PC v.2) $ time sha256sum -b /tmp/test.file 11.69user 0.93system 0:12.64elapsed 99%CPU (0avgtext+0avgdata 3784maxresident)k 11.78user 0.85system 0:12.63elapsed 99%CPU (0avgtext+0avgdata 3836maxresident)k 11.72user 0.90system 0:12.63elapsed 99%CPU (0avgtext+0avgdata 3956maxresident)k $ time b2sum -b /tmp/test.file 6.90user 1.37system 0:08.27elapsed 99%CPU (0avgtext+0avgdata 3896maxresident)k 6.76user 1.10system 0:07.87elapsed 99%CPU (0avgtext+0avgdata 3844maxresident)k 6.93user 0.95system 0:07.88elapsed 99%CPU (0avgtext+0avgdata 3872maxresident)k I see no reason for using slower and less secure sha256 algorithm. Best regards, Andrew Savchenko [-- Attachment #2: Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-14 14:57 ` Andrey Savchenko @ 2020-04-14 16:20 ` Vladimir D. Seleznev 0 siblings, 0 replies; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-14 16:20 UTC (permalink / raw) To: ALT Linux Team development discussions On Tue, Apr 14, 2020 at 05:57:13PM +0300, Andrey Savchenko wrote: > On Sun, 12 Apr 2020 02:31:43 +0300 Alexey V. Vissarionov wrote: > > On 2020-04-11 13:36:31 +0300, Andrey Savchenko wrote: > > > > >> The first part of rebuilt packages optimization for girar. > > >> It introduces pkg_identity() and simple optimization of the > > >> rebuilt sourcerpm. > > >> pkg_identity() takes RPM package and returns a value called > > >> package identity, a hash of subset of RPM package header. > > >> That subset is the entire header without some nonessential > > >> artifacts like buildhost, buildtime, header hashsum, etc. > > > I see two problems with proposed approach: > > > 1) It assumes there will be not pkg_identity hash collisions. > > > This is wrong. They may occur sooner or later and the code > > > *must* correctly deal with such collisions. > > > > The solution is well known: prefix the hash with a time_t value > > to let it grow monotonously while still being strictly dependent > > on sensitive data. > > Yes, this is a good idea. I don't get the idea. > > Whether we'd face a hash collision, we could check whether the > > timestamps differ significantly. > > > > > 2) The hash function choise — sha256 — is very unfortunate: > > > it has longer digest than sha1, but otherwise is vulnerable > > > to the same attack; so right now it is still marginally secure, > > > but it will not last long. > > We don't really need any cryptographic-grade hash function here: > > all we need is just a checksum with a good distribution to detect > > whether something had changed - obviously enough, nobody would > > try to build and exploit collisions here. Said that, we can use > > almost any polynomial. > > Still it may be a security issue. Consider what will happen if > wrong source rpm will be used: new modifications including security > fixes may be silently omitted from a branch. Nothing bad will happen. I see you don't understand the task: it's not about neither the new modifications or new releases. It's only about package rebuild. It uses no new sources. > > > Moreover sha256 is quite slow. > > > > SHA2 is implemented in the hardware in some modern CPUs, so it's > > quite fast there. > > Only in some and only for amd64 arch. But our man build infrastructure > also uses ppc64le and aarch64, so it is very important to be > efficient, especially on aarch64 which is a bottleneck for most > tasks. And consider that we have secondary build systems for other > arches like mips, riscv, e2k. > > A talk is cheap, so let's see some some numbers. > > 0) dd if=/dev/urandom of=/tmp/test.file bs=1M count=2048 > > 1) Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz > $ time sha256sum -b /tmp/test.file > 8.67user 0.27system 0:08.94elapsed 99%CPU (0avgtext+0avgdata 1944maxresident)k > 8.70user 0.25system 0:08.96elapsed 99%CPU (0avgtext+0avgdata 2148maxresident)k > 8.65user 0.28system 0:08.93elapsed 99%CPU (0avgtext+0avgdata 2064maxresident)k > > $ time b2sum -b /tmp/test.file > 2.48user 0.32system 0:02.81elapsed 99%CPU (0avgtext+0avgdata 2120maxresident)k > 2.46user 0.30system 0:02.76elapsed 99%CPU (0avgtext+0avgdata 2120maxresident)k > 2.47user 0.29system 0:02.77elapsed 99%CPU (0avgtext+0avgdata 2068maxresident)k > > 2) E8C (1300 MHz, MBE8C-PC v.2) > $ time sha256sum -b /tmp/test.file > 11.69user 0.93system 0:12.64elapsed 99%CPU (0avgtext+0avgdata 3784maxresident)k > 11.78user 0.85system 0:12.63elapsed 99%CPU (0avgtext+0avgdata 3836maxresident)k > 11.72user 0.90system 0:12.63elapsed 99%CPU (0avgtext+0avgdata 3956maxresident)k > > $ time b2sum -b /tmp/test.file > 6.90user 1.37system 0:08.27elapsed 99%CPU (0avgtext+0avgdata 3896maxresident)k > 6.76user 1.10system 0:07.87elapsed 99%CPU (0avgtext+0avgdata 3844maxresident)k > 6.93user 0.95system 0:07.88elapsed 99%CPU (0avgtext+0avgdata 3872maxresident)k > > I see no reason for using slower and less secure sha256 algorithm. We can use more faster algorithm. Again, it is not about security. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-10 23:10 [devel] RFC: girar: optimize rebuild Vladimir D. Seleznev ` (2 preceding siblings ...) 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko @ 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy 2020-04-11 15:21 ` Vladimir D. Seleznev 3 siblings, 1 reply; 22+ messages in thread From: Gleb Fotengauer-Malinovskiy @ 2020-04-11 11:04 UTC (permalink / raw) To: ALT Linux Team development discussions; +Cc: vseleznv [-- Attachment #1: Type: text/plain, Size: 1913 bytes --] Hi, On Sat, Apr 11, 2020 at 02:10:42AM +0300, Vladimir D. Seleznev wrote: > > Hi! > > The first part of rebuilt packages optimization for girar. It introduces > pkg_identity() and simple optimization of the rebuilt sourcerpm. Why do we rebuild source rpm at all when we already have one? I mean, when we use hasher with --query-repackage this new rebuilt source rpm is no better then original one. I think we can always save the original source rpm when we rebuild a package or copy it from branch to branch (like we actually do for packages originally built from src.rpm-s). > pkg_identity() takes RPM package and returns a value called package identity, > a hash of subset of RPM package header. That subset is the entire header > without some nonessential artifacts like buildhost, buildtime, header hashsum, > etc. > > The two package builds of the same NEVR might have equal or different > package identities. The equal identities mean that build results of these > packages are equal too, that allows build optimization. The practical > example of simple rebuilt sourcerpm optimization also introduced. Did you consider adding all this identity logic on the rpm's side (as a standalone helper may be)? I personally don't like the whole idea of tracking rpm tags status on girar side. Also, this helper may be useful outside of girar. > The future work can be about optimization of "copied" to another branch > sourcerpm with retrieved from archive sourcerpm, and binary packages > optimization (this case has an issue when binary subpackages are mixed > archs, i.e. arch and noarch, this probably could work only with single-arch > builds). Looks like a good plan. I think optimization of binary packages is more important then optimization which looks for archived packages. We may want to take binary packages from archive too anyway. -- glebfm [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy @ 2020-04-11 15:21 ` Vladimir D. Seleznev 2020-04-11 16:41 ` Gleb Fotengauer-Malinovskiy 0 siblings, 1 reply; 22+ messages in thread From: Vladimir D. Seleznev @ 2020-04-11 15:21 UTC (permalink / raw) To: ALT Linux Team development discussions On Sat, Apr 11, 2020 at 02:04:25PM +0300, Gleb Fotengauer-Malinovskiy wrote: > Hi, > > On Sat, Apr 11, 2020 at 02:10:42AM +0300, Vladimir D. Seleznev wrote: > > > > Hi! > > > > The first part of rebuilt packages optimization for girar. It introduces > > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > Why do we rebuild source rpm at all when we already have one? I mean, > when we use hasher with --query-repackage this new rebuilt source rpm is > no better then original one. > > I think we can always save the original source rpm when we rebuild > a package or copy it from branch to branch (like we actually do for > packages originally built from src.rpm-s). I'm sorry, I was not clear. Sure when a package is built from the sourcerpm, no optimization is required in this case as girar saves only original sourcerpm. The different things happen when package is built from the gear. In the case when package is rebuilt from the gear, girar produce new source and binary rpms, and when the rebuilt task is done it saves all these new source and binary rpms. The proposed optimization is aimed for that case. > > pkg_identity() takes RPM package and returns a value called package identity, > > a hash of subset of RPM package header. That subset is the entire header > > without some nonessential artifacts like buildhost, buildtime, header hashsum, > > etc. > > > > The two package builds of the same NEVR might have equal or different > > package identities. The equal identities mean that build results of these > > packages are equal too, that allows build optimization. The practical > > example of simple rebuilt sourcerpm optimization also introduced. > > Did you consider adding all this identity logic on the rpm's side (as a > standalone helper may be)? I personally don't like the whole idea > of tracking rpm tags status on girar side. Also, this helper may be > useful outside of girar. I did, but it's a bit complicated. RPM community likes the idea, but there is no consensus about how it should work. Sure each project can realize it by its own specific way. So, whether we should calculate the package identity in the girar side or the rpm side? If it should be on rpm side, should it support rpm 4.0.4? > > The future work can be about optimization of "copied" to another branch > > sourcerpm with retrieved from archive sourcerpm, and binary packages > > optimization (this case has an issue when binary subpackages are mixed > > archs, i.e. arch and noarch, this probably could work only with single-arch > > builds). > > Looks like a good plan. I think optimization of binary packages is more > important then optimization which looks for archived packages. > We may want to take binary packages from archive too anyway. Ok. -- WBR, Vladimir D. Seleznev ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [devel] RFC: girar: optimize rebuild 2020-04-11 15:21 ` Vladimir D. Seleznev @ 2020-04-11 16:41 ` Gleb Fotengauer-Malinovskiy 0 siblings, 0 replies; 22+ messages in thread From: Gleb Fotengauer-Malinovskiy @ 2020-04-11 16:41 UTC (permalink / raw) To: ALT Linux Team development discussions [-- Attachment #1: Type: text/plain, Size: 1828 bytes --] On Sat, Apr 11, 2020 at 06:21:01PM +0300, Vladimir D. Seleznev wrote: > On Sat, Apr 11, 2020 at 02:04:25PM +0300, Gleb Fotengauer-Malinovskiy wrote: > > Hi, > > > > On Sat, Apr 11, 2020 at 02:10:42AM +0300, Vladimir D. Seleznev wrote: > > > > > > Hi! > > > > > > The first part of rebuilt packages optimization for girar. It introduces > > > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > > > Why do we rebuild source rpm at all when we already have one? I mean, > > when we use hasher with --query-repackage this new rebuilt source rpm is > > no better then original one. > > > > I think we can always save the original source rpm when we rebuild > > a package or copy it from branch to branch (like we actually do for > > packages originally built from src.rpm-s). > > I'm sorry, I was not clear. No, I think you were clear enough. Unlike me, obviously. :) > Sure when a package is built from the sourcerpm, no optimization is > required in this case as girar saves only original sourcerpm. Yes. > The different things happen when package is built > from the gear. In the case when package is rebuilt from the gear, girar > produce new source and binary rpms, and when the rebuilt task is done it > saves all these new source and binary rpms. The proposed optimization is > aimed for that case. Yes, I was talking about this same case. I think we don't need to bother with identity of src.rpm-s at all. If we build two src.rpm-s from two identical pkg.tar-s we still get equivalent src.rpm-s because in --query-repackage mode hasher uses src.rpm as a source archive (buildtime is the only read rpm tag). Moreover, if we do not store rebuilt src.rpm-s we do not need to rebuild it at all, we can use old src.rpm for rebuild and copy. -- glebfm [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-04-27 5:38 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-04-10 23:10 [devel] RFC: girar: optimize rebuild Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 1/2] gb/gb-sh-functions: introduce pkg_identity() Vladimir D. Seleznev 2020-04-13 18:01 ` Dmitry V. Levin 2020-04-13 19:32 ` Vladimir D. Seleznev 2020-04-10 23:10 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Vladimir D. Seleznev 2020-04-11 11:29 ` Alexey Tourbin 2020-04-14 16:42 ` Vladimir D. Seleznev 2020-04-16 21:51 ` Alexey Tourbin 2020-04-17 13:54 ` Dmitry V. Levin 2020-04-20 9:05 ` [devel] stopping a cascade of rebuilds Alexey Tourbin 2020-04-23 19:21 ` Vladimir D. Seleznev 2020-04-23 20:54 ` Dmitry V. Levin 2020-04-27 5:38 ` Alexey Tourbin 2020-04-20 8:36 ` [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo Alexey Tourbin 2020-04-11 10:36 ` [devel] RFC: girar: optimize rebuild Andrey Savchenko 2020-04-11 15:33 ` Vladimir D. Seleznev 2020-04-11 23:31 ` Alexey V. Vissarionov 2020-04-14 14:57 ` Andrey Savchenko 2020-04-14 16:20 ` Vladimir D. Seleznev 2020-04-11 11:04 ` Gleb Fotengauer-Malinovskiy 2020-04-11 15:21 ` Vladimir D. Seleznev 2020-04-11 16:41 ` Gleb Fotengauer-Malinovskiy
ALT Linux Team development discussions This inbox may be cloned and mirrored by anyone: git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \ devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru public-inbox-index devel Example config snippet for mirrors. Newsgroup available over NNTP: nntp://lore.altlinux.org/org.altlinux.lists.devel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git