From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa.local.altlinux.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=B2+r7nB5WlekEs4uHEvd9ZLBhuwjXAex79o9YeqFt+w=; b=ZI1S39zi6n2HnuM0rhdHe/eyVA64Y0a0LPnze+e2PKr5stX5uTsSYNpE2DuFgyikMc IjtEJvJKHTuQVf5u5P5KW60K9/MdZCYleHx/GRH7BHkCzXkW/ZssqyzAI4ALCxDpD4AT BFNvEnoS5Ln4WoYLydPVw9EnllQvL5/fjgC1x7/Pt1nR9zkQKr9SSUJ4Xj6pRLdMFZ/9 0JITHeszpSOAaPDgkBftrUwORUKDsGFC//RBnd3FMLcIPKrKECrPABfs4fScfFemMfJy PbiJqXaQTbh0VDgDxBp5HSRXQVFz1b9ApKxKrztKgleEHKqbcnKcGb6CgbG4v15vneYd levg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=B2+r7nB5WlekEs4uHEvd9ZLBhuwjXAex79o9YeqFt+w=; b=t38VVMxY2Q5ieeFoYnXCR+o2aYOjQbEYJfbY2eEN9vyyLllDwKg8BaJu6D5udR4cuD 5OlEMS2dogMlhnNA21AXdZB3dbxDLjoiuhcbZIqhZB/9O+c6sM/Mcv6z3AeZoc/fs+Og 8mWWN7S+9/WaAmg3+jN6Olp3r0nZYw7fsO1h/ouJYaEro9jP84Fwv0vun84/J2uzwrwc YB26mC416Cui3IUTxPIwFcT6aOy1TGNCM/8VhpW+YdJpXkjMzLkWLSz5aPVzzRnlhL9T xziPBIGblSMOOrO3OeLXEL93A2AwdyHJG1mWuYJP6pWyHDJ5IpXN6tlAh2eHuhkH0W7l U+aA== X-Gm-Message-State: AGi0PuaH1nit9YY0WaXLu2zQndiPEMfoWiidxKKK8BfdFTKkoU+C0j5D Mp5OcUcskMCZqb349enr+79W1k+wlsJUDHZ7XvyQLVQy X-Google-Smtp-Source: APiQypKG6mqXhH2ffem37h8LkYP0a1X9vNV1S+BKnv+f2qKqdvB00a79M5UWbIODe3eJ8WWAWK7tekHhzdcvHcOepVM= X-Received: by 2002:a92:8f49:: with SMTP id j70mr14108480ild.117.1587371771673; Mon, 20 Apr 2020 01:36:11 -0700 (PDT) MIME-Version: 1.0 References: <20200410231044.1436970-1-vseleznv@altlinux.org> <20200410231044.1436970-3-vseleznv@altlinux.org> <20200414164244.GC618226@portlab> In-Reply-To: From: Alexey Tourbin Date: Mon, 20 Apr 2020 11:36:00 +0300 Message-ID: To: ALT Linux Team development discussions Content-Type: text/plain; charset="UTF-8" Subject: Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo X-BeenThere: devel@lists.altlinux.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: ALT Linux Team development discussions List-Id: ALT Linux Team development discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Apr 2020 08:36:17 -0000 Archived-At: List-Archive: List-Post: On Fri, Apr 17, 2020 at 12:51 AM Alexey Tourbin wrote: > So for src.rpm packages, it's a solved problem. For binary packages, > the identity should specifically exclude disttag. It will no longer > satisfy the definition of ID for rpm (substitution will break for > subpackages with strict dependencies). Therefore for binary packages, > we need to track tuples. This is a one-to-many relation: > for each ID, there may be a few disttags. So for binary packages we > need a separate identity-addressable storage which maps ID to > (while for source packages, a hardlink maps ID to > filehash). If implemented naively, this will create many small files, > one file per ID, most files with just one line. In a more practical > implementation, you should probably group all those small files by > package name. So you'll have: > > $ cat id2f/libfoo > > > > $ cat id2f/foo-data > > > > Note that for libfoo, the IDs are different, but with foo-data the IDs > are the same. This indicates that the contents of libfoo have changed > after a rebuild, while the contents of foo-data have not. It may even make sense to group the mappings by src.rpm name instead of package name. At first it seems less intuitive, but in return it can give you a consistent view similar to MVCC snapshot. Of course, these files should be updated atomically, with rename(2). To check a set subpackages, you first need to copy the file to a local dir. This should rule out the case in which some subpackages have been added to the file and some not. These files are to be updated during the task-commit stage, under the exclusive lock. This is also the right moment to detect race conditions. Suppose you build the same package for sisyphus and p9 in parallel, and the build result is the same. Before adding new packages, you recheck if the whole set can be replaced with the already existing packages. One of the two tasks then should fail (or automatically scheduled for another iteration).