From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 5 May 2021 16:10:04 +0300 From: "Dmitry V. Levin" To: devel@lists.altlinux.org Message-ID: <20210505131003.GB18368@altlinux.org> References: <20210501064431.C5F099A456D@gyle.altlinux.org> <20210501090437.bp7fx33thclcgman@example.org> <079b855b-b0d0-b0e6-2da3-4b31f4b4ab1f@basealt.ru> <20210504104532.bd2o2qmcgmves4wl@example.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [devel] hash collision in rpm X-BeenThere: devel@lists.altlinux.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: ALT Linux Team development discussions List-Id: ALT Linux Team development discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 May 2021 13:10:04 -0000 Archived-At: List-Archive: List-Post: On Wed, May 05, 2021 at 08:40:27AM +0300, Alexey Tourbin wrote: [...] > By the way, there's a clever way to detect collisions at a higher > level, even though low-level collisions are unavoidable, subject to > implacable math. We can take advantage of the fact that the build > system synchronizes package builds across a few architectures. > Currently, if a missing symbol goes undetected due to a hash > collision, it does so on all architectures simultaneously. To change > that, we need to hash symbols differently, depending on the target > architecture (we can pass seed=hash(arch) to the hash function, or use > different initialization vectors IV=hash(arch)). The desired outcome > is that when a missing symbol goes undetected, it does so, with an > overwhelming probability, on only one target. Roughly speaking, if > the probability of an undetected missing symbol is 10^-3 on a single > target, it must be 10^-6 on two targets, 10^-9 on thee targets, etc. This looks very promising indeed. I suppose the gain would justify the necessary rebuilding of all packages with set-versions in their arch-specific dependencies. -- ldv