Does anybody know anything about deduplication at the #InternetArchive? Quite frequently people upload the same files as independent items, which could be prevented by checking against checksums.
See, for example, https://archive.org/details/Om-Alqura, which seems to be an exact copy of https://archive.org/details/1320_20220730