Skip to content

Optimizing Encrypted Email Overlap Calculation with Jaxite #23

@bllmo

Description

@bllmo

I'm using jaxite to encrypt SHA-256 hashes of email addresses and trying to calculate the overlap between two lists of emails, company_emails and partner_emails. My current approach involves comparing each bit of the encrypted hash arrays, which seems inefficient given that I have to compare 256 * 881 elements to determine similarity.

Here's a snippet of the code for clarity:

company_emails = [
    "fabrizio@example.com",
    "filippo@example.com",
]
partner_emails = [
    "fabrizio@example.com",
    "fabio@example.com",
]

company_emails_hashes = [email_to_sha256_bytes(email) for email in company_emails]
partner_emails_hashes = [email_to_sha256_bytes(email) for email in partner_emails]
def encrypt_bits(byte_array, client_key_set, lwe_random_number_generator):
    return [
        jaxite_bool.encrypt(bit, client_key_set, lwe_random_number_generator)
        for bit in byte_array
    ]

def encrypt_hashes(hashes, client_key_set, lwe_random_number_generator):
   return [
        encrypt_bits(hash_bits, client_key_set, lwe_random_number_generator)
        for hash_bits in hashes
    ]

company_emails_encrypted = encrypt_hashes(
    company_emails_hashes, client_key_set, lwe_random_number_generator
)
partner_emails_encrypted = encrypt_hashes(
    partner_emails_hashes, client_key_set, lwe_random_number_generator
)
def are_encrypted_emails_identical(
    company_emails_encrypted,
    partner_emails_encrypted,
    server_keys,
    client_keys,
    encryption_params,
):
  # _xor ?
  pass

are_encrypted_emails_identical(
    company_emails_encrypted[0],
    partner_emails_encrypted[0],
    server_key_set,
    client_key_set,
    parameters,
)

Is there a more efficient way to compare these encrypted email hashes without resorting to bit-by-bit comparison? Am I misunderstanding something, or is there a better approach available within the jaxite library?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions