Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiently decrypting vectors of GPS coordinates #24

Open
koenniem opened this issue Dec 2, 2022 · 0 comments
Open

Efficiently decrypting vectors of GPS coordinates #24

koenniem opened this issue Dec 2, 2022 · 0 comments

Comments

@koenniem
Copy link

koenniem commented Dec 2, 2022

I am working with a fairly large dataset containing GPS coordinates encrypted with sodium in another programme. Now I need to decrypt them for analysis, but I am not sure how to do so efficiently. Please see the example below to see how I am currently decrypting data.

library(sodium)

# Create some fake GPS coordinates
data <- replicate(
  n = 400000,
  expr = paste0(
    sample(0:50, size = 1), ".", 
    paste0(sample(0:9, size = 14, replace = TRUE), collapse = "")
  )
)

# Generate keypair
key <- keygen()
pub <- pubkey(key)

# Encrypt message with pubkey
# Efficiency doesn't matter here
# For some reason, serialize doesn't work for my data
msg <- lapply(data, charToRaw) 
ciphertext <- lapply(msg, function(x) simple_encrypt(x, pub))
ciphertext <- lapply(ciphertext, bin2hex)

# Now for uncrypting
# How to do it faster?
out <- lapply(ciphertext, hex2bin)
out <- lapply(out, simple_decrypt, key = key)
out <- lapply(out, rawToChar)
out <- unlist(out)
identical(out, data)
#> [1] TRUE

Created on 2022-12-02 with reprex v2.0.2

There are two components to this equation that slow down the process:

  1. sodium::hex2bin() only accepts one value.
  2. sodium::simple decrypt() only accepts one value.

Running hex2bin() on the encrypted data takes about 7 seconds on my machine, and decrypting takes 35 seconds. Please keep in mind that this is just an example; on real data, I would have to repeat this process many times. Normally, I would not know whether this is fast or slow (because I do not know much about encryption), but collapsing the ciphertext to a single string (using paste()) and then running hex2bin() provides a significant speed boost.

In an ideal world, you'd run vectorized functions:

ciphertext |>
  hex2bin() |>
  simple_decrypt(key = key) |>

However, this is not possible with sodium. Is there something wrong with my approach, or is this how one works with large vectors of data?

@koenniem koenniem changed the title Efficiently decrypting vectors. Efficiently decrypting vectors of GPS coordinates Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant