Input/Output/State Encryption/Decryption protocol

For the most part, our goal is to keep the input/output encryption protocol built for Discovery largely unchanged. The benefits are obvious - that code exists, and it has been audited. There’s one relatively simple issue to resolve which is replacing the secp256k1 library, as it’s not known to be constant-time.

Initialization

We assume validators start with the following shared keys. Generating these keys is described in this post:

  1. (sk_io, pk_io) --> shared key for deriving input/output keys. pk_io is available on-chain, and sk_io is shared across all validators’ enclaves.
  2. master_state_key --> a symmetric master key used to derive other state keys.
  3. master_iv --> an IV seed used to generate fresh pseudo-random IVs for both encrypted outputs and state encryption.

Input Encryption/Decryption

As usual, a secret contract execution should enable users to encrypt their inputs, which will only ever be decrypted inside of secure enclaves. Therefore, this protocol is part of a larger user tx that should trigger a secret contract. A sketch of the protocol proceeds as follows (I included some key pieces from the larger computation protocol, but it’s not meant to be complete - just to provide context):

User:

  1. User generates an ephemeral secp256k1 key-pair. I’d recommend we still use secp256k1 curve since again, that code has been implemented and audited.

  2. Derives a symmetric key by combining the validators’ shared public key (pk_io) with the ephemeral secret key (a simple EC multiplication - basically Ephemeral DH, as we have now).

  3. Symmetrically encrypts/authenticates the transaction inputs and the secret contract function, using AES-256-GCM (again, already implemented).

  4. Creates a compute tx with the payload (contract_address, enc_func, enc_inputs, ephemeral_pubkey) and sends it to the Enigma chain.

Validators:

  1. Receive the tx, see that it’s a compute call - forward it to the enclave for processing.
  2. Derive the same symmetric key (let’s call it ephemeral_secret), by applying the same EC multiplication of the shared secret key with ephemeral_pubkey.
  3. Decrypt (func, inputs) = decrypt(enc_func, enc_inputs, ephemeral_secret) using the key generated in #2.
  4. Execute contract::func(inputs).

Encrypted outputs

Encrypted outputs (where the output is encrypted with the sender’s key), could be handled in largely the same way. We can use the same ephemeral_secret to encrypt the output (which is stored as part of the contract’s state), and only the sender will be able to decrypt it. This is generally how Discovery handles it today.

The challenge here is that AES-256-GCM, like most encryptions, is a randomized algorithm (i.e., executing it twice on the same message and with the same key will lead to different results). Since unlike Discovery, now we have multiple validators to deal with per computation, this means they will each end up with a different result, and will not be able to reach consensus (consensus is reached on a deterministic output).

To solve this, one easy option is for the validators to generate a deterministic IV, for example by taking the hash of the tx. This is extremely insecure in cases where there’s key reuse, but given that the protocol above generates a fresh key for each execution, this is probably okay.

Still, users may fail to create fresh keys for every execution, or this may become undesirable in time. Also, I might be missing something in my analysis, so I’m proposing an alternative protocol that is slightly more complicated:

Assume that the validators also share a random master_iv. Using that and a KDF, they can generate a unique IV for each transaction by taking: iv = KDF(master_iv || tx_hash). I recommend using HKDF which is already implemented in Ring, since we already modified it to fit inside SGX.

The above protocol needs to be validated. We aren’t doing much here, but more eyes are needed so if anyone has feedback - please share!

One thing to mention is that by using randomized IVs, I’m assuming that the key won’t be reused A LOT. Specifically, with a 96-bit IV, there’s meaningful risk of a collision after 2^48 encryptions (Birthday paradox), so we definitely need to stay significantly below that threshold.

Encrypted State

In light of the above, the current assessment is that getting an encrypted state should not be difficult, assuming we avoid writing a complicated serialization/deserialization protocol (e.g., like the encrypted deltas system that we wrote for Discovery). To be fair, that protocol was mostly useful for really large states, but as we’ve seen, this is an unlikely product requirement and would require a lot of effort. In other words, it’s an optimization and if it’s needed - we can evaluate it over time.

The simplest solution to get to an encrypted state, and assuming we have a public state which allows for arbitrary values, is to allow selective encryption of specific state dict entries (i.e., the key is public and the value is encrypted). If a developer wants to encrypt the WHOLE state, then they should have a single attribute called ‘state’ (or whatever) and fully encrypt a state dictionary as the value.

For this purpose, the validators should also share a symmetric master_state_key as described in the beginning of this post. With that key, they can use HKDF in the same way to get a fresh deterministic state key for each transactions:

state_key := HKDF(master_state_key || tx_hash)

7 Likes

Also wait why is the consensus being forged on ciphertexts not cleartexts

The consensus is reached on on-chain data, and we cannot write cleartexts on-chain (It won’t be private anymore).

Perhaps just hashes of the cleartext + random salt, then? (or of the cleartext, assuming we don’t care about CPA security/there’s a wide message space)

I.e. One validator posts ‘E(salt) conc hash(salt||result)’, others can consensus on the hash(salt||result) term

And yes ik the salt is a nonce more than a salt here

This is less secure (definitely no CPA security, but in general I think it leaks too much data potentially). Also, storing the encrypted output on-chain allows the user to easily read it.

Also #2, you can’t do encrypted state without using the shared pseudo-randomness trick we’re suggesting.

Ahhh, alright – In that case this works (assuming you use the master IV to rotate to new IVs before the birthday paradox kills you and you don’t care about a single compromised sgx eating the whole system)

1 Like

Hi @guy!

I know I posed this already in the other thread but here is where it belongs. If I don’t get things horribly wrong, you can use a 96 bit counters for AES256-GCM nonces. This is explicitely covered by NIST SP-800-38D: 8.2.1 Deterministic Construction. You can apply this with a 0 bit context (you only need one context) and have the whole 96 bits for your counter. I’m not 100 % sure which limit applies in 8.3 Constraints on the Number of Invocations, but you get either 2^32 or 2^96 possible invocations.

2 Likes

Hey Simon! Great to see you here :slight_smile:.

You’re perfectly correct. The reason we didn’t take the counter direction is because we weren’t sure that nodes in Tendermint will remain synced implicitly no matter what.

I know in Tendermint there are no forks, but what happens if a proposer goes offline mid-way and a view-change is triggered? Couldn’t it be the case that two nodes in the network end up not running exactly the same computations? For example, imagine block B is proposed but never committed. Instead, Block B’ is committed, then, can’t it be the case that:

Node 1: Executes block B, realizes there’s no consensus on it, executes block B’.

Node 2: Executes block B’ directly.

Since enclaves don’t know (and can’t trust) the outside world, and say we had an internal counter, Node 1 and Node 2 would become permanently unsynchronized.

I don’t know if what I mentioned is even possible. If it isn’t, and there’s NO WAY for different nodes (taking into account byzantine faults) to execute the internal counter in a different order, then I’d agree it’s a better solution.

1 Like

Why not just use block height as your counter nonce then?

Block height is also somewhat controllable by tx sender (could choose when to send the tx).

That shouldn’t matter for CPA security as long as it’s not reusable? the worry is posting bad transactions that get rollbacked, I guess

Good catch. While you can be sure that those conflicting computations will not make it into the blockchain, you cannot be sure they don’t make it into the internet.

The danger is when B and B’ use

  1. the same encryption key
  2. the same IV (e.g. the pair (height, transaction index))
  3. different messages (i.e. different contract execution transactions)

This is an unliley thing to happen but I don’t see any guarantees here.

Why is that a problem if your desired propery is uniqueness, not randomness? In every counter based system you can control the IV (to some degree).

2 Likes

If the sender is a full node and is malicious they can calculate the tx locally and not broadcast it if the results isn’t favorable to them.

The IV must be unique for any given key. The encryption key is only available in the enclave. So other than the active validators’ enclaves, nobody can execute the tx.

But yeah, if a validator is the block proposer and the transaction sender at the same time and doesn’t like the result of the calculation, it can be censored.

3 Likes

Exactly. Although for simplicity we are currently considering sharing the private keys with every full node.
This will probably be simpler to implement and help with data decentralization and network health.
Downside is a bigger attack surface.

2 Likes

I think contract_address should be encrypted because when contract_address isn’t encrypted, we can trace usage history of secret contract (though we can’t know which function and inputs the user executed), and some exchanges can connect address and their user.
(Depending on the type of application the address uses, the user can be banned by the exchanges, or pursued by regulatory authorities.)

In the same reason, the amount of gas consumption should be encrypted.

Is there some technical reasons why contract_address is not encrypted?

1 Like

contract_adrress could be considered one of the inputs, and we are thinking of ways to encrypt that too, but this is more complex than other kind of inputs.

The reason is that we need to load the code attached to this address from the blockchain. This happens outside of SGX. If all users know the encrypted contract_adrress then this is effectively not private.

If the user encrypt the public contract_adress then we will have to decrypt the address inside SGX to find the public contract_address, map it to the encrypted store key in which the code is stored on-chain, exit SGX to load the ecrypted code, pass it to SGX again and decrypt the code inside SGX. Any other way leaks private information.

This is way more complex than our original plan for the first iteration of secret contracts, and will probably add considerable development time. Also, this will make verifying contracts source code very hard if not impossible. Because now the code and code’s location must be private as well.

This is a great feature suggestion and I’d love to be wrong about everything I wrote above because ultimately we want every single bit on our chain to be private by default. :blush:

3 Likes