Non-deterministic execution (Randomized functionalities)

This was asked on Telegram and previously on the forum, so it’s worth addressing how we’ve been thinking about this situation.

Generally-speaking, consensus protocols allow multiple nodes/actors in a network to agree on the result of a certain computation. However, this becomes tricky when the computation result is dependent on some randomness. This is known as non-deterministic execution (or randomized functionalities), and currently, most blockchains, Ethereum included, simply do not support these.

To further explain what randomized functionalities are, we can work with a simple example, such as the function x_plus_random(x), that takes an input x and return x+r, where r is generated randomly at every invocation of the function. Now imagine we have a network of nodes. If we ask each node to independently run that function, each of them would likely receive a very different result, because r changes from one call to another.

Given this, how can the network agree on a single outcome? Who’s to determine what the ‘right’ r should be for a certain function call, and how do we communicate that to the rest of the network?

For the current release of our test-net, we randomly select a single worker to execute each computation. With the assumption that TEEs cannot be tampered with, we can support randomized executions natively, since there’s no need to reach consensus on an execution with other workers (each one works on a different problem, and Ethereum only validates the remote attestation proof).

However, working with a model of a single worker per computation has many disadvantages when it comes to availability, safety and performance. This is therefore one of the things we’re working to address in the next release of Discovery. Instead of a single worker, we plan to randomly sample a committee of workers for each secret contract and each epoch (epoch=several blocks long. Could be many), which is more in-line with how the original Enigma whitepaper worked.

Unfortunately, having each computation executed by a group instead of a single node gets us back to the deterministic nature of the consensus problem. To overcome this, here are a couple of proposed solutions:

  1. Before the start of an epoch, share with each of the workers assigned to a committee a unique random seed. This seed could then be used in a PRG to generate as much randomness as the computations in the epoch require, in a way that doesn’t require any further communication between the workers.

Advantages:
a. Easy to implement. Likely the first approximation of a solution which will be used in Discovery.
b. Negligible performance overhead.

The main disadvantage is that if even one TEE is compromised throughout the epoch (which is hard, but not impossible), then the attacker can fully predict the randomness going forward in that epoch. This is potentially a concern for applications like lottery that have a lot of money at stake.

  1. Implement a coin-tossing protocol - this is essentially an MPC protocol that allows multiple parties to agree on a shared randomness.

Advantages:
a. It practically solves (or at least greatly mitigates) the attack above, especially if the random value is revealed to the parties only at the time of the computation.
b. It’s cool :wink: .

Disadvantages:
a. Longer development time. Won’t be ready in time for Discovery.
b. Performance - coin-tossing, like consensus, requires interaction between the workers which would slow down computations. The best way to mitigate this would be to pre-process the shared randomness, which would create other development challenges.

The working plan right now is to go with #1 for Discovery, and #2 for the next release when MPC is available. #2 will likely be broken down to several smaller milestones.

6 Likes

You can mitigate the performance disadvantage of MPC without compromising security by having not currently active nodes perform coin tossing then pass the randomness off to others – assuming the network isn’t at 100% computational capacity the parallelism should mean you don’t actually get any slowdown.

1 Like

Yes. That’s what I meant by:

The best way to mitigate this would be to pre-process the shared randomness

Pre-processing could (and would actually have to occur) on different nodes as it happens before the committee is even selected.

Thoughts on implementing DFINITY’s random beacon?

1 Like

Yes, that’s one option. I don’t believe it would help with generating correlated randomness though (which is needed for MPC).

1 Like

Blockquote Instead of a single worker, we plan to randomly sample a committee of workers for each secret contract and each epoch (epoch=several blocks long.

Will this be implemented at Mainnet launch?

@Brendan,

Not yet. In the upcoming Discovery mainnet release, we’ll start with a single worker per epoch assigned to a given secret contract. But the selection of that worker will be random, and change every epoch.

This scheme is scheduled to be upgraded in a subsequent release to increase from one worker per contract to a committee of workers per contract, also selected at random, per epoch.

3 Likes

Thanks @victor. This has come up in Telegram as a big vulnerability and I wanted to be able to provide a clear answer.

2 Likes

@Brendan,

To provide more context to your comment, we acknowledge that this solution is far from ideal and this will be one of the first features that we will improve in the subsequent release, but I would challenge the notion that this is a big vulnerability, or at least it contextualize that assertion depending on how one thinks about a vulnerability.

The main disadvantage of this solution is the reliance at any given point in time (or epoch, as we define our timeslots in the network) on the fact that only one node/machine will be performing computations on a given contract, and the main implication is on availability. If that node decides to go down during the epoch that it is assigned to work on a contract (and being penalized in terms of losing its stake) all the computation tasks will have to wait until the epoch ends, nodes get reshuffled (randomly) and another node gets assigned to that contract, and picks up the queued work for the node that decided to go offline.

However, no data will be leaked during that process, and no secrets revealed. And the possibility of an attack around data-leakage is equally likely whether there is one or multiple nodes assigned to that contract (where the chance of making a targeted attack on a given node for a given contract has to account for the fact that nodes are chosen randomly on what contracts to work on, so there is no way to know in advance what contract the node being targeted for the attack will work on).

The other attack vector is that a malicious node returns the wrong results for a given computation (but again no data leakage takes place). Say I am a malicious node that I get to work on contract A that computes x + y, and upon receiving x=1 and y=2 I decide to output 10… We argue that this is not possible because of the reliance on SGX and the process of remote attestation, by which when a node comes online and registers to the network, it submits a proof of the following two things:

a) The code that the node is running is running inside a legitimate enclave (and not in simulation mode where it can be debugged, and data leaked)
b) The code that the node is running is exactly the same code that Enigma publishes, unmodified and signed. And with it, it submits a signature that we will verify every time this node submits the result of the computation.

Thus, because of (a) and (b) combined, it is not possible for a malicious node to return the wrong output.

Hence, the big vulnerability comes from availability, or possible lack thereof (which I am not downplaying, just making sure that we all share an understanding that this is what we are facing), but no data leakage takes place.

Hope this helps.
Victor

4 Likes

GREAT info. Thank you Victor!

1 Like