Large scale data architecture?

Hi, Enigma team. I had a question about architecting large scale data storage and access efficiently from a performance and cost perspective while still preserving secrecy for the data. I was reviewing the secret voting contract that Taariq mentioned to me, and I see that all the data is stored in a HashMap that appears to need to be loaded in its entirety to read and write values to it. I am guessing that is not the production-app ready design for millions or billions of records, right? :wink: I know that Enigma has the engima-p2p service, but I have not found any guides how to use it effectively. In essence, I guess I am looking for something like orbit-db (https://github.com/orbitdb/orbit-db) but built on top of enigma-p2p? Please advise. Thanks!

2 Likes

To give a specific example, if I had millions of users and millions of ā€œprojectsā€ that they were assigned to, and the user and project data needed to secret, what would be a good way of handling that?

1 Like

hey @bmiller59 ā€“ Secret contracts have storage based on RocksDB. Agree the secret voting isnā€™t optimized. In this ā€œdatabaseā€ all the data is private, the only way to expose data to the outside is by writing a ā€œgetterā€ function that will return the data decrypted.

1 Like

Thanks for the note about RocksDB. Are the read_state! and write_state! command essentially reading and writing from RocksDB? If that is the case, then I can just use a more detailed index/key to lookup individual user or project records, rather than retrieving a HashMap of all values, correct?

3 Likes

@ainsley Can you confirm if I have that right? Thx.

Sorry for the delay, @bmiller59. Yes, I think thatā€™s correct. Iā€™ll confirm this and also make sure it ends up in the documentation this week.

@bmiller59,

Regarding p2p limitations, we are trying to limit the size of msgs that are being passed on enigmap2p to the order of MBs. Also secret contracts, which can act as a DB, are currently limited to a size of 4GB - this is a limit due to SGX. Intel has mentioned that 4GB will be significantly increased in SGX2, which is expected to hit the marketing in H2 2020.

An idea could be to build a bridge between OrbitDB and Enigma network, such that data stored in OrbitDB can be pushed to an Enigma secret contract. This would require Enigma.js to be integrated into OrbitDB.

I actually discussed this with Haad from OrbitDB last month in Berlin. Would be an interesting integration collaborationā€¦

2 Likes

Hey @bmiller59 i can confirm that these macros access key-value pair storage ā€“ but also have some other infoā€¦ from our architect @fredfortier.

I tried to break your question down into two separate concepts, but please correct me if you are asking something different.
questions
1 - is there more efficient ways to store data than a hash map using our system
2 - is sending millions of records feasible given message size limits?

answers (via Fred)

  1. No, the entire state of each secret contract must be loaded in memory because itā€™s fully encrypted.
  2. I donā€™t know what is meant by sending nor by records. Caching the state in memory during execution is efficient, so sending millions of ā€œrecordsā€ is conceivable as long as state storage is used sparingly.

cc @can and @taariq and @laura who will probably find this discussion interesting and helpful as well.

3 Likes

Hi @can, just curious about your mention. Basically, if we want to store data amount larger than 4GB, those data should be stored in OrbitDB. And every time we want to do sth with them, we will load it from OrbitDB and put it on the bridge to load into Enigma network.
But that data will be encrypted, so how can we process with it?
And we will process a chunk of data, right?

2 Likes

@arookie,

In order to store data encrypted in OrbitDB that can be decrypted inside Enigma, check out Salad code and how we are creating key pairs inside the contract that is passed to the end user by the relayer. This way the user is able to encrypt their recipient addresses, pass it to the relayer and then the relayer submits these encrypted inputs to the Enigma secret contract. Given the encryption, the relayer has no way to decrypt and temper with usersā€™ recipient addresses.

Now replace relayer with OrbitDB and we would have the same capability

3 Likes

Wondering how state gets held between contract calls? And how does state get managed across the network?

I see mention of RocksDB, but unclear on itā€™s use in eng-core. Looking through, seems state get / put is done within memory: https://github.com/enigmampc/enigma-core/blob/2544d51ec87e42afb836631f0e5872fdbf395f08/enigma-runtime-t/src/lib.rs#L32

If you follow the stack trace:
fn read_state ->
fn read_state_key_from_memory ->
self.memory -> wasmi::MemoryRef

Where might I find the connection between the WASM Memory and Persistent state?

2 Likes