Large scale data architecture?

bmiller59 · November 12, 2019, 5:51pm

Hi, Enigma team. I had a question about architecting large scale data storage and access efficiently from a performance and cost perspective while still preserving secrecy for the data. I was reviewing the secret voting contract that Taariq mentioned to me, and I see that all the data is stored in a HashMap that appears to need to be loaded in its entirety to read and write values to it. I am guessing that is not the production-app ready design for millions or billions of records, right? I know that Enigma has the engima-p2p service, but I have not found any guides how to use it effectively. In essence, I guess I am looking for something like orbit-db (https://github.com/orbitdb/orbit-db) but built on top of enigma-p2p? Please advise. Thanks!

bmiller59 · November 13, 2019, 6:59pm

To give a specific example, if I had millions of users and millions of “projects” that they were assigned to, and the user and project data needed to secret, what would be a good way of handling that?

ainsley · November 13, 2019, 9:30pm

hey @bmiller59 – Secret contracts have storage based on RocksDB. Agree the secret voting isn’t optimized. In this “database” all the data is private, the only way to expose data to the outside is by writing a “getter” function that will return the data decrypted.

bmiller59 · November 14, 2019, 1:01am

Thanks for the note about RocksDB. Are the read_state! and write_state! command essentially reading and writing from RocksDB? If that is the case, then I can just use a more detailed index/key to lookup individual user or project records, rather than retrieving a HashMap of all values, correct?

bmiller59 · November 15, 2019, 6:28pm

@ainsley Can you confirm if I have that right? Thx.

ainsley · November 17, 2019, 11:56pm

Sorry for the delay, @bmiller59. Yes, I think that’s correct. I’ll confirm this and also make sure it ends up in the documentation this week.

can · November 18, 2019, 4:33pm

@bmiller59,

Regarding p2p limitations, we are trying to limit the size of msgs that are being passed on enigmap2p to the order of MBs. Also secret contracts, which can act as a DB, are currently limited to a size of 4GB - this is a limit due to SGX. Intel has mentioned that 4GB will be significantly increased in SGX2, which is expected to hit the marketing in H2 2020.

An idea could be to build a bridge between OrbitDB and Enigma network, such that data stored in OrbitDB can be pushed to an Enigma secret contract. This would require Enigma.js to be integrated into OrbitDB.

I actually discussed this with Haad from OrbitDB last month in Berlin. Would be an interesting integration collaboration…

ainsley · November 18, 2019, 7:54pm

Hey @bmiller59 i can confirm that these macros access key-value pair storage – but also have some other info… from our architect @fredfortier.

I tried to break your question down into two separate concepts, but please correct me if you are asking something different.
questions
1 - is there more efficient ways to store data than a hash map using our system
2 - is sending millions of records feasible given message size limits?

answers (via Fred)

No, the entire state of each secret contract must be loaded in memory because it’s fully encrypted.
I don’t know what is meant by sending nor by records. Caching the state in memory during execution is efficient, so sending millions of “records” is conceivable as long as state storage is used sparingly.

cc @can and @taariq and @laura who will probably find this discussion interesting and helpful as well.

arookie · November 19, 2019, 2:30am

Hi @can, just curious about your mention. Basically, if we want to store data amount larger than 4GB, those data should be stored in OrbitDB. And every time we want to do sth with them, we will load it from OrbitDB and put it on the bridge to load into Enigma network.
But that data will be encrypted, so how can we process with it?
And we will process a chunk of data, right?

can · November 19, 2019, 2:34am

@arookie,

In order to store data encrypted in OrbitDB that can be decrypted inside Enigma, check out Salad code and how we are creating key pairs inside the contract that is passed to the end user by the relayer. This way the user is able to encrypt their recipient addresses, pass it to the relayer and then the relayer submits these encrypted inputs to the Enigma secret contract. Given the encryption, the relayer has no way to decrypt and temper with users’ recipient addresses.

Now replace relayer with OrbitDB and we would have the same capability

moskalyk · November 23, 2019, 3:08am

Wondering how state gets held between contract calls? And how does state get managed across the network?

I see mention of RocksDB, but unclear on it’s use in eng-core. Looking through, seems state get / put is done within memory: https://github.com/enigmampc/enigma-core/blob/2544d51ec87e42afb836631f0e5872fdbf395f08/enigma-runtime-t/src/lib.rs#L32

If you follow the stack trace:
fn read_state ->
fn read_state_key_from_memory ->
self.memory -> wasmi::MemoryRef

Where might I find the connection between the WASM Memory and Persistent state?

Topic		Replies	Views
End-To-End Decentralized Voting Secret Contracts and Secret Apps	2	573	December 16, 2019
Big data analytics Secret Contracts and Secret Apps	0	384	October 5, 2020
Secret Contracts on Enigma Blockchain Secret Contracts and Secret Apps	26	5623	March 6, 2020
Secret Storage fetching Secret Contracts and Secret Apps	10	846	July 8, 2019
Computations over data Private Computation	2	785	December 3, 2019

Large scale data architecture?

Related Topics