Is ML model training actually possible on Enigma?How?

ML learning requires HUGE DATASETS in order of Terrabytes.

I saw DIDI’s cab data around 3TB somewhere on the web, if I am to train a self driving car model using data from Tesla, GM, Uber, BMW. How would I process these many TB of data on the blockchain?

How would the training take place? Could anyone please delineate as to how it will happen? (step by step that is…)

On the Discord I saw that current SGX limitation is 4gb, how would training happen if the raw data is in TBs ?

1 Like

Hey there @ezio, welcome!
I think initially, approaches such as using pre-trained models to evaluate smaller sections of user data is a feasible approach. Additionally, work in federated learning is promising, for training models while doing the computation on edge-devices (such that only the modifications to the model are returned as encrypted tasks).
for ML as a vertical, you are right to identify the size constraint as key. But this is an active field of research-- here are some recent research papers that address federated learning on edge devices using TEE hardware, which account for limited computational capacity… https://www.intel.ai/federated-learning-for-medical-imaging/#gs.bw8dg3 and https://eurosys2019.org/wp-content/uploads/2019/03/eurosys19posters-abstract66.pdf

2 Likes