Secret Infrastructure Mega Thread

I’m starting this thread as a main thread for infrastructure needs of the network.

6 Likes

We’ve run into issues time and time again during peak traffic from dapp launches and it seems to me that we are still not planning aggressively enough as a network on infrastructure.

We were recently funded with a $64,600 hardware budget (which has increased in value and will be spent all on infra) to add 40 secretnodes to the community api. Speaking with suppliers we are quoted 4-6 week wait times, and no assurances there won’t be further delay (due to a silicon supply shortage). Because of this anything we purchase needs to be planned further in advance than with the previous generation hardware.

It seems to me this won’t be enough to handle much over what our current networks top APIs can handle. Considering we still have several product launches upcoming and are also onboarding a lot more users in general (which is generating more network activity and load on apis), I’d like to discuss expanding this budget to a more ambitious number so dapps could handle far larger peak traffic. The number I think we should target is 300 nodes, but we could do it in batches larger than 40 (perhaps batches of 100 or 150). Please share your feedback / thoughts on this topic and ask any questions.

4 Likes

I’d like to see other people get involved in running API like Dan from CoS. I’d also like to see the full SCRT budget that was issued from the first spend used toward its original purpose.

1 Like

2 things prevent Dan from getting involved with secret infrastructure proposals. 1) Dan keeps saying no to collaborating. And 2) Would be better discussed on calls. But I have concerns related to past experiences. We each seem to have our conditions for working together and we have not found a mutually agreeable middle ground. That will not change unless he agrees to discuss and we mutually agree on something.

Furthermore, the more decentralized an API is the less performant it is. So personally I’m not interested in being apart of a very distributed api. Scaling here makes more sense by dan scaling out his own api, figment growing theirs, and community api also growing. Each needs to be robust, because if one goes down, we need backups that can handle peak traffic.

Then please comment here so we all can decide how much flexibility there is. I clearly shared my preference, and will state again, i will abide by any standards that the community sets in the charter.

I like that you’re looking forward. In my 4 years of experience here one thing that has been frustrating is that time and time again we only grow once we’ve reached a barrier, often times one that could have been foreseen and avoided. If we don’t fully expect a massive surge in usage then what are we doing here? The motto has always been to bring dApps that support millions of users. If we can’t handle the traffic millions of users will bring, then how will we achieve our vision?

I’m sure this would be very expensive… but not as expensive as having our infrastructure crack every time a new dApp comes out! I’m tired of turning users away. Let’s be precautionary instead of reactionary, and try our best to make that a habit going forward in all facets of the ecosystem.

5 Likes

Ok so here’s how i’d like to proceed with this.

Tomorrow I will check in with my supplier on wait times for an upgraded / increased unit order. Then I will come back and post a budget for a 300 node secret cluster. Then discuss from there.

1 Like

Okay, here’s my 2 cents.

First of all, it’s important to realize that performance issues are normal. Look at any half decent successful launch and you’ll see performance issues. The reason is that it doesn’t make economic sense to buy 10 or 100x the capacity if you’re only going to use it for a couple of hours. Also sometimes we just aren’t that good at estimating the amount of users we will see, or how our applications will be have at such scales.
The best example of this is Secretswap. It was wonky on launch because we didn’t optimize it enough, and we didn’t predict performance impact of specific features at scale. Once we got those figured out it was mostly smooth sailing, even though we didn’t do too much work on the infrastructure.

Now, on to the SN community infrastructure itself. Firstly, I’m not a huge fan of buying more servers. I think we’re at a point where for a fraction of the upfront cost we can create a scalable cloud cluster (i.e. k8s) which is open-sourced and public so it can be deployed by anyone (including a public or community cluster), and will be more robust and require less maintenance than we can do with bare-metal.

That aside, I think right now (especially with supernova around the corner) we should wait and focus on benchmarking & improving the robustness of what we currently have, rather than committing to purchasing and maintaining more hardware.

For example, do we have the numbers on what kind of requests/second the current deployed hardware can do? What do we expect the new hardware (the 40 nodes) to be able to support? Are we using caching, rate limiting, and implementing DR best-practices? Is recovering nodes from crashes automated? Is there automatic failure detection for out-of-sync nodes? Are the nodes load balanced between? These are all questions that I would tackle before scaling to even more hardware, since then those issues become even more complex. There is more to the issue than simply adding raw horsepower

9 Likes

+1 to autoscaling k8s

+1 to assess current 40 nodes can do, + you have the other part of the budget for K8 forever <3

1 Like
  1. If people want to scale in the cloud I respect that. But we will not be using our budget for cloud machines.

  2. Timing has been an open question of mine (if people want to wait or not). I’ll do whatever people want us to do timing wise, so if now is not the right time or if you think we shouldn’t get more physical hardware then those are things I want to hear.

Thanks for sharing your feedback @Cashmaney :pray:t5:

I have answers to these questions, and I’ll share more information after we deploy the hardware.

From this juncture I am opting to stick to the original plan which is to deliver exactly 40 secret nodes on 3rd gen scalable hardware (no more no less) and relay infrastructure. Due to silicon shortage, we expect to be able to deliver this sometime in Q1 2022 but we will let people know if we are able to before then. Thanks everyone for who chimed in🙂

2 Likes

After further discussions with cashmaney he indicated he doesn’t care if we do cloud or bare metal physical hardware and that there are pros and cons to each. Im not claiming this as an endorsement from him, but we are looking to proceed with our plans after this input.

2 Likes

I’m assuming that the current state of this has moved over to the “spartan proposal” thread.
I would like to suggest that if the community proceeds forward that new nodes should be launched in phases. Proof of the first 20 nodes being successfully launched should be given to the community before new nodes are approved.
A phased solution with proof of successful launch and uptime for existing nodes is ideal.

For anyone that has missed it, get voting:

Keplr - Browser extension wallet for the Inter blockchain ecosystem.

Hello all,

I wanted to update everyone on the status of our API related proposals (such as spartan).

TLDR

Current status : hardware undelivered for secure secrets and secret llc.

ETA on hardware: No eta. Best guess is Q1 / Q2.

What is covered without more funding?

  1. 20 nodes run for 1 year. (Data center costs for securesecrets)

  2. All hardware.

What is not covered?

  1. Any hours for support of API. (Any time we have to spend on the api after deployment)

  2. Data center costs for secret llc.

Note: infrastructure team is not pursuing API related funding until a time after we get and deploy hardware

More Details

Existing Community API status : Due to increasing system requirements, the community api was reduced from a max of 8 nodes to 4. It may be further reduced in the face of further increasing system requirements. This API is also not currently funded for its operation so we don’t put much time into it.

Changes

  1. Taking what we have learned from issues with existing APIs, we will not be having the entire community api be a shared cluster. Instead we will have a shared option with permission-less access to a 10-20 node cluster, then we will divide up the rest of the capacity into individual clusters dedicated to individual dapps. This decision was made because when everything is on one cluster then a single dapp having issues can cause problems for all other users of the API.

  2. Secret LLC is in the process of developing/deploying enterprise quality automation and industry standard monitoring solutions specifically for a large cluster of nodes and we’ve made a hire to work on infrastructure with me. This was in part done based on previous feedback from scrtlabs.

What happens when we get the hardware?

  1. When Mohammed from securesecrets gets the machine they ordered, my team will get nodes deployed to the machine. This deployment will be a public free access API. Since Mohammed included data center costs for 1 year, that is the amount of time from launch it will run without needing data center costs covered again.

  2. When I get our hardware from spartan, our team will deploy the nodes to it, run various tests, and make sure things are production ready. Once deemed production ready by our team, we will be ready for it to be available to users. On our side of things we used the funds for hardware and there are no funds left over for maintenance / support or data center costs. For this reason and because we are committed to carrying out the proposal in a sustainable manner, we will be seeking funding options (foundation, slabs, pool). If these funding options fail then we will mirror the figment tiered model for offering an API, and have a free tier + a paid enterprise tier providing private secret clusters to a handful of dapps.

Closing statement

Secret LLC and Secure Secrets are deeply passionate about the Secret Network and no matter what, we will find ways to solve the root issues we set out to solve on the network. We believe now more than ever, that having multiple robust APIs is needed for the network to grow and we look forward to achieving these mutual goals.

8 Likes

Wholeheartedly agree and support :100:.

I have confidence that Secret LLC and Secure Secrets have the expertise, determination and fortitude to address the infrastructure pain points the community has experienced with launches and high activity times with existing dapps such as SecretSwap and Stashh.

2 Likes