(Return of) The Spartan Proposal

The Spartan Proposal

Recently we have seen significant growth in usage and users of the Secret Network, but this same growth has highlighted that we are ill-prepared for it from an infrastructure standpoint.

To address existing infrastructure shortcomings we propose building a 70 node cluster of load-balanced endpoints to help power the upcoming generation of secret dapps launching over the years to come. This cluster would be publicly available to all and private to none, acting as a public good resource maintained by the Secret Infrastructure team. We estimate this capacity to be sufficient to power a backbone capable of handling a minimum of 2 simultaneous launches on the scale of sienna and secretswap. And while we don’t believe it’s enough capacity to handle traffic alone if Secret becomes a top 30 coin, we think this is a good first step towards addressing the demands placed on infrastructure to prepare the network for new phases of growth.

This initiative is intended to enable us to get ahead of the curve on API related scaling issues. Due to the current wait times on hardware, achieving something at this scale requires significant planning ahead of time. Secret Infrastructure has done the planning to prepare for this and is qualified to deploy and maintain this public good.

Budget

Based both on feedback and the expected 4x improvement to query throughput here is the plan.

A proposal to upgrade one of the builds from the infra b proposal to be able to handle 70 nodes (up from 20).

Note 1 : 1 x System Supermicro120U-TNR & 2 x Intel Platinum 8380 CPUs.

Note 2: We are open to structuring the new api to load balance across our nodes + nodes from other api providers on the network. Currently exploring the feasibility and doing this would require cooperation from 3rd parties)

Note 3 : There is a wait time of 4-6 weeks minimum and longer for certain components. We hope to have the equipment arrive in time to deploy in in Q1 2022.

Note 4 : This is not a discretionary budget. If the value of SCRT in this proposal increases when passed, or is more than is needed to purchase hardware, then it will be exclusively used for the purpose of upgrading the build for this specific proposal. If there is a surplus that goes unapplied to hardware upgrades, then it must go to maintenance, data center costs, and any software licenses associated with this equipment.

Budget

Hardware, Shipping, Warranty $75,010.90

Sales Tax on hardware $3,427.51

Buffer for Volatility 10% $7,843.85

= $85,732.26

Subtract $20,000 (Taking 20 nodes / $20,000 budget from Infra prop b)

Total $66,282.26

(Note: this proposal was modified from 300 nodes to 70 based on discussions and feedback)

13 Likes

This is an interesting idea. Will other users like Dan be able to bid on this? That way it’s more like a government contract.

If dan wants to continue to build out his current api I have no issue with that. This proposal is from Secret Infrastructure though, he would have to ask in his own proposal. Given dans history on the network I am not comfortable being financially tied to him on the same proposal.

I’ve never heard about Secret Infrastructure before, is it a new committee?

1 Like

Secret Infrastructure is over a year old sir. We were the first “committee”. We changed our structure to a group from a committee recently and have 2 approved proposals for this initiative on chain.

1 Like

Sounds cool, sir.
Who are members of it?

1 Like

@dylanschultzie @mohammedpatla and myself. We are open to adding other reputable people though. Infra is a big task to tackle.

1 Like

I know that Dan from CoS is a busy guy, but i would suggest to ask him to join the Secret Infrastructure, he has a huge experience.
Have you invited him?

1 Like

Dan has said no to invites to collaborate on mutually agreeable terms and as i have stated several times, after his past behavior with spend proposals and issues with paying multiple people what he promised, I am not comfortable risking the infrastructure budget by including him. He is free to make his own proposal and submit it himself to do any work he wants to do.

I think it’s imperative that we get ahead of the curve, and that if we want to be a top-20 cryptocurrency we prepare ourselves to become one. The Sienna launch was telling, and with a vast application layer being built, and being encouraged to be built, we’d certainly be amiss to wait for errors and then be stuck waiting for gear for 2 months instead of preparing properly.

I fully support the Spartan Proposal!

5 Likes

TL;DR - I think this is too much of a budget for hardware alone. I’d rather see more focus on Devops effort

I’m going to preface this with a lot of what I’m saying being written in the post here: (Secret Infrastructure Mega Thread - #8 by Cashmaney).

I’m all for creating a robust, open and community-powered endpoint for devs and users. I’m also all for creating robust infrastructure that can be scaled to meet future network needs. However, I think that at this point the big challenge is building the fundementals first. What do I mean by this? Think about a service like Infura. They have to have:

  • Solid rate limiting, and the ability to tier users based on their needs
  • Visibility into usage metrics - by users, and specific API calls to know what the usage profile looks like
  • Ability to easily block or limit abuse (such as DDoS)
  • Monitoring tools for their nodes
  • Load balancing between all the different nodes
  • Disaster recovery - redundancy (at the local and geographic levels), and single node recovery automation/processes
  • A portal that allows registration, payment & self-service

And as you can imagine all of these get vastly more complex with the amount of traffic and hardware that is being employed. The reason I’m stressing these things is that even with the hardware that was already purchased these are tough challenges.

Furthermore, solving these challenges will create open-sourced solutions that node runners can use, empowering the ecosystem. For example, monitoring and recovery tools for nodes are things that any validator will tell you are sorely missing right now.

Lastly, with Supernova around the corner performance will be changing a lot. Contacts will be cheaper and more efficient, but there is also a lot of uncertainty. This is another reason why I think making large hardware purchases is unwise.

So what am I suggesting? Creating a budget for contracting two DevOps engineers for at least two months. Instead of purchasing 4x, lets start with ordering 1x (by my math if we have 40 nodes right now and 4 servers gets us to 300 then just 1 more already more than doubles capacity).

Then in two months time, after the server arrives and we have a robust API with proper visibility we can decide on expanding even further.

In the meantime I would focus the community API on providing high-quality service to key network services - Keplr being the first and foremost. A solution that uses local hardware, load balanced with 3rd party solutions such as Chain of Secrets and Figment could create redundency so that core network services don’t have a single point of failure anymore.
A small amount of nodes could be retained as a public API, with solutions for app developers supplemented by Figment, Chain of Secrets and self-hosted nodes.

8 Likes

I tend to agree with a lot of your points. I am rather new to the Secret community so I know my opinion probably does not hold much value.

However, I am wondering if someone could answer my questions.

Judging by the secret analytics it seems the network has had a major spike in popularity since the beginning of October. With the release of Supernova, I imagine there could be more spikes in popularity and network usage these coming months.

After waiting 2 months + 4-6 weeks minimum to acquire some of the components, would it be sufficient to have “more than double capacity” to withstand potential spikes in network usage and popularity as we have seen this past month? I understand that it’s purely speculative in terms of the network usage and popularity over the next few months. However, I guess my question is not only whether doubling is sufficient, but if it were insufficient, what would the consequences of that look like?

Apologies if these questions seem redundant / stupid, I am still learning and new to the Secret Network.

1 Like

No worries man, those are reasonable questions.

Estimating traffic is always tough. There are so many unknowns. However, we can simplify a bit, and I’ll try to explain some of the logic behind my suggestions.

First, we have to ask ourselves - what is the purpose of the community API?

Is it a service for small app developers and users? If this is the case, then it isn’t as important to predict growth, as it doesn’t provide a critical service. Also, I would argue that for this use-case the current capacity is more than enough (I could be conviced otherwise, but it would have to be backed up with metrics and usage data) - smaller devs don’t tend to have high capacity demands. In most cases high usage will be a cause of unoptimized applications or devs that don’t care about the API since it is free (this is why rate limiting is critical).

Okay, so what if we want the community API to provide critical services in the network? Well, firstly, to be able to provide the kind of reliable service that critical applications require you have to have very good proccesses in place (the things I talked about in the previous post). The answer to “what happens if traffic spikes beyond capacity” depends a lot on how strong the infrastructure is. It could cause a crash for all services depending on it that would take days to recover from, or it could just slow things down a bit while the developers and infra maintainers scramble to reduce load.

But even putting that aside, in terms of raw capacity given what we’re seeing on SecretSwap & Enigma-powered services I’m fairly confident that even current capacity can meet current and short-term demands. Larger applications tend to have fairly predictable and optimized usages, making the raw demand on nodes relatively lower and can take better advantage of more advanced solutions such as caching and websockets. For example, in Secretswap instead of having every user query all possible pairs to be able to do route calculations, the on-chain values are queried once, cached and propogated to users on-demand.
I could be wrong, of course. Maybe Sienna has so many users that they have some ungodly requirements of the network, or maybe Keplr does - but I think that discussion is pointless without real-world data

4 Likes

I like your suggestions. Happy to adjust along these lines, I need to have some discussions before i respond further.

PREACH!!! strongly agree with every suggestion here.

3 Likes

I can get behind everything you quoted for sure. I will be coming back with an adjusted plan later today.

1 Like

Based both on feedback from cashmaney and the expected 4x improvement to query throughput here is the adjusted plan.

A proposal to upgrade one of the builds from the infra b proposal to be able to handle 70 nodes (up from 20).

Note 1 : 1 x System Supermicro120U-TNR & 2 x Intel Platinum 8380 CPUs.

Note 2: We are open to structuring the new api to load balance across our nodes + nodes from other api providers on the network. Currently exploring the feasibility.

Budget

  • Hardware, Shipping, Warranty $75,010.90
  • Sales Tax on hardware $3,427.51
  • Buffer for Volatility 10% $7,843.85
  • = $85,732.26
  • Subtract $20,000 (Taking 20 nodes / $20,000 budget from Infra prop b)

Total $66,282.26

1 Like

Just pointing out that the last proposal mentioned 40-50 nodes:

Deploy a 3rd gen scalable xeon, migrate api load balancer, build out a 40-50 secret node endpoint capacity, research and deploy relay nodes to connect secret network to other networks via IBC.

Are you suggesting this spend is to add an additional 20-30 nodes? Or were the funds from the first proposal not enough to cover the 40-50 nodes mentioned?

2 Likes

Easy explanation. The last proposal has a total responsibility of 40 Nodes and mohammed and I split that responsibility. Mohammed is still responsible for 20. This would make me responsible for 70 and him still responsible for his 20.

As mentioned in the Infrastructure mega thread. A phased approach to launching nodes is most appropriate imo. Proof of fully operating 20 nodes should be provided to the community before extending the next payout for subsequent batches of nodes.
Proposers should fully utilize existing funds for nodes and show proof of the successful work before proceeding with further requests of funds.