I commit to personally give 5000$ worth of STEEM or BTC at his convenience and 100% of the reward from this post and below comments to @Furion to cover server costs for hosting SteemData. Will also request from him some kind of proof of payments to verify costs.
What is SteemData?
SteemData helps developers and researchers build better STEEM applications. We parse the STEEM blockchain for you, and provide the data as a fast and convenient MongoDB service.
Here is the last update he made about SteemData
SteemData is currently available in a limited scope.
The following features are not available at the moment:
AccountOperations(Account History /w virtual ops)
A full steemd node /w high-throughput is required for SteemData to function properly. It needs all plugins and all operations enabled, so that it can construct account history and store all the virtual operations alongside the operations stored on the blockchain.
Further, the node has to be in the same datacenter as SD to handle the required throughput. This is because the usage of Steem blockchain is growing, and re-syncing all the affected state requires over 100 requests per second. The private network latency within a data center is typically below 1ms, while a public network latency is usually 10 fold or more. This would decrease SD's throughput significantly.
Unfortunately, the datacenter where SteemData currently resides only offers servers with up to 256GB of RAM, and a steemd node configured to SD's requirements needs more than that.
Also, the SteemData MongoDB server is running out of disk space :(
I have patched things up in a quick-and-dirty fashion for a couple of months now, and the SteemData codebase has gotten a bit messy. I see this as an opportunity to clean things up, and improve reliability/performance.
Further, this is an opportunity to add infrastructure support and documentation, such that anyone can spin up their own SteemData cluster.
What needs to be done?
I am currently speccing out the new infrastructure. A new cluster will be setup in a different datacenter, capable of provisioning servers with NVMe SSD's in soft & hard RAID configs, and up to 512 GB RAM.
I should to test various steemd configurations, to achieve desired performance and provision the appropriate hardware for the next 6 months of operations.
The new DB server will also benefit from faster SSD's and larger in-RAM cache (currently 30GB, will be 60GB or 120GB).
Creating a replica set would add resilience and decouple BC processing from the database, freeing additional CPU cycles for queries. Further, adding replicas that are geo-distributed would allow for low latency in-app integrations worldwide.
As mentioned before, this is an opportunity to upgrade the operational side as well, with automated provisioning and monitoring/remediation improvements.
- Clean up the codebase
- Infrastructure as code
- New Servers
- Multi-Replica DB Cluster
SteemData is a popular choice for indie developers and power users.
Steem is growing at a rapid pace, and its daily on-chain state throughput is making Bitcoin and Ethereum look pale in comparison. To scale SteemData I'm looking at the approx. recurring server costs of $3,000/mo. This would not have been possible without witness pay. Thank you for supporting my work, and thank you for using SteemData