diff --git a/docs/assets/images/aws-node-runners-1.png b/docs/assets/images/aws-node-runners-1.png new file mode 100644 index 00000000000..8ca6fbda722 Binary files /dev/null and b/docs/assets/images/aws-node-runners-1.png differ diff --git a/docs/assets/images/aws-node-runners-2.png b/docs/assets/images/aws-node-runners-2.png new file mode 100644 index 00000000000..6cdac6fd226 Binary files /dev/null and b/docs/assets/images/aws-node-runners-2.png differ diff --git a/docs/public-networks/how-to/bonsai-limit-trie-logs.md b/docs/public-networks/how-to/bonsai-limit-trie-logs.md index eb73f85b36b..f3a10e33cfe 100644 --- a/docs/public-networks/how-to/bonsai-limit-trie-logs.md +++ b/docs/public-networks/how-to/bonsai-limit-trie-logs.md @@ -1,6 +1,6 @@ --- title: Reduce storage for Bonsai Tries -sidebar_position: 12 +sidebar_position: 9 description: Reduce the size of your database when using Bonsai Tries tags: - public networks diff --git a/docs/public-networks/how-to/develop/_category_.json b/docs/public-networks/how-to/develop/_category_.json index 8c8a280b930..7c6d05c724f 100644 --- a/docs/public-networks/how-to/develop/_category_.json +++ b/docs/public-networks/how-to/develop/_category_.json @@ -1,4 +1,4 @@ { "label": "Develop dapps", - "position": 9 + "position": 10 } diff --git a/docs/public-networks/tutorials/aws-node-runners.md b/docs/public-networks/tutorials/aws-node-runners.md new file mode 100644 index 00000000000..0dee5a56dd1 --- /dev/null +++ b/docs/public-networks/tutorials/aws-node-runners.md @@ -0,0 +1,429 @@ +--- +sidebar_position: 4 +description: Configure Ethereum nodes using AWS Blockchain Node Runners. +toc_max_heading_level: 3 +tags: + - Public networks +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +# Deploy AWS Node Runners + +[AWS Blockchain Node Runners](https://aws-samples.github.io/aws-blockchain-node-runners/docs/intro) +is an open-source initiative aimed at simplifying the deployment of self-managed blockchain nodes +on AWS using vetted deployment blueprints and infrastructure configurations. +AWS Node Runners solves common challenges in architecting and deploying blockchain nodes on AWS, +helping users identify optimal configurations for specific protocol clients. + +This page walks you through the AWS Node Runners [architecture](#architecture), and how to +[deploy Besu and Teku on AWS](#deploy-aws-node-runners). + +## Architecture + +AWS Blockchain Node Runners supports several Ethereum client combinations and offers two +configuration options: a single node setup for development environments, and a highly available +multi-node setup for production environments. +The following diagrams illustrate the high level architecture of these setups. + +### Single RPC node setup + +
+

+ +![Architecture-PoC](../../assets/images/aws-node-runners-1.png) + +

+
+ +This single node setup is for small-scale development environments. +It deploys a single EC2 instance with both consensus and execution clients. +The RPC port is exposed only to the internal IP range of the VPC, while P2P ports allow external access to keep the clients synced. + +### Highly available setup + +
+

+ +![Architecture](../../assets/images/aws-node-runners-2.png) + +

+
+ +In this highly available, multiple node setup: + +1. The sync node synchronizes data continuously with the Ethereum network. +1. The sync node copies node state data to an Amazon S3 bucket. +1. New RPC nodes copy state data from the Amazon S3 bucket to accelerate their initial sync. +1. The Application Load Balancer routes application and smart contract development tool requests to available RPC nodes. + +### Architecture checklist + +The following is a checklist for an implementation of the AWS Blockchain Node Runners. +This checklist takes into account questions from the [AWS Well-Architected framework](https://aws.amazon.com/architecture/well-architected/) +that are relevant to this workload. +You can add more checks from the framework if required for your workload. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PillarControlQuestion/CheckNotes
SecurityNetwork protectionAre there unnecessary open ports in security groups?The Erigon snap sync port (`42069`) remains open for non-Erigon clients.
Traffic inspectionAWS WAF can be implemented for traffic inspection. Additional charges will apply.
Compute protectionReduce attack surfaceThis solution uses Amazon Linux 2 AMI. You can run hardening scripts on it.
Enable users to perform actions at a distanceThis solution uses AWS Systems Manager for terminal sessions, not SSH ports.
Data protection at restUse encrypted Amazon Elastic Block Store (Amazon EBS) volumesThis solution uses encrypted Amazon EBS volumes.
Use encrypted Amazon Simple Storage Service (Amazon S3) bucketsThis solution uses Amazon S3 managed keys (SSE-S3) encryption.
Data protection in transitUse TLSThe AWS Application Load Balancer currently uses an HTTP listener. To use TLS, create an HTTPS listener with a self-signed certificate.
Authorization and access controlUse instance profile with Amazon Elastic Compute Cloud (Amazon EC2) instancesThis solution uses AWS Identity and Access Management (AWS IAM) role instead of IAM user.
Follow the principle of least privilege accessIn the sync node, the root user is not used (it uses the special user `ethereum` instead).
Application securitySecurity-focused development practicescdk-nag is used with appropriate suppressions.
Cost optimizationUse cost-effective resourcesAWS Graviton-based Amazon EC2 instances are used, which are cost-effective compared to Intel/AMD instances.
Estimate costsOne sync node with m7g.2xlarge for geth-Lighthouse configuration (2048 GB SSD) will cost around $430 per month in the US East region. Additional charges apply if you deploy RPC nodes with a load balancer.
ReliabilityWithstand component failuresThis solution uses AWS Application Load Balancer with RPC nodes for high availability. If the sync node fails, Amazon S3 backup can be used to reinstate the nodes.
How is data backed up?Data is backed up to Amazon S3 using the s5cmd tool.
How are workload resources monitored?Resources are monitored using Amazon CloudWatch dashboards. Amazon CloudWatch custom metrics are pushed through CloudWatch Agent.
Performance efficiencyHow is the compute solution selected?The solution is selected based on best price-performance, that is, AWS Graviton-based Amazon EC2 instances.
How is the storage solution selected?The solution is selected based on best price-performance, that is, gp3 Amazon EBS volumes with optimal IOPS and throughput.
How is the architecture selected?The s5cmd tool is used for Amazon S3 uploads/downloads because it gives better price-performance compared to Amazon EBS snapshots.
Operational excellenceHow is health of the workload determined?Workload health is determined via AWS Application Load Balancer Target Group Health Checks, on port `8545`.
SustainabilitySelect the most efficient hardware for your workloadThis solution uses AWS Graviton-based Amazon EC2 instances, which offer the best performance per watt of energy use in Amazon EC2.
+ +## Deploy Besu and Teku on AWS + +:::note +In this guide, you'll set all major configuration through environment variables, but you can also +modify parameters in the `config/config.ts` file. +::: + +### 1. Configure the AWS CloudShell + +#### 1.1. Log into AWS + +Log in to your [AWS account](https://aws.amazon.com/) with permissions to create and modify +resources in IAM, EC2, EBS, VPC, S3, KMS, and Secrets Manager. +From the AWS Management Console, open the [AWS CloudShell](https://docs.aws.amazon.com/cloudshell/latest/userguide/welcome.html), +a web-based shell environment. +For more information, see [this demo](https://youtu.be/fz4rbjRaiQM) on +[CloudShell with VPC environment](https://docs.aws.amazon.com/cloudshell/latest/userguide/creating-vpc-environment.html), +which you'll use to test APIs from an internal IP address space. + +#### 1.2. Install dependencies + +To deploy and test blueprints in the CloudShell, clone the following repository and install dependencies: + +```bash +git clone https://github.com/aws-samples/aws-blockchain-node-runners.git +cd aws-blockchain-node-runners +npm install +``` + +### 2. Prepare to deploy nodes + +In the root directory of your project: + +1. If you have deleted or don't have the default VPC, create a default VPC: + + ```bash + aws ec2 create-default-vpc + ``` + + :::note + You might see the following error if the default VPC already exists: + + ```bash + An error occurred (DefaultVpcAlreadyExists) when calling the CreateDefaultVpc operation: A Default VPC already exists for this account in this region. + ``` + + This means that the default VPC must have at least two public subnets in different availability + zones, and public subnet must set `Auto-assign public IPv4 address` to `YES`. + ::: + +1. Configure your Node Runners Ethereum blueprint deployment. + To specify the Ethereum client combination you want to deploy, create your own copy of the `.env` + file and edit it using your preferred text editor. + The following example uses a sample configuration from the repository for a Besu and Teku node deployment: + + ```bash + # Ensure you're in aws-blockchain-node-runners/lib/ethereum + cd lib/ethereum + pwd + cp ./sample-configs/.env-besu-teku .env + nano .env + ``` + + :::note + You can find more examples for other Ethereum client combinations in the `sample-configs` directory. + ::: + +1. Deploy common components, such as IAM role and Amazon S3 bucket to store data snapshots: + + ```bash + pwd + # Ensure you're in aws-blockchain-node-runners/lib/ethereum + npx cdk deploy eth-common + ``` + +### 3. Deploy nodes + +Deploy your node or nodes, depending on your setup: + +- [Single RPC node](#31-option-1-single-rpc-node) +- [Highly available RPC nodes](#32-option-2-highly-available-rpc-nodes) + +#### 3.1. (Option 1) Single RPC node + +In a single RPC node setup: + +1. Deploy the node: + + ```bash + pwd + # Ensure you're in aws-blockchain-node-runners/lib/ethereum + npx cdk deploy eth-single-node --json --outputs-file single-node-deploy.json + ``` + + :::note + The default VPC must have at least two public subnets in different Availability Zones, and the + public subnets must set `Auto-assign public IPv4 address` to `YES`. + ::: + +1. After starting the node, wait for the initial synchronization process to finish. + It can take half a day to approximately 6-10 days, depending on the client combination and + the network state. + You can use Amazon CloudWatch to track the progress, which publishes metrics every five minutes. + Watch `sync distance` for the consensus client, and `blocks behind` for the execution client. + When the node is fully synced, those two metrics should be `0`. + To see them: + + - Navigate to [CloudWatch service](https://console.aws.amazon.com/cloudwatch/) (ensure you're + in the region you specified for `AWS_REGION`). + - Open `Dashboards` and select `eth-sync-node-` from the list of dashboards. + +1. Once the initial synchronization is done, you can access the RPC API of that node from within the + same VPC. + The RPC port is not exposed to the Internet. + Run the following query against the private IP of the single RPC node you deployed: + + ```bash + INSTANCE_ID=$(cat single-node-deploy.json | jq -r '..|.node-instance-id? | select(. != null)') + NODE_INTERNAL_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[*].Instances[*].PrivateIpAddress' --output text) + echo "NODE_INTERNAL_IP=$NODE_INTERNAL_IP" + ``` + + Copy the output from the last `echo` command with `NODE_INTERNAL_IP=` and open + [CloudShell tab with VPC environment](https://docs.aws.amazon.com/cloudshell/latest/userguide/creating-vpc-environment.html) + to access the internal IP address space. + Paste `NODE_INTERNAL_IP=` into the new CloudShell tab. + Then, query the API: + + ``` bash + # IMPORTANT: Run from CloudShell VPC environment tab + # This queries the token balance of a Beacon deposit contract: https://etherscan.io/address/0x00000000219ab540356cbb839cbe05303d7705fa + curl http://$NODE_INTERNAL_IP:8545 -X POST -H "Content-Type: application/json" \ + --data '{"method":"eth_getBalance","params":["0x00000000219ab540356cBB839Cbe05303d7705Fa", "latest"],"id":1,"jsonrpc":"2.0"}' + ``` + + The result should look like the following (the actual balance might change): + + ```javascript + {"jsonrpc":"2.0","id":1,"result":"0xe791d050f91d9949d344d"} + ``` + +#### 3.2. (Option 2) Highly available RPC nodes + +In a highly available multi-node setup: + +1. Deploy the sync node: + + ```bash + pwd + # Ensure you're in aws-blockchain-node-runners/lib/ethereum + npx cdk deploy eth-sync-node --json --outputs-file sync-node-deploy.json + ``` + + :::note + The default VPC must have at least two public subnets in different Availability Zones, and the + public subnets must set `Auto-assign public IPv4 address` to `YES`. + ::: + +1. After starting the node, wait for the initial synchronization process to finish. + It can take from half a day to approximately 6-10 days, depending on the client combination and + the network state. + You can use Amazon CloudWatch to track the progress, which publishes metrics every five minutes. + Watch `sync distance` for the consensus client, and `blocks behind` for the execution client. + When the node is fully synced, those two metrics should be `0`. + To see them: + + - Navigate to [CloudWatch service](https://console.aws.amazon.com/cloudwatch/) (make sure you are + in the region you have specified for `AWS_REGION`). + - Open `Dashboards` and select `eth-sync-node-` from the list of dashboards. + + Once the synchronization process is over, the script automatically stops both clients and copies + all the contents of the `/data` directory to your snapshot S3 bucket. + That can take from 30 minutes to approximately 2 hours. + During the process, you will see lower CPU and RAM usage, but high data disc throughput and + outbound network traffic. + The script automatically starts the clients after the process is done. + + :::note + The snapshot backup process automatically runs every day at midnight of the time zone were the + sync node runs. + To change the schedule, modify `crontab` of the root user on the node's EC2 instance. + ::: + +1. Configure and deploy two RPC nodes: + + ```bash + pwd + # Ensure you're in aws-blockchain-node-runners/lib/ethereum + npx cdk deploy eth-rpc-nodes --json --outputs-file rpc-node-deploy.json + ``` + +1. Give the new RPC nodes approximately 30 minutes to initialize, then run the following query + against the load balancer behind the RPC node created: + + ```bash + export ETH_RPC_ABL_URL=$(cat rpc-node-deploy.json | jq -r '..|.alburl? | select(. != null)') + echo ETH_RPC_ABL_URL=$ETH_RPC_ABL_URL + ``` + + ```bash + # IMPORTANT: Run from CloudShell VPC environment tab + # We query token balance of Beacon deposit contract: https://etherscan.io/address/0x00000000219ab540356cbb839cbe05303d7705fa + curl http://$ETH_RPC_ABL_URL:8545 -X POST -H "Content-Type: application/json" \ + --data '{"method":"eth_getBalance","params":["0x00000000219ab540356cBB839Cbe05303d7705Fa", "latest"],"id":1,"jsonrpc":"2.0"}' + ``` + + The result should look like the following (the actual balance might change): + + ```javascript + {"jsonrpc":"2.0","id":1,"result":"0xe791d050f91d9949d344d"} + ``` + + If the nodes are still starting and catching up with the chain, you will see the following response: + + ```HTML + + 503 Service Temporarily Unavailable + +

503 Service Temporarily Unavailable

+ + ``` + + :::note + By default and for security reasons, the load balancer is available only from within the default + VPC in the region where it is deployed. + It is not available from the Internet and is not open for external connections. + Before opening it up, protect your RPC APIs. + ::: + +### 4. Clear and undeploy nodes + +To clear and undeploy the RPC nodes, sync nodes, and common components, use the following commands: + +```bash +# Set the AWS account ID and region in case the local .env file is lost. +export AWS_ACCOUNT_ID= +export AWS_REGION= + +pwd +# Ensure you're in aws-blockchain-node-runners/lib/ethereum. + +# Destroy the single RPC node. +cdk destroy eth-single-node + +# Destroy multiple RPC nodes. +cdk destroy eth-rpc-nodes + +# Destroy the sync node. +cdk destroy eth-sync-node + +# You need to manually delete an s3 bucket with a name similar to 'eth-snapshots-$accountid-eth-nodes-common' +# on the console: +# 1. Empty the bucket +# 2. Delete the bucket +# 3. Execute and delete all common components like IAM role and Security Group +cdk destroy eth-common +``` diff --git a/docs/public-networks/tutorials/kubernetes.md b/docs/public-networks/tutorials/kubernetes.md index 17c64f3172c..bcbae22e1b3 100644 --- a/docs/public-networks/tutorials/kubernetes.md +++ b/docs/public-networks/tutorials/kubernetes.md @@ -1,6 +1,7 @@ --- title: Deploy Besu using Kubernetes description: Deploy a Besu node using Kubernetes. +sidebar_position: 3 toc_max_heading_level: 3 tags: - public networks