Separate replica from master

Not having a specific placement strategy for replica might cause it
to be deployed on the same ec2-instance as the master.

If tasks are running on the same instance, both source and destination
IP are the same when being received by the target instance and this
causes packets to be dropped as per convention for L4 networking[1].

Ensure replica is never deployed to the same ec2-instance as the
master by creating separate ASG for master and replica.

Since the cluster is now composed by different autoscaling groups,
rather than one, the DESIRED_CAPACITY variable cannot be used to specify
the size of the cluster as a whole, as each autoscaling group has its
own size.

The master ASG is defined as such:
* minimum:1 - desired:1 - maximum: configurable via the MASTER_MAX_COUNT

The slave ASG does not yet have the ability to scale (it's in the roadmap
to do[2]):
* minimum:1 - desired:1 - maximum:1

[1]https://aws.amazon.com/premiumsupport/knowledge-center/target-connection-fails-load-balancer/
[2]https://bugs.chromium.org/p/gerrit/issues/detail?id=13619

Bug: Issue 13879
Change-Id: I306610e9b5720363f730765d9f9332d0b7f52814
diff --git a/master-slave/Makefile b/master-slave/Makefile
index 89f4526..071378b 100644
--- a/master-slave/Makefile
+++ b/master-slave/Makefile
@@ -38,6 +38,9 @@
 ifdef VPC_CIDR
 		$(eval CLUSTER_OPTIONAL_PARAMS := $(CLUSTER_OPTIONAL_PARAMS) ParameterKey=VPCCIDR,ParameterValue=$(VPC_CIDR))
 endif
+ifdef MASTER_MAX_COUNT
+		$(eval CLUSTER_OPTIONAL_PARAMS := $(CLUSTER_OPTIONAL_PARAMS) ParameterKey=MasterMaxCount,ParameterValue=$(MASTER_MAX_COUNT))
+endif
 
 	$(AWS_FC_COMMAND) create-stack \
 		--stack-name $(CLUSTER_STACK_NAME) \
@@ -45,7 +48,6 @@
 		--template-body file://`pwd`/$(CLUSTER_TEMPLATE) \
 		--region $(AWS_REGION) \
 		--parameters \
-		ParameterKey=DesiredCapacity,ParameterValue=$(CLUSTER_DESIRED_CAPACITY) \
 		ParameterKey=ECSKeyName,ParameterValue=$(CLUSTER_KEYS) \
 		ParameterKey=TemplateBucketName,ParameterValue=$(TEMPLATE_BUCKET_NAME) \
 		ParameterKey=InternetGatewayIdProp,ParameterValue=$(INTERNET_GATEWAY_ID) \
diff --git a/master-slave/README.md b/master-slave/README.md
index 2581e05..f08f0f2 100644
--- a/master-slave/README.md
+++ b/master-slave/README.md
@@ -76,14 +76,13 @@
 * `DASHBOARD_STACK_NAME` : Optional. Name of the dashboard stack. `gerrit-dashboard` by default.
 * `MASTER_SUBDOMAIN`: Optional. Name of the master sub domain. `gerrit-master-demo` by default.
 * `SLAVE_SUBDOMAIN`: Optional. Name of the slave sub domain. `gerrit-slave-demo` by default.
-* `CLUSTER_DESIRED_CAPACITY`: Optional. Number of EC2 instances composing the cluster. `1` by default.
 * `GERRIT_MASTER_INSTANCE_ID`: Optional. Identifier for the Gerrit master instance.
 "gerrit-master-slave-MASTER" by default.
 * `GERRIT_SLAVE_INSTANCE_ID`: Optional. Identifier for the Gerrit slave instance.
 "gerrit-master-slave-SLAVE" by default.
 
 *NOTE*: if you are planning to run the monitoring stack, set the
-`CLUSTER_DESIRED_CAPACITY` value to at least 2. The resources provided by
+`MASTER_MAX_COUNT` value to at least 2. The resources provided by
 a single EC2 instance won't be enough for all the services that will be ran*
 
 * `PROMETHEUS_SUBDOMAIN`: Optional. Prometheus subdomain. For example: `<AWS_PREFIX>-prometheus`
diff --git a/master-slave/cf-cluster.yml b/master-slave/cf-cluster.yml
index f2668a2..e56e341 100644
--- a/master-slave/cf-cluster.yml
+++ b/master-slave/cf-cluster.yml
@@ -6,14 +6,9 @@
   TemplateBucketName:
     Description: S3 bucket containing cloudformation templates
     Type: String
-  DesiredCapacity:
+  MasterMaxCount:
+    Description: The maximum number of EC2 instances in the master autoscaling group
     Type: Number
-    Default: '1'
-    Description: Number of EC2 instances to launch in your ECS cluster.
-  MaxSize:
-    Type: Number
-    Default: '6'
-    Description: Maximum number of EC2 instances that can be launched in your ECS cluster.
   ECSAMI:
     Description: AMI ID
     Type: AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>
@@ -77,24 +72,23 @@
           LogGroupName: !Ref AWS::StackName
           RetentionInDays: 14
 
-  # Autoscaling group. This launches the actual EC2 instances that will register
-  # themselves as members of the cluster, and run the docker containers.
-  ECSAutoScalingGroup:
+  MasterASG:
     Type: AWS::AutoScaling::AutoScalingGroup
     Properties:
       VPCZoneIdentifier:
         - !GetAtt ECSTaskNetworkStack.Outputs.PublicSubnetOneRef
-      LaunchConfigurationName: !Ref 'ContainerInstances'
+      LaunchConfigurationName: !Ref 'MasterLaunchConfiguration'
       MinSize: '1'
-      MaxSize: !Ref 'MaxSize'
-      DesiredCapacity: !Ref 'DesiredCapacity'
+      MaxSize: !Ref MasterMaxCount
+      DesiredCapacity: '1'
     CreationPolicy:
       ResourceSignal:
         Timeout: PT15M
     UpdatePolicy:
       AutoScalingReplacingUpdate:
         WillReplace: 'true'
-  ContainerInstances:
+
+  MasterLaunchConfiguration:
     Type: AWS::AutoScaling::LaunchConfiguration
     Properties:
       ImageId: !Ref 'ECSAMI'
@@ -106,9 +100,10 @@
         Fn::Base64: !Sub |
           #!/bin/bash -xe
           echo ECS_CLUSTER=${ECSCluster} >> /etc/ecs/ecs.config
+          echo ECS_INSTANCE_ATTRIBUTES={\"target_group\":\"master\"} >> /etc/ecs/ecs.config
           # Make sure latest version of the helper scripts are installed as per recommendation:
           # https://github.com/awsdocs/aws-cloudformation-user-guide/blob/master/doc_source/cfn-helper-scripts-reference.md#using-the-latest-version
-          yum install -y aws-cfn-bootstrap
+          yum install -y aws-cfn-bootstrap wget
           # Get the CloudWatch Logs agent
           echo -e "
             {\"logs\":
@@ -116,26 +111,6 @@
                 {\"files\":
                   {\"collect_list\":
                     [
-                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/httpd_log\",
-                      \"log_group_name\": \"${AWS::StackName}\",
-                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/httpd_log\",
-                      \"timezone\": \"UTC\"
-                      },
-                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/sshd_log\",
-                      \"log_group_name\": \"${AWS::StackName}\",
-                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/sshd_log\",
-                      \"timezone\": \"UTC\"
-                      },
-                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/gc_log\",
-                      \"log_group_name\": \"${AWS::StackName}\",
-                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/gc_log\",
-                      \"timezone\": \"UTC\"
-                      },
-                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/audit_log\",
-                      \"log_group_name\": \"${AWS::StackName}\",
-                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/audit_log\",
-                      \"timezone\": \"UTC\"
-                      },
                       {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-master/_data/replication_log\",
                       \"log_group_name\": \"${AWS::StackName}\",
                       \"log_stream_name\": \"${EnvironmentName}/{instance_id}/master/replication_log\",
@@ -167,12 +142,89 @@
               }
             }" >> /home/ec2-user/gerritlogsaccess.json
           # Install the CloudWatch Logs agent
-          yum install -y wget
           wget https://s3.amazonaws.com/amazoncloudwatch-agent/centos/amd64/latest/amazon-cloudwatch-agent.rpm
           rpm -U ./amazon-cloudwatch-agent.rpm
           /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/home/ec2-user/gerritlogsaccess.json -s
           # Signal to CloudFormation aws-cfn-bootstrap has been correctly updated
-          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ECSAutoScalingGroup --region ${AWS::Region}
+          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource MasterASG --region ${AWS::Region}
+
+
+  ReplicaASG:
+    Type: AWS::AutoScaling::AutoScalingGroup
+    Properties:
+      VPCZoneIdentifier:
+        - !GetAtt ECSTaskNetworkStack.Outputs.PublicSubnetOneRef
+      LaunchConfigurationName: !Ref 'ReplicaLaunchConfiguration'
+      MinSize: '1'
+      MaxSize: '1'
+      DesiredCapacity: '1'
+    CreationPolicy:
+      ResourceSignal:
+        Timeout: PT15M
+    UpdatePolicy:
+      AutoScalingReplacingUpdate:
+        WillReplace: 'true'
+
+  ReplicaLaunchConfiguration:
+    Type: AWS::AutoScaling::LaunchConfiguration
+    Properties:
+      ImageId: !Ref 'ECSAMI'
+      SecurityGroups: [!Ref 'EcsHostSecurityGroup']
+      InstanceType: !Ref 'InstanceType'
+      IamInstanceProfile: !Ref 'EC2InstanceProfile'
+      KeyName: !Ref ECSKeyName
+      UserData:
+        Fn::Base64: !Sub |
+          #!/bin/bash -xe
+          echo ECS_CLUSTER=${ECSCluster} >> /etc/ecs/ecs.config
+          echo ECS_INSTANCE_ATTRIBUTES={\"target_group\":\"replica\"} >> /etc/ecs/ecs.config
+
+          # Make sure latest version of the helper scripts are installed as per recommendation:
+          # https://github.com/awsdocs/aws-cloudformation-user-guide/blob/master/doc_source/cfn-helper-scripts-reference.md#using-the-latest-version
+          yum install -y aws-cfn-bootstrap wget
+          # Get the CloudWatch Logs agent
+          echo -e "
+            {\"logs\":
+              {\"logs_collected\":
+                {\"files\":
+                  {\"collect_list\":
+                    [
+                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/replication_log\",
+                      \"log_group_name\": \"${AWS::StackName}\",
+                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/replication_log\",
+                      \"timezone\": \"UTC\"
+                      },
+                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/httpd_log\",
+                      \"log_group_name\": \"${AWS::StackName}\",
+                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/httpd_log\",
+                      \"timezone\": \"UTC\"
+                      },
+                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/sshd_log\",
+                      \"log_group_name\": \"${AWS::StackName}\",
+                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/sshd_log\",
+                      \"timezone\": \"UTC\"
+                      },
+                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/gc_log\",
+                      \"log_group_name\": \"${AWS::StackName}\",
+                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/gc_log\",
+                      \"timezone\": \"UTC\"
+                      },
+                      {\"file_path\": \"/var/lib/docker/volumes/gerrit-logs-slave/_data/audit_log\",
+                      \"log_group_name\": \"${AWS::StackName}\",
+                      \"log_stream_name\": \"${EnvironmentName}/{instance_id}/slave/audit_log\",
+                      \"timezone\": \"UTC\"
+                      }
+                    ]
+                  }
+                }
+              }
+            }" >> /home/ec2-user/gerritlogsaccess.json
+          # Install the CloudWatch Logs agent
+          wget https://s3.amazonaws.com/amazoncloudwatch-agent/centos/amd64/latest/amazon-cloudwatch-agent.rpm
+          rpm -U ./amazon-cloudwatch-agent.rpm
+          /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/home/ec2-user/gerritlogsaccess.json -s
+          # Signal to CloudFormation aws-cfn-bootstrap has been correctly updated
+          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ReplicaASG --region ${AWS::Region}
 
   EC2InstanceProfile:
     Type: AWS::IAM::InstanceProfile
diff --git a/master-slave/cf-service-master.yml b/master-slave/cf-service-master.yml
index 3c1f39b..ba08b97 100644
--- a/master-slave/cf-service-master.yml
+++ b/master-slave/cf-service-master.yml
@@ -228,6 +228,9 @@
             TaskRoleArn: !GetAtt ECSTaskExecutionRoleStack.Outputs.TaskExecutionRoleRef
             ExecutionRoleArn: !GetAtt ECSTaskExecutionRoleStack.Outputs.TaskExecutionRoleRef
             NetworkMode: bridge
+            PlacementConstraints:
+              - Expression: !Sub 'attribute:target_group == master'
+                Type: "memberOf"
             ContainerDefinitions:
                 - Name: !Ref GerritServiceName
                   Essential: true
diff --git a/master-slave/cf-service-slave.yml b/master-slave/cf-service-slave.yml
index 80014f6..44f19c2 100644
--- a/master-slave/cf-service-slave.yml
+++ b/master-slave/cf-service-slave.yml
@@ -225,6 +225,9 @@
             TaskRoleArn: !GetAtt ECSTaskExecutionRoleStack.Outputs.TaskExecutionRoleRef
             ExecutionRoleArn: !GetAtt ECSTaskExecutionRoleStack.Outputs.TaskExecutionRoleRef
             NetworkMode: bridge
+            PlacementConstraints:
+              - Expression: !Sub 'attribute:target_group == replica'
+                Type: "memberOf"
             ContainerDefinitions:
                 - Name: !Ref GerritServiceName
                   Essential: true
diff --git a/master-slave/setup.env.template b/master-slave/setup.env.template
index d87bbaa..c14f6a8 100644
--- a/master-slave/setup.env.template
+++ b/master-slave/setup.env.template
@@ -1,4 +1,4 @@
-CLUSTER_DESIRED_CAPACITY:=1
+MASTER_MAX_COUNT=2
 CLUSTER_INSTANCE_TYPE:=m4.xlarge
 SERVICE_MASTER_STACK_NAME:=$(AWS_PREFIX)-service-master
 SERVICE_SLAVE_STACK_NAME:=$(AWS_PREFIX)-service-slave