Skip to content

davidclin/cloudcustodian-policies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

David Lin's Cloud Custodian Policies

Policies in Production

Policy Description
iam-user-DeleteAccessKey.yml
Sends Slack notification when an IAM user is deleted along with its AWS access key id for auditing purposes.
health-event-issues.yml
Sends Slack and email notification when Personal Health Dashboard Operational Issues are found. Useful to get early visibility into AWS reported service outages. | (Optional) Replace periodic polling with event driven CW Rule Event Pattern | resources.json example output
phd-notifications.yml
Sends Slack notification when Personal Health Dashboard 'Other notifications' are found less than 1 day old.
get-resources.yml
Vanilla policy with no filter or actions defined. Used to quickly generate resources.json file to view all parameters available for given resource and generate a custom report against.
cfn-garbage-collection-audit.yml
Policy that garbage collects old CloudFormation stacks after specific number of days and that match a user specified string. Useful for destroying CloudFormation stacks that should be short lived such as lab, training, or test environments.
offhours.yml
Starts and stops instances during offhours via Lambda function. Instances filtered on presence of maid_offhours tag or StartAfterHours/StopAfterHours custom tags. (See Offhour Examples)
unused-sgroup-audit.yml
Retrieves all unused security groups that match regex, deletes them, then sends notifications.
iam-user-set-groups.yml
Use to add/remove a list of users to an IAM group.
s3-public-audit.yml
Sends notification when public S3 bucket is created.
copy-instance-tags.yml
Periodically copies tags from EC2 instances to respective EBS volumes.
public-instance-audit.yml
Sends notification when EC2 instance is launched with a Public IP address or attached to a Public subnet.
mfa-audit.yml Sends reminder to Slack channel so users who are in the Administrators group don't forget to enable MFA to comply with business security policies. If MFA remains disabled after 5 days of the user create date, console access is disabled and access keys are deleted.
termination-protection-audit.yml Sends email and Slack notification when EC2 instances in whitelist are found with termination protection disabled.
team-tag-ec2-audit.yml Retrieves all EC2 instances with absent or empty Team tag and sends notification.
team-tag-s3-audit.yml Retrieves all S3 buckets with absent or empty Team tag and sends notification.
s3-service-limit-audit.yml Monitors S3 service limits based on user threshhold and sends notifications.
s3-server-access-logging.yml Enables S3 server access logging with TargetBucket set to s3-access-log-account_number-region and PrefixTarget set to name of bucket.
s3-target-bucket-audit.yml Checks S3 server access logging target bucket name is in compliance. Supplements s3-server-access-logging.yml policy
s3-prevent-bucket-creation.yml Prevents creation of S3 buckets.
ebs-autocleanup-tag.yml Tags ebs volume with name 'AutoCleanup' and sets value to 'false' unless volume is already tagged with 'true'. This policy is used with the ebs-garbage-collection policy.
ebs-garbage-collection-with-tags.yml Deletes unattached EBS volumes older than 1 day (24hrs) that have the tag AutoCleanup set to true.
iam-ec2-policy-check.yml Checks IAM policies that have EC2 related full access permissions.
iam-user-audit.yml Monitors IAM users not in Ec2InstanceLaunchers group but found with AmazonEC2FullAccess managed policy attached.
iam-role-with-managed-policy-audit.yml Monitors IAM roles with the AmazonEC2FullAccess managed policy among others.
iam-user-administrator-access-audit.yml Monitors non-whitelisted IAM users with Administrator access.
iam-user-tagged-resources-audit.yml Retrieves list of AWS resources that belong to tag:Owner
ec2-health-event.yml Sends email and Slack notifications when an EC2 schedule event occurs.
iam-policy-account-audit.yml Sends email and Slack notifications when an IAM policy is found with escalated permissions.
iam-policy-account-Summary-audit.yml Sends email and Slack summary notifications of IAM policies with escalated permissions
iam-policy-CreatePolicy-audit.yml Sends email and Slack summary notifications of IAM Policies with escalated permissions based on CreatePolicy CloudTrail event. Requires a custom iam.py c7n_mailer site-package which can be obtained in the custom-site-packages directory of this repo.
iam-policy-CreatePolicyVersion-audit.yml Sends email and Slack summary notifications of IAM Policies with escalated permissions based on CreatePolicyVersion CloudTrail event. Requires a custom iam.py c7n_mailer site-package which can be obtained in the custom-site-packages directory of this repo.
iam-user-UpdateAccessKey-offboarding.yml Sends Slack notifications when IAM user's access key(s) get deactivated
iam-user-DeleteLoginProfile-offboarding.yml Sends Slack notifications when IAM user's management console gets disabled
acm-certificate-audit.yml Retrieves list of ACM certificates with expiration dates less than 60 days and sends email/Slack notification
ec2-garbage-collection-audit.yml Terminate instances with user defined name tag and age greater than 1 hour. Send email/Slack notifications.
acm-certificate-audit.yml Send notifications when ACM certificates with expiration dates less than 60 days are matched.
acm-certificate-cross-accounts-audit.yml c7n-org policy used to send notifications when ACM certificates with expiration dates less than 60 days are matched.
auto-team-tag.yml Auto tag EC2 instance with Team tag based on owner.
autotag-owner.yml Auto tag EC2 instance based on owner.
sgroup-audit.yml Retrieves all sgroups that match filter. The in-line documentation for the security-groups resource is buggy but this policy works as expected for ingress rules. Egress rules were not tested.
tag-audit.yml Retrieves list of resources that match tag.
subnet-ip-address-usage-audit.yml Sends notication via SES/Slack when private subnets in specified regions reach below 10 available IP addresses in specified region
get_lambda_runtime_audit.yml Retrieves lambdas based on runtime

Shell Scripts

Script Description
report.sh Invokes iam-user-tagged-resources-audit.yml then writes reports to file.

Policies in Test

Policy Description
ssm-managed-instance.yml Retrieves SSM managed instance attributes
ec2.yml Retrieves instances that match on specified tag name and instance type.
elasticsearch-find-all-domains.yml Finds Elasticsearch domaina and retrieves attributes.
new-user-audit.yml Retrieves iam users in specified group with MFA disabled in the last 30 days
termination-protection-list.yml Retrieves list of all EC2 instances with termination protection enabled.
security-groups-unused.yml Retrieves unused security groups using regex
stopped-instances.yml Retrieves list of all stopped instances in specific VPC. Can be further customized to match other criteria.
security-groups-unused-notify.yml Retrieves unused security groups using regex and notifies via email
iam.yml Retrieves iam users using regex
mfa.yml Retrieves iam users with MFA enabled
roles.yml Retrieves unused roles on EC2, Lambda, and ECS
admin-group.yml Retrieves users in the group named 'Administrators'
mfa-unused.yml Retrieves users who have MFA disabled in the group named 'Administrators'
emailer.yml Sends email notification via Simple Email Service (SES) using notify action
ebs-garbage-collection.yml Deletes all unattached volumes
ebs-garbage-collection-lambda.yml Deletes all unattached volumes using Lambda function
public-subnet-instance-audit-notify.yml Sends email notification via SES when EC2 instance launches in a public subnet
public-subnet-instance-audit-whitelist.yml Lambda that sends email notification via SES when EC2 instance launches in a public subnet and is NOT in the whitelist
mark-unused-sgroups.yml Mark unused security groups for deletion after N days ; to be used with delete-marked-sgroups.yml
delete-marked-sgroups.yml Unmarks used security groups that were marked for deletion then deletes remaining marked security
slack-notify.yml Slack example

Cloud Custodian Architecture and AWS Services

Getting Started

Quick Install
*** Linux ***
$ sudo apt-get install python3-venv
$ python3 -m venv custodian
$ source custodian/bin/activate
$ pip install pip --upgrade           #Upgrade pip
$ pip install setuptools --upgrade    #Upgrade setuptools
(custodian) $ pip install c7n         #Install AWS package
(custodian) $ pip install c7n-mailer  #Install mailer
(custodian) $ pip install c7n-org     #Install c7n-org

Note: 
When c7n-mailer is installed using pip: 

#Find the msg-templates directory
sudo find / -name default.j2  

#Copy email/Slack templates from this repo to the c7n_mailer/msg-templates directory
cp ~/cloudcustodian/msg-templates/*.* ~/custodian/lib/python3.6/site-packages/c7n_mailer/msg-templates/   

#Copy/update the mailer.yml from this repo to the c7n-mailer mailer directory
3) cp ~/cloudcustodian/policies/mailer.yml to /home/ubuntu/cloudcustodian/mailer/mailer.yml


*** Verify Installation ***
$ c7n-mailer
$ custodian
$ c7n-org

For more info, check out Cloud Custodian in GitHub

Quick Upgrade
*** Linux ***
$ source custodian/bin/activate
$ pip list --outdated                 #List outdated modules
$ pip install pip --upgrade           #Upgrade pip
$ pip install setuptools --upgrade    #Upgrade setuptools
$ pip install c7n --upgrade           #Upgrade c7n
$ pip install c7n-mailer --upgrade    #Upgrade c7n-mailer

Note: 
Repeat for source 'c7n_org/bin/activate' 

*** Verify Upgrade ***
$ pip list --outdated                 #c7n, c7n-mailer, setuptools, and c7n-org should not appear in output
Quick Install (Deprecated; for historical purposes)
*** Install repository***
$ git clone https://github.com/capitalone/cloud-custodian

*** Install dependencies (with virtualenv) ***
$ virtualenv c7n_mailer
$ source c7n_mailer/bin/activate
$ cd cloud-custodian/tools/c7n_mailer
$ pip install -r requirements.txt

Note: If you upgrade PIP and encounter issues related to "pip ImportError: cannot import name 'main' after update",
save yourself some grief and simply remove /usr/bin/pip and re-install pip.

*** Install extensions ***
$ python setup.py develop

*** Verify Installation ***
$ c7n-mailer
$ custodian

*** Upgrade AWS CLI ***
$ sudo pip install awscli --upgrade

*** Upgrade c7n-mailer runtime ***
Step 1 - Add following line to mailer.yml:
         runtime: python3.7
Step 2 - $ c7n-mailer --config mailer.yml --update-lambda

For more info, check out Cloud Custodian in GitHub

Usage

Getting Started
Cloud Custodian must be run within a virtual environment.

$ cd ~
$ virtualenv c7n_mailer/bin/activate
$ cd cloudcustodian  (this is the IE/cloudcustodian repo where all the policies reside)

As a test, try
$ custodian run -s out mfa.yml
$ custodian report -s out mfa.yml --format grid

Cloud Custodian will create a log file in the ~/cloudcustodian/out/ subdirectory IF there are any matches. 

Environment Settings

mailer.yml
# Which queue should we listen to for messages
queue_url: https://sqs.us-east-1.amazonaws.com/1234567890/sandbox

# Default from address
from_address: [email protected]

# Tags that we should look at for address infomation
contact_tags:
  - OwnerContact
  - OwnerEmail
  - SNSTopicARN

# Standard Lambda Function Config
region: us-east-1
role: arn:aws:iam::1234567890:role/CloudCustodianRole
slack_token: xoxb-bot_token_string_goes_here
Cloud Custodian Lambda AWS Role
Note: Based on your use case, additional permissions may be needed. 
Cloud Custodian will generate a msg if that is the case after invocation.
When in doubt, exercise the principle of least privilege and incrementally
add permissions only as needed.

Trust relationship:
"Service": "lambda.amazonaws.com"

General policy permissions:
health:DescribeEvents
health:DescribeEventDetails
health:DescribeAffectedEntities
iam:PassRole
iam:ListAccountAliases
iam:ListUsers
iam:ListRoles
iam:ListAttachedRolePolicies
iam:ListAttachedUserPolicies
iam:ListPolicies
iam:ListPolicyVersions
iam:ListGroupsForUser
iam:ListAccessKeys
iam:GetCredentialReport
iam:GenerateCredentialReport
iam:GetPolicyVersions
iam:GetPolicy
iam:GetUser
iam:DeleteAccessKey
iam:DeleteLoginProfile
iam:TagUser
ses:SendEmail
ses:SendRawEmail
lambda:CreateFunction
lambda:ListTags
lambda:GetFunction
lambda:AddPermission
lambda:ListFunctions
lambda:UpdateFunctionCode
lambda:CreateAlias
events:DescribeRule
events:PutRule
events:ListTargetsByRule
events:PutTargets
events:ListTargetsByRule
tag:GetResources
cloudwatch:CreateLogGroup
cloudwatch:CreateLogStream
autoscaling:DescribeLaunchConfigurations
s3:GetBucketLocation
s3:GetBucketTagging
s3:GetBucketPolicy
s3:GetReplicationConfiguration
s3:GetBucketVersioning
s3:GetBucketNotification  
s3:GetLifeCycleConfiguration
s3:ListAllMyBuckets
s3:GetBucketAcl
s3:GetBucketWebsite
s3:GetBucketLogging 
s3:DeleteBucket 
s3:PutBucketTagging
es:DescribeElasticsearchDomains
es:ListDomainsNames
es:ListTags
es:AddTags
cloudwatch:DescribeAlarmsForMetric
cloudwatch:PutMetricAlarm
Slack Oauth Permissions for App with Bot User
incoming-webhook
channels:history
channels:read
chat:write:bot
chat:write:user
groups:history
groups:read
im:write
users:read
users:read.email

Schemas Used

ebs
(custodian) [hostname]$ custodian schema ebs
aws.ebs:
  actions: [auto-tag-user, copy-instance-tags, delete, detach, encrypt-instance-volumes,
    invoke-lambda, mark, mark-for-op, modify, normalize-tag, notify, post-finding,
    put-metric, remove-tag, rename-tag, snapshot, tag, tag-trim, unmark, untag]
  filters: [and, config-compliance, event, fault-tolerant, health-event, instance,
    kms-alias, marked-for-op, metrics, modifyable, not, or, tag-count, value]
ec2

(custodian) [hostname]$ custodian schema ec2 aws.ec2: actions: [auto-tag-user, autorecover-alarm, invoke-lambda, mark, mark-for-op, modify-security-groups, normalize-tag, notify, post-finding, propagate-spot-tags, put-metric, reboot, remove-tag, rename-tag, resize, set-instance-profile, snapshot, start, stop, tag, tag-trim, terminate, unmark, untag] filters: [and, config-compliance, default-vpc, ebs, ephemeral, event, health-event, image, image-age, instance-age, instance-attribute, instance-uptime, marked-for-op, metrics, network-location, not, offhour, onhour, or, security-group, singleton, state-age, subnet, tag-count, termination-protected, user-data, value, vpc]

elasticsearch
aws.elasticsearch:
  actions: [auto-tag-user, delete, invoke-lambda, mark-for-op, modify-security-groups,
    notify, post-finding, put-metric, remove-tag, tag]
  filters: [and, event, marked-for-op, metrics, not, or, security-group, subnet, value,
    vpc]
iam-role
(custodian) [hostname]$ custodian schema iam-role
aws.iam-role:
  actions: [invoke-lambda, notify, put-metric]
  filters: [and, event, has-inline-policy, has-specific-managed-policy, no-specific-managed-policy,
    not, or, unused, used, value]
iam-user
(custodian) [hostname]$ custodian schema iam-user
aws.iam-user:
  actions: [delete, invoke-lambda, notify, put-metric, remove-keys]
  filters: [access-key, and, credential, event, group, mfa-device, not, or, policy,
    value]
s3
(custodian) [hostname]$ custodian schema s3
aws.s3:
  actions: [attach-encrypt, auto-tag-user, configure-lifecycle, delete, delete-bucket-notification,
    delete-global-grants, encrypt-keys, encryption-policy, invoke-lambda, mark-for-op,
    no-op, notify, post-finding, put-metric, remove-statements, remove-website-hosting,
    set-bucket-encryption, set-inventory, set-statements, tag, toggle-logging, toggle-versioning,
    unmark]
  filters: [and, bucket-encryption, bucket-notification, config-compliance, cross-account,
    data-events, event, global-grants, has-statement, inventory, is-log-target, marked-for-op,
    metrics, missing-policy-statement, missing-statement, no-encryption-statement,
    not, or, value]

security-group
(custodian) [hostname]$ custodian schema security-group
aws.security-group:
  actions: [auto-tag-user, delete, invoke-lambda, mark, mark-for-op, normalize-tag,
    notify, patch, put-metric, remove-permissions, remove-tag, rename-tag, tag, tag-trim,
    unmark, untag]
  filters: [and, default-vpc, diff, egress, event, ingress, json-diff, locked, marked-for-op,
    not, or, stale, tag-count, unused, used, value]

Getting Started

Example info returned for an EC2 health event
{
    "account": "XXXXXXXXXXXX",
    "region": "us-east-1",
    "detail": {
        "eventDescription": [
            {
                "latestDescription": "EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance associated with this event in the us-east-1 region. Due to this degradation your instance could already be unreachable. We will stop your instance after 2019-06-26 15:00 UTC.\\n\\nYou can find more information about maintenance events scheduled for your EC2 instances in the AWS Management Console (https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Events)\\n\\n* What will happen to my instance?\\nYour instance will be stopped after the specified retirement date. You can start it again at any time after it’s stopped. Any data on local instance-store volumes will be lost when the instance is stopped or terminated.\\n\\n* What do I need to do?\\nWe recommend that you stop and start the instance which will migrate the instance to a new host. Please note that any data on your local instance-store volumes will not be preserved when you stop and start your instance. For more information about stopping and starting your instance, and what to expect when your instance is stopped, such as the effect on public, private and Elastic IP addresses associated with your instance, see Stop and Start Your Instance in the EC2 User Guide (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html). However, if you do not need this instance, you can stop it at any time yourself or wait for EC2 to stop it after the retirement date.\\n\\n* Why is EC2 retiring my instance?\\nEC2 may schedule instances for retirement in cases where there is an unrecoverable issue with the underlying hardware. For more information about scheduled retirement events please see the EC2 user guide (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-retirement.html). To avoid single points of failure within critical applications, please refer to our architecture center for more information on implementing fault-tolerant architectures: (http://aws.amazon.com/architecture)\\n\\nIf you have any questions or concerns, you can contact the AWS Support Team on the community forums and via AWS Premium Support at: (http://aws.amazon.com/support)",
                "language": "en_US"
            }
        ],
        "service": "EC2",
        "eventTypeCode": "AWS_EC2_PERSISTENT_INSTANCE_RETIREMENT_SCHEDULED",
        "affectedEntities": [
            {
                "entityValue": "i-XXXXXXXXXXXXXXX",
                "tags": {}
            }
        ],
        "startTime": "Wed, 26 Jun 2019 15:00:00 GMT",
        "eventTypeCategory": "scheduledChange",
        "endTime": "Wed, 26 Jun 2019 15:00:00 GMT",
        "eventArn": "arn:aws:health:us-east-1::event/EC2/AWS_EC2_PERSISTENT_INSTANCE_RETIREMENT_SCHEDULED/AWS_EC2_PERSISTENT_INSTANCE_RETIREMENT_SCHEDULED4e63a1f4-b043-4543-88fa-53f30aeb295b"
    },
    "detail-type": "AWS Health Event",
    "source": "aws.health",
    "version": "0",
    "time": "2019-06-26T15:00:00Z",
    "debug": true,
    "id": "c0b8b90d-eafd-c6fd-0203-568c3c2fdc79",
    "resources": [
        "i-XXXXXXXXXXXXXXXXX" <-- This is an instance ID
    ]
}
Example of EC2 info returned for variable `resources` to Slack template
[{
u'Monitoring': {u'State': u'disabled'}, 
u'Hypervisor': u'xen', 
u'PublicDnsName': u'', 
u'State': {u'Code': 16, u'Name': u'running'}, 
u'EbsOptimized': True, 
u'LaunchTime': u'2019-06-14T18:19:47+00:00', 
u'PrivateIpAddress': u'10.100.8.111', 
u'ProductCodes': [], 
u'VpcId': u'vpc-b7407ed3', 
u'CpuOptions': {u'CoreCount': 16, u'ThreadsPerCore': 2}, 
u'StateTransitionReason': u'', 
u'InstanceId': u'i-XXXXXXXXXXXXXXXXX', 
u'EnaSupport': True, 
u'ImageId': u'ami-XXXXXXXXXXXXXXXXX', 
u'PrivateDnsName': u'ip-10-100-8-111.ec2.internal', 
u'KeyName': u'davids-sample-keyname', 
u'SecurityGroups': [{u'GroupName': u'launch-wizard-6', u'GroupId': u'sg-XXXXXXXXXXXXXXXXX'}], 
u'ClientToken': u'', 
u'SubnetId': u'subnet-0a38466e', 
u'InstanceType': u'p3.8xlarge', 
u'NetworkInterfaces': [{u'Status': u'in-use', u'MacAddress': u'XX:XX:XX:XX:XX:XX', u'SourceDestCheck': True, u'VpcId': u'vpc-XXXXXXXX', u'Description': u'Primary network interface', u'NetworkInterfaceId': u'eni-XXXXXXXXXXXXXXXXX', u'PrivateIpAddresses': [{u'PrivateDnsName': u'ip-10-100-8-111.ec2.internal', u'Primary': True, u'PrivateIpAddress': u'10.100.8.111'}], u'PrivateDnsName': u'ip-10-100-8-111.ec2.internal', u'Attachment': {u'Status': u'attached', u'DeviceIndex': 0, u'DeleteOnTermination': True, u'AttachmentId': u'eni-attach-076666ed57bd1c918', u'AttachTime': u'2019-06-14T18:19:47+00:00'}, u'Groups': [{u'GroupName': u'launch-wizard-6', u'GroupId': u'sg-0f74a46e92b348155'}], u'Ipv6Addresses': [], u'OwnerId': u'', u'SubnetId': u'subnet-0a38466e', u'PrivateIpAddress': u'10.100.8.111'}], u'SourceDestCheck': True, u'Placement': {u'GroupName': u'', u'Tenancy': u'default', u'AvailabilityZone': u'us-east-1d'}, u'CapacityReservationSpecification': {u'CapacityReservationPreference': u'open'}, u'c7n:MatchedFilters': [u'tag:Team', u'tag:Team'], u'BlockDeviceMappings': [{u'DeviceName': u'/dev/sda1', u'Ebs': {u'Status': u'attached', u'DeleteOnTermination': True, u'VolumeId': u'vol-04a5b8d53229ab212', u'AttachTime': u'2019-06-14T18:19:47+00:00'}…
security-groups-unused.yml
(custodian) [hostname]$ custodian run --dryrun -s . security-groups-unused.yml
2018-04-13 20:02:01,043: custodian.policy:INFO policy: security-groups-unused resource:security-group region:us-east-1 count:29 time:0.30

(custodian) [hostname]$ more ./security-groups-unused/resources.json | grep 'GroupName\|GroupId'
(custodian) [hostname]$ more ./security-groups-unused/resources.json | grep GroupName\"\:
    "GroupName": "rds-launch-wizard-5",
    "GroupName": "rds-launch-wizard",
    "GroupName": "rds-launch-wizard-2",
    "GroupName": "launch-wizard-17",
    "GroupName": "launch-wizard-5",
    "GroupName": "launch-wizard-7",
    "GroupName": "launch-wizard-6",
    "GroupName": "launch-wizard-1",
    "GroupName": "rds-launch-wizard-4",
    "GroupName": "launch-wizard-4",
    "GroupName": "launch-wizard-2",
    "GroupName": "launch-wizard-3",
    etc.
iam.yml
(custodian) [ec2-user@ip-10-100-0-195 custodian]$ custodian run --dryrun -s . iam.yml
2018-04-13 22:51:05,472: custodian.policy:INFO policy: iam-user-filter-policy resource:iam-user region:us-east-1 count:1 time:0.01

(custodian) [hostname]$ more ./iam-user-filter-policy/resources.json | grep UserName\"\:
    "UserName": "david.lin",
mfa.yml
(custodian) [hostname]$ custodian run --dryrun mfa.yml -s .
2018-04-13 23:47:40,901: custodian.policy:INFO policy: mfa-user-filter-policy resource:iam-user region:us-east-1 count:15 time:0.01

(custodian) [hostname]$ more ./mfa-user-filter-policy/resources.json | grep UserName\"\:
    "UserName": "username_1",
    "UserName": "username_2,
    "UserName": "username_3",
    "UserName": "username_4",
     etc.
roles.yml
(custodian) [hostname]$ custodian run --dryrun roles.yml -s .
2018-04-14 07:11:22,425: custodian.policy:INFO policy: iam-roles-unused resource:iam-role region:us-east-1 count:55 time:1.92

(custodian) [hostname]$ more ./iam-roles-unused/resources.json | grep RoleName
    "RoleName": "AmazonSageMaker-ExecutionRole-20180412T161207",
    "RoleName": "autotag-AutoTagExecutionRole-KA3LH5ARKJ2E",
    "RoleName": "autotag-AutoTagMasterRole-3VSL2AF3480E",
    "RoleName": "AWS-Cloudera-Infrastructu-ClusterLauncherInstanceR-1HUTDQJUYVGVE",
    etc.
admin-group.yml
(custodian) [hostname]$ custodian run --dryrun admin_group.yml -s .
2018-04-14 07:54:08,198: custodian.policy:INFO policy: iam-users-in-admin-group resource:iam-user region:us-east-1 count:14 time:3.67

(custodian) [hostname]$ more ./iam-users-in-admin-group/resources.json | grep UserName
    "UserName": "username_1",
    "UserName": "username_2",
    "UserName": "username_3",
    "UserName": "username_4",
    etc.
mfa-unused.yml
(custodian) [hostname]$ custodian run --dryrun mfa-unused.yml -s .
2018-04-14 08:13:07,214: custodian.policy:INFO policy: mfa-unused resource:iam-user region:us-east-1 count:2 time:2.54

(custodian) [ec2-user@ip-10-100-0-195 custodian]$ more ./mfa-unused/resources.json | grep UserName
    "UserName": "username_1",
    "UserName": "username_2"
emailer.yml
(custodian) [hostname]$ custodian run -s . emailer.yml
2018-04-23 22:25:12,614: custodian.policy:INFO policy: mfa-unused resource:iam-user region:us-east-1 count:2 time:8.41
2018-04-23 22:25:12,812: custodian.actions:INFO sent message:71ba67dd-731a-4734-bf63-15991754249e policy:mfa-unused template:default.html count:2
2018-04-23 22:25:12,813: custodian.policy:INFO policy: mfa-unused action: notify resources: 2 execution_time: 0.20
public-subnet-instance-audit-notify.yml
(custodian) $ custodian run -s . public-subnet-instance-audit-notify.yml
2018-05-04 01:07:56,937: custodian.policy:INFO Provisioning policy lambda public-subnet-instance-audit-notification

Usage Considerations

Offhour Examples
-------------------------------------------------------
Option 1: Using a Single Tag with key = "maid_offhours"
-------------------------------------------------------
# up mon-fri from 7am-7pm; eastern time
off=(M-F,19);on=(M-F,7);tz=est
# up mon-fri from 6am-9pm; up sun from 10am-6pm; pacific time
off=[(M-F,21),(U,18)];on=[(M-F,6),(U,10)];tz=pt



---------------------------------------------------------------------
Option 2: Using Tags with Names "StartAfterHours" and "StopAfterHours"
---------------------------------------------------------------------
# Using key "StartAfterHours"
# up mon-fri starting 7am; eastern time
on=(M-F,7);tz=est

#Using key "StopAfterHours"
# off mon-fri after 5pm; pacific time
off=(M-F,17);tz=pt



Important Note: When you stop an instance, the data on any instance store volumes is erased. 
                Therefore, if you have any data on instance store volumes that you want to 
                keep, be sure to back it up to persistent storage.

More Examples : http://capitalone.github.io/cloud-custodian/docs/quickstart/offhours.html#offhours
Other Misc Usage Considerations

copy-tag and tag-team policies require addtional enhancements that were added to c7n/tags.py. A modified version that tracks these changes can be found here.

emailer.yml requires the custodian mailer described here.

ebs-garbage-collection.yml can be run across all regions with the --region all option.

For example:

 custodian run --dryrun -s out --region all ebs-garbage-collection.yml
More

offhours.yml is run as a Lambda with CloudWatch periodic scheduler. It filters for EC2 instances tagged with "maid_offhours" and obeys rules set forth in the corresponding value pair per Cloud Custodian Offhours Policy. When specifying on/off/tz values, the values in the policies are overrided by the EC2 instance maid_offhours tag. So you can set the onhour/offhour to anything in the policy and it will not do anything.

emailer.yml requires the custodian mailer described here.

ebs-garbage-collection.yml can be run across all regions with the --region all option.

For example:

 custodian run --dryrun -s out --region all ebs-garbage-collection.yml

AWS resources that support the "health-event" filter

aws.acm-certificate
aws.app-elb
aws.cache-cluster
aws.directconnect
aws.directory
aws.dms-instance
aws.dynamodb-table
aws.ec2
aws.efs
aws.elb
aws.emr
aws.rds
aws.storage-gateway

For more information on the health-event resource filter, 
see https://cloudcustodian.io/docs/aws/resources/aws-common-filters.html#aws-common-filters-health-event

Note, the health-event resource filter is not to be confused with the resource aws.health-event.
The resource aws.health-event is useful but doesn't catch service issues.
The resource aws.health-event is useful for catching 'Other Notifications' that get reported in the Personal Health Dashboard (PHD).

For more information on the resource aws.health-event,
see https://cloudcustodian.io/docs/aws/resources/health-event.html

Troubleshooting Tips

Use 'custodian validate' to find syntax errors
Check 'name' of policy doesn't contain spaces
Check SQS to see if Custodian payload is entering the queue
Check SQS permissions permit other accounts if using c7n-org Check cloud-custodian-mailer lambda CloudWatch rule schedule (5 minute by default)
Check Lambda error logs (this requires CloudWatch logging)
Check role for lambda(s) have adequate permissions
Remember to update the cloud-custodian-mailer lambda when making changes to a policy that uses notifications
Clear the cache if you encounter errors due to stale information (rm ~/.cache/cloud-custodian.cache)

How to decode SQS msg

Grab one of the email messages in the c7n mailer SQS queue before it gets picked up and then unzip and base64 decode it.
You can see the entire metadata file with all the available info being passed to the mailer. 

To get the plan text:
Copy the encoded SQS message into a file (e.g. result)
Then decode the text using:
$ cat result | base64 -d > result.zlib
$ printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - result.zlib | gzip -dc

The printf is just to pad a proper header for gzip. Otherwise gzip will not be able to uncompress it.

*Taken from https://groups.google.com/d/msg/cloud-custodian/z67zuVApHp0/xX81toqVAgAJ

If you're using VIM to create the text file and getting an error "base64: invalid input", try pasting the encoded SQS message into a different text editor like nano instead.

How to retrieve value(s) from 'resources' from SQS msg and include it custom Slack msg

Sometimes the parameter you want to report in Slack is burried in 'resources'.
You can use the "How to decode SQS msg" section above to get the parameter then add lines similar to below
in your Slack template:

Example of extracting the SubnetId from the output of 'resources':
{
  "title":"SubnetId(s)",
  "value":"{{ resources | selectattr('SubnetId') | map(attribute='SubnetId') | list }}"
}

How to get parameters for a given resource

The trick here is to create a filter that will be true when using a type of "value" followed by a key/value pair that doesn't exist. You can then go to the output resources file to get the parameters you can then query against to refine your policy.

policies:
  - name: ssm-managed-instance
    resource: ssm-managed-instance
    description: |
      Cloud Custodian SSM Managed Instances
    comments: |
      Retrieve SSM Managed Instance Attributes
    filters:
      - not:
        - type: value
          key: foo
          value: "foo"

Example report for above policy

$ custodian report -s output ssm-managed-instance.yml --no-default-fields --field InstanceId=InstanceId --field PlatformName=PlatformName --field PlatformType=PlatformType

Log Messages

If you see the following CloudWatch log when sending notifications via Slack, ignore it:

[WARNING]	2018-06-06T23:42:21.321Z	413b5506-69e3-11e8-8a8c-6f167e23dc1a	Error: An error occurred (InvalidCiphertextException) when calling the Decrypt operation: Unable to decrypt slack_token with kms, will assume plaintext.

Canned Code Cheatsheet

Invoking Lambda Funtions
mode:
  type: cloudtrail
  role: arn:aws:iam::929292782238:role/CloudCustodian
  events:
    - CreateBucket
mode:
  type: periodic
  role: arn:aws:iam::929292782238:role/CloudCustodian
  schedule: "rate(15 minutes)"```
mode:
  type: periodic
  schedule: "rate(1 day)"
  role: arn:aws:iam::123456789012:role/lambda-role
  execution-options:
    assume_role: arn:aws:iam::123123123123:role/target-role
    metrics_enabled: false
Sending Notifications via SES and Slack
actions:
 - type: notify
   template: default.html
   slack_template: slack-default
   template_format: 'html'
   priority_header: '5'
   subject: 'Security Audit: Unused Security Groups'
   to:
     - <your-email-address-goes-here>
     - slack://#<slack-channel-name>
   owner_absent_contact:
     - <your-emails-address-goes-here>
   transport:
     type: sqs
     queue: https://sqs.us-east-1.amazonaws.com/1234567890/cloud-cloudcustodian
Filtering with regex and whitelist
filters:
  - not:
    - type: value
      key: "tag:Name"
      value: (MyJenkinsInstance|MyCloudCustodianInstance)
      op: regex
  - and:
    - type: subnet 
      key: "tag:Name"
      value: "david.lin-subnet" 

Canned Emails You Can Send to Users Who Own EC2 Instances with Scheduled Events

Canned Email for instances running on degraded hardware
(include Custodian email notification generated by ec2-health-event.yml policy for context)

Hi <username/team>,

Your instance <instance-id> is scheduled for retirement by AWS on <scheduled-start-date-goes-here>. This is likely due to an underlying AWS infrastructure issue and might mean there are hardware or other issues with the instance. You should migrate any important data off it prior to <scheduled-start-date-goes-here> and replace the instance. 
Canned Email for instances scheduled for reboot
(include Custodian email notification generated by ec2-health-event.yml policy for context)

Hi <username/team>,

Your artifactory instance is scheduled for a system reboot by AWS between July 10, 2019 at 10:00:00 PM UTC-4 and July 11, 2019 at 12:00:00 AM UTC-4. You can reschedule the reboot up until July 26, 2019 at 10:00:00 PM; the other option is to stop and restart the instance. For more information:  https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-instances-status-check_sched.html?icmpid=docs_ec2_console#schedevents_actions_reboot


Updating Latest Changes to Message Templates

When you make changes to a message template and you've deployed the Lambda mailer, you will need to update the mailer.

c7n-mailer --config mailer.yml --update-lambda

Updating Latest Merges to Master

From your virtualenv

cd ~/cloud-custodian
git pull
python setup.py install

This will reflect changes in your virtualenv Python lib such that the schema validation uses the latest fixes/updates.

Running Policy as Cron Job

See Example

crontab

  $ crontab -l
  # Run job every day at 5 pm PST.
  # Clean log at 23:00 pm PST every month to save disk space.
  * 17 * * * /home/ubuntu/cloudcustodian/cron/mfa-audit.sh > /home/ubuntu/cloudcustodian/logs/mfa-audit.log 2>&1
  * 23 * 1-12 * /home/ubuntu/cloudcustodian/cron/cleanlogs.sh
  

mfa-audit.sh

  $ pwd
  /home/ubuntu/cloudcustodian-policies/cron
  $ more mfa-audit.sh
  #!/bin/bash
 PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
  export PATH
  source c7n_mailer/bin/activate
  echo "Running policy..."
  c7n-mailer --config /home/ubuntu/cloudcustodian-policies/mailer.yml --update-lambda && custodian run -c /home/ubuntu/cloudcustodian-policies/mfa-audit.yml -s output
  echo "MFA policy run completed"
  

cleanlogs.sh

  $ pwd
  /home/ubuntu/cloudcustodian-policies/cron
  $ more cleanlogs.sh
  #!/bin/bash
      PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
  export PATH
  echo "Cleaning logs ..."
  rm /home/ubuntu/cloudcustodian/logs/mfa-audit.log
  echo "Log files deleted!"
  

Useful Tool: Quick simple editor for cron schedule expressions.

Lambda Policies 101

Lambda policies can get confusing in a hurry. My advice. RTFM before diving into the weeds!

Lambda Policies

Supported Lambda Mode Types:

  • cloudtrail
  • ec2-instance-state
  • periodic
  • config-rule
When using execution-options:
- Metrics are pushed using the assumed role which may or may not be desired. Use 'metrics_enabled: false' to disable if not desired.
- The mode must be periodic as there are restrictions on where policy executions can run according to the module:
  -- Config: May run in a different region but NOT cross-account
  -- Event: Only run in the SAME region and account
  -- Periodic: May run in a different region AND different account (this is the most flexible)

Cross-Account Notes

  • Cross account is supported in the c7n_org tool via the c7n-org CLI command.
  • c7n-org supports multiple regions via the --region option (i.e. --region all).
  • c7n-org can manage policies across different accounts and restrict the execution of policy by tag (See more).
Example:
(c7n_org) $ c7n-org run -s output -c accounts.yml -u c7n-org-public-instance-audit.yml --region us-west-1 -a "Sandbox"
2018-07-19 18:09:04,624: c7n_org:INFO Ran account:Sandbox region:us-west-1 policy:c7n-org-public-instance-audit matched:17 time:2.88
2018-07-19 18:09:04,633: c7n_org:INFO Policy resource counts Counter({'c7n-org-public-instance-audit': 17})
  • c7n_org includes a tool that auto generates the config file c7n-org uses for accounts using the aws organizations API. ( link )
  • To run policies across multiple AWS accounts, create roles in the cross-accounts that trust a 'primary/governance' account and from the primary/governance account create an instance profile that has the STS assume role to switch to N other accounts.
  • To send email/Slack notifications using the existing SQS mailer queue, add permission to the SQS mailer queue that allows cross-accounts.
  • c7n-org gets credentials from the [default] section of the ~/.aws/credentials and ~/.aws/config files. Support for profile as part of the account config was later introduced in Feb 2018.
  • The cache file can handle multiple regions but you need a separate cache for each account (i.e. --cache /home/custodian/.accountname.cache)
  • Policies can be run locally on EC2 instance or via Lambdas (or containers on k8s/ECS although I haven't tried this)

Example Trust Relationship Policy for role OrganizationAccountAccessRole in cross account:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<account_id_where_c7n-org_is_invoked>:root",
          "arn:aws:iam::<account_id_where_c7n-org_is_invokded>:role/CloudCustodian"
        ],
        "Service": [
          "lambda.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

How to generate the c7n-org config file for accounts using the AWS org API

$ source c7n_org/bin/activate
$ cd /home/ubuntu/cloud-custodian/tools/c7n_org/scripts/
$ python orgaccounts.py -f accounts.yml

Copy the file accounts.yml to the appropriate directory where your c7n-org policies live.

Worst case scenario, if the python script fails, perform a fresh install of c7n-org 
on another instance and repeat steps above.

See https://pypi.org/project/c7n-org/

Usage examples:
$ export policy=""
$ c7n-org run -s output -c accounts.yml -u $policy --region all
$c7n-org run -s output -c all-accounts.yml -u $policy -t team:devops --region us-east-1

Cross-Account Questions

  • How are Lambda policies run across accounts? The same but you need to remove role: under the mode section. In addition, the CloudCustodian role assigned to the c7n-org instance requires sts:AssumeRole and the role being assumed in the other account(s) must trust the CloudCustodian role assigned to the c7n-org.

  • How is Lambda policy sprawl managed across accounts? [TBD]

Email and Slack Message Templates

The default email and slack templates will show the primary account ID by default. In order to have the templates show cross account ID's, you need to include the variable name 'account_id'.

Email Template Example:

{% if account == "tri-na" %}
     <h2><font color="#505151"> Account {{  "%s - %s" | format(account,region)  }} </h2>
{% else %}
     <h2><font color="#505151"> Cross Account {{ "%s - %s" | format(account_id,region)  }} </h2>
{% endif %}

Slack Template Example:

{
   "title":"Account ID",
   "value":"{{ account_id }}"
},
{
   "title":"Region",
   "value":"{{ region }}"
},

Customizing Slack Message Templates Using Jinja2 Filters

The following example taken from the default Slack message template (slack_default.j2) that returns output of resources looks like this:

{
   "attachments":[
      {
         "fallback":"Cloud Custodian Policy Violation",
         "title":"Custodian",
         "color":"danger",
         "fields":[
            {
               "title":"Resources",
               "value":"{%- for resource in resources -%}
                        {{ format_resource(resource, policy['resource']) | replace('\\"', '"') | replace('"', '\\"'
) }}
                        {%- endfor -%}"
            },
            {
               "title":"Account",
               "value":"{{ account }}"
            },
            {
               "title":"Region",
               "value":"{{ region }}"
            },
            {
               "title":"Violation Description",
               "value":"{{ action['violation_desc'] }}"
            },
            {
               "title":"Action Description",
               "value":"{{ action['action_desc'] }}"
            }
         ]
      }
   ],
   {%- if not recipient.startswith('https://') %}
   "channel":"{{ recipient }}",
   {%- endif -%}
   "username":"Custodian"
}

Sometimes the information returned by the for loop is too verbose and you are only interested in rendering specific key/value pairs.

To extract specific keys from the output of resources, use Jinja2 filters instead.

Here's an example used by the acm-certificate-audit.yml policy:

{
   "attachments":[
      {
         "fallback":"Cloud Custodian ACM Certificate Audit",
         "title":"ACM Certificate Audit",
         "color":"danger",
         "fields":[
            {
               "title":"Description",
               "value":"ACM certificate(s) found nearing their expiration date <60 days"
            },
            {
               "title":"Domain Name(s)",
               "value":"{{ resources | selectattr('DomainName') | map(attribute='DomainName') | list }}"
            },
            {
               "title":"Expiration Date(s)",
               "value":"{{ resources | selectattr('NotAfter') | map(attribute='NotAfter') | list }}"
            },
            {
               "title":"Certificate Arn(s)",
               "value":"{{ resources | selectattr('CertificateArn') | map(attribute='CertificateArn') | list }}"
            },
            {
               "title":"Account ID",
               "value":"{{ account_id }}"
            },
            {
               "title":"Region",
               "value":"{{ region }}"
            },
            {
               "title":"Action Description",
               "value":"See email notification for details."
            }
         ]
      }
   ],
   "channel":"{{ recipient }}",
   "username":"Custodian"
}

c7n-mailer sendgrid Traceback

If you encounter the following traceback related to sendgrid, make sure you're using the latest version.

sendgrid had a major rev thats backwards incompatible; the latest mailer uses the latest version of sendgrid

Traceback

...
File "/home/ubuntu/cloud-custodian/tools/c7n_mailer/c7n_mailer/azure/sendgrid_delivery.py", line 20, in 
    from sendgrid.helpers.mail import Mail, To, From

Fix

$ pip freeze | grep sendgrid
$ pip list --outdated
$ pip install sendgrid -U

General Policy Notes

Cloud Custodian policies can be run

  • serverless as separate Lambdas per account per region
  • as EC2 instance via cron job
  • as EC2 instance via c7n-org
  • as container via ECS Fargate c7n-org
  • Cross account Lambda policies are not supported per Issue #1071 But was recently support per Issue #2533
  • Support for cross-account CloudWatch events is supported per Issue #2005 but requires an AWS CloudWatch footprint in each cross-account which can be stood up using CloudFormation

Resources

Custom msg-templates for c7n_mailer
Slack API and Token
Using ec2-instance-state, lessons around roles, how to view lambda logs, and more
How does garbage collection get enforced?
EC2 Offhours Support
Example offhours support
Lambda Support
AWS CloudWatch Schedule Rules
iam-user feature enhancement
Offhours Examples
CloudWatch Rules Expressions
Adding Custom Fields to Reports

About

Cloud Custodian policy that logs unused security groups

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published