Security Monitoring & Logging Guide

Physical Security Analogy

Imagine a luxury villa with valuable items inside. You have high walls, locked doors, and a safe for the most precious belongings. But what if someone still manages to get inside?

Cameras (Logging): Record everything that happens - who arrived, what they did, where they went
Alarm (Monitoring): Immediately alerts when someone enters a secured area
Security Service (Incident Response): The team that responds to alarms

Security is Layered

We protect the system at every level of attack - before, during, and after an attack. Monitoring and logging are key to detection and response during and after an attack.

AWS Monitoring & Logging Services

Audit & Compliance

AWS CloudTrail

Continuous monitoring and auditing of all API calls in your AWS account.

Records every API call
Who, what, when, from where
Management and Data events
Multi-region trails
Log file integrity validation

Monitoring & Alerting

AWS CloudWatch

Monitoring and management service for metrics, logs, and alarms.

Metrics from AWS services
Custom metrics
Log groups and log streams
Alarms and notifications
Dashboards

Threat Detection

AWS GuardDuty

Intelligent threat detection using machine learning.

Anomaly detection
Threat intelligence feeds
Automated findings
Integration with Security Hub

Centralized Overview

AWS Security Hub

Central dashboard for security findings from multiple services.

Findings aggregation
Compliance checks
Priority scoring
Automated remediation

CloudTrail - Basic Configuration

CloudTrail records all API activity in your AWS account. Every login, every instance creation, every security group change.

# Terraform - Multi-region CloudTrail with CloudWatch integration

resource "aws_cloudtrail" "security_trail" {
  name                          = "security-audit-trail"
  s3_bucket_name               = aws_s3_bucket.cloudtrail_logs.id
  include_global_service_events = true
  is_multi_region_trail        = true
  enable_log_file_validation   = true

  # Send logs to CloudWatch
  cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.cloudtrail.arn}:*"
  cloud_watch_logs_role_arn  = aws_iam_role.cloudtrail_cloudwatch.arn

  # Management events (API calls)
  event_selector {
    read_write_type           = "All"
    include_management_events = true
  }

  # Data events for S3 (optional)
  event_selector {
    read_write_type           = "All"
    include_management_events = false

    data_resource {
      type   = "AWS::S3::Object"
      values = ["arn:aws:s3:::sensitive-bucket/"]
    }
  }

  tags = {
    Name        = "security-audit-trail"
    Environment = "production"
  }
}

# S3 bucket for CloudTrail logs
resource "aws_s3_bucket" "cloudtrail_logs" {
  bucket = "company-cloudtrail-logs-${data.aws_caller_identity.current.account_id}"

  tags = {
    Name = "cloudtrail-logs"
  }
}

# Encryption for CloudTrail logs
resource "aws_s3_bucket_server_side_encryption_configuration" "cloudtrail_logs" {
  bucket = aws_s3_bucket.cloudtrail_logs.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "cloudtrail" {
  name              = "/aws/cloudtrail/security-audit"
  retention_in_days = 365

  tags = {
    Name = "cloudtrail-logs"
  }
}
                    

CloudWatch Alarms for Security Events

Example: Failed Login Alarm

# Metric filter for failed console logins
resource "aws_cloudwatch_log_metric_filter" "failed_console_login" {
  name           = "FailedConsoleLogin"
  pattern        = "{ ($.eventName = ConsoleLogin) && ($.errorMessage = \"Failed authentication\") }"
  log_group_name = aws_cloudwatch_log_group.cloudtrail.name

  metric_transformation {
    name      = "FailedConsoleLoginCount"
    namespace = "SecurityMetrics"
    value     = "1"
  }
}

# Alarm when more than 3 failed logins within 5 minutes
resource "aws_cloudwatch_metric_alarm" "failed_login_alarm" {
  alarm_name          = "FailedConsoleLoginAlarm"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "FailedConsoleLoginCount"
  namespace           = "SecurityMetrics"
  period              = 300  # 5 minutes
  statistic           = "Sum"
  threshold           = 3
  alarm_description   = "More than 3 failed login attempts within 5 minutes"

  alarm_actions = [aws_sns_topic.security_alerts.arn]

  tags = {
    Name = "failed-login-alarm"
  }
}

# SNS topic for security alerts
resource "aws_sns_topic" "security_alerts" {
  name = "security-alerts"
}

# Email subscription
resource "aws_sns_topic_subscription" "security_email" {
  topic_arn = aws_sns_topic.security_alerts.arn
  protocol  = "email"
  endpoint  = "security-team@company.cz"
}
                    

Other Important Alarms

# Alarm on root user activity
resource "aws_cloudwatch_log_metric_filter" "root_activity" {
  name           = "RootAccountUsage"
  pattern        = "{ $.userIdentity.type = \"Root\" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != \"AwsServiceEvent\" }"
  log_group_name = aws_cloudwatch_log_group.cloudtrail.name

  metric_transformation {
    name      = "RootAccountUsageCount"
    namespace = "SecurityMetrics"
    value     = "1"
  }
}

resource "aws_cloudwatch_metric_alarm" "root_usage_alarm" {
  alarm_name          = "RootAccountUsageAlarm"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "RootAccountUsageCount"
  namespace           = "SecurityMetrics"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_description   = "Root account was used!"

  alarm_actions = [aws_sns_topic.security_alerts.arn]
}

# Alarm on Security Group changes
resource "aws_cloudwatch_log_metric_filter" "security_group_changes" {
  name           = "SecurityGroupChanges"
  pattern        = "{ ($.eventName = AuthorizeSecurityGroupIngress) || ($.eventName = AuthorizeSecurityGroupEgress) || ($.eventName = RevokeSecurityGroupIngress) || ($.eventName = RevokeSecurityGroupEgress) || ($.eventName = CreateSecurityGroup) || ($.eventName = DeleteSecurityGroup) }"
  log_group_name = aws_cloudwatch_log_group.cloudtrail.name

  metric_transformation {
    name      = "SecurityGroupChangeCount"
    namespace = "SecurityMetrics"
    value     = "1"
  }
}

# Alarm on IAM policy changes
resource "aws_cloudwatch_log_metric_filter" "iam_policy_changes" {
  name           = "IAMPolicyChanges"
  pattern        = "{ ($.eventName = DeleteGroupPolicy) || ($.eventName = DeleteRolePolicy) || ($.eventName = DeleteUserPolicy) || ($.eventName = PutGroupPolicy) || ($.eventName = PutRolePolicy) || ($.eventName = PutUserPolicy) || ($.eventName = CreatePolicy) || ($.eventName = DeletePolicy) || ($.eventName = AttachRolePolicy) || ($.eventName = DetachRolePolicy) || ($.eventName = AttachUserPolicy) || ($.eventName = DetachUserPolicy) }"
  log_group_name = aws_cloudwatch_log_group.cloudtrail.name

  metric_transformation {
    name      = "IAMPolicyChangeCount"
    namespace = "SecurityMetrics"
    value     = "1"
  }
}
                    

What to Monitor?

Key Security Events

Authentication & Authorization

Failed login attempts (brute force detection)
Root account usage
MFA disabled
New IAM users/roles created
Permission changes

Network & Infrastructure

Security group changes
VPC changes
Route table modifications
Network ACL changes
New EC2 instances

Data Access

S3 bucket policy changes
S3 public access
KMS key deletion/disable
RDS snapshot sharing
Secrets Manager access

Billing & Account

CloudTrail disabled
Config rules changes
Unusual billing spikes (cryptomining)
Account settings changes

Security Monitoring Best Practices

Logging

Enable CloudTrail in all regions
Retain logs for at least 1 year
Encrypt logs at rest (KMS)
Enable log file integrity validation
VPC Flow Logs for network traffic
Centralize logs into a single account (Security Account)

Monitoring & Alerting

Alarms on critical security events
Real-time notifications (SNS, Slack, PagerDuty)
Dashboards for the security team
GuardDuty for threat detection
Security Hub for centralized overview
Regular review of findings

Incident Response

Defined playbooks for different types of incidents
Automated remediation where possible
Regular incident response drills
Post-incident reviews
Documentation of all incidents

What NOT to Do

Disable CloudTrail (even temporarily)
Ignore alarms (alert fatigue)
Delete logs before the retention period ends
Make logs accessible to everyone (least privilege)
Forget to monitor production
Not have an incident response plan

Security Monitoring

Why Do We Need Security Monitoring?

Physical Security Analogy

Security is Layered

AWS Monitoring & Logging Services

AWS CloudTrail

AWS CloudWatch

AWS GuardDuty

AWS Security Hub

CloudTrail - Basic Configuration

CloudWatch Alarms for Security Events

Example: Failed Login Alarm

Other Important Alarms

What to Monitor?

Key Security Events

Authentication & Authorization

Network & Infrastructure

Data Access

Billing & Account

Security Monitoring Best Practices

Logging

Monitoring & Alerting

Incident Response

What NOT to Do

Related Topics

CI/CD Security

Infrastructure Before AI

Zero-Downtime Deployments