Why Do We Need Security Monitoring?
Even with the strongest security measures, we must be prepared for situations where an attack is attempted or actually carried out.
Logging is like security cameras in your building - they record what happens. Monitoring is the alarm system - it alerts you when something suspicious occurs.
Without these tools, you won't know that someone stole your credentials until it's too late.
Physical Security Analogy
Imagine a luxury villa with valuable items inside. You have high walls, locked doors, and a safe for the most precious belongings.
But what if someone still manages to get inside?
- Cameras (Logging): Record everything that happens - who arrived, what they did, where they went
- Alarm (Monitoring): Immediately alerts when someone enters a secured area
- Security Service (Incident Response): The team that responds to alarms
Security is Layered
We protect the system at every level of attack - before, during, and after an attack. Monitoring and logging are key to detection and response during and after an attack.
AWS Monitoring & Logging Services
CloudTrail - Basic Configuration
CloudTrail records all API activity in your AWS account. Every login, every instance creation, every security group change.
# Terraform - Multi-region CloudTrail with CloudWatch integration
resource "aws_cloudtrail" "security_trail" {
name = "security-audit-trail"
s3_bucket_name = aws_s3_bucket.cloudtrail_logs.id
include_global_service_events = true
is_multi_region_trail = true
enable_log_file_validation = true
# Send logs to CloudWatch
cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.cloudtrail.arn}:*"
cloud_watch_logs_role_arn = aws_iam_role.cloudtrail_cloudwatch.arn
# Management events (API calls)
event_selector {
read_write_type = "All"
include_management_events = true
}
# Data events for S3 (optional)
event_selector {
read_write_type = "All"
include_management_events = false
data_resource {
type = "AWS::S3::Object"
values = ["arn:aws:s3:::sensitive-bucket/"]
}
}
tags = {
Name = "security-audit-trail"
Environment = "production"
}
}
# S3 bucket for CloudTrail logs
resource "aws_s3_bucket" "cloudtrail_logs" {
bucket = "company-cloudtrail-logs-${data.aws_caller_identity.current.account_id}"
tags = {
Name = "cloudtrail-logs"
}
}
# Encryption for CloudTrail logs
resource "aws_s3_bucket_server_side_encryption_configuration" "cloudtrail_logs" {
bucket = aws_s3_bucket.cloudtrail_logs.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "cloudtrail" {
name = "/aws/cloudtrail/security-audit"
retention_in_days = 365
tags = {
Name = "cloudtrail-logs"
}
}
CloudWatch Alarms for Security Events
Example: Failed Login Alarm
# Metric filter for failed console logins
resource "aws_cloudwatch_log_metric_filter" "failed_console_login" {
name = "FailedConsoleLogin"
pattern = "{ ($.eventName = ConsoleLogin) && ($.errorMessage = \"Failed authentication\") }"
log_group_name = aws_cloudwatch_log_group.cloudtrail.name
metric_transformation {
name = "FailedConsoleLoginCount"
namespace = "SecurityMetrics"
value = "1"
}
}
# Alarm when more than 3 failed logins within 5 minutes
resource "aws_cloudwatch_metric_alarm" "failed_login_alarm" {
alarm_name = "FailedConsoleLoginAlarm"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "FailedConsoleLoginCount"
namespace = "SecurityMetrics"
period = 300 # 5 minutes
statistic = "Sum"
threshold = 3
alarm_description = "More than 3 failed login attempts within 5 minutes"
alarm_actions = [aws_sns_topic.security_alerts.arn]
tags = {
Name = "failed-login-alarm"
}
}
# SNS topic for security alerts
resource "aws_sns_topic" "security_alerts" {
name = "security-alerts"
}
# Email subscription
resource "aws_sns_topic_subscription" "security_email" {
topic_arn = aws_sns_topic.security_alerts.arn
protocol = "email"
endpoint = "security-team@company.cz"
}
Other Important Alarms
# Alarm on root user activity
resource "aws_cloudwatch_log_metric_filter" "root_activity" {
name = "RootAccountUsage"
pattern = "{ $.userIdentity.type = \"Root\" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != \"AwsServiceEvent\" }"
log_group_name = aws_cloudwatch_log_group.cloudtrail.name
metric_transformation {
name = "RootAccountUsageCount"
namespace = "SecurityMetrics"
value = "1"
}
}
resource "aws_cloudwatch_metric_alarm" "root_usage_alarm" {
alarm_name = "RootAccountUsageAlarm"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "RootAccountUsageCount"
namespace = "SecurityMetrics"
period = 60
statistic = "Sum"
threshold = 0
alarm_description = "Root account was used!"
alarm_actions = [aws_sns_topic.security_alerts.arn]
}
# Alarm on Security Group changes
resource "aws_cloudwatch_log_metric_filter" "security_group_changes" {
name = "SecurityGroupChanges"
pattern = "{ ($.eventName = AuthorizeSecurityGroupIngress) || ($.eventName = AuthorizeSecurityGroupEgress) || ($.eventName = RevokeSecurityGroupIngress) || ($.eventName = RevokeSecurityGroupEgress) || ($.eventName = CreateSecurityGroup) || ($.eventName = DeleteSecurityGroup) }"
log_group_name = aws_cloudwatch_log_group.cloudtrail.name
metric_transformation {
name = "SecurityGroupChangeCount"
namespace = "SecurityMetrics"
value = "1"
}
}
# Alarm on IAM policy changes
resource "aws_cloudwatch_log_metric_filter" "iam_policy_changes" {
name = "IAMPolicyChanges"
pattern = "{ ($.eventName = DeleteGroupPolicy) || ($.eventName = DeleteRolePolicy) || ($.eventName = DeleteUserPolicy) || ($.eventName = PutGroupPolicy) || ($.eventName = PutRolePolicy) || ($.eventName = PutUserPolicy) || ($.eventName = CreatePolicy) || ($.eventName = DeletePolicy) || ($.eventName = AttachRolePolicy) || ($.eventName = DetachRolePolicy) || ($.eventName = AttachUserPolicy) || ($.eventName = DetachUserPolicy) }"
log_group_name = aws_cloudwatch_log_group.cloudtrail.name
metric_transformation {
name = "IAMPolicyChangeCount"
namespace = "SecurityMetrics"
value = "1"
}
}
What to Monitor?
Key Security Events
Authentication & Authorization
- Failed login attempts (brute force detection)
- Root account usage
- MFA disabled
- New IAM users/roles created
- Permission changes
Network & Infrastructure
- Security group changes
- VPC changes
- Route table modifications
- Network ACL changes
- New EC2 instances
Data Access
- S3 bucket policy changes
- S3 public access
- KMS key deletion/disable
- RDS snapshot sharing
- Secrets Manager access
Billing & Account
- CloudTrail disabled
- Config rules changes
- Unusual billing spikes (cryptomining)
- Account settings changes
Security Monitoring Best Practices
Logging
- Enable CloudTrail in all regions
- Retain logs for at least 1 year
- Encrypt logs at rest (KMS)
- Enable log file integrity validation
- VPC Flow Logs for network traffic
- Centralize logs into a single account (Security Account)
Monitoring & Alerting
- Alarms on critical security events
- Real-time notifications (SNS, Slack, PagerDuty)
- Dashboards for the security team
- GuardDuty for threat detection
- Security Hub for centralized overview
- Regular review of findings
Incident Response
- Defined playbooks for different types of incidents
- Automated remediation where possible
- Regular incident response drills
- Post-incident reviews
- Documentation of all incidents
What NOT to Do
- Disable CloudTrail (even temporarily)
- Ignore alarms (alert fatigue)
- Delete logs before the retention period ends
- Make logs accessible to everyone (least privilege)
- Forget to monitor production
- Not have an incident response plan