Harbinger Explorer

Back to Knowledge Hub
cloud-architecture
Published:

Data Encryption at Rest and In Transit: A Practical Guide

14 min read·Tags: security, encryption, kms, tls, compliance, data-platform

Data Encryption at Rest and In Transit: A Practical Guide

Encryption is the non-negotiable baseline for any production data platform handling sensitive information. Yet "we encrypt our data" is one of the most misunderstood statements in cloud security—it can mean anything from S3 default encryption (good) to a hand-rolled crypto library (terrifying).

This guide gives you the practitioner's view: what to encrypt, how to do it correctly, where the gotchas are, and how to verify you've actually done it.


The Threat Model

Before picking algorithms and key sizes, understand what you're protecting against:

Loading diagram...
ThreatEncryption at RestEncryption in TransitBoth
Stolen S3 bucket
Compromised network path
Compromised host OS
Rogue DBA

Key insight: Encryption protects data at storage and wire level. It does not protect against compromised compute or identity. Combine encryption with IAM, VPC boundaries, and audit logging.


Encryption at Rest

AWS KMS: The Right Way

Never manage your own key material in a cloud environment. Use a managed Key Management Service (KMS).

Loading diagram...

This is envelope encryption: KMS never sees your data—it only wraps/unwraps the DEK.

Terraform: KMS Key with Rotation

resource "aws_kms_key" "data_platform" {
  description             = "Data platform encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true
  multi_region            = false

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowKeyAdministration"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${var.account_id}:role/DataPlatformAdmin"
        }
        Action   = ["kms:*"]
        Resource = "*"
      },
      {
        Sid    = "AllowServiceUse"
        Effect = "Allow"
        Principal = {
          Service = ["s3.amazonaws.com", "rds.amazonaws.com", "glue.amazonaws.com"]
        }
        Action = [
          "kms:GenerateDataKey",
          "kms:Decrypt",
          "kms:DescribeKey"
        ]
        Resource = "*"
      }
    ]
  })

  tags = {
    Purpose   = "data-platform-encryption"
    Rotation  = "annual-automatic"
  }
}

resource "aws_kms_alias" "data_platform" {
  name          = "alias/data-platform-${var.environment}"
  target_key_id = aws_kms_key.data_platform.key_id
}

S3 Bucket Encryption

resource "aws_s3_bucket_server_side_encryption_configuration" "data_lake" {
  bucket = aws_s3_bucket.data_lake["bronze"].id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.data_platform.arn
    }
    bucket_key_enabled = true  # Reduce KMS API call costs by 99%
  }
}

# Enforce encryption on upload (deny unencrypted PutObject)
resource "aws_s3_bucket_policy" "enforce_encryption" {
  bucket = aws_s3_bucket.data_lake["bronze"].id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Sid    = "DenyUnencryptedObjectUploads"
      Effect = "Deny"
      Principal = "*"
      Action = "s3:PutObject"
      Resource = "${aws_s3_bucket.data_lake["bronze"].arn}/*"
      Condition = {
        StringNotEquals = {
          "s3:x-amz-server-side-encryption" = "aws:kms"
        }
      }
    }]
  })
}

Database Encryption: RDS and Redshift

resource "aws_db_instance" "data_warehouse" {
  identifier        = "data-warehouse-${var.environment}"
  engine            = "postgres"
  engine_version    = "15.4"
  instance_class    = "db.r6g.xlarge"
  storage_encrypted = true
  kms_key_id        = aws_kms_key.data_platform.arn

  # Enable Performance Insights with encryption
  performance_insights_enabled          = true
  performance_insights_kms_key_id       = aws_kms_key.data_platform.arn
  performance_insights_retention_period = 7

  # Automated backups (also encrypted with KMS)
  backup_retention_period = 30
  backup_window           = "03:00-04:00"
}

Column-Level Encryption

For PII fields that must be encrypted even from database admins, use column-level encryption. This is where envelope encryption really shines.

# AWS CLI: encrypt a single field value with KMS
aws kms encrypt   --key-id alias/data-platform-prod   --plaintext fileb://<(echo -n "user@example.com")   --output text   --query CiphertextBlob | base64 --decode > encrypted_email.bin

# Decrypt
aws kms decrypt   --ciphertext-blob fileb://encrypted_email.bin   --output text   --query Plaintext | base64 --decode

For at-scale column encryption in a data pipeline:

# Spark job config for PII column encryption
encryption:
  enabled: true
  pii_columns:
    - email
    - phone_number
    - national_id
    - credit_card_number
  strategy: deterministic  # vs random (for lookup joins)
  key_provider: aws_kms
  kms_key_id: "alias/data-platform-prod"
  cache_ttl_seconds: 300  # Cache DEKs to reduce KMS API calls

Encryption in Transit

TLS 1.3: The Baseline

TLS 1.3 eliminates the weak cipher suites that plagued 1.2. Enforce it everywhere.

# Verify TLS version and cipher suite on an endpoint
openssl s_client -connect my-kafka-broker:9093 -tls1_3 2>/dev/null | grep -E "Protocol|Cipher"

# Expected output:
# Protocol  : TLSv1.3
# Cipher    : TLS_AES_256_GCM_SHA384

Kafka TLS Configuration

# server.properties - Kafka broker
listeners=PLAINTEXT://localhost:9092,SSL://0.0.0.0:9093
ssl.keystore.location=/etc/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=${KEYSTORE_PASSWORD}
ssl.key.password=${KEY_PASSWORD}
ssl.truststore.location=/etc/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.client.auth=required
ssl.enabled.protocols=TLSv1.3
ssl.cipher.suites=TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256

# consumer.properties
security.protocol=SSL
ssl.truststore.location=/etc/kafka/ssl/client.truststore.jks
ssl.keystore.location=/etc/kafka/ssl/client.keystore.jks

mTLS for Service-to-Service

Mutual TLS ensures both parties authenticate. Essential for internal microservice communication handling data.

# Generate CA, server cert, client cert with OpenSSL
# 1. Create CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 3650 -key ca.key -out ca.crt   -subj "/CN=DataPlatform-CA/O=MyOrg"

# 2. Server cert
openssl genrsa -out server.key 2048
openssl req -new -key server.key -out server.csr   -subj "/CN=kafka-broker-1.internal/O=MyOrg"
openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key   -CAcreateserial -out server.crt

# 3. Client cert (for ETL service)
openssl genrsa -out etl-client.key 2048
openssl req -new -key etl-client.key -out etl-client.csr   -subj "/CN=etl-pipeline/O=MyOrg"
openssl x509 -req -days 365 -in etl-client.csr -CA ca.crt -CAkey ca.key   -CAcreateserial -out etl-client.crt

ALB + ACM: TLS for HTTP Endpoints

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.data_api.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate_validation.api.certificate_arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.data_api.arn
  }
}

# Redirect HTTP to HTTPS
resource "aws_lb_listener" "http_redirect" {
  load_balancer_arn = aws_lb.data_api.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}

Key Management Best Practices

Key Rotation Schedule

Key TypeRotation FrequencyMethod
KMS CMK (AWS-managed)Annual (automatic)Enable key_rotation
Data Encryption KeysPer-job or per-dayRe-wrap with new CMK
TLS certificates90 days (Let's Encrypt) / Annual (ACM)Auto-renew via ACM
Database master password90 daysSecrets Manager rotation
API keys30–90 daysManual + CI/CD rotation

AWS Secrets Manager for Credential Rotation

resource "aws_secretsmanager_secret" "db_password" {
  name                    = "data-platform/${var.environment}/db-master-password"
  kms_key_id              = aws_kms_key.data_platform.arn
  recovery_window_in_days = 7

  rotation_rules {
    automatically_after_days = 90
  }
}

resource "aws_secretsmanager_secret_rotation" "db_password" {
  secret_id           = aws_secretsmanager_secret.db_password.id
  rotation_lambda_arn = aws_lambda_function.db_password_rotator.arn

  rotation_rules {
    automatically_after_days = 90
  }
}

Compliance Mapping

RequirementControlImplementation
GDPR Art. 32Encryption of personal dataKMS + column-level for PII
SOC 2 CC6.1Logical access controlsKMS key policies + IAM
PCI DSS 3.4Cardholder data encryptionDeterministic column encryption
HIPAA § 164.312PHI encryption at rest + transitKMS + TLS 1.3
ISO 27001 A.10Cryptographic controlsKey management policy

Encryption Audit with AWS Config

resource "aws_config_config_rule" "s3_encryption" {
  name = "s3-bucket-server-side-encryption-enabled"

  source {
    owner             = "AWS"
    source_identifier = "S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED"
  }
}

resource "aws_config_config_rule" "rds_encryption" {
  name = "rds-storage-encrypted"

  source {
    owner             = "AWS"
    source_identifier = "RDS_STORAGE_ENCRYPTED"
  }
}

resource "aws_config_config_rule" "kms_rotation" {
  name = "cmk-backing-key-rotation-enabled"

  source {
    owner             = "AWS"
    source_identifier = "CMK_BACKING_KEY_ROTATION_ENABLED"
  }
}

Common Mistakes

  1. S3 default encryption ≠ customer-managed keys — AWS-managed keys don't give you key usage logs or rotation control
  2. Encrypting backups with the same key as prod — A compromised key means both are lost
  3. Ignoring KMS request quotas — Without bucket-key enabled, high-throughput workloads hit 10,000 req/s KMS limits
  4. TLS terminating at the load balancer only — Backend-to-backend traffic must also be encrypted
  5. Hardcoding encryption keys in application code — Use Secrets Manager or environment injection

Monitoring Encryption Health

Platforms like Harbinger Explorer can continuously audit your data platform's encryption posture—flagging unencrypted resources, expiring certificates, and key policy drift before they become compliance incidents.

# Quick CLI audit: find unencrypted S3 buckets
aws s3api list-buckets --query 'Buckets[].Name' --output text | tr '\t' '\n' |   xargs -I{} aws s3api get-bucket-encryption --bucket {} 2>&1 |   grep -B1 "ServerSideEncryptionConfigurationNotFoundError"

Summary

Encryption is a system, not a checkbox. Implement it correctly:

  1. Use KMS with customer-managed keys + automatic rotation
  2. Enable bucket-key to control costs
  3. Enforce TLS 1.3 everywhere, mTLS for internal services
  4. Apply column-level encryption for PII fields
  5. Audit continuously with AWS Config rules
  6. Never co-locate prod and backup keys

Try Harbinger Explorer free for 7 days and get automated encryption posture monitoring across your entire cloud data platform—no agents required.


Continue Reading

Try Harbinger Explorer for free

Connect any API, upload files, and explore with AI — all in your browser. No credit card required.

Start Free Trial

Command Palette

Search for a command to run...