Documentation Index
Fetch the complete documentation index at: https://mintlify.com/iLotuus/Enterprise-SOC-Architecture/llms.txt
Use this file to discover all available pages before exploring further.
Overview
This guide outlines the recommended installation order, deployment strategies, and architectural considerations for deploying the Enterprise SOC Architecture. Components must be installed in a specific sequence to satisfy dependencies.Installation Order and Dependencies
Components should be installed in the order presented below to ensure dependencies are met at each stage.
Dependency Graph
Deployment Architecture Strategies
- Single Server (Lab/POC)
- Distributed (Production)
- Hybrid Cloud
- Containerized (Kubernetes)
All-in-One Deployment
Use Case: Testing, proof of concept, small environments (less than 50 endpoints)Architecture:- Single server hosts all SOC components
- Containers (Docker/Podman) or VMs for service isolation
- Minimal high availability
- CPU: 16 cores
- RAM: 64 GB
- Storage: 2 TB SSD
- Network: 2x 1 Gbps NICs (management + monitoring)
- Simple deployment and management
- Lower hardware costs
- Easy for testing and learning
- No high availability
- Limited scalability
- Single point of failure
- Resource contention between components
High Availability Considerations
Critical Components Requiring HA
Elasticsearch Cluster
HA Strategy: Multi-node cluster with replication
- Deploy 3+ nodes (odd number for split-brain prevention)
- Configure index replication (minimum 1 replica)
- Use dedicated master-eligible nodes in large clusters
- Implement cluster-level shard allocation awareness
Wazuh Manager Cluster
HA Strategy: Master-worker cluster architecture
- Deploy master node and one or more worker nodes
- Agents connect to cluster (automatic failover)
- Shared configuration and rules across cluster
- Load balancer distributes agent connections
- Configure cluster in
/var/ossec/etc/ossec.conf - Enable cluster mode and set cluster key
- Configure node type (master/worker)
- Use load balancer (HAProxy, nginx) for agent connections
Logstash Pipeline Redundancy
HA Strategy: Multiple pipeline instances with load balancing
- Deploy 2+ Logstash instances
- Use load balancer for input (if using Beats)
- Configure persistent queues for data durability
- Monitor pipeline throughput and backpressure
Database High Availability
HA Strategy: Database replication and clusteringFor MySQL/PostgreSQL (Zabbix, TheHive):
- Master-slave replication
- Automatic failover (using tools like Patroni for PostgreSQL)
- Regular backups to separate storage
- Multi-node cluster with replication factor 3
- Distributed architecture provides natural HA
Application Load Balancing
HA Strategy: Load balancers for web interfacesComponents needing load balancing:
- Wazuh Dashboard (multiple dashboard instances)
- TheHive web interface
- Grafana dashboards
- HAProxy (open source, highly recommended)
- Nginx (reverse proxy + load balancing)
- Cloud load balancers (ALB on AWS, Azure Load Balancer)
Backup and Disaster Recovery
High availability prevents service interruption, but backups protect against data loss, corruption, and disasters.
| Component | Backup Method | Frequency | Retention |
|---|---|---|---|
| Elasticsearch | Snapshot API to S3/NFS | Daily | 30 days |
| Wazuh Configuration | File backup of /var/ossec | Daily | 90 days |
| TheHive Database | Database dump or snapshot | Daily | 90 days |
| Zabbix Database | MySQL/PostgreSQL dump | Daily | 30 days |
| Custom Rules/Scripts | Git repository | On change | Indefinite |
| System Configs | Configuration management (Terraform/Ansible) | On change | Indefinite |
- Document recovery procedures for each component
- Test restoration quarterly (minimum)
- Maintain runbooks for critical failure scenarios
- Store backups off-site or in separate cloud region
- Define RTO/RPO (Recovery Time/Point Objectives) for each tier
Installation Steps by Component
Phase 1: Foundation
Operating System Preparation
Operating System Preparation
For all servers:Security hardening:
- Disable root SSH login
- Configure SSH key-based authentication
- Enable automatic security updates
- Install and configure fail2ban
Phase 2: Core Infrastructure
Elasticsearch Cluster Installation
Elasticsearch Cluster Installation
Installation (per node):
Database Installation (MySQL/PostgreSQL)
Database Installation (MySQL/PostgreSQL)
For Zabbix and TheHive (if not using Elasticsearch/Cassandra):
Phase 3-7: Component Installation
Detailed installation procedures for each component (Wazuh, Logstash, Snort/Suricata, Zabbix, Prometheus, TheHive, Cortex) will be provided in component-specific documentation. The key is to follow the installation order defined in the dependency graph above.
- Add official package repository
- Install package via package manager
- Configure component (see Configuration guide)
- Enable and start systemd service
- Verify component health and connectivity
- Integrate with dependent components
Phase 8: Automation Tools
Infrastructure as Code Setup
Infrastructure as Code Setup
Terraform (for infrastructure provisioning):PyInfra (for configuration management):
Post-Installation Validation
Data Flow Verification
Confirm data flows through the pipeline:
- Deploy test Wazuh agent and verify events in Elasticsearch
- Send test syslog message to Logstash and check indexing
- Trigger test IDS alert and verify in Wazuh dashboard
- Create test incident in TheHive and verify storage
Troubleshooting Common Issues
Elasticsearch cluster won't form
Elasticsearch cluster won't form
Symptoms: Nodes don’t discover each otherSolutions:
- Verify
discovery.seed_hostscontains all node IPs - Check firewall allows port 9300 between nodes
- Ensure
cluster.nameis identical on all nodes - Verify network connectivity:
pingandtelnetbetween nodes - Check logs:
/var/log/elasticsearch/
Wazuh agents not connecting
Wazuh agents not connecting
Symptoms: Agents show as disconnected in dashboardSolutions:
- Verify firewall allows ports 1514 and 1515 to manager
- Check agent configuration:
cat /var/ossec/etc/ossec.conf - Verify manager address is correct in agent config
- Check manager logs:
/var/ossec/logs/ossec.log - Restart agent:
sudo systemctl restart wazuh-agent
High memory usage on Elasticsearch
High memory usage on Elasticsearch
Symptoms: System running out of memorySolutions:
- Verify JVM heap is set to 50% of system RAM (max 31 GB)
- Check heap settings:
/etc/elasticsearch/jvm.options - Monitor heap usage:
curl localhost:9200/_nodes/stats/jvm - Reduce replica count or index retention if needed
- Consider adding more nodes to distribute load
IDS sensor dropping packets
IDS sensor dropping packets
Symptoms: High packet drop rate in Suricata/Snort statsSolutions:
- Increase AF_PACKET buffer size (Suricata)
- Enable multi-threading in IDS configuration
- Verify NIC is in promiscuous mode:
ip link show - Check if SPAN session is overloading sensor
- Consider hardware upgrade or additional sensors
Installation Checklist
Before marking installation complete:- All components installed in correct order
- Systemd services enabled and running
- Inter-component connectivity verified
- Web dashboards accessible
- Test data flows through entire pipeline
- High availability configured (if production)
- Backup procedures implemented and tested
- Firewall rules validated
- TLS/SSL certificates installed
- Documentation updated with actual configuration
- Monitoring of SOC infrastructure itself enabled
- Team trained on basic operations
Next Steps
With components installed:- Proceed to detailed Configuration of each component
- Deploy agents and sensors to production endpoints
- Configure alerting rules and correlation logic
- Develop incident response playbooks
- Begin security event monitoring and tuning
Installation is just the beginning. Plan for 2-4 weeks of tuning and optimization before considering the SOC fully operational.
