Open Source Large Scale Full Packet Capturing
Moloch is an open source, large scale, full packet capturing, indexing, and database system. Moloch augments your current security infrastructure to store and index network traffic in standard PCAP format, providing fast, indexed access. An intuitive and simple web interface is provided for PCAP browsing, searching, and exporting. Moloch exposes APIs which allow for PCAP data and JSON formatted session data to be downloaded and consumed directly. Moloch stores and exports all packets in standard PCAP format allow you to also use your favorite PCAP ingesting tools, such as wireshark, during your analysis workflow.
Access to Moloch is protected by using HTTPS with digest passwords or by using an authentication providing web server proxy. All PCAPs are stored on the sensors and are only accessed using the Moloch interface or API. Moloch is not meant to replace an IDS but instead work along side them to store and index all the network traffic in standard PCAP format, providing fast access. Moloch is built to be deployed across many systems and can scale to handle tens of gigabits/sec of traffic. PCAP retention is based on available sensor disk space. Meta data retention is based on the Elasticsearch cluster scale. Both can be increased at anytime and are under your complete control.
Components
The Moloch system is comprised of 3 components
capture – A threaded C application that monitors network traffic, writes PCAP formatted files to disk, parses the captured packets and sends meta data (SPI data) to elasticsearch.
viewer – A node.js application that runs per capture machine and handles the web interface and transfer of PCAP files.
elasticsearch – The search database technology powering Moloch.
Hardware Requirements
Moloch is built to run across many machines for large deployments. What follows are rough guidelines for folks capturing large amounts of data with high bit rates, obviously tailor for the situation. It is not recommended to run the capture and elasticsearch processes on the same machines for highly utilized GigE networks. For demo, small network, or home installations everything on a single machine is fine.
Moloch capture/viewer systems
One dedicated management network interface and CPU for OS
For each network interface being monitored recommend ~10G of memory and another dedicated CPU
If running suricata or another IDS add an additional two (2) CPUs per interface, and an additional 5G memory (or more depending on IDS requirements)
Disk space to store the PCAP files: We recommend at least 10TB, xfs (with inode64 option set in fstab), RAID 5, at least 5 spindles)
Disable swap by removing it from fstab
If networks are highly utilized and running IDS then CPU affinity is required
Moloch elasticsearch systems (some black magic here!)
1/4 * Number_Highly_Utilized_Interfaces * Number_of_Days_of_History is a ROUGH guideline for number of elasticsearch instances (nodes) required. (Example: 1/4 * 8 interfaces * 7 days = 14 nodes)
Each elasticsearch node should have ~30G-40G memory (20G-30G [no more!] for the java process, at least 10G for the OS disk cache)
You can have multiple nodes per machine (Example 64G machine can have 2 ES nodes, 22G for the java process 10G saved for the disk cache)
Disable swap by removing it from fstab
Obviously the more nodes, the faster responses will be
You can always add more nodes, but it’s hard to remove nodes (more on this later)
What OSes are supported?
Moloch is no longer supported on 32 bit machines. Our deployment is on Centos 6 with the elrepo 4.x kernel upgrade for packet performance increases. A large amount of development is done on Mac OS X 10.11 using MacPorts, however, it has never been tested in a production setting. 🙂 Moloch since 0.16 requires gcc/g++ 4.8.4 or later to compile. This is because nodejs requires it.
The following OSes should work out of the box:
CentOS 7
Ubuntu 14.04, 16.04
FreeBSD 9
FreeBSD 10.0 (wiseService doesn’t work, but the wise plugin does), 10.3 is known NOT to compile currently
Hardware Requirements
Moloch is built to run across many machines for large deployments. For demo, small network, or home installations everything on a single machine is fine. For larger installations please see the FAQ for recomended configurations. The following are rough guidelines for capturing large amounts of data with high bit rates, obviously tailor for your specific situation. It is not recommended to run the capture and elasticsearch processes on the same machines for highly utilized GigE networks.
Moloch capture/viewer systems read FAQ Entry
Moloch elasticsearch systems read FAQ Entry
Example Configuration
Here is an example system setup for monitoring 8x GigE highly-utilized networks, with an average of ~5 Gigabit/sec, with ~7 days of pcap storage.
capture/viewer machines
5x HP Apollo 4200
64GB of memory
80TB of disk
Running Moloch and Suricata
elasticsearch machines
10x HP DL380-G7
128GB of memory
6TB of disk
Ports Used
tcp 8005 – Moloch web interface
tcp 9200-920x (configurable upper limit) – Elasticsearch service ports
tcp 9300-930x (configurable upper limit) – Elasticsearch mesh connections