Containerized Malware Baselines

Defining Behavioral Ground Truth with Rootless Podman on Arch Linux

PROJECT STATUS: ACTIVE 🟢 HOST: Arch Linux · ENGINE: Podman (Rootless) FOCUS: Malware Baselines · Behavioral Analysis · Ground Truth

⚡ TL;DR

Malware labs often rely on analyst intuition—Sherlock Holmes with just coffee and a magnifying glass. This project establishes a reproducible behavioral ground truth: a verifiable baseline for known malware behavior in a strictly controlled environment. We achieve this using rootless containers on Arch Linux, creating an ecosystem where “root” is an illusion, the network is a simulation, and environments are ephemeral.

1. Principles of Isolation: Science Without Arson

We leverage Linux namespaces to create “hermetic seals” around processes. Each container believes it holds the keys to the kingdom, but in reality, it cannot even open the host’s bathroom door.

The Namespace Stack

User Namespace: The container sees UID 0 (root), but the host maps it to an unprivileged user (e.g., UID 100000+). Internal Root ≠ External Root.
PID Namespace: Process isolation. PID 1 is the container’s entry point, distinct from the host’s systemd.
Network Namespace: Virtual interfaces completely decoupled from the host’s network stack.
Mount Namespace: Filesystem isolation. The container cannot access host files unless explicitly granted via bind mounts.

Daemonless Architecture

Unlike Docker, which requires a persistent root daemon, Podman spawns containers as direct children of the user process. This significantly reduces the attack surface.

🧪 Golden Rule: Baselines are generated unprivileged, without persistence, and with simulated Internet only.

2. Building the Lab from Scratch

2.1 Host Preparation (Arch Linux)

User namespaces are the magic that makes rootless possible. Ensure they are enabled for unprivileged users:

sudo sysctl -w kernel.unprivileged_userns_clone=1
echo "kernel.unprivileged_userns_clone=1" | sudo tee /etc/sysctl.d/userns.conf

Install the necessary tooling, including the modern netavark network stack:

sudo pacman -Syu podman crun slirp4netns netavark tcpdump yara jq

Validation:

podman info | grep -i rootless
# Output should be: rootless: true

3. Network Segmentation: The Stage Prop Internet

We create isolated networks to simulate a realistic environment without actual internet access.

podman network create netLAN --internal --subnet 10.201.0.0/24
podman network create netLAB --internal --subnet 10.202.0.0/24
podman network create netDNS --internal --subnet 10.203.0.0/24

💡 Note: The --internal flag ensures no external connectivity.

Infrastructure Mock-up

Fake DNS Listener (netDNS): Simulates resolution of C2 domains.

podman run -d --rm --network netDNS --name fake-dns alpine nc -lk -p 5353

Fake C2 Listener (netLAB): Captures HTTP beacons from malware.

podman run -d --rm --network netLAB --name fake-c2 alpine nc -lk -p 80

These listeners live in separate network namespaces, creating a “Matrix-like” simulation for the malware.

4. Obtaining Samples: Atomic Fossils

We download legal samples from public repositories like MalwareBazaar. We store the binary and its hash, never executing it on the host.

mkdir -p ~/malware-baselines
cd ~/malware-baselines

# Example download (pseudo-command)
curl -s -X POST https://mb-api.abuse.ch/api/v1/ \
  -d "query=get_file&sha256_hash=dabba0ff455exampleHASH" \
  -o sample.bin

# Verify hash
sha256sum sample.bin > sample.sha256

5. Execution: Running as “Root” (But Not Really)

5.1 Static Baseline Extraction

We first inspect the binary’s DNA without execution.

podman run --rm -it --read-only \
  -v ~/malware-baselines:/samples:ro \
  alpine sh -c "
  file /samples/sample.bin
  strings /samples/sample.bin | head -n 80
  "

Expected Output:

File type headers.
Suspicious strings: CreateFile, InternetOpenA, Mutex Global\a1b2.

5.2 Dynamic Baseline & Traffic Capture

Now we execute the sample in the isolated environment, capturing its attempts to “phone home.”

1. Start Sniffer on Host:

sudo tcpdump -i any host 10.202.0.10 -w capture.pcap &
TCP_PID=$!

2. Execute in Container:

podman run --rm -it \
  --network netLAB \
  -v ~/malware-baselines:/data:ro \
  alpine sh -c "
  echo 'Launching sample...'
  sleep 2
  ./data/sample.bin || true
  "

3. Stop Capture & Save Logs:

kill $TCP_PID
podman logs fake-c2 > fake-c2.log

Interpretation: Analyze capture.pcap and fake-c2.log to see HTTP floods, unresolved pings, and C2 connection attempts. This establishes the behavioral baseline.

6. Resource Overhead Benchmark

Is the container overhead significant? We compare hashing 1GB of data on the host vs. inside a rootless container.

truncate -s 1G benchmark.data
time sha256sum benchmark.data
time podman run --rm -v $(pwd):/bench:ro alpine sha256sum /bench/benchmark.data

Typical Results:

Bare Metal: ~100s
Rootless Container: ~105s (~5% overhead)

📊 Analysis: If overhead exceeds ~10%, check cgroup limits or storage driver performance.

7. Constructing IoC Baselines

We generate a standardized report for each sample. Since we don’t distribute live viruses, we distribute these “digital fingerprints.”

### Indicators of Compromise (Baseline)
- **Sample Hash:** `dabba0ff455exampleHASH`
- **Suspicious Strings:** `CreateFile`, `InternetOpenA`, `Mutex Global\a1b2`
- **Fake C2 Network:** `netLAB 10.202.0.0/24`
- **Observed Behavior:** Internal HTTP beacon captured by listener; no external leakage.

8. Conclusion

Rootless Podman on Arch Linux provides:

Illusory Root: Full internal privileges without host compromise.
Segmented Networking: Zero internet leakage.
Ephemeral Environments: Containers die, but logs remain as fossils.
Reproducible Science: Consistent benchmarks and auditable workflows.

This approach transforms malware analysis from “gut feeling” to measurable science. It’s lightweight, disposable, and ethical—even when the malware tries its best to be bad.