Data Collection with eBPF

Learn how Pynt uses eBPF to collect live traffic data for real-time API discovery and security testing, enabling seamless monitoring without performance impact.

Introduction & Purpose

This page provides a concise overview of how Pynt leverages eBPF and AWS Traffic Mirroring (where relevant) to capture real-time HTTP traffic, generate HAR files, and integrate with Pynt’s SaaS for deeper API security insights. The goal is to deliver comprehensive API visibility, uncovering both known and shadow endpoints while maintaining minimal overhead.

1. Why Live Traffic Capture?

Full Visibility: By monitoring raw network activity, you discover all real-time API calls - including shadow or undocumented endpoints.
Low Overhead: eBPF operates at the Linux kernel level with minimal performance impact and runs seamlessly in cloud environments.
Actionable Insights: Generating accurate HAR files enables quick integration into security workflows, identifying risks you might otherwise miss with static or manual discovery methods.

2. Key Components

eBPF Sniffer

Purpose: Intercepts HTTP traffic by hooking into system calls (accept, read, write, close) at the kernel level.
Deployment: Runs as a DaemonSet in Kubernetes, ensuring continuous coverage across all nodes.
Value: Near real-time data capture with minimal system overhead or code instrumentation.

Aggregator

Purpose: Collects, filters, and deduplicates HTTP session data; generates HAR files on demand.
Deployment: A Kubernetes Deployment that pulls data from the Sniffer through RabbitMQ.
Responsibilities:
- Filtering irrelevant traffic
- Deduplicating repeated sessions
- Storing the last X sessions in memory
- Exposing an API to produce HAR files

RabbitMQ

Purpose: Acts as a message queue for seamless, scalable communication between the Sniffer and the Aggregator.
Deployment: A Kubernetes Deployment in the same cluster.
Benefit: Decouples the data capture from processing, improving reliability and scalability.

Attacker Container (Sidecar)

Purpose: Shares a volume with the Aggregator to access generated HAR files for subsequent scanning or analysis.
Deployment: A sidecar container within the Aggregator pod.
Benefit: Automates vulnerability testing (e.g., a “har-based” approach) without extra network hops.

3. Architecture Diagram

4. Data Flow & High-Level Steps

Syscall Interception: The eBPF Sniffer hooks system calls (accept, read, write, close), capturing HTTP requests/responses in real time.
Queue & Transport: Captured session data is sent to RabbitMQ.
Aggregation & Storage: The Aggregator consumes these sessions, applying filtering and deduplication rules.
HAR Request: An API endpoint on the Aggregator lets you request an HAR file for the last X sessions.
HAR File Generation: The Aggregator writes the HAR file to a shared volume.
Attacker Container: If needed, the sidecar container accesses the HAR file for security tests or other usage.
Pynt SaaS Upload: Session metadata is also uploaded to Pynt’s SaaS for broader API cataloging and vulnerability analysis.

5. Security & Trust Considerations

Minimal Footprint: eBPF runs in a sandboxed environment at the kernel layer, ensuring system stability and performance.
Controlled Access: Only authenticated users and designated microservices can request HAR files, reducing risk.
Filtered Storage: The Aggregator’s filtering rules ensure sensitive or irrelevant data is not stored beyond the configured retention.
Encryption in Transit: Data from the Sniffer to the Aggregator is protected via secure protocols and contained within your Kubernetes cluster.

6. Next Steps & Additional Resources

Setup Guidelines: We can provide a private repository or documentation detailing the exact deployment steps for your cluster.
Configuration Examples: Sample YAML manifests for deploying the DaemonSet, Aggregator, RabbitMQ, and sidecar containers.
Integration Support: Our team is available to guide you through customizing filters, storage limits, and any specialized network settings.

For specific inquiries regarding this feature, please reach out to [email protected].

PreviousLive Traffic NextALB Traffic Capture with AWS Traffic Mirroring

Last updated 5 months ago