Setting up mitmproxy SOCKSv5 proxy on Firefox - lmmx/devnotes GitHub Wiki

Many sites serve JSON from internal APIs that turn into HTML in the browser, this lets you easily store those locally to do whatever you please with. It works with anything going through your browser (images, full HTML pages), you can even rewrite all your web traffic with it!

:warning: Don't forget to remove the mitmproxy certificate from Firefox after use.

Setup

To set up mitmproxy to scan your own web traffic (i.e. to "Man-In-The-Middle" yourself), and handle the data your computer receives over the web, you can add a certificate authority to Firefox and then run mitmdump with a script that will run when a condition (if statement) triggers.

  1. First make a venv for your project somewhere, e.g. called my_web_proxy (or pick your own name)
mkdir my_web_proxy
cd my_web_proxy
uv venv
source .venv/bin/activate
uv init --app package
uv add mitmproxy
  1. You are now in an activated virtual environment, and have a Python project named my_web_proxy with a main.py script. This is what you will point mitmdump at

  2. To make Firefox accept mitmproxy (CLI/TUI interface) and/or mitmdump (headless mitmproxy) we need to add its certificate .pem which we get from the official site. Download the files there to your machine at ~/.mitmproxy - the one named mitmproxy-ca-cert.pem is the one you will add to Firefox.

  1. Go to Firefox settings, search for "proxy" in the settings search, click configure

  2. Your connection settings menu should show "use system proxy settings", change it to "manual proxy configuration" and set the HTTP proxy to "127.0.0.1" (this means "my own machine") and port to 8080. Check the box "Also use this proxy for HTTPS". The box "Proxy DNS when using SOCKS v5" should already be checked. Click OK to save changes.

  3. In Firefox settings again, search for "certificates" in the settings search, click "Manage certificates".

  4. It should open on the Authorities tab, click "Import" then select the file you saved to ~/.mitmproxy/mitmproxy-ca-cert.pem. Check the box "This certificate can identify web sites". This means that the websites will authorise the website as being legitimate (whereas actually it is coming from the mitmproxy interception middleware).

    • Note you can delete this at any time and your machine will be back to normal, and noone will be able to run mitmproxy and intercept your traffic. While it is set up though, any running instance of mitmproxy could read your traffic, so it's advisable not to leave it there if you're not using it!
  5. [Optional] Not necessary but if you want to review, you should then be able to scroll down to m in the alphabet under certificate name and find a certificate authority listed as "mitmproxy" with a certificate called "mitmproxy" with security device as "Software Security Device"

If you then run mitmproxy (to view as a table on the command line) or mitmdump (non-interactive scripting layer) you will be getting your traffic requests picked up from there, all URLs and the requests sent and responses received are all accessible here.

Scripting

You can then run a script that triggers with mitmdump (mitmdump -s main.py), such as this:

  • Checks POST requests only
  • Checks POST requests where the response JSON contains a specific nested key (datasomeRequiredKey)
  • Parses the response body as JSON
  • Extracts the targeted nested data field
  • Builds a filename from selected attributes in the extracted object (with safe fallbacks)
  • Writes the original response payload to disk as a JSON file
  • Falls back to a timestamp-based filename if extraction or naming fails
  • Ensures output directory exists before writing files
import json
from datetime import datetime
from pathlib import Path

from mitmproxy import http

out_dir = Path("json")

def response(flow: http.HTTPFlow) -> None:
    method = flow.request.method
    content_type = flow.response.headers.get("content-type", "")

    if method != "POST" or "application/json" not in content_type:
        return

    raw = flow.response.get_text()

    try:
        payload = json.loads(raw)
    except json.JSONDecodeError:
        return

    data_block = payload.get("data", {}).get("someRequiredKey")
    if not data_block:
        return

    try:
        primary_id = data_block.get("id", "unknown")
        secondary_id = (data_block.get("label") or "").replace(" ", "-")

        file_stem = f"{primary_id}"
        if secondary_id:
            file_stem += f"_{secondary_id}"

        out_file = out_dir / f"{file_stem}.json"

    except Exception as exc:
        print(f"-------- ERROR building filename: {exc} -------------")
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
        out_file = out_dir / f"{timestamp}.json"

    out_file.write_text(raw, encoding="utf-8")

Obviously you'd need to edit this to fit your data: a simple way to get started is to just trigger on a keyword you know will be in it, and then print/breakpoint rather than save files, or save files and deal with them there.

The most important thing is do not raise errors, make sure the script doesn't crash on them, as this will block the loop.

Some example variants to get started with:

Minimal logger

import json
from mitmproxy import http

def response(flow: http.HTTPFlow) -> None:
    try:
        if flow.request.method != "POST":
            return

        ct = flow.response.headers.get("content-type", "")
        if "application/json" not in ct:
            return

        payload = json.loads(flow.response.get_text())

        data_block = payload.get("data", {}).get("someRequiredKey")
        if not data_block:
            return

        print("MATCH:", {
            "id": data_block.get("id"),
            "label": data_block.get("label")
        })

    except Exception as e:
        print("handler error:", e)

Safe file dump with timestamp fallback

Persist to disk with timestamps

import json
from datetime import datetime
from pathlib import Path
from mitmproxy import http

out_dir = Path("json")
out_dir.mkdir(exist_ok=True)


def response(flow: http.HTTPFlow) -> None:
    try:
        if flow.request.method != "POST":
            return

        if "application/json" not in flow.response.headers.get("content-type", ""):
            return

        raw = flow.response.get_text()

        try:
            payload = json.loads(raw)
        except Exception:
            return

        data_block = payload.get("data", {}).get("someRequiredKey")
        if not data_block:
            return

        # very defensive filename logic
        try:
            base = str(data_block.get("id") or "unknown")
        except Exception:
            base = "unknown"

        ts = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
        out_file = out_dir / f"{base}_{ts}.json"

        try:
            out_file.write_text(raw, encoding="utf-8")
        except Exception as e:
            print("write failed:", e)

    except Exception as e:
        print("handler error:", e)

Debug switch inspector mode

Useful for when you don't yet know the structure

import json
from mitmproxy import http

DEBUG = True


def response(flow: http.HTTPFlow) -> None:
    try:
        if flow.request.method != "POST":
            return

        ct = flow.response.headers.get("content-type", "")
        if "application/json" not in ct:
            return

        payload = json.loads(flow.response.get_text())

        if DEBUG:
            # inspect top-level shape only
            print("TOP KEYS:", list(payload.keys())[:10])

        data_block = payload.get("data", {}).get("someRequiredKey")

        if DEBUG and data_block:
            print("DATA SAMPLE:", {
                k: data_block.get(k)
                for k in list(data_block.keys())[:5]
            })

        if not data_block:
            return

    except Exception as e:
        print("debug handler error:", e)