Skip to content

multipart/form-data parsing appears to truncate a file part by 1 byte #3077

@swell-d

Description

@swell-d

Guys, sorry, I'm emotional because I spent 7 hours fixing a bug, and it turned out that the latest werkzeug 3.1.4 update was to blame. Below is text from chatgpt explaining the problem:

In Werkzeug 3.1.4, multipart/form-data parsing appears to truncate a file part by 1 byte in a reproducible edge case. The same code and client request work correctly with Werkzeug 3.1.3.

This manifests during chunked uploads from the browser: a specific chunk at a specific byte offset is parsed as length (expected - 1), while the raw (non-multipart) upload of the exact same Blob is parsed correctly. This suggests a regression in multipart parsing in 3.1.4.

Steps to reproduce:

  1. Run the minimal Flask app below.
  2. Use the HTML test page below to select a large file and run the probe.
  3. Observe that with Werkzeug 3.1.4 the multipart endpoint reports one chunk as 1 byte shorter, while the raw endpoint reports the correct size.
  4. Downgrade to Werkzeug 3.1.3 and repeat: the multipart endpoint reports correct sizes for all tested chunks.

Minimal reproducible example (server):

from flask import Flask, request, jsonify
from flask_login import LoginManager, login_required

app = Flask(__name__)
app.secret_key = "test"

login_manager = LoginManager(app)

@app.route("/upload_probe_raw", methods=["POST"])
def upload_probe_raw():
    offset = request.headers.get("X-Offset", type=int)
    expected = request.headers.get("X-Expected", type=int)
    data = request.get_data(cache=False) or b""
    return jsonify({
        "offset": offset,
        "expected": expected,
        "content_length": request.content_length,
        "data_len": len(data),
    })

@app.route("/upload_probe_form", methods=["POST"])
def upload_probe_form():
    offset = request.form.get("offset", type=int)
    expected = request.form.get("expected", type=int)

    f = request.files.get("file")
    if not f:
        return jsonify({"error": "no file"}), 400

    data = f.read() or b""

    return jsonify({
        "offset": offset,
        "expected": expected,
        "content_length": request.content_length,
        "file_len": len(data),
    })

if __name__ == "__main__":
    app.run(port=5000, debug=True)

Minimal reproducible example (client):

<input type="file" id="fileInput">
<button id="probeBtn">Probe upload</button>
<pre id="out"></pre>

<script>
const chunkSize = 10 * 1024 * 1024;
const badIndex = 25;

function log(s) {
  document.getElementById("out").textContent += s + "\n";
}

async function sendRaw(blob, offset, expected) {
  const r = await fetch("/upload_probe_raw", {
    method: "POST",
    headers: {
      "Content-Type": "application/octet-stream",
      "X-Offset": String(offset),
      "X-Expected": String(expected)
    },
    body: blob
  });
  return await r.json();
}

async function sendForm(blob, offset, expected) {
  const fd = new FormData();
  fd.append("offset", String(offset));
  fd.append("expected", String(expected));
  fd.append("file", blob, "chunk.bin");

  const r = await fetch("/upload_probe_form", {
    method: "POST",
    body: fd
  });
  return await r.json();
}

document.getElementById("probeBtn").onclick = async () => {
  const out = document.getElementById("out");
  out.textContent = "";

  const f = document.getElementById("fileInput").files[0];
  if (!f) {
    log("No file selected");
    return;
  }

  const indices = [badIndex - 1, badIndex, badIndex + 1];

  log(`File: ${f.name}`);
  log(`File size: ${f.size}`);
  log(`Chunk size: ${chunkSize}`);
  log("");

  for (const idx of indices) {
    const offset = idx * chunkSize;
    const end = Math.min(offset + chunkSize, f.size);
    const expected = end - offset;
    const blob = f.slice(offset, end);

    log(`Index ${idx} offset ${offset}`);
    log(`slice.size ${blob.size} expected ${expected}`);

    const rawRes = await sendRaw(blob, offset, expected);
    log(`RAW -> data_len ${rawRes.data_len} content_length ${rawRes.content_length}`);

    const formRes = await sendForm(blob, offset, expected);
    log(`FORM -> file_len ${formRes.file_len} content_length ${formRes.content_length}`);

    log("");
  }
};
</script>

Observed output with Werkzeug 3.1.4:

Index 24 offset 251658240
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485760 content_length 10486162

Index 25 offset 262144000
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485759 content_length 10486162

Index 26 offset 272629760
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485760 content_length 10486162

With Werkzeug 3.1.3, FORM -> file_len matches the expected size for all tested chunks, including index 25.

There is no exception/traceback; this is a silent data truncation of 1 byte.

Expected behavior:

Multipart/form-data parsing should yield the exact byte sequence sent by the client. The uploaded file part length should match the expected chunk length. Specifically, for offset 262144000 with a 10 MiB chunk size, the parsed file length should be 10485760, not 10485759.

Environment:

  • Python version: 3.13.10
  • Werkzeug version: 3.1.4 (regression), 3.1.3 (works)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions