Skip to content

[Platform] OpenAI file gives an error #961

@TNAJanssen

Description

@TNAJanssen

Describe Your Problem 🎯

What happened

The OpenAI document normalizer is sending files as base64-encoded file_data in message content, but the OpenAI API expects a file_id reference instead.

Expected behavior

Files should be uploaded to OpenAI's /v1/files endpoint first to obtain a file_id, then that file_id should be used in the message content format:

[
    'type' => 'file',
    'file' => [
        'file_id' => 'file-6F2ksmvXxt4VdoqmHRw6kL'
    ]
]

Actual behavior

The normalizer currently returns:

[
    'type' => 'file',
    'file' => [
        'filename' => 'document.pdf',
        'file_data' => 'base64encodedstring...'
    ]
]

Steps to reproduce

  1. Create a Document or File object with file content
  2. Use it in a message sent to OpenAI via the Symfony AI Platform
  3. Observe that the request contains base64 file_data instead of a file_id
  4. The API may reject or fail to process the file correctly

Provide Detailed Information 📋

  #file: "./src/InvoiceProcessing/Infrastructure/Agent/AbstractSymfonyAiAgent.php"
  #line: 77
  -previous: Symfony\AI\Platform\Exception\BadRequestException^ {#17750
    #message: "Missing required parameter: 'messages[1].content[1].file.file_id'."
    #code: 0
    #file: "./vendor/symfony/ai-platform/src/Bridge/OpenAi/Gpt/ResultConverter.php"

Component version

  • Symfony AI Platform: dev-main

Error messages

  #file: "./src/InvoiceProcessing/Infrastructure/Agent/AbstractSymfonyAiAgent.php"
  #line: 77
  -previous: Symfony\AI\Platform\Exception\BadRequestException^ {#17750
    #message: "Missing required parameter: 'messages[1].content[1].file.file_id'."
    #code: 0
    #file: "./vendor/symfony/ai-platform/src/Bridge/OpenAi/Gpt/ResultConverter.php"

Additional context

According to OpenAI's API documentation, files should be:

  1. Uploaded to POST https://api.openai.com/v1/files with:
    • purpose: "user_data" (or appropriate purpose)
    • file: multipart file upload
    • expires_after: optional expiration setting
  2. The returned file_id should then be used in message content

Impact

  • Large files cause oversized request payloads
  • May hit API size limits
  • Doesn't follow OpenAI's recommended file handling approach
  • Inefficient for repeated file usage (file must be re-embedded each time)

Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't workingPlatformIssues & PRs about the AI Platform componentStatus: Needs Review

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions