Skip to content

Conversation

@hardyliao85
Copy link

@hardyliao85 hardyliao85 commented Sep 22, 2025

Description

Enhance Vision Service for label generation with configurable languages (en and zh_TW).

Changes:

  1. Prevent model loading timeouts with --timeout 120 in Dockerfile.
  2. Updated requirements.txt to specify torch==2.8.0+cu126 (CUDA version) for GPU inference support. Requires GPU driver: Linux >=525.60.13, Windows >=527.41.
  3. Added 5 Hugging Face image classification models for hardware-based selection.
  4. LABELS_LOCALE configurable via Docker Compose.

Below is my test configuration
compose:

# Only show key parts
services:
  photoprism-vision:
    image: photoprism/vision:develop
    ports:
      - "5000:5000"
    environment:
      NVIDIA_VISIBLE_DEVICES: "all"
      NVIDIA_DRIVER_CAPABILITIES: "compute,utility"
      LABELS_LOCALE: "zh_TW"
    volumes:
      - ./photoprism_vision_models:/app/models
      - ./venv:/app/venv
    deploy:
      resources:
        reservations:
          devices:
            - driver: "nvidia"
              capabilities: [ gpu ]
              count: "all"

vision.yml:

- Type: labels
  Name: convnextv2_huge.fcmae_ft_in22k_in1k_384
  Resolution: 384
  Service:
    Uri: http://IP:5000/api/v1/vision/labels
    FileScheme: data
    RequestFormat: vision
    ResponseFormat: vision

Related Issues

Partially addresses #19 (main functionality implemented; language support pending)

Acceptance Criteria

  • New features or enhancements are fully implemented and do not break existing functionality, so that they can be released at any time without requiring additional work
  • Automated unit and/or acceptance tests are included to ensure that changes work as expected and to reduce repetitive manual work
  • Documentation has been / will be updated, especially as it relates to new configuration options or potentially disruptive changes

Note: Documentation updated only in README; no plans to update other official documents.
Note: I don't have much experience with PRs. I apologize if anything is unclear or not done properly. Please feel free to give me feedback, and I am happy to make any necessary corrections. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant