Skip to content

Support for distributed feature extraction / training #15

@pierotofy

Description

@pierotofy
  • Modify pctrain by adding a --extract-features <path>.opcfeat.bin parameter. When set, execution should stop at https://github.com/uav4geo/OpenPointClass/blob/main/randomforest.cpp#L30 and https://github.com/uav4geo/OpenPointClass/blob/main/gbm.cpp#L45
  • Serialize the required vectors (for RF that's gt, ft, GBT populates the structures similarly although not identically). It might also be possible to serialize in a single format regardless of RF or GBT if one creates a new function that simply does the serialization (like train, but stops after creating the features). One might want to encode the various scale, radius, treeDepth, etc. parameters into the serialized output to avoid repeating them and validating other serialized outputs. All serialized output's parameters from different processes need to match.
  • Modify pctrain by checking for .opcfeat.bin file input extensions; if all files passed as input are .opcfeat.bin, then read features directly instead of computing them by adapting the rf::train and gbt::train functions. If you have serialized the scale, radius, etc. parameters one can read them from the serialized files instead of passing them manually.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions