fix: add code ops and extract to python api #462
Merged
+349
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Description
ExtractOpper Python API expectations:code_map,code_reduce,code_filter, andextractnow work withdocetl.api.Pipelineand typed schemas.Summary of changes
CodeMapOp,CodeReduceOp,CodeFilterOp,ExtractOptodocetl.schemas.OpType.docetl.api.__all__.Pipeline._update_from_dictto recognizecode_map,code_reduce,code_filter,extract.docetl.operations.code_operations.*.schema.codenow accepts a callable; it’s converted to source withinspect.getsourceand bound astransform.tests/test_api.pyfor:CodeMapOpwith callable transformCodeFilterOpwith callable predicateCodeReduceOpwith callable group reducerExtractOpUsage example
Files touched
docetl/schemas.py: add and export new op schemas; extendOpTypedocetl/api.py: export new ops; handle new op types inPipeline._update_from_dictdocetl/operations/code_operations.py: accept callables forcodevia schema validatorstests/test_api.py: add tests for code ops andExtractOpBackward compatibility
Docs
Notes
def(no lambda/closures); source is captured viainspect.getsource.transform(doc: dict) -> dicttransform(doc: dict) -> booltransform(group: list[dict]) -> dict