cloud-convert is a command-line tool for processing geospatial files. It
supports inspecting file metadata, converting raster files to Cloud-Optimized
GeoTIFFs (COGs), and converting vector files to GeoParquet. Batch processing is
automatically handled when directories are provided as input.
Important
This tool should probably be considered beta. So far, it has only been tested on Linux. It has seen a lot of internal use and generally works well, but there may be some edge cases that have not been tested. In general, it probably won't result in any data loss, but be careful when overwriting files when converting to COG. I am far from claiming to be a rust developer, and this started as a hobby/learner project before finding use on the Adaptation Atlas. As such, cloud-convert is very open to suggestions, code reviews, and bug reports. That aside, hopefully this tool will be useful to others in promoting cloud optimization and better data management practices!
In prelim testing, converting a folder of 340 geojson files ranging in size from 600mb to < 1mb:
- R terra with multicore: 4 minutes 3 seconds
- cloud-convert: 1 minute 15 seconds
- Inspect metadata of raster and vector files
- Calculate quality and summary stats for raster files
- Convert raster files to Cloud-Optimized GeoTIFF (COG)
- Convert vector files to GeoParquet
- Automatically detects and processes directories in batch
Install locally using Cargo:
cargo install --path .Or build manually:
cargo build --releaseThis will create the binary at ./target/release/cloud_convert.
This binary generally requires linkages to GDAL with will not work if GDAL is not installed and it may issues if it was built with a different version of GDAL. Along with that, some features of cloud-convert require newer versions of GDAL to work, and require that GDAL is built with arrow and parquet support.
To avoid any issues, we have included a dockerfile and a script to compile the binary as a completely standalone binary that does not require GDAL to be installed on the user device. This seems to work very well, but it takes a significant time to compile the binary.
cloud-convert info path/to/file.tif
cloud-convert info path/to/file.gpkgConvert a single raster file:
cloud-convert to-cog path/to/file.tif --overwriteConvert all .tif files in a directory:
cloud-convert to-cog path/to/folder --out path/to/output_dir --overwriteConvert a single vector file:
cloud-convert to-gpq path/to/file.gpkg --out output.parquetConvert all vector files in a directory:
cloud-convert to-gpq path/to/folder --out path/to/output_dirNOTE: this does not yet work for NetCDF files
Run QAQC on a single raster file and print output to stdout:
cloud-convert run-qaqc path/to/file.tif --quantilesRun QAQC across all raster files in a directory:
cloud-convert run-qaqc path/to/folder --quantiles --output-format parquetRun all unit tests:
cargo test