|
| 1 | +# Pipeline Generator |
| 2 | + |
| 3 | +Python replacement for all three Jinja templates that generate Buildkite CI pipelines for vLLM: |
| 4 | +- `test-template-ci.j2` (full CI pipeline) |
| 5 | +- `test-template-fastcheck.j2` (fast pre-merge checks) |
| 6 | +- `test-template-amd.j2` (AMD-only pipeline) |
| 7 | + |
| 8 | +## Quick Start |
| 9 | + |
| 10 | +```bash |
| 11 | +# CI mode (default) |
| 12 | +python pipeline_generator.py --pipeline_mode ci |
| 13 | + |
| 14 | +# Fastcheck mode |
| 15 | +python pipeline_generator.py --pipeline_mode fastcheck |
| 16 | + |
| 17 | +# AMD mode |
| 18 | +python pipeline_generator.py --pipeline_mode amd |
| 19 | +``` |
| 20 | + |
| 21 | + |
| 22 | +## Directory Structure |
| 23 | + |
| 24 | +``` |
| 25 | +pipeline_generator/ |
| 26 | +├── pipeline_generator.py # Main entry point |
| 27 | +├── config.py # Configuration |
| 28 | +├── build_config.py # Build step configs |
| 29 | +├── hardware_config.py # Hardware test configs |
| 30 | +│ |
| 31 | +├── models/ # Data models |
| 32 | +│ ├── test_step.py # Input from test-pipeline.yaml |
| 33 | +│ ├── buildkite_step.py # Output for Buildkite |
| 34 | +│ └── docker_config.py # Docker/K8s configs |
| 35 | +│ |
| 36 | +├── steps/ # Step generators (organized by type) |
| 37 | +│ ├── build_steps.py # Docker image builds |
| 38 | +│ ├── test_steps.py # Regular test steps |
| 39 | +│ ├── hardware_steps.py # External hardware (Neuron, TPU, Intel, etc.) |
| 40 | +│ └── group_steps.py # Special groups (AMD, Torch Nightly) |
| 41 | +│ |
| 42 | +├── transformers/ # Command transformation pipeline |
| 43 | +│ ├── base.py # Base transformer interface |
| 44 | +│ ├── normalizer.py # Flatten & normalize commands |
| 45 | +│ ├── test_targeting.py # Intelligent test targeting |
| 46 | +│ └── coverage.py # Coverage injection |
| 47 | +│ |
| 48 | +├── selection/ # Test selection logic |
| 49 | +│ ├── filtering.py # Should run/skip decisions |
| 50 | +│ └── blocking.py # Block step (manual trigger) logic |
| 51 | +│ |
| 52 | +├── docker/ # Docker plugin builders |
| 53 | +│ └── plugin_builder.py # Builds Docker/K8s plugins |
| 54 | +│ |
| 55 | +├── utils/ # Utilities |
| 56 | +│ ├── constants.py # Enums, constants |
| 57 | +│ ├── agents.py # Agent queue selection |
| 58 | +│ └── commands.py # Command helpers |
| 59 | +│ |
| 60 | +└── tests/ # Test suite |
| 61 | + ├── test_*.py # 125 unit tests |
| 62 | + └── test_integration_comprehensive.py # 56 integration scenarios |
| 63 | +``` |
| 64 | + |
| 65 | +## Main Flow |
| 66 | + |
| 67 | +The `pipeline_generator.py` orchestrates everything: |
| 68 | + |
| 69 | +```python |
| 70 | +def generate(self, test_steps): |
| 71 | + steps = [] |
| 72 | + |
| 73 | + # Build Docker images |
| 74 | + steps.append(generate_main_build_step(self.config)) |
| 75 | + steps.extend(generate_cu118_build_steps(self.config)) |
| 76 | + steps.append(generate_cpu_build_step(self.config)) |
| 77 | + |
| 78 | + # Generate test steps |
| 79 | + steps.extend(self.generate_test_steps(test_steps)) |
| 80 | + |
| 81 | + # Add special groups |
| 82 | + steps.append(generate_torch_nightly_group(test_steps, self.config)) |
| 83 | + steps.append(generate_amd_group(test_steps, self.config)) |
| 84 | + |
| 85 | + # Add external hardware tests |
| 86 | + steps.extend(generate_all_hardware_tests(self.config.branch, self.config.nightly)) |
| 87 | + |
| 88 | + return steps |
| 89 | +``` |
| 90 | + |
| 91 | +This structure keeps the high-level flow readable while organizing details into focused modules. |
| 92 | + |
| 93 | +## Command Transformation Pipeline |
| 94 | + |
| 95 | +One key improvement over Jinja is making command transformations explicit. When converting a test step to a Buildkite step, commands go through: |
| 96 | + |
| 97 | +1. **Flatten** - Multi-node commands (list of lists) become single list |
| 98 | +2. **Normalize** - Remove backslashes from YAML line continuations |
| 99 | +3. **Test Targeting** - If only test files changed, run just those tests |
| 100 | +4. **Coverage** - Inject coverage collection if enabled |
| 101 | +5. **Join** - Combine into single command string |
| 102 | + |
| 103 | +This happens in `docker/plugin_builder.py::build_docker_command()`. Adding a new transformation is straightforward - just create a new transformer in `transformers/`. |
| 104 | + |
| 105 | +## Where to Find Things |
| 106 | + |
| 107 | +Coming from the Jinja template? Here's where logic moved: |
| 108 | + |
| 109 | +**Build steps** (lines 14-179 in Jinja) |
| 110 | +Now in: `steps/build_steps.py` |
| 111 | + |
| 112 | +**Test step conversion** (lines 180-550 in Jinja) |
| 113 | +Now in: `steps/test_steps.py` and `docker/plugin_builder.py` |
| 114 | + |
| 115 | +**Test selection/blocking** (lines 508-530, 600-621 in Jinja) |
| 116 | +Now in: `selection/blocking.py` and `selection/filtering.py` |
| 117 | + |
| 118 | +**Coverage injection** (lines 33-158 in Jinja) |
| 119 | +Now in: `transformers/coverage.py` |
| 120 | + |
| 121 | +**Intelligent test targeting** (lines 20-158 in Jinja) |
| 122 | +Now in: `transformers/test_targeting.py` and `selection/filtering.py` |
| 123 | + |
| 124 | +**AMD tests** (lines 662-727 in Jinja) |
| 125 | +Now in: `steps/group_steps.py::generate_amd_group()` |
| 126 | + |
| 127 | +**Torch Nightly tests** (lines 579-658 in Jinja) |
| 128 | +Now in: `steps/group_steps.py::generate_torch_nightly_group()` |
| 129 | + |
| 130 | +**Hardware tests** (lines 729-863 in Jinja) |
| 131 | +Now in: `steps/hardware_steps.py` |
| 132 | + |
| 133 | +## Common Tasks |
| 134 | + |
| 135 | +### Adding a new build variant |
| 136 | +Edit `steps/build_steps.py`. Follow the pattern of existing build steps. |
| 137 | + |
| 138 | +### Adding command transformation logic |
| 139 | +Create a new transformer in `transformers/`: |
| 140 | + |
| 141 | +```python |
| 142 | +from .base import CommandTransformer |
| 143 | + |
| 144 | +class MyTransformer(CommandTransformer): |
| 145 | + def transform(self, commands, test_step, config): |
| 146 | + if should_apply(): |
| 147 | + return modified_commands |
| 148 | + return None # Falls through to next transformer |
| 149 | +``` |
| 150 | + |
| 151 | +Then use it in `docker/plugin_builder.py::build_docker_command()`. |
| 152 | + |
| 153 | +### Adding a new hardware platform |
| 154 | +Add configuration to `hardware_config.py` and generation logic to `steps/hardware_steps.py`. |
| 155 | + |
| 156 | +### Adjusting Docker plugin configuration |
| 157 | +Look in `docker/plugin_builder.py` for the plugin builder logic, or `models/docker_config.py` for the config dataclasses. |
| 158 | + |
| 159 | +### Changing test selection rules |
| 160 | +- Run/skip decisions: `selection/filtering.py` |
| 161 | +- Block (manual trigger) decisions: `selection/blocking.py` |
| 162 | + |
| 163 | +## Testing |
| 164 | + |
| 165 | +Run unit tests: |
| 166 | +```bash |
| 167 | +python -m pytest tests/ -k "not integration" -v |
| 168 | +``` |
| 169 | + |
| 170 | +Run integration tests (verifies 100% compatibility with Jinja): |
| 171 | +```bash |
| 172 | +# CI mode (56 scenarios) |
| 173 | +python tests/test_integration_comprehensive.py |
| 174 | + |
| 175 | +# Fastcheck mode (8 scenarios) |
| 176 | +python tests/test_integration_fastcheck.py |
| 177 | + |
| 178 | +# AMD mode (not yet implemented) |
| 179 | +python tests/test_integration_amd.py |
| 180 | +``` |
| 181 | + |
| 182 | +**Status**: CI and Fastcheck modes achieve 100% YAML compatibility with their respective Jinja templates. |
| 183 | + |
| 184 | +## How It Works |
| 185 | + |
| 186 | +### Input: test-pipeline.yaml |
| 187 | + |
| 188 | +```yaml |
| 189 | +steps: |
| 190 | + - label: "Basic Tests" |
| 191 | + commands: |
| 192 | + - pytest tests/basic/ |
| 193 | + source_file_dependencies: |
| 194 | + - vllm/engine/ |
| 195 | +``` |
| 196 | +
|
| 197 | +### Processing |
| 198 | +
|
| 199 | +1. Parse YAML into `TestStep` objects (Pydantic models) |
| 200 | +2. For each test, decide if it should run or be blocked |
| 201 | +3. Convert to `BuildkiteStep` with appropriate Docker plugin |
| 202 | +4. Apply command transformations |
| 203 | +5. Add build steps, special groups, hardware tests |
| 204 | +6. Write final pipeline YAML |
| 205 | + |
| 206 | +### Output: pipeline.yaml |
| 207 | + |
| 208 | +```yaml |
| 209 | +steps: |
| 210 | + - label: "Build vLLM Image" |
| 211 | + key: "image-build" |
| 212 | + # ... build configuration |
| 213 | + |
| 214 | + - label: "Basic Tests" |
| 215 | + depends_on: "image-build" |
| 216 | + agents: {queue: "gpu_1_queue"} |
| 217 | + plugins: |
| 218 | + - docker#v5.2.0: |
| 219 | + image: "public.ecr.aws/..." |
| 220 | + command: ["bash", "-xc", "cd /vllm-workspace/tests && pytest tests/basic/"] |
| 221 | +``` |
| 222 | + |
| 223 | +## Backward Compatibility |
| 224 | + |
| 225 | +The Python generator produces identical YAML to the Jinja template. This is verified by `test_integration_comprehensive.py`, which runs 56 test scenarios covering: |
| 226 | + |
| 227 | +- Different branches (main vs PR) |
| 228 | +- Run all vs selective testing |
| 229 | +- Nightly mode |
| 230 | +- File change detection |
| 231 | +- Coverage injection |
| 232 | +- Intelligent test filtering |
| 233 | +- Multi-node/GPU configurations |
| 234 | +- Optional tests |
| 235 | +- And more... |
| 236 | + |
| 237 | +All 56 scenarios must produce byte-for-byte identical YAML. |
| 238 | + |
| 239 | +## Contributing |
| 240 | + |
| 241 | +When making changes: |
| 242 | + |
| 243 | +1. Write tests first (in `tests/`) |
| 244 | +2. Make your changes |
| 245 | +3. Run unit tests: `python -m pytest tests/ -k "not integration"` |
| 246 | +4. Run integration tests: `python tests/test_integration_comprehensive.py` |
| 247 | +5. Both must pass before merging |
| 248 | + |
| 249 | +The integration test is non-negotiable - it ensures we don't break existing pipelines. |
0 commit comments