Bug: compact ooo head: chunk iter: cannot populate chunk X from block XX: invalid head chunk: not found

### What is the bug?

We are running Mimir 2.17 and recently noticed compaction failures on the Mimir ingesters caused by missing head chunks.

```
compact ooo head: chunk iter: cannot populate chunk 1744774978 from block 0000000000XX000FZMPACTHEAD: invalid head chunk: not found
```

The error appears to originate from [1](https://github.com/grafana/mimir/blob/mimir-2.17.1/vendor/github.com/prometheus/prometheus/tsdb/ooo_head_read.go#L276-L280).

While looking into the issue, it seems possible that a race condition is involved as the head block is being GC'd before its compacted. I'm wondering whether it would be reasonable to skip over missing blocks that no longer exist.

```
diff --git i/vendor/github.com/prometheus/prometheus/tsdb/ooo_head_read.go w/vendor/github.com/prometheus/prometheus/tsdb/ooo_head_read.go
index 21fedc8cf6..d47bc3980b 100644
--- i/vendor/github.com/prometheus/prometheus/tsdb/ooo_head_read.go
+++ w/vendor/github.com/prometheus/prometheus/tsdb/ooo_head_read.go
@@ -279,7 +279,11 @@ func (cr *HeadAndOOOChunkReader) chunkOrIterable(meta chunks.Meta, copyLastChunk
                default:
                        _, cid, isOOO := unpackHeadChunkRef(m.Ref)
                        iterable, _, err := cr.head.chunkFromSeries(s, cid, isOOO, m.MinTime, m.MaxTime, isoState, copyLastChunk)
-                       if err != nil {
+                       // During compaction, the series can be garbage collected.
+                       // In that case, we should not error, but just ignore the chunk.
+                       if errors.Is(err, storage.ErrNotFound) {
+                               continue
+                       } else if err != nil {
                                return nil, nil, 0, fmt.Errorf("invalid head chunk: %w", err)
                        }
```


### How to reproduce it?

I haven't been able to reproduce this. However, it seems to be constantly trigger in our Kubernetes environment.

### What did you think would happen?

The compaction process should continue compacting other blocks that are present.

### What was your environment?

Kubernetes

### Any additional context to share?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: compact ooo head: chunk iter: cannot populate chunk X from block XX: invalid head chunk: not found #13683

What is the bug?

How to reproduce it?

What did you think would happen?

What was your environment?

Any additional context to share?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: compact ooo head: chunk iter: cannot populate chunk X from block XX: invalid head chunk: not found #13683

Description

What is the bug?

How to reproduce it?

What did you think would happen?

What was your environment?

Any additional context to share?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions