Skip to content

Conversation

@igoramf
Copy link
Contributor

@igoramf igoramf commented Oct 23, 2025

What is this contribution about?

This PR adds automated Pinecone documentation ingestion to the CI/CD pipeline. The
changes include:

  • New step in the deploy-docs workflow for ingesting Deco Chat documentation into Pinecone.
  • TypeScript script (scripts/ingest-docs-pinecone.ts) that handles the ingestion
    process into Pinecone vector database

This enables automatic synchronization of documentation into Pinecone whenever changes
are deployed.

Screenshots/Demonstration

N/A - Backend/CI changes only

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • Documentation is updated (if needed)
  • No breaking changes

Summary by CodeRabbit

  • Chores
    • Updated documentation deployment workflow to automate indexing/ingestion into the docs assistant and improved deployment authentication.
    • Added robust document ingestion with MDX conversion support, file validation, retry handling, optional replacement of existing indexed docs, and upload progress logging.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 23, 2025

Walkthrough

Adds a new TypeScript ingestion script that uploads local docs into a Pinecone assistant with configurable behavior (file filtering, MDX conversion, retries, optional deletion), and updates the CI workflow to install dependencies and run that script during deployment using environment variables.

Changes

Cohort / File(s) Summary
CI Workflow
.github/workflows/deploy-docs.yaml
Added step to install rag-docs script dependencies (@pinecone-database/pinecone, fast-glob, tsx) and a step to run the Pinecone ingestion script via tsx, passing environment variables (PINECONE_API_KEY, PINECONE_ASSISTANT_NAME, DOCS_DIR, DELETE_ALL, UPLOAD_DELAY_MS, CONVERT_MDX).
Pinecone Ingest Script
scripts/ingest-docs-pinecone.ts
New TypeScript script that scans DOCS_DIR for files (configurable extensions), enforces size/non-empty checks, optionally converts .mdx to .md, deletes matching remote records when DELETE_ALL is true, and uploads files to a Pinecone assistant with metadata. Implements retry/backoff, per-file logging, temp-file cleanup, and exits non-zero on failure.

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant Installer as Install deps (fast-glob, tsx, pinecone)
    participant Ingest as ingest-docs-pinecone.ts
    participant Pinecone as Pinecone Assistant API
    Note over GHA,Installer: CI run for docs deploy
    GHA->>Installer: install dependencies
    Installer-->>GHA: deps installed
    GHA->>Ingest: run TSX script with env vars
    Note over Ingest,Pinecone: Ingestion flow
    Ingest->>Ingest: scan DOCS_DIR (fast-glob)
    Ingest->>Ingest: filter/size-check files
    alt CONVERT_MDX = true
        Ingest->>Ingest: convert .mdx -> temp .md
    end
    alt DELETE_ALL = true
        Ingest->>Pinecone: list & delete remote records matching source_path
        Pinecone-->>Ingest: deletion ack
    end
    loop for each file
        Ingest->>Pinecone: upsert file (with metadata), retry on transient errors
        Pinecone-->>Ingest: upsert ack / error
    end
    Ingest-->>GHA: logs summary (uploaded vs skipped)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Potential attention areas:

  • scripts/ingest-docs-pinecone.ts: MDX conversion flow, temp-file lifecycle, file-size checks, and retry/backoff correctness.
  • Env var handling and CI invocation in .github/workflows/deploy-docs.yaml: correctness of required/optional variables and secret usage.
  • Error handling paths that cause non-zero exit and logging clarity.

Poem

🐇 I hopped through folders, sniffed each page,

Converted MDX, prepared the stage.
I nudged old records, then uploaded new,
Pinecone now holds what once only grew.
Tiny paws, big search—hop, deploy, woo! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "feat(ci): add Pinecone docs ingestion to deploy workflow" directly and clearly reflects the main changes in the changeset. The title accurately describes that a new feature for Pinecone documentation ingestion has been added to the CI/CD deploy workflow, which aligns with the modified deploy-docs workflow file and the new ingest-docs-pinecone script. The title is concise, specific, and would be immediately clear to someone scanning the project history.
Description Check ✅ Passed The PR description follows the repository's template structure completely. It includes a clear "What is this contribution about?" section explaining the automated Pinecone ingestion addition with specific details about the workflow changes and new script. The "Screenshots/Demonstration" section appropriately marks N/A with justification for backend/CI-only changes. The Review Checklist section is present with all required items checked, including PR title clarity, testing completion, documentation status, and breaking changes confirmation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
.github/workflows/deploy-docs.yaml (1)

32-40: Consider adding error handling for the ingestion step.

If the Pinecone ingestion fails, the workflow continues to deploy. This could result in deployed docs being out of sync with the Pinecone index.

Consider adding continue-on-error: false (which is the default) and potentially adding a retry mechanism or failure notification.

scripts/ingest-docs-pinecone.ts (3)

1-22: Consider standardizing comment language.

The comments mix Portuguese ("obrigatórios", "por tamanho") and English. For consistency and broader accessibility, consider using English throughout.


50-53: Validate numeric environment variables.

Using Number() to parse environment variables will silently return NaN for invalid input, which could cause unexpected behavior or runtime errors. For example, Number("abc") returns NaN, which would then be used in chunking calculations.

Consider adding validation:

function parseNumber(value: string | undefined, defaultValue: number, name: string): number {
  if (!value) return defaultValue;
  const parsed = Number(value);
  if (isNaN(parsed)) {
    throw new Error(`Invalid number for ${name}: ${value}`);
  }
  return parsed;
}

const MAX_FILE_MB = parseNumber(process.env.MAX_FILE_MB, 2, "MAX_FILE_MB");
const CHUNK_SIZE = parseNumber(process.env.CHUNK_SIZE, 2800, "CHUNK_SIZE");
const CHUNK_OVERLAP = parseNumber(process.env.CHUNK_OVERLAP, 300, "CHUNK_OVERLAP");
const BATCH_SIZE = parseNumber(process.env.BATCH_SIZE, 500, "BATCH_SIZE");

118-122: Consider adding error handling for batch upserts.

The batch upsert loop doesn't handle potential failures from individual upsert operations. If an upsert fails, the entire script will fail without indicating which batch caused the issue.

Consider wrapping the upsert in a try-catch to log which batch failed:

for (let i = 0; i < records.length; i += BATCH_SIZE) {
  const batch = records.slice(i, i + BATCH_SIZE);
  try {
    await index.namespace(NAMESPACE).upsertRecords(batch);
    console.log(`Upserted ${Math.min(i + BATCH_SIZE, records.length)}/${records.length}`);
  } catch (error) {
    console.error(`Failed to upsert batch ${i}-${i + BATCH_SIZE}:`, error);
    throw error;
  }
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc4612b and a83ca24.

📒 Files selected for processing (2)
  • .github/workflows/deploy-docs.yaml (1 hunks)
  • scripts/ingest-docs-pinecone.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/data-flow.mdc)

**/*.{ts,tsx}: MCP tools must use Zod schemas for input and, when applicable, output validation
Register tools with server.registerTool providing description, inputSchema.shape, and optional outputSchema.shape
In every MCP tool handler, perform authorization checks first and then call context.resourceAccess.grant() before business logic
Name tools using the {RESOURCE}_{ACTION} pattern (e.g., AGENTS_CREATE, THREADS_LIST)
Group related tools into typed collections (e.g., GLOBAL_TOOLS, WORKSPACE_TOOLS) and export them as const
Always check workspace/team access before executing operations that touch workspace resources
Return proper authorization errors and handle forbidden operations explicitly

Prefer specific TypeScript types over any

**/*.{ts,tsx}: Write concise, maintainable, and technically accurate TypeScript code
Use functional and declarative programming patterns; avoid classes
Favor iteration and modularization to follow DRY and avoid code duplication
Use descriptive variable names with auxiliary verbs (e.g., isLoading, hasError)
Organize files so each contains only related content (components, subcomponents, helpers, static content, types)
Prefer interfaces over types for object shapes
Avoid enums; use maps (objects/records) instead
Use the function keyword for pure functions to benefit from hoisting and clarity
Ensure dependency arrays use stable references; do not inline new objects/arrays/functions
Memoize options/objects used in hooks (e.g., useMemo for options passed to useMemo/useCallback)
Favor named exports for functions

Files:

  • scripts/ingest-docs-pinecone.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/data-flow.mdc)

Log authorization failures to aid debugging and auditing

Files:

  • scripts/ingest-docs-pinecone.ts
scripts/**

📄 CodeRabbit inference engine (AGENTS.md)

Keep infrastructure helper scripts under scripts/

Files:

  • scripts/ingest-docs-pinecone.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Biome formatting: use two-space indentation and double quotes
Keep imports sorted
Name hooks and utility functions using camelCase

Files:

  • scripts/ingest-docs-pinecone.ts
{apps/api,scripts}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use explicit file extensions in imports where Deno requires them

Files:

  • scripts/ingest-docs-pinecone.ts
🔇 Additional comments (8)
.github/workflows/deploy-docs.yaml (2)

41-46: LGTM!

The deploy step is correctly configured with the deployment token from secrets.


29-30: The original review comment's concern is not valid.

The two install steps are intentionally in different contexts:

  • Line 24-27 installs docs app dependencies to docs/node_modules
  • Line 29-30 installs root-level script dependencies to root node_modules

The script scripts/ingest-docs-pinecone.ts is located at the repository root and requires its dependencies installed there. This is correct monorepo behavior with no risk of version conflicts—each context has its own isolated node_modules.

Minor note: The step name "Install rag-docs script dependencies" is misleading since the script is actually ingest-docs-pinecone.ts.

Likely an incorrect or invalid review comment.

scripts/ingest-docs-pinecone.ts (6)

58-61: LGTM!

The extension filtering logic is correct and straightforward.


78-81: LGTM!

The text splitter is correctly initialized with the configured chunk size and overlap.


87-90: Verify deleteAll behavior.

The script calls deleteAll() on a namespace, which will remove all existing documents. Ensure this aligns with your intended behavior, especially since it runs on every deployment (per the workflow configuration).

Consider adding a confirmation log or safety check before deletion, especially if this runs automatically.


92-96: LGTM!

The file discovery and filtering logic is correct. Using fast-glob with the normalized pattern and then filtering by extension is a clean approach.


100-114: LGTM!

The file processing logic is well-structured:

  • Skips empty/large files gracefully
  • Normalizes paths for consistency
  • Chunks documents with metadata
  • Creates unique IDs using path and chunk index

127-130: LGTM!

The top-level error handling correctly logs errors and exits with a non-zero code, which will properly fail the CI workflow if something goes wrong.

PINECONE_INDEX_NAME: ${{ secrets.PINECONE_INDEX_NAME }}
NAMESPACE: docs
DOCS_DIR: docs/view/src/content/pt-br
DELETE_ALL: "true"
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconsider DELETE_ALL default in production.

Setting DELETE_ALL: "true" means every deployment wipes all existing documents from Pinecone before re-ingesting. This could cause temporary data unavailability and potential data loss if the ingestion step fails partway through.

Consider either:

  • Setting DELETE_ALL: "false" and relying on upserts to update existing chunks
  • Moving deletion to a separate manual workflow
  • Adding a conditional based on branch or tag
🤖 Prompt for AI Agents
.github/workflows/deploy-docs.yaml around line 39: the workflow currently sets
DELETE_ALL: "true" which force-deletes all Pinecone documents on every
deployment; change this to a safer default and/or add conditional control: set
DELETE_ALL: "false" by default, or gate deletion behind a branch/tag condition
(e.g., only true for main/release or when a manual input/secret is present), or
move the delete step to a separate manual workflow; implement one of these
options so routine CI deploys use upserts without wiping data unless explicitly
requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better than nothing, we can make it good later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viktormarinho Sounds good! Feel free to circle back to this when you're ready to make it more robust. 👍

Comment on lines 37 to 42
const PINECONE_API_KEY = process.env.PINECONE_API_KEY!;
const PINECONE_INDEX_NAME = process.env.PINECONE_INDEX_NAME!;

if (!PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
if (!PINECONE_INDEX_NAME) throw new Error("Define PINECONE_INDEX_NAME (index name)");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove redundant non-null assertions.

The non-null assertion operator (!) on lines 37-38 tells TypeScript these values are never null/undefined, but then lines 40-41 immediately check if they're falsy. This is redundant.

Apply this diff to validate before assignment:

-const PINECONE_API_KEY = process.env.PINECONE_API_KEY!;
-const PINECONE_INDEX_NAME = process.env.PINECONE_INDEX_NAME!;
-
-if (!PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
-if (!PINECONE_INDEX_NAME) throw new Error("Define PINECONE_INDEX_NAME (index name)");
+if (!process.env.PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
+if (!process.env.PINECONE_INDEX_NAME) throw new Error("Define PINECONE_INDEX_NAME (index name)");
+
+const PINECONE_API_KEY = process.env.PINECONE_API_KEY;
+const PINECONE_INDEX_NAME = process.env.PINECONE_INDEX_NAME;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const PINECONE_API_KEY = process.env.PINECONE_API_KEY!;
const PINECONE_INDEX_NAME = process.env.PINECONE_INDEX_NAME!;
if (!PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
if (!PINECONE_INDEX_NAME) throw new Error("Define PINECONE_INDEX_NAME (index name)");
if (!process.env.PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
if (!process.env.PINECONE_INDEX_NAME) throw new Error("Define PINECONE_INDEX_NAME (index name)");
const PINECONE_API_KEY = process.env.PINECONE_API_KEY;
const PINECONE_INDEX_NAME = process.env.PINECONE_INDEX_NAME;
🤖 Prompt for AI Agents
In scripts/ingest-docs-pinecone.ts around lines 37 to 42, remove the redundant
non-null assertions on PINECONE_API_KEY and PINECONE_INDEX_NAME and instead
validate the environment variables before assigning: read the raw values from
process.env without "!", check if each is present (throwing the existing errors
if missing), and only then assign the consts (or use a guaranteed-typed value).
This eliminates the pointless "!" while preserving the runtime checks and
type-safety.

Comment on lines 63 to 76
async function readIfSmall(file: string): Promise<string | null> {
try {
const stat = await fs.stat(file);
if (stat.size > bytesLimit) {
console.warn(`Skip por tamanho (${(stat.size / 1024 / 1024).toFixed(2)}MB): ${file}`);
return null;
}
const raw = await fs.readFile(file, "utf8");
if (!raw.trim()) return null;
return raw;
} catch {
return null;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Improve error handling in readIfSmall.

The catch block silently swallows all errors (line 73-74), which could mask legitimate issues like permission errors, disk failures, or file system problems. While returning null for unreadable files is acceptable, the errors should be logged for debugging.

Apply this diff to add error logging:

   } catch {
+    console.warn(`Failed to read file: ${file}`);
     return null;
   }

Or for more detailed logging:

-  } catch {
+  } catch (error) {
+    console.warn(`Failed to read file: ${file}`, error);
     return null;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function readIfSmall(file: string): Promise<string | null> {
try {
const stat = await fs.stat(file);
if (stat.size > bytesLimit) {
console.warn(`Skip por tamanho (${(stat.size / 1024 / 1024).toFixed(2)}MB): ${file}`);
return null;
}
const raw = await fs.readFile(file, "utf8");
if (!raw.trim()) return null;
return raw;
} catch {
return null;
}
}
async function readIfSmall(file: string): Promise<string | null> {
try {
const stat = await fs.stat(file);
if (stat.size > bytesLimit) {
console.warn(`Skip por tamanho (${(stat.size / 1024 / 1024).toFixed(2)}MB): ${file}`);
return null;
}
const raw = await fs.readFile(file, "utf8");
if (!raw.trim()) return null;
return raw;
} catch (error) {
console.warn(`Failed to read file: ${file}`, error);
return null;
}
}
🤖 Prompt for AI Agents
In scripts/ingest-docs-pinecone.ts around lines 63 to 76, the readIfSmall
function currently swallows all errors in its catch block; update the catch to
log the error (including the file path and error message/stack) before returning
null so permission/IO errors are visible during debugging, e.g., capture the
caught error and call console.error (or the project's logger) with a clear
message and the error, then return null as before.

// - Embedding: Pinecone automatically generates via integrated embedding
// - Upsert: uses upsertRecords() sending text + metadata
//
// ENV obrigatórios:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pt-br -> en

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
scripts/ingest-docs-pinecone.ts (1)

26-31: Remove redundant non-null assertions.

The non-null assertions on lines 26-27 followed by validation checks on lines 29-30 are redundant. Validate before assignment instead.

Apply this diff:

-const PINECONE_API_KEY = process.env.PINECONE_API_KEY!;
-const PINECONE_ASSISTANT_NAME = process.env.PINECONE_ASSISTANT_NAME!;
-
-if (!PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
-if (!PINECONE_ASSISTANT_NAME) throw new Error("Define PINECONE_ASSISTANT_NAME (assistant name)");
+if (!process.env.PINECONE_API_KEY) throw new Error("Define PINECONE_API_KEY");
+if (!process.env.PINECONE_ASSISTANT_NAME) throw new Error("Define PINECONE_ASSISTANT_NAME (assistant name)");
+
+const PINECONE_API_KEY = process.env.PINECONE_API_KEY;
+const PINECONE_ASSISTANT_NAME = process.env.PINECONE_ASSISTANT_NAME;
.github/workflows/deploy-docs.yaml (1)

38-38: Reconsider DELETE_ALL in production deployments.

Setting DELETE_ALL: "true" on every deployment wipes all existing documents before re-ingesting, which could cause temporary data unavailability if ingestion fails mid-process. Consider making deletion opt-in or conditional.

🧹 Nitpick comments (4)
scripts/ingest-docs-pinecone.ts (4)

20-23: Add explicit file extensions to imports.

Scripts under scripts/** are executed in a Deno-like environment where explicit file extensions are required. The imports from node:fs/promises and node:path are correct, but third-party imports should follow this pattern when applicable.

As per coding guidelines


67-69: Log errors for debugging.

The catch block silently swallows all errors, which could mask permission errors, disk failures, or other file system issues. Add error logging to aid debugging.

Apply this diff:

-  } catch {
+  } catch (error) {
+    console.warn(`Failed to stat file: ${file}`, error);
     return false;
   }

130-135: Log cleanup failures for debugging.

The catch block in the finally section silently swallows errors when deleting temporary files. While it's acceptable to continue on cleanup failures, logging these errors helps diagnose file system issues.

Apply this diff:

     if (tempFile) {
       try {
         await fs.unlink(tempFile);
-      } catch {
+      } catch (error) {
+        console.warn(`Failed to delete temporary file: ${tempFile}`, error);
       }
     }

139-198: Consider adding progress feedback for deletion phase.

The deletion phase (lines 150-169) iterates through all existing files but only logs when a file is deleted. For large assistants, this phase could appear to hang. Consider adding periodic progress updates.

Example enhancement:

console.log(`Checking ${existingFiles.length} existing file(s)...`);
let deleted = 0;
for (const file of existingFiles) {
  const sourcePath = file.metadata?.source_path;
  if (sourcePath && localPaths.has(sourcePath)) {
    await assistant.deleteFile(file.id);
    console.log(`Deleted: ${file.name} (${sourcePath})`);
    deleted++;
  }
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a83ca24 and ad0c54c.

📒 Files selected for processing (2)
  • .github/workflows/deploy-docs.yaml (1 hunks)
  • scripts/ingest-docs-pinecone.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/data-flow.mdc)

**/*.{ts,tsx}: MCP tools must use Zod schemas for input and, when applicable, output validation
Register tools with server.registerTool providing description, inputSchema.shape, and optional outputSchema.shape
In every MCP tool handler, perform authorization checks first and then call context.resourceAccess.grant() before business logic
Name tools using the {RESOURCE}_{ACTION} pattern (e.g., AGENTS_CREATE, THREADS_LIST)
Group related tools into typed collections (e.g., GLOBAL_TOOLS, WORKSPACE_TOOLS) and export them as const
Always check workspace/team access before executing operations that touch workspace resources
Return proper authorization errors and handle forbidden operations explicitly

Prefer specific TypeScript types over any

**/*.{ts,tsx}: Write concise, maintainable, and technically accurate TypeScript code
Use functional and declarative programming patterns; avoid classes
Favor iteration and modularization to follow DRY and avoid code duplication
Use descriptive variable names with auxiliary verbs (e.g., isLoading, hasError)
Organize files so each contains only related content (components, subcomponents, helpers, static content, types)
Prefer interfaces over types for object shapes
Avoid enums; use maps (objects/records) instead
Use the function keyword for pure functions to benefit from hoisting and clarity
Ensure dependency arrays use stable references; do not inline new objects/arrays/functions
Memoize options/objects used in hooks (e.g., useMemo for options passed to useMemo/useCallback)
Favor named exports for functions

Files:

  • scripts/ingest-docs-pinecone.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/data-flow.mdc)

Log authorization failures to aid debugging and auditing

Files:

  • scripts/ingest-docs-pinecone.ts
scripts/**

📄 CodeRabbit inference engine (AGENTS.md)

Keep infrastructure helper scripts under scripts/

Files:

  • scripts/ingest-docs-pinecone.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Biome formatting: use two-space indentation and double quotes
Keep imports sorted
Name hooks and utility functions using camelCase

Files:

  • scripts/ingest-docs-pinecone.ts
{apps/api,scripts}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use explicit file extensions in imports where Deno requires them

Files:

  • scripts/ingest-docs-pinecone.ts
🔇 Additional comments (2)
.github/workflows/deploy-docs.yaml (2)

29-30: LGTM: Dependencies installed correctly.

The dependencies are installed at the repository root, which is appropriate since the script is located at scripts/ingest-docs-pinecone.ts (root level).


37-37: Verify if pt-br is your intended documentation language, or if it should be en.

Both docs/view/src/content/pt-br/ and docs/view/src/content/en/ directories exist in the repository. The current workflow targets Portuguese (pt-br), but a past review comment suggested using English (en) instead. Confirm this is your intended documentation source, or update to en if that's preferred.

}

async function uploadWithRetry(
assistant: any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace any with specific type.

The assistant parameter is typed as any, which bypasses type safety. Use the specific Pinecone Assistant type from the SDK.

As per coding guidelines

Apply this diff:

+import type { Assistant } from "@pinecone-database/pinecone";
+
 async function uploadWithRetry(
-  assistant: any,
+  assistant: Assistant<any>,
   file: string,
   normalizedPath: string,
   retries = MAX_RETRIES
🤖 Prompt for AI Agents
In scripts/ingest-docs-pinecone.ts around line 77, the assistant parameter is
currently typed as "any"; replace it with the specific Assistant type exported
by the Pinecone SDK (use the exact exported type name from the SDK you depend
on), update the function signature to use that type, and add the corresponding
import at the top of the file. Ensure you remove the "any" usage everywhere this
parameter is referenced so TypeScript enforces proper typings (adjust related
code if the stricter type surfaces required changes).

@viktormarinho
Copy link
Contributor

viktormarinho commented Nov 14, 2025

plz run format, lint check etc @igoramf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants