New ServiceRoute for html to pdf conversion#222
Conversation
|
Visit the preview URL for this PR (updated for commit 7b8855c):
(expires Thu, 19 Feb 2026 18:00:43 GMT) 🔥 via Firebase Hosting GitHub Action 🌎 Sign: 0c15c45ea5a4c54095387eacf30c3755c9260f22 |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Pull request overview
Adds/extends the Service API PDF conversion functionality by introducing shared PDF settings utilities and a new endpoint to convert raw HTML strings to a merged PDF.
Changes:
- Refactors PDF settings handling into shared defaults + merge helpers (
DEFAULT_SETTINGS,mergeSettings,setupPage,generatePdfFromPage). - Reuses a dedicated TypeBox schema for PDF settings (
pdfSettingsSchema) across routes. - Adds a new
/v1/pdf/convert-htmlroute to convert multiple HTML strings into a single merged PDF.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const page = await setupPage(browser, mergedSettings) | ||
| await page.setContent(html, { waitUntil: 'networkidle0' }) | ||
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | ||
| await merger.add(pdfBuffer as Uint8Array) | ||
| await page.close() |
There was a problem hiding this comment.
As with /v1/pdf/convert, browser/page are only closed on the success path. If setContent, PDF generation, or merging throws, Chromium can be left running. Use try/finally to always close browser, and close each page in a per-iteration finally to avoid leaks.
| const page = await setupPage(browser, mergedSettings) | |
| await page.setContent(html, { waitUntil: 'networkidle0' }) | |
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | |
| await merger.add(pdfBuffer as Uint8Array) | |
| await page.close() | |
| let page: Page | null = null | |
| try { | |
| page = await setupPage(browser, mergedSettings) | |
| await page.setContent(html, { waitUntil: 'networkidle0' }) | |
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | |
| await merger.add(pdfBuffer as Uint8Array) | |
| } finally { | |
| if (page) { | |
| await page.close() | |
| } | |
| } |
| if (!urls.length) { | ||
| return reply.status(400).send( | ||
| JSON.stringify({ | ||
| error: 'At least one URL is required', | ||
| }) | ||
| ) | ||
| return reply.status(400).send(JSON.stringify({ error: 'At least one URL is required' })) | ||
| } |
There was a problem hiding this comment.
These error responses are JSON-stringified but no content-type: application/json header is set here, so Fastify will typically treat the payload as text/plain. Consider sending an object (letting Fastify serialize) or explicitly setting the JSON content-type for consistency with the global error handler.
| }, | ||
| async (request, reply) => { | ||
| const { htmlContents, settings } = request.body |
There was a problem hiding this comment.
This error response is JSON-stringified but no content-type: application/json header is set, so the response may be treated as plain text. Consider sending an object or explicitly setting the JSON content-type (consistent with fastifyErrorHandler).
| @@ -79,90 +167,101 @@ export const pdfRoute = (fastify: FastifyInstance, options: any, done: () => any | |||
| }), | |||
There was a problem hiding this comment.
The 200-response schema documents a JSON object, but the handler actually returns a binary PDF (Buffer) with Content-Type: application/pdf. This will make Swagger/OpenAPI docs and any generated clients incorrect. Update the route schema to describe a PDF/binary response (and content-type) or change the handler to return the documented JSON payload.
| const page = await setupPage(browser, mergedSettings) | ||
| await page.goto(url, { waitUntil: 'networkidle0' }) | ||
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | ||
| await merger.add(pdfBuffer as Uint8Array) | ||
| await page.close() |
There was a problem hiding this comment.
The browser (and potentially the current page) will not be closed if an error occurs anywhere in the loop (e.g., navigation timeout, invalid URL, PDF generation failure). In a long-lived/serverless runtime this can leak Chromium processes and memory. Use a try/finally (or try { ... } finally { await browser.close() }) and ensure each page.close() runs in a per-iteration finally as well.
| const page = await setupPage(browser, mergedSettings) | |
| await page.goto(url, { waitUntil: 'networkidle0' }) | |
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | |
| await merger.add(pdfBuffer as Uint8Array) | |
| await page.close() | |
| let page: Page | null = null | |
| try { | |
| page = await setupPage(browser, mergedSettings) | |
| await page.goto(url, { waitUntil: 'networkidle0' }) | |
| const pdfBuffer = await generatePdfFromPage(page, mergedSettings) | |
| await merger.add(pdfBuffer as Uint8Array) | |
| } finally { | |
| if (page) { | |
| try { | |
| await page.close() | |
| } catch { | |
| // Ignore errors on page.close to avoid masking prior failures | |
| } | |
| } | |
| } |
| landscape: settings.pdf.landscape, | ||
| margin: settings.pdf.margin, | ||
| }) | ||
| return Buffer.from(pdfBuffer) |
There was a problem hiding this comment.
page.pdf() in Puppeteer returns a Buffer (which is already a Uint8Array). Wrapping it in new Uint8Array(pdfBuffer) will allocate and copy the entire PDF for every page, increasing CPU/memory usage. Return the Buffer directly (or keep it as Uint8Array without copying) and adjust the return type accordingly.
| return Buffer.from(pdfBuffer) | |
| return pdfBuffer |
| body: Type.Object({ | ||
| htmlContents: Type.Array(Type.String(), { description: 'Array of HTML content strings' }), | ||
| settings: pdfSettingsSchema, | ||
| }), | ||
| response: { |
There was a problem hiding this comment.
The 200-response schema documents a JSON object, but this handler returns a binary PDF (Buffer) with Content-Type: application/pdf. Update the route schema to describe a PDF/binary response (and content-type) or change the handler to return the documented JSON payload.
Note
Medium Risk
Introduces a new PDF-generation endpoint and refactors Puppeteer flow/shared settings; risk is mainly around runtime behavior (resource cleanup, PDF rendering differences) and handling untrusted HTML/URLs in a headless browser.
Overview
Adds a new authenticated
POST /v1/pdf/convert-htmlroute that converts an array of HTML strings into a single merged PDF response.Refactors the existing URL conversion route to share
DEFAULT_SETTINGS, settings-schema reuse (pdfSettingsSchema), and helper functions (mergeSettings,setupPage,generatePdfFromPage) for consistent viewport/timezone/PDF output, plus minor error/response cleanup (ensuring pages and the browser are closed and returning more specific error messages).Written by Cursor Bugbot for commit 947e362. Configure here.