Automation April 12, 2026

Puppeteer, Node.js, and a 1500+ article migration

Notes from automating a large WordPress article migration with scripts instead of repeating manual browser work.

Large content migrations are rarely technically glamorous, but they are good tests of engineering judgment. A manual migration can look faster at the start, then become slow, error-prone, and difficult to verify once the count grows.

One automation project involved migrating 1500+ WordPress articles. The useful decision was to stop treating it as repeated browser work and start treating it as a controlled workflow.

The shape of the automation

The script work centered on a few practical needs:

reading source content consistently
preserving titles, body structure, categories, and metadata where possible
handling browser sessions with Puppeteer
recording progress so failed runs could resume
separating extraction, transformation, and publishing steps

The goal was not to write a clever script. The goal was to reduce manual error and make the migration observable.

What mattered most

Retries and logging mattered more than speed. A fast migration that fails silently is worse than a slower one that tells you exactly which article failed and why.

I also learned to keep transformation logic separate from browser automation. Browser automation is already fragile because it depends on page structure, timing, and logged-in state. Mixing it with content cleanup makes the whole script harder to reason about.

The practical lesson

Automation becomes valuable when it turns repeated work into a reviewable process. For migration work, that means checkpoints, logs, resumability, and clear boundaries between steps.