2.3 KiB
Zotero Paper Fetcher (Automator)
This script automates saving an academic paper (or any URL) to Zotero using the official Zotero Connector Google Chrome extension, driven by Playwright in Python.
Why this approach?
Zotero Connector is a Chrome extension that provides the official, most robust way of getting high-quality metadata and full-text PDFs (if proxy or site access allows). However, standard browser automation (Headless Chrome) blocks Chrome extensions from running.
This script elegantly solves the problem by:
- Automatically downloading the latest Zotero Connector extension.
- Unpacking it from its
.crxformat. - Launching Chromium using Playwright in
--headless=newmode (which DOES allow extensions, unlike the old headless mode) with a persistent user data directory. - Auto-closing setup tabs.
- Invoking the Zotero Connector programmatically by accessing its background service worker (
Zotero.Connector_Browser.saveWithTranslator(...)).
Prerequisites
- uv installed.
- Zotero Desktop must be currently running on your machine (the extension communicates with the Zotero desktop app securely on port
1969).
Setup
First, install dependencies and set up the playwright environment using uv:
uv sync
uv run playwright install chromium
Usage
Simply pass the URL of the paper you want to add to Zotero:
uv run zotero_automator.py "https://arxiv.org/abs/1706.03762"
If you want to watch the browser process visually (helpful for debugging if a site requires a captcha or login, or just to verify the extension is working), pass the --headed flag:
uv run zotero_automator.py "https://arxiv.org/abs/1706.03762" --headed
How It Works
setup_extension(): Locates theEKHAGK...identifier for the Zotero extension on the Chrome web store and downloads the raw.crxpayload. It unpacks the contents into./zotero_extension/.save_to_zotero(): Starts anasync_playwrightsession pointing to a local profile folder (./chrome_profile/). The extension injects its translator scripts on network idle. We find the extension's background service worker, trigger the programmatic save, and then pollsessionProgressuntil Zotero finishes downloading the PDFs and metadata.