# Zotero Paper Fetcher (Automator) This script automates saving an academic paper (or any URL) to Zotero using the official **Zotero Connector** Google Chrome extension, driven by Playwright in Python. ## Why this approach? Zotero Connector is a Chrome extension that provides the official, most robust way of getting high-quality metadata and full-text PDFs (if proxy or site access allows). However, standard browser automation (Headless Chrome) blocks Chrome extensions from running. This script elegantly solves the problem by: 1. Automatically downloading the latest Zotero Connector extension. 2. Unpacking it from its `.crx` format. 3. Launching Chromium using Playwright in `--headless=new` mode (which DOES allow extensions, unlike the old headless mode) with a persistent user data directory. 4. Auto-closing setup tabs. 5. Invoking the Zotero Connector programmatically by accessing its background service worker (`Zotero.Connector_Browser.saveWithTranslator(...)`). ## Prerequisites 1. [**uv**](https://github.com/astral-sh/uv) installed. 2. **Zotero Desktop** must be currently running on your machine (the extension communicates with the Zotero desktop app securely on port `1969`). ## Setup First, install dependencies and set up the playwright environment using `uv`: ```bash uv sync uv run playwright install chromium ``` ## Usage Simply pass the URL of the paper you want to add to Zotero: ```bash uv run zotero_automator.py "https://arxiv.org/abs/1706.03762" ``` If you want to watch the browser process visually (helpful for debugging if a site requires a captcha or login, or just to verify the extension is working), pass the `--headed` flag: ```bash uv run zotero_automator.py "https://arxiv.org/abs/1706.03762" --headed ``` ## How It Works - **`setup_extension()`**: Locates the `EKHAGK...` identifier for the Zotero extension on the Chrome web store and downloads the raw `.crx` payload. It unpacks the contents into `./zotero_extension/`. - **`save_to_zotero()`**: Starts an `async_playwright` session pointing to a local profile folder (`./chrome_profile/`). The extension injects its translator scripts on network idle. We find the extension's background service worker, trigger the programmatic save, and then poll `sessionProgress` until Zotero finishes downloading the PDFs and metadata.