2026-03-08 00:26:05 -06:00

51 lines
2.3 KiB
Markdown

# Zotero Paper Fetcher (Automator)
This script automates saving an academic paper (or any URL) to Zotero using the official **Zotero Connector** Google Chrome extension, driven by Playwright in Python.
## Why this approach?
Zotero Connector is a Chrome extension that provides the official, most robust way of getting high-quality metadata and full-text PDFs (if proxy or site access allows). However, standard browser automation (Headless Chrome) blocks Chrome extensions from running.
This script elegantly solves the problem by:
1. Automatically downloading the latest Zotero Connector extension.
2. Unpacking it from its `.crx` format.
3. Launching Chromium using Playwright in `--headless=new` mode (which DOES allow extensions, unlike the old headless mode) with a persistent user data directory.
4. Auto-closing setup tabs.
5. Emulating the official Zotero Connector keyboard shortcut (e.g. `Cmd+Shift+S` on Mac) once the page is fully loaded.
## Prerequisites
1. **Python 3.8+** installed.
2. **Zotero Desktop** must be currently running on your machine (the extension communicates with the Zotero desktop app securely on port `1969`).
## Setup
First, initialize your virtual environment and install dependencies:
```bash
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
playwright install chromium
```
## Usage
Simply pass the URL of the paper you want to add to Zotero:
```bash
source venv/bin/activate
python zotero_automator.py "https://arxiv.org/abs/1706.03762"
```
If you want to watch the browser process visually (helpful for debugging if a site requires a captcha or login, or just to verify the extension is working), pass the `--headed` flag:
```bash
python zotero_automator.py "https://arxiv.org/abs/1706.03762" --headed
```
## How It Works
- **`setup_extension()`**: Locates the `EKHAGK...` identifier for the Zotero extension on the Chrome web store and downloads the raw `.crx` payload. It unpacks the contents into `./zotero_extension/`.
- **`save_to_zotero()`**: Starts an `async_playwright` session pointing to a local profile folder (`./chrome_profile/`). The extension injects its translator scripts on network idle. We fire the platform-specific shortcut for "Save to Zotero" and wait 10 seconds for the backend download to finish.