51 lines
2.3 KiB
Markdown
51 lines
2.3 KiB
Markdown
# Zotero Paper Fetcher (Automator)
|
|
|
|
This script automates saving an academic paper (or any URL) to Zotero using the official **Zotero Connector** Google Chrome extension, driven by Playwright in Python.
|
|
|
|
## Why this approach?
|
|
|
|
Zotero Connector is a Chrome extension that provides the official, most robust way of getting high-quality metadata and full-text PDFs (if proxy or site access allows). However, standard browser automation (Headless Chrome) blocks Chrome extensions from running.
|
|
|
|
This script elegantly solves the problem by:
|
|
1. Automatically downloading the latest Zotero Connector extension.
|
|
2. Unpacking it from its `.crx` format.
|
|
3. Launching Chromium using Playwright in `--headless=new` mode (which DOES allow extensions, unlike the old headless mode) with a persistent user data directory.
|
|
4. Auto-closing setup tabs.
|
|
5. Emulating the official Zotero Connector keyboard shortcut (e.g. `Cmd+Shift+S` on Mac) once the page is fully loaded.
|
|
|
|
## Prerequisites
|
|
|
|
1. **Python 3.8+** installed.
|
|
2. **Zotero Desktop** must be currently running on your machine (the extension communicates with the Zotero desktop app securely on port `1969`).
|
|
|
|
## Setup
|
|
|
|
First, initialize your virtual environment and install dependencies:
|
|
|
|
```bash
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
pip install -r requirements.txt
|
|
playwright install chromium
|
|
```
|
|
|
|
## Usage
|
|
|
|
Simply pass the URL of the paper you want to add to Zotero:
|
|
|
|
```bash
|
|
source venv/bin/activate
|
|
python zotero_automator.py "https://arxiv.org/abs/1706.03762"
|
|
```
|
|
|
|
If you want to watch the browser process visually (helpful for debugging if a site requires a captcha or login, or just to verify the extension is working), pass the `--headed` flag:
|
|
|
|
```bash
|
|
python zotero_automator.py "https://arxiv.org/abs/1706.03762" --headed
|
|
```
|
|
|
|
## How It Works
|
|
|
|
- **`setup_extension()`**: Locates the `EKHAGK...` identifier for the Zotero extension on the Chrome web store and downloads the raw `.crx` payload. It unpacks the contents into `./zotero_extension/`.
|
|
- **`save_to_zotero()`**: Starts an `async_playwright` session pointing to a local profile folder (`./chrome_profile/`). The extension injects its translator scripts on network idle. We fire the platform-specific shortcut for "Save to Zotero" and wait 10 seconds for the backend download to finish.
|