amem.sh
Shared knowledge for agents and humans.
amem is a local-first knowledge engine. It captures references (arXiv papers, PDFs, web pages), compiles them into structured wiki notes with SHA-256 provenance, and serves them to AI agents over MCP. Everything runs on your Mac — no cloud.
Why amem
Frontier models write better papers when they can cite real sources. Humans maintain better notes when capture is friction-free. amem serves both audiences from one knowledge base:
- For agents: MCP tools
amem_capture,amem_compile,amem_cite,amem_recall— grounded citations with verifiable provenance - For humans: a CLI and a Chrome extension that drop into your existing workflow, storing everything as readable markdown under
~/.amem/
Relationship to aide.sh
aide orchestrates agents; amem gives them memory. The two are complementary:
| Layer | Tool | Concern |
|---|---|---|
| Orchestration | aide.sh | Dispatch, budgets, teams |
| Knowledge | amem.sh | Capture, compile, recall |
aide can use amem as its memory sync method ([sync.memory] method = "amem" in aide.toml).
Quick links
- Install
- Quick Start
- MCP Tools
- GitHub: amem-sh · amem-clipper · amem-site · amem-hq
Installation
Prerequisites
- macOS (primary target) or Linux
- Rust toolchain (
cargo) - Ollama running locally — required for the
compilestep pdftotext(frompoppler) —brew install poppler
Install the CLI
cargo install --git https://github.com/yiidtw/amem-sh
Or from a local checkout:
git clone git@github.com:yiidtw/amem-sh.git
cd amem-sh
cargo install --path .
Verify
amem --version
amem --help
Register with Claude Code
cargo install does not auto-register the MCP server. Run this once:
amem mcp install
This registers amem under your user scope (~/.claude.json). After restarting Claude Code, /mcp will show amem · ✔ connected, giving any agent access to amem_capture, amem_compile, amem_cite, amem_recall.
The one-liner shell installer (install.sh) runs this automatically if claude is on PATH.
To undo: amem mcp uninstall.
Pull an Ollama model
ollama pull llama3.1
Override the default model with AMEM_OLLAMA_MODEL=<name> if needed.
Next
- Quick Start — capture your first paper
- Concepts — how amem thinks
Quick Start
Capture a paper, compile it into a wiki note, and recall it.
1. Capture
amem capture https://arxiv.org/abs/1706.03762
Downloads the PDF to ~/.amem/raw/ and prints a cite key (e.g., vaswani2017attention).
2. Compile
amem compile vaswani2017attention
Parses the PDF, chunks it with SHA-256 provenance, runs Ollama paraphrase passes, and writes ~/.amem/wiki/{ts}_vaswani2017attention.md.
3. Recall
amem recall "attention mechanism"
Grep-searches your wiki and returns excerpts with cite keys.
4. Cite
amem cite vaswani2017attention --format bibtex
Prints a formatted citation.
5. Hook it up to Claude
claude mcp add amem -- amem mcp serve
Then ask Claude: “Cite vaswani2017attention in APA format using the amem MCP server.”
Next
- MCP Tools — the four tools agents use
- Storage Layout — what’s in
~/.amem/
Concepts
Knowledge, not chat logs
Most “second brain” tools store a stream of captures — articles, notes, clippings — and hope you can find them later. amem does something different: it compiles captures into wiki notes with verbatim quotes and verifiable provenance. The wiki is the product; the raw captures are just source material.
This follows Andrej Karpathy’s recommendation of maintaining a personal wiki as the substrate for long-term thinking.
Dual audience
amem serves both agents (over MCP) and humans (via CLI + extension) from the same store. An agent citing a paper sees the same markdown a human sees. There’s no hidden “agent memory” that drifts from what’s on disk.
Provenance by construction
Every chunk carries a SHA-256 hash of its source. If the original file changes, amem verify detects drift. Citations stay grounded.
Offline-first
Zero cloud dependencies by default. Ollama runs locally. The only network calls are to fetch papers from their public URLs (arXiv, DOI resolvers). You can operate amem entirely air-gapped after initial capture.
Three interfaces
- CLI —
amem capture,amem compile,amem recall,amem cite - MCP server —
amem_capture,amem_compile,amem_cite,amem_recallfor agents - Chrome extension — one-click web page capture + self-recorded demos
MCP Tools
amem exposes four MCP tools over stdio. Start the server with amem mcp serve.
amem_capture
Download a paper and generate a cite key.
Input: url (string) — arXiv URL, DOI, PDF URL, or local file path
Output: cite_key (string), raw_path (string)
amem_compile
Parse + chunk + paraphrase a captured source into a wiki note.
Input: cite_key (string)
Output: wiki_path (string), chunk_count (number)
amem_cite
Format a citation in a supported style.
Input: cite_key (string), format (string, one of bibtex / apa / mla / chicago / ieee)
Output: citation (string)
amem_recall
Search the wiki for matching chunks.
Input: query (string), limit (number, optional, default 10)
Output: list of {cite_key, excerpt, sha256, score}
Register with Claude
claude mcp add amem -- amem mcp serve
Register with other MCP clients
Any MCP client that supports stdio transport works. Point it at the amem mcp serve command.
Storage Layout
Everything lives under ~/.amem/.
~/.amem/
├── raw/ # original captures
│ ├── vaswani2017attention.pdf
│ └── ...
├── wiki/ # compiled notes
│ ├── 20260418_vaswani2017attention.md
│ └── ...
└── index.md # auto-maintained TOC
raw/
Original files exactly as downloaded. amem verify re-hashes against these.
wiki/
Compiled markdown notes. One file per cite key. Filename prefix is the compile timestamp so recompiles preserve history (you can git init ~/.amem/wiki && git commit if you want versioning).
index.md
Auto-regenerated on every amem compile — a flat list of all cite keys with paths. Never edit by hand.
Custom location
Override with AMEM_HOME=<path> (planned — not yet wired up).
Citation Formats
amem cite <cite_key> --format <fmt> supports:
| Format | Flag | Use |
|---|---|---|
| BibTeX | bibtex | LaTeX papers |
| APA | apa | Social sciences |
| MLA | mla | Humanities |
| Chicago | chicago | History, arts |
| IEEE | ieee | Engineering |
Default: bibtex.
Example
$ amem cite vaswani2017attention --format apa
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need.
Metadata sources
- arXiv API (for arXiv papers)
- CrossRef (for DOIs)
- Extracted from PDF metadata as fallback
amem Clipper (Chrome Extension)
amem Clipper is the browser-side companion to the native amem CLI — a Chrome MV3 side-panel extension for capturing web pages into your amem knowledge base. Positioning and install flow are defined by RFC-001.
Status
Bridge mode (Day 2 — in progress): extension talks to amem locally over WebSocket on port 7600. Extension captures → bridge forwards → amem stores.
Standalone mode (Day 3 — planned): when amem isn’t running, the extension falls back to Google Drive backup. Requires a new OAuth client (not shipped yet).
Install
The extension will be published on the Chrome Web Store as a new listing under the same developer account as crossmem (the v0 prototype). crossmem remains published and unchanged.
Published URL will appear here after approval.
Self-recording workflow
The extension ships with a self-recording skeleton: chrome.runtime.sendMessage({cmd: 'start_recording'}) triggers a tabCapture session via the offscreen document, producing amem-recording-<iso>.webm in Downloads.
This is how amem produces its own demo videos — the extension records itself being used. See Self-Recording Workflow.
Self-Recording Workflow
One of amem’s foundational features: amem is its own best demo. The extension records itself being used, producing the marketing video and Chrome Web Store listing assets automatically.
How it works
- An orchestrator (human or agent) sends
{cmd: 'start_recording'}to the extension’s background service worker. - The background worker opens an
offscreen.htmldocument with aMediaRecorderconsumingtabCapture. - The agent then drives the extension UI through the bridge — clicking the side panel, capturing pages, showing the wiki.
{cmd: 'stop_recording'}finalizes the.webmand saves to Downloads asamem-recording-<iso>.webm.
Why this matters
- Demos stay in sync with reality — the recording shows the current UI, not a stale screenshot
- Chrome Web Store listings can be refreshed in minutes
- Agents learn to operate amem by watching their own recordings
Status
The recording skeleton is shipped. The agent-driven recording scenarios are Day 2 work. The complete workflow ships alongside the Chrome Web Store submission.
amem capture
Download a paper or PDF into ~/.amem/raw/ and generate a cite key.
Usage
amem capture <url> [--doi <doi>] [--cite-key <key>]
Arguments
<url>— arXiv URL, DOI, PDF URL, or local file path--doi <doi>— override DOI lookup (useful when a PDF URL doesn’t resolve)--cite-key <key>— override the auto-generated cite key
Examples
amem capture https://arxiv.org/abs/1706.03762
amem capture https://example.com/paper.pdf
amem capture 10.1038/nature14539 --doi 10.1038/nature14539
amem capture ./local-paper.pdf --cite-key smith2024local
Output
Prints the cite key on success. Exits non-zero on failure.
See also
- amem compile — next step after capture
amem compile
Parse a captured source, chunk with SHA-256 provenance, run Ollama paraphrase, and emit a wiki note.
Usage
amem compile <cite_key>
Environment
AMEM_OLLAMA_MODEL— override the paraphrase model (default:llama3.1)
Output
Writes ~/.amem/wiki/{timestamp}_{cite_key}.md and updates ~/.amem/index.md.
Example
$ amem compile vaswani2017attention
Chunked into 47 chunks. Ollama paraphrase 47/47 ok.
Wrote ~/.amem/wiki/20260418_vaswani2017attention.md
See also
- amem recall — search compiled notes
amem recall
Grep-search compiled wiki notes and return excerpts.
Usage
amem recall <query> [--limit N]
Arguments
<query>— free-text search term--limit N— max results (default 10)
Example
$ amem recall "attention mechanism" --limit 3
vaswani2017attention (sha256:a1b2c3…): "…the Transformer, based solely on attention mechanisms…"
bahdanau2014neural (sha256:d4e5f6…): "…a neural machine translation model learns to pay attention to…"
See also
- MCP Tools — same backend exposed to agents
amem cite
Print a formatted citation for a captured source.
Usage
amem cite <cite_key> [--format bibtex|apa|mla|chicago|ieee]
Default format: bibtex.
See also
- Citation Formats — full format reference
amem mcp
Subcommands for the MCP stdio server and its Claude Code registration.
amem mcp serve
Start the MCP stdio server. Exposes amem_capture, amem_compile, amem_cite, amem_recall to any MCP client.
amem mcp serve
amem mcp install
Register the current amem binary with Claude Code at user scope. Idempotent — safe to re-run; it removes any stale registration first. Requires claude on PATH.
amem mcp install
install.sh calls this automatically if Claude Code is installed.
amem mcp uninstall
Remove the registration from Claude Code.
amem mcp uninstall
See also
- MCP Tools — tool schemas
- MCP Protocol — implementation details
Philosophy
Wiki as substrate
Karpathy argues that a personal wiki — verbatim quotes, paraphrase, cross-links — is the durable substrate of long-term thinking. Apps come and go; markdown with provenance stays readable for decades.
amem treats the wiki as the product, not a byproduct. Capture is just staging; compile is the commitment.
Offline-first, always
Cloud dependencies are a liability. amem runs entirely on a user’s Mac: Ollama for LLM, pdftotext for PDF parsing, local disk for storage. The only network calls fetch papers from their canonical URLs.
This matters for three reasons:
- Sovereignty — your knowledge base doesn’t evaporate when a vendor pivots
- Speed — local I/O beats round-trips
- Reproducibility — SHA-256 provenance only means something if the source sits next to the hash
Dual audience
MCP for agents, CLI + extension for humans — same store, one source of truth. If an agent cites a chunk, a human can read that exact chunk by grep-ing ~/.amem/wiki/.
Complement, don’t compete
aide orchestrates agents. amem gives them memory. Neither duplicates the other. [sync.memory] method = "amem" in an aide.toml is the contract.
No legacy
amem is a clean v1 rewrite of crossmem. No code was ported verbatim; every module was reconsidered against these principles. crossmem remains published as v0 for historical continuity.
MCP Protocol
amem implements the Model Context Protocol over stdio using the rmcp Rust crate.
Transport
- Stdio only (Day 1)
- HTTP/WebSocket bridge for the Chrome extension is Day 2 work
Server identification
server_name:amemserver_version: matches the crate version
Tools
See MCP Tools for input/output schemas.
Error conventions
- Malformed inputs →
InvalidParams - Missing cite_key →
ResourceNotFound - Ollama unreachable →
Internalwith hintstart ollama - File IO errors →
Internalwith the OS error
Session model
Each amem mcp serve process is one session. No state shared between sessions beyond what’s on disk under ~/.amem/.
Relationship to aide.sh
amem and aide are complementary layers of the same system.
aide — orchestration
aide.sh dispatches work between agents. An Aidefile defines each agent’s persona, budget, triggers, and skills. aide dispatch <agent> <task> spawns an isolated Claude Code process and returns a bounded summary.
amem — knowledge
amem captures, compiles, and serves references. An agent running under aide can call amem_recall over MCP to ground its reasoning in real sources.
Integration point
In aide.toml:
[sync.memory]
method = "amem"
conflict = "causal"
When this is set, aide uses amem as its cross-agent memory substrate. Agents share a wiki; a fact learned by one agent is available to all.
Division of labor
| Concern | aide | amem |
|---|---|---|
| Which agent runs? | ✓ | — |
| How much budget? | ✓ | — |
| What does the agent read? | — | ✓ |
| Where are citations stored? | — | ✓ |
| How are secrets gated? | ✓ | — |
| What’s the SHA-256 of chunk 7? | — | ✓ |
Using one without the other
- amem alone: CLI + extension + MCP. Useful for human-led research and single-agent setups.
- aide alone: works fine, just without shared knowledge. Agents reason from their own seed context.
- Both together: the full stack.
RFC-001 — Bridge-first architecture + settings/feature-flags
- Status: Draft
- Authors: @yiidtw
- Created: 2026-04-21
- Supersedes: SPEC.md § Roadmap Day 3 item “standalone mode”
- Related: RFC-002 (RSS), amem-sh
youtubepipeline
TL;DR
Reposition the Chrome extension as amem Clipper (working name, not yet branded) — a thin companion to the native amem binary, explicitly modelled on the Apple Watch → iPhone relationship. Drop standalone mode from the roadmap: the binary is a hard prerequisite. When the bridge is unreachable or a feature flag is off, the relevant UI is grayed out in place with a single-click enable/install affordance, not hidden. Heavy capture features (YouTube, RSS, Drive) are feature-flagged in ~/.amem/config.toml and lazy-loaded on demand.
Positioning and naming
The extension and the CLI are not peers. The CLI is amem; the extension is a peripheral that surfaces two capture moments (a browser tab, a YouTube video) that would otherwise require copying a URL into a terminal. Everything downstream of capture — storage, compile, transcription, MCP, recall — lives in the binary.
Working name: amem Clipper. Inherits the Evernote Web Clipper lineage (users know what “clipper” means), avoids collision with the internal “bridge” process name (amem-bridge server on WS 7600), and leaves the brand “amem” attached to the core product rather than the peripheral. The name is a placeholder — final branding lands with the CWS submission.
Product model:
| Layer | Name | Role |
|---|---|---|
| Core | amem (CLI + library + MCP) | The product. Storage, compile, recall, agent API. |
| Peripheral | amem Clipper (Chrome extension) | Capture UI for the browser. Cannot function without the core. |
| Internal | amem-bridge (the WS 7600 process) | Implementation detail — users rarely name this. |
This positioning reshapes every downstream decision:
- “Does the extension need feature X?” → Only if feature X is a capture moment. Storage, recall, etc. always live server-side.
- “What happens with no binary?” → Clipper is bricked, like an Apple Watch with no paired iPhone. Frontload install; don’t half-ship.
- “Where do settings live?” → In
~/.amem/config.tomlon disk. Clipper renders them via a bridge RPC; it never holds authoritative state.
Motivation
The original SPEC envisioned two extension modes:
| Mode | Day | Requirement |
|---|---|---|
| Bridge | 2 | Native amem binary + WS 7600 |
| Standalone | 3 | No native binary; cloud sync via Drive |
After shipping the YouTube pipeline we hit three things that make standalone look worse than we expected:
- Standalone is structurally crippled. Compile requires Ollama, transcription requires whisper-rs — neither runs in a Chrome MV3 extension. A standalone capture can store URLs but cannot compile, so the wiki never builds. Agent-side MCP is also dark because MCP is native-only.
- Feature cost is real. YouTube alone needs yt-dlp (~20 MB), ffmpeg (~60 MB), whisper model (75 MB–3 GB depending on size). Bundling all of this into default install breaks “offline-first with zero cloud dependencies by default” by shifting the pain from network to disk.
- Dual code paths are a maintenance tax. Standalone + bridge would mean two storage backends, two capture pipelines, two sets of bugs. The SPEC’s principle 4 (“complement aide, don’t duplicate”) applies to our own internals too.
Meanwhile, bridge mode is already the richer experience. A 30-second curl amem.sh/install | sh is less friction than a crippled standalone fork.
Proposal
1. Bridge is the only mode — UI degrades by graying out
Clipper on cold-start pings ws://127.0.0.1:7600/status. Based on the response, individual UI regions render as enabled, gray-disabled with a one-click enable, or gray-disabled with install CTA:
| State | UI for capture-web | UI for YouTube | UI for RSS | Global banner |
|---|---|---|---|---|
| Bridge unreachable | gray, tooltip “amem not running” | gray | gray | “Install amem → curl amem.sh/install | sh” with copy button |
| Bridge OK, all features off | enabled | gray, inline “Enable YouTube (~95 MB)” button | gray, inline “Enable RSS” button | — |
| Bridge OK, YT enabled | enabled | enabled | gray + enable | — |
| Bridge OK, all on | enabled | enabled | enabled | — |
Rationale: grayed controls are discoverable (user sees the feature exists, understands why it’s off) and honest (no hidden states). All-or-nothing install cards punish curious first-run users; per-feature gray-out is the Apple Watch “grey watch face when disconnected” pattern the positioning promises.
2. Feature flags live in ~/.amem/config.toml
Clipper renders a Settings page that maps to keys in this file via a new bridge RPC (settings_get / settings_set). The file is the single source of truth — both CLI and Clipper read/write the same keys. Clipper holds no authoritative config of its own, consistent with the peripheral positioning.
# ~/.amem/config.toml
version = 1
[features]
youtube = false # enables YT capture + compile (lazy-downloads yt-dlp + whisper model)
rss = false # enables RSS subscription ingestion (see RFC-002)
drive = false # enables Google Drive backup (Day 3)
[youtube]
whisper_model = "tiny.en" # tiny.en | base.en | small.en | medium.en
[bridge]
host = "127.0.0.1" # MUST be loopback (see Security)
port = 7600
token_file = "~/.amem/bridge.token"
3. Lazy-load on feature enable
Enabling a flag from Clipper or CLI triggers a setup routine:
amem youtube setup # CLI: downloads yt-dlp + whisper model + checks ffmpeg
amem rss setup # (RFC-002)
amem drive setup # (Day 3)
Clipper setup button → bridge RPC feature_setup({name}) → server runs the corresponding amem <name> setup, streams progress back over WS so Clipper can show a progress bar inline next to the (still grayed) control.
Graceful degradation in core flows. If a user runs amem capture <youtube-url> when features.youtube = false, the CLI prints:
YouTube capture is not enabled. To turn it on:
amem youtube setup
This will download yt-dlp (~20 MB) and the tiny.en whisper model (~75 MB).
The MCP tool amem_capture returns an analogous structured error, so agents can surface it to their user.
4. Bridge auto-start
On first install, amem install (the curl|sh script) registers a per-user background service:
- macOS:
launchctluser agent (~/Library/LaunchAgents/sh.amem.bridge.plist) - Linux:
systemctl --userunit (~/.config/systemd/user/amem-bridge.service) - Windows: Task Scheduler at-logon task (deferred; Windows support is a follow-up)
Goal: after first-run setup, the bridge is as available as Ollama is today — it just runs.
Security
Bridge-always means a persistent localhost WebSocket. Three defences, all MUST land before the “always on” posture ships:
| Defence | Mechanism |
|---|---|
| Loopback binding | Server binds 127.0.0.1, never 0.0.0.0. Reject --bind CLI flags that widen this. |
| Origin allow-list | WS handshake rejects connections whose Origin: header is not chrome-extension://<amem-clipper-prod-id> (production ID) or chrome-extension://<amem-clipper-dev-id> (dev build). |
| Token auth | On bridge start we mint a 32-byte random token to ~/.amem/bridge.token (mode 0600). Extension retrieves it via native-messaging handshake at install time. All WS messages must carry {"token": "..."} in their envelope. Tokens rotate on bridge restart. |
Threat model
| Threat | Impact | Mitigation |
|---|---|---|
| Another local program connects to WS | Could trigger amem_capture → write files to ~/.amem/raw/ | Token auth + origin check kill 99% of this |
| Disk-fill DoS via repeated capture | Fill user’s disk | Rate-limit captures per minute; refuse when ~/.amem/ exceeds configurable quota |
| Malicious browser extension connects as us | Impersonates our extension_id | Chrome refuses to forge Origin: for a different extension_id |
| RCE via yt-dlp / ffmpeg CVE | Arbitrary code execution | Use pinned versions, track security advisories; same posture as Ollama |
| Prompt injection in captured content | Poisons MCP amem_recall output | Same risk as today’s arxiv/PDF pipeline; not new from bridge-always |
Net risk: slightly higher than CLI-only (persistent WS endpoint exists), lower than an HTTP server accepting remote connections. Comparable to VS Code’s language-server loopback.
Migration
- SPEC.md §Roadmap: strike Day 3 “standalone mode”; add “bridge security hardening + install polish” and “Drive backup” (Drive stays).
- amem-clipper (repo renamed from
amem-extension2026-04-21): delete any standalone-only code paths (none should exist yet — Day 2 skeleton is bridge-only; this is a no-op today). Rename the product surface to “amem Clipper” in README, store listing, manifestname, and UI chrome. - docs:
guide/extension.mdrenamed toguide/clipper.md; its “Standalone vs Bridge” section is being rewritten as “How install works”. - README (amem-hq): clarify install-first story on all public pages, label the repo as “amem Clipper (Chrome MV3 extension)”.
Rejected alternatives
- “Pure cloud standalone” — extension + Drive only, no native binary. Breaks offline-first and agent-MCP. Also introduces OAuth complexity earlier than Day 3.
- Bundle everything in the default install. Ships ~3 GB of whisper models most users never use. Opposite of the lazy-load principle.
- Run whisper.wasm in the browser. Early 2026 performance is still 3–10× slower than native for
base.en; model download in the extension also hits MV3 storage limits.
Concrete work
See GitHub issues linked from this RFC.
amem youtube setupsubcommand + graceful-degradation prompt incapture(amem-sh)- Bridge: loopback binding +
Origin:check + token auth (amem-sh) - Bridge: auto-start service installers (macOS launchd, Linux systemd) (
amem-sh) - Clipper: bridge status probe, per-region gray-out UI, Settings page backed by
config.tomlvia bridge RPC (amem-clipper) - Clipper: rename product surface to “amem Clipper” (manifest, store listing copy, in-UI strings) — repo slug already renamed to
amem-clipper2026-04-21 (amem-clipper) - SPEC.md + docs: drop standalone, document bridge-first (
amem-hq)
Open questions
- Do we treat Ollama as a similarly lazy-loaded feature? Arguably yes — PDF compile also blocks without it. Worth a follow-up RFC if so.
- Drive backup (Day 3): should it require Pro/auth once shipped, or stay free? Product decision, not in scope here.
RFC-002 — RSS / Atom subscription management
- Status: Draft
- Authors: @yiidtw
- Created: 2026-04-21
- Related: RFC-001 (feature flags,
amem rss setup)
TL;DR
Add a first-class subscription layer on top of the existing capture pipeline. Users amem sub add <feed> to follow a source (arxiv category, blog, YouTube channel); a polling loop fetches the feed, dedups by GUID, and routes each new item through the existing capture + (optional) compile flow. MCP exposes amem_subscribe / amem_sub_list so agents can manage the user’s reading queue.
Motivation
amem’s capture flow is reactive: it only runs when a human (or agent) hands it a URL. That makes it useless for tracking ongoing sources:
- Following arxiv
cs.CLas new papers drop - Karpathy / Willison / lesswrong blog posts
- A YouTube channel’s new uploads (YouTube publishes per-channel RSS natively)
Every knowledge worker we’ve talked to does some version of this manually today — Feedly/NetNewsWire for reading, then copy-paste URLs into whatever capture tool they use. amem can collapse both steps.
This also inverts the self-recording story: amem is designed for you to produce content into. RSS lets other people’s content flow in on the same rails, so the wiki grows continuously rather than only after active capture.
Proposal
1. Subscription storage
# ~/.amem/subscriptions.toml
version = 1
[[subscription]]
id = "arxiv-cs-cl"
url = "http://export.arxiv.org/rss/cs.CL"
title = "arXiv cs.CL (Computation and Language)"
auto_compile = false # capture-only by default; compile is opt-in
poll_minutes = 60
enabled = true
added_at = "2026-04-21T00:00:00Z"
last_polled = "2026-04-21T00:30:00Z"
[[subscription]]
id = "3b1b-channel"
url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCYO_jab_esuFRV4b17AJtAw"
title = "3Blue1Brown"
auto_compile = true # small channel, OK to auto-transcribe
poll_minutes = 240
enabled = true
2. Dedup ledger
~/.amem/subscriptions/
ledger.jsonl # append-only, one JSON object per seen item
state/{sub_id}/last_etag # HTTP caching
Each ledger line:
{"sub_id":"3b1b-channel","guid":"yt:video:aircAruvnKk","captured_at":"2026-04-21T00:30:00Z","cite_key":"3blue1brown2017neural"}
Dedup is GUID-based. If a feed republishes an item (edit, repost), the existing capture wins; we don’t re-download.
3. CLI
amem sub add <url> [--auto-compile] [--poll-minutes N] [--title "..."]
amem sub list [--json]
amem sub remove <id>
amem sub enable|disable <id>
amem sub fetch [<id>] # one-shot poll, honours etag
amem sub daemon # long-running poller (used by service unit)
amem rss setup # install daemon (macOS launchd / Linux systemd user)
4. Poll algorithm
For each enabled sub whose now - last_polled >= poll_minutes:
- GET feed with
If-None-Match: {last_etag}andIf-Modified-Since: {last_polled} - 304 → update
last_polled, skip - 200 → parse via
feed-rs, iterate items - For each item not in ledger:
- Route to existing
cite::cmd_capture(item.link)(auto-picks arxiv / PDF / YouTube based on URL) - If
auto_compile = true→ also call the appropriatecmd_compile - Write ledger line
- Route to existing
- Update ledger + state
Failures per-item don’t block the rest of the feed. Aggregate failures re-queue with exponential backoff (15 min → 2 h cap).
5. MCP surface
amem_sub_add(url, auto_compile?) -> sub_id
amem_sub_list() -> [{ id, title, last_polled, enabled, ... }]
amem_sub_remove(id)
amem_sub_fetch(id?) -> {fetched: N, captured: M, errors: [...] }
This lets an agent maintain its own research feed without a human in the loop: “follow every arxiv paper that cites Vaswani 2017” becomes a single MCP call.
6. Gated behind features.rss
Disabled by default. amem rss setup enables it, installs the daemon, and writes features.rss = true to ~/.amem/config.toml (per RFC-001).
Non-goals
- Rich reader UI. amem is not Feedly. Reading lives in the wiki +
amem recall. If people want visual unread counts, that belongs in an extension page, not the core. - OPML import on day 1. Easy add later; skip for MVP to keep surface small.
- Arbitrary scheduling cron.
poll_minutesis enough; cron-syntax scheduling is out of scope. - Podcast audio-only feeds. These would need whisper anyway — treat them as RFC-002b when YouTube pipeline is stable on more models.
Risks
| Risk | Mitigation |
|---|---|
| A popular arxiv category fills disk (dozens of papers/day) | poll_minutes default 120 + per-feed disk quota + user confirmation on first-time auto_compile = true |
| Feed publisher rate-limits us | Honour Retry-After, respect 429; back off to 6 h for repeat offenders |
| Duplicate captures when arxiv updates a paper’s version | Keep first ingest; subsequent versions append a note to the existing wiki entry rather than creating a new cite_key |
| RSS spec is loose — malformed feeds break parser | feed-rs handles common variants; log + skip malformed entries, do not abort the poll |
Concrete work
- Rust crate additions:
feed-rs = "2",toml_edit = "0.22"(config writes preserve comments) (amem-sh) amem subsubcommand family (amem-sh)amem rss setupinstaller andamem sub daemonlong-runner (amem-sh)- MCP tools (
amem-sh) - SPEC.md: add
subscriptionto the storage layout section (amem-hq) - Docs: new
guide/subscriptions.mdpage (amem-hq)
Extension UI for managing subscriptions is deferred — CLI first.
Privacy Policy
Last updated: 2026-04-21
amem is a local-first knowledge capture system. This document covers the Chrome extension (“amem Clipper”) and the native CLI (“amem”). Companion repos: amem-clipper, amem-sh.
What amem does
amem captures web pages, papers, and recordings into your personal, local-first knowledge base. Everything stays on your machine unless you explicitly opt into a sync destination.
Data collection
amem does not collect, transmit, or store any user data on external servers. There is no backend, no database, no analytics, no telemetry.
What amem Clipper accesses
- Active tab when you click the amem action, open the side panel, or pick a context-menu entry — to read the page title, URL, and (when asked) visible text.
- Local storage (
chrome.storage.local) — for capture history and preferences. - Local WebSocket at
ws://127.0.0.1:7600(the amem bridge) — only when running on your own machine, to write captures to your filesystem. No external network calls. - Tab recording (
tabCapture) — only while you are actively recording from the side panel. Recordings are saved directly to your browser Downloads folder as.webmfiles.
What amem Clipper does NOT do
- Does not read your browsing history.
- Does not track your activity or behavior.
- Does not send data to any third-party server.
- Does not use Google Drive, OAuth, or any cloud provider in this release.
- Does not record audio.
Permissions explained
| Permission | Why |
|---|---|
storage | Save capture history and settings locally in the browser. |
activeTab | Access the current tab when you initiate a capture. |
scripting | Inject capture helpers into the active tab when you press capture. |
sidePanel | Render the amem side panel UI. |
alarms | Periodically retry the local bridge connection. |
tabs | Read the current tab’s URL and title for the capture record. |
tabCapture | Record the active tab’s video stream (explicit user action only). |
offscreen | Host MediaRecorder in an offscreen document (service workers can’t). |
contextMenus | Offer right-click capture entries. |
downloads | Save recordings to your Downloads folder as .webm files. |
host_permissions: <all_urls> | Capture works on whatever page you are viewing. Nothing is read without an explicit capture action. |
amem CLI + crate
The native amem binary (installed via amem install, brew, or cargo) runs entirely on your machine. It reads and writes only ~/.amem/ and (optionally) paths you pass as arguments. It makes outbound HTTPS requests only when you run amem capture <url> — strictly to fetch the URL you asked for (e.g., arXiv PDF, YouTube audio via yt-dlp). No analytics, no telemetry.
Contact
Questions about this policy: open an issue at https://github.com/yiidtw/amem-clipper/issues or https://github.com/yiidtw/amem-sh/issues.