amem.sh

Shared knowledge for agents and humans.

amem is a local-first knowledge engine. It captures references (arXiv papers, PDFs, web pages), compiles them into structured wiki notes with SHA-256 provenance, and serves them to AI agents over MCP. Everything runs on your Mac — no cloud.

Why amem

Frontier models write better papers when they can cite real sources. Humans maintain better notes when capture is friction-free. amem serves both audiences from one knowledge base:

For agents: MCP tools amem_capture, amem_compile, amem_cite, amem_recall — grounded citations with verifiable provenance
For humans: a CLI and a Chrome extension that drop into your existing workflow, storing everything as readable markdown under ~/.amem/

Relationship to aide.sh

aide orchestrates agents; amem gives them memory. The two are complementary:

Layer	Tool	Concern
Orchestration	aide.sh	Dispatch, budgets, teams
Knowledge	amem.sh	Capture, compile, recall

aide can use amem as its memory sync method ([sync.memory] method = "amem" in aide.toml).

Quick links

Install
Quick Start
MCP Tools
GitHub: amem-sh · amem-clipper · amem-site · amem-hq

Installation

Prerequisites

macOS (primary target) or Linux
Rust toolchain (cargo)
Ollama running locally — required for the compile step
pdftotext (from poppler) — brew install poppler

Install the CLI

cargo install --git https://github.com/yiidtw/amem-sh

Or from a local checkout:

git clone git@github.com:yiidtw/amem-sh.git
cd amem-sh
cargo install --path .

Verify

amem --version
amem --help

Register with Claude Code

cargo install does not auto-register the MCP server. Run this once:

amem mcp install

This registers amem under your user scope (~/.claude.json). After restarting Claude Code, /mcp will show amem · ✔ connected, giving any agent access to amem_capture, amem_compile, amem_cite, amem_recall.

The one-liner shell installer (install.sh) runs this automatically if claude is on PATH.

To undo: amem mcp uninstall.

Pull an Ollama model

ollama pull llama3.1

Override the default model with AMEM_OLLAMA_MODEL=<name> if needed.

Quick Start — capture your first paper
Concepts — how amem thinks

Quick Start

Capture a paper, compile it into a wiki note, and recall it.

1. Capture

amem capture https://arxiv.org/abs/1706.03762

Downloads the PDF to ~/.amem/raw/ and prints a cite key (e.g., vaswani2017attention).

2. Compile

amem compile vaswani2017attention

Parses the PDF, chunks it with SHA-256 provenance, runs Ollama paraphrase passes, and writes ~/.amem/wiki/{ts}_vaswani2017attention.md.

3. Recall

amem recall "attention mechanism"

Grep-searches your wiki and returns excerpts with cite keys.

4. Cite

amem cite vaswani2017attention --format bibtex

Prints a formatted citation.

5. Hook it up to Claude

claude mcp add amem -- amem mcp serve

Then ask Claude: “Cite vaswani2017attention in APA format using the amem MCP server.”

MCP Tools — the four tools agents use
Storage Layout — what’s in ~/.amem/

Concepts

Knowledge, not chat logs

Most “second brain” tools store a stream of captures — articles, notes, clippings — and hope you can find them later. amem does something different: it compiles captures into wiki notes with verbatim quotes and verifiable provenance. The wiki is the product; the raw captures are just source material.

This follows Andrej Karpathy’s recommendation of maintaining a personal wiki as the substrate for long-term thinking.

Dual audience

amem serves both agents (over MCP) and humans (via CLI + extension) from the same store. An agent citing a paper sees the same markdown a human sees. There’s no hidden “agent memory” that drifts from what’s on disk.

Provenance by construction

Every chunk carries a SHA-256 hash of its source. If the original file changes, amem verify detects drift. Citations stay grounded.

Offline-first

Zero cloud dependencies by default. Ollama runs locally. The only network calls are to fetch papers from their public URLs (arXiv, DOI resolvers). You can operate amem entirely air-gapped after initial capture.

Three interfaces

CLI — amem capture, amem compile, amem recall, amem cite
MCP server — amem_capture, amem_compile, amem_cite, amem_recall for agents
Chrome extension — one-click web page capture + self-recorded demos

MCP Tools

amem exposes four MCP tools over stdio. Start the server with amem mcp serve.

amem_capture

Download a paper and generate a cite key.

Input: url (string) — arXiv URL, DOI, PDF URL, or local file path

Output: cite_key (string), raw_path (string)

amem_compile

Parse + chunk + paraphrase a captured source into a wiki note.

Input: cite_key (string)

Output: wiki_path (string), chunk_count (number)

amem_cite

Format a citation in a supported style.

Input: cite_key (string), format (string, one of bibtex / apa / mla / chicago / ieee)

Output: citation (string)

amem_recall

Search the wiki for matching chunks.

Input: query (string), limit (number, optional, default 10)

Output: list of {cite_key, excerpt, sha256, score}

Register with Claude

claude mcp add amem -- amem mcp serve

Register with other MCP clients

Any MCP client that supports stdio transport works. Point it at the amem mcp serve command.

Storage Layout

Everything lives under ~/.amem/.

~/.amem/
├── raw/                          # original captures
│   ├── vaswani2017attention.pdf
│   └── ...
├── wiki/                         # compiled notes
│   ├── 20260418_vaswani2017attention.md
│   └── ...
└── index.md                      # auto-maintained TOC

raw/

Original files exactly as downloaded. amem verify re-hashes against these.

wiki/

Compiled markdown notes. One file per cite key. Filename prefix is the compile timestamp so recompiles preserve history (you can git init ~/.amem/wiki && git commit if you want versioning).

index.md

Auto-regenerated on every amem compile — a flat list of all cite keys with paths. Never edit by hand.

Custom location

Override with AMEM_HOME=<path> (planned — not yet wired up).

Citation Formats

amem cite <cite_key> --format <fmt> supports:

Format	Flag	Use
BibTeX	`bibtex`	LaTeX papers
APA	`apa`	Social sciences
MLA	`mla`	Humanities
Chicago	`chicago`	History, arts
IEEE	`ieee`	Engineering

Default: bibtex.

Example

$ amem cite vaswani2017attention --format apa
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need.

Metadata sources

arXiv API (for arXiv papers)
CrossRef (for DOIs)
Extracted from PDF metadata as fallback

amem Clipper (Chrome Extension)

amem Clipper is the browser-side companion to the native amem CLI — a Chrome MV3 side-panel extension for capturing web pages into your amem knowledge base. Positioning and install flow are defined by RFC-001.

Status

Bridge mode (Day 2 — in progress): extension talks to amem locally over WebSocket on port 7600. Extension captures → bridge forwards → amem stores.

Standalone mode (Day 3 — planned): when amem isn’t running, the extension falls back to Google Drive backup. Requires a new OAuth client (not shipped yet).

Install

The extension will be published on the Chrome Web Store as a new listing under the same developer account as crossmem (the v0 prototype). crossmem remains published and unchanged.

Published URL will appear here after approval.

Self-recording workflow

The extension ships with a self-recording skeleton: chrome.runtime.sendMessage({cmd: 'start_recording'}) triggers a tabCapture session via the offscreen document, producing amem-recording-<iso>.webm in Downloads.

This is how amem produces its own demo videos — the extension records itself being used. See Self-Recording Workflow.

Self-Recording Workflow

One of amem’s foundational features: amem is its own best demo. The extension records itself being used, producing the marketing video and Chrome Web Store listing assets automatically.

How it works

An orchestrator (human or agent) sends {cmd: 'start_recording'} to the extension’s background service worker.
The background worker opens an offscreen.html document with a MediaRecorder consuming tabCapture.
The agent then drives the extension UI through the bridge — clicking the side panel, capturing pages, showing the wiki.
{cmd: 'stop_recording'} finalizes the .webm and saves to Downloads as amem-recording-<iso>.webm.

Why this matters

Demos stay in sync with reality — the recording shows the current UI, not a stale screenshot
Chrome Web Store listings can be refreshed in minutes
Agents learn to operate amem by watching their own recordings

Status

The recording skeleton is shipped. The agent-driven recording scenarios are Day 2 work. The complete workflow ships alongside the Chrome Web Store submission.

amem capture

Download a paper or PDF into ~/.amem/raw/ and generate a cite key.

Usage

amem capture <url> [--doi <doi>] [--cite-key <key>]

Arguments

<url> — arXiv URL, DOI, PDF URL, or local file path
--doi <doi> — override DOI lookup (useful when a PDF URL doesn’t resolve)
--cite-key <key> — override the auto-generated cite key

Examples

amem capture https://arxiv.org/abs/1706.03762
amem capture https://example.com/paper.pdf
amem capture 10.1038/nature14539 --doi 10.1038/nature14539
amem capture ./local-paper.pdf --cite-key smith2024local

Output

Prints the cite key on success. Exits non-zero on failure.

amem compile

Parse a captured source, chunk with SHA-256 provenance, run Ollama paraphrase, and emit a wiki note.

Usage

amem compile <cite_key>

Environment

AMEM_OLLAMA_MODEL — override the paraphrase model (default: llama3.1)

Output

Writes ~/.amem/wiki/{timestamp}_{cite_key}.md and updates ~/.amem/index.md.

Example

$ amem compile vaswani2017attention
Chunked into 47 chunks. Ollama paraphrase 47/47 ok.
Wrote ~/.amem/wiki/20260418_vaswani2017attention.md

amem recall

Grep-search compiled wiki notes and return excerpts.

Usage

amem recall <query> [--limit N]

Arguments

<query> — free-text search term
--limit N — max results (default 10)

Example

$ amem recall "attention mechanism" --limit 3
vaswani2017attention (sha256:a1b2c3…): "…the Transformer, based solely on attention mechanisms…"
bahdanau2014neural  (sha256:d4e5f6…): "…a neural machine translation model learns to pay attention to…"

amem cite

Print a formatted citation for a captured source.

Usage

amem cite <cite_key> [--format bibtex|apa|mla|chicago|ieee]

Default format: bibtex.

amem mcp

Subcommands for the MCP stdio server and its Claude Code registration.

amem mcp serve

Start the MCP stdio server. Exposes amem_capture, amem_compile, amem_cite, amem_recall to any MCP client.

amem mcp serve

amem mcp install

Register the current amem binary with Claude Code at user scope. Idempotent — safe to re-run; it removes any stale registration first. Requires claude on PATH.

amem mcp install

install.sh calls this automatically if Claude Code is installed.

amem mcp uninstall

Remove the registration from Claude Code.

amem mcp uninstall

Philosophy

Wiki as substrate

Karpathy argues that a personal wiki — verbatim quotes, paraphrase, cross-links — is the durable substrate of long-term thinking. Apps come and go; markdown with provenance stays readable for decades.

amem treats the wiki as the product, not a byproduct. Capture is just staging; compile is the commitment.

Offline-first, always

Cloud dependencies are a liability. amem runs entirely on a user’s Mac: Ollama for LLM, pdftotext for PDF parsing, local disk for storage. The only network calls fetch papers from their canonical URLs.

This matters for three reasons:

Sovereignty — your knowledge base doesn’t evaporate when a vendor pivots
Speed — local I/O beats round-trips
Reproducibility — SHA-256 provenance only means something if the source sits next to the hash

Dual audience

MCP for agents, CLI + extension for humans — same store, one source of truth. If an agent cites a chunk, a human can read that exact chunk by grep-ing ~/.amem/wiki/.

Complement, don’t compete

aide orchestrates agents. amem gives them memory. Neither duplicates the other. [sync.memory] method = "amem" in an aide.toml is the contract.

No legacy

amem is a clean v1 rewrite of crossmem. No code was ported verbatim; every module was reconsidered against these principles. crossmem remains published as v0 for historical continuity.

MCP Protocol

amem implements the Model Context Protocol over stdio using the rmcp Rust crate.

Transport

Stdio only (Day 1)
HTTP/WebSocket bridge for the Chrome extension is Day 2 work

Server identification

server_name: amem
server_version: matches the crate version

Tools

See MCP Tools for input/output schemas.

Error conventions

Malformed inputs → InvalidParams
Missing cite_key → ResourceNotFound
Ollama unreachable → Internal with hint start ollama
File IO errors → Internal with the OS error

Session model

Each amem mcp serve process is one session. No state shared between sessions beyond what’s on disk under ~/.amem/.

Relationship to aide.sh

amem and aide are complementary layers of the same system.

aide — orchestration

aide.sh dispatches work between agents. An Aidefile defines each agent’s persona, budget, triggers, and skills. aide dispatch <agent> <task> spawns an isolated Claude Code process and returns a bounded summary.

amem — knowledge

amem captures, compiles, and serves references. An agent running under aide can call amem_recall over MCP to ground its reasoning in real sources.

Integration point

In aide.toml:

[sync.memory]
method = "amem"
conflict = "causal"

When this is set, aide uses amem as its cross-agent memory substrate. Agents share a wiki; a fact learned by one agent is available to all.

Division of labor

Concern	aide	amem
Which agent runs?	✓	—
How much budget?	✓	—
What does the agent read?	—	✓
Where are citations stored?	—	✓
How are secrets gated?	✓	—
What’s the SHA-256 of chunk 7?	—	✓

Using one without the other

amem alone: CLI + extension + MCP. Useful for human-led research and single-agent setups.
aide alone: works fine, just without shared knowledge. Agents reason from their own seed context.
Both together: the full stack.

RFC-001 — Bridge-first architecture + settings/feature-flags

Status: Draft
Authors: @yiidtw
Created: 2026-04-21
Supersedes: SPEC.md § Roadmap Day 3 item “standalone mode”
Related: RFC-002 (RSS), amem-sh youtube pipeline

TL;DR

Reposition the Chrome extension as amem Clipper (working name, not yet branded) — a thin companion to the native amem binary, explicitly modelled on the Apple Watch → iPhone relationship. Drop standalone mode from the roadmap: the binary is a hard prerequisite. When the bridge is unreachable or a feature flag is off, the relevant UI is grayed out in place with a single-click enable/install affordance, not hidden. Heavy capture features (YouTube, RSS, Drive) are feature-flagged in ~/.amem/config.toml and lazy-loaded on demand.

Positioning and naming

The extension and the CLI are not peers. The CLI is amem; the extension is a peripheral that surfaces two capture moments (a browser tab, a YouTube video) that would otherwise require copying a URL into a terminal. Everything downstream of capture — storage, compile, transcription, MCP, recall — lives in the binary.

Working name: amem Clipper. Inherits the Evernote Web Clipper lineage (users know what “clipper” means), avoids collision with the internal “bridge” process name (amem-bridge server on WS 7600), and leaves the brand “amem” attached to the core product rather than the peripheral. The name is a placeholder — final branding lands with the CWS submission.

Product model:

Layer	Name	Role
Core	`amem` (CLI + library + MCP)	The product. Storage, compile, recall, agent API.
Peripheral	`amem Clipper` (Chrome extension)	Capture UI for the browser. Cannot function without the core.
Internal	`amem-bridge` (the WS 7600 process)	Implementation detail — users rarely name this.

This positioning reshapes every downstream decision:

“Does the extension need feature X?” → Only if feature X is a capture moment. Storage, recall, etc. always live server-side.
“What happens with no binary?” → Clipper is bricked, like an Apple Watch with no paired iPhone. Frontload install; don’t half-ship.
“Where do settings live?” → In ~/.amem/config.toml on disk. Clipper renders them via a bridge RPC; it never holds authoritative state.

Motivation

The original SPEC envisioned two extension modes:

Mode	Day	Requirement
Bridge	2	Native `amem` binary + WS 7600
Standalone	3	No native binary; cloud sync via Drive

After shipping the YouTube pipeline we hit three things that make standalone look worse than we expected:

Standalone is structurally crippled. Compile requires Ollama, transcription requires whisper-rs — neither runs in a Chrome MV3 extension. A standalone capture can store URLs but cannot compile, so the wiki never builds. Agent-side MCP is also dark because MCP is native-only.
Feature cost is real. YouTube alone needs yt-dlp (~20 MB), ffmpeg (~60 MB), whisper model (75 MB–3 GB depending on size). Bundling all of this into default install breaks “offline-first with zero cloud dependencies by default” by shifting the pain from network to disk.
Dual code paths are a maintenance tax. Standalone + bridge would mean two storage backends, two capture pipelines, two sets of bugs. The SPEC’s principle 4 (“complement aide, don’t duplicate”) applies to our own internals too.

Meanwhile, bridge mode is already the richer experience. A 30-second curl amem.sh/install | sh is less friction than a crippled standalone fork.

Proposal

1. Bridge is the only mode — UI degrades by graying out

Clipper on cold-start pings ws://127.0.0.1:7600/status. Based on the response, individual UI regions render as enabled, gray-disabled with a one-click enable, or gray-disabled with install CTA:

State	UI for capture-web	UI for YouTube	UI for RSS	Global banner
Bridge unreachable	gray, tooltip “amem not running”	gray	gray	“Install amem → `curl amem.sh/install \| sh`” with copy button
Bridge OK, all features off	enabled	gray, inline “Enable YouTube (~95 MB)” button	gray, inline “Enable RSS” button	—
Bridge OK, YT enabled	enabled	enabled	gray + enable	—
Bridge OK, all on	enabled	enabled	enabled	—

Rationale: grayed controls are discoverable (user sees the feature exists, understands why it’s off) and honest (no hidden states). All-or-nothing install cards punish curious first-run users; per-feature gray-out is the Apple Watch “grey watch face when disconnected” pattern the positioning promises.

2. Feature flags live in `~/.amem/config.toml`

Clipper renders a Settings page that maps to keys in this file via a new bridge RPC (settings_get / settings_set). The file is the single source of truth — both CLI and Clipper read/write the same keys. Clipper holds no authoritative config of its own, consistent with the peripheral positioning.

# ~/.amem/config.toml
version = 1

[features]
youtube = false         # enables YT capture + compile (lazy-downloads yt-dlp + whisper model)
rss     = false         # enables RSS subscription ingestion (see RFC-002)
drive   = false         # enables Google Drive backup (Day 3)

[youtube]
whisper_model = "tiny.en"  # tiny.en | base.en | small.en | medium.en

[bridge]
host = "127.0.0.1"      # MUST be loopback (see Security)
port = 7600
token_file = "~/.amem/bridge.token"

3. Lazy-load on feature enable

Enabling a flag from Clipper or CLI triggers a setup routine:

amem youtube setup      # CLI: downloads yt-dlp + whisper model + checks ffmpeg
amem rss setup          # (RFC-002)
amem drive setup        # (Day 3)

Clipper setup button → bridge RPC feature_setup({name}) → server runs the corresponding amem <name> setup, streams progress back over WS so Clipper can show a progress bar inline next to the (still grayed) control.

Graceful degradation in core flows. If a user runs amem capture <youtube-url> when features.youtube = false, the CLI prints:

YouTube capture is not enabled. To turn it on:
  amem youtube setup
This will download yt-dlp (~20 MB) and the tiny.en whisper model (~75 MB).

The MCP tool amem_capture returns an analogous structured error, so agents can surface it to their user.

4. Bridge auto-start

On first install, amem install (the curl|sh script) registers a per-user background service:

macOS: launchctl user agent (~/Library/LaunchAgents/sh.amem.bridge.plist)
Linux: systemctl --user unit (~/.config/systemd/user/amem-bridge.service)
Windows: Task Scheduler at-logon task (deferred; Windows support is a follow-up)

Goal: after first-run setup, the bridge is as available as Ollama is today — it just runs.

Security

Bridge-always means a persistent localhost WebSocket. Three defences, all MUST land before the “always on” posture ships:

Defence	Mechanism
Loopback binding	Server binds `127.0.0.1`, never `0.0.0.0`. Reject `--bind` CLI flags that widen this.
Origin allow-list	WS handshake rejects connections whose `Origin:` header is not `chrome-extension://<amem-clipper-prod-id>` (production ID) or `chrome-extension://<amem-clipper-dev-id>` (dev build).
Token auth	On bridge start we mint a 32-byte random token to `~/.amem/bridge.token` (mode `0600`). Extension retrieves it via native-messaging handshake at install time. All WS messages must carry `{"token": "..."}` in their envelope. Tokens rotate on bridge restart.

Threat model

Threat	Impact	Mitigation
Another local program connects to WS	Could trigger `amem_capture` → write files to `~/.amem/raw/`	Token auth + origin check kill 99% of this
Disk-fill DoS via repeated capture	Fill user’s disk	Rate-limit captures per minute; refuse when `~/.amem/` exceeds configurable quota
Malicious browser extension connects as us	Impersonates our extension_id	Chrome refuses to forge `Origin:` for a different extension_id
RCE via yt-dlp / ffmpeg CVE	Arbitrary code execution	Use pinned versions, track security advisories; same posture as Ollama
Prompt injection in captured content	Poisons MCP `amem_recall` output	Same risk as today’s arxiv/PDF pipeline; not new from bridge-always

Net risk: slightly higher than CLI-only (persistent WS endpoint exists), lower than an HTTP server accepting remote connections. Comparable to VS Code’s language-server loopback.

Migration

SPEC.md §Roadmap: strike Day 3 “standalone mode”; add “bridge security hardening + install polish” and “Drive backup” (Drive stays).
amem-clipper (repo renamed from amem-extension 2026-04-21): delete any standalone-only code paths (none should exist yet — Day 2 skeleton is bridge-only; this is a no-op today). Rename the product surface to “amem Clipper” in README, store listing, manifest name, and UI chrome.
docs: guide/extension.md renamed to guide/clipper.md; its “Standalone vs Bridge” section is being rewritten as “How install works”.
README (amem-hq): clarify install-first story on all public pages, label the repo as “amem Clipper (Chrome MV3 extension)”.

Rejected alternatives

“Pure cloud standalone” — extension + Drive only, no native binary. Breaks offline-first and agent-MCP. Also introduces OAuth complexity earlier than Day 3.
Bundle everything in the default install. Ships ~3 GB of whisper models most users never use. Opposite of the lazy-load principle.
Run whisper.wasm in the browser. Early 2026 performance is still 3–10× slower than native for base.en; model download in the extension also hits MV3 storage limits.

Concrete work

See GitHub issues linked from this RFC.

amem youtube setup subcommand + graceful-degradation prompt in capture (amem-sh)
Bridge: loopback binding + Origin: check + token auth (amem-sh)
Bridge: auto-start service installers (macOS launchd, Linux systemd) (amem-sh)
Clipper: bridge status probe, per-region gray-out UI, Settings page backed by config.toml via bridge RPC (amem-clipper)
Clipper: rename product surface to “amem Clipper” (manifest, store listing copy, in-UI strings) — repo slug already renamed to amem-clipper 2026-04-21 (amem-clipper)
SPEC.md + docs: drop standalone, document bridge-first (amem-hq)

Open questions

Do we treat Ollama as a similarly lazy-loaded feature? Arguably yes — PDF compile also blocks without it. Worth a follow-up RFC if so.
Drive backup (Day 3): should it require Pro/auth once shipped, or stay free? Product decision, not in scope here.

RFC-002 — RSS / Atom subscription management

Status: Draft
Authors: @yiidtw
Created: 2026-04-21
Related: RFC-001 (feature flags, amem rss setup)

TL;DR

Add a first-class subscription layer on top of the existing capture pipeline. Users amem sub add <feed> to follow a source (arxiv category, blog, YouTube channel); a polling loop fetches the feed, dedups by GUID, and routes each new item through the existing capture + (optional) compile flow. MCP exposes amem_subscribe / amem_sub_list so agents can manage the user’s reading queue.

Motivation

amem’s capture flow is reactive: it only runs when a human (or agent) hands it a URL. That makes it useless for tracking ongoing sources:

Following arxiv cs.CL as new papers drop
Karpathy / Willison / lesswrong blog posts
A YouTube channel’s new uploads (YouTube publishes per-channel RSS natively)

Every knowledge worker we’ve talked to does some version of this manually today — Feedly/NetNewsWire for reading, then copy-paste URLs into whatever capture tool they use. amem can collapse both steps.

This also inverts the self-recording story: amem is designed for you to produce content into. RSS lets other people’s content flow in on the same rails, so the wiki grows continuously rather than only after active capture.

Proposal

1. Subscription storage

# ~/.amem/subscriptions.toml
version = 1

[[subscription]]
id            = "arxiv-cs-cl"
url           = "http://export.arxiv.org/rss/cs.CL"
title         = "arXiv cs.CL (Computation and Language)"
auto_compile  = false         # capture-only by default; compile is opt-in
poll_minutes  = 60
enabled       = true
added_at      = "2026-04-21T00:00:00Z"
last_polled   = "2026-04-21T00:30:00Z"

[[subscription]]
id            = "3b1b-channel"
url           = "https://www.youtube.com/feeds/videos.xml?channel_id=UCYO_jab_esuFRV4b17AJtAw"
title         = "3Blue1Brown"
auto_compile  = true          # small channel, OK to auto-transcribe
poll_minutes  = 240
enabled       = true

2. Dedup ledger

~/.amem/subscriptions/
  ledger.jsonl                 # append-only, one JSON object per seen item
  state/{sub_id}/last_etag     # HTTP caching

Each ledger line:

{"sub_id":"3b1b-channel","guid":"yt:video:aircAruvnKk","captured_at":"2026-04-21T00:30:00Z","cite_key":"3blue1brown2017neural"}

Dedup is GUID-based. If a feed republishes an item (edit, repost), the existing capture wins; we don’t re-download.

3. CLI

amem sub add <url> [--auto-compile] [--poll-minutes N] [--title "..."]
amem sub list [--json]
amem sub remove <id>
amem sub enable|disable <id>
amem sub fetch [<id>]          # one-shot poll, honours etag
amem sub daemon                # long-running poller (used by service unit)
amem rss setup                 # install daemon (macOS launchd / Linux systemd user)

4. Poll algorithm

For each enabled sub whose now - last_polled >= poll_minutes:

GET feed with If-None-Match: {last_etag} and If-Modified-Since: {last_polled}
304 → update last_polled, skip
200 → parse via feed-rs, iterate items
For each item not in ledger:
- Route to existing cite::cmd_capture(item.link) (auto-picks arxiv / PDF / YouTube based on URL)
- If auto_compile = true → also call the appropriate cmd_compile
- Write ledger line
Update ledger + state

Failures per-item don’t block the rest of the feed. Aggregate failures re-queue with exponential backoff (15 min → 2 h cap).

5. MCP surface

amem_sub_add(url, auto_compile?) -> sub_id
amem_sub_list() -> [{ id, title, last_polled, enabled, ... }]
amem_sub_remove(id)
amem_sub_fetch(id?) -> {fetched: N, captured: M, errors: [...] }

This lets an agent maintain its own research feed without a human in the loop: “follow every arxiv paper that cites Vaswani 2017” becomes a single MCP call.

6. Gated behind `features.rss`

Disabled by default. amem rss setup enables it, installs the daemon, and writes features.rss = true to ~/.amem/config.toml (per RFC-001).

Non-goals

Rich reader UI. amem is not Feedly. Reading lives in the wiki + amem recall. If people want visual unread counts, that belongs in an extension page, not the core.
OPML import on day 1. Easy add later; skip for MVP to keep surface small.
Arbitrary scheduling cron. poll_minutes is enough; cron-syntax scheduling is out of scope.
Podcast audio-only feeds. These would need whisper anyway — treat them as RFC-002b when YouTube pipeline is stable on more models.

Risks

Risk	Mitigation
A popular arxiv category fills disk (dozens of papers/day)	`poll_minutes` default 120 + per-feed disk quota + user confirmation on first-time `auto_compile = true`
Feed publisher rate-limits us	Honour `Retry-After`, respect 429; back off to 6 h for repeat offenders
Duplicate captures when arxiv updates a paper’s version	Keep first ingest; subsequent versions append a note to the existing wiki entry rather than creating a new cite_key
RSS spec is loose — malformed feeds break parser	`feed-rs` handles common variants; log + skip malformed entries, do not abort the poll

Concrete work

Rust crate additions: feed-rs = "2", toml_edit = "0.22" (config writes preserve comments) (amem-sh)
amem sub subcommand family (amem-sh)
amem rss setup installer and amem sub daemon long-runner (amem-sh)
MCP tools (amem-sh)
SPEC.md: add subscription to the storage layout section (amem-hq)
Docs: new guide/subscriptions.md page (amem-hq)

Extension UI for managing subscriptions is deferred — CLI first.

Privacy Policy

Last updated: 2026-04-21

amem is a local-first knowledge capture system. This document covers the Chrome extension (“amem Clipper”) and the native CLI (“amem”). Companion repos: amem-clipper, amem-sh.

What amem does

amem captures web pages, papers, and recordings into your personal, local-first knowledge base. Everything stays on your machine unless you explicitly opt into a sync destination.

Data collection

amem does not collect, transmit, or store any user data on external servers. There is no backend, no database, no analytics, no telemetry.

What amem Clipper accesses

Active tab when you click the amem action, open the side panel, or pick a context-menu entry — to read the page title, URL, and (when asked) visible text.
Local storage (chrome.storage.local) — for capture history and preferences.
Local WebSocket at ws://127.0.0.1:7600 (the amem bridge) — only when running on your own machine, to write captures to your filesystem. No external network calls.
Tab recording (tabCapture) — only while you are actively recording from the side panel. Recordings are saved directly to your browser Downloads folder as .webm files.

What amem Clipper does NOT do

Does not read your browsing history.
Does not track your activity or behavior.
Does not send data to any third-party server.
Does not use Google Drive, OAuth, or any cloud provider in this release.
Does not record audio.

Permissions explained

Permission	Why
`storage`	Save capture history and settings locally in the browser.
`activeTab`	Access the current tab when you initiate a capture.
`scripting`	Inject capture helpers into the active tab when you press capture.
`sidePanel`	Render the amem side panel UI.
`alarms`	Periodically retry the local bridge connection.
`tabs`	Read the current tab’s URL and title for the capture record.
`tabCapture`	Record the active tab’s video stream (explicit user action only).
`offscreen`	Host `MediaRecorder` in an offscreen document (service workers can’t).
`contextMenus`	Offer right-click capture entries.
`downloads`	Save recordings to your Downloads folder as `.webm` files.
`host_permissions: <all_urls>`	Capture works on whatever page you are viewing. Nothing is read without an explicit capture action.

amem CLI + crate

The native amem binary (installed via amem install, brew, or cargo) runs entirely on your machine. It reads and writes only ~/.amem/ and (optionally) paths you pass as arguments. It makes outbound HTTPS requests only when you run amem capture <url> — strictly to fetch the URL you asked for (e.g., arXiv PDF, YouTube audio via yt-dlp). No analytics, no telemetry.

Contact

Questions about this policy: open an issue at https://github.com/yiidtw/amem-clipper/issues or https://github.com/yiidtw/amem-sh/issues.

Keyboard shortcuts

amem.sh