Back to Skills

knowd

by Community

Personal knowledge base โ€” save web pages and search them semantically. Use when user shares a URL to save/bookmark/remember, asks "what did I save about...", wants to search saved articles, or asks about their knowledge base. Supports multiple embedding providers (OpenAI, Voyage, Cohere, Jina, Ollama).

1.0.0
$ npx skills add https://github.com/ianpcook/knowd-skill

Files

README.md
2.3 KB
# knowd ๐Ÿง 

Personal knowledge base for AI agents. Save web pages, search them semantically.

No UI โ€” just conversation. Tell your agent "save this URL" and it extracts, chunks, embeds, and stores it. Ask "what did I save about X?" and it finds the most relevant passages.

## Features

- **Multi-provider embeddings** โ€” OpenAI, Voyage AI, Cohere, Jina, or Ollama (local/free)
- **Semantic search** โ€” cosine similarity over embedded chunks
- **Content extraction** โ€” trafilatura handles the messy web
- **SQLite storage** โ€” single file, no external services
- **Provider lock-in protection** โ€” DB tracks which provider was used; can't mix incompatible embeddings

## Install

### As an OpenClaw / Clawdbot skill

```bash
# From Skills N'at
# Visit https://skills-nat.vercel.app and install via repo URL

# Or manually
git clone https://github.com/ianpcook/knowd-skill.git skills/knowd
pip3 install -r skills/knowd/requirements.txt
```

### Standalone

```bash
git clone https://github.com/ianpcook/knowd-skill.git
cd knowd-skill
pip3 install -r requirements.txt
python3 scripts/knowd.py --help
```

## Setup

Set at least one embedding provider API key:

| Provider | Env Variable | Model |
|----------|-------------|-------|
| OpenAI (default) | `OPENAI_API_KEY` | text-embedding-3-small |
| Voyage AI | `VOYAGE_API_KEY` | voyage-3-lite |
| Cohere | `COHERE_API_KEY` | embed-v4 |
| Jina | `JINA_API_KEY` | jina-embeddings-v3 |
| Ollama (local) | โ€” | nomic-embed-text |

## Usage

```bash
# Save a URL
python3 scripts/knowd.py save "https://example.com/article"

# Search your knowledge
python3 scripts/knowd.py search "machine learning best practices" -k 5

# List saved sources
python3 scripts/knowd.py list

# Stats
python3 scripts/knowd.py stats

# Delete
python3 scripts/knowd.py delete "https://example.com/article"

# List available providers
python3 scripts/knowd.py providers
```

## How it works

1. **Fetch** โ€” downloads the page, extracts readable text via trafilatura
2. **Chunk** โ€” splits into ~500-token overlapping chunks at sentence boundaries
3. **Embed** โ€” sends chunks to your chosen embedding provider
4. **Store** โ€” SQLite with binary embeddings (no vector DB dependency)
5. **Search** โ€” embeds your query, computes cosine similarity against all chunks

## License

MIT
SKILL.mdMain
3.1 KB
---
name: knowd
description: Personal knowledge base โ€” save web pages and search them semantically. Use when user shares a URL to save/bookmark/remember, asks "what did I save about...", wants to search saved articles, or asks about their knowledge base. Supports multiple embedding providers (OpenAI, Voyage, Cohere, Jina, Ollama).
metadata: {"clawdbot": {"emoji": "๐Ÿง ", "requires": {"bins": ["python3"]}, "primaryEnv": "OPENAI_API_KEY"}}
---

# knowd โ€” Personal Knowledge Base

Save web pages, search them semantically. No UI โ€” just conversation.

## Setup

Install dependencies:
```bash
pip3 install -r <skill>/requirements.txt
```

Set ONE of these API keys (or use Ollama for local/free):
- `OPENAI_API_KEY` โ€” OpenAI text-embedding-3-small (default)
- `VOYAGE_API_KEY` โ€” Voyage AI voyage-3-lite
- `COHERE_API_KEY` โ€” Cohere embed-v4
- `JINA_API_KEY` โ€” Jina jina-embeddings-v3
- Ollama: no key needed, just have Ollama running locally

The provider is locked in on first save โ€” can't mix embedding spaces in the same DB.

## Commands

```bash
# Save a URL
python3 <skill>/scripts/knowd.py save "<url>"

# Save with a specific provider (first use only sets the provider)
python3 <skill>/scripts/knowd.py --provider voyage save "<url>"

# Semantic search
python3 <skill>/scripts/knowd.py search "<query>" -k 5

# List saved sources
python3 <skill>/scripts/knowd.py list

# Stats (includes provider info)
python3 <skill>/scripts/knowd.py stats

# Delete a source
python3 <skill>/scripts/knowd.py delete "<url-or-id>"

# List available providers and which keys are set
python3 <skill>/scripts/knowd.py providers
```

## When to Use

### Saving
When user shares a URL with intent to save ("save this", "remember this", "bookmark this", "add to my knowledge base"):
1. Run `knowd save "<url>"`
2. Report: title, chunk count, provider used
3. Be conversational: "Saved! Got 8 chunks from 'Article Title' via openai."

### Searching
When user asks about saved knowledge ("what did I save about...", "find that article about...", "search my knowledge base for..."):
1. Run `knowd search "<query>"`
2. Summarize results naturally โ€” titles, relevant snippets, scores only if helpful
3. Don't dump raw output; synthesize

### Listing
When user asks what they've saved:
1. Run `knowd list`
2. Present as a clean list with titles and dates

### Provider Selection
- On first use, if user hasn't specified, auto-detect: use whichever API key is available in the environment
- If multiple keys exist, prefer OpenAI (most common)
- If user explicitly requests a provider: `--provider cohere`
- After first save, the provider is locked to that DB

## State

Database: `<workspace>/state/knowd.db` (SQLite)

The DB stores the embedding provider/model in metadata. Attempting to use a different provider on an existing DB will error with a clear explanation.

## Auto-Detection

If the user doesn't specify a provider, check environment variables in this order:
1. OPENAI_API_KEY โ†’ openai
2. VOYAGE_API_KEY โ†’ voyage  
3. COHERE_API_KEY โ†’ cohere
4. JINA_API_KEY โ†’ jina
5. Check if Ollama is running โ†’ ollama
6. Error: no provider available
requirements.txt
24 B
click
trafilatura
numpy
knowd.py
17.9 KB
#!/usr/bin/env python3
"""knowd โ€” personal knowledge base with semantic search. Multi-provider embeddings."""

import hashlib
import json
import os
import re
import sqlite3
import sys
import urllib.request
import urllib.error
from datetime import datetime, timezone
from pathlib import Path
from urllib.parse import urlparse

import click
import numpy as np
import trafilatura

# Defaults
DEFAULT_DB = Path(__file__).resolve().parent.parent.parent.parent / "state" / "knowd.db"
CHUNK_SIZE = 500  # ~tokens (approx 4 chars/token)
CHUNK_OVERLAP = 50

# --- Embedding Providers ---

PROVIDERS = {
    "openai": {
        "env": "OPENAI_API_KEY",
        "model": "text-embedding-3-small",
        "dims": 1536,
        "url": "https://api.openai.com/v1/embeddings",
    },
    "voyage": {
        "env": "VOYAGE_API_KEY",
        "model": "voyage-3-lite",
        "dims": 1024,
        "url": "https://api.voyageai.com/v1/embeddings",
    },
    "cohere": {
        "env": "COHERE_API_KEY",
        "model": "embed-v4",
        "dims": 1024,
        "url": "https://api.cohere.com/v2/embed",
    },
    "jina": {
        "env": "JINA_API_KEY",
        "model": "jina-embeddings-v3",
        "dims": 1024,
        "url": "https://api.jina.ai/v1/embeddings",
    },
    "ollama": {
        "env": None,
        "model": "nomic-embed-text",
        "dims": 768,
        "url": "http://localhost:11434/api/embed",
    },
}


def _http_post(url, headers, body, timeout=60):
    """Simple HTTP POST returning parsed JSON."""
    data = json.dumps(body).encode()
    req = urllib.request.Request(url, data=data, headers=headers, method="POST")
    with urllib.request.urlopen(req, timeout=timeout) as resp:
        return json.loads(resp.read().decode())


def embed_texts(texts, provider, model, api_key=None):
    """Get embeddings from the specified provider. Returns list of np arrays."""
    if provider == "openai":
        return _embed_openai(texts, model, api_key)
    elif provider == "voyage":
        return _embed_voyage(texts, model, api_key)
    elif provider == "cohere":
        return _embed_cohere(texts, model, api_key)
    elif provider == "jina":
        return _embed_jina(texts, model, api_key)
    elif provider == "ollama":
        return _embed_ollama(texts, model)
    else:
        raise click.ClickException(f"Unknown provider: {provider}")


def _embed_openai(texts, model, api_key):
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
    all_embs = []
    for i in range(0, len(texts), 2048):
        batch = texts[i:i + 2048]
        resp = _http_post(PROVIDERS["openai"]["url"], headers, {"model": model, "input": batch})
        for item in sorted(resp["data"], key=lambda x: x["index"]):
            all_embs.append(np.array(item["embedding"], dtype=np.float32))
    return all_embs


def _embed_voyage(texts, model, api_key):
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
    all_embs = []
    for i in range(0, len(texts), 128):
        batch = texts[i:i + 128]
        resp = _http_post(PROVIDERS["voyage"]["url"], headers, {"model": model, "input": batch})
        for item in sorted(resp["data"], key=lambda x: x["index"]):
            all_embs.append(np.array(item["embedding"], dtype=np.float32))
    return all_embs


def _embed_cohere(texts, model, api_key):
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
    all_embs = []
    for i in range(0, len(texts), 96):
        batch = texts[i:i + 96]
        resp = _http_post(
            PROVIDERS["cohere"]["url"], headers,
            {"model": model, "texts": batch, "input_type": "search_document", "embedding_types": ["float"]},
        )
        for emb in resp["embeddings"]["float"]:
            all_embs.append(np.array(emb, dtype=np.float32))
    return all_embs


def _embed_jina(texts, model, api_key):
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
    all_embs = []
    for i in range(0, len(texts), 2048):
        batch = texts[i:i + 2048]
        resp = _http_post(PROVIDERS["jina"]["url"], headers, {"model": model, "input": batch})
        for item in sorted(resp["data"], key=lambda x: x["index"]):
            all_embs.append(np.array(item["embedding"], dtype=np.float32))
    return all_embs


def _embed_ollama(texts, model):
    all_embs = []
    for text in texts:
        resp = _http_post(PROVIDERS["ollama"]["url"], {"Content-Type": "application/json"},
                          {"model": model, "input": text})
        all_embs.append(np.array(resp["embeddings"][0], dtype=np.float32))
    return all_embs


# --- DB ---

def get_db(db_path):
    db_path = Path(db_path)
    db_path.parent.mkdir(parents=True, exist_ok=True)
    conn = sqlite3.connect(str(db_path))
    conn.row_factory = sqlite3.Row
    conn.execute("PRAGMA foreign_keys = ON")
    _init_db(conn)
    return conn


def _init_db(conn):
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS meta (key TEXT PRIMARY KEY, value TEXT);
        CREATE TABLE IF NOT EXISTS sources (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            url TEXT UNIQUE NOT NULL,
            title TEXT,
            domain TEXT,
            saved_at TEXT NOT NULL,
            content_hash TEXT NOT NULL
        );
        CREATE TABLE IF NOT EXISTS chunks (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            source_id INTEGER NOT NULL REFERENCES sources(id) ON DELETE CASCADE,
            content TEXT NOT NULL,
            embedding BLOB,
            chunk_index INTEGER NOT NULL
        );
        CREATE INDEX IF NOT EXISTS idx_chunks_source ON chunks(source_id);
    """)
    conn.execute("INSERT OR IGNORE INTO meta VALUES ('schema_version', '1')")
    conn.commit()


def get_meta(conn, key):
    row = conn.execute("SELECT value FROM meta WHERE key = ?", (key,)).fetchone()
    return row[0] if row else None


def set_meta(conn, key, value):
    conn.execute("INSERT OR REPLACE INTO meta VALUES (?, ?)", (key, value))
    conn.commit()


def resolve_provider(conn, provider=None, model=None):
    """Resolve provider/model, enforcing DB lock-in after first use."""
    db_provider = get_meta(conn, "embedding_provider")
    db_model = get_meta(conn, "embedding_model")

    if db_provider:
        # DB already has a provider locked in
        if provider and provider != db_provider:
            raise click.ClickException(
                f"This database uses '{db_provider}' embeddings. "
                f"Cannot switch to '{provider}' โ€” vectors would be incompatible. "
                f"Use --db to create a separate database, or delete and re-save all sources."
            )
        provider = db_provider
        model = model or db_model
    else:
        # First use โ€” set provider
        provider = provider or "openai"
        if provider not in PROVIDERS:
            raise click.ClickException(
                f"Unknown provider: {provider}. Choose from: {', '.join(PROVIDERS.keys())}"
            )
        model = model or PROVIDERS[provider]["model"]

    # Validate API key
    pinfo = PROVIDERS[provider]
    api_key = None
    if pinfo["env"]:
        api_key = os.environ.get(pinfo["env"])
        if not api_key:
            raise click.ClickException(
                f"Provider '{provider}' requires {pinfo['env']} environment variable."
            )

    # Lock in on first use
    if not db_provider:
        set_meta(conn, "embedding_provider", provider)
        set_meta(conn, "embedding_model", model)

    return provider, model, api_key


# --- Content extraction ---

def fetch_content(url):
    """Extract text and title from a URL."""
    downloaded = trafilatura.fetch_url(url)
    if not downloaded:
        # Fallback: raw urllib
        try:
            req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0 (compatible; knowd/1.0)"})
            with urllib.request.urlopen(req, timeout=30) as resp:
                downloaded = resp.read().decode("utf-8", errors="replace")
        except Exception:
            raise click.ClickException(f"Could not fetch: {url}")

    if not downloaded:
        raise click.ClickException(f"Could not fetch: {url}")

    text = trafilatura.extract(downloaded, include_comments=False, include_tables=True)
    title = None
    try:
        meta = trafilatura.metadata.extract_metadata(downloaded)
        if meta and meta.title:
            title = meta.title
    except Exception:
        pass

    if not text or len(text) < 50:
        raise click.ClickException(f"Could not extract meaningful content from: {url}")

    return text, title


# --- Chunking ---

def chunk_text(text, chunk_size=CHUNK_SIZE, overlap=CHUNK_OVERLAP):
    """Split text into chunks of ~chunk_size tokens with sentence-boundary overlap."""
    chars = chunk_size * 4
    olap = overlap * 4

    sentences = []
    for line in text.split("\n"):
        line = line.strip()
        if not line:
            continue
        parts = re.split(r"(?<=[.!?])\s+", line)
        sentences.extend(parts)

    chunks = []
    current = ""
    for s in sentences:
        if len(current) + len(s) + 1 > chars and current:
            chunks.append(current.strip())
            current = current[-olap:] + " " + s if olap else s
        else:
            current = current + " " + s if current else s
    if current.strip():
        chunks.append(current.strip())
    return chunks if chunks else [text[:chars]]


# --- Search ---

def cosine_sim(a, b):
    return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-10))


# --- CLI ---

@click.group()
@click.option("--db", default=None, help="Override DB path")
@click.option("--json-output", "--json", "json_out", is_flag=True, help="JSON output")
@click.option("--provider", "-p", default=None, help=f"Embedding provider: {', '.join(PROVIDERS.keys())}")
@click.option("--model", "-m", default=None, help="Override embedding model")
@click.pass_context
def cli(ctx, db, json_out, provider, model):
    """knowd โ€” save and search your knowledge."""
    ctx.ensure_object(dict)
    ctx.obj["db_path"] = db or os.environ.get("KNOWD_DB", str(DEFAULT_DB))
    ctx.obj["json"] = json_out
    ctx.obj["provider"] = provider
    ctx.obj["model"] = model


@cli.command()
@click.pass_context
def init(ctx):
    """Initialize the database."""
    conn = get_db(ctx.obj["db_path"])
    conn.close()
    click.echo("Database initialized.")


@cli.command()
@click.argument("url")
@click.pass_context
def save(ctx, url):
    """Save a URL to the knowledge base."""
    conn = get_db(ctx.obj["db_path"])
    provider, model, api_key = resolve_provider(conn, ctx.obj["provider"], ctx.obj["model"])

    click.echo(f"Fetching {url}...")
    text, title = fetch_content(url)
    content_hash = hashlib.sha256(text.encode()).hexdigest()

    # Check for duplicate
    existing = conn.execute("SELECT id, content_hash FROM sources WHERE url = ?", (url,)).fetchone()
    if existing:
        if existing["content_hash"] == content_hash:
            if ctx.obj["json"]:
                click.echo(json.dumps({"status": "duplicate", "title": title, "url": url}))
            else:
                click.echo(f"Already saved (unchanged): {title or url}")
            conn.close()
            return
        conn.execute("DELETE FROM chunks WHERE source_id = ?", (existing["id"],))
        conn.execute(
            "UPDATE sources SET title=?, content_hash=?, saved_at=? WHERE id=?",
            (title, content_hash, datetime.now(timezone.utc).isoformat(), existing["id"]),
        )
        source_id = existing["id"]
        click.echo("Content updated, re-embedding...")
    else:
        domain = urlparse(url).netloc
        cur = conn.execute(
            "INSERT INTO sources (url, title, domain, saved_at, content_hash) VALUES (?,?,?,?,?)",
            (url, title, domain, datetime.now(timezone.utc).isoformat(), content_hash),
        )
        source_id = cur.lastrowid

    chunks = chunk_text(text)
    click.echo(f"Embedding {len(chunks)} chunks via {provider}/{model}...")
    embeddings = embed_texts(chunks, provider, model, api_key)

    for i, (chunk, emb) in enumerate(zip(chunks, embeddings)):
        conn.execute(
            "INSERT INTO chunks (source_id, content, embedding, chunk_index) VALUES (?,?,?,?)",
            (source_id, chunk, emb.tobytes(), i),
        )
    conn.commit()
    conn.close()

    if ctx.obj["json"]:
        click.echo(json.dumps({"status": "saved", "title": title, "url": url, "chunks": len(chunks), "provider": provider, "model": model}))
    else:
        click.echo(f"Saved: {title or url}")
        click.echo(f"  {len(chunks)} chunks, {len(text)} chars ({provider}/{model})")


@cli.command()
@click.argument("query")
@click.option("-k", default=5, help="Number of results")
@click.pass_context
def search(ctx, query, k):
    """Semantic search over saved knowledge."""
    conn = get_db(ctx.obj["db_path"])
    provider, model, api_key = resolve_provider(conn, ctx.obj["provider"], ctx.obj["model"])

    query_emb = embed_texts([query], provider, model, api_key)[0]

    rows = conn.execute(
        """
        SELECT c.id, c.content, c.embedding, c.chunk_index,
               s.url, s.title, s.saved_at, s.domain
        FROM chunks c JOIN sources s ON c.source_id = s.id
        WHERE c.embedding IS NOT NULL
    """
    ).fetchall()

    if not rows:
        click.echo("No saved content to search.")
        conn.close()
        return

    results = []
    for r in rows:
        emb = np.frombuffer(r["embedding"], dtype=np.float32)
        sim = cosine_sim(query_emb, emb)
        results.append({
            "title": r["title"] or r["url"],
            "url": r["url"],
            "domain": r["domain"],
            "chunk": r["content"],
            "score": round(sim, 4),
            "saved_at": r["saved_at"],
        })

    results.sort(key=lambda x: x["score"], reverse=True)
    results = results[:k]
    conn.close()

    if ctx.obj["json"]:
        click.echo(json.dumps(results, indent=2))
    else:
        for i, r in enumerate(results, 1):
            preview = r["chunk"][:300] + "..." if len(r["chunk"]) > 300 else r["chunk"]
            click.echo(f"\n{'โ”€' * 60}")
            click.echo(f"  [{i}] {r['title']}  (score: {r['score']:.3f})")
            click.echo(f"  {r['url']}")
            click.echo(f"  Saved: {r['saved_at'][:10]}")
            click.echo(f"  {preview}")
        click.echo(f"\n{'โ”€' * 60}")


@cli.command("list")
@click.option("--limit", default=20, help="Max sources to show")
@click.pass_context
def list_sources(ctx, limit):
    """List saved sources."""
    conn = get_db(ctx.obj["db_path"])
    rows = conn.execute(
        """
        SELECT s.id, s.url, s.title, s.domain, s.saved_at,
               COUNT(c.id) as chunk_count
        FROM sources s LEFT JOIN chunks c ON s.id = c.source_id
        GROUP BY s.id ORDER BY s.saved_at DESC LIMIT ?
    """,
        (limit,),
    ).fetchall()
    conn.close()

    if ctx.obj["json"]:
        click.echo(json.dumps([dict(r) for r in rows], indent=2))
    else:
        if not rows:
            click.echo("No saved sources.")
            return
        for r in rows:
            click.echo(f"  [{r['id']}] {r['title'] or r['url']}")
            click.echo(f"      {r['url']}")
            click.echo(f"      {r['saved_at'][:10]} ยท {r['chunk_count']} chunks")
            click.echo()


@cli.command()
@click.pass_context
def stats(ctx):
    """Show database stats."""
    conn = get_db(ctx.obj["db_path"])
    sources = conn.execute("SELECT COUNT(*) FROM sources").fetchone()[0]
    chunks = conn.execute("SELECT COUNT(*) FROM chunks").fetchone()[0]
    provider = get_meta(conn, "embedding_provider") or "not set"
    model = get_meta(conn, "embedding_model") or "not set"
    conn.close()

    db_path = Path(ctx.obj["db_path"])
    db_size = db_path.stat().st_size if db_path.exists() else 0
    size_str = f"{db_size / 1024:.1f} KB" if db_size < 1048576 else f"{db_size / 1048576:.1f} MB"

    if ctx.obj["json"]:
        click.echo(json.dumps({"sources": sources, "chunks": chunks, "db_size_bytes": db_size, "provider": provider, "model": model}))
    else:
        click.echo(f"  Sources:  {sources}")
        click.echo(f"  Chunks:   {chunks}")
        click.echo(f"  DB size:  {size_str}")
        click.echo(f"  Provider: {provider}")
        click.echo(f"  Model:    {model}")


@cli.command()
@click.argument("url_or_id")
@click.pass_context
def delete(ctx, url_or_id):
    """Delete a source and its chunks."""
    conn = get_db(ctx.obj["db_path"])
    try:
        sid = int(url_or_id)
        row = conn.execute("SELECT * FROM sources WHERE id = ?", (sid,)).fetchone()
    except ValueError:
        row = conn.execute("SELECT * FROM sources WHERE url = ?", (url_or_id,)).fetchone()

    if not row:
        raise click.ClickException(f"Source not found: {url_or_id}")

    conn.execute("DELETE FROM chunks WHERE source_id = ?", (row["id"],))
    conn.execute("DELETE FROM sources WHERE id = ?", (row["id"],))
    conn.commit()
    conn.close()

    if ctx.obj["json"]:
        click.echo(json.dumps({"status": "deleted", "title": row["title"], "url": row["url"]}))
    else:
        click.echo(f"Deleted: {row['title'] or row['url']}")


@cli.command()
@click.pass_context
def providers(ctx):
    """List available embedding providers."""
    if ctx.obj["json"]:
        click.echo(json.dumps({k: {"model": v["model"], "env": v["env"]} for k, v in PROVIDERS.items()}, indent=2))
    else:
        click.echo("Available embedding providers:\n")
        for name, info in PROVIDERS.items():
            env = info["env"] or "(none โ€” local)"
            key_set = "โœ“" if (not info["env"] or os.environ.get(info["env"])) else "โœ—"
            click.echo(f"  {name:10s}  model: {info['model']:30s}  env: {env:20s} [{key_set}]")
        click.echo()
        db_path = Path(ctx.obj["db_path"])
        if db_path.exists():
            conn = get_db(str(db_path))
            p = get_meta(conn, "embedding_provider")
            m = get_meta(conn, "embedding_model")
            conn.close()
            if p:
                click.echo(f"  This database uses: {p}/{m}")


if __name__ == "__main__":
    cli()
skill.json
561 B
{
  "name": "knowd",
  "version": "1.0.0",
  "description": "Personal knowledge base โ€” save web pages and search them semantically. Multi-provider embeddings (OpenAI, Voyage, Cohere, Jina, Ollama).",
  "author": "Ian Cook",
  "license": "MIT",
  "category": "knowledge",
  "agents": ["claude-code", "openclaw"],
  "requires": {
    "bins": ["python3"],
    "env": {
      "oneOf": ["OPENAI_API_KEY", "VOYAGE_API_KEY", "COHERE_API_KEY", "JINA_API_KEY"]
    }
  },
  "keywords": ["knowledge-base", "semantic-search", "embeddings", "bookmarks", "web-clipper"]
}

Compatible Agents

Claude CodeclaudeCodexOpenClawAntigravityGemini

Details

Category
Uncategorized
Version
1.0.0
Stars
0
Added
February 11, 2026
Updated
February 11, 2026

Actions

Download .zip

Upload this .zip to Claude Desktop via Settings โ†’ Capabilities โ†’ Skills

Vote: