Sync Your Folder to ChatGPT: the story behind `openai-folder-sync`

created: Friday, Aug 29, 2025

A while back I kept running into the same problem: I had a bunch of notes, specs, and markdown docs on my laptop, but when I opened ChatGPT and asked questions about them, the model obviously didn’t know they existed. Copy-pasting files was tedious; manually uploading one by one wasn’t much better. So I built a tiny CLI that does the boring bit for me: openai-folder-sync, a command-line tool that syncs a local directory to an OpenAI Vector Store, so those files become searchable context for ChatGPT.

Why a vector store?

OpenAI’s file search is built around a vector store. You add files to a store, they’re processed into embeddings, and assistants (or ChatGPT via connected tools) can retrieve and ground answers using those files. It’s a clean model, but it assumes you’ll remember to upload and maintain your file set. A sync step—which mirrors a folder you already care about—is closer to how we work day to day.

What the tool does

openai-folder-sync scans a directory on your machine, filters files by extension, and pushes them to an existing vector store. You can run it once to populate a store, or add it to your routine (e.g., a cron job) to keep your knowledge base fresh. There’s also a switch to embed git metadata (branch and commit) into the uploaded content, so when ChatGPT cites something, you know exactly which version it came from.

Mirrors a local folder into a Vector Store
Supports extension filtering (e.g., md,txt,pdf)
Optional git provenance (--git-info)
Can be scripted or run ad-hoc

Repo: https://github.com/JensWalter/openai-folder-sync

Installation

It’s written in Rust and installs via Cargo:

cargo install --git https://github.com/JensWalter/openai-folder-sync.git

Cargo will build and place the openai-folder-sync binary on your PATH.

Usage

You’ll need an OpenAI API key and an existing vector store ID (looks like vs_…). Then point the CLI at your folder:

openai-folder-sync \
  --vector-store 'vs_ABCDEFGHIJK' \
  --local-dir '/Users/jens/tmp/wiki/content' \
  --extensions md

Prefer running through Cargo during development?

cargo run -- \
  --vector-store 'vs_ABCDEFGHIJK' \
  --local-dir '/Users/jens/tmp/wiki/content' \
  --extensions md

Most flags can also be set via environment variables:

export OPENAI_API_KEY=sk-...
export VECTOR_STORE=vs_ABCDEFGHIJK
export LOCAL_DIR=/Users/jens/tmp/wiki/content
export EXTENSIONS=md,txt,pdf
export GIT_INFO=true

openai-folder-sync

Common flags
  --vector-store / VECTOR_STORE – target store ID
  --local-dir / LOCAL_DIR – folder to sync
  --extensions / EXTENSIONS – comma-separated list
  --git-info / GIT_INFO – include git metadata (true|false)
  --help – show all options

How this fits with ChatGPT

Once the files are in a vector store, ChatGPT (or your Assistants API apps) can retrieve relevant snippets from those files while answering your questions. That means you can ask things like “What did the proposal say about the migration timeline?” and the model will pull the relevant context from your synced docs—no copy-paste required. Your folder effectively becomes a living knowledge base.

Tips & gotchas

Choose extensions deliberately. Start with the formats you actually use—md, txt, maybe pdf. This keeps processing fast and retrieval focused.
Create the vector store first. The tool expects a store ID; set one up via the API or dashboard, then hand that ID to the CLI.
Use git info for provenance. If you sync from a repo, enabling –git-info tags uploaded text with commit metadata, which makes answers easier to trust and trace.
Exclude noise. Consider keeping build artifacts, node_modules, or temporary files outside your sync directory, or add them to .gitignore if you rely on the git metadata.

Example workflow

Create one vector store per knowledge domain (e.g., “Company Wiki”, “Research Notes”).
Point openai-folder-sync at the matching local folder.
Run it as part of your writing routine (for me: after pushing to main).
Ask ChatGPT questions and let retrieval do the heavy lifting.

Roadmap

I currently have no further plans for this tool. If you think something is still missing, just open an issue or PR.

⸻

If you want to try it, the repository includes install and usage details plus help output. I built it to make my own workflow simpler; if it saves you from a few copy-paste marathons and helps ChatGPT answer from your real documents, that’s a win. Happy syncing!