Repo Mind Light is follow-up work to Repo Mind. It keeps the same core ambition as Repo Mind: giving humans and coding agents a reliable, holistic understanding of a repository, not just isolated search hits, while preserving the same quality of repository understanding in a form that is easier to operationalize.
Context - Distributed Repository Memory
Large repositories are full of information that is individually searchable but hard to assemble into a useful mental model. Source code explains only part of the story. Equally important details often live in pull requests, issues, review threads, and old comments where design intent, operational tradeoffs, and team memory accumulate over time.
Repo Mind Light is built for that broader form of understanding rather than for narrow snippet retrieval alone.
The Problem - Scattered Context
Repository questions often depend on context that is spread across code, discussions, and history. Those questions include:
- architectural questions about how the repository is organized
- questions about which feature is implemented by which subsystem or component
- ownership and contact questions, such as who is most likely to know a part of the system well
- historical questions about why something works the way it does and where that reasoning was previously discussed
One concrete example is incident response. When an incident lands, first responders need to get oriented quickly: which pull requests, issues, subsystems, and people are relevant, and what buried history matters before the investigation can even begin. Repo Mind Light helps with that, but incident response is only one example of the broader repository-understanding problem the project addresses.
Solution Approach - Focused Architecture
| Dimension | Repo Mind | Repo Mind Light |
|---|---|---|
| Core goal | Holistic repository understanding | |
| Indexed content | Code, docs, issues, PRs, summaries | Issues and PRs indexed locally |
| Code and docs access | Preprocessed as part of the wider system | Retrieved live through Blackbird |
| Retrieval modes | Multiple variants | GraphRAG Zero only |
| Deployment shape | Remote service | Standalone tool or Docker image plus local MCP server |
| Workflow fit | Integrable anywhere | Integrable anywhere, with a strong fit for Agentic Workflows |
Repo Mind Light has a focused architecture built around three pieces:
- incremental indexing of GitHub issues and pull requests into local on-disk index files
- GraphRAG Zero retrieval over that indexed issue and pull request corpus at query time
- live retrieval of source-code chunks and documentation files from GitHub Code Search, also known internally as Blackbird
This design keeps the system grounded in repository-wide context while making it straightforward to run as a standalone tool or embed into a larger workflow.
GraphRAG Zero is the third generation in a family of GraphRAG retrieval approaches from Microsoft Research. In Repo Mind Light it is used in a mode without cluster summarization: retrieval stays grounded in embeddings and chunk retrieval, with graph structure guiding candidate selection rather than relying on precomputed cluster summaries. The current GraphRAG Zero implementation is proprietary.
At query time, Repo Mind Light combines the indexed issue and pull request context with live code and docs retrieval, then exposes that combined understanding through an MCP server that coding agents can call directly.
Repo Mind Light can be used as a standalone repository question-answering system. It can also be embedded naturally into Agentic Workflows, where an agent needs high-quality repository context as part of a larger task. That workflow integration is important, but it is an application of Repo Mind Light rather than its whole purpose.
In workflow settings, the integration looks like this:
- a GitHub Actions job restores the most recent local index snapshot from GitHub Actions cache
- Repo Mind Light refreshes that index incrementally for the current repository
- the workflow starts a local MCP server from the published Docker image
- an agent queries that MCP server for repository understanding and contact-finding questions
In practice, those local index files stay small. Even for repositories with thousands of issues and pull requests, the Repo Mind Light index is only around 100 MB, which fits comfortably inside GitHub Actions’ 10 GB cache budget. Because GitHub Actions cache is least recently used and index maintenance is incremental, old snapshots fall out naturally, while refreshed runs only need to process changed issues and pull requests.
This lets repository understanding travel with the workflow instead of depending on a separately managed service, while still preserving the main capability: strong answers grounded in the repository’s code, discussions, and history.
Repo Mind Light is designed to give agents and humans the same kind of broad repository understanding explored in Repo Mind, but in a form that can run standalone or plug directly into operational and agentic workflows.
Evaluation - Internal Workflows And Agents
The most interesting evaluation for Repo Mind Light is not just whether it retrieves relevant context in the abstract, but whether it helps people and agents in live operational tasks.
One current use case is incident response for internal teams at GitHub. When incidents come in, those teams use agentic workflows backed by Repo Mind Light to mine repositories for related issues, likely contacts, code references, and historical discussions that can help them get started faster.
That feedback has been especially useful because it highlights what they actually value:
- related issues that reveal prior investigation and adjacent failures
- code references that anchor the problem in the current implementation
- old issue comments and discussion threads that contain tribal knowledge that ordinary search often misses
That last category is particularly important. Large repositories accumulate a lot of operational memory in places that are technically searchable but practically hard to find at the right moment. Repo Mind Light is useful when it brings that hidden repository memory forward early enough to shape the response.
The same underlying capability is valuable for coding agents. Repo Mind showed that better repository context improves consistency and reliability, especially on broader multi-file tasks. Repo Mind Light is meant to preserve that same practical benefit for agents while moving to a simpler architecture that is easier to run inside real workflows.
This is where the shape of the system matters. The retrieval quality is useful, but the real test is whether the system can fit naturally into a live workflow where reliability and workflow fit matter just as much as raw retrieval quality.
Availability - Public Access
Repo Mind Light is available as a public Docker image that you can pull
as ghcr.io/githubnext/repo-mind-light.
The public package page is
github.com/orgs/githubnext/packages/container/package/repo-mind-light.
Public documentation lives at
githubnext.com/projects/repo-mind-light/.
That image is the main public way to use the system today.
The source repository remains private because the current GraphRAG Zero implementation is proprietary. The public Docker image exposes the practical interface to Repo Mind Light without exposing that source.