Cogniscape stores semantic descriptions of what happened in your codebase — never the code itself.
How we process developer activity
Every event that enters Cogniscape passes through four stages before it reaches the knowledge graph. Each stage reduces the payload to only the semantic information needed for developer intelligence.Every event type (pull requests, reviews, comments, issues, pushes) has a dedicated processing path that explicitly selects which fields to include. Unknown event types fall back to a conservative default that still excludes all code and sensitive fields.
What we store vs. what we don’t
The tables below show exactly which fields from common developer events are kept and which are discarded.Pull request events
| Field | Stored | Example of what reaches the graph |
|---|---|---|
| Developer | Yes | "alice" |
| Repository | Yes | "acme/backend" |
| Action | Yes | "opened", "merged" |
| PR number | Yes | 42 |
| Title | Yes | "Add retry logic to payment service" |
| State | Yes | "open", "closed" |
| Branch names | Yes | "feat/retry-payments" |
| Labels | Yes | ["bug", "priority:high"] |
| Assignees / Reviewers | Yes | ["bob", "carol"] |
| PR body (description) | Sanitized | Code blocks and inline code removed; surrounding text kept |
| Diff / changed files content | No | Never captured |
| Raw event payload | No | Always excluded |
Review and comment events
| Field | Stored | Notes |
|---|---|---|
| Review state | Yes | "approved", "changes_requested" |
| Review body | Sanitized | Code blocks removed |
| Comment body | Sanitized | Code blocks removed |
| Code diffs | No | Contains actual code — always excluded |
| File path | Yes | "src/payments/retry.ts" (path only, not content) |
Push events
| Field | Stored | Notes |
|---|---|---|
| Branch | Yes | "main" |
| Commit messages | Yes | Human-written text describing intent |
| Commit identifiers | No | Excluded |
| File lists (added/modified/removed) | No | Excluded |
| File contents | No | Not included in event payloads |
Understanding LLM-reconstructed content
This is the most important section of this document. Even with all code stripped from stored data, you may occasionally see what looks like source code in a Cogniscape MCP response. Here is why.What the knowledge graph actually contains
When Cogniscape processes a sanitized event, our AI engine extracts entities and facts in natural language. For example, from a pull request review that discusses a timestamp bug fix, the graph might store the following (function and variable names are extracted from PR discussions, not from source code): Entities:addNotification— “A helper function that captures the current ID before incrementing to ensure correct timestamp alignment in notification creation.”currentId— “A variable used to generate sequential notification IDs and corresponding timestamp offsets.”
- “The addNotification helper was introduced to fix an off-by-one bug where the template literal evaluated before the ID increment.”
How code-like content appears in responses
When you query the Cogniscape MCP — for example, asking “What technical issues were found in the notifications PR?” — the following happens:- The Cogniscape MCP searches the knowledge graph and retrieves relevant entities and facts
- These results are passed to your LLM (the one powering your Claude Code, Claude Desktop, or other MCP client)
- Your LLM synthesizes a response from the semantic descriptions
addNotification, currentId) and the fact descriptions are detailed enough to convey the logic, your LLM can reconstruct plausible code as part of its response. It is doing what LLMs do — generating the most helpful answer from the context it received.
The code in such responses is generated by your own LLM at query time, not retrieved from the Cogniscape database. It may not even match your actual implementation — it is the LLM’s best interpretation of the semantic descriptions.
A concrete example
Here is what is stored in the graph versus what your LLM might generate:- What Cogniscape stores
- What your LLM might generate
Security by design
Cogniscape’s data protection is enforced at multiple layers, ensuring that no single point of failure can expose source code.| Layer | Protection |
|---|---|
| Event reception | Only selected event types are accepted; others are rejected |
| Normalization | Raw payloads are discarded — only structured metadata fields proceed |
| Sanitization | Code blocks, inline code, diff hunks, and sensitive fields are stripped |
| Knowledge graph | AI extracts natural-language entities and facts, not code |
| Cogniscape MCP | Returns semantic search results; any code in the final response is generated by the client’s own LLM |
Questions?
If you have questions about how Cogniscape handles your data, contact us at [email protected].