Setup Context Based Collision Resolution V1#2580
Draft
GeorgeNgMsft wants to merge 12 commits into
Draft
Conversation
…, scorer, decision) Deterministic, LLM-free core for context-weighted collision resolution (design: docs/architecture/collision/context-weighted-collision-resolution-design.md). All pure/leaf modules under src/context/contextSelector/, not yet wired in. Two extensibility seams the design centers on: - ConversationSignalSource (context-vector creation) - v1 RingBufferSignalSource (raw-token ring buffer, lambda=0.9/N=20 decay) swappable for knowPro entities. - CollisionScorer (scoring) - v1 TfIdfScorer (candidate-local IDF) swappable for knowPro-entity / embedding-similarity scorers; same CandidateScore contract. Modules: - tokenize.ts: pinned canonicalizer/tokenizer with protected patterns (C#, .NET, A1:B2). - conversationSignal.ts: ContextVector + source seam + decayed ring buffer (history-only). - keywordVector/keywordExtractor/keywordSidecar/keywordIndex: derived lexical floor plus live-tunable collision-keywords.json overrides (effective = derived + add - remove). - scorer.ts: candidate-local IDF TF-IDF over flattened keyword sets. - decision.ts: coverage + evidence gate + margin with quantization and total ordering. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Integrate the engine as a deterministic pre-strategy tier on the grammarMatch path (design section 11): a confident topical pick resolves on the cache path (no LLM); abstain falls through to the configured grammar strategy (default) or escalates to LLM translation. - session.ts: add collision.contextSelector config block + defaults (detect off, windowTurns 20, decay 0.9, minUniqueTokens 2, minMass 1.0, margin 1.0, abstainFallback defer-to-strategy). Only detect is @config-exposed. - matchContextSelector.ts: orchestrator - adapts validated MatchResults to scorer candidates, runs signal -> keywords -> scorer -> decision, emits telemetry, returns the winning match + U-2 affordance note or abstains. - collisionTelemetry.ts: add "context-weight" strategy label + optional per-candidate matchedTokens for explainable events. - commandHandlerContext.ts: construct conversationSignal (ring buffer), contextSelectorSidecar, contextSelectorKeywords (KeywordIndex); reset signal + invalidate keyword cache on session switch. - matchRequest.ts: insert the tier between registry-first and the grammar strategy. - interpretRequest.ts: record each completed turn into the signal (history-only). - historyCommandHandler.ts / systemAgent.ts: reset the signal on `@history clear` and clear-deep. Off by default; detect:false preserves byte-identical legacy behavior. Builds green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ords surface - configCommandHandlers.ts: `@config collision contextSelector detect [on|off]` toggle (only detect is exposed, per §11.3); render the contextSelector row in both the HTML and text `@config collision` views; include it in the anyOn summary. - collisionKeywordHandlers.ts: `@collision keywords <schema.action> add|remove|list|clear` to inspect derived + override keywords and edit the collision-keywords.json sidecar (§5.3). `list` with no arg lists all overrides; with a target shows derived, delta, and merged effective sets. - collisionCommandHandlers.ts: register the `keywords` sub-command. The U-2 reroute affordance (displayInfo) shipped with the integration commit. Builds green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rator 48 specs across 6 suites, all deterministic and LLM-free: - contextSelectorTokenize: protected patterns, stopword/verb drops, de-camel, NFKC. - contextSelectorSignal: decay math (design §14 Scenario 1), window cap, history-only, reset, within-turn multiplicity. - contextSelectorScorer: candidate-local IDF (disc=1 unique, 0 shared, graduated), matched-token detail, stable ordering. - contextSelectorDecision: coverage / no-signal / min-unique-tokens / min-mass / margin gates, total ordering, quantize; §14 Scenario 1 (resolve) + Scenario 2 (abstain). - contextSelectorKeywords: lexical extraction, param walk (arrays/refs), sidecar add/remove/clear + canonicalization + disk reload + malformed-degrades-to-empty, index derive/memoize/effective/invalidate. - contextSelectorResolve: end-to-end resolve + abstain (coverage/margin/no-signal) with telemetry assertions. Full package suite green: 66 suites / 1032 tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update the dispatcher README "Action Collision Detection" section: - add the contextSelector config block to the schema listing; - new "Context-weighted resolution (contextSelector)" subsection covering the per-collision pipeline (context vector -> keywords -> TF-IDF -> decide), the two extensibility seams, telemetry/affordance, and the abstain fallback; - add the `@config collision contextSelector detect` and `@collision keywords` rows to the shell command table. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ector
Addresses two independent Round-1 reviews (code-review + rubber-duck).
Correctness (code-review, Medium):
- keywordIndex: stop memoizing a missing definition (queried-before-load actions
were cached empty forever); invalidate the derived-keyword cache for an agent's
schemas on in-place `reloadAgentSchema` (learn/remember-how-to flows) so the
scorer re-extracts from reloaded schema text instead of stale keywords.
Extensibility seam (rubber-duck, blocking):
- Add ContextResolutionStrategy (strategy.ts) bundling scoring + decision policy
so an embedding / knowPro scorer swaps as one unit; the orchestrator now depends
only on the seam, not on TF-IDF internals. TfIdfStrategy = TfIdfScorer + the
count-based evidence gate.
Tuning / scope (rubber-duck, non-blocking):
- Retune conservative defaults to the λ=0.9 scale: minMass 1.0->0.75, margin
1.0->0.5 (margin 1.0 pathologically abstained "2 fresh vs 1" cases); add
boundary fixtures.
- Stop dropping domain nouns that double as verbs ("list", "search") as generic
verbs — "list" is the named excel↔list scenario keyword.
- Document that ContextVector is the knowPro projection target (design §9 step 2)
and that v1 keyword source = lexical floor + manual sidecar (distillation +
auto-derived layers are follow-ups) in README + code comments.
Build green; contextSelector suite 52/52.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ector
Addresses Round-2 code-review + rubber-duck.
Correctness:
- matchContextSelector now returns a 3-way outcome (resolve/abstain/skip). "skip"
(< 2 distinct candidates — e.g. a tiedHeuristics tie between two constructions of
the SAME action) no longer collapses into "abstain", so `escalate-to-llm` can't
send a confident single-action match to the LLM (code-review, Medium).
- matchRequest: `defer-to-strategy` on abstain now runs the configured grammar
strategy even when `grammarMatch.detect` is off (design §11.1 abstain semantics);
"skip" falls through to today's behavior, never escalates (rubber-duck, blocking).
- Emit `firstMatchCandidate` on contextSelector events (reuse exported toCandidate)
so the rollout can compare treatment vs first-match control when contextSelector
short-circuits the strategy (rubber-duck, blocking).
- On duplicate (schema, action), keep the heuristically-best MatchResult, not the
first seen (rubber-duck).
Extensibility seam (rubber-duck, blocking):
- Stop the strategy leaking TF-IDF vocabulary into the orchestrator: CandidateScore
`uniqueTokenCount`/`matched` are now optional evidence; the strategy returns a
`winnerNote` phrase; the user affordance is generic ("↪ routed to X — recent
topic"). An embedding strategy is now a clean drop-in (score-only + its own note).
Tuning:
- minMass 0.75 -> 1.0 (Round 1 over-corrected); at λ=0.9 this bounds a stale two-token
turn to ~age 7 so an old topic stops silently resolving; added staleness + dedup
fixtures.
Build green; contextSelector 54/54; full dispatcher suite 66 suites / 1038 tests.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…genericity
Round-3 code-review found no feature issues; round-3 rubber-duck raised one
non-blocking residual: the strategy seam was clean at the orchestrator/telemetry
layer but the shared decision types still carried TF-IDF vocabulary.
- ContextResolutionStrategy is now generic over its config type
(`<C = DecisionConfig>`), so a non-lexical strategy defines its own thresholds
rather than being handed TF-IDF's minUniqueTokens/minMass/margin.
- ContextSelectorDecision.reason widened to `string` (AbstainReason documents the
count-based strategy's values) so a strategy can emit its own abstain reason.
- Add contextSelectorStrategy.spec.ts proving an alternate "similarity" strategy
with its own config type, reason ("similarity-floor"), and no lexical evidence
fields satisfies the seam with zero changes to the engine/decision/orchestrator.
contextSelector 57/57; full dispatcher suite 67 suites / 1041 tests green.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a deterministic, LLM-free tier to the dispatcher's grammar/cache collision path. When two or more agents match the same input,
contextSelectorranks them by how well their keywords match the recent conversation and either resolves the collision on the cache path (skipping the LLM translation) or abstains to today's behavior. Off by default (collision.contextSelector.detect: false) — with detection off, behavior is byte-identical to legacy first-match.How it works
Per collision: build a recency-decayed keyword map of recent user turns (history-only); score each candidate's keywords against it with candidate-local IDF (tokens all candidates share cancel, unique ones distinguish); resolve only when a coverage guard, an evidence gate, and a clear-winner margin all pass — otherwise abstain. Every decision is explainable from counts and emitted as
context-weighttelemetry.Extensibility (two seams)
ConversationSignalSource) — swap the v1 raw-token ring buffer for knowPro topics/entities behind the sameContextVector.ContextResolutionStrategy) — swap TF-IDF for a knowPro-entity or embedding-similarity scorer as a self-contained unit.Surfaces
@config collision contextSelector detect on;@collision keywords <schema.action> add|remove|list|clear;context-weighttelemetry via@collision events/ per-session JSONL / Cosmos (when enabled).Scope & safety
Abstain never routes worse than today. v1 keyword source is a lexical floor + manual sidecar (LLM distillation and auto-derived layers are follow-ups). Thresholds are conservative, unbenchmarked defaults to calibrate on fixtures.
Testing & review
Full dispatcher suite green (67 suites / 1041 tests, 57 new). Three adversarial review rounds; all blocking findings fixed, final round clean.
Rollout
Merge with
detect: false→ calibrate thresholds on fixtures → flipdetect: on.