Skip to content

Setup Context Based Collision Resolution V1#2580

Draft
GeorgeNgMsft wants to merge 12 commits into
mainfrom
dev/georgeng/contxt_detection_v1
Draft

Setup Context Based Collision Resolution V1#2580
GeorgeNgMsft wants to merge 12 commits into
mainfrom
dev/georgeng/contxt_detection_v1

Conversation

@GeorgeNgMsft

Copy link
Copy Markdown
Contributor

Summary

Adds a deterministic, LLM-free tier to the dispatcher's grammar/cache collision path. When two or more agents match the same input, contextSelector ranks them by how well their keywords match the recent conversation and either resolves the collision on the cache path (skipping the LLM translation) or abstains to today's behavior. Off by default (collision.contextSelector.detect: false) — with detection off, behavior is byte-identical to legacy first-match.

How it works

Per collision: build a recency-decayed keyword map of recent user turns (history-only); score each candidate's keywords against it with candidate-local IDF (tokens all candidates share cancel, unique ones distinguish); resolve only when a coverage guard, an evidence gate, and a clear-winner margin all pass — otherwise abstain. Every decision is explainable from counts and emitted as context-weight telemetry.

Extensibility (two seams)

  • Context source (ConversationSignalSource) — swap the v1 raw-token ring buffer for knowPro topics/entities behind the same ContextVector.
  • Scorer (ContextResolutionStrategy) — swap TF-IDF for a knowPro-entity or embedding-similarity scorer as a self-contained unit.

Surfaces

@config collision contextSelector detect on; @collision keywords <schema.action> add|remove|list|clear; context-weight telemetry via @collision events / per-session JSONL / Cosmos (when enabled).

Scope & safety

Abstain never routes worse than today. v1 keyword source is a lexical floor + manual sidecar (LLM distillation and auto-derived layers are follow-ups). Thresholds are conservative, unbenchmarked defaults to calibrate on fixtures.

Testing & review

Full dispatcher suite green (67 suites / 1041 tests, 57 new). Three adversarial review rounds; all blocking findings fixed, final round clean.

Rollout

Merge with detect: false → calibrate thresholds on fixtures → flip detect: on.

GeorgeNgMsft and others added 12 commits July 1, 2026 01:04
…, scorer, decision)

Deterministic, LLM-free core for context-weighted collision resolution
(design: docs/architecture/collision/context-weighted-collision-resolution-design.md).
All pure/leaf modules under src/context/contextSelector/, not yet wired in.

Two extensibility seams the design centers on:
- ConversationSignalSource (context-vector creation) - v1 RingBufferSignalSource
  (raw-token ring buffer, lambda=0.9/N=20 decay) swappable for knowPro entities.
- CollisionScorer (scoring) - v1 TfIdfScorer (candidate-local IDF) swappable for
  knowPro-entity / embedding-similarity scorers; same CandidateScore contract.

Modules:
- tokenize.ts: pinned canonicalizer/tokenizer with protected patterns (C#, .NET, A1:B2).
- conversationSignal.ts: ContextVector + source seam + decayed ring buffer (history-only).
- keywordVector/keywordExtractor/keywordSidecar/keywordIndex: derived lexical floor
  plus live-tunable collision-keywords.json overrides (effective = derived + add - remove).
- scorer.ts: candidate-local IDF TF-IDF over flattened keyword sets.
- decision.ts: coverage + evidence gate + margin with quantization and total ordering.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Integrate the engine as a deterministic pre-strategy tier on the grammarMatch
path (design section 11): a confident topical pick resolves on the cache path
(no LLM); abstain falls through to the configured grammar strategy (default) or
escalates to LLM translation.

- session.ts: add collision.contextSelector config block + defaults
  (detect off, windowTurns 20, decay 0.9, minUniqueTokens 2, minMass 1.0,
  margin 1.0, abstainFallback defer-to-strategy). Only detect is @config-exposed.
- matchContextSelector.ts: orchestrator - adapts validated MatchResults to
  scorer candidates, runs signal -> keywords -> scorer -> decision, emits
  telemetry, returns the winning match + U-2 affordance note or abstains.
- collisionTelemetry.ts: add "context-weight" strategy label + optional
  per-candidate matchedTokens for explainable events.
- commandHandlerContext.ts: construct conversationSignal (ring buffer),
  contextSelectorSidecar, contextSelectorKeywords (KeywordIndex); reset signal +
  invalidate keyword cache on session switch.
- matchRequest.ts: insert the tier between registry-first and the grammar strategy.
- interpretRequest.ts: record each completed turn into the signal (history-only).
- historyCommandHandler.ts / systemAgent.ts: reset the signal on `@history clear`
  and clear-deep.

Off by default; detect:false preserves byte-identical legacy behavior. Builds green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ords surface

- configCommandHandlers.ts: `@config collision contextSelector detect [on|off]`
  toggle (only detect is exposed, per §11.3); render the contextSelector row in
  both the HTML and text `@config collision` views; include it in the anyOn summary.
- collisionKeywordHandlers.ts: `@collision keywords <schema.action> add|remove|list|clear`
  to inspect derived + override keywords and edit the collision-keywords.json sidecar
  (§5.3). `list` with no arg lists all overrides; with a target shows derived, delta,
  and merged effective sets.
- collisionCommandHandlers.ts: register the `keywords` sub-command.

The U-2 reroute affordance (displayInfo) shipped with the integration commit. Builds green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rator

48 specs across 6 suites, all deterministic and LLM-free:
- contextSelectorTokenize: protected patterns, stopword/verb drops, de-camel, NFKC.
- contextSelectorSignal: decay math (design §14 Scenario 1), window cap, history-only,
  reset, within-turn multiplicity.
- contextSelectorScorer: candidate-local IDF (disc=1 unique, 0 shared, graduated),
  matched-token detail, stable ordering.
- contextSelectorDecision: coverage / no-signal / min-unique-tokens / min-mass / margin
  gates, total ordering, quantize; §14 Scenario 1 (resolve) + Scenario 2 (abstain).
- contextSelectorKeywords: lexical extraction, param walk (arrays/refs), sidecar
  add/remove/clear + canonicalization + disk reload + malformed-degrades-to-empty,
  index derive/memoize/effective/invalidate.
- contextSelectorResolve: end-to-end resolve + abstain (coverage/margin/no-signal) with
  telemetry assertions.

Full package suite green: 66 suites / 1032 tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update the dispatcher README "Action Collision Detection" section:
- add the contextSelector config block to the schema listing;
- new "Context-weighted resolution (contextSelector)" subsection covering the
  per-collision pipeline (context vector -> keywords -> TF-IDF -> decide), the two
  extensibility seams, telemetry/affordance, and the abstain fallback;
- add the `@config collision contextSelector detect` and `@collision keywords`
  rows to the shell command table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ector

Addresses two independent Round-1 reviews (code-review + rubber-duck).

Correctness (code-review, Medium):
- keywordIndex: stop memoizing a missing definition (queried-before-load actions
  were cached empty forever); invalidate the derived-keyword cache for an agent's
  schemas on in-place `reloadAgentSchema` (learn/remember-how-to flows) so the
  scorer re-extracts from reloaded schema text instead of stale keywords.

Extensibility seam (rubber-duck, blocking):
- Add ContextResolutionStrategy (strategy.ts) bundling scoring + decision policy
  so an embedding / knowPro scorer swaps as one unit; the orchestrator now depends
  only on the seam, not on TF-IDF internals. TfIdfStrategy = TfIdfScorer + the
  count-based evidence gate.

Tuning / scope (rubber-duck, non-blocking):
- Retune conservative defaults to the λ=0.9 scale: minMass 1.0->0.75, margin
  1.0->0.5 (margin 1.0 pathologically abstained "2 fresh vs 1" cases); add
  boundary fixtures.
- Stop dropping domain nouns that double as verbs ("list", "search") as generic
  verbs — "list" is the named excel↔list scenario keyword.
- Document that ContextVector is the knowPro projection target (design §9 step 2)
  and that v1 keyword source = lexical floor + manual sidecar (distillation +
  auto-derived layers are follow-ups) in README + code comments.

Build green; contextSelector suite 52/52.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ector

Addresses Round-2 code-review + rubber-duck.

Correctness:
- matchContextSelector now returns a 3-way outcome (resolve/abstain/skip). "skip"
  (< 2 distinct candidates — e.g. a tiedHeuristics tie between two constructions of
  the SAME action) no longer collapses into "abstain", so `escalate-to-llm` can't
  send a confident single-action match to the LLM (code-review, Medium).
- matchRequest: `defer-to-strategy` on abstain now runs the configured grammar
  strategy even when `grammarMatch.detect` is off (design §11.1 abstain semantics);
  "skip" falls through to today's behavior, never escalates (rubber-duck, blocking).
- Emit `firstMatchCandidate` on contextSelector events (reuse exported toCandidate)
  so the rollout can compare treatment vs first-match control when contextSelector
  short-circuits the strategy (rubber-duck, blocking).
- On duplicate (schema, action), keep the heuristically-best MatchResult, not the
  first seen (rubber-duck).

Extensibility seam (rubber-duck, blocking):
- Stop the strategy leaking TF-IDF vocabulary into the orchestrator: CandidateScore
  `uniqueTokenCount`/`matched` are now optional evidence; the strategy returns a
  `winnerNote` phrase; the user affordance is generic ("↪ routed to X — recent
  topic"). An embedding strategy is now a clean drop-in (score-only + its own note).

Tuning:
- minMass 0.75 -> 1.0 (Round 1 over-corrected); at λ=0.9 this bounds a stale two-token
  turn to ~age 7 so an old topic stops silently resolving; added staleness + dedup
  fixtures.

Build green; contextSelector 54/54; full dispatcher suite 66 suites / 1038 tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…genericity

Round-3 code-review found no feature issues; round-3 rubber-duck raised one
non-blocking residual: the strategy seam was clean at the orchestrator/telemetry
layer but the shared decision types still carried TF-IDF vocabulary.

- ContextResolutionStrategy is now generic over its config type
  (`<C = DecisionConfig>`), so a non-lexical strategy defines its own thresholds
  rather than being handed TF-IDF's minUniqueTokens/minMass/margin.
- ContextSelectorDecision.reason widened to `string` (AbstainReason documents the
  count-based strategy's values) so a strategy can emit its own abstain reason.
- Add contextSelectorStrategy.spec.ts proving an alternate "similarity" strategy
  with its own config type, reason ("similarity-floor"), and no lexical evidence
  fields satisfies the seam with zero changes to the engine/decision/orchestrator.

contextSelector 57/57; full dispatcher suite 67 suites / 1041 tests green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant