Skip to content

feat(detectors): add YahooOAuth detector#5094

Open
deerajcm wants to merge 4 commits into
trufflesecurity:mainfrom
deerajcm:add-yahoooauth-detector
Open

feat(detectors): add YahooOAuth detector#5094
deerajcm wants to merge 4 commits into
trufflesecurity:mainfrom
deerajcm:add-yahoooauth-detector

Conversation

@deerajcm

@deerajcm deerajcm commented Jun 30, 2026

Copy link
Copy Markdown

Summary

Add detector for Yahoo OAuth access and refresh tokens.

Changes

  • Add YahooOAuth detector for OAuth tokens
  • Pattern: Long access tokens (800-1500 chars) and refresh tokens (60-120 chars)
  • Keywords: "yahoo", "oauth", "access_token", "yahoo_token"
  • API verification via https://api.login.yahoo.com/openid/v1/userinfo
  • Add enum YahooOAuth = 1066 to proto definitions
  • Register detector in defaults.go
  • Add comprehensive unit tests

Detector Details

Type: Yahoo OAuth Tokens
Patterns:

  • Access Token: Very long tokens (800-1500 chars) with alphanumeric, dots, underscores, hyphens
  • Refresh Token: Shorter tokens (60-120 chars) with alphanumeric, dots, underscores, hyphens, tildes

Keywords: "yahoo", "oauth", "access_token", "yahoo_token", "yahoooauth"

Verification:

  • Access tokens: Verified via Yahoo OpenID userinfo endpoint
  • HTTP 200 = valid token
  • HTTP 401 = invalid/expired token

Use Case: Detect exposed Yahoo OAuth tokens from config files, environment variables, or code

Token Types Detected

Access Token

  • Format: OfV5iMac7gx6SGNFLAmsFVMTP17EmgpfI4nFJTDaFvHur3Oxg6mVni4Lt... (very long)
  • Length: 800-1500 characters
  • Use: API authentication, accessing user data
  • Verification: Live API call to Yahoo userinfo endpoint

Refresh Token

  • Format: AOahQ2qfcSxRRa1r4EDFhCDdsx0y~001~Fj.vO_OAW2IXbqFqc8gK3e0wJdTsx6kulrM-
  • Length: 60-120 characters
  • Use: Obtaining new access tokens
  • Verification: Format validation (API verification requires client credentials)

Test Results

✅ valid_yahoo_oauth_access_token - PASS
✅ valid_yahoo_oauth_refresh_token - PASS
✅ invalid_too_short - PASS

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Yahoo verification performs live outbound auth calls with discovered tokens, and the proto enum renumbering can break compatibility if anything still expects the old 1053–1055 detector IDs.
> 
> **Overview**
> Adds two new default secret detectors and wires them through the detector type registry.
> 
> A **BasicAuth** scanner finds `Authorization` / `auth` **Basic** base64 blobs, decodes them, requires a non-empty `username:password`, and reports **SecretParts** while leaving findings **unverified** (no target URL to probe). A **YahooOAuth** scanner matches long access tokens (800–1500 chars) and shorter refresh tokens (60–120 chars), dedupes refresh substrings of access tokens, and optionally **verifies** access tokens against Yahoo’s OpenID **userinfo** endpoint with the no-local-address HTTP client.
> 
> Both scanners are registered in `defaults.go`. **`detector_type.proto`** / generated enums gain **`BasicAuth` (1057)** and **`YahooOAuth` (1072)** and reassign numeric IDs **1053–1059** (e.g. replacing prior **1053–1055** names like BrainTrust/PgAnalyze/RedHat with Shippo, VisibleNpm, Bcrypt, Base64PrivateKey, Duo, DockerSwarm, etc.).
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit fa5ca2b6942f4aec977aade6ca92c9c4c8d18801. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Deeraj CM and others added 3 commits June 22, 2026 10:26
Add detector for HTTP Basic Authentication tokens (BSCAU002).
Detects Authorization: Basic <base64> patterns and decodes them
to extract username:password credentials.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
feat(detectors): add BasicAuth detector
Add detector for Yahoo OAuth access and refresh tokens.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@deerajcm deerajcm requested a review from a team June 30, 2026 11:33
@deerajcm deerajcm requested review from a team as code owners June 30, 2026 11:33
@CLAassistant

CLAassistant commented Jun 30, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ deerajcm
❌ Deeraj CM


Deeraj CM seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit d3843f2. Configure here.

// Yahoo OAuth Refresh Token pattern
// Shorter tokens (60-100 chars) with alphanumeric, dots, underscores, hyphens, tildes
// Example: AOahQ2qfcSxRRa1r4EDFhCDdsx0y~001~Fj.vO_OAW2IXbqFqc8gK3e0wJdTsx6kulrM-
refreshTokenPat = regexp.MustCompile(`\b([A-Za-z0-9][A-Za-z0-9._~-]{59,119})\b`)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refresh token regex truncates tokens ending with non-word chars

High Severity

The \b word boundary at the end of refreshTokenPat is incompatible with the character class [A-Za-z0-9._~-] which includes non-word characters (-, ., ~). In RE2, \b only considers [0-9A-Za-z_] as word characters. When a token ends with -, ., or ~ and is followed by a common delimiter like ", }, or whitespace, no word boundary exists at that position, causing the regex to backtrack and truncate the trailing non-word characters. The PR description's own example token AOahQ2qfcSxRRa1r4EDFhCDdsx0y~001~Fj.vO_OAW2IXbqFqc8gK3e0wJdTsx6kulrM- would be captured without the trailing -. The test avoids this by using a token ending in ABC.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3843f2. Configure here.

}
}
return true, nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Body parsing in verification is unreachable dead code

Low Severity

Every code path inside the if resp.StatusCode == 200 block returns true, nil regardless of whether the body can be read or parsed, or whether it contains email or sub fields. The io.ReadAll, json.Unmarshal, and field checks are dead code that has no effect on behavior. The encoding/json and io imports exist solely for this unused logic.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3843f2. Configure here.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants