-
Notifications
You must be signed in to change notification settings - Fork 2
feat(similarity): 脚本相似度检测与完整性校验系统 #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
CodFrm
wants to merge
87
commits into
main
Choose a base branch
from
test/hotfix
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
87 commits
Select commit
Hold shift + click to select a range
5a1d1b7
feat(similarity): scaffold similarity_svc package with goja/xxhash deps
CodFrm b49dac1
feat(similarity): define core fingerprint types and TokenKind enum
CodFrm 9320821
test(similarity): cover KindUnknown in TokenKind_String table
CodFrm eb8a092
feat(similarity): implement parseAndNormalize via goja AST walk
CodFrm b6ff6c8
fix(similarity): nil guards on Expression walks; walk ObjectLiteral p…
CodFrm cfad561
feat(similarity): implement k-gram xxhash64 sliding window
CodFrm eeb5f67
test(similarity): cover operator value distinction and zero/negative k
CodFrm 4758845
feat(similarity): implement winnowing with monotonic deque
CodFrm 4230617
test(similarity): add winnow brute-force invariant test; clarify comp…
CodFrm 3e3586f
style(similarity): use range over int in property test loop
CodFrm aee6188
feat(similarity): wire ExtractFingerprints public API
CodFrm 980e613
test(similarity): tighten ExtractFingerprints test contracts; documen…
CodFrm c9b9d66
feat(similarity): implement pure set-based Jaccard similarity
CodFrm f022d3d
test(similarity): add Jaccard symmetry and proper-subset cases
CodFrm 179d61e
test(similarity): golden test for rename invariance
CodFrm acd98ce
test(similarity): golden test for code reorder similarity
CodFrm 4c8b66d
test(similarity): golden test for unrelated code disjointness
CodFrm b4b8e4b
chore(similarity): lint-fix polish
CodFrm 3e1e888
update .gitignore
CodFrm 4b8462e
feat(similarity): add Phase 2 entities + gormigrate for six tables
CodFrm f9e1007
refactor(similarity): convert entity status enums to named types + po…
CodFrm 68910c1
feat(similarity): add FingerprintRepo with upsert + parse-status helpers
CodFrm e8eaed2
refactor(similarity): apply repo conventions to FingerprintRepo
CodFrm f9084c3
feat(similarity): add SimilarPairRepo with normalized-pair upsert
CodFrm c2d9979
feat(similarity): add SuspectSummaryRepo with upsert
CodFrm b6fe291
feat(similarity): add SimilarityWhitelistRepo (pair-level)
CodFrm 07843e6
feat(similarity): add IntegrityWhitelistRepo (script-level)
CodFrm e1f6e67
feat(similarity): add IntegrityReviewRepo with code-id upsert
CodFrm 9e624c1
feat(similarity): add error codes + zh_CN i18n at 114000 range
CodFrm 9d8e41f
feat(similarity): add similarity.* config keys + Validate() check
CodFrm 443b726
fix(similarity): default ScanEnabled/IntegrityEnabled to true so omit…
CodFrm aeeac1a
feat(script): add ScriptCode.FindByIDIncludeDeleted for similarity ev…
CodFrm b7fecb4
feat(similarity): add FingerprintESRepo with bulk/search/agg + index …
CodFrm 494a9ff
fix(similarity): harden FingerprintESRepo error handling + add body t…
CodFrm 5e09d25
feat(similarity): add pending-warning context helper for script_svc h…
CodFrm 87b9fcc
feat(similarity): add integrity signal detectors (Cat A/B/C/D)
CodFrm 696fd95
feat(similarity): add IntegritySvc with Check/IsWhitelisted/RecordWar…
CodFrm 4ba44e6
test(similarity): integrity golden tests cover normal/minified/obfusc…
CodFrm 57bf27d
feat(similarity): add NSQ producers for similarity.scan + integrity.w…
CodFrm bfad8fb
feat(similarity): implement ScanSvc.Scan orchestration (lock + ES + p…
CodFrm c109a4d
refactor(similarity): inject Scan dependencies via function vars for …
CodFrm 1af7bf2
test(similarity): cover ScanSvc.Scan branches with mocked repos
CodFrm ed23599
feat(similarity): NSQ consumer for similarity.scan -> ScanSvc.Scan
CodFrm b90ebe2
feat(similarity): NSQ consumer for integrity.warning -> IntegritySvc.…
CodFrm 8026457
feat(similarity): crontab handler refreshes Redis stop-fp set from ES…
CodFrm 6e9ab66
feat(similarity): integrate Integrity check + similarity scan publish…
CodFrm cba9ea5
feat(similarity): integrate Integrity check + similarity scan publish…
CodFrm 4dd3aa1
feat(similarity): register NSQ consumers + stop-fp crontab
CodFrm d10c043
feat(similarity): register similarity repos, services, and ES index init
CodFrm a0111b6
chore(similarity): lint-fix polish
CodFrm d678858
fix(similarity): wire stop-fp crontab to similarity.stop_fp_refresh_s…
CodFrm 1a10e4d
feat(similarity): declare Phase 3 admin + evidence API request/respon…
CodFrm 4def300
feat(similarity): add list/find/resolve methods + ES position lookup …
CodFrm 00e1cc9
feat(similarity): add RequireSimilarityPairAccess middleware for Phase 3
CodFrm 6bce562
feat(similarity): scaffold AdminSvc interface with stub methods
CodFrm d0f6fd3
feat(similarity): implement AdminSvc.ListPairs + ListSuspects
CodFrm 6b67087
feat(similarity): implement GetPairDetail + MatchSegments builder
CodFrm caafa47
feat(similarity): implement pair + integrity whitelist and review end…
CodFrm e7c612e
feat(similarity): wire Phase 3 admin + evidence routes
CodFrm ba2479e
chore(similarity): silence errcheck on ES resp.Body.Close deferred calls
CodFrm d8c62ce
feat(similarity): add DELETE /admin/similarity/whitelist/:id endpoint
CodFrm 20367af
fix(similarity): silence errcheck on new ES resp.Body.Close deferred …
CodFrm 38342bd
chore: exclude auto-generated docs/ dir from golangci-lint
CodFrm b3c2699
feat(similarity): Phase 4 patrol + backfill + stop-fp manual refresh
CodFrm ff96dda
fix(similarity): close four gaps against design spec
CodFrm 9d5099a
fix(similarity): defer cancel in patrol context cancellation test
CodFrm 1621b1d
perf(similarity): share scans across Integrity.Check signals
CodFrm c16fd7a
fix(similarity): retry fingerprint parse wrapped in async function
CodFrm 62605f3
feat(similarity): make max_code_size=0 disable the fingerprint size gate
CodFrm cf2a665
feat(similarity): admin endpoint listing fingerprint parse failures
CodFrm 33fcdf7
chore(similarity): silence gosec G304 on bench test fixture read
CodFrm 273cd26
fix(similarity): make Reset backfill truly force a full rescan
CodFrm 18d0896
chore(similarity): add debug logging to integrity check and match seg…
CodFrm 8e8b388
fix(similarity): mark deleted scripts and purge stale pending pairs
CodFrm ed1c7e4
fix(similarity): cover ES6+ syntax in fingerprint walker
CodFrm ff40518
perf(similarity): 将完整性检查耗时信号移至异步扫描消费者
CodFrm 2a79ba8
refactor(similarity): 移除 AAEncode 和 JJEncode 完整性信号
CodFrm 074e812
style: fix gofmt formatting in integrity signals
CodFrm 248fc59
fix(similarity): use Similarity().ScanEnabled in Validate() for consi…
CodFrm 47dcd93
fix(similarity): propagate Redis errors in RunBackfill instead of swa…
CodFrm 2549935
refactor(similarity): deduplicate stopFpRedisKey into single exported…
CodFrm 8c81038
fix(similarity): clamp Jaccard score and document stop-fp approximation
CodFrm 3cb51cf
fix(similarity): use format directive in integrity rejection message
CodFrm 5274a96
fix(similarity): handle FindLatest and IsWhitelisted errors in Update…
CodFrm 5485bb9
fix(similarity): skip integrity pre-check for auto-sync code updates
CodFrm e8bdeeb
refactor(similarity): move backfill state data access to repository l…
CodFrm 4073a28
refactor(similarity): split admin.go into evidence and whitelist files
CodFrm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,6 +10,7 @@ | |
|
|
||
| .claude/settings.local.json | ||
| CLAUDE.md | ||
| .omc | ||
|
|
||
| # ip2region data files (download separately) | ||
| /data/*.xdb | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,6 +30,10 @@ linters: | |
| misspell: | ||
| locale: US | ||
|
|
||
| exclusions: | ||
| paths: | ||
| - docs | ||
|
|
||
| formatters: | ||
| enable: | ||
| - gofmt | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate() checks cfg.Bool("similarity.scan_enabled") to decide whether Elasticsearch must be configured, but Similarity() defaults ScanEnabled=true even when the YAML key is absent. This can let the app start without elasticsearch.address while similarity scanning is effectively enabled (and main.go later calls EnsureFingerprintIndex based on Similarity().ScanEnabled). Consider basing this check on Similarity().ScanEnabled (or otherwise applying the same defaulting logic as Similarity()) so startup validation matches runtime behavior.