Skip to content

feat(similarity): 脚本相似度检测与完整性校验系统#3

Open
CodFrm wants to merge 87 commits intomainfrom
test/hotfix
Open

feat(similarity): 脚本相似度检测与完整性校验系统#3
CodFrm wants to merge 87 commits intomainfrom
test/hotfix

Conversation

@CodFrm
Copy link
Copy Markdown
Member

@CodFrm CodFrm commented Apr 16, 2026

No description provided.

CodFrm added 30 commits April 13, 2026 13:42
Implements parseAndNormalize(code) which parses JavaScript using goja's
parser and walks the resulting AST to produce a normalized Token stream.
Identifiers collapse to KindVar, literals to their typed kinds, and
structural constructs emit KindKeyword/KindPunct tokens so the stream
still encodes program structure. Unknown node types fall through to
KindUnknown to keep the walker deterministic as more JS constructs
surface in subsequent tasks.

Promotes github.com/dop251/goja from indirect to direct dependency.
Scaffolds the data-layer entities and gormigrate migration for the
Phase 2 code similarity detection and integrity review feature:

- Fingerprint / SimilarPair / SuspectSummary
- SimilarityWhitelist / IntegrityWhitelist / IntegrityReview

Registers T20260414 in migrations/init.go.
- Replace errors.Is(err, gorm.ErrRecordNotFound) with db.RecordNotFound(err)
  to match the canonical CaGo helper used in 64 other repo sites.
- Drop createtime/updatetime mutation from Upsert; service layer owns the
  clock per existing repo convention (see internal/service/* sites).
- UpdateParseStatus now takes scannedAt int64 explicitly so the repo stays
  clock-free; regenerate mock to reflect the new signature.
CodFrm added 17 commits April 14, 2026 14:33
docs/docs.go is regenerated by swag and its legacy strings.Replace calls
trigger staticcheck QF1004 — we don't own the template, so exclude the
entire directory rather than chase transient hits.
Adds the Phase 4 operations tooling required by §4.5 and §8.5 bootstrap:

- similarity_patrol crontab handler with two modes: daily Patrol() for
  incremental catch-up of scripts whose latest code is newer than their
  last fingerprint scan, and RunBackfill() kicked off by the admin
  endpoint to iterate every active script from a persisted cursor with
  rate-limiting and resumable state.
- similarity_repo.PatrolQueryRepo with ListStaleScriptIDs /
  ListScriptIDsFromCursor / CountScripts for the two modes.
- similarity_svc.BackfillState helpers persisted via system_config:
  TryAcquireBackfillLock, SetBackfillCursor, FinishBackfill,
  ResetBackfillCursor. State survives restarts and prevents
  simultaneous admin clicks from double-starting.
- Admin endpoints POST /admin/similarity/backfill (with reset flag for
  §8.5 step 9), GET /admin/similarity/backfill/status, POST
  /admin/similarity/scan/:script_id, and POST
  /admin/similarity/stop-fp/refresh (§8.5 step 8 on-demand refresh).
- RegisterBackfillRunner + RegisterStopFpRefresher function-injection
  seams wire the crontab handler methods into admin_svc without an
  import cycle.
- Production code uses function-typed fields / package vars for all
  Redis, NSQ producer, system_config, and PatrolQuery dependencies so
  unit tests can substitute fakes (matches existing similarity_stop_fp
  pattern).
- 31 new unit tests covering backfill state (9), admin_backfill + stop-fp
  refresh (11), and patrol handler (11) — including resume-from-cursor,
  ctx cancellation during rate-limit sleep, re-entry guards, and
  publish-failure continuation.
- 用 ES cardinality 聚合替代 Σ CommonCount 修正 coverage 计算,
  消除跨候选指纹的双重计数(spec §4.1 Step 5)
- 新增 PurgeScriptData 级联清理 ES/fingerprint/pair/summary,
  通过 ScriptDeleteMsg.HardDelete 字段驱动的新 consumer 接入
  硬删除路径(spec §4.6)
- 回填 running flag 改用 Redis SETNX 原子 CAS,消除两位管理员
  同时点启动的竞态;元数据仍落在 system_config(spec §2.3/§4.5)
- DBConfigProvider 新增 GetBool/GetFloat/GetInt,Similarity() 在
  YAML 之上叠加 pre_system_config 动态覆盖,让管理员后台可实时
  调整 14 个 similarity.* 阈值开关(spec §1.1/§6.1)
gosec (G118) flagged the missing defer even though the goroutine fires
cancel on the happy path — defer guards the early-return paths.
Introduce a codeFeatures struct computed once per Check so the four
Category-A signals share a single rune pass (line count, max line, whitespace,
comment bytes) and the two Category-B signals share one collectIdents call.
Adds a benchmark covering 1MB obfuscated and 256KB plain samples.

On M1: obfuscated 1MB goes 142ms → 80ms (1.78x, allocs halved), plain 256KB
64ms → 50ms (1.29x, allocs -40%).
ScriptCat wraps background/cron scripts in (async function(){ ... })() at
runtime, making top-level return and await legal. The fingerprint parser
treated the source as a standalone ECMAScript Script, so scripts using
either feature were rejected with "Illegal return statement" and marked
parse_status=failed, falling out of the similarity index.

parseAndNormalize now retries wrapped on parse failure and shifts token
positions back by the wrapper prefix length (clamped into the original
source range) so downstream match segments still point at real bytes.
Drop the 512KB auto-default on MaxCodeSize. scan.go already guards on
`MaxCodeSize > 0`, so zero now means unlimited (bounded only by the
API-level 10MB cap on script code). Default config example updated to 0 so
fresh deployments index all scripts the backend will accept.
Adds GET /admin/similarity/parse-failures so operators can triage scripts
that are invisible to similarity comparison. Default filter is
parse_status=failed; pass status=2 to see skipped rows. Rescan uses the
existing POST /admin/similarity/scan/:script_id, no new action required.

Introduces FingerprintRepo.ListByParseStatus with ParseFailureFilter, the
adminSvc.ListParseFailures handler composing script + user briefs, and
wires the route into the admin middleware group.
Reset=true on /admin/similarity/backfill previously only zeroed the cursor
but left the Scan code_hash short-circuit intact, so every rescanned script
no-oped with "code unchanged, skipping" and the admin saw no effect.

Thread a force flag from TriggerBackfill → RunBackfill → SimilarityScanMsg
→ consumer → ScanSvc.Scan. When force=true the short-circuit is bypassed
so extraction, ES indexing, and pair upsert all run again. Patrol and the
publish/update script events keep force=false to stay idempotent.
…ments

IntegritySvc.Check now logs final score, per-category breakdown, and hit
signal names so ops can trace why a given script landed in a specific zone.
RecordWarning surfaces marshal/upsert failures with full context.

BuildMatchSegments logs each load step (fingerprint row, ES positions) and
the final segment count, making the evidence-page build path debuggable
without attaching a debugger.
Two related issues on the admin similar-pairs view:

1. Soft-deleted scripts kept showing in the pair list with no indication.
   Per spec §4.6 we deliberately preserve the underlying fingerprint as
   evidence, so instead of cascading the delete we surface the state:
   ScriptBrief now exposes IsDeleted, and ListPairsRequest accepts an
   ExcludeDeleted toggle that JOINs cdb_tampermonkey_script to filter
   pairs whose either side is in DELETE status.

2. After a script's code changes such that an old pair drops below the
   Jaccard threshold, the row in pre_script_similar_pair was never
   touched again and lingered as a zombie. Scan() now calls
   DeletePendingByScriptID right after candidate lookup so any pair
   that's still similar gets re-Upserted by step 11 while obsolete
   pending rows disappear. Whitelisted / reviewed pairs are preserved
   because those statuses are explicit admin decisions.
walkNode only handled an ES5 subset and dropped to a single KindUnknown
for any unrecognized AST node, so any modern userscript starting with a
top-level `class` (or built around let/const, arrows, async/await,
template literals, destructuring, etc.) collapsed to under 14 tokens and
tripped the `too_few_fingerprints` skip in scan — leaving stale similar
pairs frozen forever.

Rewrite walkNode to cover the full goja AST: classes (including private
fields, static blocks, methods, getters), lexical declarations,
arrow functions, template literals, await/yield, try/catch/throw,
switch/case, for-in/for-of, do-while, with, optional chaining,
spread/rest, destructuring patterns, sequence + conditional + unary
expressions, new, super, this, meta-property, and PropertyKeyed/Short
(which the old object-literal walker had been silently turning into
KindUnknown).

Also plug the scan early-exit cleanup hole: when scan bails out at any
of the five guard paths (soft-deleted / oversized / parse-failed /
too-few-fingerprints / non-active), still purge pending pairs touching
this script. Otherwise scripts that *used to* match leave their old
pairs visible forever, since no later scan reaches step 10b for them.

Tested with testdata/1.js (ScriptCat OCS helper, 59KB, 1335 lines):
fingerprints went from 1 to 866, total tokens from <14 to 4705.

walkNode coverage 63.5% -> 85.6%, purgePendingPairs 50% -> 100%.
HTTP 请求中仅执行快速信号(预计算 Cat A + 已知打包器 Cat D),
耗时正则信号(标识符提取、注释统计、字符串数组检测等)由
similarity.scan NSQ 消费者异步处理,避免大型脚本发布超时。

- 新增 CheckFast() 方法,已知打包器签名匹配即时拦截 (score=1.0)
- scan.go 步骤 2b:异步完整性检查 + 自动归档 + 记录警告
- 移除已废弃的 integrity.warning 消息队列流程
- 新增 integrity_async_auto_archive 配置项
这两个编码方式极其冷门,实际恶意脚本几乎不会使用,且其代码特征
会被其他信号(单字符标识符比率、空白比率等)覆盖,无需专门检测。
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the plumbing for a script similarity detection system plus an integrity (minify/obfuscation) pre-check, including persistence tables, NSQ topics/consumers, cron-driven patrol/backfill jobs, and admin/evidence endpoints.

Changes:

  • Add DB migrations + new similarity/integrity entities and repositories (MySQL + Elasticsearch index init).
  • Add similarity.scan producer/consumer, hard-delete purge consumer, and cron handlers for patrol/backfill + stop-fingerprint refresh.
  • Add integrity fast pre-check into script create/update flows, plus config (YAML + DB overrides) and routing/controller wiring.

Reviewed changes

Copilot reviewed 103 out of 107 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
migrations/init.go Registers new migration.
migrations/20260414.go Creates/drops similarity tables.
internal/task/producer/topic.go Adds similarity.scan topic.
internal/task/producer/similarity.go Producer + subscribe helpers.
internal/task/producer/similarity_test.go Round-trip msg parsing test.
internal/task/producer/script.go Adds HardDelete flag to delete msg.
internal/task/crontab/handler/similarity_stop_fp.go Stop-fingerprint refresh job.
internal/task/crontab/handler/similarity_stop_fp_test.go Stop-fp handler unit tests.
internal/task/crontab/handler/similarity_patrol.go Patrol + backfill cron handler.
internal/task/crontab/handler/similarity_patrol_test.go Patrol/backfill unit tests.
internal/task/crontab/crontab.go Registers similarity cron handlers.
internal/task/consumer/subscribe/similarity_scan.go Consumer for similarity.scan.
internal/task/consumer/subscribe/similarity_scan_test.go Consumer dispatch/force tests.
internal/task/consumer/subscribe/similarity_purge.go Hard-delete purge consumer.
internal/task/consumer/subscribe/similarity_purge_test.go Purge consumer tests.
internal/task/consumer/consumer.go Registers new subscribers.
internal/service/similarity_svc/testdata/reorder_pair/original.js Similarity fixtures.
internal/service/similarity_svc/testdata/reorder_pair/reordered.js Similarity fixtures.
internal/service/similarity_svc/testdata/rename_pair/original.js Similarity fixtures.
internal/service/similarity_svc/testdata/rename_pair/renamed.js Similarity fixtures.
internal/service/similarity_svc/testdata/different_pair/a.js Similarity fixtures.
internal/service/similarity_svc/testdata/different_pair/b.js Similarity fixtures.
internal/service/similarity_svc/testdata/integrity/normal/plain_userscript.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/normal/embedded_small_lib.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/minified/uglify_output.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/minified/terser_output.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/packed/dean_edwards_packer.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/obfuscated/obfuscator_io_level1.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/obfuscated/obfuscator_io_level4.js Integrity fixtures.
internal/service/similarity_svc/testdata/integrity/borderline/has_vendored_json.js Integrity fixtures.
internal/service/similarity_svc/purge.go Purge cascade implementation.
internal/service/similarity_svc/purge_test.go Purge cascade tests.
internal/service/similarity_svc/pending_warning.go Integrity result types.
internal/service/similarity_svc/mock/scan.go Generated scan mock.
internal/service/similarity_svc/mock/integrity.go Generated integrity mock.
internal/service/similarity_svc/match_segments.go Build UI match segments.
internal/service/similarity_svc/match_segments_test.go Match segment tests.
internal/service/similarity_svc/integrity_signals.go Integrity signals implementation.
internal/service/similarity_svc/integrity_signals_test.go Signal unit tests.
internal/service/similarity_svc/integrity.go Integrity service + messaging.
internal/service/similarity_svc/integrity_test.go Integrity end-to-end tests.
internal/service/similarity_svc/integrity_bench_test.go Integrity benchmarks.
internal/service/similarity_svc/doc.go Package-level docs.
internal/service/similarity_svc/backfill_state.go Backfill state + Redis lock.
internal/service/similarity_svc/backfill_state_test.go Backfill state tests.
internal/service/similarity_svc/admin_backfill.go Admin backfill/manual scan hooks.
internal/service/similarity_svc/access.go Evidence access middleware.
internal/service/similarity_svc/access_test.go Access service smoke test.
internal/service/script_svc/script.go Integrates integrity gate + scan publish.
internal/repository/similarity_repo/fingerprint.go Fingerprint MySQL repo.
internal/repository/similarity_repo/fingerprint_test.go Repo shape test.
internal/repository/similarity_repo/fingerprint_es_init.go ES index create helper.
internal/repository/similarity_repo/fingerprint_es_test.go ES query-body tests.
internal/repository/similarity_repo/patrol_query.go Patrol/backfill SQL repo.
internal/repository/similarity_repo/similar_pair.go Pair repo + normalization.
internal/repository/similarity_repo/suspect_summary.go Suspect summary repo.
internal/repository/similarity_repo/similarity_whitelist.go Pair whitelist repo.
internal/repository/similarity_repo/integrity_whitelist.go Integrity whitelist repo.
internal/repository/similarity_repo/integrity_review.go Integrity review queue repo.
internal/repository/similarity_repo/*_test.go Repo/interface shape tests.
internal/repository/similarity_repo/mock/*.go Generated repo mocks.
internal/repository/similarity_repo/doc.go Repo package docs.
internal/repository/script_repo/script_code.go Adds FindByIDIncludeDeleted.
internal/repository/script_repo/mock/script_code.go Updates mock for new method.
internal/pkg/code/code.go Adds similarity error codes.
internal/pkg/code/zh_cn.go Adds similarity zh-CN messages.
internal/model/entity/similarity_entity/*.go New similarity/integrity entities.
internal/controller/similarity_ctr/similarity.go Similarity controller methods.
internal/api/router.go Wires admin + evidence routes.
configs/db_provider.go Adds typed getters (bool/float/int).
configs/db_provider_test.go Adds DB provider tests.
configs/config.go Adds SimilarityConfig + defaults/overrides + validate hook.
configs/config.yaml.example Adds similarity config examples.
cmd/app/main.go Registers similarity repos/services + ensures ES index.
go.mod Adds deps for similarity/integrity.
go.sum Updates dependency checksums.
.golangci.yml Updates lint exclusions.
.gitignore Ignores .omc directory.

Comment thread configs/config.go
Comment on lines +239 to +246
// similarity.scan_enabled=true 需要 elasticsearch 地址(cago 读取 elasticsearch.address 列表)
if cfg.Bool(ctx, "similarity.scan_enabled") {
var esAddress []string
_ = cfg.Scan(ctx, "elasticsearch.address", &esAddress)
if len(esAddress) == 0 {
return fmt.Errorf("similarity.scan_enabled=true requires elasticsearch.address to be set")
}
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate() checks cfg.Bool("similarity.scan_enabled") to decide whether Elasticsearch must be configured, but Similarity() defaults ScanEnabled=true even when the YAML key is absent. This can let the app start without elasticsearch.address while similarity scanning is effectively enabled (and main.go later calls EnsureFingerprintIndex based on Similarity().ScanEnabled). Consider basing this check on Similarity().ScanEnabled (or otherwise applying the same defaulting logic as Similarity()) so startup validation matches runtime behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +452 to +462
// 完整性前置检查(仅执行快速信号,耗时信号由相似度扫描消费者异步处理)
if similarity_svc.IntegrityEnabled() && similarity_svc.Integrity() != nil && req.Code != "" {
latest, _ := script_repo.ScriptCode().FindLatest(ctx, script.ID, 0, true)
var existingHash string
if latest != nil {
existingHash = sha256HexString(latest.Code)
}
newHash := sha256HexString(req.Code)
if newHash != existingHash {
whitelisted, _ := similarity_svc.Integrity().IsWhitelisted(ctx, script.ID)
if !whitelisted {
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateCode's integrity pre-check ignores errors from ScriptCode().FindLatest and Integrity().IsWhitelisted (both assigned to _). If either call fails transiently, the code may treat the script as not whitelisted / changed and incorrectly block an update (400) instead of surfacing a server error or skipping the integrity gate. Handle these errors explicitly (e.g., return the error, or fail-open with a warning log depending on desired policy) to avoid false rejections.

Copilot uses AI. Check for mistakes.
Comment on lines +169 to +174
ok, release, err := h.acquireBackfillLock(ctx)
if err != nil || !ok {
logger.Ctx(ctx).Warn("similarity backfill: redis lock unavailable",
zap.Bool("ok", ok), zap.Error(err))
return nil
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunBackfill returns nil when acquireBackfillLock returns an error (it checks if err != nil || !ok { ...; return nil }). This suppresses real Redis failures and makes backfill runs silently no-op, which is hard to detect/alert on. Consider returning the error when err != nil, and only treating !ok (lock held) as a nil/no-op path.

Copilot uses AI. Check for mistakes.
@CodFrm
Copy link
Copy Markdown
Member Author

CodFrm commented Apr 16, 2026

Code review

Found 7 issues (issues 1-3 confirm Copilot's findings, 4-7 are additional):

  1. Config validation mismatch: Validate() defaults scan_enabled=false but Similarity() defaults true (CLAUDE.md says "New required config keys must be added to Validate() function (fails fast at startup)")

Validate() reads cfg.Bool(ctx, "similarity.scan_enabled") which returns false when the YAML key is absent. But Similarity() initializes ScanEnabled: true before applying YAML. If a user deploys without the similarity block in config.yaml, Validate() skips the ES check, but Similarity().ScanEnabled returns true at runtime — the app starts cleanly then crashes on ES operations.

Fix: use Similarity().ScanEnabled in Validate() instead of cfg.Bool().

// similarity.scan_enabled=true 需要 elasticsearch 地址(cago 读取 elasticsearch.address 列表)
if cfg.Bool(ctx, "similarity.scan_enabled") {
var esAddress []string
_ = cfg.Scan(ctx, "elasticsearch.address", &esAddress)
if len(esAddress) == 0 {
return fmt.Errorf("similarity.scan_enabled=true requires elasticsearch.address to be set")
}
}

  1. Silently discarded errors in UpdateCode integrity pre-check (bug: transient DB/Redis failures cause false rejections)

FindLatest() and IsWhitelisted() errors are assigned to _. If FindLatest fails, existingHash stays empty so the hash comparison always triggers — every update runs integrity check even when code is unchanged. If IsWhitelisted fails, whitelisted scripts are treated as non-whitelisted and can be spuriously rejected with HTTP 400.

if similarity_svc.IntegrityEnabled() && similarity_svc.Integrity() != nil && req.Code != "" {
latest, _ := script_repo.ScriptCode().FindLatest(ctx, script.ID, 0, true)
var existingHash string
if latest != nil {
existingHash = sha256HexString(latest.Code)
}
newHash := sha256HexString(req.Code)
if newHash != existingHash {
whitelisted, _ := similarity_svc.Integrity().IsWhitelisted(ctx, script.ID)
if !whitelisted {
result := similarity_svc.Integrity().CheckFast(ctx, req.Code)
if result.Score >= similarity_svc.IntegrityBlockThreshold() {
return nil, i18n.NewErrorWithStatus(
ctx, http.StatusBadRequest,
code.SimilarityIntegrityRejected,
result.BuildUserMessage(),
)
}
}
}

  1. RunBackfill swallows Redis errors and leaves backfill lock held for 2 hours (bug: err != nil and !ok collapsed into one branch)

When acquireBackfillLock returns a Redis error, the function returns nil (no error) and exits before the defer finishBackfill() on line 176. The backfillRunningRedisKey stays held for its 2-hour TTL, causing all subsequent TriggerBackfill calls to return 409 Conflict.

Fix: split the condition — return err when err != nil, only return nil when !ok && err == nil.

// Redis lock is belt-and-suspenders.
ok, release, err := h.acquireBackfillLock(ctx)
if err != nil || !ok {
logger.Ctx(ctx).Warn("similarity backfill: redis lock unavailable",
zap.Bool("ok", ok), zap.Error(err))
return nil
}
defer release()

  1. stopFpRedisKey constant duplicated across two packages (bug: silent decoupling risk)

The same Redis key "similarity:stop_fp" is independently defined in scan.go (reader) and similarity_stop_fp.go (writer). If either is changed independently, the system silently breaks — scans read an empty set and treat all fingerprints as non-stop.

// stopFpRedisKey holds the current stop-fingerprint set (populated by the
// Task 20 crontab). It is a Redis SET of hex-encoded uint64 fingerprints.
const stopFpRedisKey = "similarity:stop_fp"

const (
stopFpRedisKey = "similarity:stop_fp"
stopFpLockKey = "crontab:similarity:stop_fp_refresh:lock"

  1. BuildUserMessage() result silently dropped — zh_cn message has no format directives (bug: users see generic error instead of signal details)

Both Create and UpdateCode pass result.BuildUserMessage() as extra arg to i18n.NewErrorWithStatus, but the registered message "代码未通过完整性检查,请勿提交压缩或混淆后的代码" has no %s format directives. The detailed signal breakdown from BuildUserMessage() is silently discarded by fmt.Sprintf.

if result.Score >= similarity_svc.IntegrityBlockThreshold() {
return nil, i18n.NewErrorWithStatus(
ctx, http.StatusBadRequest,
code.SimilarityIntegrityRejected,
result.BuildUserMessage(),
)
}

SimilarityAccessDenied: "无权访问该相似对",
SimilarityIntegrityRejected: "代码未通过完整性检查,请勿提交压缩或混淆后的代码",

  1. Jaccard denominator uses stale FingerprintCntEffective from script B (bug: score can exceed 1.0 or go negative)

denom := effective + other.FingerprintCntEffective - c.CommonCount mixes the current stop-fp-filtered count for script A with a stored count for script B that was computed under a potentially different stop-fp set. If the stop-fp set changed since B was last scanned, the denominator is inconsistent — it can go negative (pair silently dropped) or produce Jaccard > 1.0.

// Jaccard = |A ∩ B| / |A ∪ B| where |A ∪ B| = |A| + |B| - |A ∩ B|.
denom := effective + other.FingerprintCntEffective - c.CommonCount
if denom <= 0 {
continue
}
jaccard := float64(c.CommonCount) / float64(denom)

  1. SyncOnce auto-sync now subject to integrity block with no bypass (bug: upstream code quality change silently breaks sync)

SyncOnce calls UpdateCode() which now includes the integrity pre-check. If an upstream sync URL starts serving minified/obfuscated code, auto-sync is silently rejected with SimilarityIntegrityRejected. There is no mechanism to distinguish system-sync from user submission, and no per-script exemption for the sync path.

}
if _, err := s.UpdateCode(ctx, req); err != nil {
logger.Error("更新代码失败", zap.String("sync_url", script.SyncUrl), zap.Error(err))
return err
}


🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

CodFrm added 9 commits April 16, 2026 16:03
… constant

Export StopFpRedisKey from similarity_svc (the domain owner) and remove
the duplicate local definition from the stop-fp crontab handler, so a
key change in either place can no longer silently break the other.
…ayer

Extract Redis lock and system_config cursor primitives from similarity_svc
into a new BackfillStateRepo interface in similarity_repo, following the
project's service-locator convention. Service layer retains business logic;
tests updated to use MockBackfillStateRepo instead of function-var faking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants