Fix: Improve migration safety and performance for chat messages#411
Fix: Improve migration safety and performance for chat messages#411yaojin3616 wants to merge 9 commits intomainfrom
Conversation
- Add idempotency to 'increase_api_key_length' migration - Add database index to ChatMessage.conversation_id - Clean up imports in WeCom API and ChannelUserService
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77e1cec1d3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| open_id = open_id or external_user_id | ||
| # Feishu external_id must remain tenant-stable user_id only. | ||
| # Never backfill it from open_id. | ||
| external_id = (extra_info.get("external_id") or "").strip() or None |
There was a problem hiding this comment.
Use external_user_id fallback for Feishu external_id
In the Feishu branch, external_id is now derived only from extra_info["external_id"] and no longer falls back to the external_user_id argument. When enrichment data is missing (for example if the contact lookup fails), _find_org_member gets no usable Feishu identifiers and returns None, so existing OrgMember links are skipped and the code repeatedly creates new channel users/shell members for the same person.
Useful? React with 👍 / 👎.
backend/app/api/feishu.py
Outdated
| channel_type="feishu", | ||
| external_user_id=sender_open_id, | ||
| # For Feishu, external_user_id is strictly user_id. | ||
| external_user_id=sender_user_id_feishu or None, |
There was a problem hiding this comment.
Keep deterministic fallback in Feishu file user lookup
_handle_feishu_file now passes sender_user_id_feishu or None as external_user_id, but sender_user_id_feishu is initialized empty and only filled inside a best-effort API call that can fail silently. In those failure cases, file/image messages are processed without a stable external identifier, which breaks identity matching and can create duplicate platform users across repeated messages/retries.
Useful? React with 👍 / 👎.
- Optimize DingTalk/Feishu lookup logic in channel_user_service.py - Fix code indentation and formatting in AgentDetail.tsx
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 433799971a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| conditions.append( | ||
| (OrgMember.unionid == external_user_id) | | ||
| (OrgMember.external_id == external_user_id) | ||
| (OrgMember.unionid == unionid) | | ||
| (OrgMember.external_id == external_id) | ||
| ) |
There was a problem hiding this comment.
Reject empty DingTalk identifiers before OR lookup
In the DingTalk branch, external_user_id is now optional and _get_channel_ids converts empty values to None, but this query still appends (OrgMember.unionid == unionid) | (OrgMember.external_id == external_id) unconditionally. If a webhook arrives without a usable staff ID, this becomes IS NULL OR IS NULL and can match an unrelated active OrgMember for the same provider, incorrectly binding messages to the wrong platform user. Add a guard to return None when both identifiers are missing (like the WeCom/Feishu branches do).
Useful? React with 👍 / 👎.
| if 'ix_chat_messages_conversation_id' not in indexes: | ||
| op.create_index( | ||
| 'ix_chat_messages_conversation_id', | ||
| 'chat_messages', | ||
| ['conversation_id'], | ||
| unique=False | ||
| ) |
There was a problem hiding this comment.
Move chat_messages index into a new Alembic revision
This change adds op.create_index(...) inside the existing increase_api_key_length revision, which keeps the same revision ID. Databases that already applied this revision will not rerun upgrade(), so they will never receive ix_chat_messages_conversation_id and won’t get the intended history-query performance improvement. The index should be introduced via a new migration revision so existing environments can apply it through normal upgrades.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b99433fb00
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if unionid: | ||
| lookup_conditions.append(OrgMember.unionid == unionid) | ||
| if external_id: | ||
| lookup_conditions.append(OrgMember.external_id == external_id) | ||
| if not lookup_conditions: |
There was a problem hiding this comment.
Match Feishu members by open_id during org lookup
The Feishu branch now builds lookup predicates from only unionid and external_id, but drops open_id entirely, so messages that only carry open_id (for example when Contact API enrichment fails) can no longer match existing OrgMember rows keyed by open_id. In that case resolve_channel_user falls through to ChannelUserResolutionError and skips processing for valid users. Fresh evidence in this commit: test_channel_user_service_uses_feishu_open_id_for_existing_member_lookup expects this lookup path, but these lines never add an open_id condition.
Useful? React with 👍 / 👎.
| return ExternalUserInfo( | ||
| provider_type=self.provider_type, | ||
| provider_user_id=info_data.get("open_id", ""), | ||
| provider_union_id=info_data.get("union_id"), | ||
| name=info_data.get("name", ""), |
There was a problem hiding this comment.
Fill Feishu provider_user_id before returning user info
FeishuAuthProvider.get_user_info now returns ExternalUserInfo without setting provider_user_id, while the same commit changed identity binding paths to prefer provider_user_id (BaseAuthProvider.find_or_create_user and /auth/{provider}/bind). This leaves Feishu SSO flows dependent on optional fallback fields and can miss identity linkage when union_id is absent/unstable, causing duplicate or unbound identities. Fresh evidence in this commit: test_feishu_auth_provider_prefers_contact_user_id_over_open_id expects a non-null provider_user_id, but this return payload cannot provide one.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29af7ab2e6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return ExternalUserInfo( | ||
| provider_type=self.provider_type, | ||
| provider_user_id=info_data.get("openId", ""), | ||
| provider_union_id=info_data.get("unionId"), | ||
| name=info_data.get("nick", ""), |
There was a problem hiding this comment.
Populate DingTalk provider_user_id before returning user info
DingTalkAuthProvider.get_user_info now returns ExternalUserInfo without provider_user_id, but downstream SSO code still treats that value as a required string (for example, BaseAuthProvider.find_or_create_user passes it into sso_service.link_identity, whose shell-member path uses provider_user_id[:8]). For first-time DingTalk logins/binds where no existing OrgMember is found, this becomes a NoneType slice and the identity-link flow fails instead of completing login.
Useful? React with 👍 / 👎.
Summary
This PR addresses several areas of codebase hygiene and performance:
Checklist