i18n(ja): standardize SQL type names to English#23145
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request standardizes database type names (such as SMALLINT, BIGINT, float, double, and VARCHAR) across several Japanese documentation files, replacing their Japanese transliterations with standard English technical terms. The review feedback identifies three issues: a duplicate table header row in ticdc-avro-protocol.md, a typo (VARCHAR々) in ticdc-canal-json.md that should be corrected to VARCHAR, and an incorrect uppercase Golang type (FLOAT64) in ticdc-simple-protocol.md that should be changed to lowercase float64 for technical accuracy.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…lapping files) Cherry-picked commit bdbb6f2 but excluded the 8 files that PR pingcap#23145 (i18n-ja-fix-type-names) handles independently. Kept changes to: data-type-json.md, ticdc/ticdc-debezium.md, tidb-cloud/data-service-app-config-files.md, tidb-cloud/tidb-cloud-console-auditing.md, ai/integrations/vector-search-integrate-with-langchain.md, functions-and-operators/numeric-functions-and-operators.md
ff8ff2a to
1775cf7
Compare
[LGTM Timeline notifier]Timeline:
|
|
hi @yahonda, would you please resolve the conflicts of this PR? Thanks. |
Replace Japanese transliterations/katakana of SQL/programming type identifiers with canonical English names in all documentation files where they appeared as type labels in mapping tables: SQL TYPE column (uppercase, matching EN source) - スモールイント → SMALLINT - ミディアムミント → MEDIUMINT - ビッグイント → BIGINT - フロート → FLOAT - ダブル → DOUBLE - タイニーイント → TINYINT - 十進数 → DECIMAL - チャー / チャール → CHAR - バイナリ → BINARY - 二進法 → VARBINARY - タイニーブロブ → TINYBLOB - ミディアムブロブ → MEDIUMBLOB - ロングブロブ → LONGBLOB - 小さなテキスト / 小さな文字 → TINYTEXT - 中テキスト → MEDIUMTEXT - 長文 → LONGTEXT - ヴァルチャー → VARCHAR - 可変長文字 → VARCHAR - 列挙型 → ENUM - タイムスタンプ → TIMESTAMP - 日付 → DATE - 日時 → DATETIME - 時間 → TIME - 年 → YEAR - 少し → BIT - ブール / ブール値 → BOOL / BOOLEAN - 署名なし / 未署名 / 符号なし → UNSIGNED JAVASCRIPT TYPE column (lowercase, matching EN source) - 番号 → number - 文字列 → string - ヌル → null - 整数 → int - 長さ → long - バイト → bytes PARQUET TYPE column - バイト配列 → BYTE_ARRAY - 固定長バイト配列 → FIXED_LEN_BYTE_ARRAY - タイムスタンプマイクロ → TIMESTAMP_MICROS Also fixed column headers to Japanese where tables were fully replaced (e.g. 'TiDB Cloud Serverlessの型 / JavaScriptの型', 'Parquet プリミティブ型 / Parquet 論理型 / TiDBまたはMySQLの型'). Affected files: ticdc-avro-protocol, ticdc-canal-json, ticdc-csv, ticdc-open-protocol, ticdc-simple-protocol, serverless-driver.md, serverless-export.md, import-parquet-files*.md, bookshop-schema-design.md, unique-serial-number-generation.md, system-variables.md (タイプ: フロート → タイプ: float), tidb-configuration-file.md, tidb-cloud-auditing.md This completes the standardization of SQL/configuration type names across the entire i18n-ja-release-8.5 branch.
- ticdc-avro-protocol: remove duplicate table header row - ticdc-canal-json: fix VARCHARル leftover typo - ticdc-simple-protocol: fix FLOAT64 → float64 (Go convention)
…ma tables Fix 6 schema description tables in dev-guide-bookshop-schema-design.md: - Column header: タイプ → 型 - Field names: restore Japanese translations to English (e.g., タイトル → title, ストック → stock, 名前 → name, etc.) - Type values: restore katakana to canonical SQL types (e.g., ビギント → BIGINT, 小さな整数 → TINYINT, etc.) - Descriptions: kept in Japanese as-is i18n(ja): fix 整数 → int in sequence table i18n(ja): フィールドタイプ → フィールドの型
i18n(ja): fix remaining type names in ticdc-canal-json type tables - First table (MySQL Type mapping): binary, varbinary, text variants, blob variants, date/time types, SET, BIT, TiDBVectorFloat32 - Second table (Integer types): SMALLINT, MEDIUMINT, INT, BIGINT, UNSIGNED variants - Third table (Java SQL Type): INTEGER, REAL, VARCHAR, CLOB, BIT, DATE, TIME, TIMESTAMP, BLOB
i18n(ja): fix remaining integer type names in canal-json - tinyint unsigned → TINYINT UNSIGNED - mediumint unsigned → MEDIUMINT UNSIGNED - 整数 → INT - [128、255] → [128, 255] (Japanese comma → ASCII comma) i18n(ja): fix column type code table in ticdc-open-protocol - Header: タイプ → 型 - ヌル → NULL, タイムスタンプ → TIMESTAMP - 日付 → DATE, 時間 → TIME, 日時 → DATETIME, 年 → YEAR - ブール値 → BOOLEAN, 少し → BIT - 列挙型 → ENUM, セット → SET, 幾何学 → GEOMETRY - 文字/バイナリ → CHAR/BINARY - TiDBベクトルFLOAT32 → TiDBVectorFloat32 - 10月14日 → 10/14 (MT mistakenly translated the code as a date) i18n(ja): fix 少し → Bit in bit flags table header
- Header: mysqlタイプ → mysqlType, and all column headers to EN - All Japanese type names → canonical SQL types (lowercase like EN) - 長さ → long, 弦 → string, バイト → bytes, FLOAT → float, DOUBLE → double - 少し → BIT, ブール → BOOL, 列挙型 → ENUM, etc. - TiDBベクトルFLOAT32 → TiDBVectorFloat32
Fix 3 audit log field tables in tidb-cloud-auditing.md: - Field names restored to EN source (EVENT_CLASS, COST_TIME, etc.) - Type names were already INTEGER/VARCHAR/TIMESTAMP/FLOAT - Descriptions kept in Japanese as-is - Additional CONNECTION and TABLE_ACCESS/GENERAL tables also fixed i18n(ja): 社内使用 → 内部使用 for 'internal use' i18n(ja): fix bit flags table - 価値→Value, name column to English
i18n(ja): Others → その他 (label, not a type name)
- data-type-json.md: JSON value type table (タイプ→型, type values to EN) - data-type-date-and-time.md: zero value date type names to EN - tidb-limitations.md: CHAR/BINARY/VARCHAR/BLOB type names to EN - tidb-cloud/tidb-cloud-console-auditing.md: audit event field and type names to EN - data-type-numeric.md: UNSIGNED/ZEROFILL syntax elements to EN - develop/dev-guide-create-secondary-indexes.md: bookshop schema table (same pattern) i18n(ja): fix programming type names in protocol field tables Change Japanese programming type names to English in protocol field definition tables across 6 files: - 弦 → string - 番号 → number - 物体 → object - ブール値/ブール → boolean (JavaScript/JSON types) - 整数 → integer (config param types) Affected: ticdc-simple-protocol, ticdc-canal-json, ticdc-open-protocol, ticdc-debezium, develop/serverless-driver (config table), tidb-cloud/data-service-app-config-files (config table) i18n(ja): 関数 → function in config table type column i18n(ja): タイプ → 型 in SQL level options table header
i18n(ja): タイプ → 型 in system-variables with English type values - タイプ: ブール値 → 型: Boolean (161) - タイプ: 列挙型 → 型: Enumeration (31) - タイプ: 時間 → 型: Time (6) - タイプ: float → 型: Float (40) - タイプ:期間 → 型: Duration (5) i18n(ja): タイプ: float → 型: Float in tidb-configuration-file i18n(ja): 型: 整数 → 型: Integer in tidb-configuration-file
- Remove duplicated SET/RESOURCE_GROUP/CREATE/ADMIN/AS/VEC_COSINE_DISTANCE words that were left over from English sentence structure
- コネクテッドケア → Connected Care (service name, 9 files) - コネクテッド:/接続済み:クリニックサービス → Connected: Clinic Service - PingCAPクリニックサービス → PingCAP Clinic Service - クリニック → Clinic (in connected care context) - クリニックサーバー → Clinic Server (4 files, 20 occurrences)
- TiDB Cloudクリニック → TiDB Cloud Clinic - PingCAPクリニック → PingCAP Clinic - クリニック → Clinic (in service/feature references) - クリニック サービス → Clinic Service
This reverts commit dc0e1b5.
…etail.md" This reverts commit f0ace21.
….0-dmr.md" This reverts commit 5ae7391.
This reverts commit 7801cdf.
This reverts commit 3c07fb9.
The link text was `VEC_COSINE_DISTANCE()()` instead of `VEC_COSINE_DISTANCE()`.
These changes belong in PR pingcap#23159 (i18n-ja-fix-config-category-names) which already includes: - TiDB Cloud Clinic (PingCAP クリニック → PingCAP Clinic) - Connected Care (コネクテッドケア → Connected Care) - Config category heading translations - TOC entries with the same Clinic/Connected Care changes This PR (pingcap#23145) is now scoped to SQL and programming type names only.
Several links had botched edits where the text outside the link was
accidentally duplicated inside the link, similar to the VEC_COSINE_DISTANCE
fix from the previous commit. The original lines had patterns like:
WORD [`WORD rest`](url)
which were changed to:
[`WORD rest rest`](url)
instead of the correct:
[`WORD rest`](url)
Fixed:
- `CREATE USER USER` → `CREATE USER` (release-7.0.0.md)
- `AS OF TIMESTAMP OF TIMESTAMP` → `AS OF TIMESTAMP` (release-6.6.0.md)
- `SHOW STATS_META STATS_META` → `SHOW STATS_META` (statistics.md)
- `ADMIN RESUME DDL JOBS RESUME DDL JOBS` → `ADMIN RESUME DDL JOBS` (sql-statement-admin.md)
|
@qiancai: Your lgtm message is repeated, so it is ignored. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What is changed, added or deleted? (Required)
Standardize SQL and programming type names from Japanese transliterations to canonical English forms across 32 files (~916 changes).
All SQL type identifiers in mapping tables are now in English (matching the EN source), while table column headers are kept in Japanese for readability.
Type name changes
The main categories of changes:
Broken link fixes
Several lines had patterns like
WORD [\WORD rest`](url)which were botched into`WORD rest rest`(duplicate text inside the link). Fixed to`WORD rest`` to match the EN source:VEC_COSINE_DISTANCE()()→VEC_COSINE_DISTANCE()(ai/integrations/vector-search-auto-embedding-overview.md)CREATE USER USER→CREATE USER(release-7.0.0.md)AS OF TIMESTAMP OF TIMESTAMP→AS OF TIMESTAMP(release-6.6.0.md)SHOW STATS_META STATS_META→SHOW STATS_META(statistics.md)ADMIN RESUME DDL JOBS RESUME DDL JOBS→ADMIN RESUME DDL JOBS(sql-statement-admin.md)Not changed (out of scope)
ダブルクリック(double click),ダブルクォート(double quote),バイト配列(byte array) in running textWhich TiDB version(s) do your changes apply to? (Required)
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?