Hi TN: Implement Serial tagger#440
Conversation
…res, number chains, and mathematical powers Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
for more information, see https://pre-commit.ci
| @@ -0,0 +1,68 @@ | |||
| अ अ | |||
There was a problem hiding this comment.
if these chars don't go through any changes, let's not map them as a transformation
There was a problem hiding this comment.
Fixed: loading as a plain acceptor via pynini.string_file directly instead of pynini.project(..., "input") and updated the tsv file accordingly.
| pure_cardinal_words = pynini.compose(cardinal.fst, strip_cardinal_tags).optimize() | ||
|
|
||
| length_filter = pynini.closure(any_digit, 1, 3) | ||
| limited_cardinal = pynini.compose(length_filter, pure_cardinal_words).optimize() |
There was a problem hiding this comment.
instead of using cardinal.fst and then stripping tags, you can look at how cardinals graph is accessed by serial in English.
There was a problem hiding this comment.
Updated to reuse cardinal's exposed subgraphs directly (cardinal.digit | cardinal.zero | cardinal.teens_and_ties | cardinal.graph_hundreds), same pattern as the EN serial tagger and our own ordinal tagger. Removed the cardinal.fst + tag-stripping logic entirely.
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
for more information, see https://pre-commit.ci
What does this PR do ?
Hi TN: Implement Serial tagger for Devanagari-numeric mixtures, number chains, and mathematical powers
Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.