Skip to content

EN TN Fixes for Issue 166#207

Merged
tbartley94 merged 15 commits into
mainfrom
EN-TN-Fixes-081524
Aug 19, 2024
Merged

EN TN Fixes for Issue 166#207
tbartley94 merged 15 commits into
mainfrom
EN-TN-Fixes-081524

Conversation

@zoobereq
Copy link
Copy Markdown
Contributor

@zoobereq zoobereq commented Aug 15, 2024

What does this PR do ?

This PR provides a fix for Issue #166 for English TN.

Before your PR is "Ready for review"

Pre checks:

  • Have you signed your commits? Use git commit -s to sign.
  • Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • Remove import guards (try import: ... except: ...) if not already done.
  • If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

@zoobereq zoobereq marked this pull request as ready for review August 19, 2024 15:25
@zoobereq zoobereq requested review from ekmb and tbartley94 August 19, 2024 15:25
Copy link
Copy Markdown
Member

@tbartley94 tbartley94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tbartley94 tbartley94 merged commit 6048ce4 into main Aug 19, 2024
BuyuanCui pushed a commit that referenced this pull request Aug 20, 2024
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui pushed a commit that referenced this pull request Sep 19, 2024
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui pushed a commit that referenced this pull request Sep 26, 2024
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui pushed a commit that referenced this pull request Sep 26, 2024
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui pushed a commit that referenced this pull request Oct 16, 2024
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
ngachchi pushed a commit to ngachchi/NeMo-text-processing that referenced this pull request Jun 23, 2025
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Namrata Gachchi <[email protected]>
FredHaa pushed a commit to FredHaa/NeMo-text-processing that referenced this pull request Aug 15, 2025
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
mgrafu pushed a commit that referenced this pull request Mar 13, 2026
* Rebases the updated main

Signed-off-by: Simon Zuberek <[email protected]>

* Passes Pynini fails SP

Signed-off-by: Simon Zuberek <[email protected]>

* Adjustst the weights on the domain graph

Signed-off-by: Simon Zuberek <[email protected]>

* Enables semiotic classes for SP tests

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reweights the tokenizer

Signed-off-by: Simon Zuberek <[email protected]>

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Cleans up ELECTRONIC tagger

Signed-off-by: Simon Zuberek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates test cases

Signed-off-by: Simon Zuberek <[email protected]>

* Updates Jenkins

Signed-off-by: Simon Zuberek <[email protected]>

* Enables all CI tests

Signed-off-by: Simon Zuberek <[email protected]>

* Updates EN TN Cache

Signed-off-by: Simon Zuberek <[email protected]>

---------

Signed-off-by: Simon Zuberek <[email protected]>
Co-authored-by: Simon Zuberek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants