Skip to content

Trigger unit tests for docker images upload workflow#3329

Draft
xibinliu wants to merge 1 commit intoxibin/cifrom
xibin/ci2
Draft

Trigger unit tests for docker images upload workflow#3329
xibinliu wants to merge 1 commit intoxibin/cifrom
xibin/ci2

Conversation

@xibinliu
Copy link
Collaborator

@xibinliu xibinliu commented Mar 6, 2026

Description

  • images will only be tagged to the current date when unit tests pass
  • Images will only be tagged to "latest" when unit tests pass

Tests

See this manually triggered workflow run

Old:
Another old workflow run - by calling tests from another file.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

- all
- tpu
- gpu
for_dev_test:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the images generated by this workflow development tests in another image repo so the formal image repo is not polluted.
(our previous discussion: For testing purposes, can you change the name of the docker image so that it doesn't pollute our production images.)

@xibinliu xibinliu mentioned this pull request Mar 6, 2026
4 tasks
@xibinliu xibinliu force-pushed the xibin/ci2 branch 2 times, most recently from ffd00f3 to ea56990 Compare March 6, 2026 21:33
context: .
file: ${{ inputs.dockerfile }}
tags: gcr.io/tpu-prod-env-multipod/${{ inputs.image_name }}:latest
tags: gcr.io/tpu-prod-env-multipod/${{ inputs.image_name }}:${{ inputs.image_date }}-build-${{ github.run_id }}
Copy link
Collaborator Author

@xibinliu xibinliu Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SurbhiJainUSC we need to discuss what tags do we want to add to the images before and after testing.

Current:

  • before tests: ${{ github.run_id }}, and hashes of the repos
  • after tests: "latest", and "<image_date>"

@xibinliu xibinliu force-pushed the xibin/ci2 branch 2 times, most recently from 3b2089d to 8ca7650 Compare March 7, 2026 00:01
@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@xibinliu xibinliu force-pushed the xibin/ci2 branch 2 times, most recently from be50e5b to 71076a6 Compare March 12, 2026 18:42
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a different workflow for testing the docker image?

Can we update build_and_push_docker_image.yml to something like this:

# https://github.com/AI-Hypercomputer/maxtext/blob/main/.github/workflows/build_and_push_docker_image.yml#L111
- name: Build and push Docker image

# Runs tests on docker image with tag as ${{ github.run_id }}
- name: Test Dokcer Image

# [Add tags such as ](https://github.com/AI-Hypercomputer/maxtext/blob/main/.github/workflows/build_and_push_docker_image.yml#L129)
- name: Add tags to Docker Image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged the jobs into build_and_push_docker_image.yml and now tests are executed immediately after the image has been pushed.

One caveat: since the tpu-post-training-nightly depends on tpu-post-training-stable, the whole workflow needs more time to finish because it now waits for the test to be done.

- images will only be tagged to the current date when unit tests pass
- images will only be tagged to "latest" when unit tests pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants