Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newly added sparse checkout functionality is not cleaning up after itself #1475

Closed
markjm opened this issue Sep 15, 2023 · 17 comments · Fixed by #1598
Closed

Newly added sparse checkout functionality is not cleaning up after itself #1475

markjm opened this issue Sep 15, 2023 · 17 comments · Fixed by #1598

Comments

@markjm
Copy link

markjm commented Sep 15, 2023

Thanks @dscho for contributing #1369 . We have started using it and noticed the following issue

repo-a / .github/workflows/workflow-a.yml

jobs:
  job-1:
    runs-on: internal-runner

- uses: actions/checkout@v3
  with:
    sparse-checkout: |
      .github
      src

running just the above works as expected. But consider another workflow in the same repo

repo-a / .github/workflows/workflow-b.yml

jobs:
  job-12:
    runs-on: internal-runner

- uses: actions/checkout@v3
- run: ls ./not-src

When running workflow-a, then workflow workflow-b, workflow-b will not find any files in ./not-src.

This is because workflow workflow-b runs in the same _work directory as workflow-a. On the first run of workflow-a, we have polluted the repo in a way this action doesn't know about (and thus can't clean up).

Notice that, even in workflow-b:

git config --list will show core.sparseCheckout true
git sparse-checkout list will still include the listed paths
and a simple
ls will only show the sparse directories for workflow-a

It seems that - similar to the auth cleanup (and other pre-run steps) this action does - we need to disable sparse checkout in the pre steps of this action

Note: this is likely only an issue for self-hosted runners which dont get the same cleaning/re-init process github-hosted runners get

Similar-ish issue to recently reported 47fbe2d (for submodules, instead of sparse checkout). @megamanics

@dscho
Copy link
Contributor

dscho commented Sep 18, 2023

this is likely only an issue for self-hosted runners

Indeed.

The working directory's path typically contains the workflow name, e.g. D:\a\_work\repo-a\repo-a, but not the workflow name.

A solid work-around would be to scorch the worktree (if it exists), right before the actions/checkout step.

A more surgical work-around would be to use a shell script like this:

- shell: bash
  run: |
    if test true = "$(git config --type bool core.sparsecheckout)"
    then
      rm "$(git rev-parse --git-path info/sparse-checkout)" &&
      git sparse-checkout disable
    fi

@markjm
Copy link
Author

markjm commented Sep 19, 2023

We are looking to use such a workaround in the interim, but it seems this is something that should be fixed here within actions/checkout, no?

@jinhyoo-mp
Copy link

This caused CI issues for us so we had to remove sparse checkout

@vibro
Copy link

vibro commented Jan 17, 2024

I also ran into issues with this, two different workflows, one which uses sparse and one which doesn't. I ended up removing the contents of the directory prior to checkout to force a clean checkout every time.

@aamagda
Copy link

aamagda commented Jan 29, 2024

@dscho Hi! Do you have plans to fix this isuue?

@dscho
Copy link
Contributor

dscho commented Jan 29, 2024

@aamagda time constraints force me to say "no".

Having said that, if you can start implementing that (probably by specifically resetting the sparse checkout if none is desired), and verify that it works, I can take it from there.

dscho added a commit to dscho/checkout that referenced this issue Jan 31, 2024
This should allow users to reuse existing folders when running
`actions/checkout` where a previous run asked for a sparse checkout but
the current run does not ask for a sparse checkout.

This fixes actions#1475

Signed-off-by: Johannes Schindelin <[email protected]>
jww3 added a commit that referenced this issue Feb 21, 2024
When a worktree is reused by actions/checkout and the first time sparse checkout was enabled, we need to ensure that the second time it is only a sparse checkout if explicitly asked for. Otherwise, we need to disable the sparse checkout so that a full checkout is the outcome of this Action.

## Details
* If no `sparse-checkout` parameter is specified, disable it

This should allow users to reuse existing folders when running
`actions/checkout` where a previous run asked for a sparse checkout but
the current run does not ask for a sparse checkout.

This fixes #1475

There are use cases in particular with non-ephemeral (self-hosted) runners where an
existing worktree (that has been initialized as a sparse checkout) is
reused in subsequent CI runs (where `actions/checkout` is run _without_
any `sparse-checkout` parameter).

In these scenarios, we need to make sure that the sparse checkout is
disabled before checking out the files.

### Also includes:

* npm run build
* ci: verify that an existing sparse checkout can be made unsparse
* Added a clarifying comment about test branches.
* `test-proxy` now uses newly-minted `test-ubuntu-git` container image from ghcr.io

---------

Signed-off-by: Johannes Schindelin <[email protected]>
Co-authored-by: John Wesley Walker III <[email protected]>
@justin-newman
Copy link

Assuming v4.1.2 works when explicitly setting it, I am still having the same issue with v4.1.2 on a self-hosted server

@dscho
Copy link
Contributor

dscho commented Apr 8, 2024

Assuming v4.1.2 works when explicitly setting it, I am still having the same issue with v4.1.2 on a self-hosted server

@justin-newman could you verify in the logs that the git sparse-checkout disable command was run, like so (line 51)?

@justin-newman
Copy link

Assuming v4.1.2 works when explicitly setting it, I am still having the same issue with v4.1.2 on a self-hosted server

@justin-newman could you verify in the logs that the git sparse-checkout disable command was run, like so (line 51)?

Yes, it is doing that

@dscho
Copy link
Contributor

dscho commented Apr 12, 2024

could you verify in the logs that the git sparse-checkout disable command was run, like so (line 51)?

Yes, it is doing that

@justin-newman can you find out whether that command simply does not work in your setup? I am somewhat surprised to read that it was called yet did not fix the problem.

@sandro-meier
Copy link

could you verify in the logs that the git sparse-checkout disable command was run, like so (line 51)?

Yes, it is doing that

@justin-newman can you find out whether that command simply does not work in your setup? I am somewhat surprised to read that it was called yet did not fix the problem.

I can confirm the same issue on our self hosted runner with version 4.1.6. The sparse checkout disable is called

image

image

@sandro-meier
Copy link

If I add another step in the action where I run git sparse-checkout disable then it works.

@major-mayer
Copy link

I can confirm that this is still an issue.
The git config still shows:

git config --list
core.repositoryformatversion=1
core.filemode=true
core.bare=false
core.logallrefupdates=true
core.sparsecheckout=true
remote.origin.url=https://github.com/xxx
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.promisor=true
remote.origin.partialclonefilter=blob:none
gc.auto=0

... even tho the action reports the following:

Determining the checkout info
/usr/bin/git sparse-checkout disable
/usr/bin/git config --local --unset-all extensions.worktreeConfig 

But apparently this git sparse-checkout disable requires a username and password on git version 2.39.2, which is the default version in Debian Bookworm.
Thus, the command fails when i execute it manually, since no input is given.
Instead, i use the following workaround:

git config core.sparseCheckout false

@LuvForAirplanes
Copy link

git config core.sparseCheckout false

This is the current way to fix this issue since newer versions of git require GitHub Authentication, and I'm not usually logged in to my GitHub account on my production machines. (hello world 🙄)
It would really be nice if this could be fixed.

@major-mayer
Copy link

Ahh, I thought the Git version shipped with Debian is too old, not too new, but that's very interesting to hear.
I really don't understand why it should be necessary to log in to change a configuration variable...

@dscho
Copy link
Contributor

dscho commented Jun 24, 2024

apparently this git sparse-checkout disable requires a username and password on git version 2.39.2

Care to go into more details?

Is this because the worktree is sparsely-checked out and it's a partial clone and the command needs to fetch Git objects in order to turn the sparse checkout into a full one?

If that is the case, I really do not understand because the sparse checkout is disabled after the authentication is set up.

Puzzled!

@major-mayer
Copy link

Ahh, I think the authentication problem isn't really the issue here, since it only happens when I try to manually execute git sparse-checkout disable or do it before the checkout action is executed.
In the checkout action logs, there are no obvious errors:


Setting up auth
  /usr/bin/git config --local --name-only --get-regexp core\.sshCommand
  /usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
  /usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
  /usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
  /usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic ***
Fetching the repository
  /usr/bin/git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules --depth=1 origin +16624a5b4b2b4fcd31b683ea4533d447f4d9d459:refs/remotes/origin/dev
  Von https://github.com/xxx
   + 121c72e...16624a5 16624a5b4b2b4fcd31b683ea4533d447f4d9d459 -> origin/dev  (Aktualisierung erzwungen)
Determining the checkout info
/usr/bin/git sparse-checkout disable
/usr/bin/git config --local --unset-all extensions.worktreeConfig

And I can't tell you why authentication would be required.
Your explanation sounds reasonable, but only if git sparse-checkout disable would actually fetch objects, while git config core.sparseCheckout false just changes the setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants