Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing files from docker image #2299

Closed
csgergo123 opened this issue Mar 10, 2021 · 13 comments
Closed

Missing files from docker image #2299

csgergo123 opened this issue Mar 10, 2021 · 13 comments
Assignees
Labels
bug os/windows priority:high High priority issue or feature regression A return to a previous and less advanced or worse state

Comments

@csgergo123
Copy link

Bug

Current Behavior

It seems sometimes the garden.io builds incorrectly the docker image. Somehow one file is missing from the image that is there in the source folder in my machine. Files are missing according to the error log. The missing file is not always the same.

garden build -> The build is successful.
garden deploy -> The deploy process sometimes works fine, sometimes fails with a custom error (because of the missing files). Moreover, when the deployment process is fine too, sometimes the running pod crashes because of the same error. So either shows the error already during deployment or in the running container after deployment.

It's important to mention that sometimes everything works great, and then after a few minutes this error occurs, even though nothing changes in the source code that can cause it. It seems random to me.

If I see in the container what is built from the image what the garden.io built, I really can't find the files there that should be there.

Expected behavior

Workaround

Need to run the garden deploy or garden dev more than once. It can be up to 10.

Your environment

  • OS: Windows 10
  • Garden.io version: 0.12.18
  • How I'm running Kubernetes: Remote cluster
  • buildMode: cluster-docker
@edvald
Copy link
Collaborator

edvald commented Mar 10, 2021

Hey @csgergo123. That's quite odd. We'd need quite a bit more information to go on though to try and debug. Can you share the configuration for the module in question?

@ITHedgeHog
Copy link
Contributor

I run in to something with sounds like the same issue, when I'm debugging next and it happens I'll grab some repo steps (The project is opensource, so hopefully you can replicate).

I have a work around at the moment, which is to make a whitespace change to the Docker file and save it - it seems to resolve it.

@csgergo123
Copy link
Author

Hey @edvald .
I have a repo called admin.

Most of the times the process going well. But sometimes the node application crashes.
Right now the exact error is:

Error: Cannot find module '../../db/knexfile'".

Screenshot. https://ibb.co/bBrYC9b
This screenshot clearly shows that all the files are in the original directory, but the container which was launched from the faulty image is missing the knexfile.

The application seaches the knexfile in the right place but the file is not there.
I tried running on both remote kubernetes cluster and local docker too. I tried rerun from the built image.
Screenshot. https://ibb.co/GRbsnNB

Sometimes the knexfile is missing, sometimes the whole db directory, sometimes completely different file. It seems random to me.
Screenshot. https://ibb.co/KG8CspG

Then when I edit something (even a whitespace) in a file and rebuild and redeploy with garden occasionally working well for a while.

I copy some files to help to debug.

/garden.yml

kind: Project
name: admin
defaultEnvironment: "remote"
environments:
  - name: remote
providers:
  - name: kubernetes
    environments: [remote]
    context: ****
    namespace: admin
    defaultHostname:  ****
    # buildMode: cluster-docker       # Remote build mode
    deploymentRegistry:
      hostname: ****
      namespace: garden-system
    clusterDocker:
      enableBuildKit: false      # Tried true also

/admin/garden.yml

kind: Module
name: admin
type: container
hotReload:
  sync:
    - target: /usr/src/app
  postSyncCommand: [touch, /usr/src/app/hotreloadfile ]
services:
  - name: admin
    ports:
      - name: http
        containerPort: 8000
        servicePort: 80
    ingresses:
      - path: /
        port: http
        hostname: "****"
    environments:
      - varfile: './env-files/development.env'
    hotReloadArgs: [npm, run, start:dev]
build:
  timeout: 3600

/admin/Dockerfile

FROM node:14.15

EXPOSE 8080

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

COPY package*.json /usr/src/app/

RUN npm install
RUN npm install -g knex 

COPY . /usr/src/app

RUN npm run gulp build

CMD ["node", "app.js"]

@andreygolev
Copy link

andreygolev commented Mar 24, 2021

Maybe that's related to my issue as well.

I do have a PHP application.

I run "composer install" locally in order to fetch all dependencies into "vendor" directory.
After that I run either "garden build" or "garden dev".
It syncs sources to a buildkit and builds image.
After it's built, there are missing some dependencies.

Dockerfile is simple as:
COPY . /www/default

If I build image simply with docker locally, everything is in place and nothing is missing, but garden somehow excludes some of dependencies during syncing of sources to in-cluster buildkitd.

One more observation, that these files are missing in local .garden/build/projectName directory.
So, I guess that's not buildkit's fault.

Though Garden claims about Large number of files (11375) found in module X. You may need to configure file exclusions.

I'm fine with this huge amount of files :)

Please disregard. It appears that Garden skips sync for directories with .git directory inside. I can understand that :)

@csgergo123
Copy link
Author

Yes, when this error occurs the problematic files are missing from the projectName/.garden/build/projectName directory too.

@edvald
Copy link
Collaborator

edvald commented Mar 26, 2021

Thanks for the added info. This tracks with my suspicion, which is that it's a concurrency problem in how we're doing build staging in the .garden/build directory.

Could you try running with GARDEN_LEGACY_BUILD_STAGE=true in your environment, to see if everything works fine then? This would confirm my suspicion. Still need to work out why exactly this is happening, but I'm digging into that as well.

@edvald edvald self-assigned this Mar 26, 2021
@edvald edvald added bug priority:high High priority issue or feature regression A return to a previous and less advanced or worse state os/windows labels Mar 26, 2021
@csgergo123
Copy link
Author

@edvald Where should I use exactly the GARDEN_LEGACY_BUILD_STAGE option?

@edvald
Copy link
Collaborator

edvald commented Mar 26, 2021

Sorry, should have clarified, that's an environment variable in your shell.

@edvald
Copy link
Collaborator

edvald commented Mar 26, 2021

Also an update: We'll revert to make that the default as of next release, while we figure out what's causing the issue.

edvald added a commit that referenced this issue Mar 26, 2021
This temporarily mitigates #2299 by reverting back to using rsync for
build staging on Windows, while we work out why exactly it fails for
some users/projects.

Users can still opt into the newer rsync-less mode by setting
`GARDEN_EXPERIMENTAL_BUILD_STAGE=true` in their shell environment.
thsig pushed a commit that referenced this issue Mar 29, 2021
This temporarily mitigates #2299 by reverting back to using rsync for
build staging on Windows, while we work out why exactly it fails for
some users/projects.

Users can still opt into the newer rsync-less mode by setting
`GARDEN_EXPERIMENTAL_BUILD_STAGE=true` in their shell environment.
@csgergo123
Copy link
Author

@edvald It seems v0.12.20 solves the error with the default rsync setting. Thank you.

@edvald edvald closed this as completed Jan 13, 2022
@edvald edvald reopened this Jan 13, 2022
@edvald
Copy link
Collaborator

edvald commented Jan 13, 2022

Oops, this is still an active issue, misread it.

@gustaff-weldon
Copy link

gustaff-weldon commented Oct 7, 2022

@edvald I think I'm facing a similar problem. garden deploy (v0.12.44) fails with

> [development  2/14] COPY   tmp/docker-build/external-files/avro-schemas/   /avro-schemas/:
------
failed to compute cache key: "/tmp/docker-build/external-files/avro-schemas" not found: not found

1 deploy action(s) failed!

I have tried GARDEN_LEGACY_BUILD_STAGE=true garden deploy, same error.

My .garden is very minimal:

kind: Module
type: container
name: foo

build:
  targetImage: development

dockerfile: Dockerfile

services:
  - name: foo
    ports:
      - name: http
        containerPort: 3000

My Dockerfile.dockerignore:

*

!/app/
...
<snip>
!tmp/docker-build/external-files/

I've checked .garden/build/<project>/ in monorepo root folder and it is missing the tmp folder, which exists in my project directory and is not excluded in Dockerfile.dockerignore.
The .garden/build/<project>/ folder, is also missing the aforementioned dockerignore file.

What drives the list of files that are being copied to build folder?

Also I wonder, why does garden even need to copy all the module files to that project .garden/build/ folder? It looks inefficient. If I were to use garden in all my projects in the monorepo (I'm currently evaluating garden) it seems, I would end up with almost a copy my monorepo in that .garden/build folder...

@vvagaytsev
Copy link
Collaborator

@csgergo123 this was fixed in #4434 and #4438. Feel free to reopen if it's still an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug os/windows priority:high High priority issue or feature regression A return to a previous and less advanced or worse state
Projects
None yet
Development

No branches or pull requests

6 participants