-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-284] [Bug] Unable to run dbt inside a docker container due to missing dependencies #4784
Comments
OK, I think this is because of this line: https://github.com/dbt-labs/dbt-core/blob/main/docker/Dockerfile#L53 it makes the default directory immutable at build time, which is super not-obvious for the user. Spent a good few hours digging around for this! Here's how I'm getting around it: FROM ghcr.io/dbt-labs/dbt-snowflake:1.0.latest
RUN dbt --version
WORKDIR /usr/my-project
COPY . .
RUN dbt deps
# TODO obviously this kind of thing needs to be parameterized in future.
ENTRYPOINT ["dbt", "--log-format", "json", "--warn-error", "run"] |
Hmm. It seems to be working correctly for me:
If it's possible can you provide your |
Thank you so much for taking a look @iknox-fa . What's really strange is it works for me on my local machine, but not in my CI job. However, my CI job runs the same commands... as per the log file I posted. Here's my packages file, it's rather simple: packages:
- package: dbt-labs/dbt_utils
version: 0.8.1
- package: calogica/dbt_expectations
version: 0.5.2 |
also, here's my
|
Lol those are two of the same deps I used to test. Very strange. I've reached the end of my day so I'll take another look with fresh eyes in the AM. In the meantime I'm glad you have a workaround. |
Yeah, I'm fried too I've been working on this issue all day haha. I hope we can get a consistent reproduction. Using the different directory seems to work, and I got my first successful deployed run of dbt with it so that's at least something! Thanks again for the help. |
Btw, I tested this a tiny bit further - it definitely is because of the FROM ghcr.io/dbt-labs/dbt-snowflake:1.0.latest
RUN dbt --version
RUN touch test.txt
WORKDIR /usr/my-project
COPY . .
RUN dbt deps
ENTRYPOINT ["dbt", "--log-format", "json", "--warn-error", "run"] Then, playing around in the resulting container:
|
signing off for reals now 👋 catch'ya on this issue tomorrow or something |
@alexrosenfeld10 thanks for reporting! At the moment we are not going to get time in the near future to dig into this more unfortunately. I'm going to add the |
@leahwicz ok, fair enough. I think it might be as simple as removing the |
at the very least, hopefully this issue serves as a searchable destination for users debugging the same weird behavior |
it does) So is there any workaround? |
Btw, it can change the volume state on Mac OS that's why it is even more confusing. @jtcohen6 sorry for bugging, but what is the intended way to setup own dbt deps using a dbt docker image? |
@eugene-nikolaev yes, the workaround is what I said here: #4784 (comment) |
(whoops didn't mean to close) |
@alexrosenfeld10 you mean to copypaste a Dockerfile from dbt-core and throw away that line? Well, that will work, thanks |
No, I don't. I mean do this: FROM ghcr.io/dbt-labs/dbt-snowflake:1.1.latest
# We have to move out of the default dir because of https://github.com/dbt-labs/dbt-core/issues/4784
WORKDIR /wherever/you/want
COPY . .
RUN dbt deps
# warnings as errors - https://docs.getdbt.com/reference/global-configs#warnings-as-errors
# further arguments are passed in the kubernetes config as "args"
ENTRYPOINT ["dbt","--log-format", "json", "--warn-error", "run"] |
ugh, the default action on command + enter is "comment and close". Obviously there are workarounds, but the core issue isn't fixed so i'm gonna reopen |
@alexrosenfeld10, thanks a lot, works fine! |
Spent a whole day on the same problem, and finally I found this issue. |
I also spent a lot of time on this problem before finding the root cause, and after that, this issue. The reason for this differing from e.g. local machine to CI is because of the builder used. I can reproduce the same behaviour locally with the following Dockerfile: FROM ghcr.io/dbt-labs/dbt-snowflake:1.0.latest
COPY . .
RUN dbt deps with the following packages.yml packages:
- package: dbt-labs/dbt_utils
version: 0.8.1
- package: calogica/dbt_expectations
version: 0.5.2 And a valid dbt_project.yml (because this is needed for Using
Then, building an image using Docker (buildkit enabled) instead yields the following result:
And building with docker but without buildkit enabled yields the initial result/"error":
So, the behaviour is inconsistent across docker engines, and only docker buildkit produces the "expected" result. Personally, I'm in favour of removing the The way this Dockerfile is currently set up, it is hard to use it as a source to build upon -- because the WORKDIR is in a VOLUME, meaning that any changes (e.g. RUN commands) to the same directory are not persisted. As mentioned previously in this thread, using another directory works fine (e.g. Furthermore, on the VOLUME train, there should be few affected by removing this directive. Specifying the To try to summarise: The current Dockerfile is hard to use without digging relatively deep into how Docker works (or finding this issue). The main pain point is the VOLUME directive. Removing it should be relatively straight-forward for existing users as far as I can tell. It will also resolve this issue, and making it a lot easier to use "out of the box" without having some documentation telling you to use another WORKDIR for things to work (and if that's the solution... why is there then a VOLUME specified which is not in use?). |
There is a different approach to solving this volume issue affecting I have built my own custom Dockerfile, but faced similar issues when trying to mount a I was able to come up with a different approach to this problem that solved my use case. You can simply update where In the case of the current Dockerfile which has a ✅ TLDR SOLUTION ✅
The benefit here is this is just a modification to your Frankly, dbt labs should just update their image to place their packages outside of that directory to avoid this problem for everyone. |
Running into the same issue, thanks everyone for your help. Just to add on to the previous solution, here's what I'm doing, using an env var (see the jinja dbt docs) to set the packages-install-path and by default use the default one of In clean-targets:
- "target"
- "{{ env_var('MYCOMPANY_DBT_PACKAGE_INSTALL_PATH', 'dbt_packages') }}"
packages-install-path: "{{ env_var('MYCOMPANY_DBT_PACKAGE_INSTALL_PATH', 'dbt_packages') }}" My Dockerfile FROM ghcr.io/dbt-labs/dbt-bigquery:1.5.0
ENV MYCOMPANY_DBT_PACKAGE_INSTALL_PATH=/home/dbt_packages
RUN mkdir -m 777 -p ${MYCOMPANY_DBT_PACKAGE_INSTALL_PATH}
COPY ./my_package /usr/app
COPY ./profiles /root/.dbt/
RUN dbt deps |
@jtcohen6 maybe it's time to revisit this issue? Seems like a lot of folks have this problem, and I don't think it'd be a major lift for dbt Labs to fix |
@alexrosenfeld10 would you be interested in raising a pull request, by any chance? e.g., a PR that removes this line (assuming that is the solution you are suggesting): Line 53 in 7740bd6
The other key thing we'd need to do is write up an explanation of what users can do to restore the previous behavior (if needed). @sklirg may have already provided that here:
|
Sure, I can, but have little context into the impact / test cases / process needed. |
Here's a rough outline of the most important pieces when opening the PR:
You've already done most of those in #8069 -- it's looking great 😎 The main part that is missing is the changelog part -- I'll follow-up within the PR itself for next steps there. Then a member of the dbt Labs engineering team will review the PR once it's open. They'll help figure out what kind of testing is needed and give any other feedback that is needed prior to merge. |
yep, have done that before for actual code changes in here, just don't have the time on hand right now to get my local set up again (new machine). If you or someone else wants to push it over the line, that'd be fine, otherwise I'll get to it.. sometime |
No prob @alexrosenfeld10 -- I just added the changelog entry to the PR ✅ |
Thanks @dbeatty10 |
Is there an existing issue for this?
Current Behavior
Running
dbt deps
inside a Dockerfile has no effect on the final resulting image. This is pretty much a full blocker for me, and I feel like I'm going crazy that this isn't already an issue anyone has raised. I must be missing something obvious, right?Expected Behavior
Running
dbt deps
should persist the generateddbt_packages
in the final resulting image. Immutable deploys are super important so you know exactly what you're getting every time you ship code. This means the deps are bundled in the image at build time. That way if anything is down (like the servers where the dbt dependencies are hosted), my image still has everything it needs inside it to start up. Or, if I need to ship the exact same image again, I can.Steps To Reproduce
Set up this docker file:
The resulting image will not contain the
dbt_packages
directory, even though thedbt deps
command was run. For example:There is no
dbt_packages
directory.Relevant log output
Environment
What database are you using dbt with?
No response
Additional Context
other folks are having the same issue: https://getdbt.slack.com/archives/C2JRRQDTL/p1645743857126709
The text was updated successfully, but these errors were encountered: