Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hugo stuck on "Start building sites …" #8166

Closed
ybizeul opened this issue Jan 21, 2021 · 24 comments
Closed

Hugo stuck on "Start building sites …" #8166

ybizeul opened this issue Jan 21, 2021 · 24 comments
Assignees
Labels
Milestone

Comments

@ybizeul
Copy link

ybizeul commented Jan 21, 2021

What version of Hugo are you using (hugo version)?

$ hugo version
Hugo Static Site Generator v0.80.0-792EF0F4 linux/amd64 BuildDate: 2020-12-31T13:37:58Z

Does this issue reproduce with the latest release?

Yes

Context

Hugo running as a container in Gitlab pages, container base is klakegg/hugo:debian but it doesn't seem to matter.

Description

Every once in a while (actually, every single time when run in Gitlab) hugo will be stuck on Start building sites … message.

I was able to run in --debugand got the following :

INFO 2021/01/21 17:14:55 Using config file:
Start building sites …
INFO 2021/01/21 17:14:55 syncing static files to /public/

Then I wanted to reproduce the issue home and it was a little bit trickier, it happens less often, but running site generation in a whileloop does the trick. So here is what I was able to catch with strace, it's stuck with repeating FUTEX_WAIT_PRIVATE

[...]
newfstatat(AT_FDCWD, "/public/", {st_mode=S_IFDIR|0755, st_size=104, ...}, 0) = 0
futex(0x3134988, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x3134988, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x3134988, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xc000058548, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3134988, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x3134988, FUTEX_WAIT_PRIVATE, 0, NULL^Cstrace: Process 101 detached

I'm running hugowith the following flags to test :

hugo --debug --noChmod --noTimes -d ../public

Also, the site is bound to the container from a local directory

I'm not sure what other parts of strace would be relevant.

@moorereason
Copy link
Contributor

  • How large is /static? Does the size of /static matter?
  • Running in a loop, how often does it fail (about what percentage of builds hang)?
  • How many CPUs does the host have?
  • If you run GOMAXPROCS=1 hugo --debug --noChmod --noTimes -d ../public, what happens?
  • If possible, can you create a small demo repo that reproduces the issue?

@ybizeul
Copy link
Author

ybizeul commented Jan 22, 2021

Interestingly enough, after a few minutes I couldn't reproduce it in my lab, but it was still failing in Gitlab.

So I downgraded my image to 0.70.0 and it has been working without issues since then.

  • /static is minuscule (296K)
  • about 20% when it was failing
  • 2
  • welll now that's interesting, GOMAXPROCS=1 hugo --debug --noChmod --noTimes -d ../public actually blocked execution once, but never again, even in a loop

That's the currently failing site :

site.zip

@DarrienG
Copy link

DarrienG commented Jan 26, 2021

I've found this has started happening to me too. Normally I invoke hugo from a webserver, but in this case I manually tested it and have found this happens occasionally when I run the command normally. It too gets caught on syncing static files to /public/

/static is larger in my case (1.6MB), but has worked fine up until now. I'm running on a system with 1 CPU and you can see the whole site here.

I find I am able to reproduce this with 100% consistency on my production server after deleting the public folder and running hugo again. My findings are consistent when running with GOMAXPROCS=1 hugo --debug --noChmod --noTimes -d ../public as well as normally.

I am unable to reproduce this on my dev machine.

Update: In my manual testing, if I mkdir the public folder before running hugo, even if it is empty, I am unable to get it to hang. I will further test, but if I had to guess, it looks like the process of syncing to public is probably deadlocked behind waiting for the public folder to be made, and is never signalled that it is created.

Update: Now that I am manually creating public I am unable to reproduce this on my dev machine and CI is passing again.

@moorereason
Copy link
Contributor

Using the site.zip from @ybizeul, I'm able to reproduce the issue. My tests show that removing all images and CSS from the root of the content folder resolves the issue. Running with the Go race detector doesn't trigger a race for me. Here's what I'm seeing:

$ cat /proc/cpuinfo | grep ^processor | wc -l
2

$ hugo env
Hugo Static Site Generator v0.81.0-DEV-4D2B6FC4 linux/amd64 BuildDate: 2021-01-22T11:30:16-0600
GOOS="linux"
GOARCH="amd64"
GOVERSION="go1.16beta1"

$ unzip site.zip
$ cd site/
# Add the files to a git repo to restore deleted items later
$ git init
$ git add *
$ git commit -m "foo"

$ find content/ -maxdepth 1 -type f -name "*.css" -or -name "*.png"
content/custom.css
content/OBS.png
content/community.png
content/[email protected]

$ while true; do rm -rf public && hugo --debug; done
## ^^ hangs EVENTUALLY at:
INFO 2021/01/26 09:24:57 syncing static files to /home/x/src/github.com/moorereason/hugoscratch/iss8166/site/public/

$ while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done
# ^^ hangs EVERY TIME at:
INFO 2021/01/26 09:24:57 syncing static files to /home/x/src/github.com/moorereason/hugoscratch/iss8166/site/public/

$ rm -f content/*.png
$ find content/ -maxdepth 1 -type f -name "*.css" -or -name "*.png"
content/custom.css

$ while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done
# ^^ hangs EVERY TIME at:
INFO 2021/01/26 09:24:57 syncing static files to /home/x/src/github.com/moorereason/hugoscratch/iss8166/site/public/

$ rm -f content/custom.css
$ find content/ -maxdepth 1 -type f -name "*.css" -or -name "*.png"

$ while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done
# ^^ works

$ git checkout content/
$ rm -f content/custom.css content/[email protected] content/OBS.png
$ find content/ -maxdepth 1 -type f -name "*.css" -or -name "*.png"
content/community.png

$ while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done
# ^^ hangs EVERY TIME at:
INFO 2021/01/26 09:24:57 syncing static files to /home/x/src/github.com/moorereason/hugoscratch/iss8166/site/public/

$ rm -f content/community.png
$ find content/ -maxdepth 1 -type f -name "*.css" -or -name "*.png"

$ while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done
# ^^ works

@DarrienG, thanks for the additional info. The link to your site source doesn't work for me: 404. Can you confirm that your site is similar to what I describe above?

@DarrienG
Copy link

DarrienG commented Jan 26, 2021

Ah sorry, I forgot I left it private. I'll check after work, but my site is here if you'd like to test in the meantime:

site.zip

I'm guessing it will be the same for me though.

@DarrienG
Copy link

DarrienG commented Jan 27, 2021

I'm back. I cannot confirm this is the same. There are no .png or .css files in my /content folder, however I do have a .svg file there. Removing it does the trick which would actually coincide when I hit this bug, while I was adding a favicon to my website.

If this is the case, I believe this bug is also on 0.74 and possibly the releases in between as well, which is why I originally upgraded, thinking the upgrade would fix it.

@DarrienG
Copy link

DarrienG commented Jan 27, 2021

Confirming this bug goes back at least as far as 0.74.0

[root@darrien blog]# /tmp/hugo env
Hugo Static Site Generator v0.74.0-D2B11626 linux/amd64 BuildDate: 2020-07-13T10:30:21Z
GOOS="linux"
GOARCH="amd64"
GOVERSION="go1.14.3"
[root@darrien blog]# cp /opt/blog/content/favicon.svg content/
[root@darrien blog]# while true; do rm -rf public && GOMAXPROCS=1 /opt/hugo --debug; sleep 1; done
while true; do rm -rf public && GOMAXPROCS=1 /tmp/hugo --debug; sleep 1; done
INFO 2021/01/27 00:44:31 No translation bundle found for default language "en"
INFO 2021/01/27 00:44:31 Translation func for language en not found, use default.
INFO 2021/01/27 00:44:31 i18n not initialized; if you need string translations, check that you have a bundle in /i18n that matches the site language or the default language.
INFO 2021/01/27 00:44:31 Using config file: 
Building sites … INFO 2021/01/27 00:44:31 syncing static files to /tmp/blog/public/

^C
[root@darrien blog]# rm content/favicon.svg 
rm: remove regular file 'content/favicon.svg'? y
[root@darrien blog]# rm -rf public/
[root@darrien blog]# while true; do rm -rf public && GOMAXPROCS=1 /tmp/hugo --debug; sleep 1; done
INFO 2021/01/27 00:45:17 No translation bundle found for default language "en"
INFO 2021/01/27 00:45:17 Translation func for language en not found, use default.
INFO 2021/01/27 00:45:17 i18n not initialized; if you need string translations, check that you have a bundle in /i18n that matches the site language or the default language.
INFO 2021/01/27 00:45:17 Using config file: 
Building sites … DEBUG 2021/01/27 00:45:17 Render page Error handling in Java is error prone to "/posts/error-handling-in-java-is-error-prone/index.html"
INFO 2021/01/27 00:45:17 syncing static files to /tmp/blog/public/
DEBUG 2021/01/27 00:45:17 found menu: "main", in site config
DEBUG 2021/01/27 00:45:17 Render page Fireworks for your terminal to "/posts/fireworks-for-your-terminal/index.html"
...

My site uses a theme that is not compatible with Hugo <0.74.0 so I can't test any further.

@SMUsamaShah
Copy link

@jmooring
Copy link
Member

@SMUsamaShah It looks like you can work around this by:

mv content/me.jpg static/images/

Then update the image link in content/about_me.md to ![me](/images/me.jpg).

@SMUsamaShah
Copy link

@jmooring
I only added --debug flag to hugo action before seeing your suggestion and it worked fine.
This issue is random and the image in content dir is definitely not causing it. I am also run the site locally on my windows machine.

@jmooring
Copy link
Member

@SMUsamaShah From what I can see, the image in content dir is causing the problem. Try it yourself:

git clone --recurse-submodules https://github.com/AbdulRafayZaidi/AbdulRafayZaidi.github.io
cd AbdulRafayZaidi.github.io
while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done

The build hangs.

Now move the image and try again.

mv content/me.jpg static/images/
while true; do rm -rf public && GOMAXPROCS=1 hugo --debug; done

The build completes.

@SMUsamaShah
Copy link

I get it now. It's only visible in the loop the way you did it.

Start building sites …
INFO 2021/04/17 16:42:51 syncing static files to /d/misc/blog/rafay_blog/public/
Total in 104 ms
Error: Error building site: process: readAndProcessContent: open /d/misc/blog/rafay_blog/public/me.jpg: no such file or directory
INFO 2021/04/17 16:42:51 Using config file:
Start building sites …
INFO 2021/04/17 16:42:51 syncing static files to /d/misc/blog/rafay_blog/public/
Total in 103 ms
Error: Error building site: process: readAndProcessContent: open /d/misc/blog/rafay_blog/public/me.jpg: no such file or directory
INFO 2021/04/17 16:42:52 Using config file:

Isn't it a bug?

@jmooring
Copy link
Member

@SMUsamaShah

Isn't it a bug?

Yes, that is why this issue has a "Bug" label.

@sytone
Copy link

sytone commented Jun 14, 2021

I hit this issue with a txt file and some json files in the content folder. Moving them out enable the build to happen again. This was for a Azure static web site as I was adding routes. This is related to Azure/static-web-apps#412 linked above.

@keslerm
Copy link

keslerm commented Aug 21, 2021

Hitting this issue as well.

@knadh
Copy link

knadh commented Sep 17, 2021

This bug still exists. On Windows machines, the hidden desktop.ini that's found in directories causes this issue.

PS: Run in the website's root directory del /s /q /f /a ".\desktop.ini" to delete all instances of the file recursively across all sub-directories.

@somethingSTRANGE
Copy link

When Desktop.ini is in the "<root>/data/" folder hugo server generates the following error:

ERROR <date> <time> failed to load data: failed to load data: "<root>\data\desktop.ini:1:1": unmarshal of format "" is not supported

It would be ideal if Hugo server would simply ignore Desktop.ini files.

Desktop.ini normally has the "System" and "Hidden" file attributes. Could either or both of those on a given file indicate the file should be ignored by Hugo?

@jmooring
Copy link
Member

@somethingSTRANGE

When Desktop.ini is in the "/data/" folder ...

This is an unrelated issue. See:
https://gohugo.io/getting-started/configuration/#ignore-content-and-data-files-when-rendering

@chengjun
Copy link

Hitting this issue as well.

@jmooring
Copy link
Member

@bep bep self-assigned this Dec 22, 2021
@bep bep added this to the v0.91.2 milestone Dec 22, 2021
@jmooring
Copy link
Member

jmooring commented Dec 22, 2021

@bep If you do this, no problem:

while true; do rm -rf public && mkdir public && GOMAXPROCS=1 hugo --debug; done

@bep
Copy link
Member

bep commented Dec 22, 2021

@jmooring I have found the stupid culprits (2) ... Will create a PR soon.

bep added a commit to bep/hugo that referenced this issue Dec 22, 2021
bep added a commit to bep/hugo that referenced this issue Dec 22, 2021
bep added a commit to bep/hugo that referenced this issue Dec 23, 2021
This is related to gohugoio#8166 – if the /public folder did not exist, you had no /static files and had static files in /content root, then you would get a "no such file or directory" error.
bep added a commit to bep/hugo that referenced this issue Dec 23, 2021
This is related to gohugoio#8166 – if the /public folder did not exist, you had no /static files and had static files in /content root, then you would get a "no such file or directory" error.
bep added a commit to bep/hugo that referenced this issue Dec 23, 2021
* Before this commit, when you had static files in the root of /content and no /public folder, that folder would not be created unless the /static syncer had already run.
* So, with a common pattern doing `rm -rf public && hugo` would the fail now and then because /static and /content are processed in parallel (unless you have cleanDestinationDir=true)
* This was even worse before commit 0b918e1 – a frozen build.

Closes gohugoio#8166
@bep
Copy link
Member

bep commented Dec 23, 2021

Note that a workaround for the above would be to always make sure that /public exists before you start the build.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests