Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker manageiq do not start after adding/removing network interface - memcached connectivity #17274

Closed
gaetanquentin opened this issue Apr 10, 2018 · 13 comments · Fixed by #19463
Assignees

Comments

@gaetanquentin
Copy link

i have launched the latest docker manageiq (20180404) container.
it was working fine.

i had the "strange" idea to change network configuration, adding a second network interface, and removing it after (docker network connect / docker network disconnect)

Now the container do not start and i don't know why since the network conf looks like before.

The log error is:

Error something seems wrong, we need at least two parameters to check service status
== Checking MIQ database status ==
** DB already initialized

{"@timestamp":"2018-04-05T15:47:32.690234 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for evm.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.690894 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for vim.log has been changed to [WARN]"}
{"@timestamp":"2018-04-05T15:47:32.691549 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for rhevm.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.692014 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for aws.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.692380 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for kubernetes.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.692814 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for datawarehouse.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.693234 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for container_monitoring.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.693607 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for scvmm.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.693993 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for api.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.694350 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for fog.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.694738 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for azure.log has been changed to [WARN]"}
{"@timestamp":"2018-04-05T15:47:32.695077 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for lenovo.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.695500 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for websocket.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.695935 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for vcloud.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:32.696293 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for nuage.log has been changed to [INFO]"}
{"@timestamp":"2018-04-05T15:47:33.068484 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(SessionStore) Using session_store: ActionDispatch::Session::MemCacheStore"}
{"@timestamp":"2018-04-05T15:47:33.451305 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for "127.0.0.1" port 11211"}

/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in server_for_key': No server available (Dalli::RingError) from /usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/client.rb:236:in alive!'
from /usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/rack/session/dalli.rb:19:in initialize' from /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/action_dispatch/middleware/session/abstract_store.rb:32:in initialize'

il looks like a memcache network connectivity problem.

What should i modify to make it start again?

Regards.

@dalareo
Copy link

dalareo commented Aug 7, 2018

I have a similar issue: not even able to restart the container after stopping it:

/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in `server_for_key': No server available (Dalli::RingError)


"message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}

@ngoduykhanh
Copy link

I faced the same issue with manageiq docker image.

{"@timestamp":"2018-08-16T05:06:01.028087 ","hostname":"fae6893c4776","pid":5,"tid":"c0d104","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}

@adamjpavlik
Copy link

The default /manageiq/docker-assets/appliance-initialize.sh only starts memcached and postresql if the DB has not been initialized. See line 18.

I created local appliance-initialize.sh that include start commands for the aforementioned services if the DB is initialized. My local Dockerfile includes a Copy of this new file that overwrites the base image default.

guilrom added a commit to guilrom/manageiq that referenced this issue Oct 15, 2018
Allowing to restart MIQ Container after stopping it (cf. ManageIQ#17274 (comment))
guilrom added a commit to guilrom/manageiq that referenced this issue Oct 15, 2018
Allowing to restart MIQ Container after stopping it (cf. ManageIQ#17274 (comment))
@isaaccarrington
Copy link

I just pulled today and got the same issue on restart

{"@timestamp":"2018-10-17T06:24:27.703264 ","hostname":"0fa2b870b67d","pid":9,"tid":"1005108","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}
Error something seems wrong, we need at least two parameters to check service status
== Checking MIQ database status ==
** DB already initialized
/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in `server_for_key': No server available (Dalli::RingError)```

@eselvam
Copy link

eselvam commented Oct 23, 2018

You need to change the file in two location in docker container: /usr/bin and /var/www/miq/vmdb/docker-assets. First spin up the manageiq with docker then use docker cp to copy the original file to Host then modify as per Guilrom and copy back to docker instance using same docker cp. Once copied back then restart the container. It should be working. I tested this approach and working fine.

@eselvam
Copy link

eselvam commented Oct 23, 2018

After making above changes, the httpd is not starting. we have to start it manually to access the manageiq web page.

/usr/sbin/httpd -DFOREGROUND &

I think we are missing some config here not sure how to find it. still checking. It is happening for only docker images. The vmware image i.e ova or ovf works fine.

@miq-bot miq-bot added the stale label Apr 29, 2019
@miq-bot
Copy link
Member

miq-bot commented Apr 29, 2019

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

@keuko
Copy link

keuko commented Aug 20, 2019

Hello,

I have the same issue ... is there any progress ?

Thanks,
Michal

@eselvam
Copy link

eselvam commented Aug 21, 2019 via email

@gaetanquentin
Copy link
Author

gaetanquentin commented Aug 21, 2019

@eselvam none. this docker image is a one shot launch.... ;-) after you stopped it, unable to start it again. And they don't care.

tested with hammer-6, hammer-10

@JPrause
Copy link
Member

JPrause commented Sep 25, 2019

@miq-bot remove_label stale
@miq-bot remove_label pinned

@miq-bot miq-bot removed the stale label Sep 25, 2019
@chessbyte
Copy link
Member

@carbonin can you take a look at this?

@carbonin carbonin self-assigned this Nov 5, 2019
carbonin added a commit to carbonin/manageiq that referenced this issue Nov 5, 2019
To do this I overwrote the entrypoint from the base image with
what is mostly the previous appliance initialize script.

The main changes I made were to add the server start at the end
and to remove the old container scripts references by pasting
the v2 key writing function into where it was previously called.

Additionally I removed starting memcached from the block that only
gets called if the database doesn't exist. We should start memcached
regardless.

This should allow the container to be started after a clean stop
Fixes ManageIQ#17274
@carbonin
Copy link
Member

carbonin commented Nov 5, 2019

After #19463 is merged this should work for the most part. The only issue I was still having was that httpd might not start sometimes, but I was able to exec into the container and start it to get the UI accessible.

I think the issue is that even if you wait for the server to stop cleanly the other processes in the container will be killed uncleanly which I feel like is what's causing the issue I'm seeing with httpd. But I think the main problem here should be fixed by my PR.

carbonin added a commit to carbonin/manageiq that referenced this issue Nov 5, 2019
To do this I overwrote the entrypoint from the base image with
what is mostly the previous appliance initialize script.

The main changes I made were to add the server start at the end
and to remove the old container scripts references by pasting
the v2 key writing function into where it was previously called.

Additionally I removed starting memcached from the block that only
gets called if the database doesn't exist. We should start memcached
regardless.

This should allow the container to be started after a clean stop
Fixes ManageIQ#17274
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.