Skip to content
This repository has been archived by the owner on Jun 30, 2024. It is now read-only.

WiP: Dockerfile for turnkey deployments #1178

Merged
merged 6 commits into from
Apr 30, 2019

Conversation

yarikoptic
Copy link
Contributor

@yarikoptic yarikoptic commented Jan 4, 2019

This is ATM just an initial version of Dockerfile which just installs needed components

Setup

  • Have books/ directory with books (due to Books are "hardcoded" through the code base, should be "discovered" instead #1180 I guess should only be known books)
  • Optionally have configs/{instructors,students}.csv and bindmount that configs as /srv/configs to get DB populated and locked down
  • Run docker with needed bind mounts and only a few options (e.g. DB password) to get the whole machinery setup and running
  • Running container then should be cherished to not be gone, since it would contain built/deployed books and the active DB (see below may be about bind mounting DB itself)

TODOs

  • Add a runestone user and switch installation of all non-system wide components to be done under that user. I guess the server could be ran under e.g. 8080 but port mapped by docker itself upon run to port 80
  • Decide on how to organize it for proper balance of flexibility (e.g. having books bind mountable outside, or in principle could be an ARG on which books to deploy)
  • See if the entire beast could be RF'ed to be a neurodocker call, so it would be easy to generate also Singularity images and overall have a more concise recipe
  • Consider versioning all git repos and use pipenv to create fixed tested server deployment.
  • Make its entry point to be the process to start the server
  • Minimize footprint by joining all apt calls, and doing proper cleanup after
  • Someone who actually has a clue in DBA should look at the whole setup. To me, the "superuser" for the DB looks too dangerous
  • Someone who actually has a clue in Docker should look at the Dockerfile and probably suggest a proper Docker compose recipe with 2 docker images, where DB will be provided by one and the whole app by another one talking to the DB in the first one
  • I have looked into this external db and soon will push the version with some code for it. But it was not sufficient -- some data seems to be stored outside of that /var directory so didn't fully work: Someone who just has a clue in postgres might want to look to bind mount the location for the DB itself (/var/lib/postgresql/9.6/main/) so DB could then reside outside of docker along side with the books. Probably should be bindmounted somewhere aside and then the startpoint.sh should do needed initial rsync and adjustments to config to use it instead of the blank installed one.
  • if /srv/configs/{instructors,students}.csv files provided, then DB is populated with those accounts and new signups are disabled

Not sure if I will accomplish all of the above TODOs but want to reach a usable for myself state first to consider it at least "done enough", the rest of non-fulfilled TODOs then could be migrated to a separate issue.

Initial working version

Ok, initial version is pushed and here is how you could try it yourself after you get docker installed (let me know if I should just push that image, but better if you just build it yourself for now):

$> docker build --tag yarikoptic-runestone:0.1.0 .
...
$> docker run -it -p 8081:8080 -e DB_PASSWORD=123 -v $HOME/runestone/books:/srv/web2py/applications/runestone/books  yarikoptic-runestone:0.1.0
...

where I have one book (fopp) subfolder under $HOME/runestone/books. And that is it! you then should be having a docker container running exposing runestone on port 8081 (and internally within container it would be hardcoded 8080). So could be mapped to 80 etc. If you add --rm to the run invocation - that created/configured container would be removed after a "trial run". If you don't (like it is now), you could check that container's id using e.g. docker ps, and then use docker stop and docker start to stop/start it back and forth. It shouldn't then reinitialize anything -- would just pretty much restart the services.

And that is where I would need your help -- to actually make it work

I would really appreciate you giving it a try and helping out to resolve those few issues to get it fully working ;-)

Closes #1176

@vsoch
Copy link
Contributor

vsoch commented Apr 26, 2019

@yarikoptic ! How nuts that I find you here! I literally searched for docker, and then this was the first link that I clicked.

I just found Runestone and I'm pretty excited about it :) Why are you using neurodocker instead of more (typically) base images? Also, since we have a database and other components, why not try something like docker-compose?

I think a docker deployment (especially for developers) would be great - i"d like to hack around a bit :)

@vsoch
Copy link
Contributor

vsoch commented Apr 26, 2019

Oup. now I see:

Someone who actually has a clue in DBA should look at the whole setup. To me, the "superuser" for the DB looks too dangerous

Someone who actually has a clue in Docker should look at the Dockerfile and probably suggest a proper Docker compose recipe with 2 docker images, where DB will be provided by one and the whole app by another one talking to the DB in the first one

+1.

@bnmnetp
Copy link
Member

bnmnetp commented Apr 26, 2019

@yarikoptic @vsoch with summer coming it would be great if we could get a really solid Docker setup finished. I would definitely be able to devote some time to this now.

Also @vsoch feel free to join the discussion on our slack channel: http://runestoneinteractive.org/pages/support.html

@yarikoptic
Copy link
Contributor Author

Indeed it is funny hope small the world is @vsoch. Glad to see you here
There is also https://github.com/dartmouth-pbs/psyc161-wi19-srv which intended to provide (in public branch) complete sample to start the beast with sample students/instructors list and sample books (but you need a docker image which I never pushed). All linked via submodules - you know I like git ;-). I might need to push few more commit I have locally. In master branch I have my real setup, with sensitive files (names and passwords) under git annex v7 in unlocked mode. So you won't get them ;-)

Having just a single docker image and no compose:

  • db is somewhat tightly relates to the actual books used by the course
  • with having everything in a single container I could just commit it, fire up (that repo above had a helper) from the image to come on another port and test needed changes before putting it in production. So it was convenient and I had clear idea what versions of things I am using

What benefits would you get from separating into multiple containers and using compose?

@yarikoptic
Copy link
Contributor Author

Re neurodocker - I guess it came out too custom to benefit from it, so comment could be removed ;-)

@yarikoptic
Copy link
Contributor Author

BTW, you might hit #1191 whenever your db gets useful ;-) so a fix is still due

@vsoch
Copy link
Contributor

vsoch commented Apr 26, 2019

Well, modern developer bros would want to use kubernetes, but docker-compose is better suited here for (likely) deployment on a single server.

The benefit we get with decoupling the database from the web server and application is that it's more flexible to different kinds of deployment. If the entrypoint to the application assumes a web server is ready, and a database too (with provided username / password passed from the environment) we aren't limited to how we deploy, including potentially scaling deployment of the application while using just one database. For Singularity Hub, for example I develop with a postgres image, but then the production database is Google managed postgres.

If we hard code the database in the same container, well, then we're stuck with it. There is an alternative - you can do a "one container approach" and box the web server and application in a container, and then have the application start via a single docker container and a database url. I do this wit expfactory to allow for filesystem, sqlite3, postgres, or mysql databases. https://expfactory.github.io/usage#custom-databases

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

@yarikoptic your setup works okay with

$ docker run -e DB_PASSWORD=runestone -p 8080:8080 runstone/server

What issues do you have? I don't have a lot of experience with web2py, but it seems a lot more complicated than flask or Django, so my (after a couple hours of work today) attempts to get it docker-compose-ized have thus far failed. Is it really a good long term strategy to try and use this deprecated module?

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

What module are you referring to? Web2py? It is still under active development, and works just fine under Python 3.6/3.7. We should have our own port to 3.7 complete in the near future. I’ve talked about moving to flask over the years, but its a big lift which seems much less important than all of the other things I could be working on. If there is another module that we are using that is deprecated please let me know, so we can take action.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

Ah okay, I was looking at the README - it has Python 2.7 written all over it.

Flask is a lot more popular so you'd likely get more contributors, but I understand that's not your priority.

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

Good point, I updated the README.

I use flask in other projects and I really like it. I'm not sure whether we would get very many more contributors. I learned early on that the vast majority of our users don't want to (or don't have the skills to) mess with web programming or even spinning up a server. (maybe an easy Docker recipe would change that but I'm not sure...) Hence https://runestone.academy fills a need for instructors who want/need access to good free resources for teaching CS.

I was looking at this: https://medium.freecodecamp.org/docker-development-workflow-a-guide-with-flask-and-postgres-db1a1843044a today, and I don't see why something pretty simple like this wouldn't work for us. I've also seen a bunch of web2py images that look like the could be good starting points.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

I can give you a start with a docker-compose file, but I don't understand web2py enough to have a full working thing.

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

Cool. Thanks

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

Also, our .travis.yml file may shed some light on the hoops we have to jump through to get web2py and runestone running automatically.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

Cool, thanks! I am taking a break and I’ll pick up later today and at least share a (not fully working thing) later - hopefully we can combine expertise and turn it into a working thing :)

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

okay, I have a postgres, runestone, nginx docker-compose started up - I'm trying to figure out the right way to start web2py. The command I'm using is:

python web2py.py --ip=0.0.0.0 --port=8080 --password="${POSTGRES_PASSWORD}" -K runestone --nogui -X runestone

When I don't set WEB2PY_MIGRATE to Yes, it's clear that there are tables missing:

# python web2py.py --ip=0.0.0.0 --port=8080 --password="${POSTGRES_PASSWORD}" -K runestone --nogui -X runestone 
web2py Web Framework
Created by Massimo Di Pierro, Copyright 2007-2019
Version 2.18.4-stable+timestamp.2019.03.13.05.27.54
Database drivers available: sqlite3, psycopg2, pymysql, imaplib

please visit:
	http://127.0.0.1:8080/
use "kill -SIGTERM 35" to shutdown the web2py server


starting scheduler for "runestone"...
Currently running 1 scheduler processes
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:Error retrieving status
Processes started
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
ERROR:web2py.scheduler.061079fb7082#42:    error popping tasks
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/srv/web2py/gluon/scheduler.py", line 635, in run
    self.send_heartbeat(counter)
  File "/srv/web2py/gluon/scheduler.py", line 1161, in send_heartbeat
    self.db_thread(query).count()
  File "/srv/web2py/gluon/packages/dal/pydal/objects.py", line 2368, in count
    return db._adapter.count(self.query, distinct)
  File "/srv/web2py/gluon/packages/dal/pydal/adapters/base.py", line 791, in count
    self.execute(self._count(query, distinct))
  File "/srv/web2py/gluon/packages/dal/pydal/adapters/__init__.py", line 66, in wrap
    return f(*args, **kwargs)
  File "/srv/web2py/gluon/packages/dal/pydal/adapters/base.py", line 413, in execute
    rv = self.cursor.execute(command, *args[1:], **kwargs)
ProgrammingError: relation "scheduler_worker" does not exist
LINE 1: SELECT COUNT(*) FROM "scheduler_worker" WHERE ("scheduler_wo...
                             ^

When I set the variable to "Yes" it tells me I'm missing some sql.log (likely where it would migrate from?)

# python web2py.py --ip=0.0.0.0 --port=8080 --password="${POSTGRES_PASSWORD}" -K runestone --nogui -X runestone 
web2py Web Framework
Created by Massimo Di Pierro, Copyright 2007-2019
Version 2.18.4-stable+timestamp.2019.03.13.05.27.54
Database drivers available: sqlite3, psycopg2, pymysql, imaplib

please visit:
	http://127.0.0.1:8080/
use "kill -SIGTERM 56" to shutdown the web2py server


starting scheduler for "runestone"...
Currently running 1 scheduler processes
Traceback (most recent call last):
  File "/srv/web2py/gluon/restricted.py", line 219, in restricted
    exec(ccode, environment)
  File "applications/runestone/models/db.py", line 24, in <module>
    session.connect(request, response, masterapp='runestone', db=db, migrate='runestone_web2py_sessions.table')
  File "/srv/web2py/gluon/globals.py", line 953, in connect
    migrate=table_migrate,
  File "/srv/web2py/gluon/packages/dal/pydal/base.py", line 590, in define_table
    table = self.lazy_define_table(tablename, *fields, **kwargs)
  File "/srv/web2py/gluon/packages/dal/pydal/base.py", line 624, in lazy_define_table
    polymodel=polymodel)
  File "/srv/web2py/gluon/packages/dal/pydal/adapters/base.py", line 798, in create_table
    return self.migrator.create_table(*args, **kwargs)
  File "/srv/web2py/gluon/packages/dal/pydal/migrator.py", line 279, in create_table
    query), table)
  File "/srv/web2py/gluon/packages/dal/pydal/migrator.py", line 487, in log
    logfile = self.file_open(table._loggername, 'ab')
  File "/srv/web2py/gluon/packages/dal/pydal/migrator.py", line 495, in file_open
    fileobj = portalocker.LockedFile(filename, mode)
  File "/srv/web2py/gluon/packages/dal/pydal/contrib/portalocker.py", line 185, in __init__
    self.file = open_file(filename, mode.replace('w', 'a'))
  File "/srv/web2py/gluon/packages/dal/pydal/contrib/portalocker.py", line 170, in open_file
    f = open(filename, mode)
IOError: [Errno 2] No such file or directory: 'applications/runestone/databases/sql.log'

I noticed that @yarikoptic is reading in a bunch of csv, etc. and I'm wondering where this is from? What is the proper way to start web2py in terms of preparing the database?

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

And note that (for now) I'm running the containers via docker compose, but I've disabled the actual entrypoint (what I'm testing above) in favor of a working entrypoint to keep the container running so I can then shell inside and test interactively.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

OO I forgot to run:

rsmanage initdb --

hehe :) Reading instructions is good!

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

Whew!

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

The CSV that @yarikoptic is reading and feeding to rsmanage commands are not needed for a dev environment. They are there so that an instructor could have a csv file of students to pre-register and a csv file of registered users that get promoted to being TA's or co-instructors.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

ahh okay :) I was guessing that. Can you tell me what the scheduler is? It might be appropriate to have it run via a clone of the main container, as opposed to the &&

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

The scheduler is an 'extra' copy of web2py that handles long running requests. In our case every course has its own static textbook so when an instructor creates a new course a task gets scheduled to build that book. Building a boot would take longer than any sane browser would wait for a response. The scheduler sits there and handles these long running tasks and we have javascript code that runs to check the status of the build while the instructor waits.

This whole mess will go away in a month or so, so I would just leave it as is instead of starting a separate container for something that will disappear.

@vsoch
Copy link
Contributor

vsoch commented Apr 27, 2019

Should this be an environment secret, or something the container can generate when it's created?

- mkdir private
- echo "sha512:16492eda-ba33-48d4-8748-98d9bbdf8d33" > private/auth.key

@bnmnetp
Copy link
Member

bnmnetp commented Apr 27, 2019

It is important to keep auth.key consistent across restarts otherwise passwords will break every time you make a new container. But I'm not sure it needs to be secret.

@bnmnetp bnmnetp merged commit 6a65bfb into RunestoneInteractive:master Apr 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wishlist: Dockerfile for easy turnkey deployment/upgrades
3 participants