-
-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TO BE REPLACE: Discussions about Docker + originally hard copy fixes / extensions for WW 2.15 branch + move Docker base image to Ubuntu 18.04 LTS #972
Conversation
as it was (accidentally) removed in commit openwebwork@bea85e1
I think this is a critical fix.
|
On WW running in Docker - after defining the setting, generating PDF fails due to
|
How about using |
The main goal was to support multilingual PDF generation using XeLaTeX and polyglossia. My test case was English+Hebrew, so I have added XeLaTeX conf/snippets/hardcopyThemes directories for using polyglossia for English and English+Hebrew. Fonts (culmus) were installed in the Docker image to provide the Hebrew fonts needed to test the English+Hebrew xelatex compilation. conf/localOverrides.conf.dist contains sample code to use the\ new themes. conf/site.conf.dist contains a commented out option to use xelatex instead of pdflatex. The docker-config/ has additional sample config files, and docker-entrypoint.sh was moved into docker-config/. In particular there are very rough sample files for using WW with SSL in Docker, but the placeholder certificate and key files would need to be put in place, as well as various settings adjusted in the config files. The code in lib/WeBWorK/ContentGenerator/Hardcopy.pm which displays the available hardcopyThemes options was changed to avoid the use of maketext. Local installations may have local hardcopyThemes and they will not have entries in the PO translations files, and the prior code would trigger errors as a result. The version of xelatex and various latex packages in Ubuntu 16.04 did not manage to compile the Hebrew+English latex files I tested, but they compile under Ubuntu 18.04. As a result, the Dockerfile was changed to build upon Ubuntu 18.04 (LTS), which had originally been intended to be done in a different PR. Along the way, many Perl modules are now installed from the Ubuntu repositories, which saved on "cpanm" local installation. Several additional tools, as well as some packages for SSL support were added to the image. The docker-compose.yml was updated, and includes sample code to use various external resources, and to store Apache logs outside the image. At present, the docker config files are set up under the assumption that the OPL is mounted from OUTSIDE the docker image. That saves on build time, and allows sharing it between different images (but may slow things down somewhat) but there is a commented out block in Dockerfile to put the OPL in the image. The Dockerfile also updates the base Ubuntu image, created locales, sets the timezone, etc. Several lines for SSL support are commented out, and should be enabled for sites using SSL, once the relevant customized configuration files and certificates are provided to the image (possibly via external mounts). Settings for SQL password in docker-compose.yml should be effective if they are set/exist when the MariaDB container boots up with an empty database directory/volume. Thus, those passwords should be set BEFORE the first "docker-compose up" or after the mysql data volume is deleted.
@heiderich - I had hoped to make a change to Along the way, more Perl modules are installed via Ubuntu, additional tools and packages are added, include some needed for SSL support. The result is that this is a pretty "heavy" PR in terms of changes. See the commit message from the second commit, which has lots of explanation. On 2 machines I tested on using Docker I can generate PDF files using xelatex both with the English+Hebrew hardcopy theme and with an English only hardcopy theme for all English problem sets. Making use of the new hardcopy themes requires changes in The commit message also explains about when/how to set the SQL passwords (= before first DB startup). |
I've successfully built webwork-2.15 with PR #972 and #973 merged both in a docker environment I haven't done much testing yet. I probably won't be able to test the handling of hebrew easily but I can run these for a while and see if there are any more issues with just using 2.15 for existing courses and problems. |
After running |
I'm puzzled by the intended "out of the box" operation of docker and the OPL. Is it intended that the OPL be installed manually inside the container after it is set up? Linking to an external OPL was commented out but there were no instructions for cloning the OPL in the docker-compose.yml |
The mysql passwords are not not really needed in the Dockerfile, as far as I understand, as that is used only to build the webwork app image, however that may have been the original manner in which they were put into the environment variables for the set of images. Instead, they should now be set via |
Note: This is also a PR against the WW 2.15 tree... Yes - the older Docker builds did not have the package with To get support for additional languages with Unicode: you could try pdflatex + babel, but xelatex + polyglossia are far more Unicode friendly. However, the hardcopy theme using either option needs to load the correct languages, provide fonts, etc. polyglossia and and fonts. Polyglossia + fontspec of xelaytex uses system installed fonts, which require far less work to get functioning than using fonts with pdflatex and babel. References:
They critical bits are more-or-less something like
but with the languages needed and suitable font names. As explained above, the versions of various latex/xelatex packages/code under Ubuntu 16.04 did not support compiling the same tex files which worked under Ubuntu 18.04... Here is a xelatex compiled file from my server: The source LaTeX code was renamed to end with .txt so it can be uploaded: I will email you a tgz file with the PG files of this assignment and the set definition file. |
I think this approach is going to work if yo do the right header setup (either for babel or polyglossia) but is going to depend of your "locale". So the need to get personalised headers... can not be bypassed. |
Regarding the specifics of the error on the second page of ww2-test002.pdf:
|
Yes, that is currently the case for any non-English latex document. 😄 @nmoller - If you want to prepare hardcopy theme headers for French - that would be nice to add before WW 2.15 is released. It could be that the CCDMD team already have one somewhere in there, possibly CCDMD@b289cf7#diff-f69176fb70ee3fea438b14262d26ae16 @heiderich - do you want to tackle German? Would your colleague who helped write a sample Arabic PG problem maybe help with an Arabic hardcopy theme?
This has been a long team effort, and I joined the party very late. Almost all the core UTF-8 support in the WW 2.15 branch was added by:
@mgage seems to be ready to close up WW 2.15 with the last few things soon, so it can be formally released later in the summer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is going to work if yo do the right header setup (either for babel or polyglossia) but is going to depend of your "locale". So the need to get personalised headers... can not be bypassed.
At a more generic level, the database (a fresh, clean one... dont tested update as by
https://mathiasbynens.be/notes/mysql-utf8mb4 or
https://medium.com/@manish_demblani/breaking-out-from-the-mysql-character-set-hell-24c6a306e1e5 ) works perfect.
Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment: are you able to separate this docker change and hard copy fixes into two different commits? It will be better for the git history and much easier for people who needs to look this change in the future.
RUN apt-get update \ | ||
&& apt-get -y upgrade \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this, as from Dockerfile best practices (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
Avoid RUN apt-get upgrade and dist-upgrade, as many of the “essential” packages from the parent images cannot upgrade inside an unprivileged container. If a package contained in the parent image is out-of-date, contact its maintainers. If you know there is a particular package, foo, that needs to be updated, use apt-get install -y foo to update automatically.
libuniversal-can-perl \ | ||
libuniversal-isa-perl \ | ||
libtest-fatal-perl \ | ||
libjson-xs-perl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find using system packages instead of cpan sometimes can be troublesome. Often times, system packages are not up-to-date. What packages people are using when doing webwork core development? I suggest we use the same packages for consistency and stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope most developers will now be working using the "current" Docker image + their local modifications, so standardizing on what is packed by our root OS image is convenient, for what is available. As I understand it, that is also how @mgage would like to see core system developers working.
If there are specific CPAN packages where a newer version are needed, we should use them and document it in the WW install instructions, as such issues are likely to effect people on some Linux distributions (ex. RHEL/CentOS) where changes to newer versions are very slow to happen.
I think @mgage and I ran into such an issue with the DBI or DBD on an older machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having started out using cpan packages almost exclusively I have found system packages a bit quicker and sometimes easier. We have also run into problems where the system files are out of date. On the other hand we have run into several cpan modules that are unreliable to download and install (XML-simple
libxml-simple-perl \ libsoap-lite-perl \
were two examples where loading the CPAN modules were would sometimes work and sometimes not.
I haven't yet found a universal rule for preferring system install over cpan install.
ca-certificates \ | ||
culmus \ | ||
fonts-linuxlibertine \ | ||
lmodern \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid to include the tools that are not necessary to run webwork, e.g. vim, telnet, etc. The goal is to build a smaller image that doesn't necessary include everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point accepted. I found telnet helpful when there were issues with outgoing connections, and some reasonable editor is sometimes needed when working/debugging in the running container. However, they can be installed on the running container as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found such tools very useful. Especially during development it is handy not to be in need to install them after every rebuild.
I did not catch up with all the discussions going on, but maybe one option could be to have two Dockerfiles, one mainly for production and one mainly for development. Of course this would require to maintain two Dockerfiles and probably some code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw #972 (comment). I agree with @xcompass in that having a single Dockerfile / image for dev and ops would be desirable. @xcompass how is your development cycle? When you change parts of webwork2 or pg, do you do this on the host system and rebuild the container? If this is the recommended way, then I think we should try to assure that such kind of changes will result in short rebuild times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In trying to understand how to speed up docker performance on the mac I have found that Docker allows for "cascading" files -- at least for docker-compose.yml.
You can have docker-compose.yml (for production) and docker-compose.dev.yml (for development) with the latter containing only the additions needed for development.
docker-compose -f docker-compose.yml -f docker-compose-dev.yml up create the development environment.
|
||
RUN echo 'America/New_York' > /etc/timezone \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use UTC for container timezone as it is best practises for ops. People often mount the /etc/timezone file from host to have container in sync with host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not a devops expert (yet). That makes lots of sense and will be done, and I will add a note somewhere in docker-compose.yml
about maybe mounting the system file.
@xcompass - what do you recommend about the generation of locale files? I think the same logic regarding timezone means that the code to generate locales might best be moved to the very end of the Dockerfile or to the startup file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense 😄
@@ -104,72 +168,90 @@ RUN mkdir -p $APP_ROOT/courses $APP_ROOT/libraries $APP_ROOT/webwork2 | |||
# && mv webwork2-${WEBWORK_BRANCH} $APP_ROOT/webwork2 \ | |||
# && rm -rf /tmp/${WEBWORK_BRANCH}.tar.gz /tmp/webwork2-${WEBWORK_BRANCH} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the container not include webwork source anymore? What if I'm an OPL developer and I have no interests in changing webwork core code. Or I just want to test webwork out. I still needs to download a copy of webwork source code outside container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. For any operational use it should be included, even if system developers don't need it.
|
||
ENTRYPOINT ["docker-entrypoint.sh"] | ||
|
||
EXPOSE 80 | ||
|
||
# Comment out the next line if SSL is not needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change it to Uncomment the next line if SSL is needed
|
||
# setup apache | ||
# if no SSL needed - comment out the line && a2enmod ssl && a2ensite default-ssl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change it to if SSL is needed - uncomment the line && a2enmod ssl && a2ensite default-ssl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the revised PR #980 which focuses on the Docker changes only. I tried to implement all the changes suggested.
WORKDIR $APP_ROOT | ||
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why so many empty lines?
build: . | ||
image: webwork | ||
|
||
image: webwork_215_ubuntu_1804 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not change this. In the future (or maybe we should do it now), we should set up dockerhub to automatically build the image, so people don't need to build the image locally. It is much faster to pull down an image from dockerhub than building it locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xcompass - Thanks for all the feedback about the Docker files and best practices.
- I think that at the least we should somehow "tag" for WW release version, so that people can depend on getting the version they expect.
- @xcompass - How should that be done "correctly"? - I will work on splitting this into two different PRs: one only for the WW hardcopy code without any of the Docker changes, and one with the Docker changes taking your comments into account.
- I don't know if I will have it ready until early next week.
@mgage - Can we merge in #973 soon as that had very minor change to Dockerfile
and working in steps will avoid merge conflicts. Then I will build the "hardcopy only" PR based on a branch with that merged in, and then the major Docker changes from this PR as a step after the revised "hardcopy" PR. We probably want to consider @heiderich's suggestion of moving all the Docker files out of webwork2 and having webwork2 and pg as submodules of the "ww-docker" module. I don't know if those people able to work on that task have time to try to do it before WW 2.15 becomes an official release candidate.
Sorry for the delay, we were busy with a few deadlines in the past a few weeks. Here are my comments:
Sorry for the wall of text. I do both dev and ops, so I tend to think what is the best approach for both sides. |
@xcompass Thanks.
|
@xcompass Thanks for your 'wall of text'. We need comments from those experienced in both dev and ops. While we are continuously learning, that fact that many of us are mathematicians and don't work on many different types of software development limits the breadth of our experience. One of the time consuming aspects of installing webwork2 quickly is building the index for the OPL in the database (running OPL-update ). Is there some way to keep that part of the database in an image so that it loads quickly even if it's a bit out of date? Perhaps the better long term fix would be to consider how to improve OPL-update so that it does updates as opposed to starting from scratch. |
Why don't we figure out how to distribute a tarball of the OPL SQL tables (sort of like the SL portion of the course archives) which can be installed using the admin course? That would allow installing a "recent" set of OPL tables without any need to scan the entire OPL file tree and parse the PG files? It might be necessary to provide GPG signatures and/or SHA-256 sums of the internal data/tarball to make this a bit more secure. Such an approach would mean that "most people" could avoid running OPL-update at all, so long as they don't have "local code" merged into the OPL for search purposes. Once we finish and release multi-library support people will no longer need that hack.
I suspect that might be hard to do, as there is a dependence on "id numbers" of objects in the SQL tables which get created when the tables are created. Determining what they were from a modified OPL might not be simple, nor would be detecting deletions of files of changes to tags. |
Creating an SQL dump of the library is an excellent idea. We could use some checks to make sure that the database version matches the current OPL repos. Even if it doesn't, the match would probably be good enough for most development purposes. For production sites, which change infrequently, the overhead of building the OPL database is not so bad. We could create a github hook that creates a new OPL database when changes are made (or once a day if changes are made). |
@mgage @taniwallach @xcompass : Before embarquing in a long process... I think it could be helpful to stablish where we are heading to. I explain... Pan has set a way to get an image (as light as possible with all the setup done and that everything works out of the box... because of that the inclusion of the opl update in the docker-entryscript.sh). I believe the 10 min it takes ... are worth :) because somebody can use it as it is without any hacking. If you are thinking in a DEV env without the opl update option.... it is easier to do that in the entryscript. |
The Docker configuration changes (after additional modifications) were moved to #980 |
Replaced by:
Much discussion about revisions to the Docker configuration appears in the conversation below.
Original:
Reinsert the
$externalPrograms{pdflatex}
setting in conf/site.conf.dist as it was removed in commit bea85e1 apparently by accident.The hard copy generation code in
lib/WeBWorK/ContentGenerator/Hardcopy.pm
andthe web service hard PDF generation code in
lib/WebworkWebservice/MathTranslators.pm
both expect this setting to exist.