-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4.0] Restructuring Installation, splitting up SQL #28350
Conversation
…4install # Conflicts: # installation/sql/mysql/joomla.sql # installation/sql/postgresql/base.sql
SQL seems dodgy - we're going to add assets and extension table rows for extensions that don't have a valid DB table :/ Also not sure what happens during rollback here. Currently if the tables fail to populate or the db fails to create we call the configurationmodel::deleteconfiguration method. Which your now trashing which I think would mean if something fails would now end up with us stuck (as the trigger for the postinstallation view is the configuration.php file existence). Overall not against the concept of splitting some of this up (assume this would also help with drone installation failures). But I think there's still quite some work to be done in this PR before it's ready |
Concerning the "dodgy" SQL: Yes, we could split this up further and have stuff like the extension entries moved to the extensions SQL file, but I wanted to keep it simple in the first iteration. Especially since stuff like nested sets require quite a set of queries to insert new rows... Regarding the rollback: Right now we are writing the configuration.php and executing all the SQL stuff plus some more stuff in just that config step and that is pretty crappy. The whole code structure is quite a mess. Anyway: Right now we are first writing the configuration.php and then we are executing all the queries. This changes that to first setup the DB and THEN to write the configuration.php. So in case of a "rollback", we are simply starting in the installation screen again. |
TBH, I am not sure if this is the right approach. If I'd split up joomla.sql, I'd first split it up into two sql files only for the beginning: One for the schema and one for data. Then we could see if that is already enough. Then, as far as I remember each SQL statement is executed separately in PHP because they are checked and if necessary modified for an utf8mb4 to utf8 downgrade if necessary. So it should be easy to add some error log statement in PHP for each statement to see by the time stamps of these logs where we lose so much time. If necessary then we could change the installation of extensions so that the extensions installer is used, using extension-specific installation sql, like it would be with a discovery installation, so PHP code would care for stuff like insert into extensions table, menus and assets and so on, that would save us some SQL statements in the data file. Possibly it could also help with time to create indexes first with offline status and later after having inserted initial data set them online. I will try to find time on weekend to do that, find out where we lose the time with use of some error log or debug output, then I'll report back here the results and my recommendations and eventually make an alternative PR. |
plus don't forget that mysql is not able to rollback DDL |
To show what advantages it would have to split schema and data, I've made a branch where I have split the joomla.sql file into joomla_schema.sql and joomla_data.sql. Up to now it has only this change, i.e. nothing changed in PHP yet: https://github.com/richard67/joomla-cms/tree/4.0-dev-split-joomla-sql-1. But you can immediately see the advantage when comparing the 2 joomla_data.sql files of the 2 database types with a good comparison tool, e.g. BeyondCompare, TotalCommander or whatever: You see more quickly the differences between them, and there are a lot, e.g. for PostgreSQL we do not set assed_id for diverse table data. And there are other differences. So it seems I have to make at least 1 PR to fix PostgreSQL data. And I think maintenance for the pure data file will be easier, no need to scoll up and down miles to look up some asset_id for some menu. Another advantage of separate file(s) for data and schema is that for the data files we don't need to check for stuff like utf8mb4 to utf8 query downgrade. When I execute the schemal sql in PhpMyAdmin here, it takes some 1 second for all. For the data script it takes fractions of a second. But when I install J4, using the same database server, the database step takes some 30 to 40 seconds. So I think we don't lose the time in SQL, we lose it in PHP. This all just for discussion. @Hackwar I think it is related to this PR, but if you don't want that to be discussed here I will stop and continue elsewhere. |
At the beginning of this PR, the goal was to split up the SQL to make it run on underpowered hosts. But now that it is "done", it is rather a cleanup of the installation. The installation currently has a semi-sophisticated system to process a list of tasks for the actual installation and we are first dbcheck and then... mash everything else into the task "config". As if that wasn't opaque enough, that config step uses methods from several models and those are basically distributed randomly between these models. I'm not claiming that this is perfect already, but at least a lot better than before. How to split up the SQL I don't really care. |
What do you mean with it is done? Here in this PR?
But with this PR you determine how it will be split. And I think it's not a good way. |
I mean the way the PHP has been rearranged.
Feel free to do a PR against this PR to re-order this. But I have to say that I don't think the distinction between schema and content of a table is a good one either. We will at one point come to a situation where we are supporting installations with different feature sets and that will mean for example a reduced number of tables what in here is in the extensions.sql. So a split along the lines of schema and data will not help here. |
@Hackwar Not sure if that would make sense because too different approach, but I will check and do so if useful. Regarding the conflicts: They come from my recently merged PR #28345 . Shall I help you with solving the conflicts? The changes from my PR will still be necessary with your PR, I think. |
That would be good, yes. |
@Hackwar Hmm, it's not simply solving conflicts at the same place. Because you restructured things in your PR, my change has to be applied differently. In general the change was that the createDatabase function of the databae model throws exceptions, but in the calling functions those were not handled, and so the configuration file was not deleted after failure. If you think you can apply this, do so and then just solve conflicts by taking yours, otherwise I will have to check later after work when I'm at home if I can make a PR for your branch for that purpose. |
@Hackwar Have done my best, see Hackwar#37. |
@strawhatgami You have tested with the last changes? Those should have fixed that. |
@strawhatgami If that helps you: You can download full installation packages "Joomla_4.0.0-beta1-dev+pr.28350-Development-Full_Package" here: https://ci.joomla.org:444/artifacts/joomla/joomla-cms/4.0-dev/28350/downloads/30772/. Just download the package again, or if you use a git clone or Patchtester, revert the patch of this PR if you still have it applied, then fetch the latest changes and apply the patch again. The last change should have solved the issue with the falsely reported database error related to UTF-8 Multibyte conversion. |
@richard67 I download the full updated package for each test, so I used your link, thanks. |
@strawhatgami As you can see it doesn't show the error about missing UTF-8 Multibyte conversion. The error about wrong schema version is normal for these packages build for the PR's. So if you want you can give this PR here a good test result. |
Merge upstream/4.0-dev into origin/j4install (2)
I have tested this item ✅ successfully on 5feaae1 This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28350. |
I have tested this item ✅ successfully on 5feaae1 I have tested new installation with and without the patch of this PR applied, and both of this on MySQL 5.7 and PostgreSQL 11. In each case I have exported structure and data of the database into an SQL file. I have verified that there is no difference in structure and data between a new install with and without this PR, except of those differences in data which are expected for different installations (e.g. id of super user, update times, ...). I can also say that the database installation part feels a bit faster with the patch of this PR applied in case of new installation. This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28350. |
RTC This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28350. |
without a strong separation between DDL and DML |
@alikon I think about doing that, separate DDL and DML oir at least reorganise DML so it is easier to be maintained. But when I have tested this PR here, it really seemed to me that database installation was faster than before, and PHP has been cleaned up a bit, so I think it wasn't only cosmetic. |
Again, the goal wasn't to get a perfectly split up SQL, but to fix our installation steps. A next step now is to visualise the progress. So instead of just having a rotating logo, it should display a modal with a list of steps and a checkmark next to them. |
@Hackwar Exactly what I want. Will you do that, or should we ping some JS guru? Update: Maybe not a modal. |
I also disagree that we should split this into structure and data. Instead I want minimal viable system and then the different extensions. The current split is an attempt at that. But if you are unhappy with the current split, feel free to propose a better one in another PR. I don't mind. |
We can have both, split into minimal and extended functionality and in addition a de-coupling of the order of processing for DDL and DML, which never had another reason than our tradition and maybe the ability to look up column names for insert statements in the create table statements above, but that never was a good reason. E.g. I think instead of hard-coding asset_id columns we could get them by clever select, such things. |
I think this PR here was a good step and later changes I have in mind will be small optimizations maybe, nothing fundamental. |
as you may already knows Mysql doesn't allow to rollback for DDL instructions |
@alikon I disagree here. This is about performance, about user experience and about error handling. Before this was merged, the installation had just 2 steps (checking the DB and everything else) and that "everything else" step was slightly resource intensive. On a slow machine this could time out. By splitting this up into far more and more logical steps, this practically prevents the PHP timeouts that some people experienced in the past. By splitting this up, you also have ways to identify an error. Right now it could fail basically everywhere and the error response would always be identical. Did it fail when populating the tables or when converting tables to utf8mb4 or when creating the configuration.php? Since it is just one big lump of a step, you wouldn't really know. Last but not least, even on faster hosts, the installation takes some time. It isn't obvious if the process is still running or if it hangs. With the above mentioned modal, we would now be able to give the user feedback, that it is still processing the installation. |
Anyway, thanks @rdeutz for merging. :-) |
i'm not questioning if this is better or worst than before----- i trust you and people who made test on it |
Pull Request for Issue #28332.
When the installation was redone for 4.0, it seems as if the serverside scripts were more or less bypassed and partially duplicated. I tried to clean this up and also split up the huge SQL file into smaller, more manageable chunks.
This PR is "final", but if we follow this path, we would have to do some more work. Instead of the loader animation, we would have to add a modal that displays what is currently done. (Base SQL, Support SQL, Extensions, custom SQL, localisation SQL, writing configuration) and especially the last step, the config step, would have to be split up further.