Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkp/pkp-lib#2163 Fix PHP Notices for missing article columns #1483

Merged
merged 2 commits into from
Oct 13, 2017

Conversation

NateWr
Copy link
Contributor

@NateWr NateWr commented Aug 10, 2017

@asmecher Ok here's a proper fix for the missing article columns that uses the query builder. All credit to @kaschioudi on this one as he had to lift me over all the SQL hurdles. :)

@NateWr NateWr requested a review from asmecher August 10, 2017 10:20
$q->leftJoin('section_settings as stl', function($join) use($locale) {
$join->on('s.section_id', '=', Capsule::raw("'stl.section_id'"));
$join->on('stl.setting_name', '=', Capsule::raw("'section_title'"));
$join->on('stl.locale', '=', Capsule::raw("'{$locale}'"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this defeat string escaping? (Also below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. Unfortunately, the bound params get compiled out of order otherwise, so we lose some of the benefits of the query builder here. But... (see below)

Copy link
Member

@asmecher asmecher Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one would work by just changing out Capsule::raw("'{$locale}'") for $locale. (And then you wouldn't have to worry about malicious values of $locale breaking out of their cages using ' characters embedded within.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I do that, I get the following error:

DB Error: Unknown column 'en_US' in 'on clause'

This comment walks through the debug process on that line and how we ended up there if that gives any clue to what's going on: #1483 (comment)

$primaryLocale = \AppLocale::getPrimaryLocale();
$locale = \AppLocale::getLocale();

$this->columns[] = Capsule::raw('COALESCE(stl.setting_value, stpl.setting_value) AS section_title');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like heavy use of Capsule::raw essentially reverts us to gluing together SQL strings, negating the usefulness of the query builder, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do lose much of the expressiveness of the query builder. But it does allow us to piece together bits of a query out of order, which is really helpful for building extensible queries to share code between OJS/OMP, and to allow plugins to hook in.

This method of gluing SQL strings together seems easier and less fragile to me than direct string manipulations. But I'd be open to an alternate approach. I just don't think I'm the one to formulate that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about this during the dev call. Everywhere where we need to split SQL construction between classes (most typically app/pkp-lib) it's currently spaghetti, so this is an improvement -- but it might be nice to find a way to establish this new pattern with less use of the raw construct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NateWr, can you walk me through an example of where the ordering is important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asmecher Ok I tried to come up with a reproducible case of the issue. I have to say I don't really understand what's going on here too well. But let's take the following code here:

		$q->leftJoin('section_settings as stpl', function($join) use($primaryLocale) {
			$join->on('s.section_id', '=', Capsule::raw("'stpl.section_id'"));
			$join->on('stpl.setting_name', '=', Capsule::raw("'section_title'"));
			$join->on('stpl.locale', '=', Capsule::raw("'{$primaryLocale}'"));
		});

My first attempt looked like this:

		$q->leftJoin('section_settings as stpl', function($join) use($primaryLocale) {
			$join->on('s.section_id', '=', 'stpl.section_id');
			$join->where('stpl.setting_name', '=', 'section_title');
			$join->where('stpl.locale', '=', $primaryLocale);
		});

If you look at the SQL query and the bound params, they look like this:

select `s`.*, COALESCE(stl.setting_value, stpl.setting_value) AS section_title, COALESCE(sal.setting_value, sapl.setting_value) AS section_abbrev from `submissions` as `s` left join `section_settings` as `stpl` on `s`.`section_id` = `stpl`.`section_id` and `stpl`.`setting_name` = ? and `stpl`.`locale` = ? left join `section_settings` as `stl` on `s`.`section_id` = 'stl.section_id' and `stl`.`setting_name` = 'section_title' and `stl`.`locale` = 'en_US' left join `section_settings` as `sapl` on `s`.`section_id` = 'sapl.section_id' and `sapl`.`setting_name` = 'section_abbrev' and `sapl`.`locale` = 'en_US' left join `section_settings` as `sal` on `s`.`section_id` = 'sal.section_id' and `sal`.`setting_name` = 'section_abbrev' and `sal`.`locale` = 'en_US' where `s`.`context_id` = ? and `s`.`stage_id` in (?, ?, ?) and `s`.`section_id` in (?) group by `s`.`submission_id`, COALESCE(stl.setting_value, stpl.setting_value), COALESCE(sal.setting_value, sapl.setting_value) order by `s`.`date_submitted` desc

Array
(
    [0] => 1
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => section_title
    [5] => en_US
    [6] => 2
)

So the $join->where statements were getting set as bound parameters, but were out of order with the eventual compilation of the query string.

The fix was to use multiple $join->on statements, like this:

		$q->leftJoin('section_settings as stpl', function($join) use($primaryLocale) {
			$join->on('s.section_id', '=', 'stpl.section_id');
			$join->on('stpl.setting_name', '=', 'section_title');
			$join->on('stpl.locale', '=', $primaryLocale);
		});

For some reason, those on statements don't get set as bound parameters. They get rendered into the SQL statement. It's not clear to me whether the parameter binding/sanitizing is happening within the leftJoin, so it's still sanitizing the parameter. Of it it's just ignoring it and not doing any binding/sanitizing on it. That leads the the following SQL and params output:

select `s`.*, COALESCE(stl.setting_value, stpl.setting_value) AS section_title, COALESCE(sal.setting_value, sapl.setting_value) AS section_abbrev from `submissions` as `s` left join `section_settings` as `stpl` on `s`.`section_id` = `stpl`.`section_id` and `stpl`.`setting_name` = `section_title` and `stpl`.`locale` = `en_US` left join `section_settings` as `stl` on `s`.`section_id` = 'stl.section_id' and `stl`.`setting_name` = 'section_title' and `stl`.`locale` = 'en_US' left join `section_settings` as `sapl` on `s`.`section_id` = 'sapl.section_id' and `sapl`.`setting_name` = 'section_abbrev' and `sapl`.`locale` = 'en_US' left join `section_settings` as `sal` on `s`.`section_id` = 'sal.section_id' and `sal`.`setting_name` = 'section_abbrev' and `sal`.`locale` = 'en_US' where `s`.`context_id` = ? and `s`.`stage_id` in (?, ?, ?) and `s`.`section_id` in (?) group by `s`.`submission_id`, COALESCE(stl.setting_value, stpl.setting_value), COALESCE(sal.setting_value, sapl.setting_value) order by `s`.`date_submitted` desc

Array
(
    [0] => 1
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 2
)

However, that leads to an OJS database error:

<h1>DB Error: Unknown column 'section_title' in 'on clause'</h1>

Finally, Kassim recommended the use of Capsule::raw, which seems to fix that problem:

		$q->leftJoin('section_settings as stpl', function($join) use($primaryLocale) {
			$join->on('s.section_id', '=', Capsule::raw("'stpl.section_id'"));
			$join->on('stpl.setting_name', '=', Capsule::raw("'section_title'"));
			$join->on('stpl.locale', '=', Capsule::raw("'{$primaryLocale}'"));
		});

So it's not clear to me if this is really a case where order is causing problems. Maybe where methods are to be avoided inside leftJoin closures. It's not the only place I've run into this issue, but it's not exactly clear to me when the params get bound of order or why.

Maybe @kaschioudi has a better idea of a more specific example or use-case to illustrate the issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping column names in Capsule::raw is OK when it's just a column name -- it's more when whole clauses of SQL get wrapped in it that we lose the benefits of the query builder, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so I think this is the largest chunk that gets passed through Capsule::raw:

$this->columns[] = Capsule::raw('COALESCE(stl.setting_value, stpl.setting_value) AS section_title');

When I take that out of the Capsule::raw it tells me:

<h1>DB Error: Unknown column 'section_title' in 'on clause'</h1>

Maybe you or @kaschioudi have an idea for getting around this, but it's not entirely clear to me what the issue is, or why wrapping it in Capsule::raw solves the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's OK to wrap Capsule::raw around this example, as I suspect it'll try to interpret it as a string value or something without the wrapper.

@@ -44,6 +44,38 @@ public function filterBySections($sectionIds) {
* @return object Query object
*/
public function appGet($q) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking broadly about the division of SQL between the DAOs and the service layer classes, what is the end-game for the DAOs? For example, would they retain insert/delete/fetch, leaving the more business-logic-intensive joins to service classes? (That's as much for @kaschioudi as anyone)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @kaschioudi on this one. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaschioudi, pinging you on the above comment!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alec for the reminder. I totally forgot about this. So here's my point of view.

Generally services are helper objects which handle complex business logic usually by controlling multiple domain objects. For OJS that means Services should orchestrate interactions between group of DAO and/or query builders. CRUD and simple operations remain on DAOs and complex queries go into entity query builders. Once we replace ADODB (pkp/pkp-lib#2493), we will probably introduce model (ActiveRecord implementation) objects which will centralize CRUD and complex queries under a single class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense -- and provides a good place for cross-object queries to live, which we've never had before.

NateWr referenced this pull request in pkp/pkp-lib Aug 28, 2017
This commit integrates some updates to the REST API infrastructure
from kaschioudi/pkp-lib's rest API branch. These changes include
a new getParameter approach for retrieving params passed to API
endpoints, as well as splitting off a /backend/ API namespace for
specifically serving our own app needs.

It also introduces the services classes introduced in kaschioudi's
services branch. The submission service class is a central place
for managing business logic. At the moment, it provides a common
point for retrieving a list of submissions and deleting
submissions.

These two architectural changes were integrated with the
submissions list panel component that is powered by Vue.js. As a
result, this commit also strips out the large method bundled in
the SubmissionDAO->get() from previous commits. It also strips
out the DBQueryBuilder class that was added, in favor of the
query builder class that kaschioudi has introduced. That builder
is under classes/services/QueryBuilders and is based on the lib
which powers Laravel.
@NateWr NateWr force-pushed the i2163_add_article_columns branch from 2e0a1ab to bf62f93 Compare September 15, 2017 14:51
@NateWr NateWr force-pushed the i2163_add_article_columns branch from bf62f93 to 335b519 Compare September 27, 2017 11:25
@NateWr NateWr force-pushed the i2163_add_article_columns branch 3 times, most recently from faf9133 to 83e8689 Compare October 9, 2017 12:28
$q->groupBy(Capsule::raw('COALESCE(sal.setting_value, sapl.setting_value)'));

$q->leftJoin('section_settings as stpl', function($join) use($primaryLocale) {
$join->on('s.section_id', '=', Capsule::raw('`stpl`.`section_id`'));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaschioudi and @asmecher I'm running into some problems with this line and 63, 69 and 75 below. When written like the following, it passes mysql tests but fails postgres tests:

'`stpl`.`section_id`'

Same with:

"`stpl`.`section_id`"

I've tried other things which pass postgres but cause problems with mysql. I've tried:

'"stpl"."section_id"'

And

"'stpl'.'section_id'"

And

'stpl.section_id'

Do either of you have an idea for what the syntax should be for this to pass muster with postgres and mysql?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this now...

Copy link
Member

@asmecher asmecher Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any quoting is needed here. This works for me with PostgreSQL: asmecher@9ca1c8b

@NateWr NateWr force-pushed the i2163_add_article_columns branch from 83e8689 to 886c30f Compare October 12, 2017 10:56
@NateWr NateWr merged commit e8711da into pkp:master Oct 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants