-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow "" to filter for explicite empty fileds #1798
Conversation
The Appveyor failure is unrelated to this PR. |
I've tested this on Mac, where it appears to work as intended with no unintended side effects. |
Thank you for testing. |
I would, but I don't think I know C++ or the syntax well enough for it to be a meaningful review. |
src/library/searchquery.cpp
Outdated
bool NullTextFilterNode::match(const TrackPointer& pTrack) const { | ||
for (const auto& sqlColumn: m_sqlColumns) { | ||
QVariant value = getTrackValueForColumn(pTrack, sqlColumn); | ||
if (!value.isValid() || !qVariantCanConvert<QString>(value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition might cause that different columns are picked by match() and toSql(). toSql() simply picks the first column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This is only relevant for "artist" which has two columns. So I think in our case if the "artist" column is empty or invalid it should be enough to match.
src/library/searchquery.cpp
Outdated
if (!value.isValid() || !qVariantCanConvert<QString>(value)) { | ||
continue; | ||
} | ||
// only use the major coloumn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
src/library/searchquery.cpp
Outdated
|
||
QString NullTextFilterNode::toSql() const { | ||
for (const auto& sqlColumn: m_sqlColumns) { | ||
// only use the major coloumn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
src/library/searchquery.cpp
Outdated
} | ||
|
||
QString NullTextFilterNode::toSql() const { | ||
for (const auto& sqlColumn: m_sqlColumns) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is confusing if you just want to pick the first column from a non-empty list.
src/library/searchquery.cpp
Outdated
} | ||
|
||
QString NoCrateFilterNode::toSql() const { | ||
return QString("id NOT IN (%1)").arg(CrateStorage::formatQueryForTrackIdsWithCrate()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRATETABLE_ID
src/library/searchqueryparser.cpp
Outdated
@@ -153,8 +170,13 @@ void SearchQueryParser::parseTokens(QStringList tokens, | |||
mixxx::track::io::key::ChromaticKey key = | |||
KeyUtils::guessKeyFromText(argument); | |||
if (key == mixxx::track::io::key::INVALID) { | |||
pNode = std::make_unique<TextFilterNode>( | |||
m_pTrackCollection->database(), m_fieldToSqlColumns[field], argument); | |||
if (argument == "\"\"") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""" should become a constant
src/library/searchquery.cpp
Outdated
m_pCrateStorage->selectAllTracksSorted()); | ||
|
||
while (crateTracks.next()) { | ||
m_matchingTrackIds.push_back(crateTracks.trackId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Track IDs might be pushed multiple times, which is not necessary. The result contains tuples of (track id, crate id) sorted by track id, but we only need distinct track ids here.
We can also think about adding a new query instead of post-filtering the result.
Using "" for empty/missing numeric fields feels strange t first, but at least it is consistent. It might even be more intuitive for the common user, who does not consider the restrictions of a strongly typed programming language. |
Done. Thank you for review. |
src/library/searchquery.h
Outdated
@@ -16,6 +16,8 @@ | |||
#include "util/memory.h" | |||
#include "library/crate/cratestorage.h" | |||
|
|||
const QString kExpliciteEmpty = "\"\""; // "" searches for an empty string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the constant is not a valid English term and needs to be explained. The comment is misleading, because it applies to all kind of fields and not only strings.
- Maybe kMissingFieldSearchTerm or something else?
- The SQL uses "IS NULL" which does not match fields with empty strings!? What is the desired result for string fields? I suggest to cover both NULL and empty strings with this filter.
src/library/searchquery.h
Outdated
@@ -91,6 +93,23 @@ class TextFilterNode : public QueryNode { | |||
QString m_argument; | |||
}; | |||
|
|||
class NullTextFilterNode : public QueryNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is inappropriate if we want to filter for both NULL and empty strings.
As expected empty string fields are not covered by this filter yet. |
Done! |
After a closer look at the parser I really have to asked a question: Why don't we evaluate empty arguments like they are supposed to be?? The odd special case handling at the beginning of SearchQueryParser::getTextArgument() prevents us from using empty arguments for this use case. Instead we now add another special case by introducing this counterintuitive empty quoted argument workaround. We should instead remove the existing special case and use empty arguments to search for missing or empty fields. Did you notice that only spaces inserted on the right side get this special case handling, but not those on the left side? One more reason for removing this ugly legacy code, even if it changes the behaviour. Only the regex should be responsible for tokenizing the filter strings into field and argument. |
I think there can be a way to keep some of the special case handling and allow to use an empty argument for searching for empty strings. As a last resort, we can also ditch the feature to allow a space after the token. I think we need to define the big goal and adjust our small steps towards this. I also like to stick with the quoted search as an "exact match" like Google does. After the discussion how to find BPM values, I am convinced that the fuzzy match should be the default and if one wants to have the exact match, he has to use quotes. By the way a singles space after the token like "title: " may also search for all titles that contain a space aka two words. What do you think? |
Conflicts: src/library/searchquery.cpp
The idea for distinguishing between an exact and a fuzzy search not reflected in the code. Now it's just one more special case that makes refactoring the code harder when implementing the full-featured solution. That's my only (minor) concern. But the code has to be rewritten anyway. Usability (also a minor issue): The feature is difficult to discover without reading the manual. It's ok for this small change. But I don't like this way of piling up new features on top without cleaning up the mess below, even if this requires dropping old, inconsistent features. |
Ok, so how to continue here? The first candidate for exact and fuzy match can be BPM. |
src/library/crate/cratestorage.cpp
Outdated
@@ -535,6 +547,17 @@ CrateTrackSelectResult CrateStorage::selectTracksSortedByCrateNameLike(const QSt | |||
} | |||
} | |||
|
|||
TrackSelectResult CrateStorage::selectAllTracksSorted() const { | |||
FwdSqlQuery query(m_database, QString( | |||
"SELECT %1 FROM %2 GROUP BY %1 ORDER BY %1").arg( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use GROUP BY instead of DISTINCT?
bool NoCrateFilterNode::match(const TrackPointer& pTrack) const { | ||
if (!m_matchInitialized) { | ||
TrackSelectResult tracks( | ||
m_pCrateStorage->selectAllTracksSorted()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried, because generally the size of this query's result if of the same magnitude as the size of the whole library.
The only way for preventing this would be to use an adaptive strategy depending on the actual contents of the crate tracks table, i.e. for deciding if most tracks are organized in crates or not. This comes at the cost of one or two additional queries for determining the sizes upfront to chose the optimum query that will return either a blacklist or a whitelist.
But the whole search thing does not scale very well, so let's just keep it as is without introducing more complexity.
@@ -91,6 +93,23 @@ class TextFilterNode : public QueryNode { | |||
QString m_argument; | |||
}; | |||
|
|||
class NullOrEmptyTextFilterNode : public QueryNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have separate NullOrEmptyTextFilterNode/TextFilterNode and NoCrateFilterNode/FilterNode and the decision which one to use is done by the parser. But there is only one NumericFilterNode which does this decision internally.
I understand the motivation behind this, i.e. to separate and modularize the code for different use cases. But now the design has become inconsistent and responsibilities are scattered across different levels of abstraction.
We can keep the functionality as is. But we have to at least fix the design issues to keep this maintainable. |
OK, done. |
Conflicts: src/library/searchqueryparser.cpp
@uklotzde, All comments are addressed, Can we merge this? |
src/library/searchquery.h
Outdated
QString m_operator; | ||
double m_dOperatorArgument; | ||
bool m_bRangeQuery; | ||
double m_dRangeLow; | ||
double m_dRangeHigh; | ||
}; | ||
|
||
class NullNumericFilterNode : public QueryNode { | ||
public: | ||
NullNumericFilterNode(const QStringList& sqlColumns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicit
src/library/searchquery.h
Outdated
@@ -102,6 +121,20 @@ class CrateFilterNode : public QueryNode { | |||
mutable std::vector<TrackId> m_matchingTrackIds; | |||
}; | |||
|
|||
class NoCrateFilterNode : public QueryNode { | |||
public: | |||
NoCrateFilterNode(const CrateStorage* pCrateStorage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add explicit to all single argument constructors. Otherwise I think we can merge this. LGTM.
Thank you for review. |
This feature was added in mixxxdj/mixxx#1798
This allows to filter all Tracks without a genre like that genre:""
It works for all text fields, including crate:""
https://bugs.launchpad.net/mixxx/+bug/1788086