Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function resolution #12743

Merged
merged 11 commits into from
Sep 4, 2019
Merged

Function resolution #12743

merged 11 commits into from
Sep 4, 2019

Conversation

rongrong
Copy link
Contributor

No description provided.

@rongrong rongrong force-pushed the function-resolution branch 11 times, most recently from e0854ad to b455821 Compare May 2, 2019 23:45
@rongrong rongrong force-pushed the function-resolution branch 2 times, most recently from 7f96068 to 2d8b8a9 Compare July 23, 2019 17:50
@rongrong rongrong force-pushed the function-resolution branch from 2d8b8a9 to 13b5fd8 Compare August 7, 2019 00:21
@rongrong rongrong force-pushed the function-resolution branch 12 times, most recently from 44f40b4 to 9e9667a Compare August 19, 2019 22:24
@rongrong rongrong requested review from highker and arhimondr August 19, 2019 22:48
@caithagoras
Copy link
Contributor

I'll take a look today/tomorrow.

@rongrong rongrong requested a review from hellium01 August 23, 2019 02:05
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything up to "Move FunctionNamespaceManager to SPI" LGTM

/**
* Ideally function namespaces should support transactions like connectors do, and getCandidates should be transaction-aware.
* queryId serves as a transaction ID before proper support for transaction is introduced.
* TODO Support transaction in function namespaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to add @Experimental for this interface

* for a given queryId, getFunctionHandle with the same queryId should return a valid FunctionHandle, even if the function
* is deleted. Multiple calls of this function with the same parameters should return the same FunctionHandle.
* queryId serves as a transaction ID before proper support for transaction is introduced.
* TODO Support transaction in function namespaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to add @Experimental for this interface

.map(TypeSignatureProvider::getTypeSignature)
.collect(toImmutableList()));
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else is redundant

private final List<String> originalParts;

@JsonCreator
public static FullyQualifiedName of(String dottedName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has different assumption for the parameter from other overloaded of method. Should we give it a different name to avoid confusion?

It is also less readable this way since both this method and the next method have the same name while both can take one String parameter as their inputs.

return of(asList(parts));
}

public static FullyQualifiedName of(String... parts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supplying 2 parts always throw. Instead we can do

public static FullyQualifiedName of(String first, String second, String third, String... rest)

Similar to https://github.com/prestodb/presto/blob/master/presto-parser/src/main/java/com/facebook/presto/sql/tree/QualifiedName.java#L35

return new FullyQualifiedName(originalParts, parts);
}

public static FullyQualifiedName of(FullyQualifiedName.Prefix prefix, String name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be only use case of Prefix. Should we add a similar constraints to Prefix to only allow 2 or more parts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I added that in later commits. But here since the constructor is private I don't see the need.

@rongrong rongrong force-pushed the function-resolution branch 2 times, most recently from 352533c to d8b28d0 Compare August 27, 2019 02:25
public StaticFunctionNamespaceStore(FunctionManager functionManager, StaticFunctionNamespaceStoreConfig config)
{
this.functionManager = functionManager;
this.configDir = config.getFunctionNamespaceConfigurationDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this allowed to be null? If so, use Optional instead.


private static List<File> listFiles(File dir)
{
if (dir != null && dir.isDirectory()) {
Copy link
Contributor

@caithagoras caithagoras Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: return empty list early:

if (...) {
   return ImmutableList.of();
}
...

import static com.facebook.presto.util.PropertiesUtil.loadProperties;
import static com.google.common.base.Preconditions.checkState;

public class StaticFunctionNamespaceStore
Copy link
Contributor

@caithagoras caithagoras Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is called StaticFunctionNamespaceStore so I expect it to be relevant only to the built-in functions and StiatcFunctionNamespaceManager, but instead, this class serves as an entry point to load all the function namespace managers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of the StaticFunctionNamespaceStore is more or less a copy of StaticCatalogStore. Here "static" is more referring to the available function namespaces and how they are served are configured statically. I'm open to name suggestions. Maybe we can change StaticFunctionNamespace to BuiltinFunctionNamespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I think that would avoid confusion from overloading the meaning of static

@caithagoras
Copy link
Contributor

lgtm overall, but since I'm not yet familiar with the code, I'll let @highker approve.

@highker highker self-requested a review August 28, 2019 06:37
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One high-level comment regarding config

createTestingViewCodec(),
blockEncodingSerde,
sessionPropertyManager,
schemaPropertyManager,
tablePropertyManager,
columnPropertyManager,
analyzePropertyManager,
transactionManager);
transactionManager,
new FunctionManager(typeManager, blockEncodingSerde, featuresConfig));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FunctionManager can be injected right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor is used for testing purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you annotate the constructor of MetadataManager with for test only as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called in LocalQueryRunner which is used for testing but is not in the /test directory. Also it's not about visibility, so not sure whether @VisibleForTesting is appropriate. Is there other annotations for testing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fine. XXXQueryRunner is for testing only (as a convention) anyway....

private static final Splitter SPLITTER = Splitter.on(',').trimResults().omitEmptyStrings();

private final FunctionManager functionManager;
private final File configDir;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need a directory? We have been gradually ditching file-based configs. The only place I remember we still use is the catalog directory that is used for installing connector plugins and the other file-based configs are legacy logic. If the config is just some attributes, shall we use the normal configuration than a file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm basically copying the configuration of catalogs here because the concepts kinda overlap so I think it's nice to keep them the same way. Happy to take other alternatives if they makes more sense. What do you mean by "normal configuration"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should understand more about how function namespaces will be configured. Will it be something similar to connectors, where each connector has its own config? Or it will be something simple like log.properties below. The former may need a file-based config but the latter can be just put into a new class like LogConfig.java.

Screen Shot 2019-08-30 at 1 54 06 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see this, each config would be configuring a different function namespace manager, which can manage 1 or many function namespaces. Some function namespaces are just logical separations but physically could be managed the same way. For example, you might want to have separate function namespaces for different teams, but they are all SQL functions and can be managed using the same manager with the same configuration. On the other hand, other function namespaces can be physically different. For example, you may want to support MySQL functions as a different function namespace, so people can directly use mysql.function.foo in their queries. This kind of function namespaces would have a 1:1 mapping to the connector they serve.

Let me know whether you think this is reasonable. We can discuss this further.

@@ -400,7 +400,7 @@ protected void setup(Binder binder)
// connector plan optimizer manager
binder.bind(ConnectorPlanOptimizerManager.class).in(Scopes.SINGLETON);

// index manager
// index managerTestVariableExtractor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like an accidental change

@rongrong rongrong force-pushed the function-resolution branch from d8b28d0 to ced4deb Compare August 30, 2019 19:21
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rongrong rongrong force-pushed the function-resolution branch from ced4deb to 7792381 Compare September 3, 2019 19:15
@rongrong
Copy link
Contributor Author

rongrong commented Sep 3, 2019

@highker I added an additional commit to rename StaticFunctionNamespaceManager to BuiltInFunctionNamespaceManager to avoid confusion between function namespace manager and function namespace config. Do you want to take another look?

@rongrong rongrong force-pushed the function-resolution branch from 7792381 to df42c75 Compare September 4, 2019 00:08
StaticFunctionNamespace will be handled separately. It won't need to be
dynamically registered nor will it be used to support dynamic namespaces
so the factory is not necessary.
Break the resolveFunction API into getCandidates and getFunctionHandle
so the function resolution logic can be managed consistently within the
query engine.
We enforce explicit naming for dynamic function namespaces. All
unqualified function names will only be resolved against the built-in
static function namespace. While it is possible to define an ordering
(through SQL path or other means) and convention (best match / first
match), in reality when complicated namespaces are involved such
implicit resolution might hide errors and cause confusion.
@rongrong rongrong force-pushed the function-resolution branch from df42c75 to 3379b4f Compare September 4, 2019 18:34
@rongrong rongrong merged commit 3379b4f into prestodb:master Sep 4, 2019
@rongrong rongrong deleted the function-resolution branch September 4, 2019 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants