Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Influx Connector #2397

Closed
wants to merge 20 commits into from
Closed

Influx Connector #2397

wants to merge 20 commits into from

Conversation

williame
Copy link

@williame williame commented Jan 3, 2020

Influx is a noSQL time-series database. https://www.influxdata.com/

@cla-bot cla-bot bot added the cla-signed label Jan 3, 2020
Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left only basic comments. This is codestyle guidleline in this project if you haven't yet read it. https://github.com/prestosql/presto#code-style

@ebyhr
Copy link
Member

ebyhr commented Jan 8, 2020

Before starting more detailed reviews, let's talk if we really want to use okhttp instead of the existing library

@williame
Copy link
Author

williame commented Jan 8, 2020

Before starting more detailed reviews, let's talk if we really want to use okhttp instead of the existing library

One issue is dependency versions. Adding this to the pom:

    <dependency>
        <groupId>org.influxdb</groupId>
        <artifactId>influxdb-java</artifactId>
        <version>2.17</version>
    </dependency>

Will cause okhttp3 version conflicts with the rest of presto. How would we best solve that?

The other issue is how to build the query string. At the moment, this connector in the MR is building that manually with a custom InfluxQL class. However, obviously it would smell less if we could use the QueryBuilder offered by the influx java lib instead.

However, the QueryBuilder https://github.com/influxdata/influxdb-java/blob/master/QUERY_BUILDER.md doesn't support column names that need quoting. Deep in the implementation the columns are appended to a StringBuilder unescaped in https://github.com/influxdata/influxdb-java/blob/master/src/main/java/org/influxdb/querybuilder/Appender.java#L76 and the API doc examples rely on and abuse this so it probably can't ever be added.

Another issue is the quoting of parameters. Newer versions of the influx java library do support ? to denote a parameter, rather like JDBC's PreparedStatement. However, the influx java library always passes these parameters to the influx server, even though early versions of influx didn't support this feature.

So, if we sort out the dependency problem, it looks like we'd still be building our own InfluxQL strings manually rather than using a tried and tested QueryBuilder.

Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to switch to the official library later, but I left initial comments for http client.
I suppose influxdb is the exected name (≠influx) as mongodb. This is not a huge problem, but it will affect config properties.

presto-influx/pom.xml Outdated Show resolved Hide resolved
private final OkHttpClient httpClient;
private final String baseUrl;
// the various metadata are cached for a configurable number of milliseconds so we don't hammer the server
private final CachedMetaData<Map<String, String>> retentionPolicies; // schema name (lower-case) -> retention policy (case-sensitive)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, we use guava for cache. You can see the existing code in MongoSession

While I think it worth to cache table definitions, I'm not sure if we should store schema and table names. Are those operations slow in InfluxDB?

// Influx tracks the tags in each measurement, but not which retention-policy they are used in
private Map<String, InfluxColumn> getTags(String tableName)
{
return tagKeys.computeIfAbsent(tableName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method and getFields are too nested. Please try to avoid it.


public List<InfluxColumn> getColumns(String schemaName, String tableName)
{
List<InfluxColumn> columns = InfluxTpchTestSupport.getColumns(config.getDatabase(), schemaName, tableName);
Copy link
Member

@ebyhr ebyhr Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood this is the temporally hack to run TPCH test. Please correct me if my understanding is wrong.

Cassandra connector has the same behavior, the column orders is arranged automatically. Then, we added extra information to hold the expected orders. Could you try adding the same feature? You can refer CassandraSession class.

if (series == null) {
return Collections.emptyMap();
}
InfluxError.GENERAL.check(series.getNodeType().equals(JsonNodeType.ARRAY), "expecting an array, not " + series, query);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using InfluxError check and fail methods. We can use checkArgument method, PrestoException class and so on.

public InfluxQueryRunner()
throws Exception
{
dockerContainer = new InfluxDBContainer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract dockerContainer from this class. You can refer to TestingMySqlServer class.

ConnectorFactory factory = getOnlyElement(plugin.getConnectorFactories());
assertInstanceOf(factory, InfluxConnectorFactory.class);

Connector c = factory.create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid abbreviations.

Suggested change
Connector c = factory.create(
Connector connector = factory.create(


// from is a reserved word so must be quoted
test.addIdentifier("frOm").append(" = ").add("to");
assertEquals(test.toString(), "\"frOm\" = 'to'");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more test cases?


public class InfluxClient
{
final Logger logger;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it private static final

``influx.database=`` The database name must be specified. Each instance of the connector
can only connect to a single database on a server
``influx.username=`` Optional
``influx.password=`` Option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix typo

@ofekby
Copy link

ofekby commented Feb 28, 2020

Hi is this pull request is still on the going?

@RonC-LL
Copy link

RonC-LL commented Apr 17, 2020

I subscribed to this pull request but I just wanted to echo @ofekby 's enthusiasm and interest in the final product. Thank you for your work. (I'd maybe try building what you have, but I'd rather wait until it's official.)

@RonC-LL
Copy link

RonC-LL commented May 7, 2020

FWIW, I built an image from Will's source - currently at Presto version 330-SNAPSHOT - and it works beautifully.

I hope this can get past the finish line and merged into the official image. It's a great connector.

UPDATE (13-May-2020): In my development environment, I forked a version of the PR, merged it with Presto 333, added some logging (the query & url strings), and added a new influx.check-collision boolean config that tells the connector whether or not it should throw an exception if it detects an Influx field name "collision" (field names are the same when lower-cased).

@hrishy
Copy link

hrishy commented Jun 28, 2020

Hi

Is this connector for influxdb available for download ?

@deepak-k-meesho
Copy link

Hi

By when will this connector be officially available?

@StalkerOne
Copy link

@williame hello, Does connector support influxdb now?

@bitsondatadev
Copy link
Member

👋🏻 This PR looks really interesting. It looks like there's a lot of demand for it. @williame, are you still interested in working on this?

@cpard, any interest in looking into this feature?

Unfortunately, it looks like this is inactive and nobody is working on it. We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

@mosabua mosabua closed this Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

10 participants