Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide "light" version of this package #1142

Closed
tswast opened this issue Feb 17, 2022 · 4 comments
Closed

provide "light" version of this package #1142

tswast opened this issue Feb 17, 2022 · 4 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Feb 17, 2022

In certain use cases (limited compute resources), it would be useful to have just the REST API pieces without any protobuf, grpcio, pyarrow, pandas or other heavier dependencies.

CC @parthea

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 17, 2022
@tswast tswast added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Feb 17, 2022
@majorgilles
Copy link

Concur with the post, we were using this as a replacement over the now bloated google-api-python-client and will be forced to pin this to the latest non problematic version because we use this from an AWS lambda environment which have very specific restricions in terms of deployment size

@benthorner
Copy link

Would love to see a "light" version of this package.

v3 won't install on a Raspberry Pi due to the pyarrow dependency, which seems to lack a pre-compiled version for that architecture, which means trying (and failing) to compile it from source, etc.

I'm only using the client to insert some rows in a table, so really don't need all the extra bits. For now I've got it working by pinning to the last v2 version i.e. pip install google-cloud-bigquery <3.0.0dev.

benthorner added a commit to benthorner/snsary that referenced this issue May 4, 2022
This follows a similar pattern to other output tests by mocking the
end-to-end API calls. While this was a bit harder for the BigQuery
client, in the absence of functional tests the advantages are:

- It clarifies what the error behaviours are.
- It thoroughly verifies the calls are valid.

The documentation is a bit fragmented:

- The old Grafana data source has more extensive guidance on how to
setup the service user for querying data [^1].

- I can't find a clear reference to the environment variable for the
credentials file path - just the error message.

Note that v3 of the Google Cloud BigQuery client won't install on a
Raspberry Pi right now, so we need to pin it to a lower version [^2].

[^1]: https://grafana.com/grafana/plugins/doitintl-bigquery-datasource/
[^2]: googleapis/python-bigquery#1142 (comment)
benthorner added a commit to benthorner/snsary that referenced this issue May 5, 2022
This follows a similar pattern to other output tests by mocking the
end-to-end API calls. While this was a bit harder for the BigQuery
client, in the absence of functional tests the advantages are:

- It clarifies what the error behaviours are.
- It thoroughly verifies the calls are valid.

The documentation is a bit fragmented:

- The old Grafana data source has more extensive guidance on how to
setup the service user for querying data [^1].

- I can't find a clear reference to the environment variable for the
credentials file path - just the error message.

Note that v3 of the Google Cloud BigQuery client won't install on a
Raspberry Pi right now, so we need to pin it to a lower version [^2].

[^1]: https://grafana.com/grafana/plugins/doitintl-bigquery-datasource/
[^2]: googleapis/python-bigquery#1142 (comment)
benthorner added a commit to benthorner/snsary that referenced this issue May 5, 2022
This follows a similar pattern to other output tests by mocking the
end-to-end API calls. While this was a bit harder for the BigQuery
client, in the absence of functional tests the advantages are:

- It clarifies what the error behaviours are.
- It thoroughly verifies the calls are valid.

The documentation is a bit fragmented:

- The old Grafana data source has more extensive guidance on how to
setup the service user for querying data [^1].

- I can't find a clear reference to the environment variable for the
credentials file path - just the error message.

Note that v3 of the Google Cloud BigQuery client won't install on a
Raspberry Pi right now, so we need to pin it to a lower version [^2].

[^1]: https://grafana.com/grafana/plugins/doitintl-bigquery-datasource/
[^2]: googleapis/python-bigquery#1142 (comment)
benthorner added a commit to benthorner/snsary that referenced this issue May 6, 2022
This follows a similar pattern to other output tests by mocking the
end-to-end API calls. While this was a bit harder for the BigQuery
client, in the absence of functional tests the advantages are:

- It clarifies what the error behaviours are.
- It thoroughly verifies the calls are valid.

The documentation is a bit fragmented:

- The old Grafana data source has more extensive guidance on how to
setup the service user for querying data [^1].

- I can't find a clear reference to the environment variable for the
credentials file path - just the error message.

Note that v3 of the Google Cloud BigQuery client won't install on a
Raspberry Pi right now, so we need to pin it to a lower version [^2].

[^1]: https://grafana.com/grafana/plugins/doitintl-bigquery-datasource/
[^2]: googleapis/python-bigquery#1142 (comment)
@chalmerlowe
Copy link
Collaborator

Gonna close this item.

Rationale:

  • By comparison, there are a number of open issues that currently have a higher priority/severity
  • BigQuery itself consumes a lot of memory and is often not very well suited for IoT devices in general
  • As we continue to migrate BigQuery toward a gRPC world, the optional dependencies will become less and less optional
  • IoT is often about streaming data and a commonly recommended use case is to stream to a central application that then pushes data to BigQuery versus streaming directly from dozens/hundreds/thousands of small devices to BigQuery
  • Having auth secrets stored directly on each IoT device is probably not the best idea

@tswast
Copy link
Contributor Author

tswast commented Nov 22, 2023

I believe this is fixed by #1721

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

5 participants