-
Notifications
You must be signed in to change notification settings - Fork 297
RecordHandler
As part of executing a federated query, Athena needs a mechanism for reading the row data from your source. As such Athena will call your connector to read each of the 'splits' that were generated by your MetadataHandler and expect 0 or more rows to be returned.
The Athena Query Federation SDK provides a RecordHandler as an abstract class that you can extend in order to implement the above functionality via the below functions:
The only required method when extending RecordHandler is readWithConstraint(...)
which is called for each split. If you'd like more control over how your connector uses Apache Arrow for sending row data to Athena you can instead choose to override doReadRecords(...)
. We do not recommend overriding doReadRecords(...)
unless you you have experience with Apache Arrow or need to use Apache Arrow directly for performance reasons.
In most cases you will deploy a MetadataHandler and RecordHandler together in the same Lambda function by using a CompositeHandler. There are however some unique cases where you may want to deploy them independently. This is supported by Athena and most often done for one of the below reasons:
- You have a centralized source of meta-data for all your data sources (e.g. a Single Source of Truth) which is in its own VPC.
- Your data sources themselves are in separate VPC which do not contain the meta-data source.
- Your meta data operations and data reads require different scale or languages in their lambda function.