diff --git a/README.md b/README.md index 753b8378..fb06795b 100644 --- a/README.md +++ b/README.md @@ -81,40 +81,48 @@ Visualization tools are available to help both programmers and non-programmers s
#### Basic queries -The `afscgap.query` method is the main entry point into Python-based utilization. Calls can be written manually or generated in the [visual analytics tool](https://app.pyafscgap.org). For example, this requests all records of Pasiphaea pacifica in 2021 from the Gulf of Alaska to get the median bottom temperature when they were observed: +The `afscgap.Query` object is the main entry point into Python-based utilization. Calls can be written manually or generated in the [visual analytics tool](https://app.pyafscgap.org). For example, this requests all records of Pasiphaea pacifica in 2021 from the Gulf of Alaska to get the median bottom temperature when they were observed: ``` import statistics import afscgap -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +# Build query +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() + +# Get temperatures in Celsius +temperatures = [record.get_bottom_temperature(units='c') for record in results] -temperatures = [record.get_bottom_temperature_c() for record in results] +# Take the median print(statistics.median(temperatures)) ``` -Note that `afscgap.query` returns a [Cursor](https://pyafscgap.org/devdocs/afscgap/cursor.html#Cursor). One can iterate over this `Cursor` to access [Record](https://pyafscgap.org/devdocs/afscgap/model.html#Record) objects. You can do this with list comprehensions, maps, etc or with a good old for loop like in this example which gets a histogram of haul temperatures: +Note that `afscgap.Query.execute` returns a [Cursor](https://pyafscgap.org/devdocs/afscgap/cursor.html#Cursor). One can iterate over this `Cursor` to access [Record](https://pyafscgap.org/devdocs/afscgap/model.html#Record) objects. You can do this with list comprehensions, maps, etc or with a good old for loop like in this example which gets a histogram of haul temperatures: ``` +# Mapping from temperature in Celsius to count count_by_temperature_c = {} -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +# Build query +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() +# Iterate through results and count for record in results: - temp = record.get_bottom_temperature_c() + temp = record.get_bottom_temperature(units='c') temp_rounded = round(temp) count = count_by_temperature_c.get(temp_rounded, 0) + 1 count_by_temperature_c[temp_rounded] = count +# Print the result print(count_by_temperature_c) ``` @@ -125,24 +133,24 @@ See [data structure section](#data-structure). Using an iterator will have the l #### Enable absence data One of the major limitations of the official API is that it only provides presence data. However, this library can optionally infer absence or "zero catch" records using a separate static file produced by NOAA AFSC GAP. The [algorithm and details for absence inference](#absence-vs-presence-data) is further discussed below. -Absence data / "zero catch" records inference can be turned on by setting `presence_only` to false in `query`. To demonstrate, this example finds total area swept and total weight for Gadus macrocephalus from the Aleutian Islands in 2021: +Absence data / "zero catch" records inference can be turned on by passing `False` to `set_presence_only` in `Query`. To demonstrate, this example finds total area swept and total weight for Gadus macrocephalus from the Aleutian Islands in 2021: ``` import afscgap -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Gadus macrocephalus', - presence_only=False -) +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Gadus macrocephalus') +query.set_presence_only(False) +results = query.execute() total_area = 0 total_weight = 0 for record in results: - total_area += record.get_area_swept_ha() - total_weight += record.get_weight() + total_area += record.get_area_swept(units='ha') + total_weight += record.get_weight(units='kg') template = '%.2f kg / hectare swept (%.1f kg, %.1f hectares' weight_per_area = total_weight / total_area @@ -155,26 +163,82 @@ For more [details on the zero catch record feature](#absence-vs-presence-data),
+#### Chaining +It is possible to use the Query object for method chaining. + +``` +import statistics + +import afscgap + +# Build query +results = afscgap.Query() \ + .filter_year(eq=2021) \ + .filter_srvy(eq='GOA') \ + .filter_scientific_name(eq='Pasiphaea pacifica') \ + .execute() + +# Get temperatures in Celsius +temperatures = [record.get_bottom_temperature(units='c') for record in results] + +# Take the median +print(statistics.median(temperatures)) +``` + +Each filter and set method on Query returns the same query object. + +
+ +#### Builder operations +Note that Query is a builder. So, it may be used to execute a search and then execute another search with slightly modified parameters: + +``` +import statistics + +import afscgap + +# Build query +query = afscgap.Query() +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') + +# Get temperatures in Celsius for 2021 +query.filter_year(eq=2021) +results = query.execute() +temperatures = [record.get_bottom_temperature(units='c') for record in results] +print(statistics.median(temperatures)) + +# Get temperatures in Celsius for 2019 +query.filter_year(eq=2019) +results = query.execute() +temperatures = [record.get_bottom_temperature(units='c') for record in results] +print(statistics.median(temperatures)) +``` + +When calling filter, all prior filters on the query object for that field are overwritten. + +
+ #### Serialization Users may request a dictionary representation: ``` -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +import afscgap + +# Create a query +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() # Get dictionary from individual record for record in results: dict_representation = record.to_dict() print(dict_representation['bottom_temperature_c']) -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +# Execute again +results = query.execute() # Get dictionary for all records results_dicts = results.to_dicts() @@ -195,11 +259,11 @@ import pandas import afscgap -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() pandas.DataFrame(results.to_dicts()) ``` @@ -212,31 +276,55 @@ Note that Pandas is not required to use this library. You can provide range queries which translate to ORDS or Python emulated filters. For example, the following requests before and including 2019: ``` -results = afscgap.query( - year=(None, 2019), - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +import afscgap + +# Build query +query = afscgap.Query() +query.filter_year(max_val=2021) # Note max_val +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() + +# Sum weight +weights = map(lambda x: x.get_weight(units='kg'), results) +total_weight = sum(weights) +print(total_weight) ``` The following requests data after and including 2019: ``` -results = afscgap.query( - year=(2019, None), - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +import afscgap + +# Build query +query = afscgap.Query() +query.filter_year(min_val=2021) # Note min_val +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() + +# Sum weight +weights = map(lambda x: x.get_weight(units='kg'), results) +total_weight = sum(weights) +print(total_weight) ``` Finally, the following requests data between 2015 and 2019 (includes 2015 and 2019): ``` -results = afscgap.query( - year=(2015, 2019), - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +import afscgap + +# Build query +query = afscgap.Query() +query.filter_year(min_val=2015, max_val=2019) # Note min/max_val +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() + +# Sum weight +weights = map(lambda x: x.get_weight(units='kg'), results) +total_weight = sum(weights) +print(total_weight) ``` For more advanced filters, please see manual filtering below. @@ -249,12 +337,14 @@ Users may provide advanced queries using Oracle's REST API query parameters. For ``` import afscgap -results = afscgap.query( - year=2021, - latitude_dd={'$between': [56, 57]}, - longitude_dd={'$between': [-161, -160]} -) +# Query with ORDS syntax +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_latitude({'$between': [56, 57]}) +query.filter_longitude({'$between': [-161, -160]}) +results = query.execute() +# Summarize count_by_common_name = {} for record in results: @@ -262,6 +352,9 @@ for record in results: new_count = record.get_count() count = count_by_common_name.get(common_name, 0) + new_count count_by_common_name[common_name] = count + +# Print +print(count_by_common_name['walleye pollock']) ``` For more info about the options available, consider the [Oracle docs](https://docs.oracle.com/en/database/oracle/oracle-rest-data-services/19.2/aelig/developing-REST-applications.html#GUID-F0A4D4F9-443B-4EB9-A1D3-1CDE0A8BAFF2) or a helpful unaffiliated [getting started tutorial from Jeff Smith](https://www.thatjeffsmith.com/archive/2019/09/some-query-filtering-examples-in-ords/). @@ -272,29 +365,34 @@ For more info about the options available, consider the [Oracle docs](https://do By default, the library will iterate through all results and handle pagination behind the scenes. However, one can also request an individual page: ``` -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica' -) +import afscgap + +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Gadus macrocephalus') +results = query.execute() -results_for_page = results.get_page(offset=20, limit=100) -print(len(results_for_page)) # Will print 32 (results contains 52 records) +results_for_page = results.get_page(offset=20, limit=53) +print(len(results_for_page)) ``` Client code can also change the pagination behavior used when iterating: ``` -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica', - start_offset=10, - limit=200 -) +import afscgap + +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Gadus macrocephalus') +query.set_start_offset(10) +query.set_limit(200) +query.set_filter_incomplete(True) +results = query.execute() for record in results: - print(record.get_bottom_temperature_c()) + print(record.get_bottom_temperature(units='c')) ``` Note that records are only requested once during iteration and only after the prior page has been returned via the iterator ("lazy" loading). @@ -357,45 +455,45 @@ For more information on the schema, see the [metadata](https://github.com/afsc-g These fields are available as getters on `afscgap.model.Record` (`result.get_srvy()`) and may be used as optional filters on the query `asfcgagp.query(srvy='GOA')`. Fields which are `Optional` have two getters. First, the "regular" getter (`result.get_count()`) will assert that the field is not None before returning a non-optional. The second "maybe" getter (`result.get_count_maybe()`) will return None if the value was not provided or could not be parsed. -| **Filter keyword** | **Regular Getter** | **Maybe Getter** | -|-----------------------|--------------------------------------|------------------------------------------------------| -| year | get_year() -> float | | -| srvy | get_srvy() -> str | | -| survey | get_survey() -> str | | -| survey_id | get_survey_id() -> float | | -| cruise | get_cruise() -> float | | -| haul | get_haul() -> float | | -| stratum | get_stratum() -> float | | -| station | get_station() -> str | | -| vessel_name | get_vessel_name() -> str | | -| vessel_id | get_vessel_id() -> float | | -| date_time | get_date_time() -> str | | -| latitude_dd | get_latitude_dd() -> float | | -| longitude_dd | get_longitude_dd() -> float | | -| species_code | get_species_code() -> float | | -| common_name | get_common_name() -> str | | -| scientific_name | get_scientific_name() -> str | | -| taxon_confidence | get_taxon_confidence() -> str | | -| cpue_kgha | get_cpue_kgha() -> float | get_cpue_kgha_maybe() -> Optional[float] | -| cpue_kgkm2 | get_cpue_kgkm2() -> float | get_cpue_kgkm2_maybe() -> Optional[float] | -| cpue_kg1000km2 | get_cpue_kg1000km2() -> float | get_cpue_kg1000km2_maybe() -> Optional[float] | -| cpue_noha | get_cpue_noha() -> float | get_cpue_noha_maybe() -> Optional[float] | -| cpue_nokm2 | get_cpue_nokm2() -> float | get_cpue_nokm2_maybe() -> Optional[float] | -| cpue_no1000km2 | get_cpue_no1000km2() -> float | get_cpue_no1000km2_maybe() -> Optional[float] | -| weight_kg | get_weight_kg() -> float | get_weight_kg_maybe() -> Optional[float] | -| count | get_count() -> float | get_count_maybe() -> Optional[float] | -| bottom_temperature_c | get_bottom_temperature_c() -> float | get_bottom_temperature_c_maybe() -> Optional[float] | -| surface_temperature_c | get_surface_temperature_c() -> float | get_surface_temperature_c_maybe() -> Optional[float] | -| depth_m | get_depth_m() -> float | | -| distance_fished_km | get_distance_fished_km() -> float | | -| net_width_m | get_net_width_m() -> float | get_net_width_m_maybe() -> Optional[float] | -| net_height_m | get_net_height_m() -> float | get_net_height_m_maybe() -> Optional[float] | -| area_swept_ha | get_area_swept_ha() -> float | | -| duration_hr | get_duration_hr() -> float | | -| tsn | get_tsn() -> int | get_tsn_maybe() -> Optional[int] | -| ak_survey_id | get_ak_survey_id() -> int | | - -`Record` objects also have a `is_complete` method which returns true if all the fields with an `Optional` type are non-None and the `date_time` could be parsed and made into an ISO 8601 string. +| **API Field** | **Filter on Query** | **Regular Getter** | **Maybe Getter** | +|-----------------------|------------------------------------------|------------------------------------------------|-----------------------------------------------------------------| +| year | filter_year() | get_year() -> float | | +| srvy | filter_srvy() | get_srvy() -> str | | +| survey | filter_survey() | get_survey() -> str | | +| survey_id | filter_survey_id() | get_survey_id() -> float | | +| cruise | filter_cruise() | get_cruise() -> float | | +| haul | filter_haul() | get_haul() -> float | | +| stratum | filter_stratum() | get_stratum() -> float | | +| station | filter_station() | get_station() -> str | | +| vessel_name | filter_vessel_name() | get_vessel_name() -> str | | +| vessel_id | filter_vessel_id() | get_vessel_id() -> float | | +| date_time | filter_date_time() | get_date_time() -> str | | +| latitude_dd | filter_latitude(units='dd') | get_latitude(units='dd') -> float | | +| longitude_dd | filter_longitude(units='dd') | get_longitude(units='dd') -> float | | +| species_code | filter_species_code() | get_species_code() -> float | | +| common_name | filter_common_name() | get_common_name() -> str | | +| scientific_name | filter_scientific_name() | get_scientific_name() -> str | | +| taxon_confidence | filter_taxon_confidence() | get_taxon_confidence() -> str | | +| cpue_kgha | filter_cpue_weight(units='kg/ha') | get_cpue_weight(units='kg/ha') -> float | get_cpue_weight_maybe(units='kg/ha') -> Optional[float] | +| cpue_kgkm2 | filter_cpue_weight(units='kg/km2') | get_cpue_weight(units='kg/km2') -> float | get_cpue_weight_maybe(units='kg/km2') -> Optional[float] | +| cpue_kg1000km2 | filter_cpue_weight(units='kg1000/km2') | get_cpue_weight(units='kg1000/km2') -> float | get_cpue_weight_maybe(units='kg1000/km2') -> Optional[float] | +| cpue_noha | filter_cpue_count(units='count/ha') | get_cpue_count(units='count/ha') -> float | get_cpue_count_maybe(units='count/ha') -> Optional[float] | +| cpue_nokm2 | filter_cpue_count(units='count/km2') | get_cpue_count(units='count/km2') -> float | get_cpue_count_maybe(units='count/km2') -> Optional[float] | +| cpue_no1000km2 | filter_cpue_count(units='count1000/km2') | get_cpue_count(units='count1000/km2') -> float | get_cpue_count_maybe(units='count1000/km2') -> Optional[float] | +| weight_kg | filter_weight(units='kg') | get_weight(units='kg') -> float | get_weight_maybe() -> Optional[float] | +| count | filter_count() | get_count() -> float | get_count_maybe() -> Optional[float] | +| bottom_temperature_c | filter_bottom_temperature(units='c') | get_bottom_temperature(units='c') -> float | get_bottom_temperature_maybe(units='c') -> Optional[float] | +| surface_temperature_c | filter_surface_temperature(units='c') | get_surface_temperature(units='c') -> float | get_surface_temperature_maybe() -> Optional[float] | +| depth_m | filter_depth(units='m') | get_depth(units='m') -> float | | +| distance_fished_km | filter_distance_fished(units='km') | get_distance_fished(units='km') -> float | | +| net_width_m | filter_net_width(units='m') | get_net_width(units='m') -> float | get_net_width(units='m') -> Optional[float] | +| net_height_m | filter_net_height(units='m') | get_net_height(units='m') -> float | get_net_height(units='m') -> Optional[float] | +| area_swept_ha | filter_area_swept(units='ha') | get_area_swept(units='ha') -> float | | +| duration_hr | filter_duration(units='hr') | get_duration(units='hr') -> float | | +| tsn | filter_tsn() | get_tsn() -> int | get_tsn_maybe() -> Optional[int] | +| ak_survey_id | filter_ak_survey_id() | get_ak_survey_id() -> int | | + +Support for additional units are available for some fields and are calculated on the fly within the `afscgap` library when requested. `Record` objects also have a `is_complete` method which returns true if all the fields with an `Optional` type are non-None and the `date_time` could be parsed and made into an ISO 8601 string.

@@ -415,22 +513,22 @@ import toolz.itertoolz import afscgap -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Gadus macrocephalus', - presence_only=False -) +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Gadus macrocephalus') +query.set_presence_only(False) +results = query.execute() def simplify_record(full_record): - latitude = full_record.get_latitude_dd() - longitude = full_record.get_longitude_dd() + latitude = full_record.get_latitude(units='dd') + longitude = full_record.get_longitude(units='dd') geohash = geolib.geohash.encode(latitude, longitude, 5) return { 'geohash': geohash, - 'area': full_record.get_area_swept_ha(), - 'weight': full_record.get_weight_kg() + 'area': full_record.get_area_swept(units='ha'), + 'weight': full_record.get_weight(units='kg') } def combine_record(a, b): @@ -501,13 +599,13 @@ with open('hauls.csv') as f: rows = csv.DictReader(f) hauls = [afscgap.inference.parse_haul(row) for row in rows] -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Gadus macrocephalus', - presence_only=False, - hauls_prefetch=hauls -) +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Gadus macrocephalus') +query.set_presence_only(False) +query.set_hauls_prefetch(hauls) +results = query.execute() ``` This can be helpful when executing a lot of queries and the bandwidth to download the [hauls metadata file](https://pyafscgap.org/community/hauls.csv) multiple times may not be desireable. @@ -523,18 +621,20 @@ There are a few caveats for working with these data that are important for resea #### Incomplete or invalid records Metadata fields such as `year` are always required to make a `Record` whereas others such as catch weight `cpue_kgkm2` are not present on all records returned by the API and are optional. See the [data structure section](#data-structure) for additional details. For fields with optional values: - - A maybe getter (like `get_cpue_kgkm2_maybe`) is provided which will return None without error if the value is not provided or could not be parsed. - - A regular getter (like `get_cpue_kgkm2`) is provided which asserts the value is not None before it is returned. + - A maybe getter (like `get_cpue_weight_maybe`) is provided which will return None without error if the value is not provided or could not be parsed. + - A regular getter (like `get_cpue_weight`) is provided which asserts the value is not None before it is returned. `Record` objects also have an `is_complete` method which returns true if both all optional fields on the `Record` are non-None and the `date_time` field on the `Record` is a valid ISO 8601 string. By default, records for which `is_complete` are false are returned when iterating or through `get_page` but this can be overridden by with the `filter_incomplete` keyword argument like so: ``` -results = afscgap.query( - year=2021, - srvy='GOA', - scientific_name='Pasiphaea pacifica', - filter_incomplete=True -) +import afscgap + +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +query.set_filter_incomplete(True) +results = query.execute() for result in results: assert result.is_complete() @@ -543,7 +643,14 @@ for result in results: Results returned by the API for which non-Optional fields could not be parsed (like missing `year`) are considered "invalid" and always excluded during iteration when those raw unreadable records are kept in a `queue.Queue[dict]` that can be accessed via `get_invalid` like so: ``` -results = afscgap.query(year=2021, srvy='GOA') +import afscgap + +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_srvy(eq='GOA') +query.filter_scientific_name(eq='Pasiphaea pacifica') +results = query.execute() + valid = list(results) invalid_queue = results.get_invalid() @@ -602,11 +709,13 @@ We invite contributions via [our project Github](https://github.com/SchmidtDSE/a While participating in the community, you may need to debug URL generation. Therefore, for investigating issues or evaluating the underlying operations, you can also request a full URL for a query: ``` -results = afscgap.query( - year=2021, - latitude_dd={'$between': [56, 57]}, - longitude_dd={'$between': [-161, -160]} -) +import afscgap + +query = afscgap.Query() +query.filter_year(eq=2021) +query.filter_latitude(eq={'$between': [56, 57]}) +query.filter_longitude(eq={'$between': [-161, -160]}) +results = query.execute() print(results.get_page_url(limit=10, offset=0)) ``` diff --git a/afscgap/__init__.py b/afscgap/__init__.py index 7a053057..3e8d7788 100644 --- a/afscgap/__init__.py +++ b/afscgap/__init__.py @@ -32,6 +32,7 @@ from afscgap.typesdef import INT_PARAM from afscgap.typesdef import STR_PARAM +from afscgap.typesdef import OPT_FLOAT from afscgap.typesdef import OPT_INT from afscgap.typesdef import OPT_STR from afscgap.typesdef import OPT_REQUESTOR @@ -46,244 +47,1269 @@ ]) -def query( - year: FLOAT_PARAM = None, - srvy: STR_PARAM = None, - survey: STR_PARAM = None, - survey_id: FLOAT_PARAM = None, - cruise: FLOAT_PARAM = None, - haul: FLOAT_PARAM = None, - stratum: FLOAT_PARAM = None, - station: STR_PARAM = None, - vessel_name: STR_PARAM = None, - vessel_id: FLOAT_PARAM = None, - date_time: STR_PARAM = None, - latitude_dd: FLOAT_PARAM = None, - longitude_dd: FLOAT_PARAM = None, - species_code: FLOAT_PARAM = None, - common_name: STR_PARAM = None, - scientific_name: STR_PARAM = None, - taxon_confidence: STR_PARAM = None, - cpue_kgha: FLOAT_PARAM = None, - cpue_kgkm2: FLOAT_PARAM = None, - cpue_kg1000km2: FLOAT_PARAM = None, - cpue_noha: FLOAT_PARAM = None, - cpue_nokm2: FLOAT_PARAM = None, - cpue_no1000km2: FLOAT_PARAM = None, - weight_kg: FLOAT_PARAM = None, - count: FLOAT_PARAM = None, - bottom_temperature_c: FLOAT_PARAM = None, - surface_temperature_c: FLOAT_PARAM = None, - depth_m: FLOAT_PARAM = None, - distance_fished_km: FLOAT_PARAM = None, - net_width_m: FLOAT_PARAM = None, - net_height_m: FLOAT_PARAM = None, - area_swept_ha: FLOAT_PARAM = None, - duration_hr: FLOAT_PARAM = None, - tsn: INT_PARAM = None, - ak_survey_id: INT_PARAM = None, - limit: OPT_INT = None, - start_offset: OPT_INT = None, - base_url: OPT_STR = None, - requestor: OPT_REQUESTOR = None, - filter_incomplete: bool = False, - presence_only: bool = True, - suppress_large_warning: bool = False, - hauls_url: OPT_STR = None, - warn_function: WARN_FUNCTION = None, - hauls_prefetch: OPT_HAUL_LIST = None) -> afscgap.cursor.Cursor: - """Execute a query against the AFSC GAP API. - - Args: - year: Filter on year for the survey in which this observation was made. - Pass None if no filter should be applied. Defaults to None. - srvy: Filter on the short name of the survey in which this observation - was made. Pass None if no filter should be applied. Defaults to - None. Note that common values include: NBS (N Bearing Sea), EBS (SE - Bearing Sea), BSS (Bearing Sea Slope), GOA (Gulf of Alaska), and - AI (Aleutian Islands). - survey: Filter on long form description of the survey in which the - observation was made. Pass None if no filter should be applied. - Defaults to None. - survey_id: Filter on unique numeric ID for the survey. Pass None if no - filter should be applied. Defaults to None. - cruise: Filter on an ID uniquely identifying the cruise in which the - observation was made. Pass None if no filter should be applied. - Defaults to None. - haul: Filter on an ID uniquely identifying the haul in which this - observation was made. Pass None if no filter should be applied. - Defaults to None. - stratum: Filter on unique ID for statistical area / survey combination. - Pass None if no filter should be applied. Defaults to None. - station: Filter on station associated with the survey. Pass None if no - filter should be applied. Defaults to None. - vessel_name: Filter on unique ID describing the vessel that made this - observation. Pass None if no filter should be applied. Defaults to - None. - vessel_id: Filter on name of the vessel at the time the observation was - made. Pass None if no filter should be applied. Defaults to None. - date_time: Filter on the date and time of the haul as an ISO 8601 - string. Pass None if no filter should be applied. Defaults to None. - If given an ISO 8601 string, will convert from ISO 8601 to the API - datetime string format. Similarly, if given a dictionary, all values - matching an ISO 8601 string will be converted to the API datetime - string format. - latitude_dd: Filter on latitude in decimal degrees associated with the - haul. Pass None if no filter should be applied. Defaults to None. - longitude_dd: Filter on longitude in decimal degrees associated with the - haul. Pass None if no filter should be applied. Defaults to None. - species_code: Filter on unique ID associated with the species observed. - Pass None if no filter should be applied. Defaults to None. - common_name: Filter on the “common name” associated with the species - observed. Pass None if no filter should be applied. Defaults to - None. - scientific_name: Filter on the “scientific name” associated with the - species observed. Pass None if no filter should be applied. Defaults - to None. - taxon_confidence: Filter on confidence flag regarding ability to - identify species. Pass None if no filter should be applied. Defaults - to None. - cpue_kgha: Filter on catch weight divided by net area (kg / hectares) if - available. Pass None if no filter should be applied. Defaults to - None. - cpue_kgkm2: Filter on catch weight divided by net area (kg / km^2) if - available. Pass None if no filter should be applied. Defaults to - None. - cpue_kg1000km2: Filter on catch weight divided by net area (kg / km^2 * - 1000) if available. Pass None if no filter should be applied. - Defaults to None. - cpue_noha: Filter on catch number divided by net sweep area if available - (count / hectares). Pass None if no filter should be applied. - Defaults to None. - cpue_nokm2: Filter on catch number divided by net sweep area if - available (count / km^2). Pass None if no filter should be applied. - Defaults to None. - cpue_no1000km2: Filter on catch number divided by net sweep area if - available (count / km^2 * 1000). Pass None if no filter should be - applied. Defaults to None. - weight_kg: Filter on taxon weight (kg) if available. Pass None if no - filter should be applied. Defaults to None. - count: Filter on total number of organism individuals in haul. Pass None - if no filter should be applied. Defaults to None. - bottom_temperature_c: Filter on bottom temperature associated with - observation if available in Celsius. Pass None if no filter should - be applied. Defaults to None. - surface_temperature_c: Filter on surface temperature associated with - observation if available in Celsius. Pass None if no filter should - be applied. Defaults to None. - depth_m: Filter on depth of the bottom in meters. Pass None if no filter - should be applied. Defaults to None. - distance_fished_km: Filter on distance of the net fished as km. Pass - None if no filter should be applied. Defaults to None. - net_width_m: Filter on distance of the net fished as m. Pass None if no - filter should be applied. Defaults to None. - net_height_m: Filter on height of the net fished as m. Pass None if no - filter should be applied. Defaults to None. - area_swept_ha: Filter on area covered by the net while fishing in - hectares. Pass None if no filter should be applied. Defaults to - None. - duration_hr: Filter on duration of the haul as number of hours. Pass - None if no filter should be applied. Defaults to None. - tsn: Filter on taxonomic information system species code. Pass None if - no filter should be applied. Defaults to None. - ak_survey_id: Filter on AK identifier for the survey. Pass None if no - filter should be applied. Defaults to None. - limit: The maximum number of results to retrieve per HTTP request. If - None or not provided, will use API's default. - start_offset: The number of initial results to skip in retrieving - results. If None or not provided, none will be skipped. - base_url: The URL at which the API can be found. If None, will use - default (offical URL at time of release). See - afscgap.client.DEFAULT_URL. - requestor: Strategy to use for making HTTP requests. If None, will use - a default as defined by afscgap.client.Cursor. - filter_incomplete: Flag indicating if "incomplete" records should be - filtered. If true, "incomplete" records are silently filtered from - the results, putting them in the invalid records queue. If false, - they are included and their is_complete() will return false. - Defaults to false. - presence_only: Flag indicating if abscence / zero catch data should be - inferred. If false, will run abscence data inference. If true, will - return presence only data as returned by the NOAA API service. - Defaults to true. - suppress_large_warning: Indicate if the library should warn when an - operation may consume a large amount of memory. If true, the warning - will not be emitted. Defaults to true. - hauls_url: The URL at which the flat file with hauls metadata can be - found or None if a default should be used. Defaults to None. - warn_function: Function to call with a message describing warnings - encountered. If None, will use warnings.warn. Defaults to None. - hauls_prefetch: If using presence_only=True, this is ignored. Otherwise, - if None, will instruct the library to download hauls metadata. If - not None, will use this as the hauls list for zero catch record - inference. - - Returns: - Cursor to manage HTTP requests and query results. +class Query: + """Entrypoint for the AFSC GAP Python library. + + Facade for executing queries against AFSC GAP and builder to create those + queries. """ - all_dict_raw = { - 'year': year, - 'srvy': srvy, - 'survey': survey, - 'survey_id': survey_id, - 'cruise': cruise, - 'haul': haul, - 'stratum': stratum, - 'station': station, - 'vessel_name': vessel_name, - 'vessel_id': vessel_id, - 'date_time': date_time, - 'latitude_dd': latitude_dd, - 'longitude_dd': longitude_dd, - 'species_code': species_code, - 'common_name': common_name, - 'scientific_name': scientific_name, - 'taxon_confidence': taxon_confidence, - 'cpue_kgha': cpue_kgha, - 'cpue_kgkm2': cpue_kgkm2, - 'cpue_kg1000km2': cpue_kg1000km2, - 'cpue_noha': cpue_noha, - 'cpue_nokm2': cpue_nokm2, - 'cpue_no1000km2': cpue_no1000km2, - 'weight_kg': weight_kg, - 'count': count, - 'bottom_temperature_c': bottom_temperature_c, - 'surface_temperature_c': surface_temperature_c, - 'depth_m': depth_m, - 'distance_fished_km': distance_fished_km, - 'net_width_m': net_width_m, - 'net_height_m': net_height_m, - 'area_swept_ha': area_swept_ha, - 'duration_hr': duration_hr, - 'tsn': tsn, - 'ak_survey_id': ak_survey_id - } - - api_cursor = afscgap.client.build_api_cursor( - all_dict_raw, - limit=limit, - start_offset=start_offset, - filter_incomplete=filter_incomplete, - requestor=requestor, - base_url=base_url - ) - - if presence_only: - return api_cursor - - decorated_cursor = afscgap.inference.build_inference_cursor( - all_dict_raw, - api_cursor, - requestor=requestor, - hauls_url=hauls_url, - hauls_prefetch=hauls_prefetch - ) - - if not suppress_large_warning: - if not warn_function: - warn_function = lambda x: warnings.warn(x) - - warn_function(LARGE_WARNING) - - return decorated_cursor + def __init__(self, base_url: OPT_STR = None, hauls_url: OPT_STR = None, + requestor: OPT_REQUESTOR = None): + """Create a new Query. + + Args: + base_url: The URL at which the API can be found. If None, will use + default (offical URL at time of release). See + afscgap.client.DEFAULT_URL. + hauls_url: The URL at which the flat file with hauls metadata can be + found or None if a default should be used. Defaults to None. + requestor: Strategy to use for making HTTP requests. If None, will + use a default as defined by afscgap.client.Cursor. + """ + # URLs for data + self._base_url = base_url + self._hauls_url = hauls_url + self._requestor = requestor + + # Filter parameters + self._year: FLOAT_PARAM = None + self._srvy: STR_PARAM = None + self._survey: STR_PARAM = None + self._survey_id: FLOAT_PARAM = None + self._cruise: FLOAT_PARAM = None + self._haul: FLOAT_PARAM = None + self._stratum: FLOAT_PARAM = None + self._station: STR_PARAM = None + self._vessel_name: STR_PARAM = None + self._vessel_id: FLOAT_PARAM = None + self._date_time: STR_PARAM = None + self._latitude_dd: FLOAT_PARAM = None + self._longitude_dd: FLOAT_PARAM = None + self._species_code: FLOAT_PARAM = None + self._common_name: STR_PARAM = None + self._scientific_name: STR_PARAM = None + self._taxon_confidence: STR_PARAM = None + self._cpue_kgha: FLOAT_PARAM = None + self._cpue_kgkm2: FLOAT_PARAM = None + self._cpue_kg1000km2: FLOAT_PARAM = None + self._cpue_noha: FLOAT_PARAM = None + self._cpue_nokm2: FLOAT_PARAM = None + self._cpue_no1000km2: FLOAT_PARAM = None + self._weight_kg: FLOAT_PARAM = None + self._count: FLOAT_PARAM = None + self._bottom_temperature_c: FLOAT_PARAM = None + self._surface_temperature_c: FLOAT_PARAM = None + self._depth_m: FLOAT_PARAM = None + self._distance_fished_km: FLOAT_PARAM = None + self._net_width_m: FLOAT_PARAM = None + self._net_height_m: FLOAT_PARAM = None + self._area_swept_ha: FLOAT_PARAM = None + self._duration_hr: FLOAT_PARAM = None + self._tsn: INT_PARAM = None + self._ak_survey_id: INT_PARAM = None + + # Query pararmeters + self._limit: OPT_INT = None + self._start_offset: OPT_INT = None + self._filter_incomplete: bool = False + self._presence_only: bool = True + self._suppress_large_warning: bool = False + self._warn_function: WARN_FUNCTION = None + self._hauls_prefetch: OPT_HAUL_LIST = None + + def filter_year(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None) -> 'Query': + """Filter on year for the survey in which this observation was made. + + Filter on year for the survey in which this observation was made, + ovewritting all prior year filters on this Query if one was previously + set. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._year = self._create_float_param(eq, min_val, max_val) + return self + + def filter_srvy(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> 'Query': + """Filter on haul survey short name. + + Filter on the short name of the survey in which this observation was + made. Pass None if no filter should be applied. Defaults to None. Note + that common values include: NBS (N Bearing Sea), EBS (SE Bearing Sea), + BSS (Bearing Sea Slope), GOA (Gulf of Alaska), and AI (Aleutian + Islands). Overwrites all prior srvy filters if set on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._srvy = self._create_str_param(eq, min_val, max_val) + return self + + def filter_survey(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> 'Query': + """Filter on survey long name. + + Filter on long form description of the survey in which the observation + was made. Overwrites all prior survey filters if set on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._survey = self._create_str_param(eq, min_val, max_val) + return self + + def filter_survey_id(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None) -> 'Query': + """Filter on unique numeric ID for the survey. + + Filter on unique numeric ID for the survey, overwritting prior survey ID + filters if set on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._survey_id = self._create_float_param(eq, min_val, max_val) + return self + + def filter_cruise(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None) -> 'Query': + """Filter on cruise ID. + + Filter on an ID uniquely identifying the cruise in which the observation + was made. Overwrites all prior cruise filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._cruise = self._create_float_param(eq, min_val, max_val) + return self + + def filter_haul(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None) -> 'Query': + """Filter on haul identifier. + + Filter on an ID uniquely identifying the haul in which this observation + was made. Overwrites all prior haul filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._haul = self._create_float_param(eq, min_val, max_val) + return self + + def filter_stratum(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None) -> 'Query': + """Filter on unique ID for statistical area / survey combination. + + Filter on unique ID for statistical area / survey combination, + overwritting all prior stratum filters on Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._stratum = self._create_float_param(eq, min_val, max_val) + return self + + def filter_station(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> 'Query': + """Filter on station associated with the survey. + + Filter on station associated with the survey, overwritting all prior + station filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._station = self._create_str_param(eq, min_val, max_val) + return self + + def filter_vessel_name(self, eq: STR_PARAM = None, + min_val: OPT_STR = None, max_val: OPT_STR = None) -> 'Query': + """Filter on unique ID describing the vessel that made this observation. + + Filter on unique ID describing the vessel that made this observation, + overwritting all prior vessel name filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._vessel_name = self._create_str_param(eq, min_val, max_val) + return self + + def filter_vessel_id(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None) -> 'Query': + """Filter on name of the vessel at the time the observation was made. + + Filter on name of the vessel at the time the observation was made, + overwritting all prior vessel ID filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._vessel_id = self._create_float_param(eq, min_val, max_val) + return self + + def filter_date_time(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> 'Query': + """Filter on the date and time of the haul. + + Filter on the date and time of the haul as an ISO 8601 string. If given + an ISO 8601 string, will afscgap.convert from ISO 8601 to the API + datetime string format. Similarly, if given a dictionary, all values + matching an ISO 8601 string will be afscgap.converted to the API + datetime string format. Overwrites all prior date time filters on + this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._date_time = self._create_str_param(eq, min_val, max_val) + return self + + def filter_latitude(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'dd') -> 'Query': + """Filter on latitude in decimal degrees associated with the haul. + + Filter on latitude in decimal degrees associated with the haul, + overwritting all prior latitude filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the filter values are provided. Currently + only dd supported. Ignored if given eq value containing ORDS + query. + + Returns: + This object for chaining if desired. + """ + self._latitude_dd = self._create_float_param( + afscgap.convert.unconvert_degrees(eq, units), + afscgap.convert.unconvert_degrees(min_val, units), + afscgap.convert.unconvert_degrees(max_val, units) + ) + return self + + def filter_longitude(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'dd') -> 'Query': + """Filter on longitude in decimal degrees associated with the haul. + + Filter on longitude in decimal degrees associated with the haul, + overwritting all prior longitude filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the filter values are provided. Currently + only dd supported. + + Returns: + This object for chaining if desired. + """ + self._longitude_dd = self._create_float_param( + afscgap.convert.unconvert_degrees(eq, units), + afscgap.convert.unconvert_degrees(min_val, units), + afscgap.convert.unconvert_degrees(max_val, units) + ) + return self + + def filter_species_code(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None) -> 'Query': + """Filter on unique ID associated with the species observed. + + Filter on unique ID associated with the species observed, overwritting + all prior species code filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._species_code = self._create_float_param(eq, min_val, max_val) + return self + + def filter_common_name(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> 'Query': + """Filter on the "common name" associated with the species observed. + + Filter on the "common name" associated with the species observed, + overwritting all prior common name filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._common_name = self._create_str_param(eq, min_val, max_val) + return self + + def filter_scientific_name(self, eq: STR_PARAM = None, + min_val: OPT_STR = None, max_val: OPT_STR = None) -> 'Query': + """Filter on the "scientific name" associated with the species observed. + + Filter on the "scientific name" associated with the species observed, + overwritting all prior scientific name filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._scientific_name = self._create_str_param(eq, min_val, max_val) + return self + + def filter_taxon_confidence(self, eq: STR_PARAM = None, + min_val: OPT_STR = None, max_val: OPT_STR = None) -> 'Query': + """Filter on confidence flag regarding ability to identify species. + + Filter on confidence flag regarding ability to identify species, + overwritting all taxon confidence filters on this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._taxon_confidence = self._create_str_param(eq, min_val, max_val) + return self + + def filter_cpue_weight(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'kg/ha') -> 'Query': + """Filter on catch per unit effort. + + Filter on catch per unit effort as weight divided by net area if + available. Overwrites all prior CPUE weight filters applied to this + Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units for the catch per unit effort provided. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Ignored if given + eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + param = self._create_float_param(eq, min_val, max_val) + + self._cpue_kgha = None + self._cpue_kgkm2 = None + self._cpue_kg1000km2 = None + + if units == 'kg/ha': + self._cpue_kgha = param + elif units == 'kg/km2': + self._cpue_kgkm2 = param + elif units == 'kg1000/km2': + self._cpue_kg1000km2 = param + else: + raise RuntimeError('Unrecognized units ' + units) + + return self + + def filter_cpue_count(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'count/ha') -> 'Query': + """Filter catch per unit effort as count over area in hectares. + + Filter on catch number divided by net sweep area if available (count / + hectares). Overwrites all prior CPUE count filters applied to this + Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units for the given catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. + Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + param = self._create_float_param(eq, min_val, max_val) + + self._cpue_noha = None + self._cpue_nokm2 = None + self._cpue_no1000km2 = None + + if units == 'count/ha': + self._cpue_noha = param + elif units == 'count/km2': + self._cpue_nokm2 = param + elif units == 'count1000/km2': + self._cpue_no1000km2 = param + else: + raise RuntimeError('Unrecognized units ' + units) + + return self + + def filter_weight(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'kg') -> 'Query': + """Filter on taxon weight (kg) if available. + + Filter on taxon weight (kg) if available, overwrites all prior weight + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the weight are given. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + self._weight_kg = self._create_float_param( + afscgap.convert.unconvert_weight(eq, units), + afscgap.convert.unconvert_weight(min_val, units), + afscgap.convert.unconvert_weight(max_val, units) + ) + return self + + def filter_count(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None) -> 'Query': + """Filter on total number of organism individuals in haul. + + Filter on total number of organism individuals in haul, overwrites all + prior count filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. Ignored if given eq value containing + ORDS query. + + Returns: + This object for chaining if desired. + """ + self._count = self._create_float_param(eq, min_val, max_val) + return self + + def filter_bottom_temperature(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'c') -> 'Query': + """Filter on bottom temperature. + + Filter on bottom temperature associated with observation if available in + the units given. Overwrites all prior bottom temperature filters applied + to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the temperature filter values are given. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + self._bottom_temperature_c = self._create_float_param( + afscgap.convert.unconvert_temperature(eq, units), + afscgap.convert.unconvert_temperature(min_val, units), + afscgap.convert.unconvert_temperature(max_val, units) + ) + return self + + def filter_surface_temperature(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'c') -> 'Query': + """Filter on surface temperature. + + Filter on surface temperature associated with observation if available + in the units given. Overwrites all prior bottom temperature filters + applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the temperature filter values are given. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + self._surface_temperature_c = self._create_float_param( + afscgap.convert.unconvert_temperature(eq, units), + afscgap.convert.unconvert_temperature(min_val, units), + afscgap.convert.unconvert_temperature(max_val, units) + ) + return self + + def filter_depth(self, eq: FLOAT_PARAM = None, min_val: OPT_FLOAT = None, + max_val: OPT_FLOAT = None, units: str = 'm') -> 'Query': + """Filter on depth of the bottom in meters. + + Filter on depth of the bottom in meters, overwrites all prior depth + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the distance are given. Options: + m or km for meters and kilometers respectively. Defaults to m. + Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + self._depth_m = self._create_float_param( + afscgap.convert.unconvert_distance(eq, units), + afscgap.convert.unconvert_distance(min_val, units), + afscgap.convert.unconvert_distance(max_val, units) + ) + return self + + def filter_distance_fished(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'm') -> 'Query': + """Filter on distance of the net fished. + + Filter on distance of the net fished, overwritting prior distance fished + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the distance values are given. Options: + m or km for meters and kilometers respectively. Defaults to m. + Ignored if given eq value containing ORDS query. + + Returns: + This object for chaining if desired. + """ + def convert_to_km(target, units): + in_meters = afscgap.convert.unconvert_distance(target, units) + return afscgap.convert.convert_distance(in_meters, 'km') + + self._distance_fished_km = self._create_float_param( + convert_to_km(eq, units), + convert_to_km(min_val, units), + convert_to_km(max_val, units) + ) + return self + + def filter_net_width(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'm') -> 'Query': + """Filter on distance of the net fished. + + Filter on distance of the net fished, overwritting prior net width + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + + Returns: + This object for chaining if desired. + """ + self._net_width_m = self._create_float_param( + afscgap.convert.unconvert_distance(eq, units), + afscgap.convert.unconvert_distance(min_val, units), + afscgap.convert.unconvert_distance(max_val, units) + ) + return self + + def filter_net_height(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'm') -> 'Query': + """Filter on height of the net fished. + + Filter on height of the net fished, overwritting prior net height + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + + Returns: + This object for chaining if desired. + """ + self._net_height_m = self._create_float_param( + afscgap.convert.unconvert_distance(eq, units), + afscgap.convert.unconvert_distance(min_val, units), + afscgap.convert.unconvert_distance(max_val, units) + ) + return self + + def filter_area_swept(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'm') -> 'Query': + """Filter on area covered by the net while fishing. + + Filter on area covered by the net while fishing, overwritting prior + area swept filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the area should be returned. Options: + ha, m2, km2. Defaults to ha. + + Returns: + This object for chaining if desired. + """ + self._area_swept_ha = self._create_float_param( + afscgap.convert.unconvert_area(eq, units), + afscgap.convert.unconvert_area(min_val, units), + afscgap.convert.unconvert_area(max_val, units) + ) + return self + + def filter_duration(self, eq: FLOAT_PARAM = None, + min_val: OPT_FLOAT = None, max_val: OPT_FLOAT = None, + units: str = 'hr') -> 'Query': + """Filter on duration of the haul. + + Filter on duration of the haul, ovewritting all prior duration filters + applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + units: The units in which the duration should be returned. Options: + day, hr, min. Defaults to hr. + + Returns: + This object for chaining if desired. + """ + self._duration_hr = self._create_float_param( + afscgap.convert.unconvert_time(eq, units), + afscgap.convert.unconvert_time(min_val, units), + afscgap.convert.unconvert_time(max_val, units) + ) + return self + + def filter_tsn(self, eq: INT_PARAM = None, min_val: OPT_INT = None, + max_val: OPT_INT = None) -> 'Query': + """Filter on taxonomic information system species code. + + Filter on taxonomic information system species code, overwritting all + prior TSN filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._tsn = self._create_int_param(eq, min_val, max_val) + return self + + def filter_ak_survey_id(self, eq: INT_PARAM = None, min_val: OPT_INT = None, + max_val: OPT_INT = None) -> 'Query': + """Filter on AK identifier for the survey. + + Filter on AK identifier for the survey, overwritting all prior AK ID + filters applied to this Query. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + Returns: + This object for chaining if desired. + """ + self._ak_survey_id = self._create_int_param(eq, min_val, max_val) + return self + + def set_limit(self, limit: OPT_INT) -> 'Query': + """Set the max number of results. + + Set the max number of results, overwritting prior limit settings on this + Query. + + Args: + limit: The maximum number of results to retrieve per HTTP request. + If None or not provided, will use API's default. + + Returns: + This object for chaining if desired. + """ + self._limit = limit + return self + + def set_start_offset(self, start_offset: OPT_INT) -> 'Query': + """Indicate how many results to skip. + + Indicate how many results to skip, overwritting prior offset settings on + this Query. + + Args: + start_offset: The number of initial results to skip in retrieving + results. If None or not provided, none will be skipped. + + Returns: + This object for chaining if desired. + """ + self._start_offset = start_offset + return self + + def set_filter_incomplete(self, filter_incomplete: bool) -> 'Query': + """Indicate if incomplete records should be filtered out. + + Indicate if incomplete records should be filtered out, overwritting + prior incomplete filter settings on this Query. + + Args: + filter_incomplete: Flag indicating if "incomplete" records should be + filtered. If true, "incomplete" records are silently filtered + from the results, putting them in the invalid records queue. If + false, they are included and their is_complete() will return + false. Defaults to false. + + Returns: + This object for chaining if desired. + """ + self._filter_incomplete = filter_incomplete + return self + + def set_presence_only(self, presence_only: bool) -> 'Query': + """Indicate if zero catch inference should be enabled. + + Indicate if zero catch inference should be enabled, overwritting prior + abscence / zero catch data settings on this Query. + + Args: + presence_only: Flag indicating if abscence / zero catch data should + be inferred. If false, will run abscence data inference. If + true, will return presence only data as returned by the NOAA API + service. Defaults to true. + + Returns: + This object for chaining if desired. + """ + self._presence_only = presence_only + return self + + def set_suppress_large_warning(self, supress: bool) -> 'Query': + """Indicate if the large results warning should be supressed. + + Indicate if the large results warning should be supressed, overwritting + prior large results warning supressions settings on this Query. + + Args: + suppress_large_warning: Indicate if the library should warn when an + operation may consume a large amount of memory. If true, the + warning will not be emitted. Defaults to true. + + Returns: + This object for chaining if desired. + """ + self._suppress_large_warning = supress + return self + + def set_warn_function(self, warn_function: WARN_FUNCTION) -> 'Query': + """Indicate how warnings should be emitted. + + Indicate how warnings should be emitted, overwritting the prior warning + function settings on this Query. + + Args: + warn_function: Function to call with a message describing warnings + encountered. If None, will use warnings.warn. Defaults to None. + + Returns: + This object for chaining if desired. + """ + self._warn_function = warn_function + return self + + def set_hauls_prefetch(self, hauls_prefetch: OPT_HAUL_LIST) -> 'Query': + """Indicate if hauls' data were prefetched. + + Indicate if hauls' data were prefetched, overwritting prior prefetch + settings on this Query. + + Args: + hauls_prefetch: If using presence_only=True, this is ignored. + Otherwise, if None, will instruct the library to download hauls + metadata. If not None, will use this as the hauls list for zero + catch record inference. + + Returns: + This object for chaining if desired. + """ + self._hauls_prefetch = hauls_prefetch + return self + + def execute(self) -> afscgap.cursor.Cursor: + """Execute the query built up in this object. + + Execute the query built up in this object using its current state. Note + that later changes to this builder will not impact prior returned + Cursors from execute. + + Returns: + Cursor to manage HTTP requests and query results. + """ + all_dict_raw = { + 'year': self._year, + 'srvy': self._srvy, + 'survey': self._survey, + 'survey_id': self._survey_id, + 'cruise': self._cruise, + 'haul': self._haul, + 'stratum': self._stratum, + 'station': self._station, + 'vessel_name': self._vessel_name, + 'vessel_id': self._vessel_id, + 'date_time': self._date_time, + 'latitude_dd': self._latitude_dd, + 'longitude_dd': self._longitude_dd, + 'species_code': self._species_code, + 'common_name': self._common_name, + 'scientific_name': self._scientific_name, + 'taxon_confidence': self._taxon_confidence, + 'cpue_kgha': self._cpue_kgha, + 'cpue_kgkm2': self._cpue_kgkm2, + 'cpue_kg1000km2': self._cpue_kg1000km2, + 'cpue_noha': self._cpue_noha, + 'cpue_nokm2': self._cpue_nokm2, + 'cpue_no1000km2': self._cpue_no1000km2, + 'weight_kg': self._weight_kg, + 'count': self._count, + 'bottom_temperature_c': self._bottom_temperature_c, + 'surface_temperature_c': self._surface_temperature_c, + 'depth_m': self._depth_m, + 'distance_fished_km': self._distance_fished_km, + 'net_width_m': self._net_width_m, + 'net_height_m': self._net_height_m, + 'area_swept_ha': self._area_swept_ha, + 'duration_hr': self._duration_hr, + 'tsn': self._tsn, + 'ak_survey_id': self._ak_survey_id + } + + api_cursor = afscgap.client.build_api_cursor( + all_dict_raw, + limit=self._limit, + start_offset=self._start_offset, + filter_incomplete=self._filter_incomplete, + requestor=self._requestor, + base_url=self._base_url + ) + + if self._presence_only: + return api_cursor + + decorated_cursor = afscgap.inference.build_inference_cursor( + all_dict_raw, + api_cursor, + requestor=self._requestor, + hauls_url=self._hauls_url, + hauls_prefetch=self._hauls_prefetch + ) + + if not self._suppress_large_warning: + warn_function = self._warn_function + if not warn_function: + warn_function = lambda x: warnings.warn(x) + + warn_function(LARGE_WARNING) + + return decorated_cursor + + def _create_str_param(self, eq: STR_PARAM = None, min_val: OPT_STR = None, + max_val: OPT_STR = None) -> STR_PARAM: + """Create a new string parameter. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + + """ + return self._create_param(eq, min_val, max_val) # type: ignore + + def _create_float_param(self, eq: FLOAT_PARAM = None, + min_val: FLOAT_PARAM = None, + max_val: FLOAT_PARAM = None) -> FLOAT_PARAM: + """Create a new float parameter. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + """ + return self._create_param(eq, min_val, max_val) # type: ignore + + def _create_int_param(self, eq: INT_PARAM = None, min_val: OPT_INT = None, + max_val: OPT_INT = None) -> INT_PARAM: + """Create a new int parameter. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + Returns: + Compatible param representation. + """ + return self._create_param(eq, min_val, max_val) # type: ignore + + def _create_param(self, eq=None, min_val=None, max_val=None): + """Create a new parameter. + + Args: + eq: The exact value that must be matched for a record to be + returned. Pass None if no equality filter should be applied. + Error thrown if min_val or max_val also provided. May also be + a dictionary representing an ORDS query. + min_val: The minimum allowed value, inclusive. Pass None if no + minimum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + max_val: The maximum allowed value, inclusive. Pass None if no + maximum value filter should be applied. Defaults to None. Error + thrown if eq also proivded. + Returns: + Compatible param representation. + """ + eq_given = eq is not None + min_val_given = min_val is not None + max_val_given = max_val is not None + + if eq_given and (min_val_given and max_val_given): + raise RuntimeError('Cannot query with both eq and min/max val.') + + if eq_given: + return eq + elif min_val_given or max_val_given: + return [min_val, max_val] + else: + return None diff --git a/afscgap/client.py b/afscgap/client.py index 9312d5bf..40fcbf96 100644 --- a/afscgap/client.py +++ b/afscgap/client.py @@ -555,21 +555,33 @@ def get_date_time(self) -> str: """ return self._date_time - def get_latitude_dd(self) -> float: + def get_latitude(self, units: str = 'dd') -> float: """Get the field labeled as latitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Latitude in decimal degrees associated with the haul. """ - return self._latitude_dd + return afscgap.model.assert_float_present( + afscgap.convert.convert_degrees(self._latitude_dd, units) + ) - def get_longitude_dd(self) -> float: + def get_longitude(self, units: str = 'dd') -> float: """Get the field labeled as longitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Longitude in decimal degrees associated with the haul. """ - return self._longitude_dd + return afscgap.model.assert_float_present( + afscgap.convert.convert_degrees(self._longitude_dd, units) + ) def get_species_code(self) -> float: """Get the field labeled as species_code in the API. @@ -606,68 +618,58 @@ def get_taxon_confidence(self) -> str: """ return self._taxon_confidence - def get_cpue_kgha_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kgha in the API. - - Returns: - Catch weight divided by net area (kg / hectares) if available. See - metadata. None if could not interpret as a float. - """ - return self._cpue_kgha - - def get_cpue_kgkm2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kgkm2 in the API. - - Returns: - Catch weight divided by net area (kg / km^2) if available. See - metadata. None if could not interpret as a float. - """ - return self._cpue_kgkm2 - - def get_cpue_kg1000km2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kg1000km2 in the API. - - Returns: - Catch weight divided by net area (kg / km^2 * 1000) if available. - See metadata. None if could not interpret as a float. - """ - return self._cpue_kg1000km2 + def get_cpue_weight_maybe(self, units: str = 'kg/ha') -> OPT_FLOAT: + """Get a field labeled as cpue_* in the API. - def get_cpue_noha_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_noha in the API. + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Returns: - Catch number divided by net sweep area if available (count / - hectares). See metadata. None if could not interpret as a float. + Catch weight divided by net area (in given units) if available. See + metadata. None if could not interpret as a float. If an inferred + zero catch record, will be zero. """ - return self._cpue_noha + return { + 'kg/ha': self._cpue_kgha, + 'kg/km2': self._cpue_kgkm2, + 'kg1000/km2': self._cpue_kg1000km2 + }[units] - def get_cpue_nokm2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_nokm2 in the API. + def get_cpue_count_maybe(self, units: str = 'kg/ha') -> OPT_FLOAT: + """Get the field labeled as cpue_* in the API. - Returns: - Catch number divided by net sweep area if available (count / km^2). - See metadata. None if could not interpret as a float. - """ - return self._cpue_nokm2 + Get the catch per unit effort from the record with one of the following + units: kg/ha, kg/km2, kg1000/km2. - def get_cpue_no1000km2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_no1000km2 in the API. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. Returns: - Catch number divided by net sweep area if available (count / km^2 * - 1000). See metadata. None if could not interpret as a float. + Catch weight divided by net area (in given units) if available. See + metadata. None if could not interpret as a float. If an inferred + zero catch record, will be zero. """ - return self._cpue_no1000km2 + return { + 'count/ha': self._cpue_noha, + 'count/km2': self._cpue_nokm2, + 'count1000/km2': self._cpue_no1000km2 + }[units] - def get_weight_kg_maybe(self) -> OPT_FLOAT: + def get_weight_maybe(self, units: str = 'kg') -> OPT_FLOAT: """Get the field labeled as weight_kg in the API. + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Returns: - Taxon weight (kg) if available. See metadata. None if could not - interpret as a float. + Taxon weight if available. See metadata. None if could not + interpret as a float. If an inferred zero catch record, will be + zero. """ - return self._weight_kg + return afscgap.convert.convert_weight(self._weight_kg, units) def get_count_maybe(self) -> OPT_FLOAT: """Get the field labeled as count in the API. @@ -678,87 +680,157 @@ def get_count_maybe(self) -> OPT_FLOAT: """ return self._count - def get_bottom_temperature_c_maybe(self) -> OPT_FLOAT: + def get_bottom_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as bottom_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: - Bottom temperature associated with observation if available in - Celsius. None if not given or could not interpret as a float. + Bottom temperature associated with observation / inferrence if + available in desired units. None if not given or could not interpret + as a float. """ - return self._bottom_temperature_c + return afscgap.convert.convert_temperature( + self._bottom_temperature_c, + units + ) - def get_surface_temperature_c_maybe(self) -> OPT_FLOAT: + def get_surface_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as surface_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: - Surface temperature associated with observation if available in - Celsius. None if not given or could not interpret as a float. + Surface temperature associated with observation / inferrence if + available. None if not given or could not interpret as a float. """ - return self._surface_temperature_c + return afscgap.convert.convert_temperature( + self._surface_temperature_c, + units + ) - def get_depth_m(self) -> float: + def get_depth(self, units: str = 'm') -> float: """Get the field labeled as depth_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Depth of the bottom in meters. + Depth of the bottom. """ - return self._depth_m + return afscgap.model.assert_float_present( + afscgap.convert.convert_distance(self._depth_m, units) + ) - def get_distance_fished_km(self) -> float: + def get_distance_fished(self, units: str = 'm') -> float: """Get the field labeled as distance_fished_km in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as km. + Distance of the net fished. """ - return self._distance_fished_km + return afscgap.model.assert_float_present( + afscgap.convert.convert_distance( + self._distance_fished_km * 1000, + units + ) + ) - def get_net_width_m_maybe(self) -> OPT_FLOAT: + def get_net_width_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m or None if not given. + Distance of the net fished or None if not given. """ - return self._net_width_m + return afscgap.convert.convert_distance( + self._net_width_m, + units + ) - def get_net_height_m_maybe(self) -> OPT_FLOAT: + def get_net_height_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m or None if not given. + Height of the net fished or None if not given. """ - return self._net_height_m + return afscgap.convert.convert_distance( + self._net_height_m, + units + ) - def get_net_width_m(self) -> float: + def get_net_width(self, units: str = 'm') -> float: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m after asserting it is given. + Distance of the net fished after asserting it is given. """ - return afscgap.model.assert_float_present(self._net_width_m) + return afscgap.model.assert_float_present( + self.get_net_width_maybe(units=units) + ) - def get_net_height_m(self) -> float: + def get_net_height(self, units: str = 'm') -> float: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m after asserting it is given. + Height of the net fished after asserting it is given. """ - return afscgap.model.assert_float_present(self._net_height_m) + return afscgap.model.assert_float_present( + self.get_net_height_maybe(units=units) + ) - def get_area_swept_ha(self) -> float: + def get_area_swept(self, units: str = 'ha') -> float: """Get the field labeled as area_swept_ha in the API. + Args: + units: The units in which the area should be returned. Options: + ha, m2, km2. Defaults to ha. + Returns: - Area covered by the net while fishing in hectares. + Area covered by the net while fishing in desired units. """ - return self._area_swept_ha + return afscgap.model.assert_float_present( + afscgap.convert.convert_area(self._area_swept_ha, units) + ) - def get_duration_hr(self) -> float: + def get_duration(self, units: str = 'hr') -> float: """Get the field labeled as duration_hr in the API. + Args: + units: The units in which the duration should be returned. Options: + day, hr, min. Defaults to hr. + Returns: - Duration of the haul as number of hours. + Duration of the haul. """ - return self._duration_hr + return afscgap.model.assert_float_present( + afscgap.convert.convert_time(self._duration_hr, units) + ) def get_tsn(self) -> int: """Get the field labeled as tsn in the API. @@ -793,48 +865,32 @@ def get_ak_survey_id_maybe(self) -> OPT_INT: """ return self._ak_survey_id - def get_cpue_kgha(self) -> float: + def get_cpue_weight(self, units: str = 'kg/ha') -> float: """Get the value of field cpue_kgha with validity assert. - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch weight divided by net area (kg / hectares) if available. See - metadata. - """ - return afscgap.model.assert_float_present(self._cpue_kgha) - - def get_cpue_kgkm2(self) -> float: - """Get the value of field cpue_kgkm2 with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Catch weight divided by net area (kg / km^2) if available. See + Catch weight divided by net area (kg / hectares) if available. See metadata. """ - return afscgap.model.assert_float_present(self._cpue_kgkm2) - - def get_cpue_kg1000km2(self) -> float: - """Get the value of field cpue_kg1000km2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch weight divided by net area (kg / km^2 * 1000) if available. - See metadata. - """ - return afscgap.model.assert_float_present(self._cpue_kg1000km2) + return afscgap.model.assert_float_present( + self.get_cpue_weight_maybe(units=units) + ) - def get_cpue_noha(self) -> float: + def get_cpue_count(self, units: str = 'count/ha') -> float: """Get the value of field cpue_noha with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. @@ -843,37 +899,17 @@ def get_cpue_noha(self) -> float: Catch number divided by net sweep area if available (count / hectares). See metadata. """ - return afscgap.model.assert_float_present(self._cpue_noha) - - def get_cpue_nokm2(self) -> float: - """Get the value of field cpue_nokm2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch number divided by net sweep area if available (count / km^2). - See metadata. - """ - return afscgap.model.assert_float_present(self._cpue_nokm2) - - def get_cpue_no1000km2(self) -> float: - """Get the value of field cpue_no1000km2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch number divided by net sweep area if available (count / km^2 * - 1000). See metadata. - """ - return afscgap.model.assert_float_present(self._cpue_no1000km2) + return afscgap.model.assert_float_present( + self.get_cpue_count_maybe(units=units) + ) - def get_weight_kg(self) -> float: + def get_weight(self, units: str = 'kg') -> float: """Get the value of field weight_kg with validity assert. + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. @@ -881,7 +917,9 @@ def get_weight_kg(self) -> float: Returns: Taxon weight (kg) if available. See metadata. """ - return afscgap.model.assert_float_present(self._weight_kg) + return afscgap.model.assert_float_present( + self.get_weight_maybe(units=units) + ) def get_count(self) -> float: """Get the value of field count with validity assert. @@ -895,31 +933,45 @@ def get_count(self) -> float: """ return afscgap.model.assert_float_present(self._count) - def get_bottom_temperature_c(self) -> float: + def get_bottom_temperature(self, units='c') -> float: """Get the value of field bottom_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Bottom temperature associated with observation if available in - Celsius. + Bottom temperature associated with observation / inferrence if + available. """ - return afscgap.model.assert_float_present(self._bottom_temperature_c) + return afscgap.model.assert_float_present( + self.get_bottom_temperature_maybe(units=units) + ) - def get_surface_temperature_c(self) -> float: + def get_surface_temperature(self, units='c') -> float: """Get the value of field surface_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Surface temperature associated with observation if available in - Celsius. None if not + Surface temperature associated with observation / inferrence if + available. """ - return afscgap.model.assert_float_present(self._surface_temperature_c) + return afscgap.model.assert_float_present( + self.get_surface_temperature_maybe(units=units) + ) def is_complete(self) -> bool: """Determine if this record has all of its values filled in. @@ -948,51 +1000,6 @@ def is_complete(self) -> bool: return all_fields_present and has_valid_date_time - def to_dict(self) -> dict: - """Serialize this Record to a dictionary form. - - Returns: - Dictionary with field names matching those found in the API results - with incomplete records having some values as None. - """ - return { - 'year': self._year, - 'srvy': self._srvy, - 'survey': self._survey, - 'survey_id': self._survey_id, - 'cruise': self._cruise, - 'haul': self._haul, - 'stratum': self._stratum, - 'station': self._station, - 'vessel_name': self._vessel_name, - 'vessel_id': self._vessel_id, - 'date_time': self._date_time, - 'latitude_dd': self._latitude_dd, - 'longitude_dd': self._longitude_dd, - 'species_code': self._species_code, - 'common_name': self._common_name, - 'scientific_name': self._scientific_name, - 'taxon_confidence': self._taxon_confidence, - 'cpue_kgha': self._cpue_kgha, - 'cpue_kgkm2': self._cpue_kgkm2, - 'cpue_kg1000km2': self._cpue_kg1000km2, - 'cpue_noha': self._cpue_noha, - 'cpue_nokm2': self._cpue_nokm2, - 'cpue_no1000km2': self._cpue_no1000km2, - 'weight_kg': self._weight_kg, - 'count': self._count, - 'bottom_temperature_c': self._bottom_temperature_c, - 'surface_temperature_c': self._surface_temperature_c, - 'depth_m': self._depth_m, - 'distance_fished_km': self._distance_fished_km, - 'net_width_m': self._net_width_m, - 'net_height_m': self._net_height_m, - 'area_swept_ha': self._area_swept_ha, - 'duration_hr': self._duration_hr, - 'tsn': self._tsn, - 'ak_survey_id': self._ak_survey_id, - } - def parse_record(target: dict) -> afscgap.model.Record: """Parse a record from a returned item dictionary. diff --git a/afscgap/convert.py b/afscgap/convert.py index aaaf1f43..cba66415 100644 --- a/afscgap/convert.py +++ b/afscgap/convert.py @@ -9,6 +9,8 @@ """ import re +from afscgap.typesdef import FLOAT_PARAM +from afscgap.typesdef import OPT_FLOAT from afscgap.typesdef import STR_PARAM DATE_REGEX = re.compile('(?P\\d{2})\\/(?P\\d{2})\\/' + \ @@ -20,6 +22,60 @@ '(?P\\d{2})') ISO_8601_TEMPLATE = '%s-%s-%sT%s:%s:%s' +AREA_CONVERTERS = { + 'ha': lambda x: x, + 'm2': lambda x: x * 10000, + 'km2': lambda x: x * 0.01 +} + +AREA_UNCONVERTERS = { + 'ha': lambda x: x, + 'm2': lambda x: x / 10000, + 'km2': lambda x: x / 0.01 +} + +DISTANCE_CONVERTERS = { + 'm': lambda x: x, + 'km': lambda x: x / 1000 +} + +DISTANCE_UNCONVERTERS = { + 'm': lambda x: x, + 'km': lambda x: x * 1000 +} + +TEMPERATURE_CONVERTERS = { + 'c': lambda x: x, + 'f': lambda x: x * 9 / 5 + 32 +} + +TEMPERATURE_UNCONVERTERS = { + 'c': lambda x: x, + 'f': lambda x: (x - 32) * 5 / 9 +} + +TIME_CONVERTERS = { + 'day': lambda x: x / 24, + 'hr': lambda x: x, + 'min': lambda x: x * 60 +} + +TIME_UNCONVERTERS = { + 'day': lambda x: x * 24, + 'hr': lambda x: x, + 'min': lambda x: x / 60 +} + +WEIGHT_CONVERTERS = { + 'g': lambda x: x * 1000, + 'kg': lambda x: x +} + +WEIGHT_UNCONVERTERS = { + 'g': lambda x: x / 1000, + 'kg': lambda x: x +} + def convert_from_iso8601(target: STR_PARAM) -> STR_PARAM: """Convert strings from ISO 8601 format to API format. @@ -112,3 +168,208 @@ def is_iso8601(target: str) -> bool: True if it matches the expected format and false otherwise. """ return ISO_8601_REGEX.match(target) is not None + + +def convert_area(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert an area. + + Args: + target: The value to convert in hectares. + units: Desired units. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + return AREA_CONVERTERS[units](target) + + +def unconvert_area(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Standardize an area to the API-native units (hectare). + + Args: + target: The value to convert in hectares. + units: The units of value. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + if isinstance(target, dict): + return target + + return AREA_UNCONVERTERS[units](target) + + +def convert_degrees(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert targets from degrees to another units. + + Args: + target: The value to convert which may be None. + units: Desired units. + + Returns: + The same value input after asserting that units are dd, the only + supported units. + """ + assert units == 'dd' + return target + + +def unconvert_degrees(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Standardize a degree to the API-native units (degrees). + + Args: + target: The value to convert which may be None. + units: The units of value. + + Returns: + The same value input after asserting that units are dd, the only + supported units. + """ + assert units == 'dd' + return target + + +def convert_distance(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert a linear distance. + + Args: + target: The value to convert in meters. + units: Desired units. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + return DISTANCE_CONVERTERS[units](target) + + +def unconvert_distance(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Convert a linear distance to the API-native units (meters). + + Args: + target: The value to convert in meters. + units: The units of value. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + if isinstance(target, dict): + return target + + return DISTANCE_UNCONVERTERS[units](target) + + +def convert_temperature(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert a temperature. + + Args: + target: The value to convert in Celcius. + units: Desired units. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + return TEMPERATURE_CONVERTERS[units](target) + + +def unconvert_temperature(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Convert a linear temperature to the API-native units (Celsius). + + Args: + target: The value to convert in Celcius. + units: The units of value. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + if isinstance(target, dict): + return target + + return TEMPERATURE_UNCONVERTERS[units](target) + + +def convert_time(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert a time. + + Args: + target: The value to convert in hours. + units: Desired units. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + return TIME_CONVERTERS[units](target) + + +def unconvert_time(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Convert a time to the API-native units (hours). + + Args: + target: The value to convert in hours. + units: The units of value. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + if isinstance(target, dict): + return target + + return TIME_UNCONVERTERS[units](target) + + +def convert_weight(target: OPT_FLOAT, units: str) -> OPT_FLOAT: + """Convert a weight. + + Args: + target: The value to convert in kilograms. + units: Desired units. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + return WEIGHT_CONVERTERS[units](target) + + +def unconvert_weight(target: FLOAT_PARAM, units: str) -> FLOAT_PARAM: + """Convert a weight to the API-native units (kilograms). + + Args: + target: The value to convert in kilograms. + units: The units of value. + + Returns: + The converted value. Note that, if target is None, will return None. + """ + if target is None: + return None + + if isinstance(target, dict): + return target + + return WEIGHT_UNCONVERTERS[units](target) diff --git a/afscgap/inference.py b/afscgap/inference.py index 87c0dbb3..0415ac9e 100644 --- a/afscgap/inference.py +++ b/afscgap/inference.py @@ -604,21 +604,39 @@ def get_date_time(self) -> str: """ return self._haul.get_date_time() - def get_latitude_dd(self) -> float: + def get_latitude(self, units: str = 'dd') -> float: """Get the field labeled as latitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Latitude in decimal degrees associated with the haul. """ - return self._haul.get_latitude_dd() + return afscgap.model.assert_float_present( + afscgap.convert.convert_degrees( + self._haul.get_latitude_dd(), + units + ) + ) - def get_longitude_dd(self) -> float: + def get_longitude(self, units: str = 'dd') -> float: """Get the field labeled as longitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Longitude in decimal degrees associated with the haul. """ - return self._haul.get_longitude_dd() + return afscgap.model.assert_float_present( + afscgap.convert.convert_degrees( + self._haul.get_longitude_dd(), + units + ) + ) def get_species_code(self) -> float: """Get the field labeled as species_code in the API. @@ -654,59 +672,48 @@ def get_taxon_confidence(self) -> str: """ return 'Unassessed' - def get_cpue_kgha_maybe(self) -> OPT_FLOAT: - """Get catch weight divided by net area (kg / hectares). + def get_cpue_weight_maybe(self, units: str = 'kg/ha') -> OPT_FLOAT: + """Get a field labeled as cpue_* in the API. - Returns: - Always returns 0. - """ - return 0 - - def get_cpue_kgkm2_maybe(self) -> OPT_FLOAT: - """Get catch weight divided by net area (kg / km^2). - - Returns: - Always returns 0. - """ - return 0 - - def get_cpue_kg1000km2_maybe(self) -> OPT_FLOAT: - """Get catch weight divided by net area (kg / km^2 * 1000). + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Returns: - Always returns 0. + Catch weight divided by net area (in given units) if available. See + metadata. None if could not interpret as a float. If an inferred + zero catch record, will be zero. """ return 0 - def get_cpue_noha_maybe(self) -> OPT_FLOAT: - """Get catch number divided by net sweep area. + def get_cpue_count_maybe(self, units: str = 'kg/ha') -> OPT_FLOAT: + """Get the field labeled as cpue_* in the API. - Returns: - Always returns 0. - """ - return 0 + Get the catch per unit effort from the record with one of the following + units: kg/ha, kg/km2, kg1000/km2. - def get_cpue_nokm2_maybe(self) -> OPT_FLOAT: - """Get catch number divided by net sweep area. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. Returns: - Always returns 0. + Catch weight divided by net area (in given units) if available. See + metadata. None if could not interpret as a float. If an inferred + zero catch record, will be zero. """ return 0 - def get_cpue_no1000km2_maybe(self) -> OPT_FLOAT: - """Get catch number divided by net sweep area. + def get_weight_maybe(self, units='kg') -> OPT_FLOAT: + """Get the field labeled as weight_kg in the API. - Returns: - Always returns 0. - """ - return 0 - - def get_weight_kg_maybe(self) -> OPT_FLOAT: - """Get taxon weight (kg). + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. Returns: - Always returns 0. + Taxon weight if available. See metadata. None if could not + interpret as a float. If an inferred zero catch record, will be + zero. """ return 0 @@ -718,87 +725,160 @@ def get_count_maybe(self) -> OPT_FLOAT: """ return 0 - def get_bottom_temperature_c_maybe(self) -> OPT_FLOAT: + def get_bottom_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as bottom_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: - Bottom temperature associated with haul if available in - Celsius. None if not given or could not interpret as a float. + Bottom temperature associated with observation / inferrence if + available in desired units. None if not given or could not interpret + as a float. """ - return self._haul.get_bottom_temperature_c_maybe() + return afscgap.convert.convert_temperature( + self._haul.get_bottom_temperature_c_maybe(), + units + ) - def get_surface_temperature_c_maybe(self) -> OPT_FLOAT: + def get_surface_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as surface_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: - Surface temperature associated with haul if available in - Celsius. None if not given or could not interpret as a float. + Surface temperature associated with observation / inferrence if + available. None if not given or could not interpret as a float. """ - return self._haul.get_surface_temperature_c_maybe() + return afscgap.convert.convert_temperature( + self._haul.get_surface_temperature_c_maybe(), + units + ) - def get_depth_m(self) -> float: + def get_depth(self, units: str = 'm') -> float: """Get the field labeled as depth_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Depth of the bottom in meters. + Depth of the bottom. """ - return self._haul.get_depth_m() + return afscgap.model.assert_float_present( + afscgap.convert.convert_distance(self._haul.get_depth_m(), units) + ) - def get_distance_fished_km(self) -> float: + def get_distance_fished(self, units: str = 'm') -> float: """Get the field labeled as distance_fished_km in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to km. + Returns: - Distance of the net fished as km. + Distance of the net fished. """ - return self._haul.get_distance_fished_km() + return afscgap.model.assert_float_present( + afscgap.convert.convert_distance( + self._haul.get_distance_fished_km() * 1000, + units + ) + ) - def get_net_width_m_maybe(self) -> OPT_FLOAT: + def get_net_width_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m or None if not given. + Distance of the net fished or None if not given. """ - return self._haul.get_net_width_m_maybe() + return afscgap.convert.convert_distance( + self._haul.get_net_width_m_maybe(), + units + ) - def get_net_height_m_maybe(self) -> OPT_FLOAT: + def get_net_height_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m or None if not given. + Height of the net fished or None if not given. """ - return self._haul.get_net_height_m_maybe() + return afscgap.convert.convert_distance( + self._haul.get_net_height_m_maybe(), + units + ) - def get_net_width_m(self) -> float: + def get_net_width(self, units: str = 'm') -> float: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m after asserting it is given. + Distance of the net fished after asserting it is given. """ - return self._haul.get_net_width_m() + return afscgap.model.assert_float_present( + self.get_net_width_maybe(units=units) + ) - def get_net_height_m(self) -> float: + def get_net_height(self, units: str = 'm') -> float: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m after asserting it is given. + Height of the net fished after asserting it is given. """ - return self._haul.get_net_height_m() + return afscgap.model.assert_float_present( + self.get_net_height_maybe(units=units) + ) - def get_area_swept_ha(self) -> float: + def get_area_swept(self, units: str = 'ha') -> float: """Get the field labeled as area_swept_ha in the API. + Args: + units: The units in which the area should be returned. Options: + ha, m2, km2. Defaults to ha. + Returns: - Area covered by the net while fishing in hectares. + Area covered by the net while fishing in desired units. """ - return self._haul.get_area_swept_ha() + return afscgap.model.assert_float_present( + afscgap.convert.convert_area( + self._haul.get_area_swept_ha(), + units + ) + ) - def get_duration_hr(self) -> float: + def get_duration(self, units: str = 'hr') -> float: """Get the field labeled as duration_hr in the API. + Args: + units: The units in which the duration should be returned. Options: + day, hr, min. Defaults to hr. + Returns: - Duration of the haul as number of hours. + Duration of the haul. """ - return self._haul.get_duration_hr() + return afscgap.model.assert_float_present( + afscgap.convert.convert_time(self._haul.get_duration_hr(), units) + ) def get_tsn(self) -> int: """Get taxonomic information system species code. @@ -832,87 +912,53 @@ def get_ak_survey_id_maybe(self) -> OPT_INT: """ return self._ak_survey_id - def get_cpue_kgha(self) -> float: + def get_cpue_weight(self, units: str = 'kg/ha') -> float: """Get the value of field cpue_kgha with validity assert. - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Always returns 0. - """ - return 0 - - def get_cpue_kgkm2(self) -> float: - """Get the value of field cpue_kgkm2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Always returns 0 - """ - return 0 - - def get_cpue_kg1000km2(self) -> float: - """Get the value of field cpue_kg1000km2 with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Always returns 0 + Catch weight divided by net area (kg / hectares) if available. See + metadata. Always returns 0. """ return 0 - def get_cpue_noha(self) -> float: + def get_cpue_count(self, units: str = 'count/ha') -> float: """Get the value of field cpue_noha with validity assert. - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Always returns 0 - """ - return 0 - - def get_cpue_nokm2(self) -> float: - """Get the value of field cpue_nokm2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Always returns 0 - """ - return 0 - - def get_cpue_no1000km2(self) -> float: - """Get the value of field cpue_no1000km2 with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Always returns 0 + Catch number divided by net sweep area if available (count / + hectares). See metadata. Always returns 0. """ return 0 - def get_weight_kg(self) -> float: + def get_weight(self, units: str = 'kg') -> float: """Get the value of field weight_kg with validity assert. + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Always returns 0 + Taxon weight (kg) if available. See metadata. Always returns 0. """ return 0 @@ -928,31 +974,45 @@ def get_count(self) -> float: """ return 0 - def get_bottom_temperature_c(self) -> float: + def get_bottom_temperature(self, units='c') -> float: """Get the value of field bottom_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Bottom temperature associated with observation if available in - Celsius. + Bottom temperature associated with observation / inferrence if + available. """ - return self._haul.get_bottom_temperature_c() + return afscgap.model.assert_float_present( + self.get_bottom_temperature_maybe(units=units) + ) - def get_surface_temperature_c(self) -> float: + def get_surface_temperature(self, units='c') -> float: """Get the value of field surface_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Surface temperature associated with observation if available in - Celsius. None if not + Surface temperature associated with observation / inferrence if + available. """ - return self._haul.get_surface_temperature_c() + return afscgap.model.assert_float_present( + self.get_surface_temperature_maybe(units=units) + ) def is_complete(self) -> bool: """Determine if this record has all of its values filled in. @@ -965,51 +1025,6 @@ def is_complete(self) -> bool: ak_survey_id_given = self._ak_survey_id is not None return tsn_given and ak_survey_id_given and self._haul.is_complete() - def to_dict(self) -> dict: - """Serialize this Record to a dictionary form. - - Returns: - Dictionary with field names matching those found in the API results - with incomplete records having some values as None. - """ - return { - 'year': self.get_year(), - 'srvy': self.get_srvy(), - 'survey': self.get_survey(), - 'survey_id': self.get_survey_id(), - 'cruise': self.get_cruise(), - 'haul': self.get_haul(), - 'stratum': self.get_stratum(), - 'station': self.get_station(), - 'vessel_name': self.get_vessel_name(), - 'vessel_id': self.get_vessel_id(), - 'date_time': self.get_date_time(), - 'latitude_dd': self.get_latitude_dd(), - 'longitude_dd': self.get_longitude_dd(), - 'species_code': self.get_species_code(), - 'common_name': self.get_common_name(), - 'scientific_name': self.get_scientific_name(), - 'taxon_confidence': self.get_taxon_confidence(), - 'cpue_kgha': self.get_cpue_kgha(), - 'cpue_kgkm2': self.get_cpue_kgkm2(), - 'cpue_kg1000km2': self.get_cpue_kg1000km2(), - 'cpue_noha': self.get_cpue_noha(), - 'cpue_nokm2': self.get_cpue_nokm2(), - 'cpue_no1000km2': self.get_cpue_no1000km2(), - 'weight_kg': self.get_weight_kg(), - 'count': self.get_count(), - 'bottom_temperature_c': self.get_bottom_temperature_c_maybe(), - 'surface_temperature_c': self.get_surface_temperature_c_maybe(), - 'depth_m': self.get_depth_m(), - 'distance_fished_km': self.get_distance_fished_km(), - 'net_width_m': self.get_net_width_m(), - 'net_height_m': self.get_net_height_m(), - 'area_swept_ha': self.get_area_swept_ha(), - 'duration_hr': self.get_duration_hr(), - 'tsn': self.get_tsn_maybe(), - 'ak_survey_id': self.get_ak_survey_id() - } - def parse_haul(target: dict) -> afscgap.model.Haul: """Parse a Haul record from a row in the community Hauls flat file. diff --git a/afscgap/model.py b/afscgap/model.py index 7bcb5cc4..f5778c8a 100644 --- a/afscgap/model.py +++ b/afscgap/model.py @@ -175,17 +175,25 @@ def get_date_time(self) -> str: """ raise NotImplementedError('Use implementor.') - def get_latitude_dd(self) -> float: + def get_latitude(self, units: str = 'dd') -> float: """Get the field labeled as latitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Latitude in decimal degrees associated with the haul. """ raise NotImplementedError('Use implementor.') - def get_longitude_dd(self) -> float: + def get_longitude(self, units: str = 'dd') -> float: """Get the field labeled as longitude_dd in the API. + Args: + units: The units to return this value in. Only supported is dd for + degrees. Deafults to dd. + Returns: Longitude in decimal degrees associated with the haul. """ @@ -227,71 +235,46 @@ def get_taxon_confidence(self) -> str: """ raise NotImplementedError('Use implementor.') - def get_cpue_kgha_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kgha in the API. - - Returns: - Catch weight divided by net area (kg / hectares) if available. See - metadata. None if could not interpret as a float. If an inferred - zero catch record, will be zero. - """ - raise NotImplementedError('Use implementor.') + def get_cpue_weight_maybe(self, units: str = 'kg/ha') -> OPT_FLOAT: + """Get a field labeled as cpue_* in the API. - def get_cpue_kgkm2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kgkm2 in the API. + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Returns: - Catch weight divided by net area (kg / km^2) if available. See + Catch weight divided by net area (in given units) if available. See metadata. None if could not interpret as a float. If an inferred zero catch record, will be zero. """ raise NotImplementedError('Use implementor.') - def get_cpue_kg1000km2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_kg1000km2 in the API. + def get_cpue_count_maybe(self, units: str = 'count/ha') -> OPT_FLOAT: + """Get the field labeled as cpue_* in the API. - Returns: - Catch weight divided by net area (kg / km^2 * 1000) if available. - See metadata. None if could not interpret as a float. If an inferred - zero catch record, will be zero. - """ - raise NotImplementedError('Use implementor.') - - def get_cpue_noha_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_noha in the API. - - Returns: - Catch number divided by net sweep area if available (count / - hectares). See metadata. None if could not interpret as a float. If - an inferred zero catch record, will be zero. - """ - raise NotImplementedError('Use implementor.') + Get the catch per unit effort from the record with one of the following + units: kg/ha, kg/km2, kg1000/km2. - def get_cpue_nokm2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_nokm2 in the API. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. Returns: - Catch number divided by net sweep area if available (count / km^2). - See metadata. None if could not interpret as a float. If an inferred + Catch weight divided by net area (in given units) if available. See + metadata. None if could not interpret as a float. If an inferred zero catch record, will be zero. """ raise NotImplementedError('Use implementor.') - def get_cpue_no1000km2_maybe(self) -> OPT_FLOAT: - """Get the field labeled as cpue_no1000km2 in the API. - - Returns: - Catch number divided by net sweep area if available (count / km^2 * - 1000). See metadata. None if could not interpret as a float. If an - inferred zero catch record, will be zero. - """ - raise NotImplementedError('Use implementor.') - - def get_weight_kg_maybe(self) -> OPT_FLOAT: + def get_weight_maybe(self, units: str = 'kg') -> OPT_FLOAT: """Get the field labeled as weight_kg in the API. + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Returns: - Taxon weight (kg) if available. See metadata. None if could not + Taxon weight if available. See metadata. None if could not interpret as a float. If an inferred zero catch record, will be zero. """ @@ -307,87 +290,128 @@ def get_count_maybe(self) -> OPT_FLOAT: """ raise NotImplementedError('Use implementor.') - def get_bottom_temperature_c_maybe(self) -> OPT_FLOAT: + def get_bottom_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as bottom_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: Bottom temperature associated with observation / inferrence if - available in Celsius. None if not given or could not interpret as a - float. + available in desired units. None if not given or could not interpret + as a float. """ raise NotImplementedError('Use implementor.') - def get_surface_temperature_c_maybe(self) -> OPT_FLOAT: + def get_surface_temperature_maybe(self, units: str = 'c') -> OPT_FLOAT: """Get the field labeled as surface_temperature_c in the API. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Returns: Surface temperature associated with observation / inferrence if - available in Celsius. None if not given or could not interpret as a - float. + available. None if not given or could not interpret as a float. """ raise NotImplementedError('Use implementor.') - def get_depth_m(self) -> float: + def get_depth(self, units: str = 'm') -> float: """Get the field labeled as depth_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Depth of the bottom in meters. + Depth of the bottom. """ raise NotImplementedError('Use implementor.') - def get_distance_fished_km(self) -> float: + def get_distance_fished(self, units: str = 'm') -> float: """Get the field labeled as distance_fished_km in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as km. + Distance of the net fished. """ raise NotImplementedError('Use implementor.') - def get_net_width_m(self) -> float: + def get_net_width(self, units: str = 'm') -> float: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m after asserting it is given. + Distance of the net fished after asserting it is given. """ raise NotImplementedError('Use implementor.') - def get_net_height_m(self) -> float: + def get_net_height(self, units: str = 'm') -> float: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m after asserting it is given. + Height of the net fished after asserting it is given. """ raise NotImplementedError('Use implementor.') - def get_net_width_m_maybe(self) -> OPT_FLOAT: + def get_net_width_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_width_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Distance of the net fished as m or None if not given. + Distance of the net fished or None if not given. """ raise NotImplementedError('Use implementor.') - def get_net_height_m_maybe(self) -> OPT_FLOAT: + def get_net_height_maybe(self, units: str = 'm') -> OPT_FLOAT: """Get the field labeled as net_height_m in the API. + Args: + units: The units in which the distance should be returned. Options: + m or km for meters and kilometers respectively. Defaults to m. + Returns: - Height of the net fished as m or None if not given. + Height of the net fished or None if not given. """ raise NotImplementedError('Use implementor.') - def get_area_swept_ha(self) -> float: + def get_area_swept(self, units: str = 'ha') -> float: """Get the field labeled as area_swept_ha in the API. + Args: + units: The units in which the area should be returned. Options: + ha, m2, km2. Defaults to ha. + Returns: - Area covered by the net while fishing in hectares. + Area covered by the net while fishing in desired units. """ raise NotImplementedError('Use implementor.') - def get_duration_hr(self) -> float: + def get_duration(self, units: str = 'hr') -> float: """Get the field labeled as duration_hr in the API. + Args: + units: The units in which the duration should be returned. Options: + day, hr, min. Defaults to hr. + Returns: - Duration of the haul as number of hours. + Duration of the haul. """ raise NotImplementedError('Use implementor.') @@ -424,48 +448,30 @@ def get_ak_survey_id_maybe(self) -> OPT_INT: """ raise NotImplementedError('Use implementor.') - def get_cpue_kgha(self) -> float: + def get_cpue_weight(self, units: str = 'kg/ha') -> float: """Get the value of field cpue_kgha with validity assert. - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch weight divided by net area (kg / hectares) if available. See - metadata. Will be zero if a zero catch record. - """ - raise NotImplementedError('Use implementor.') - - def get_cpue_kgkm2(self) -> float: - """Get the value of field cpue_kgkm2 with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + kg/ha, kg/km2, kg1000/km2. Defaults to kg/ha. Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: - Catch weight divided by net area (kg / km^2) if available. See + Catch weight divided by net area (kg / hectares) if available. See metadata. Will be zero if a zero catch record. """ raise NotImplementedError('Use implementor.') - def get_cpue_kg1000km2(self) -> float: - """Get the value of field cpue_kg1000km2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch weight divided by net area (kg / km^2 * 1000) if available. - See metadata. Will be zero if a zero catch record. - """ - raise NotImplementedError('Use implementor.') - - def get_cpue_noha(self) -> float: + def get_cpue_count(self, units: str = 'count/ha') -> float: """Get the value of field cpue_noha with validity assert. + Args: + units: The desired units for the catch per unit effort. Options: + count/ha, count/km2, and count1000/km2. Defaults to count/ha. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. @@ -476,35 +482,13 @@ def get_cpue_noha(self) -> float: """ raise NotImplementedError('Use implementor.') - def get_cpue_nokm2(self) -> float: - """Get the value of field cpue_nokm2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch number divided by net sweep area if available (count / km^2). - See metadata. Will be zero if a zero catch record. - """ - raise NotImplementedError('Use implementor.') - - def get_cpue_no1000km2(self) -> float: - """Get the value of field cpue_no1000km2 with validity assert. - - Raises: - AssertionError: Raised if this field was not given by the API or - could not be parsed as expected. - - Returns: - Catch number divided by net sweep area if available (count / km^2 * - 1000). See metadata. Will be zero if a zero catch record. - """ - raise NotImplementedError('Use implementor.') - - def get_weight_kg(self) -> float: + def get_weight(self, units: str = 'kg') -> float: """Get the value of field weight_kg with validity assert. + Args: + units: The units in which the weight should be returned. Options are + g, kg for grams and kilograms respectively. Deafults to kg. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. @@ -528,29 +512,39 @@ def get_count(self) -> float: """ raise NotImplementedError('Use implementor.') - def get_bottom_temperature_c(self) -> float: + def get_bottom_temperature(self, units='c') -> float: """Get the value of field bottom_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: Bottom temperature associated with observation / inferrence if - available in Celsius. + available. """ raise NotImplementedError('Use implementor.') - def get_surface_temperature_c(self) -> float: + def get_surface_temperature(self, units='c') -> float: """Get the value of field surface_temperature_c with validity assert. + Args: + units: The units in which the temperature should be returned. + Options: c or f for Celcius and Fahrenheit respectively. + Defaults to c. + Raises: AssertionError: Raised if this field was not given by the API or could not be parsed as expected. Returns: Surface temperature associated with observation / inferrence if - available in Celsius. None if not + available. """ raise NotImplementedError('Use implementor.') @@ -566,11 +560,54 @@ def is_complete(self) -> bool: def to_dict(self) -> dict: """Serialize this Record to a dictionary form. + Serialize this Record to a dictionary form, including only field names + that would be found on records returned from the API service. + Returns: Dictionary with field names matching those found in the API results with incomplete records having some values as None. """ - raise NotImplementedError('Use implementor.') + return { + 'year': self.get_year(), + 'srvy': self.get_srvy(), + 'survey': self.get_survey(), + 'survey_id': self.get_survey_id(), + 'cruise': self.get_cruise(), + 'haul': self.get_haul(), + 'stratum': self.get_stratum(), + 'station': self.get_station(), + 'vessel_name': self.get_vessel_name(), + 'vessel_id': self.get_vessel_id(), + 'date_time': self.get_date_time(), + 'latitude_dd': self.get_latitude(), + 'longitude_dd': self.get_longitude(), + 'species_code': self.get_species_code(), + 'common_name': self.get_common_name(), + 'scientific_name': self.get_scientific_name(), + 'taxon_confidence': self.get_taxon_confidence(), + 'cpue_kgha': self.get_cpue_weight_maybe(units='kg/ha'), + 'cpue_kgkm2': self.get_cpue_weight_maybe(units='kg/km2'), + 'cpue_kg1000km2': self.get_cpue_weight_maybe(units='kg1000/km2'), + 'cpue_noha': self.get_cpue_count_maybe(units='count/ha'), + 'cpue_nokm2': self.get_cpue_count_maybe(units='count/km2'), + 'cpue_no1000km2': self.get_cpue_count_maybe(units='count1000/km2'), + 'weight_kg': self.get_weight(units='kg'), + 'count': self.get_count(), + 'bottom_temperature_c': self.get_bottom_temperature_maybe( + units='c' + ), + 'surface_temperature_c': self.get_surface_temperature_maybe( + units='c' + ), + 'depth_m': self.get_depth(units='m'), + 'distance_fished_km': self.get_distance_fished(units='km'), + 'net_width_m': self.get_net_width(units='m'), + 'net_height_m': self.get_net_height(units='m'), + 'area_swept_ha': self.get_area_swept(units='ha'), + 'duration_hr': self.get_duration(units='hr'), + 'tsn': self.get_tsn_maybe(), + 'ak_survey_id': self.get_ak_survey_id() + } class Haul(HaulKeyable): diff --git a/afscgap/query_util.py b/afscgap/query_util.py index d0d4b52e..375d7413 100644 --- a/afscgap/query_util.py +++ b/afscgap/query_util.py @@ -20,7 +20,7 @@ def interpret_value(value): if value is None: return None - if not isinstance(value, tuple): + if not (isinstance(value, tuple) or isinstance(value, list)): return value if len(value) != 2: diff --git a/afscgap/test/test_client.py b/afscgap/test/test_client.py index 5aa2bb00..e3d883dc 100644 --- a/afscgap/test/test_client.py +++ b/afscgap/test/test_client.py @@ -140,7 +140,7 @@ def test_parse_record(self): self.assertEqual(parsed.get_srvy(), 'GOA') self.assertAlmostEquals(parsed.get_vessel_id(), 148) self.assertAlmostEquals( - parsed.get_cpue_kg1000km2(), + parsed.get_cpue_weight(units='kg1000/km2'), 40.132273, places=5 ) diff --git a/afscgap/test/test_convert.py b/afscgap/test/test_convert.py index 335cda13..c4a24fb4 100644 --- a/afscgap/test/test_convert.py +++ b/afscgap/test/test_convert.py @@ -56,3 +56,75 @@ def test_is_iso8601_success(self): def test_is_iso8601_fail(self): self.assertFalse(afscgap.convert.is_iso8601('07/16/2021 11:30:22')) + + def test_convert_area(self): + self.assertAlmostEquals( + afscgap.convert.convert_area(123, 'm2'), + 1230000 + ) + + def test_unconvert_area(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_area(1.23, 'km2'), + 123 + ) + + def test_convert_degrees(self): + self.assertAlmostEquals( + afscgap.convert.convert_degrees(123, 'dd'), + 123 + ) + + def test_unconvert_degrees(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_degrees(123, 'dd'), + 123 + ) + + def test_convert_distance(self): + self.assertAlmostEquals( + afscgap.convert.convert_distance(123, 'km'), + 0.123 + ) + + def test_unconvert_distance(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_distance(123, 'km'), + 123000 + ) + + def test_convert_temperature(self): + self.assertAlmostEquals( + afscgap.convert.convert_temperature(12, 'f'), + 53.6 + ) + + def test_unconvert_temperature(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_temperature(12, 'f'), + -11.111111111 + ) + + def test_convert_time(self): + self.assertAlmostEquals( + afscgap.convert.convert_time(123, 'day'), + 5.125 + ) + + def test_unconvert_time(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_time(123, 'min'), + 2.05 + ) + + def test_convert_weight(self): + self.assertAlmostEquals( + afscgap.convert.convert_weight(12, 'g'), + 12000 + ) + + def test_unconvert_weight(self): + self.assertAlmostEquals( + afscgap.convert.unconvert_weight(12, 'g'), + 0.012 + ) diff --git a/afscgap/test/test_entry.py b/afscgap/test/test_entry.py index dad6ba3c..db569fd9 100644 --- a/afscgap/test/test_entry.py +++ b/afscgap/test/test_entry.py @@ -9,6 +9,7 @@ """ import csv import unittest +import unittest.mock import afscgap.test.test_tools @@ -29,57 +30,76 @@ def setUp(self): ) def test_query_primitive(self): - result = afscgap.query( - year=2021, - srvy='BSS', - requestor=self._mock_requestor - ) - results = list(result) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_srvy(eq='BSS') + results = list(query.execute()) self.assertEquals(len(results), 20) def test_query_dict(self): - result = afscgap.query( - year=2021, - latitude_dd={'$gte': 56.99, '$lte': 57.04}, - longitude_dd={'$gte': -143.96, '$lte': -144.01}, - requestor=self._mock_requestor - ) - results = list(result) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_latitude(eq={'$gte': 56.99, '$lte': 57.04}) + query.filter_longitude(eq={'$gte': -143.96, '$lte': -144.01}) + results = list(query.execute()) + self.assertEquals(len(results), 20) + + def test_query_keywords(self): + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_latitude(min_val=56.99, max_val=57.04) + query.filter_longitude(min_val=-143.96, max_val=-144.01) + results = list(query.execute()) self.assertEquals(len(results), 20) def test_query_dict_filter_incomplete(self): - result = afscgap.query( - year=2021, - latitude_dd={'$gte': 56.99, '$lte': 57.04}, - longitude_dd={'$gte': -143.96, '$lte': -144.01}, - requestor=self._mock_requestor, - filter_incomplete=True - ) - results = list(result) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_latitude(eq={'$gte': 56.99, '$lte': 57.04}) + query.filter_longitude(eq={'$gte': -143.96, '$lte': -144.01}) + query.set_filter_incomplete(True) + results = list(query.execute()) self.assertEquals(len(results), 19) def test_query_dict_invalid_filter_incomplete(self): - result = afscgap.query( - year=2021, - latitude_dd={'$gte': 56.99, '$lte': 57.04}, - longitude_dd={'$gte': -143.96, '$lte': -144.01}, - requestor=self._mock_requestor, - filter_incomplete=True - ) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_latitude(eq={'$gte': 56.99, '$lte': 57.04}) + query.filter_longitude(eq={'$gte': -143.96, '$lte': -144.01}) + query.set_filter_incomplete(True) + result = query.execute() list(result) self.assertEquals(result.get_invalid().qsize(), 2) def test_query_dict_invalid_keep_incomplete(self): - result = afscgap.query( - year=2021, - latitude_dd={'$gte': 56.99, '$lte': 57.04}, - longitude_dd={'$gte': -143.96, '$lte': -144.01}, - requestor=self._mock_requestor, - filter_incomplete=False - ) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.filter_latitude(eq={'$gte': 56.99, '$lte': 57.04}) + query.filter_longitude(eq={'$gte': -143.96, '$lte': -144.01}) + query.set_filter_incomplete(False) + result = query.execute() list(result) self.assertEquals(result.get_invalid().qsize(), 1) + def test_create_param_eq(self): + query = afscgap.Query(requestor=self._mock_requestor) + self.assertEqual(query._create_param(2021), 2021) + + def test_create_param_lt(self): + query = afscgap.Query(requestor=self._mock_requestor) + self.assertEqual(query._create_param(min_val=2021), [2021, None]) + + def test_create_param_gt(self): + query = afscgap.Query(requestor=self._mock_requestor) + self.assertEqual(query._create_param(max_val=2021), [None, 2021]) + + def test_create_param_between(self): + query = afscgap.Query(requestor=self._mock_requestor) + self.assertEqual( + query._create_param(min_val=2020, max_val=2021), + [2020, 2021] + ) + class EntryPointInferenceTests(unittest.TestCase): @@ -99,45 +119,47 @@ def test_query_keep_presence_only(self): side_effect=[self._api_result] ) - result = afscgap.query( - year=2021, - requestor=mock_requestor, - presence_only=True - ) + query = afscgap.Query(requestor=mock_requestor) + query.filter_year(eq=2021) + query.set_presence_only(True) + result = query.execute() + results = list(result) self.assertEquals(len(results), 2) def test_query_primitive(self): warn_function = unittest.mock.MagicMock() - result = afscgap.query( - year=2021, - requestor=self._mock_requestor, - presence_only=False, - warn_function=warn_function - ) + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.set_presence_only(False) + query.set_warn_function(warn_function) + result = query.execute() + results = list(result) self.assertEquals(len(results), 4) def test_query_primitive_warning(self): warn_function = unittest.mock.MagicMock() - result = afscgap.query( - year=2021, - requestor=self._mock_requestor, - presence_only=False, - warn_function=warn_function - ) + + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.set_presence_only(False) + query.set_warn_function(warn_function) + result = query.execute() + warn_function.assert_called() def test_query_primitive_suppress(self): warn_function = unittest.mock.MagicMock() - result = afscgap.query( - year=2021, - requestor=self._mock_requestor, - presence_only=False, - suppress_large_warning=True, - warn_function=warn_function - ) + + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.set_presence_only(False) + query.set_warn_function(warn_function) + query.set_suppress_large_warning(True) + result = query.execute() + warn_function.assert_not_called() def test_prefetch(self): @@ -147,12 +169,13 @@ def test_prefetch(self): hauls = [afscgap.inference.parse_haul(row) for row in rows] warn_function = unittest.mock.MagicMock() - result = afscgap.query( - year=2021, - requestor=self._mock_requestor, - presence_only=False, - suppress_large_warning=True, - warn_function=warn_function, - hauls_prefetch=hauls - ) + + query = afscgap.Query(requestor=self._mock_requestor) + query.filter_year(eq=2021) + query.set_presence_only(False) + query.set_warn_function(warn_function) + query.set_suppress_large_warning(True) + query.set_hauls_prefetch(hauls) + result = query.execute() + self._mock_requestor.assert_not_called() diff --git a/afscgap/test/test_inference.py b/afscgap/test/test_inference.py index 0d04fa59..658fe3da 100644 --- a/afscgap/test/test_inference.py +++ b/afscgap/test/test_inference.py @@ -89,7 +89,7 @@ def test_decorator_infer_override(self): 34, 567 ) - self.assertEquals(decorated.get_cpue_kgha_maybe(), 0) + self.assertEquals(decorated.get_cpue_weight_maybe(units='kg/ha'), 0) def test_decorator_infer_given(self): data_2021 = filter(lambda x: x.get_year() == 2021, self._all_hauls_data) @@ -102,7 +102,7 @@ def test_decorator_infer_given(self): 34, 567 ) - self.assertEquals(decorated.get_cpue_kgha_maybe(), 0) + self.assertEquals(decorated.get_cpue_weight_maybe(units='kg/ha'), 0) def test_decorator_dict(self): data_2021 = filter(lambda x: x.get_year() == 2021, self._all_hauls_data) diff --git a/afscgapviz/build_database.py b/afscgapviz/build_database.py index 3b90ccad..30671897 100644 --- a/afscgapviz/build_database.py +++ b/afscgapviz/build_database.py @@ -99,19 +99,21 @@ def simplify_record(target: afscgap.model.Record, Returns: Record with information needed for the web application. """ - latitude = target.get_latitude_dd() - longitude = target.get_longitude_dd() + latitude = target.get_latitude(units='dd') + longitude = target.get_longitude(units='dd') geohash = geolib.geohash.encode(latitude, longitude, geohash_size) - surface_temperature_c_maybe = target.get_surface_temperature_c_maybe() + surface_temperature_c_maybe = target.get_surface_temperature_maybe( + units='c' + ) if surface_temperature_c_maybe is None: return None - bottom_temperature_c_maybe = target.get_bottom_temperature_c_maybe() + bottom_temperature_c_maybe = target.get_bottom_temperature_maybe(units='c') if bottom_temperature_c_maybe is None: return None - weight_kg_maybe = target.get_weight_kg_maybe() + weight_kg_maybe = target.get_weight_maybe(units='kg') if weight_kg_maybe is None: return None @@ -129,7 +131,7 @@ def simplify_record(target: afscgap.model.Record, bottom_temperature_c_maybe, weight_kg_maybe, count_maybe, - target.get_area_swept_ha(), + target.get_area_swept(units='ha'), 1 ) @@ -179,11 +181,12 @@ def get_year(survey: str, year: int, geohash_size: int) -> SIMPLIFIED_RECORDS: Returns: Iterable over SimplifiedRecords generated / downloaded. """ - results = afscgap.query( - srvy=survey, - year=year, - presence_only=False - ) + query = afscgap.Query() + query.filter_srvy(eq=survey) + query.filter_year(eq=year) + query.set_presence_only(False) + + results = query.execute() simplified_records_maybe = map( lambda x: simplify_record(x, geohash_size), diff --git a/afscgapviz/templates/example.py_html b/afscgapviz/templates/example.py_html index 6534a58e..58f1e248 100644 --- a/afscgapviz/templates/example.py_html +++ b/afscgapviz/templates/example.py_html @@ -1,17 +1,16 @@ import afscgap -results_1 = afscgap.query( - survey='{{ survey }}', - year={{ year }}, - {% if species %}scientific_name='{{ species }}'{% else %}common_name='{{ common_name }}'{% endif %}, - presence_only=False -) - +query_1 = afscgap.Query() +query_1.filter_srvy(eq='{{ survey }}') +query_1.filter_year(eq={{ year }}) +{% if species %}query_1.filter_scientific_name(eq='{{ species }}'){% else %}query_1.filter_common_name(eq='{{ common_name }}'){% endif %} +query_1.set_presence_only(False) +results_1 = query_1.execute() {% if is_comparison %} -results_2 = afscgap.query( - survey='{{ survey }}', - year={{ other_year }}, - {% if other_species %}scientific_name='{{ other_species }}'{% else %}common_name='{{ other_common_name }}'{% endif %}, - presence_only=False -) +query_2 = afscgap.Query() +query_2.filter_srvy(eq='{{ survey }}') +query_2.filter_year(eq={{ other_year }}) +{% if other_species %}query_2.filter_scientific_name(eq='{{ other_species }}'){% else %}query_2.filter_common_name(eq='{{ other_common_name }}'){% endif %} +query_2.set_presence_only(False) +results_2 = query_2.execute() {% endif %} diff --git a/inst/architecture.drawio b/inst/architecture.drawio new file mode 100644 index 00000000..245e9fb8 --- /dev/null +++ b/inst/architecture.drawio @@ -0,0 +1,178 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/inst/library.png b/inst/library.png index ec68fcb6..538edbab 100644 Binary files a/inst/library.png and b/inst/library.png differ diff --git a/inst/paper.bib b/inst/paper.bib index c62f63c2..6b8682b0 100644 --- a/inst/paper.bib +++ b/inst/paper.bib @@ -38,7 +38,7 @@ @misc{diagrams url={https://github.com/jgraph/drawio}, journal={GitHub}, publisher={GitHub, Inc}, - uthor={JGraph Ltd and draw.io AG}, + author={JGraph Ltd and draw.io AG}, year={2023} } diff --git a/inst/paper.md b/inst/paper.md index c791db9b..f208366d 100644 --- a/inst/paper.md +++ b/inst/paper.md @@ -34,16 +34,16 @@ bibliography: paper.bib --- # Summary -The Resource Assessment and Conservation Engineering Division of the National Oceanic and Atmospheric Administration's Alaska Fisheries Science Center (NOAA AFSC RACE) runs the [Groundfish Assessment Program](https://www.fisheries.noaa.gov/contact/groundfish-assessment-program) (GAP) which produces longitudinal catch data [@afscgap]. These "hauls" report where marine species are found during bottom trawl surveys and in what quantities, empowering ocean health research and fisheries management [@example]. Increasing accessibility of these important data (RACEBASE/FOSS) through tools for individuals of diverse programming experience, Pyafscgap.org offers not just easier access to the dataset's REST API through query compilation but provides both memory-efficient algorithms for "zero-catch inference" and interactive visual analytics tools [@inport]. Altogether, this toolset supports investigatory tasks not easily executable using the API service alone and, leveraging game and information design, offers these data to a broader audience. +The Resource Assessment and Conservation Engineering Division of the National Oceanic and Atmospheric Administration's Alaska Fisheries Science Center (NOAA AFSC RACE) runs the [Groundfish Assessment Program](https://www.fisheries.noaa.gov/contact/groundfish-assessment-program) (GAP) which produces longitudinal catch data [@afscgap]. These "hauls" report where marine species are found during bottom trawl surveys and in what quantities, empowering ocean health research and fisheries management [@example]. Increasing accessibility of these important data through tools for individuals of diverse programming experience, Pyafscgap.org offers not just easier access to the dataset's REST API through query compilation but provides both memory-efficient algorithms for "zero-catch inference" and interactive visual analytics tools [@inport]. Altogether, this toolset supports investigatory tasks not easily executable using the API service alone and, leveraging game and information design, offers these data to a broader audience. # Statement of need -Pyafscgap.org reduces barriers for use of GAP data, offering open source solutions for addressing the dataset's presence-only nature, use of proprietary databases, and size / complexity [@inport]. +Pyafscgap.org reduces barriers for use of GAP data, offering open source solutions for addressing the dataset's use of proprietary technology, presence-only nature, and size / complexity [@inport]. ## Developer accessibility -First, working with these data requires knowledge of tools ouside the Python "standard toolchest" like the closed-source Oracle REST Data Services (ORDS) [@ords]. The `afscgap` package offers easier developer access to the official REST service with automated pagination, query language compilation, and documented types. Together, these tools enable Python developers to use familiar patterns to interact with these data like type checking, standard documentation, and other Python data-related libraries. +Working with these data requires knowledge of tools ouside the Python "standard toolchest" like the closed-source Oracle REST Data Services (ORDS) [@ords]. The `afscgap` package offers easier open source-based developer access to the official REST service with automated pagination, Python to ORDS syntax compilation, and documented types. Together, these tools enable Python developers to use familiar patterns to interact with these data like type checking, standard documentation, and other Python data-related libraries. ## Common analysis -That being said, access to the API alone cannot support some investigations as the API provides "presence-only" data [@inport]. Many types of analysis (like geohash-aggregated species catch per unit effort) require information not just about where a species was present but also where it was not [@geohash]. In other words, while the presence-only dataset may provide a total weight or count, the total area swept for a region may not necessarily be easily available but required [@notebook]. The `afscgap` Python package can, with memory efficiency, algorithmically infer those needed "zero catch" records for developers. +Access to the API alone cannot support some investigations as the API provides "presence-only" data [@inport]. Many types of analysis require information not just about where a species was present but also where it was not. For example, consider geohash-aggregated species catch per unit effort: while the presence-only dataset may provide a total weight or count, the total area swept for a region may not necessarily be easily available but required [@geohash; @notebook]. The `afscgap` Python package can, with memory efficiency, algorithmically infer those needed "zero catch" records. ## General public accessibility Though the `afscgap` Python package makes GAP catch data more accessible to developers, the size and complexity of this dataset requires non-trivial engineering for comparative analysis between species, years, and/or geographic areas [@notebook]. Without deep developer experience, it can be difficult to get started. To address a broader audience, this project also offers a no-code visualization tool sitting on top of `afscgap` to begin investigations with CSV and Python code export as a bridge to further analysis. @@ -52,12 +52,10 @@ Though the `afscgap` Python package makes GAP catch data more accessible to deve This project aims to improve accessibility of GAP catch data, democratizing developer access and offering inclusive approachable tools to kickstart analysis. ## Lazy querying facade -Starting with the `afscgap` library, lazy "generator iterables" increase accessibility by encapsulating logic for memory-efficient pagination and "data munging" behind a familiar interface [@lazy]. Furthermore, to support zero catch data, decorators adapt diverse structures to common interfaces, freeing client code from understanding the full complexities of `afscgap`'s type system [@decorators]. +The `afscgap` library manages significant data structure complexity to offer a simple familiar interface to Python developers. First, lazy "generator iterables" increase accessibility by encapsulating logic for memory-efficient pagination and "data munging" behind Python-standard iterators [@lazy]. Furthermore, to support zero catch data, decorators adapt diverse structures to common interfaces, offering polymorphism [@decorators]. Finally, offering a single object entry-point into the library, a "facade" approach allows the user to interact with these systems without requiring deep understanding of the library's types, a goal furthered by compilation of "standard" Python types to Oracle REST Data Service queries [@facade]. ![Diagram of simplified afscgap operation [@diagrams].\label{fig:library}](library.png) -Finally, offering a single function entry-point into the library, this "facade" approach allows the user to interact with these systems without requiring client code to reflect deep understanding of the library's mechanics, a goal furthered by compilation of "standard" Python types to Oracle REST Data Service queries [@facade]. - ## Zero catch inference "Negative" or "zero catch" inference enables scientists to conduct a broader range of analysis. To achieve this, the package uses the following algorithm: @@ -80,15 +78,15 @@ Of course, building competency in a sophisticated interface like this presents u - **Introduction**: The player sees information about Pacific cod with pre-filled elements used to achieve that analysis gradually fading in. - **Development**: Using the mechanics introduced moments prior, the tool invites the player to change the analysis to compare different regions with temperature data. - **Twist**: Overlays on the same display are enabled, allowing the player to leverage mechanics they just exercised in a now more complex interface. - - **Conclusion**: The tool ends by giving the player an opportunity to demonstrate all of the skills acquired in a new problem. + - **Conclusion**: End with giving the player an opportunity to demonstrate all of the skills acquired in a new problem. -Finally, while this interface uses game / information design techniques to offer an accessible on-ramp to quickly learn a sophisticated interface, it also serves as a starting point for continued analysis by generating either CSV or Python code to take work into other tools. Examined via Thinking-aloud Method [@thinkaloud]. +While this interface uses game / information design techniques to offer an accessible on-ramp to quickly learn a sophisticated interface, it also serves as a starting point for continued analysis by generating either CSV or Python code to take work into other tools. Examined via Thinking-aloud Method [@thinkaloud]. ## Limitations Notable current limitations: - Single-threaded and non-asynchoronous. - - Visualization aggregation of hauls happens on a point due to dataset limitations which may cause some approximation in regional CPUE [@readme]. + - Due to dataset limitations, hauls are represeted by points not areas in visualization aggregation [@readme]. # Acknowledgements Thanks to the following for feedback / testing on these components: diff --git a/inst/preview_paper.sh b/inst/preview_paper.sh index be84ee91..545d35ea 100644 --- a/inst/preview_paper.sh +++ b/inst/preview_paper.sh @@ -1,6 +1,8 @@ wget https://raw.githubusercontent.com/pandoc/lua-filters/master/author-info-blocks/author-info-blocks.lua wget https://raw.githubusercontent.com/pandoc/lua-filters/master/scholarly-metadata/scholarly-metadata.lua +sudo apt-get update + sudo apt-get install pandoc pandoc-citeproc texlive-extra-utils texlive-fonts-recommended texlive-latex-base texlive-latex-extra pandoc paper.md --bibliography=paper.bib --lua-filter=scholarly-metadata.lua --lua-filter=author-info-blocks.lua -o ../website/static/paper.pdf diff --git a/notebooks/cod.ipynb b/notebooks/cod.ipynb index 55baf9ac..e0b6de52 100644 --- a/notebooks/cod.ipynb +++ b/notebooks/cod.ipynb @@ -26,7 +26,7 @@ "\n", "## Summary / Abstract\n", "\n", - "In demonstrating the use of `afscgap` and providing a community tutorial, this notebook looks at a sharp decline in Pacific cod (Gadus macrocephalus) presence in the Gulf of Alaska before and after \"The Blob\" heating event. To facilitate this comparison, it looks at 2013 vs 2021. Using data from [NOAA AFSC GAP](https://www.fisheries.noaa.gov/foss/f?p=215:28), this notebook finds further confirmatory evidence of that [well documented species decline](https://www.npr.org/2019/12/08/785634169/alaska-cod-fishery-closes-and-industry-braces-for-ripple-effect) with geospatial visualizations showing areas of reduced catch during bottom trawl surveys. Furthermore, this work finds that reduced stock still persists despite the warming event having \"ended\" some years prior. Altogether, this notebook joins other anlysis in warning of ecological and economic threats in the area caused by climate change." + "To demonstrate the use of the `afscgap` Python library and provide a community tutorial, this notebook looks at a sharp decline in Pacific cod (Gadus macrocephalus) presence in the Gulf of Alaska before and after \"The Blob\" heating event. To facilitate this comparison, it looks at 2013 vs 2021. Using data from [NOAA AFSC GAP](https://www.fisheries.noaa.gov/foss/f?p=215:28), this notebook finds further confirmatory evidence of that [well documented species decline](https://www.npr.org/2019/12/08/785634169/alaska-cod-fishery-closes-and-industry-braces-for-ripple-effect) with geospatial visualizations showing areas of reduced catch during bottom trawl surveys. Furthermore, this work finds that reduced stock still persists despite the warming event having \"ended\" some years prior. Altogether, this notebook joins other anlysis in warning of ecological and economic threats in the area caused by climate change." ] }, { @@ -134,40 +134,97 @@ "import afscgap" ] }, + { + "cell_type": "markdown", + "id": "446610c6", + "metadata": {}, + "source": [ + "This notebook starts by building up a query with the filters to be used for both the `BEFORE_YEAR` and `AFTER_YEAR`." + ] + }, { "cell_type": "code", "execution_count": 6, + "id": "d260274b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "query = afscgap.Query()\n", + "query.filter_srvy(eq='GOA')\n", + "query.filter_scientific_name(eq=SPECIES)" + ] + }, + { + "cell_type": "markdown", + "id": "75975a1b", + "metadata": {}, + "source": [ + "This notebook uses two executions: one with `BEFORE_YEAR` and one with `AFTER_YEAR`. This next snippet filters and executes for the first year." + ] + }, + { + "cell_type": "code", + "execution_count": 7, "id": "b5abf34c", "metadata": {}, "outputs": [], "source": [ - "query_before = afscgap.query(\n", - " srvy='GOA',\n", - " year=BEFORE_YEAR,\n", - " scientific_name=SPECIES\n", - ")\n", - "\n", - "query_after = afscgap.query(\n", - " srvy='GOA',\n", - " year=AFTER_YEAR,\n", - " scientific_name=SPECIES\n", - ")" + "query.filter_year(eq=BEFORE_YEAR)\n", + "results_before = query.execute()" + ] + }, + { + "cell_type": "markdown", + "id": "ee76b46c", + "metadata": {}, + "source": [ + "Next, an execution for the second year." ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, + "id": "b0cf526c", + "metadata": {}, + "outputs": [], + "source": [ + "query.filter_year(eq=AFTER_YEAR)\n", + "results_after = query.execute()" + ] + }, + { + "cell_type": "markdown", + "id": "7b83d34c", + "metadata": {}, + "source": [ + "Query's `execute` uses the latest provided values at the time it is called. In other words, calling `filter_year` a second time overwrites the year filter created by the first call to `filter_year`. This lets one use the same Query object to make multiple related searches without needing to re-specify all of the filters from scratch. Regardless, with these two result sets in place, the notebook can get the overall counts." + ] + }, + { + "cell_type": "code", + "execution_count": 9, "id": "337fbcdb", "metadata": {}, "outputs": [], "source": [ - "count_before = sum(map(lambda x: x.get_count(), query_before))\n", - "count_after = sum(map(lambda x: x.get_count(), query_after))" + "count_before = sum(map(lambda x: x.get_count(), results_before))\n", + "count_after = sum(map(lambda x: x.get_count(), results_after))" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 10, "id": "35cb0066", "metadata": {}, "outputs": [ @@ -206,7 +263,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 11, "id": "38a30901", "metadata": {}, "outputs": [], @@ -228,25 +285,26 @@ "source": [ "#### Query\n", "\n", - "This notebook starts by making a method to execute a query for a given year. Note that this uses `presence_only=False`. This enables [zero catch record inference](https://pyafscgap.org/devdocs/afscgap/afscgap.html#absence-vs-presence-data) and is necessary in this case becuase the official API service will get \"presence-only\" data that, while much more compact, can't report on hauls when the species of interest was not found." + "This notebook starts by making a method to execute a query for a given year. Note that this uses `set_presence_only(False)`. This enables [zero catch record inference](https://pyafscgap.org/devdocs/afscgap/afscgap.html#absence-vs-presence-data) and is necessary in this case becuase the official API service will get \"presence-only\" data that, while much more compact, can't report on hauls when the species of interest was not found." ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 12, "id": "5f083c81", "metadata": {}, "outputs": [], "source": [ "def make_query(year):\n", - " query = afscgap.query(\n", - " srvy='GOA',\n", - " year=year,\n", - " scientific_name=SPECIES,\n", - " presence_only=False,\n", - " suppress_large_warning=True\n", - " )\n", - " complete_results = filter(lambda x: x.is_complete(), query)\n", + " query = afscgap.Query()\n", + " query.filter_srvy(eq='GOA')\n", + " query.filter_year(eq=year)\n", + " query.filter_scientific_name(eq=SPECIES)\n", + " query.set_presence_only(False)\n", + " query.set_suppress_large_warning(True)\n", + " results = query.execute()\n", + " \n", + " complete_results = filter(lambda x: x.is_complete(), results)\n", " simplified_results = map(simplify_record, complete_results)\n", " aggregated_results = aggregate_geohashes(simplified_results)\n", " return aggregated_results" @@ -278,22 +336,22 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 13, "id": "69f9f051", "metadata": {}, "outputs": [], "source": [ "def simplify_record(full_record):\n", - " latitude = full_record.get_latitude_dd()\n", + " latitude = full_record.get_latitude(units='dd')\n", " \n", - " longitude = full_record.get_longitude_dd()\n", + " longitude = full_record.get_longitude(units='dd')\n", " if longitude > 0:\n", " longitude = longitude * -1\n", " \n", " return {\n", " 'geohash': geolib.geohash.encode(latitude, longitude, GEOHASH_SIZE),\n", - " 'area': full_record.get_area_swept_ha(),\n", - " 'weight': full_record.get_weight_kg()\n", + " 'area': full_record.get_area_swept(units='ha'),\n", + " 'weight': full_record.get_weight(units='kg')\n", " }" ] }, @@ -307,7 +365,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 14, "id": "476f7a27", "metadata": {}, "outputs": [], @@ -343,7 +401,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 15, "id": "bac6fa5a", "metadata": {}, "outputs": [], @@ -371,7 +429,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 16, "id": "963b38a2", "metadata": {}, "outputs": [], @@ -413,7 +471,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 17, "id": "7600c7c5", "metadata": {}, "outputs": [], @@ -425,7 +483,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 18, "id": "8e780d5d", "metadata": {}, "outputs": [], @@ -455,7 +513,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 19, "id": "753b4d74", "metadata": {}, "outputs": [ @@ -474,7 +532,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 20, "id": "c1352f52", "metadata": {}, "outputs": [ @@ -493,7 +551,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 21, "id": "be4ccd6e", "metadata": {}, "outputs": [ @@ -548,7 +606,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 22, "id": "eced0035", "metadata": {}, "outputs": [], @@ -558,7 +616,7 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 23, "id": "20b1dff1", "metadata": {}, "outputs": [ @@ -627,7 +685,7 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 24, "id": "c09e024f", "metadata": {}, "outputs": [], @@ -641,7 +699,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 25, "id": "b9af205f", "metadata": {}, "outputs": [], @@ -688,7 +746,7 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 26, "id": "2803fab3", "metadata": {}, "outputs": [], @@ -709,7 +767,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 27, "id": "ffeab92b", "metadata": {}, "outputs": [ @@ -755,7 +813,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 28, "id": "7963307f", "metadata": {}, "outputs": [ @@ -801,7 +859,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 29, "id": "d267303a", "metadata": {}, "outputs": [ @@ -847,7 +905,7 @@ }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 30, "id": "70c212f4", "metadata": {}, "outputs": [], @@ -857,7 +915,7 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 31, "id": "5281ea41", "metadata": {}, "outputs": [ @@ -894,7 +952,38 @@ "\n", "Of course, these data are not perfect. It's possible that the same species shifted geographically outside the surveyed area or some other difference between the surveys exists to cause lower reporting. Furthermore, aggregation uses a single latitude / longitude point and it remains possible that a haul may spill out of a geohash causing some approximation in CPUE calculation. However, in combination with [research on mechanistic understanding of Pacific cod reductions due to warming](https://cdnsciencepub.com/doi/full/10.1139/cjfas-2019-0238), the case for warming itself explaining these findings remains strong.\n", "\n", - "Given their importance to human communities along with continued climate change, these results may suggest that a mechanism may exist for similar warming events in the region to inflict not just further ecological but also economic harm to the region." + "Given their importance to human communities along with continued climate change, these results may suggest that a mechanism may exist for similar warming events in the region to inflict not just further ecological but also economic harm to the region.\n", + "\n", + "All this in mind, this short exploration ends with a look at sample sizes." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2ac02c3f", + "metadata": {}, + "outputs": [], + "source": [ + "def get_frame_sizes(target):\n", + " total = target.shape[0]\n", + " zero_records = target[target['weightPerArea'] == 0].shape[0]\n", + " return {'total': total, 'zeroRecords': zero_records}\n", + "\n", + "sizes_before = get_frame_sizes(frame_before)\n", + "before_params = (BEFORE_YEAR, sizes_before['total'], sizes_before['zeroRecords'])\n", + "print('%d: %d geohashes of which %d had zero catch' % before_params)\n", + "\n", + "sizes_after = get_frame_sizes(frame_after)\n", + "after_params = (AFTER_YEAR, sizes_after['total'], sizes_after['zeroRecords'])\n", + "print('%d: %d geohashes of which %d had zero catch' % after_params)" + ] + }, + { + "cell_type": "markdown", + "id": "436cfa31", + "metadata": {}, + "source": [ + "Note that those geohashes with zero catch would not have been included without `set_presence_only(False)` which enables zero catch inference." ] }, { diff --git a/notebooks/requirements.txt b/notebooks/requirements.txt index f950c36e..ecfb1077 100644 --- a/notebooks/requirements.txt +++ b/notebooks/requirements.txt @@ -1,4 +1,4 @@ -afscgap ~= 0.0.6 +afscgap ~= 0.0.8 Cartopy ~= 0.21.1 geolib ~= 1.0.7 matplotlib ~= 3.7.0 diff --git a/pyproject.toml b/pyproject.toml index 94eb5184..da869071 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "afscgap" -version = "0.0.7" +version = "0.0.8" authors = [ { name="A Samuel Pottinger", email="sam.pottinger@berkeley.edu" }, ]