Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate unified Python/C++ docs #13846

Merged
merged 80 commits into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
2ebf609
Add breathe dep
vyasr Aug 9, 2023
efdd554
Add all libcudf doc pages
vyasr Aug 9, 2023
a2a5598
Remove extraneous file
vyasr Aug 9, 2023
91db8c7
Temporarily allow building with warnings so that CI can complete and …
vyasr Aug 10, 2023
0ec51ea
Temporarily disable CI doxygen check
vyasr Aug 10, 2023
0aee382
Fix cmake format
vyasr Aug 10, 2023
fd6f9e7
Start handling more missing refs
vyasr Nov 8, 2023
3229d52
Add lexer for pseudocode
vyasr Nov 8, 2023
ec0d1ba
Add the default stream to a group and link it
vyasr Nov 8, 2023
fe34bf3
Add a few more types to ignore
vyasr Nov 9, 2023
86ca9d5
Add extra intersphinx lookup step
vyasr Nov 9, 2023
884c5b8
Add more types to ignore
vyasr Nov 9, 2023
269f7ba
Add more robust logic for parsing namespaces
vyasr Nov 9, 2023
4e25e15
Add expressions to Sphinx
vyasr Nov 10, 2023
81eb699
Ignore detail APIs for now
vyasr Nov 10, 2023
327ec95
Add io_types to Sphinx docs
vyasr Nov 10, 2023
e100ee4
Also ignore md_regex
vyasr Nov 21, 2023
740c4f7
Breathe doesn't support deprecated tag.
vyasr Nov 21, 2023
3c610ed
Add anchors for namespaces
vyasr Nov 21, 2023
fbb1396
Add md_regex page
vyasr Nov 30, 2023
2952b4b
Also search the numeric namespace
vyasr Nov 30, 2023
7383e92
Add range_window_bounds to group
vyasr Nov 30, 2023
4ba01d3
Add nvtext namespace and clean up namespace logic
vyasr Nov 30, 2023
fcc9fc3
Ignore kafka objects
vyasr Nov 30, 2023
a182b49
Make sure to use the template-stripped reftarget when searching inter…
vyasr Nov 30, 2023
5ed9a0d
Add spans to Sphinx
vyasr Nov 30, 2023
0850b1c
Ignore dlmanagedtensor for now
vyasr Nov 30, 2023
7dcf8f9
Add tdigest to Sphinx
vyasr Nov 30, 2023
2c111eb
Ignore char_utf8
vyasr Nov 30, 2023
7b690f8
Also account for std namespaced objects
vyasr Dec 1, 2023
de3e5de
Add a couple more specific names to remap
vyasr Dec 1, 2023
f0e030e
Add io::datasource
vyasr Dec 1, 2023
153a695
Repoint intersphinx to the online docs
vyasr Dec 1, 2023
6a72a57
Add missing orc types
vyasr Dec 1, 2023
b3c4157
Ignore bpe pairs impl
vyasr Dec 1, 2023
6ddaa4e
Add newly added doxygen namespaces to Sphinx
vyasr Dec 1, 2023
7ceb302
Remove unused ingroup from src file and ignore symbol instead
vyasr Dec 2, 2023
4f19a29
Ignore TypeKind
vyasr Dec 2, 2023
b224641
Fix header
vyasr Dec 2, 2023
93a432d
Add script to parse xml and fix known issues
vyasr Dec 15, 2023
6817217
Parse more precisely and remove potential SFINAE duplicates
vyasr Dec 15, 2023
9ea9c75
Remove nonexistent group from Sphinx
vyasr Dec 15, 2023
3ed3a2f
Simplify script
vyasr Dec 15, 2023
8151aad
Make checks strict again
vyasr Dec 15, 2023
da266dc
Temporarily move parsing script
vyasr Dec 15, 2023
06625bf
Moving parsing into conf.py
vyasr Dec 15, 2023
527181f
Remove outdated reference
vyasr Dec 15, 2023
59c5844
Remove ignores that are no longer necessary
vyasr Dec 15, 2023
90c63e5
Add links for dlpack
vyasr Dec 15, 2023
238a553
Remove old test changes
vyasr Dec 15, 2023
895caf8
Put back detail ignore
vyasr Dec 15, 2023
15942d4
Temporarily disable text docs for cudf
vyasr Dec 16, 2023
0663d2e
Make table compatible with text output
vyasr Dec 16, 2023
90c89c5
Optimize missing reference hook
vyasr Dec 17, 2023
5161a3a
Reenable notebooks
vyasr Dec 17, 2023
b4ccc3b
Reenable text builds
vyasr Dec 17, 2023
a90679b
Address PR feedback
vyasr Dec 18, 2023
2edd7ad
Add one more note
vyasr Dec 18, 2023
bd3a9e1
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Dec 18, 2023
42604fa
Match group layout of modules from doxygen HTML
vyasr Dec 18, 2023
5ece824
Reorganize to add in non-API pages
vyasr Dec 18, 2023
0ede73e
Require new Breathe
vyasr Dec 18, 2023
beaceaf
Fix issues with developer guide links
vyasr Dec 18, 2023
7a9581f
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Dec 19, 2023
89ce4dc
Test parallel builds
vyasr Dec 19, 2023
08797fb
Move parallelism flag to build script so that it's not hardcoded in M…
vyasr Dec 19, 2023
cf40777
More optimizations
vyasr Dec 19, 2023
6c2fa6b
Merge branch 'branch-24.02' into feat/unify_docs
vyasr Dec 19, 2023
39bfa1a
Merge branch 'branch-24.02' into feat/unify_docs
vyasr Jan 9, 2024
8ea7a61
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Jan 9, 2024
aab3e86
Fix style
vyasr Jan 9, 2024
ff12064
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Jan 11, 2024
0efd3bc
Put back doxygen HTML generation for now.
vyasr Jan 11, 2024
3f598a7
Fix typo
vyasr Jan 12, 2024
2bcb370
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Jan 16, 2024
7445e50
Merge remote-tracking branch 'origin/branch-24.02' into feat/unify_docs
vyasr Jan 17, 2024
a5bc91a
Fix one more doxygen error
vyasr Jan 17, 2024
b2dae66
Revert all changes that break the doxygen build
vyasr Jan 17, 2024
7f8a50d
Fix a typo
vyasr Jan 17, 2024
13cd4c0
Disable APIs containing tables for now due to failing text builds
vyasr Jan 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions ci/build_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,7 @@ export RAPIDS_DOCS_DIR="$(mktemp -d)"

rapids-logger "Build CPP docs"
pushd cpp/doxygen
aws s3 cp s3://rapidsai-docs/librmm/html/${RAPIDS_VERSION_NUMBER}/rmm.tag . || echo "Failed to download rmm Doxygen tag"
doxygen Doxyfile
mkdir -p "${RAPIDS_DOCS_DIR}/libcudf/html"
mv html/* "${RAPIDS_DOCS_DIR}/libcudf/html"
popd

rapids-logger "Build Python docs"
Expand Down
1 change: 1 addition & 0 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dependencies:
- benchmark==1.8.0
- boto3>=1.21.21
- botocore>=1.24.21
- breathe
- c-compiler
- cachetools
- clang-tools=16.0.6
Expand Down
1 change: 1 addition & 0 deletions conda/environments/all_cuda-120_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dependencies:
- benchmark==1.8.0
- boto3>=1.21.21
- botocore>=1.24.21
- breathe
- c-compiler
- cachetools
- clang-tools=16.0.6
Expand Down
2 changes: 1 addition & 1 deletion cpp/doxygen/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -1145,7 +1145,7 @@ IGNORE_PREFIX =
# If the GENERATE_HTML tag is set to YES, doxygen will generate HTML output
# The default value is: YES.

GENERATE_HTML = YES
GENERATE_HTML = NO

# The HTML_OUTPUT tag is used to specify where the HTML docs will be put. If a
# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of
Expand Down
5 changes: 4 additions & 1 deletion cpp/include/cudf/io/json.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,10 @@ enum class json_recovery_mode_t {
*
* Parameters in PANDAS that are unavailable or in cudf:
*
*
* +----------------------+--------------------------------------------------+
* | Name | Description |
* | -------------------- | ------------------------------------------------ |
* +======================+==================================================+
* | `orient` | currently fixed-format |
* | `typ` | data is always returned as a cudf::table |
* | `convert_axes` | use column functions for axes operations instead |
Expand All @@ -84,6 +86,7 @@ enum class json_recovery_mode_t {
* | `date_unit` | only millisecond units are supported |
* | `encoding` | only ASCII-encoded data is supported |
* | `chunksize` | use `byte_range_xxx` for chunking instead |
* +----------------------+--------------------------------------------------+
*/
class json_reader_options {
source_info _source;
Expand Down
134 changes: 73 additions & 61 deletions cpp/include/cudf/strings/convert/convert_datetime.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,30 +31,33 @@ namespace strings {
* @file
*/

// clang-format off
/**
* @brief Returns a new timestamp column converting a strings column into
* timestamps using the provided format pattern.
*
* The format pattern can include the following specifiers: "%Y,%y,%m,%d,%H,%I,%p,%M,%S,%f,%z"
*
* | Specifier | Description |
* | :-------: | ----------- |
* | \%d | Day of the month: 01-31 |
* | \%m | Month of the year: 01-12 |
* | \%y | Year without century: 00-99. [0,68] maps to [2000,2068] and [69,99] maps to [1969,1999] |
* | \%Y | Year with century: 0001-9999 |
* | \%H | 24-hour of the day: 00-23 |
* | \%I | 12-hour of the day: 01-12 |
* | \%M | Minute of the hour: 00-59 |
* | \%S | Second of the minute: 00-59. Leap second is not supported. |
* | \%f | 6-digit microsecond: 000000-999999 |
* | \%z | UTC offset with format ±HHMM Example +0500 |
* | \%j | Day of the year: 001-366 |
* | \%p | Only 'AM', 'PM' or 'am', 'pm' are recognized |
* | \%W | Week of the year with Monday as the first day of the week: 00-53 |
* | \%w | Day of week: 0-6 = Sunday-Saturday |
* | \%U | Week of the year with Sunday as the first day of the week: 00-53 |
* | \%u | Day of week: 1-7 = Monday-Sunday |
* +-----------+-----------------------------------------------------------------------------------------+
* | Specifier | Description |
* +===========+=========================================================================================+
* | ``%d`` | Day of the month: 01-31 |
* | ``%m`` | Month of the year: 01-12 |
* | ``%y`` | Year without century: 00-99. [0,68] maps to [2000,2068] and [69,99] maps to [1969,1999] |
* | ``%Y`` | Year with century: 0001-9999 |
* | ``%H`` | 24-hour of the day: 00-23 |
* | ``%I`` | 12-hour of the day: 01-12 |
* | ``%M`` | Minute of the hour: 00-59 |
* | ``%S`` | Second of the minute: 00-59. Leap second is not supported. |
* | ``%f`` | 6-digit microsecond: 000000-999999 |
* | ``%z`` | UTC offset with format ±HHMM Example +0500 |
* | ``%j`` | Day of the year: 001-366 |
* | ``%p`` | Only 'AM', 'PM' or 'am', 'pm' are recognized |
* | ``%W`` | Week of the year with Monday as the first day of the week: 00-53 |
* | ``%w`` | Day of week: 0-6 = Sunday-Saturday |
* | ``%U`` | Week of the year with Sunday as the first day of the week: 00-53 |
* | ``%u`` | Day of week: 1-7 = Monday-Sunday |
* +-----------+-----------------------------------------------------------------------------------------+
*
* Other specifiers are not currently supported.
*
Expand Down Expand Up @@ -84,37 +87,41 @@ namespace strings {
* @param mr Device memory resource used to allocate the returned column's device memory
* @return New datetime column
*/
// clang-format on
std::unique_ptr<column> to_timestamps(
strings_column_view const& input,
data_type timestamp_type,
std::string_view format,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

// clang-format off
/**
* @brief Verifies the given strings column can be parsed to timestamps using the provided format
* pattern.
*
* The format pattern can include the following specifiers: "%Y,%y,%m,%d,%H,%I,%p,%M,%S,%f,%z"
*
* | Specifier | Description |
* | :-------: | ----------- |
* | \%d | Day of the month: 01-31 |
* | \%m | Month of the year: 01-12 |
* | \%y | Year without century: 00-99. [0,68] maps to [2000,2068] and [69,99] maps to [1969,1999] |
* | \%Y | Year with century: 0001-9999 |
* | \%H | 24-hour of the day: 00-23 |
* | \%I | 12-hour of the day: 01-12 |
* | \%M | Minute of the hour: 00-59|
* | \%S | Second of the minute: 00-59. Leap second is not supported. |
* | \%f | 6-digit microsecond: 000000-999999 |
* | \%z | UTC offset with format ±HHMM Example +0500 |
* | \%j | Day of the year: 001-366 |
* | \%p | Only 'AM', 'PM' or 'am', 'pm' are recognized |
* | \%W | Week of the year with Monday as the first day of the week: 00-53 |
* | \%w | Day of week: 0-6 = Sunday-Saturday |
* | \%U | Week of the year with Sunday as the first day of the week: 00-53 |
* | \%u | Day of week: 1-7 = Monday-Sunday |
* +-----------+-----------------------------------------------------------------------------------------+
* | Specifier | Description |
* +===========+=========================================================================================+
* | ``%d`` | Day of the month: 01-31 |
* | ``%m`` | Month of the year: 01-12 |
* | ``%y`` | Year without century: 00-99. [0,68] maps to [2000,2068] and [69,99] maps to [1969,1999] |
* | ``%Y`` | Year with century: 0001-9999 |
* | ``%H`` | 24-hour of the day: 00-23 |
* | ``%I`` | 12-hour of the day: 01-12 |
* | ``%M`` | Minute of the hour: 00-59 |
* | ``%S`` | Second of the minute: 00-59. Leap second is not supported. |
* | ``%f`` | 6-digit microsecond: 000000-999999 |
* | ``%z`` | UTC offset with format ±HHMM Example +0500 |
* | ``%j`` | Day of the year: 001-366 |
* | ``%p`` | Only 'AM', 'PM' or 'am', 'pm' are recognized |
* | ``%W`` | Week of the year with Monday as the first day of the week: 00-53 |
* | ``%w`` | Day of week: 0-6 = Sunday-Saturday |
* | ``%U`` | Week of the year with Sunday as the first day of the week: 00-53 |
* | ``%u`` | Day of week: 1-7 = Monday-Sunday |
* +-----------+-----------------------------------------------------------------------------------------+
*
* Other specifiers are not currently supported.
* The "%f" supports a precision value to read the numeric digits. Specify the
Expand All @@ -132,43 +139,47 @@ std::unique_ptr<column> to_timestamps(
* @param mr Device memory resource used to allocate the returned column's device memory
* @return New BOOL8 column
*/
// clang-format on
std::unique_ptr<column> is_timestamp(
strings_column_view const& input,
std::string_view format,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

// clang-format off
/**
* @brief Returns a new strings column converting a timestamp column into
* strings using the provided format pattern.
*
* The format pattern can include the following specifiers: "%Y,%y,%m,%d,%H,%I,%p,%M,%S,%f,%z,%Z"
*
* | Specifier | Description |
* | :-------: | ----------- |
* | \%d | Day of the month: 01-31 |
* | \%m | Month of the year: 01-12 |
* | \%y | Year without century: 00-99 |
* | \%Y | Year with century: 0001-9999 |
* | \%H | 24-hour of the day: 00-23 |
* | \%I | 12-hour of the day: 01-12 |
* | \%M | Minute of the hour: 00-59|
* | \%S | Second of the minute: 00-59 |
* | \%f | 6-digit microsecond: 000000-999999 |
* | \%z | Always outputs "+0000" |
* | \%Z | Always outputs "UTC" |
* | \%j | Day of the year: 001-366 |
* | \%u | ISO weekday where Monday is 1 and Sunday is 7 |
* | \%w | Weekday where Sunday is 0 and Saturday is 6 |
* | \%U | Week of the year with Sunday as the first day: 00-53 |
* | \%W | Week of the year with Monday as the first day: 00-53 |
* | \%V | Week of the year per ISO-8601 format: 01-53 |
* | \%G | Year based on the ISO-8601 weeks: 0000-9999 |
* | \%p | AM/PM from `timestamp_names::am_str/pm_str` |
* | \%a | Weekday abbreviation from the `names` parameter |
* | \%A | Weekday from the `names` parameter |
* | \%b | Month name abbreviation from the `names` parameter |
* | \%B | Month name from the `names` parameter |
* +-----------+-----------------------------------------------------------------------------------------+
* | Specifier | Description |
* +===========+=========================================================================================+
* | ``%d`` | Day of the month: 01-31 |
* | ``%m`` | Month of the year: 01-12 |
* | ``%y`` | Year without century: 00-99. [0,68] maps to [2000,2068] and [69,99] maps to [1969,1999] |
* | ``%Y`` | Year with century: 0001-9999 |
* | ``%H`` | 24-hour of the day: 00-23 |
* | ``%I`` | 12-hour of the day: 01-12 |
* | ``%M`` | Minute of the hour: 00-59 |
* | ``%S`` | Second of the minute: 00-59. Leap second is not supported. |
* | ``%f`` | 6-digit microsecond: 000000-999999 |
* | ``%z`` | Always outputs "+0000" |
* | ``%Z`` | Always outputs "UTC" |
* | ``%j`` | Day of the year: 001-366 |
* | ``%u`` | ISO weekday where Monday is 1 and Sunday is 7 |
* | ``%w`` | Weekday where Sunday is 0 and Saturday is 6 |
* | ``%U`` | Week of the year with Sunday as the first day: 00-53 |
* | ``%W`` | Week of the year with Monday as the first day: 00-53 |
* | ``%V`` | Week of the year per ISO-8601 format: 01-53 |
* | ``%G`` | Year based on the ISO-8601 weeks: 0000-9999 |
* | ``%p`` | AM/PM from `timestamp_names::am_str/pm_str` |
* | ``%a`` | Weekday abbreviation from the `names` parameter |
* | ``%A`` | Weekday from the `names` parameter |
* | ``%b`` | Month name abbreviation from the `names` parameter |
* | ``%B`` | Month name from the `names` parameter |
* +-----------+-----------------------------------------------------------------------------------------+
*
* Additional descriptions can be found here:
* https://en.cppreference.com/w/cpp/chrono/system_clock/formatter
Expand Down Expand Up @@ -244,6 +255,7 @@ std::unique_ptr<column> is_timestamp(
* @param mr Device memory resource used to allocate the returned column's device memory
* @return New strings column with formatted timestamps
*/
// clang-format on
std::unique_ptr<column> from_timestamps(
column_view const& timestamps,
std::string_view format = "%Y-%m-%dT%H:%M:%SZ",
Expand Down
Loading
Loading