From cdc98d6713b811f9aebc47f01ff3f8f052282801 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 12 Aug 2022 03:11:28 +0530
Subject: [PATCH 01/27] pull changes from PR #11291 Squashed commit of the
 following:

commit 6e1bc7549d7af503bd8925b01bbec88b061050d8
Author: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date:   Fri Aug 12 03:06:30 2022 +0530

    remove debug print in logical stack

commit 8e756459feae63fdff98619f8a12a521df67e44e
Author: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date:   Fri Aug 12 03:01:34 2022 +0530

    remove duplicate renamed header

commit 3b2acb2c7ed2e772dcb181356ce3e17b99732686
Merge: 2b59b046d5 a67b718b22
Author: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date:   Fri Aug 12 02:59:01 2022 +0530

    Merge branch 'branch-22.10' of https://github.com/rapidsai/cudf into json-tree

commit 2b59b046d5d045363debe8fb50165c5db81de976
Merge: 12cf0bee64 2d214ea28b
Author: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date:   Tue Jul 26 13:40:41 2022 +0530

    Merge branch 'branch-22.08' of https://github.com/rapidsai/cudf into json-tree

commit 12cf0bee640f1e74f5c626a8f558dc873ea3313e
Author: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date:   Tue Jul 26 13:29:55 2022 +0530

    fix clang-format style fix

commit 3e756bbe7588a5e5cbe1fb098bfd0598c304f648
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Jul 18 08:17:03 2022 -0700

    replaces tree return type from tuple to struct

commit bef4fb1fbea490d435d0bae5db5260f9053ae865
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon May 16 22:10:08 2022 -0700

    moved debug print to detail ns

commit ff905281fecb004314f4d41ba11f84e9eb85cb96
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Fri May 13 09:52:20 2022 -0700

    squash & rebase on latest tokenizer version

commit 987699f466e2ccc20c7531b12fdd194405cd5b4a
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jun 2 05:19:53 2022 -0700

    fixes sg-count & uses rmm stream in fst tests

commit 00a95ebfb6958f351cef085a7545b8cc3983622b
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 12:17:08 2022 -0700

    put lookup tables into their own cudf file

commit a8ac5fac949610b4e18a062432413fb882a1e00a
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 09:59:37 2022 -0700

    refactored lookup tables

commit f996ce92f32547297193052dd0999ec70d326ab5
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 11 12:17:55 2022 -0700

    squashed with bracket/brace test

commit 671ce416d55d1b0ec685b12d6d27c2bf35e6e2e3
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Tue Apr 12 22:55:00 2022 -0700

    minor style changes addressing review comments

commit f4ec994a92c65b663e676e31e8d783a2c0601680
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 4 07:35:33 2022 -0700

    device_span

commit d18238feccfc09d98eaa52ce739893b7673e6ed8
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 4 02:28:30 2022 -0700

    renaming key-value store op to stack_op

commit 62ddf669fbdae440d3ad1a94985c0dfacc28d606
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Mar 31 05:28:17 2022 -0700

    switched to using rmm also inside algorithm

commit 2f7b254600b063407e6f60a9733c34cace25290b
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Mar 31 04:11:44 2022 -0700

    Added utility to debug print & instrumented code to use it

commit 67f609dfcea869c9fdcfe5358ac0e4cdada4ae89
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 04:15:11 2022 -0700

    renames enums & moving from device_span to ptr params

commit 01aef4404fb95825f2a5d19859194f317b28da66
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed Jul 13 07:22:52 2022 -0700

    wraps if with stream params into detail ns

commit 4aaf595585c71a598bec94267f065624476489f7
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed Jul 13 05:45:49 2022 -0700

    fixes for breaking downstream interface changes

commit 237456d9dafe1540d23accea2bd8d770fb6fc88e
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jun 2 08:19:37 2022 -0700

    fixes breaking changes from dependent-FST-PR

commit 7fc8619aa6f1f990da53e5f211c5382b7fb89322
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Tue May 3 07:05:44 2022 -0700

    rebase on latest FST

commit 6d3eff29ef386663e8feb3a103eb9795c7c7c66d
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jun 2 05:19:53 2022 -0700

    fixes sg-count & uses rmm stream in fst tests

commit 654883655bdcd9e389883e0f99f56b656855d1c2
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 12:17:08 2022 -0700

    put lookup tables into their own cudf file

commit 9dfd4ad56a4327851f6f914601015ade6365d3a3
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 09:59:37 2022 -0700

    refactored lookup tables

commit fe06f0b44d6447c703fa9de7acab55cd7e6ae053
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 11 12:17:55 2022 -0700

    squashed with bracket/brace test

commit 36c82962eb6a149bed5a1e07edad107b4d156cfb
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Tue Apr 12 22:55:00 2022 -0700

    minor style changes addressing review comments

commit 24dab9e2f0fe53a13e370006915801a983da44eb
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 4 07:35:33 2022 -0700

    device_span

commit 49fa9960e0f251a93ed5b9c4fdc17889530fc4ea
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 4 02:28:30 2022 -0700

    renaming key-value store op to stack_op

commit b260610a7c7e526b21e91f35e47491fa02649aeb
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Mar 31 05:28:17 2022 -0700

    switched to using rmm also inside algorithm

commit 9b20d169ea9d054c1efa89e522515771422ad312
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Mar 31 04:11:44 2022 -0700

    Added utility to debug print & instrumented code to use it

commit 78dd8932a7cd72e63d2853aea15b61d922d6ab48
Merge: 8a184e97ce 96270919fa
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Fri Jul 15 23:06:55 2022 -0700

    Merge remote-tracking branch 'upstream/branch-22.08' into feature/finite-state-transducer-trimmed

commit 8a184e97ce14a400f70d463f7e55f95f32d7a547
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Fri Jul 15 22:51:18 2022 -0700

    rephrases documentation on in-reg array

commit bea2a02226314cddb6073726d2feafa21d89bb52
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Fri Jul 15 01:54:20 2022 -0700

    replaces vanilla loop with iota

commit cba16196b356ecc807a6ae67a20b357677cf26a4
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 09:31:12 2022 -0700

    fixes style in dispatch dfa

commit 3f479528b2b24b6eddae4d622509f8392b97eb0d
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 09:22:03 2022 -0700

    replaces gtest asserts with expects

commit d351e5c4197acf7c7ab215ea7555926cb2d1f5b8
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 09:17:59 2022 -0700

    addresses style review comments & fixes a todo

commit 3038058e48347ef95a18fe2f18190c5e0de7c9a0
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 09:17:09 2022 -0700

    adds excplitis error checking

commit f52e61457b2b88b8b6a4f61bfd214283ea2d28a9
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 14 09:16:18 2022 -0700

    replaces enum with typed constexpr

commit eb2496205ce65c808e968348c22e35862bb19ff7
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Tue Jul 12 04:52:36 2022 -0700

    fixes logical stack test includes

commit a798852ea24e44a8432847fae82f4009cce20c05
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Jul 11 11:00:22 2022 -0700

    adds check for state transition narrowing conversion

commit e6f8defa0b79d040eb465cb76a12af194d1ff899
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Jul 11 09:06:01 2022 -0700

    some west-const remainders & unifies StateIndexT

commit 5f1c4b544882f1d35ac8701eb611e8f64c12ac56
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Jul 11 06:26:47 2022 -0700

    removes state vector-wrapper in favor of vanilla array

commit 485a1c632bc6fe03030c445b8cff06dc1c3ca32f
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Fri Jul 8 22:49:57 2022 -0700

    adopts c++17 namespaces declarations

commit f656f494e39f628dee12706ee2a0e9c6ea180126
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jul 7 02:41:16 2022 -0700

    adopts device-side test data gen

commit 694a365448a2156d0a1c60fafcd52f67c5f0c3f6
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed Jun 15 04:28:51 2022 -0700

    adopts suggested fst test changes

commit 9fe8e4b6e2c527e471d9627369e72595ef3e452c
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Tue Jun 14 03:12:35 2022 -0700

    minor doxygen fix

commit eccf9701432f557b52b1f44b985128668bf1462f
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Thu Jun 2 05:19:53 2022 -0700

    fixes sg-count & uses rmm stream in fst tests

commit 6fdd24a5625150469242af16fdcb1d549b3676e0
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon May 9 12:17:34 2022 -0700

    refactor lut sanity check

commit 17dcbfd07b73a64a0a1cfda71ea9c2770b6a8662
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon May 9 10:33:00 2022 -0700

    making const vars const

commit ea79a81fb9b0473d37f31c42bce25269a3d17d88
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon May 9 10:32:17 2022 -0700

    Adding hostdevice macros to in-reg array

commit caf61955c32c57cca287fb9d7e74bf5d0efc8506
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon May 9 10:24:51 2022 -0700

    unified usage of pragma unrolls

commit e24a13301a34fbb08d8424a2ba4edfbebd402d67
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed May 4 07:29:00 2022 -0700

    removing unused var post-cleanup

commit 39cff8039c160a3de0795a33c4e4fc2215072900
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed Apr 27 04:42:31 2022 -0700

    Change interface for FST to not need temp storage

commit 239f138d78cc12af8607f1feb7d7ec4bec2f58fc
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 12:17:08 2022 -0700

    put lookup tables into their own cudf file

commit 39a6b65c9fc4ad12d33155b54c8373b98de2de43
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 25 09:59:37 2022 -0700

    refactored lookup tables

commit 355d1e43e29e4eeadc21f9d4d9e6aa43ee8afe9b
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Wed Apr 20 05:11:32 2022 -0700

    clean up & addressing review comments

commit 0557d4176fca272ac98d644bbb3dd8fa87333d7a
Author: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Date:   Mon Apr 11 12:17:55 2022 -0700

    squashed with bracket/brace test
---
 cpp/src/io/json/nested_json.hpp    |  52 +++++++++
 cpp/src/io/json/nested_json_gpu.cu | 175 ++++++++++++++++++++++++++++-
 cpp/tests/io/nested_json_test.cu   | 165 +++++++++++++++++++++++++++
 3 files changed, 391 insertions(+), 1 deletion(-)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index 3f7d73fb931..7fec5ddd5b2 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -76,6 +76,49 @@ enum token_t : PdaTokenT {
   NUM_TOKENS
 };
 
+/// Type used to represent the class of a node (or a node "category") within the tree representation
+using NodeT = char;
+
+/**
+ * @brief Class of a node (or a node "category") within the tree representation
+ */
+enum node_t : NodeT {
+  /// A node representing a struct
+  NC_STRUCT,
+  /// A node representing a list
+  NC_LIST,
+  /// A node representing a field name
+  NC_FN,
+  /// A node representing a string value
+  NC_STR,
+  /// A node representing a numeric or literal value (e.g., true, false, null)
+  NC_VAL,
+  /// A node representing a parser error
+  NC_ERR,
+  /// Total number of node classes
+  NUM_NODE_CLASSES
+};
+
+/// Type used to index into the nodes within the tree of structs, lists, field names, and value
+/// nodes
+using NodeIndexT = uint32_t;
+
+/// Type large enough to represent tree depth from [0, max-tree-depth); may be an unsigned type
+using TreeDepthT = StackLevelT;
+
+/**
+ * @brief Struct that encapsulate all information of a columnar tree representation.
+ */
+struct tree_meta_t {
+  std::vector<NodeT> node_categories;
+  std::vector<NodeIndexT> parent_node_ids;
+  std::vector<TreeDepthT> node_levels;
+  std::vector<SymbolOffsetT> node_range_begin;
+  std::vector<SymbolOffsetT> node_range_end;
+};
+
+constexpr NodeIndexT parent_node_sentinel = std::numeric_limits<NodeIndexT>::max();
+
 namespace detail {
 /**
  * @brief Identifies the stack context for each character from a JSON input. Specifically, we
@@ -110,6 +153,15 @@ void get_token_stream(device_span<SymbolT const> d_json_in,
                       SymbolOffsetT* d_tokens_indices,
                       SymbolOffsetT* d_num_written_tokens,
                       rmm::cuda_stream_view stream);
+
+/**
+ * @brief Parses the given JSON string and generates a tree representation of the given input.
+ *
+ * @param input The JSON input
+ * @param stream The CUDA stream to which kernels are dispatched
+ * @return
+ */
+tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream);
 }  // namespace detail
 
 }  // namespace cudf::io::json
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index b8e05054e11..1a0469573ba 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -25,6 +25,8 @@
 
 #include <rmm/exec_policy.hpp>
 
+#include <stack>
+
 namespace cudf::io::json {
 
 // JSON to stack operator DFA (Deterministic Finite Automata)
@@ -796,6 +798,177 @@ void get_token_stream(device_span<SymbolT const> json_in,
                                stream);
 }
 
-}  // namespace detail
+tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream)
+{
+  constexpr std::size_t single_item = 1;
+  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
+
+  rmm::device_uvector<SymbolT> d_input{input.size(), stream};
+  cudaMemcpyAsync(
+    d_input.data(), input.data(), input.size() * sizeof(input[0]), cudaMemcpyHostToDevice, stream);
+
+  // Parse the JSON and get the token stream
+  cudf::io::json::detail::get_token_stream(
+    cudf::device_span<SymbolT>{d_input.data(), d_input.size()},
+    tokens_gpu.device_ptr(),
+    token_indices_gpu.device_ptr(),
+    num_tokens_out.device_ptr(),
+    stream);
 
+  // Copy the JSON tokens to the host
+  token_indices_gpu.device_to_host(stream);
+  tokens_gpu.device_to_host(stream);
+  num_tokens_out.device_to_host(stream);
+
+  // Make sure tokens have been copied to the host
+  stream.synchronize();
+
+  // Whether a token does represent a node in the tree representation
+  auto is_node = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin:
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin:
+      case token_t::ErrorBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The node that a token represents
+  auto token_to_node = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin: return NC_STRUCT;
+      case token_t::ListBegin: return NC_LIST;
+      case token_t::StringBegin: return NC_STR;
+      case token_t::ValueBegin: return NC_VAL;
+      case token_t::FieldNameBegin: return NC_FN;
+      default: return NC_ERR;
+    };
+  };
+
+  auto get_token_index = [](PdaTokenT const token, SymbolOffsetT const token_index) {
+    constexpr SymbolOffsetT skip_quote_char = 1;
+    switch (token) {
+      case token_t::StringBegin: return token_index + skip_quote_char;
+      case token_t::FieldNameBegin: return token_index + skip_quote_char;
+      default: return token_index;
+    };
+  };
+
+  // Whether a token expects to be followed by its respective end-of-* token partner
+  auto is_begin_of_section = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The end-of-* partner token for a given beginning-of-* token
+  auto end_of_partner = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StringBegin: return token_t::StringEnd;
+      case token_t::ValueBegin: return token_t::ValueEnd;
+      case token_t::FieldNameBegin: return token_t::FieldNameEnd;
+      default: return token_t::ErrorBegin;
+    };
+  };
+
+  // Whether the token pops from the parent node stack
+  auto does_pop = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructEnd:
+      case token_t::ListEnd: return true;
+      default: return false;
+    };
+  };
+
+  // Whether the token pushes onto the parent node stack
+  auto does_push = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The node id sitting on top of the stack becomes the node's parent
+  // The full stack represents the path from the root to the current node
+  std::stack<std::pair<NodeIndexT, bool>> parent_stack;
+
+  constexpr bool field_name_node    = true;
+  constexpr bool no_field_name_node = false;
+
+  std::vector<NodeT> node_categories;
+  std::vector<NodeIndexT> parent_node_ids;
+  std::vector<TreeDepthT> node_levels;
+  std::vector<SymbolOffsetT> node_range_begin;
+  std::vector<SymbolOffsetT> node_range_end;
+
+  std::size_t node_id = 0;
+  for (std::size_t i = 0; i < num_tokens_out[0]; i++) {
+    auto token = tokens_gpu[i];
+
+    // The section from the original JSON input that this token demarcates
+    std::size_t range_begin = get_token_index(token, token_indices_gpu[i]);
+    std::size_t range_end   = range_begin + 1;
+
+    // Identify this node's parent node id
+    std::size_t parent_node_id =
+      (parent_stack.size() > 0) ? parent_stack.top().first : parent_node_sentinel;
+
+    // If this token is the beginning-of-{value, string, field name}, also consume the next end-of-*
+    // token
+    if (is_begin_of_section(token)) {
+      if ((i + 1) < num_tokens_out[0] && end_of_partner(tokens_gpu[i + 1])) {
+        // Update the range_end for this pair of tokens
+        range_end = token_indices_gpu[i + 1];
+        // We can skip the subsequent end-of-* token
+        i++;
+      }
+    }
+
+    // Emit node if this token becomes a node in the tree
+    if (is_node(token)) {
+      node_categories.push_back(token_to_node(token));
+      parent_node_ids.push_back(parent_node_id);
+      node_levels.push_back(parent_stack.size());
+      node_range_begin.push_back(range_begin);
+      node_range_end.push_back(range_end);
+    }
+
+    // Modify the stack if needed
+    if (token == token_t::FieldNameBegin) {
+      parent_stack.push({node_id, field_name_node});
+    } else {
+      if (does_push(token)) {
+        parent_stack.push({node_id, no_field_name_node});
+      } else if (does_pop(token)) {
+        CUDF_EXPECTS(parent_stack.size() >= 1, "Invalid JSON input.");
+        parent_stack.pop();
+      }
+
+      // If what we're left with is a field name on top of stack, we need to pop it
+      if (parent_stack.size() >= 1 && parent_stack.top().second == field_name_node) {
+        parent_stack.pop();
+      }
+    }
+
+    // Update node_id
+    if (is_node(token)) { node_id++; }
+  }
+
+  return {std::move(node_categories),
+          std::move(parent_node_ids),
+          std::move(node_levels),
+          std::move(node_range_begin),
+          std::move(node_range_end)};
+}
+
+}  // namespace detail
 }  // namespace cudf::io::json
diff --git a/cpp/tests/io/nested_json_test.cu b/cpp/tests/io/nested_json_test.cu
index 0b7e2bb82f8..077236b75c2 100644
--- a/cpp/tests/io/nested_json_test.cu
+++ b/cpp/tests/io/nested_json_test.cu
@@ -23,8 +23,60 @@
 #include <rmm/cuda_stream.hpp>
 #include <rmm/cuda_stream_view.hpp>
 
+#include <stack>
+#include <string>
+
 namespace cuio_json = cudf::io::json;
 
+namespace {
+
+std::string get_node_string(std::size_t const node_id,
+                            cuio_json::tree_meta_t const& tree_rep,
+                            std::string const& json_input)
+{
+  auto node_to_str = [] __host__ __device__(cuio_json::PdaTokenT const token) {
+    switch (token) {
+      case cuio_json::NC_STRUCT: return "STRUCT";
+      case cuio_json::NC_LIST: return "LIST";
+      case cuio_json::NC_FN: return "FN";
+      case cuio_json::NC_STR: return "STR";
+      case cuio_json::NC_VAL: return "VAL";
+      case cuio_json::NC_ERR: return "ERR";
+      default: return "N/A";
+    };
+  };
+
+  return "<" + std::to_string(node_id) + ":" + node_to_str(tree_rep.node_categories[node_id]) +
+         ":[" + std::to_string(tree_rep.node_range_begin[node_id]) + ", " +
+         std::to_string(tree_rep.node_range_end[node_id]) + ") '" +
+         json_input.substr(tree_rep.node_range_begin[node_id],
+                           tree_rep.node_range_end[node_id] - tree_rep.node_range_begin[node_id]) +
+         "'>";
+}
+
+void print_tree_representation(std::string const& json_input,
+                               cuio_json::tree_meta_t const& tree_rep)
+{
+  for (std::size_t i = 0; i < tree_rep.node_categories.size(); i++) {
+    std::size_t parent_id = tree_rep.parent_node_ids[i];
+    std::stack<std::size_t> path;
+    path.push(i);
+    while (parent_id != cuio_json::parent_node_sentinel) {
+      path.push(parent_id);
+      parent_id = tree_rep.parent_node_ids[parent_id];
+    }
+
+    while (path.size()) {
+      auto const node_id = path.top();
+      std::cout << get_node_string(node_id, tree_rep, json_input)
+                << (path.size() > 1 ? " -> " : "");
+      path.pop();
+    }
+    std::cout << "\n";
+  }
+}
+}  // namespace
+
 // Base test fixture for tests
 struct JsonTest : public cudf::test::BaseFixture {
 };
@@ -231,3 +283,116 @@ TEST_F(JsonTest, TokenStream)
     EXPECT_EQ(golden_token_stream[i].second, tokens_gpu[i]) << "Mismatch at #" << i;
   }
 }
+
+TEST_F(JsonTest, TreeRepresentation)
+{
+  using cuio_json::PdaTokenT;
+  using cuio_json::SymbolOffsetT;
+  using cuio_json::SymbolT;
+
+  // Prepare cuda stream for data transfers & kernels
+  cudaStream_t stream = nullptr;
+  cudaStreamCreate(&stream);
+  rmm::cuda_stream_view stream_view(stream);
+
+  // Test input
+  std::string input = R"(  [{)"
+                      R"("category": "reference",)"
+                      R"("index:": [4,12,42],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "[Sayings of the Century]",)"
+                      R"("price": 8.95)"
+                      R"(},  )"
+                      R"({)"
+                      R"("category": "reference",)"
+                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "{}[], <=semantic-symbols-string",)"
+                      R"("price": 8.95)"
+                      R"(}] )";
+
+  // Get the JSON's tree representation
+  auto tree_rep = cuio_json::detail::get_tree_representation(
+    cudf::host_span<SymbolT const>{input.data(), input.size()}, stream_view);
+
+  // Print tree representation
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+
+  // Golden sample of node categories
+  std::vector<cuio_json::node_t> golden_node_categories = {
+    cuio_json::NC_LIST, cuio_json::NC_STRUCT, cuio_json::NC_FN,     cuio_json::NC_STR,
+    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_VAL,    cuio_json::NC_VAL,
+    cuio_json::NC_VAL,  cuio_json::NC_FN,     cuio_json::NC_STR,    cuio_json::NC_FN,
+    cuio_json::NC_STR,  cuio_json::NC_FN,     cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_LIST,
+    cuio_json::NC_VAL,  cuio_json::NC_STRUCT, cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_STRUCT, cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_STR,
+    cuio_json::NC_FN,   cuio_json::NC_VAL};
+
+  // Golden sample of node ids
+  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {cuio_json::parent_node_sentinel,
+                                                               0,
+                                                               1,
+                                                               2,
+                                                               1,
+                                                               4,
+                                                               5,
+                                                               5,
+                                                               5,
+                                                               1,
+                                                               9,
+                                                               1,
+                                                               11,
+                                                               1,
+                                                               13,
+                                                               0,
+                                                               15,
+                                                               16,
+                                                               15,
+                                                               18,
+                                                               19,
+                                                               19,
+                                                               19,
+                                                               19,
+                                                               23,
+                                                               24,
+                                                               25,
+                                                               25,
+                                                               15,
+                                                               28,
+                                                               15,
+                                                               30,
+                                                               15,
+                                                               32};
+
+  // Golden sample of node levels
+  std::vector<cuio_json::TreeDepthT> golden_node_levels = {0, 1, 2, 3, 2, 3, 4, 4, 4, 2, 3, 2,
+                                                           3, 2, 3, 1, 2, 3, 2, 3, 4, 4, 4, 4,
+                                                           5, 6, 7, 7, 2, 3, 2, 3, 2, 3};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_begin = {
+    2,   3,   5,   17,  29,  38,  39,  41,  44,  49,  59,  72,  81,  108, 116, 124, 126,
+    138, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 199, 212, 221, 255, 263};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_end = {
+    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
+    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
+
+  // Check results against golden samples
+  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+
+  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
+    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+  }
+}

From d0745791a1e82212e37158b753c4d4df1a4fb08b Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 12 Aug 2022 03:45:06 +0530
Subject: [PATCH 02/27] fix minor bug, cleanup unit test

---
 cpp/src/io/json/nested_json_gpu.cu |  2 +-
 cpp/tests/io/nested_json_test.cu   | 14 +++-----------
 2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 1a0469573ba..170340d28d6 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -925,7 +925,7 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
     // If this token is the beginning-of-{value, string, field name}, also consume the next end-of-*
     // token
     if (is_begin_of_section(token)) {
-      if ((i + 1) < num_tokens_out[0] && end_of_partner(tokens_gpu[i + 1])) {
+      if ((i + 1) < num_tokens_out[0] && end_of_partner(token) == tokens_gpu[i + 1]) {
         // Update the range_end for this pair of tokens
         range_end = token_indices_gpu[i + 1];
         // We can skip the subsequent end-of-* token
diff --git a/cpp/tests/io/nested_json_test.cu b/cpp/tests/io/nested_json_test.cu
index 077236b75c2..b7b0bc16edb 100644
--- a/cpp/tests/io/nested_json_test.cu
+++ b/cpp/tests/io/nested_json_test.cu
@@ -17,6 +17,8 @@
 #include <io/json/nested_json.hpp>
 #include <io/utilities/hostdevice_vector.hpp>
 
+#include <cudf/utilities/default_stream.hpp>
+
 #include <cudf_test/base_fixture.hpp>
 #include <cudf_test/cudf_gtest.hpp>
 
@@ -286,15 +288,6 @@ TEST_F(JsonTest, TokenStream)
 
 TEST_F(JsonTest, TreeRepresentation)
 {
-  using cuio_json::PdaTokenT;
-  using cuio_json::SymbolOffsetT;
-  using cuio_json::SymbolT;
-
-  // Prepare cuda stream for data transfers & kernels
-  cudaStream_t stream = nullptr;
-  cudaStreamCreate(&stream);
-  rmm::cuda_stream_view stream_view(stream);
-
   // Test input
   std::string input = R"(  [{)"
                       R"("category": "reference",)"
@@ -312,8 +305,7 @@ TEST_F(JsonTest, TreeRepresentation)
                       R"(}] )";
 
   // Get the JSON's tree representation
-  auto tree_rep = cuio_json::detail::get_tree_representation(
-    cudf::host_span<SymbolT const>{input.data(), input.size()}, stream_view);
+  auto tree_rep = cuio_json::detail::get_tree_representation(input, cudf::default_stream_value);
 
   // Print tree representation
   if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }

From f5287a50ce5aed7c4704afc9b4668ff0f881c2b6 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 12 Aug 2022 04:51:24 +0530
Subject: [PATCH 03/27] add struct member begin, end tokens

---
 cpp/src/io/json/nested_json.hpp    |   4 +
 cpp/src/io/json/nested_json_gpu.cu | 244 +++++++++++++++--------------
 cpp/tests/io/nested_json_test.cu   | 113 ++++++++++---
 3 files changed, 218 insertions(+), 143 deletions(-)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index 7fec5ddd5b2..da67fda4351 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -58,6 +58,10 @@ enum token_t : PdaTokenT {
   ListBegin,
   /// End-of-list token (on encounter of semantic ']')
   ListEnd,
+  // Beginning-of-struct-member token
+  StructMemberBegin,
+  // End-of-struct-member token
+  StructMemberEnd,
   /// Beginning-of-field-name token (on encounter of first quote)
   FieldNameBegin,
   /// End-of-field-name token (on encounter of a field name's second quote)
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 170340d28d6..49d3d02524a 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -461,66 +461,68 @@ auto get_translation_table()
                                                         {token_t::ErrorBegin},
                                                         {},
                                                         {token_t::ValueBegin}}};
-  pda_tlt[static_cast<StateT>(pda_state_t::PD_BOA)] = {{{token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::StructBegin},
-                                                        {token_t::ListBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ListEnd},
-                                                        {token_t::StringBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ValueBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::StructEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::FieldNameBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin}}};
-  pda_tlt[static_cast<StateT>(pda_state_t::PD_LON)] = {{{token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd, token_t::ListEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd, token_t::StructEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ValueEnd},
-                                                        {}}};
+  pda_tlt[static_cast<StateT>(pda_state_t::PD_BOA)] = {
+    {{token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::StructBegin},
+     {token_t::ListBegin},
+     {token_t::ErrorBegin},
+     {token_t::ListEnd},
+     {token_t::StringBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ValueBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::StructEnd},
+     {token_t::ErrorBegin},
+     {token_t::StructMemberBegin, token_t::FieldNameBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin}}};
+  pda_tlt[static_cast<StateT>(pda_state_t::PD_LON)] = {
+    {{token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd},
+     {},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd, token_t::ListEnd},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd},
+     {},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd, token_t::StructMemberEnd, token_t::StructEnd},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd},
+     {token_t::ErrorBegin},
+     {token_t::ValueEnd},
+     {}}};
   pda_tlt[static_cast<StateT>(pda_state_t::PD_STR)] = {
     {{}, {}, {}, {}, {token_t::StringEnd}, {}, {}, {}, {}, {},
      {}, {}, {}, {}, {token_t::StringEnd}, {}, {}, {}, {}, {},
@@ -528,66 +530,68 @@ auto get_translation_table()
   pda_tlt[static_cast<StateT>(pda_state_t::PD_SCE)] = {{{}, {}, {}, {}, {}, {}, {}, {}, {}, {},
                                                         {}, {}, {}, {}, {}, {}, {}, {}, {}, {},
                                                         {}, {}, {}, {}, {}, {}, {}, {}, {}, {}}};
-  pda_tlt[static_cast<StateT>(pda_state_t::PD_PVL)] = {{{token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ListEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::StructEnd},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin}}};
-  pda_tlt[static_cast<StateT>(pda_state_t::PD_BFN)] = {{{token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::FieldNameBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {token_t::ErrorBegin},
-                                                        {},
-                                                        {token_t::ErrorBegin}}};
+  pda_tlt[static_cast<StateT>(pda_state_t::PD_PVL)] = {
+    {{token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ListEnd},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::StructMemberEnd, token_t::StructEnd},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::StructMemberEnd},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin}}};
+  pda_tlt[static_cast<StateT>(pda_state_t::PD_BFN)] = {
+    {{token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::StructMemberBegin, token_t::FieldNameBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {token_t::ErrorBegin},
+     {},
+     {token_t::ErrorBegin}}};
   pda_tlt[static_cast<StateT>(pda_state_t::PD_FLN)] = {{{token_t::ErrorBegin},
                                                         {token_t::ErrorBegin},
                                                         {token_t::ErrorBegin},
diff --git a/cpp/tests/io/nested_json_test.cu b/cpp/tests/io/nested_json_test.cu
index b7b0bc16edb..1c3256b337a 100644
--- a/cpp/tests/io/nested_json_test.cu
+++ b/cpp/tests/io/nested_json_test.cu
@@ -250,29 +250,96 @@ TEST_F(JsonTest, TokenStream)
   // Golden token stream sample
   using token_t = cuio_json::token_t;
   std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> golden_token_stream = {
-    {2, token_t::ListBegin},        {3, token_t::StructBegin},      {4, token_t::FieldNameBegin},
-    {13, token_t::FieldNameEnd},    {16, token_t::StringBegin},     {26, token_t::StringEnd},
-    {28, token_t::FieldNameBegin},  {35, token_t::FieldNameEnd},    {38, token_t::ListBegin},
-    {39, token_t::ValueBegin},      {40, token_t::ValueEnd},        {41, token_t::ValueBegin},
-    {43, token_t::ValueEnd},        {44, token_t::ValueBegin},      {46, token_t::ValueEnd},
-    {46, token_t::ListEnd},         {48, token_t::FieldNameBegin},  {55, token_t::FieldNameEnd},
-    {58, token_t::StringBegin},     {69, token_t::StringEnd},       {71, token_t::FieldNameBegin},
-    {77, token_t::FieldNameEnd},    {80, token_t::StringBegin},     {105, token_t::StringEnd},
-    {107, token_t::FieldNameBegin}, {113, token_t::FieldNameEnd},   {116, token_t::ValueBegin},
-    {120, token_t::ValueEnd},       {120, token_t::StructEnd},      {124, token_t::StructBegin},
-    {125, token_t::FieldNameBegin}, {134, token_t::FieldNameEnd},   {137, token_t::StringBegin},
-    {147, token_t::StringEnd},      {149, token_t::FieldNameBegin}, {155, token_t::FieldNameEnd},
-    {158, token_t::ListBegin},      {159, token_t::ValueBegin},     {160, token_t::ValueEnd},
-    {161, token_t::StructBegin},    {162, token_t::StructEnd},      {164, token_t::ValueBegin},
-    {168, token_t::ValueEnd},       {169, token_t::StructBegin},    {170, token_t::FieldNameBegin},
-    {172, token_t::FieldNameEnd},   {174, token_t::ListBegin},      {175, token_t::StructBegin},
-    {177, token_t::StructEnd},      {180, token_t::StructBegin},    {181, token_t::StructEnd},
-    {182, token_t::ListEnd},        {184, token_t::StructEnd},      {186, token_t::ListEnd},
-    {188, token_t::FieldNameBegin}, {195, token_t::FieldNameEnd},   {198, token_t::StringBegin},
-    {209, token_t::StringEnd},      {211, token_t::FieldNameBegin}, {217, token_t::FieldNameEnd},
-    {220, token_t::StringBegin},    {252, token_t::StringEnd},      {254, token_t::FieldNameBegin},
-    {260, token_t::FieldNameEnd},   {263, token_t::ValueBegin},     {267, token_t::ValueEnd},
-    {267, token_t::StructEnd},      {268, token_t::ListEnd}};
+    {2, token_t::ListBegin},
+    {3, token_t::StructBegin},
+    {4, token_t::StructMemberBegin},
+    {4, token_t::FieldNameBegin},
+    {13, token_t::FieldNameEnd},
+    {16, token_t::StringBegin},
+    {26, token_t::StringEnd},
+    {27, token_t::StructMemberEnd},
+    {28, token_t::StructMemberBegin},
+    {28, token_t::FieldNameBegin},
+    {35, token_t::FieldNameEnd},
+    {38, token_t::ListBegin},
+    {39, token_t::ValueBegin},
+    {40, token_t::ValueEnd},
+    {41, token_t::ValueBegin},
+    {43, token_t::ValueEnd},
+    {44, token_t::ValueBegin},
+    {46, token_t::ValueEnd},
+    {46, token_t::ListEnd},
+    {47, token_t::StructMemberEnd},
+    {48, token_t::StructMemberBegin},
+    {48, token_t::FieldNameBegin},
+    {55, token_t::FieldNameEnd},
+    {58, token_t::StringBegin},
+    {69, token_t::StringEnd},
+    {70, token_t::StructMemberEnd},
+    {71, token_t::StructMemberBegin},
+    {71, token_t::FieldNameBegin},
+    {77, token_t::FieldNameEnd},
+    {80, token_t::StringBegin},
+    {105, token_t::StringEnd},
+    {106, token_t::StructMemberEnd},
+    {107, token_t::StructMemberBegin},
+    {107, token_t::FieldNameBegin},
+    {113, token_t::FieldNameEnd},
+    {116, token_t::ValueBegin},
+    {120, token_t::ValueEnd},
+    {120, token_t::StructMemberEnd},
+    {120, token_t::StructEnd},
+    {124, token_t::StructBegin},
+    {125, token_t::StructMemberBegin},
+    {125, token_t::FieldNameBegin},
+    {134, token_t::FieldNameEnd},
+    {137, token_t::StringBegin},
+    {147, token_t::StringEnd},
+    {148, token_t::StructMemberEnd},
+    {149, token_t::StructMemberBegin},
+    {149, token_t::FieldNameBegin},
+    {155, token_t::FieldNameEnd},
+    {158, token_t::ListBegin},
+    {159, token_t::ValueBegin},
+    {160, token_t::ValueEnd},
+    {161, token_t::StructBegin},
+    {162, token_t::StructEnd},
+    {164, token_t::ValueBegin},
+    {168, token_t::ValueEnd},
+    {169, token_t::StructBegin},
+    {170, token_t::StructMemberBegin},
+    {170, token_t::FieldNameBegin},
+    {172, token_t::FieldNameEnd},
+    {174, token_t::ListBegin},
+    {175, token_t::StructBegin},
+    {177, token_t::StructEnd},
+    {180, token_t::StructBegin},
+    {181, token_t::StructEnd},
+    {182, token_t::ListEnd},
+    {184, token_t::StructMemberEnd},
+    {184, token_t::StructEnd},
+    {186, token_t::ListEnd},
+    {187, token_t::StructMemberEnd},
+    {188, token_t::StructMemberBegin},
+    {188, token_t::FieldNameBegin},
+    {195, token_t::FieldNameEnd},
+    {198, token_t::StringBegin},
+    {209, token_t::StringEnd},
+    {210, token_t::StructMemberEnd},
+    {211, token_t::StructMemberBegin},
+    {211, token_t::FieldNameBegin},
+    {217, token_t::FieldNameEnd},
+    {220, token_t::StringBegin},
+    {252, token_t::StringEnd},
+    {253, token_t::StructMemberEnd},
+    {254, token_t::StructMemberBegin},
+    {254, token_t::FieldNameBegin},
+    {260, token_t::FieldNameEnd},
+    {263, token_t::ValueBegin},
+    {267, token_t::ValueEnd},
+    {267, token_t::StructMemberEnd},
+    {267, token_t::StructEnd},
+    {268, token_t::ListEnd}};
 
   // Verify the number of tokens matches
   ASSERT_EQ(golden_token_stream.size(), num_tokens_out[0]);

From d694f2195b17c13df9c15397583b5355e1c0f046 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 12 Aug 2022 04:53:59 +0530
Subject: [PATCH 04/27] add get_tree_representation_gpu initial version

---
 cpp/src/io/json/nested_json.hpp    |   3 +
 cpp/src/io/json/nested_json_gpu.cu | 317 +++++++++++++++++++++++++++++
 2 files changed, 320 insertions(+)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index da67fda4351..4f4b3004dcc 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -166,6 +166,9 @@ void get_token_stream(device_span<SymbolT const> d_json_in,
  * @return
  */
 tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream);
+
+tree_meta_t get_tree_representation_gpu(device_span<SymbolT const> d_input,
+                                        rmm::cuda_stream_view stream);
 }  // namespace detail
 
 }  // namespace cudf::io::json
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 49d3d02524a..4fac231c7c1 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -20,11 +20,24 @@
 #include <io/fst/lookup_tables.cuh>
 #include <io/utilities/hostdevice_vector.hpp>
 
+#include <cudf/detail/utilities/vector_factories.hpp>
 #include <cudf/types.hpp>
 #include <cudf/utilities/span.hpp>
 
+#include <rmm/device_uvector.hpp>
 #include <rmm/exec_policy.hpp>
 
+#include "thrust/functional.h"
+#include "thrust/iterator/counting_iterator.h"
+#include "thrust/iterator/transform_iterator.h"
+#include "thrust/iterator/transform_output_iterator.h"
+#include "thrust/iterator/zip_iterator.h"
+#include "thrust/sequence.h"
+#include "thrust/sort.h"
+#include "thrust/tabulate.h"
+#include <thrust/gather.h>
+#include <thrust/scan.h>
+
 #include <stack>
 
 namespace cudf::io::json {
@@ -829,6 +842,25 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
   // Make sure tokens have been copied to the host
   stream.synchronize();
 
+  auto to_token_str = [](PdaTokenT token) {
+    switch (token) {
+      case token_t::StructBegin: return " {";
+      case token_t::StructEnd: return " }";
+      case token_t::ListBegin: return " [";
+      case token_t::ListEnd: return " ]";
+      case token_t::FieldNameBegin: return "FB";
+      case token_t::FieldNameEnd: return "FE";
+      case token_t::StringBegin: return "SB";
+      case token_t::StringEnd: return "SE";
+      case token_t::ErrorBegin: return "er";
+      case token_t::ValueBegin: return "VB";
+      case token_t::ValueEnd: return "VE";
+      case token_t::StructMemberBegin: return " <";
+      case token_t::StructMemberEnd: return " >";
+      default: return ".";
+    }
+  };
+
   // Whether a token does represent a node in the tree representation
   auto is_node = [](PdaTokenT const token) {
     switch (token) {
@@ -967,6 +999,60 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
     if (is_node(token)) { node_id++; }
   }
 
+  // DEBUG prints
+  auto print_cat = [](auto const& gpu, auto const& cpu, auto const name) {
+    auto to_cat = [](auto v) {
+      switch (v) {
+        case NC_STRUCT: return " S";
+        case NC_LIST: return " L";
+        case NC_STR: return " \"";
+        case NC_VAL: return " V";
+        case NC_FN: return " F";
+        case NC_ERR: return "ER";
+        default: return "UN";
+      };
+    };
+    for (auto const& v : cpu)
+      printf("%s,", to_cat(v));
+    std::cout << name << "(CPU):" << std::endl;
+    for (auto const& v : gpu)
+      printf("%s,", to_cat(v));
+    std::cout << name << "(GPU):" << std::endl;
+    if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
+      for (auto i = 0lu; i < cpu.size(); i++)
+        printf("%2s,", (gpu[i] == cpu[i] ? " " : "x"));
+      std::cout << std::endl;
+    }
+  };
+  bool mismatch  = false;
+  auto print_vec = [&](auto const& gpu, auto const& cpu, auto const name) {
+    for (auto const& v : cpu)
+      printf("%2d,", int(v));
+    std::cout << name << "(CPU):" << std::endl;
+    for (auto const& v : gpu)
+      printf("%2d,", int(v));
+    std::cout << name << "(GPU):" << std::endl;
+    if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
+      for (auto i = 0lu; i < cpu.size(); i++) {
+        mismatch |= (gpu[i] != cpu[i]);
+        printf("%2s,", (gpu[i] == cpu[i] ? " " : "x"));
+      }
+      std::cout << std::endl;
+    }
+  };
+
+#define PRINT_VEC(vec) print_vec(value.vec, vec, #vec);
+  auto value = get_tree_representation_gpu(d_input, stream);
+  // PRINT_VEC(node_categories); //Works
+  print_cat(value.node_categories, node_categories, "node_categories");
+  PRINT_VEC(node_levels);       // Works
+  PRINT_VEC(node_range_begin);  // Works
+  PRINT_VEC(node_range_end);    // Works
+  PRINT_VEC(parent_node_ids);   // Works
+  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
+  std::cout << "Mismatch: " << mismatch << std::endl;
+
+#undef PRINT_VEC
   return {std::move(node_categories),
           std::move(parent_node_ids),
           std::move(node_levels),
@@ -974,5 +1060,236 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
           std::move(node_range_end)};
 }
 
+// The node that a token represents
+struct token_to_node {
+  __device__ auto operator()(PdaTokenT const token) -> NodeT
+  {
+    switch (token) {
+      case token_t::StructBegin: return NC_STRUCT;
+      case token_t::ListBegin: return NC_LIST;
+      case token_t::StringBegin: return NC_STR;
+      case token_t::ValueBegin: return NC_VAL;
+      case token_t::FieldNameBegin: return NC_FN;
+      default: return NC_ERR;
+    };
+  }
+};
+
+// convert token indices to node range for each vaid node.
+template <typename T1, typename T2, typename T3>
+struct node_ranges {
+  T1 tokens_gpu;
+  T2 token_indices_gpu;
+  T3 num_tokens;
+  __device__ auto operator()(size_type i) -> thrust::tuple<SymbolOffsetT, SymbolOffsetT>
+  {
+    // Whether a token expects to be followed by its respective end-of-* token partner
+    auto is_begin_of_section = [] __device__(PdaTokenT const token) {
+      switch (token) {
+        case token_t::StringBegin:
+        case token_t::ValueBegin:
+        case token_t::FieldNameBegin: return true;
+        default: return false;
+      };
+    };
+    // The end-of-* partner token for a given beginning-of-* token
+    auto end_of_partner = [] __device__(PdaTokenT const token) {
+      switch (token) {
+        case token_t::StringBegin: return token_t::StringEnd;
+        case token_t::ValueBegin: return token_t::ValueEnd;
+        case token_t::FieldNameBegin: return token_t::FieldNameEnd;
+        default: return token_t::ErrorBegin;
+      };
+    };
+    auto get_token_index = [] __device__(PdaTokenT const token, SymbolOffsetT const token_index) {
+      constexpr SymbolOffsetT skip_quote_char = 1;
+      switch (token) {
+        case token_t::StringBegin: return token_index + skip_quote_char;
+        case token_t::FieldNameBegin: return token_index + skip_quote_char;
+        default: return token_index;
+      };
+    };
+    PdaTokenT const token = tokens_gpu[i];
+    // The section from the original JSON input that this token demarcates
+    SymbolOffsetT range_begin = get_token_index(token, token_indices_gpu[i]);
+    SymbolOffsetT range_end   = range_begin + 1;
+    if (is_begin_of_section(token)) {
+      if ((i + 1) < num_tokens && end_of_partner(token) == tokens_gpu[i + 1]) {
+        // Update the range_end for this pair of tokens
+        range_end = token_indices_gpu[i + 1];
+      }
+    }
+    return thrust::make_tuple(range_begin, range_end);
+  }
+};
+
+// GPU version of get_tree_representation
+tree_meta_t get_tree_representation_gpu(device_span<SymbolT const> d_input,
+                                        rmm::cuda_stream_view stream)
+{
+  constexpr std::size_t single_item = 1;
+  rmm::device_uvector<PdaTokenT> tokens_gpu{d_input.size(), stream};
+  rmm::device_uvector<SymbolOffsetT> token_indices_gpu{d_input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
+
+  // Parse the JSON and get the token stream
+  cudf::io::json::detail::get_token_stream(
+    d_input, tokens_gpu.data(), token_indices_gpu.data(), num_tokens_out.device_ptr(), stream);
+
+  // Copy the JSON token count to the host
+  num_tokens_out.device_to_host(stream);
+
+  // Make sure tokens have been copied to the host
+  stream.synchronize();
+
+  // Whether a token does represent a node in the tree representation
+  auto is_node = [] __device__(PdaTokenT const token) -> size_type {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin:
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin:
+      case token_t::ErrorBegin: return 1;
+      default: return 0;
+    };
+  };
+
+  // Whether the token pops from the parent node stack
+  auto does_pop = [] __device__(PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructMemberEnd:
+      case token_t::StructEnd:
+      case token_t::ListEnd: return true;
+      default: return false;
+    };
+  };
+
+  // Whether the token pushes onto the parent node stack
+  auto does_push = [] __device__(PdaTokenT const token) {
+    switch (token) {
+      // case token_t::StructMemberBegin: //TODO: Either use FieldNameBegin here or change the
+      // token_to_node function
+      case token_t::FieldNameBegin:
+      case token_t::StructBegin:
+      case token_t::ListBegin: return true;
+      default: return false;
+    };
+  };
+
+  auto num_tokens = num_tokens_out[0];
+  auto is_node_it = thrust::make_transform_iterator(tokens_gpu.begin(), is_node);
+  auto num_nodes  = thrust::reduce(rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens);
+
+  // Node categories: copy_if with transform.
+  rmm::device_uvector<NodeT> node_categories(num_nodes, stream);
+  auto node_categories_it =
+    thrust::make_transform_output_iterator(node_categories.begin(), token_to_node{});
+  auto node_categories_end = thrust::copy_if(rmm::exec_policy(stream),
+                                             tokens_gpu.begin(),
+                                             tokens_gpu.begin() + num_tokens,
+                                             node_categories_it,
+                                             is_node);
+  CUDF_EXPECTS(node_categories_end - node_categories_it == num_nodes,
+               "node category count mismatch");
+
+  // Node levels: transform_exclusive_scan, copy_if.
+  rmm::device_uvector<size_type> token_levels(num_tokens, stream);
+  auto push_pop_it = thrust::make_transform_iterator(
+    tokens_gpu.begin(), [does_push, does_pop] __device__(PdaTokenT const token) -> size_type {
+      return does_push(token) ? 1 : (does_pop(token) ? -1 : 0);
+    });
+  thrust::exclusive_scan(
+    rmm::exec_policy(stream), push_pop_it, push_pop_it + num_tokens, token_levels.begin());
+
+  rmm::device_uvector<TreeDepthT> node_levels(num_nodes, stream);
+  auto node_levels_end = thrust::copy_if(rmm::exec_policy(stream),
+                                         token_levels.begin(),
+                                         token_levels.begin() + num_tokens,
+                                         tokens_gpu.begin(),
+                                         node_levels.begin(),
+                                         is_node);
+  CUDF_EXPECTS(node_levels_end - node_levels.begin() == num_nodes, "node level count mismatch");
+
+  // Node ranges: copy_if with transform.
+  rmm::device_uvector<SymbolOffsetT> node_range_begin(num_nodes, stream);
+  rmm::device_uvector<SymbolOffsetT> node_range_end(num_nodes, stream);
+  auto node_range_tuple_it =
+    thrust::make_zip_iterator(node_range_begin.begin(), node_range_end.begin());
+  using node_ranges_t    = node_ranges<decltype(tokens_gpu.begin()),
+                                    decltype(token_indices_gpu.begin()),
+                                    decltype(num_tokens)>;
+  auto node_range_out_it = thrust::make_transform_output_iterator(
+    node_range_tuple_it, node_ranges_t{tokens_gpu.begin(), token_indices_gpu.begin(), num_tokens});
+
+  auto node_range_out_end =
+    thrust::copy_if(rmm::exec_policy(stream),
+                    thrust::make_counting_iterator<size_type>(0),
+                    thrust::make_counting_iterator<size_type>(0) + num_tokens,
+                    node_range_out_it,
+                    [is_node, tokens_gpu = tokens_gpu.begin()] __device__(size_type i) -> bool {
+                      PdaTokenT const token = tokens_gpu[i];
+                      return is_node(token);
+                    });
+  CUDF_EXPECTS(node_range_out_end - node_range_out_it == num_nodes, "node range count mismatch");
+
+  // Node parent ids: previous push token_id transform, stable sort, segmented scan with Max,
+  // copy_if. This one is sort of logical stack. But more generalized. TODO: make it own function.
+  rmm::device_uvector<size_type> parent_token_ids(num_tokens, stream);  // XXX: fill with 0?
+  rmm::device_uvector<size_type> initial_order(num_tokens, stream);
+  thrust::sequence(rmm::exec_policy(stream), initial_order.begin(), initial_order.end());
+  thrust::tabulate(rmm::exec_policy(stream),
+                   parent_token_ids.begin(),
+                   parent_token_ids.end(),
+                   [does_push, tokens_gpu = tokens_gpu.begin()] __device__(auto i) -> size_type {
+                     if (i == 0)
+                       return -1;
+                     else
+                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;  // XXX: -1 or 0?
+                   });
+  auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
+  // TODO: use radix sort.
+  thrust::stable_sort_by_key(rmm::exec_policy(stream),
+                             token_levels.data(),
+                             token_levels.data() + token_levels.size(),
+                             out_pid);
+  // SegmentedScan Max.
+  thrust::inclusive_scan_by_key(rmm::exec_policy(stream),
+                                token_levels.data(),
+                                token_levels.data() + token_levels.size(),
+                                parent_token_ids.data(),
+                                parent_token_ids.data(),  // size_type{-1},
+                                thrust::equal_to<size_type>{},
+                                thrust::maximum<size_type>{});
+  // TODO: Avoid sorting again by  gather_if on a transform iterator. or scatter.
+  thrust::sort_by_key(rmm::exec_policy(stream),
+                      initial_order.data(),
+                      initial_order.data() + initial_order.size(),
+                      parent_token_ids.data());
+
+  rmm::device_uvector<size_type> node_ids_gpu(num_tokens, stream);
+  thrust::exclusive_scan(
+    rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens, node_ids_gpu.begin());
+  rmm::device_uvector<NodeIndexT> parent_node_ids(num_nodes, stream);
+  auto parent_node_ids_it = thrust::make_transform_iterator(
+    parent_token_ids.begin(),
+    [node_ids_gpu = node_ids_gpu.begin()] __device__(size_type const pid) -> NodeIndexT {
+      return pid < 0 ? pid : node_ids_gpu[pid];
+    });
+  auto parent_node_ids_end = thrust::copy_if(rmm::exec_policy(stream),
+                                             parent_node_ids_it,
+                                             parent_node_ids_it + parent_token_ids.size(),
+                                             tokens_gpu.begin(),
+                                             parent_node_ids.begin(),
+                                             is_node);
+  CUDF_EXPECTS(parent_node_ids_end - parent_node_ids.begin() == num_nodes,
+               "parent node id gather mismatch");
+
+  return {cudf::detail::make_std_vector_async(node_categories, stream),
+          cudf::detail::make_std_vector_async(parent_node_ids, stream),
+          cudf::detail::make_std_vector_async(node_levels, stream),
+          cudf::detail::make_std_vector_async(node_range_begin, stream),
+          cudf::detail::make_std_vector_async(node_range_end, stream)};
+}
 }  // namespace detail
 }  // namespace cudf::io::json

From 34ba420932a460342cf5d361d314a806cc9dc254 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Sun, 14 Aug 2022 21:15:41 +0530
Subject: [PATCH 05/27] delete repeated nested_json_test.cu

---
 cpp/tests/io/nested_json_test.cu | 457 -------------------------------
 1 file changed, 457 deletions(-)
 delete mode 100644 cpp/tests/io/nested_json_test.cu

diff --git a/cpp/tests/io/nested_json_test.cu b/cpp/tests/io/nested_json_test.cu
deleted file mode 100644
index 1c3256b337a..00000000000
--- a/cpp/tests/io/nested_json_test.cu
+++ /dev/null
@@ -1,457 +0,0 @@
-/*
- * Copyright (c) 2022, NVIDIA CORPORATION.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-#include <io/json/nested_json.hpp>
-#include <io/utilities/hostdevice_vector.hpp>
-
-#include <cudf/utilities/default_stream.hpp>
-
-#include <cudf_test/base_fixture.hpp>
-#include <cudf_test/cudf_gtest.hpp>
-
-#include <rmm/cuda_stream.hpp>
-#include <rmm/cuda_stream_view.hpp>
-
-#include <stack>
-#include <string>
-
-namespace cuio_json = cudf::io::json;
-
-namespace {
-
-std::string get_node_string(std::size_t const node_id,
-                            cuio_json::tree_meta_t const& tree_rep,
-                            std::string const& json_input)
-{
-  auto node_to_str = [] __host__ __device__(cuio_json::PdaTokenT const token) {
-    switch (token) {
-      case cuio_json::NC_STRUCT: return "STRUCT";
-      case cuio_json::NC_LIST: return "LIST";
-      case cuio_json::NC_FN: return "FN";
-      case cuio_json::NC_STR: return "STR";
-      case cuio_json::NC_VAL: return "VAL";
-      case cuio_json::NC_ERR: return "ERR";
-      default: return "N/A";
-    };
-  };
-
-  return "<" + std::to_string(node_id) + ":" + node_to_str(tree_rep.node_categories[node_id]) +
-         ":[" + std::to_string(tree_rep.node_range_begin[node_id]) + ", " +
-         std::to_string(tree_rep.node_range_end[node_id]) + ") '" +
-         json_input.substr(tree_rep.node_range_begin[node_id],
-                           tree_rep.node_range_end[node_id] - tree_rep.node_range_begin[node_id]) +
-         "'>";
-}
-
-void print_tree_representation(std::string const& json_input,
-                               cuio_json::tree_meta_t const& tree_rep)
-{
-  for (std::size_t i = 0; i < tree_rep.node_categories.size(); i++) {
-    std::size_t parent_id = tree_rep.parent_node_ids[i];
-    std::stack<std::size_t> path;
-    path.push(i);
-    while (parent_id != cuio_json::parent_node_sentinel) {
-      path.push(parent_id);
-      parent_id = tree_rep.parent_node_ids[parent_id];
-    }
-
-    while (path.size()) {
-      auto const node_id = path.top();
-      std::cout << get_node_string(node_id, tree_rep, json_input)
-                << (path.size() > 1 ? " -> " : "");
-      path.pop();
-    }
-    std::cout << "\n";
-  }
-}
-}  // namespace
-
-// Base test fixture for tests
-struct JsonTest : public cudf::test::BaseFixture {
-};
-
-TEST_F(JsonTest, StackContext)
-{
-  // Type used to represent the atomic symbol type used within the finite-state machine
-  using SymbolT      = char;
-  using StackSymbolT = char;
-
-  // Prepare cuda stream for data transfers & kernels
-  rmm::cuda_stream stream{};
-  rmm::cuda_stream_view stream_view(stream);
-
-  // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}\\\"[], <=semantic-symbols-string\\\\",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
-
-  // Prepare input & output buffers
-  rmm::device_uvector<SymbolT> d_input(input.size(), stream_view);
-  hostdevice_vector<StackSymbolT> stack_context(input.size(), stream_view);
-
-  ASSERT_CUDA_SUCCEEDED(cudaMemcpyAsync(d_input.data(),
-                                        input.data(),
-                                        input.size() * sizeof(SymbolT),
-                                        cudaMemcpyHostToDevice,
-                                        stream.value()));
-
-  // Run algorithm
-  cuio_json::detail::get_stack_context(d_input, stack_context.device_ptr(), stream_view);
-
-  // Copy back the results
-  stack_context.device_to_host(stream_view);
-
-  // Make sure we copied back the stack context
-  stream_view.synchronize();
-
-  std::vector<char> golden_stack_context{
-    '_', '_', '_', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '[', '[', '[', '[', '[', '[', '[', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '[', '[', '[', '[', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '[', '[', '[', '{', '[', '[', '[', '[', '[', '[', '[', '{',
-    '{', '{', '{', '{', '[', '{', '{', '[', '[', '[', '{', '[', '{', '{', '[', '[', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '[', '_'};
-
-  ASSERT_EQ(golden_stack_context.size(), stack_context.size());
-  CUDF_TEST_EXPECT_VECTOR_EQUAL(golden_stack_context, stack_context, stack_context.size());
-}
-
-TEST_F(JsonTest, StackContextUtf8)
-{
-  // Type used to represent the atomic symbol type used within the finite-state machine
-  using SymbolT      = char;
-  using StackSymbolT = char;
-
-  // Prepare cuda stream for data transfers & kernels
-  rmm::cuda_stream stream{};
-  rmm::cuda_stream_view stream_view(stream);
-
-  // Test input
-  std::string input = R"([{"a":{"year":1882,"author": "Bharathi"}, {"a":"filip ʒakotɛ"}}])";
-
-  // Prepare input & output buffers
-  rmm::device_uvector<SymbolT> d_input(input.size(), stream_view);
-  hostdevice_vector<StackSymbolT> stack_context(input.size(), stream_view);
-
-  ASSERT_CUDA_SUCCEEDED(cudaMemcpyAsync(d_input.data(),
-                                        input.data(),
-                                        input.size() * sizeof(SymbolT),
-                                        cudaMemcpyHostToDevice,
-                                        stream.value()));
-
-  // Run algorithm
-  cuio_json::detail::get_stack_context(d_input, stack_context.device_ptr(), stream_view);
-
-  // Copy back the results
-  stack_context.device_to_host(stream_view);
-
-  // Make sure we copied back the stack context
-  stream_view.synchronize();
-
-  std::vector<char> golden_stack_context{
-    '_', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
-    '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '['};
-
-  ASSERT_EQ(golden_stack_context.size(), stack_context.size());
-  CUDF_TEST_EXPECT_VECTOR_EQUAL(golden_stack_context, stack_context, stack_context.size());
-}
-
-TEST_F(JsonTest, TokenStream)
-{
-  using cuio_json::PdaTokenT;
-  using cuio_json::SymbolOffsetT;
-  using cuio_json::SymbolT;
-
-  constexpr std::size_t single_item = 1;
-
-  // Prepare cuda stream for data transfers & kernels
-  rmm::cuda_stream stream{};
-  rmm::cuda_stream_view stream_view(stream);
-
-  // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
-
-  // Prepare input & output buffers
-  rmm::device_uvector<SymbolT> d_input(input.size(), stream_view);
-
-  ASSERT_CUDA_SUCCEEDED(cudaMemcpyAsync(d_input.data(),
-                                        input.data(),
-                                        input.size() * sizeof(SymbolT),
-                                        cudaMemcpyHostToDevice,
-                                        stream.value()));
-
-  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream_view};
-  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream_view};
-  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream_view};
-
-  // Parse the JSON and get the token stream
-  cuio_json::detail::get_token_stream(d_input,
-                                      tokens_gpu.device_ptr(),
-                                      token_indices_gpu.device_ptr(),
-                                      num_tokens_out.device_ptr(),
-                                      stream_view);
-
-  // Copy back the number of tokens that were written
-  num_tokens_out.device_to_host(stream_view);
-  tokens_gpu.device_to_host(stream_view);
-  token_indices_gpu.device_to_host(stream_view);
-
-  // Make sure we copied back all relevant data
-  stream_view.synchronize();
-
-  // Golden token stream sample
-  using token_t = cuio_json::token_t;
-  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> golden_token_stream = {
-    {2, token_t::ListBegin},
-    {3, token_t::StructBegin},
-    {4, token_t::StructMemberBegin},
-    {4, token_t::FieldNameBegin},
-    {13, token_t::FieldNameEnd},
-    {16, token_t::StringBegin},
-    {26, token_t::StringEnd},
-    {27, token_t::StructMemberEnd},
-    {28, token_t::StructMemberBegin},
-    {28, token_t::FieldNameBegin},
-    {35, token_t::FieldNameEnd},
-    {38, token_t::ListBegin},
-    {39, token_t::ValueBegin},
-    {40, token_t::ValueEnd},
-    {41, token_t::ValueBegin},
-    {43, token_t::ValueEnd},
-    {44, token_t::ValueBegin},
-    {46, token_t::ValueEnd},
-    {46, token_t::ListEnd},
-    {47, token_t::StructMemberEnd},
-    {48, token_t::StructMemberBegin},
-    {48, token_t::FieldNameBegin},
-    {55, token_t::FieldNameEnd},
-    {58, token_t::StringBegin},
-    {69, token_t::StringEnd},
-    {70, token_t::StructMemberEnd},
-    {71, token_t::StructMemberBegin},
-    {71, token_t::FieldNameBegin},
-    {77, token_t::FieldNameEnd},
-    {80, token_t::StringBegin},
-    {105, token_t::StringEnd},
-    {106, token_t::StructMemberEnd},
-    {107, token_t::StructMemberBegin},
-    {107, token_t::FieldNameBegin},
-    {113, token_t::FieldNameEnd},
-    {116, token_t::ValueBegin},
-    {120, token_t::ValueEnd},
-    {120, token_t::StructMemberEnd},
-    {120, token_t::StructEnd},
-    {124, token_t::StructBegin},
-    {125, token_t::StructMemberBegin},
-    {125, token_t::FieldNameBegin},
-    {134, token_t::FieldNameEnd},
-    {137, token_t::StringBegin},
-    {147, token_t::StringEnd},
-    {148, token_t::StructMemberEnd},
-    {149, token_t::StructMemberBegin},
-    {149, token_t::FieldNameBegin},
-    {155, token_t::FieldNameEnd},
-    {158, token_t::ListBegin},
-    {159, token_t::ValueBegin},
-    {160, token_t::ValueEnd},
-    {161, token_t::StructBegin},
-    {162, token_t::StructEnd},
-    {164, token_t::ValueBegin},
-    {168, token_t::ValueEnd},
-    {169, token_t::StructBegin},
-    {170, token_t::StructMemberBegin},
-    {170, token_t::FieldNameBegin},
-    {172, token_t::FieldNameEnd},
-    {174, token_t::ListBegin},
-    {175, token_t::StructBegin},
-    {177, token_t::StructEnd},
-    {180, token_t::StructBegin},
-    {181, token_t::StructEnd},
-    {182, token_t::ListEnd},
-    {184, token_t::StructMemberEnd},
-    {184, token_t::StructEnd},
-    {186, token_t::ListEnd},
-    {187, token_t::StructMemberEnd},
-    {188, token_t::StructMemberBegin},
-    {188, token_t::FieldNameBegin},
-    {195, token_t::FieldNameEnd},
-    {198, token_t::StringBegin},
-    {209, token_t::StringEnd},
-    {210, token_t::StructMemberEnd},
-    {211, token_t::StructMemberBegin},
-    {211, token_t::FieldNameBegin},
-    {217, token_t::FieldNameEnd},
-    {220, token_t::StringBegin},
-    {252, token_t::StringEnd},
-    {253, token_t::StructMemberEnd},
-    {254, token_t::StructMemberBegin},
-    {254, token_t::FieldNameBegin},
-    {260, token_t::FieldNameEnd},
-    {263, token_t::ValueBegin},
-    {267, token_t::ValueEnd},
-    {267, token_t::StructMemberEnd},
-    {267, token_t::StructEnd},
-    {268, token_t::ListEnd}};
-
-  // Verify the number of tokens matches
-  ASSERT_EQ(golden_token_stream.size(), num_tokens_out[0]);
-
-  for (std::size_t i = 0; i < num_tokens_out[0]; i++) {
-    // Ensure the index the tokens are pointing to do match
-    EXPECT_EQ(golden_token_stream[i].first, token_indices_gpu[i]) << "Mismatch at #" << i;
-
-    // Ensure the token category is correct
-    EXPECT_EQ(golden_token_stream[i].second, tokens_gpu[i]) << "Mismatch at #" << i;
-  }
-}
-
-TEST_F(JsonTest, TreeRepresentation)
-{
-  // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
-
-  // Get the JSON's tree representation
-  auto tree_rep = cuio_json::detail::get_tree_representation(input, cudf::default_stream_value);
-
-  // Print tree representation
-  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
-
-  // Golden sample of node categories
-  std::vector<cuio_json::node_t> golden_node_categories = {
-    cuio_json::NC_LIST, cuio_json::NC_STRUCT, cuio_json::NC_FN,     cuio_json::NC_STR,
-    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_VAL,    cuio_json::NC_VAL,
-    cuio_json::NC_VAL,  cuio_json::NC_FN,     cuio_json::NC_STR,    cuio_json::NC_FN,
-    cuio_json::NC_STR,  cuio_json::NC_FN,     cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_LIST,
-    cuio_json::NC_VAL,  cuio_json::NC_STRUCT, cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_STRUCT, cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_STR,
-    cuio_json::NC_FN,   cuio_json::NC_VAL};
-
-  // Golden sample of node ids
-  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {cuio_json::parent_node_sentinel,
-                                                               0,
-                                                               1,
-                                                               2,
-                                                               1,
-                                                               4,
-                                                               5,
-                                                               5,
-                                                               5,
-                                                               1,
-                                                               9,
-                                                               1,
-                                                               11,
-                                                               1,
-                                                               13,
-                                                               0,
-                                                               15,
-                                                               16,
-                                                               15,
-                                                               18,
-                                                               19,
-                                                               19,
-                                                               19,
-                                                               19,
-                                                               23,
-                                                               24,
-                                                               25,
-                                                               25,
-                                                               15,
-                                                               28,
-                                                               15,
-                                                               30,
-                                                               15,
-                                                               32};
-
-  // Golden sample of node levels
-  std::vector<cuio_json::TreeDepthT> golden_node_levels = {0, 1, 2, 3, 2, 3, 4, 4, 4, 2, 3, 2,
-                                                           3, 2, 3, 1, 2, 3, 2, 3, 4, 4, 4, 4,
-                                                           5, 6, 7, 7, 2, 3, 2, 3, 2, 3};
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_begin = {
-    2,   3,   5,   17,  29,  38,  39,  41,  44,  49,  59,  72,  81,  108, 116, 124, 126,
-    138, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 199, 212, 221, 255, 263};
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_end = {
-    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
-    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
-
-  // Check results against golden samples
-  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
-  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
-  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
-  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
-  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
-
-  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
-  }
-}

From a0837094ea8a4773f4e6eedd000f4eb6f733de20 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Wed, 24 Aug 2022 00:45:29 +0530
Subject: [PATCH 06/27] add print_tree debug print

---
 cpp/src/io/json/nested_json.hpp    |  11 +-
 cpp/src/io/json/nested_json_gpu.cu | 155 +++++++++++++++++------------
 2 files changed, 98 insertions(+), 68 deletions(-)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index e4e97668c14..2d1383bec09 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -72,6 +72,13 @@ struct tree_meta_t {
   std::vector<SymbolOffsetT> node_range_begin;
   std::vector<SymbolOffsetT> node_range_end;
 };
+struct tree_meta_t2 {
+  rmm::device_uvector<NodeT> node_categories;
+  rmm::device_uvector<NodeIndexT> parent_node_ids;
+  rmm::device_uvector<TreeDepthT> node_levels;
+  rmm::device_uvector<SymbolOffsetT> node_range_begin;
+  rmm::device_uvector<SymbolOffsetT> node_range_end;
+};
 
 constexpr NodeIndexT parent_node_sentinel = std::numeric_limits<NodeIndexT>::max();
 
@@ -308,8 +315,8 @@ void get_token_stream(device_span<SymbolT const> d_json_in,
  */
 tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream);
 
-tree_meta_t get_tree_representation_gpu(device_span<SymbolT const> d_input,
-                                        rmm::cuda_stream_view stream);
+tree_meta_t2 get_tree_representation_gpu(device_span<SymbolT const> d_input,
+                                         rmm::cuda_stream_view stream);
 
 /**
  * @brief Parses the given JSON string and generates table from the given input.
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 8c28885515a..6057ed29dbf 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -852,6 +852,58 @@ void get_token_stream(device_span<SymbolT const> json_in,
                                stream);
 }
 
+void print_tree(tree_meta_t const& cpu_tree, tree_meta_t const& gpu_tree)
+{
+  // DEBUG prints
+  auto to_cat = [](auto v) -> std::string {
+    switch (v) {
+      case NC_STRUCT: return " S";
+      case NC_LIST: return " L";
+      case NC_STR: return " \"";
+      case NC_VAL: return " V";
+      case NC_FN: return " F";
+      case NC_ERR: return "ER";
+      default: return "UN";
+    };
+  };
+  auto to_int    = [](auto v) { return std::to_string(static_cast<int>(v)); };
+  bool mismatch  = false;
+  auto print_vec = [&](auto const& cpu, auto const& gpu, auto const name, auto converter) {
+    if (not cpu.empty()) {
+      for (auto const& v : cpu)
+        printf("%3s,", converter(v).c_str());
+      std::cout << name << "(CPU):" << std::endl;
+    }
+    if (not cpu.empty()) {
+      for (auto const& v : gpu)
+        printf("%3s,", converter(v).c_str());
+      std::cout << name << "(GPU):" << std::endl;
+    }
+    if (not cpu.empty() and not gpu.empty()) {
+      if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
+        for (auto i = 0lu; i < cpu.size(); i++) {
+          mismatch |= (gpu[i] != cpu[i]);
+          printf("%3s,", (gpu[i] == cpu[i] ? " " : "x"));
+        }
+        std::cout << std::endl;
+      }
+    }
+  };
+#define PRINT_VEC(vec) print_vec(cpu_tree.vec, gpu_tree.vec, #vec, to_int);
+  for (int i = 0; i < int(gpu_tree.node_categories.size()); i++)
+    printf("%3d,", i);
+  printf(" node_id\n");
+  print_vec(
+    cpu_tree.node_categories, gpu_tree.node_categories, "node_categories", to_cat);  // Works
+  PRINT_VEC(node_levels);                                                            // Works
+  // PRINT_VEC(node_range_begin);  // Works
+  // PRINT_VEC(node_range_end);    // Works
+  PRINT_VEC(parent_node_ids);  // Works
+  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
+  // std::cout << "Mismatch: " << mismatch << std::endl;
+#undef PRINT_VEC
+}
+
 tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream)
 {
   constexpr std::size_t single_item = 1;
@@ -879,7 +931,8 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
   // Make sure tokens have been copied to the host
   stream.synchronize();
 
-  auto to_token_str = [](PdaTokenT token) {
+  // DEBUG print
+  [[maybe_unused]] auto to_token_str = [](PdaTokenT token) {
     switch (token) {
       case token_t::StructBegin: return " {";
       case token_t::StructEnd: return " }";
@@ -897,6 +950,11 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
       default: return ".";
     }
   };
+  std::cout << "Tokens: \n";
+  for (auto i = 0u; i < num_tokens_out[0]; i++) {
+    std::cout << to_token_str(tokens_gpu[i]) << " ";
+  }
+  std::cout << std::endl;
 
   // Whether a token does represent a node in the tree representation
   auto is_node = [](PdaTokenT const token) {
@@ -1036,65 +1094,30 @@ tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_st
     if (is_node(token)) { node_id++; }
   }
 
+  // GPU generation.
+  auto d_value         = get_tree_representation_gpu(d_input, stream);
+  tree_meta_t gpu_tree = {cudf::detail::make_std_vector_async(d_value.node_categories, stream),
+                          cudf::detail::make_std_vector_async(d_value.parent_node_ids, stream),
+                          cudf::detail::make_std_vector_async(d_value.node_levels, stream),
+                          cudf::detail::make_std_vector_async(d_value.node_range_begin, stream),
+                          cudf::detail::make_std_vector_async(d_value.node_range_end, stream)};
+
+  tree_meta_t cpu_tree = {std::move(node_categories),
+                          std::move(parent_node_ids),
+                          std::move(node_levels),
+                          std::move(node_range_begin),
+                          std::move(node_range_end)};
   // DEBUG prints
-  auto print_cat = [](auto const& gpu, auto const& cpu, auto const name) {
-    auto to_cat = [](auto v) {
-      switch (v) {
-        case NC_STRUCT: return " S";
-        case NC_LIST: return " L";
-        case NC_STR: return " \"";
-        case NC_VAL: return " V";
-        case NC_FN: return " F";
-        case NC_ERR: return "ER";
-        default: return "UN";
-      };
-    };
-    for (auto const& v : cpu)
-      printf("%s,", to_cat(v));
-    std::cout << name << "(CPU):" << std::endl;
-    for (auto const& v : gpu)
-      printf("%s,", to_cat(v));
-    std::cout << name << "(GPU):" << std::endl;
-    if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
-      for (auto i = 0lu; i < cpu.size(); i++)
-        printf("%2s,", (gpu[i] == cpu[i] ? " " : "x"));
-      std::cout << std::endl;
-    }
-  };
-  bool mismatch  = false;
-  auto print_vec = [&](auto const& gpu, auto const& cpu, auto const name) {
-    for (auto const& v : cpu)
-      printf("%2d,", int(v));
-    std::cout << name << "(CPU):" << std::endl;
-    for (auto const& v : gpu)
-      printf("%2d,", int(v));
-    std::cout << name << "(GPU):" << std::endl;
-    if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
-      for (auto i = 0lu; i < cpu.size(); i++) {
-        mismatch |= (gpu[i] != cpu[i]);
-        printf("%2s,", (gpu[i] == cpu[i] ? " " : "x"));
-      }
-      std::cout << std::endl;
-    }
-  };
-
-#define PRINT_VEC(vec) print_vec(value.vec, vec, #vec);
-  auto value = get_tree_representation_gpu(d_input, stream);
-  // PRINT_VEC(node_categories); //Works
-  print_cat(value.node_categories, node_categories, "node_categories");
-  PRINT_VEC(node_levels);       // Works
-  PRINT_VEC(node_range_begin);  // Works
-  PRINT_VEC(node_range_end);    // Works
-  PRINT_VEC(parent_node_ids);   // Works
-  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
-  std::cout << "Mismatch: " << mismatch << std::endl;
+  print_tree(cpu_tree, gpu_tree);
+  for (int i = 0; i < int(cpu_tree.node_range_begin.size()); i++) {
+    printf("%3s ",
+           std::string(input.data() + cpu_tree.node_range_begin[i],
+                       cpu_tree.node_range_end[i] - cpu_tree.node_range_begin[i])
+             .c_str());
+  }
+  printf(" (JSON)\n");
 
-#undef PRINT_VEC
-  return {std::move(node_categories),
-          std::move(parent_node_ids),
-          std::move(node_levels),
-          std::move(node_range_begin),
-          std::move(node_range_end)};
+  return cpu_tree;
 }
 
 // The node that a token represents
@@ -1161,8 +1184,8 @@ struct node_ranges {
 };
 
 // GPU version of get_tree_representation
-tree_meta_t get_tree_representation_gpu(device_span<SymbolT const> d_input,
-                                        rmm::cuda_stream_view stream)
+tree_meta_t2 get_tree_representation_gpu(device_span<SymbolT const> d_input,
+                                         rmm::cuda_stream_view stream)
 {
   constexpr std::size_t single_item = 1;
   rmm::device_uvector<PdaTokenT> tokens_gpu{d_input.size(), stream};
@@ -1321,13 +1344,13 @@ tree_meta_t get_tree_representation_gpu(device_span<SymbolT const> d_input,
                                              is_node);
   CUDF_EXPECTS(parent_node_ids_end - parent_node_ids.begin() == num_nodes,
                "parent node id gather mismatch");
-
-  return {cudf::detail::make_std_vector_async(node_categories, stream),
-          cudf::detail::make_std_vector_async(parent_node_ids, stream),
-          cudf::detail::make_std_vector_async(node_levels, stream),
-          cudf::detail::make_std_vector_async(node_range_begin, stream),
-          cudf::detail::make_std_vector_async(node_range_end, stream)};
+  return {std::move(node_categories),
+          std::move(parent_node_ids),
+          std::move(node_levels),
+          std::move(node_range_begin),
+          std::move(node_range_end)};
 }
+
 /**
  * @brief Parses the given JSON string and generates a tree representation of the given input.
  *

From 1768fe7183b34e3ed8b4686d6c57a1972d4eccc7 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Wed, 24 Aug 2022 11:46:05 +0530
Subject: [PATCH 07/27] fix valueEnd, structMemberEnd, add relevant unit test

---
 cpp/src/io/json/nested_json_gpu.cu |   2 +-
 cpp/tests/io/nested_json_test.cpp  | 118 +++++++++++++++++++++--------
 2 files changed, 88 insertions(+), 32 deletions(-)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 6057ed29dbf..5b4179ac410 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -569,7 +569,7 @@ auto get_translation_table()
      {token_t::ErrorBegin},
      {token_t::ErrorBegin},
      {token_t::ErrorBegin},
-     {token_t::ValueEnd},
+     {token_t::ValueEnd, token_t::StructMemberEnd},
      {token_t::ErrorBegin},
      {token_t::ValueEnd},
      {}}};
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index 672d87fe616..597590092a3 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -285,7 +285,7 @@ TEST_F(JsonTest, StackContextUtf8)
   CUDF_TEST_EXPECT_VECTOR_EQUAL(golden_stack_context, stack_context, stack_context.size());
 }
 
-TEST_F(JsonTest, TokenStream)
+auto get_token_stream_to_host(std::string& input, rmm::cuda_stream_view stream)
 {
   using cuio_json::PdaTokenT;
   using cuio_json::SymbolOffsetT;
@@ -293,53 +293,57 @@ TEST_F(JsonTest, TokenStream)
 
   constexpr std::size_t single_item = 1;
 
-  // Prepare cuda stream for data transfers & kernels
-  rmm::cuda_stream stream{};
-  rmm::cuda_stream_view stream_view(stream);
-
-  // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
-
   // Prepare input & output buffers
-  rmm::device_uvector<SymbolT> d_input(input.size(), stream_view);
+  rmm::device_uvector<SymbolT> d_input(input.size(), stream);
 
-  ASSERT_CUDA_SUCCEEDED(cudaMemcpyAsync(d_input.data(),
+  EXPECT_CUDA_SUCCEEDED(cudaMemcpyAsync(d_input.data(),
                                         input.data(),
                                         input.size() * sizeof(SymbolT),
                                         cudaMemcpyHostToDevice,
                                         stream.value()));
 
-  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream_view};
-  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream_view};
-  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream_view};
+  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
 
   // Parse the JSON and get the token stream
   cuio_json::detail::get_token_stream(d_input,
                                       tokens_gpu.device_ptr(),
                                       token_indices_gpu.device_ptr(),
                                       num_tokens_out.device_ptr(),
-                                      stream_view);
+                                      stream);
 
   // Copy back the number of tokens that were written
-  num_tokens_out.device_to_host(stream_view);
-  tokens_gpu.device_to_host(stream_view);
-  token_indices_gpu.device_to_host(stream_view);
+  tokens_gpu.device_to_host(stream);
+  token_indices_gpu.device_to_host(stream);
+  num_tokens_out.device_to_host(stream);
 
   // Make sure we copied back all relevant data
-  stream_view.synchronize();
+  stream.synchronize();
+  return std::make_tuple(
+    std::move(tokens_gpu), std::move(token_indices_gpu), std::move(num_tokens_out));
+}
+
+TEST_F(JsonTest, TokenStream)
+{
+  // Test input
+  std::string input = R"(  [{)"
+                      R"("category": "reference",)"
+                      R"("index:": [4,12,42],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "[Sayings of the Century]",)"
+                      R"("price": 8.95)"
+                      R"(},  )"
+                      R"({)"
+                      R"("category": "reference",)"
+                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "{}[], <=semantic-symbols-string",)"
+                      R"("price": 8.95)"
+                      R"(}] )";
+  // Parse the JSON and get the token stream
+  auto [tokens_gpu, token_indices_gpu, num_tokens_out] =
+    get_token_stream_to_host(input, cudf::default_stream_value);
 
   // Golden token stream sample
   using token_t = cuio_json::token_t;
@@ -447,6 +451,58 @@ TEST_F(JsonTest, TokenStream)
   }
 }
 
+TEST_F(JsonTest, TokenStream2)
+{
+  // value end with comma, space, close-brace ", }"
+  std::string input =
+    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
+
+  // Golden token stream sample
+  using token_t = cuio_json::token_t;
+  // clang-format off
+  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> golden_token_stream = {
+    {0, token_t::ListBegin},
+    {2, token_t::StructBegin}, {3, token_t::StructEnd}, //{}
+    {6, token_t::StructBegin},
+        {8, token_t::StructMemberBegin}, {8, token_t::FieldNameBegin}, {10, token_t::FieldNameEnd}, //a
+            {13, token_t::StructBegin},
+                {15, token_t::StructMemberBegin}, {15, token_t::FieldNameBegin}, {17, token_t::FieldNameEnd}, {21, token_t::ValueBegin}, {22, token_t::ValueEnd}, {22, token_t::StructMemberEnd}, //a.y
+                {24, token_t::StructMemberBegin}, {24, token_t::FieldNameBegin},  {26, token_t::FieldNameEnd},  {29, token_t::ListBegin}, {30, token_t::ListEnd}, {32, token_t::StructMemberEnd}, //a.z
+            {32, token_t::StructEnd},
+        {33, token_t::StructMemberEnd},
+    {33, token_t::StructEnd},
+    {36, token_t::StructBegin},
+        {38, token_t::StructMemberBegin}, {38, token_t::FieldNameBegin}, {40, token_t::FieldNameEnd}, //a
+            {44, token_t::StructBegin},
+                {46, token_t::StructMemberBegin}, {46, token_t::FieldNameBegin}, {48, token_t::FieldNameEnd}, {52, token_t::ValueBegin}, {53, token_t::ValueEnd}, {53, token_t::StructMemberEnd}, //a.x
+                {55, token_t::StructMemberBegin}, {55, token_t::FieldNameBegin}, {57, token_t::FieldNameEnd}, {60, token_t::ValueBegin}, {61, token_t::ValueEnd}, {61, token_t::StructMemberEnd}, //a.y
+            {61, token_t::StructEnd},
+        {62, token_t::StructMemberEnd},
+        {64, token_t::StructMemberBegin}, {64, token_t::FieldNameBegin}, {66, token_t::FieldNameEnd}, //b
+            {70, token_t::StructBegin},
+                {71, token_t::StructMemberBegin}, {71, token_t::FieldNameBegin}, {73, token_t::FieldNameEnd}, {76, token_t::ValueBegin}, {78, token_t::ValueEnd}, {79, token_t::StructMemberEnd}, //b.x
+                {81, token_t::StructMemberBegin}, {81, token_t::FieldNameBegin}, {83, token_t::FieldNameEnd}, {86, token_t::ValueBegin}, {88, token_t::ValueEnd}, {88, token_t::StructMemberEnd}, //b.z
+            {88, token_t::StructEnd},
+        {89, token_t::StructMemberEnd},
+    {89, token_t::StructEnd},
+    {90, token_t::ListEnd}};
+  // clang-format on
+
+  auto [tokens_gpu, token_indices_gpu, num_tokens_out] =
+    get_token_stream_to_host(input, cudf::default_stream_value);
+
+  // Verify the number of tokens matches
+  ASSERT_EQ(golden_token_stream.size(), num_tokens_out[0]);
+
+  for (std::size_t i = 0; i < num_tokens_out[0]; i++) {
+    // Ensure the index the tokens are pointing to do match
+    EXPECT_EQ(golden_token_stream[i].first, token_indices_gpu[i]) << "Mismatch at #" << i;
+
+    // Ensure the token category is correct
+    EXPECT_EQ(golden_token_stream[i].second, tokens_gpu[i]) << "Mismatch at #" << i;
+  }
+}
+
 TEST_F(JsonTest, TreeRepresentation)
 {
   // Test input

From 496482625eecfe36f23957940905531eae4efdb8 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Wed, 24 Aug 2022 13:34:25 +0530
Subject: [PATCH 08/27] add unit test for get_tree_representation with valueend
 space

---
 cpp/tests/io/nested_json_test.cpp | 59 +++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index 597590092a3..7a85b7dbf2b 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -606,6 +606,65 @@ TEST_F(JsonTest, TreeRepresentation)
   }
 }
 
+TEST_F(JsonTest, TreeRepresentation2)
+{
+  // Test input: value end with comma, space, close-brace ", }"
+  std::string input =
+    //  0         1         2         3         4         5         6         7         8         9
+    //  0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
+    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
+
+  // Get the JSON's tree representation
+  auto tree_rep = cuio_json::detail::get_tree_representation(input, cudf::default_stream_value);
+
+  // Print tree representation
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+  // TODO compare with CPU version
+
+  // Golden sample of node categories
+  // clang-format off
+  std::vector<cuio_json::node_t> golden_node_categories = {
+    cuio_json::NC_LIST, cuio_json::NC_STRUCT,
+    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_LIST,
+    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL,
+                          cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL};
+
+  // Golden sample of node ids
+  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {cuio_json::parent_node_sentinel, 0,
+                                                               0, 2,  3,  4,  5,  4, 7,
+                                                               0, 9, 10, 11, 12, 11, 14,
+                                                                  9, 16, 17, 18, 17, 20};
+  // clang-format on
+
+  // Golden sample of node levels
+  std::vector<cuio_json::TreeDepthT> golden_node_levels = {
+    0, 1, 1, 2, 3, 4, 5, 4, 5, 1, 2, 3, 4, 5, 4, 5, 2, 3, 4, 5, 4, 5,
+  };
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_begin = {0,  2,  6,  9,  13, 16, 21, 25, 29, 36, 39,
+                                                      44, 47, 52, 56, 60, 65, 70, 72, 76, 82, 86};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_end = {1,  3,  7,  10, 14, 17, 22, 26, 30, 37, 40,
+                                                    45, 48, 53, 57, 61, 66, 71, 73, 78, 83, 88};
+
+  // Check results against golden samples
+  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+
+  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
+    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+  }
+}
+
 TEST_F(JsonTest, ExtractColumn)
 {
   using cuio_json::SymbolT;

From 4007f6ab4803b6c1efaadb7b381dc253e04d3da7 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Wed, 24 Aug 2022 23:09:33 +0530
Subject: [PATCH 09/27] cleanup, organize code

---
 cpp/CMakeLists.txt                 |   1 +
 cpp/src/io/json/json_tree.cu       | 282 ++++++++++++++++
 cpp/src/io/json/nested_json.hpp    |  13 +-
 cpp/src/io/json/nested_json_gpu.cu | 512 ----------------------------
 cpp/tests/CMakeLists.txt           |   2 +-
 cpp/tests/io/json_tree.cpp         | 517 +++++++++++++++++++++++++++++
 cpp/tests/io/nested_json_test.cpp  | 300 -----------------
 7 files changed, 803 insertions(+), 824 deletions(-)
 create mode 100644 cpp/src/io/json/json_tree.cu
 create mode 100644 cpp/tests/io/json_tree.cpp

diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index 44aaac54adb..cd1d1b4de22 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -328,6 +328,7 @@ add_library(
   src/io/csv/writer_impl.cu
   src/io/functions.cpp
   src/io/json/json_gpu.cu
+  src/io/json/json_tree.cu
   src/io/json/nested_json_gpu.cu
   src/io/json/reader_impl.cu
   src/io/json/experimental/read_json.cpp
diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
new file mode 100644
index 00000000000..8826f3217d7
--- /dev/null
+++ b/cpp/src/io/json/json_tree.cu
@@ -0,0 +1,282 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "nested_json.hpp"
+
+#include <io/utilities/hostdevice_vector.hpp>
+
+#include <thrust/copy.h>
+#include <thrust/iterator/counting_iterator.h>
+#include <thrust/iterator/transform_output_iterator.h>
+#include <thrust/iterator/zip_iterator.h>
+#include <thrust/sequence.h>
+#include <thrust/sort.h>
+#include <thrust/tabulate.h>
+
+#include <rmm/device_uvector.hpp>
+#include <rmm/exec_policy.hpp>
+
+// JSON tree generation from tokens
+// TODO JSON tree traversal
+namespace cudf::io::json {
+namespace detail {
+
+// DEBUG print
+template <typename T>
+void print_vec(T const& cpu, std::string const name)
+{
+  for (auto const& v : cpu)
+    printf("%3d,", int(v));
+  std::cout << name << std::endl;
+}
+
+// The node that a token represents
+struct token_to_node {
+  __device__ auto operator()(PdaTokenT const token) -> NodeT
+  {
+    switch (token) {
+      case token_t::StructBegin: return NC_STRUCT;
+      case token_t::ListBegin: return NC_LIST;
+      case token_t::StringBegin: return NC_STR;
+      case token_t::ValueBegin: return NC_VAL;
+      case token_t::FieldNameBegin: return NC_FN;
+      default: return NC_ERR;
+    };
+  }
+};
+
+// convert token indices to node range for each valid node.
+template <typename T1, typename T2, typename T3>
+struct node_ranges {
+  T1 tokens_gpu;
+  T2 token_indices_gpu;
+  T3 num_tokens;
+  __device__ auto operator()(size_type i) -> thrust::tuple<SymbolOffsetT, SymbolOffsetT>
+  {
+    // Whether a token expects to be followed by its respective end-of-* token partner
+    auto is_begin_of_section = [] __device__(PdaTokenT const token) {
+      switch (token) {
+        case token_t::StringBegin:
+        case token_t::ValueBegin:
+        case token_t::FieldNameBegin: return true;
+        default: return false;
+      };
+    };
+    // The end-of-* partner token for a given beginning-of-* token
+    auto end_of_partner = [] __device__(PdaTokenT const token) {
+      switch (token) {
+        case token_t::StringBegin: return token_t::StringEnd;
+        case token_t::ValueBegin: return token_t::ValueEnd;
+        case token_t::FieldNameBegin: return token_t::FieldNameEnd;
+        default: return token_t::ErrorBegin;
+      };
+    };
+    auto get_token_index = [] __device__(PdaTokenT const token, SymbolOffsetT const token_index) {
+      constexpr SymbolOffsetT skip_quote_char = 1;
+      switch (token) {
+        case token_t::StringBegin: return token_index + skip_quote_char;
+        case token_t::FieldNameBegin: return token_index + skip_quote_char;
+        default: return token_index;
+      };
+    };
+    PdaTokenT const token = tokens_gpu[i];
+    // The section from the original JSON input that this token demarcates
+    SymbolOffsetT range_begin = get_token_index(token, token_indices_gpu[i]);
+    SymbolOffsetT range_end   = range_begin + 1;
+    if (is_begin_of_section(token)) {
+      if ((i + 1) < num_tokens && end_of_partner(token) == tokens_gpu[i + 1]) {
+        // Update the range_end for this pair of tokens
+        range_end = token_indices_gpu[i + 1];
+      }
+    }
+    return thrust::make_tuple(range_begin, range_end);
+  }
+};
+
+// Parses the given JSON string and generates a tree representation of the given input.
+tree_meta_t get_tree_representation(device_span<SymbolT const> d_input,
+                                    rmm::cuda_stream_view stream)
+{
+  constexpr std::size_t single_item = 1;
+  rmm::device_uvector<PdaTokenT> tokens_gpu{d_input.size(), stream};
+  rmm::device_uvector<SymbolOffsetT> token_indices_gpu{d_input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
+
+  // Parse the JSON and get the token stream
+  cudf::io::json::detail::get_token_stream(
+    d_input, tokens_gpu.data(), token_indices_gpu.data(), num_tokens_out.device_ptr(), stream);
+
+  // Copy the JSON token count to the host
+  num_tokens_out.device_to_host(stream);
+
+  // Make sure tokens have been copied to the host
+  stream.synchronize();
+
+  // Whether a token does represent a node in the tree representation
+  auto is_node = [] __device__(PdaTokenT const token) -> size_type {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin:
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin:
+      case token_t::ErrorBegin: return 1;
+      default: return 0;
+    };
+  };
+
+  // Whether the token pops from the parent node stack
+  auto does_pop = [] __device__(PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructMemberEnd:
+      case token_t::StructEnd:
+      case token_t::ListEnd: return true;
+      default: return false;
+    };
+  };
+
+  // Whether the token pushes onto the parent node stack
+  auto does_push = [] __device__(PdaTokenT const token) {
+    switch (token) {
+      // case token_t::StructMemberBegin: //TODO: Either use FieldNameBegin here or change the
+      // token_to_node function
+      case token_t::FieldNameBegin:
+      case token_t::StructBegin:
+      case token_t::ListBegin: return true;
+      default: return false;
+    };
+  };
+
+  auto num_tokens = num_tokens_out[0];
+  auto is_node_it = thrust::make_transform_iterator(tokens_gpu.begin(), is_node);
+  auto num_nodes  = thrust::reduce(rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens);
+
+  // Node categories: copy_if with transform.
+  rmm::device_uvector<NodeT> node_categories(num_nodes, stream);
+  auto node_categories_it =
+    thrust::make_transform_output_iterator(node_categories.begin(), token_to_node{});
+  auto node_categories_end = thrust::copy_if(rmm::exec_policy(stream),
+                                             tokens_gpu.begin(),
+                                             tokens_gpu.begin() + num_tokens,
+                                             node_categories_it,
+                                             is_node);
+  CUDF_EXPECTS(node_categories_end - node_categories_it == num_nodes,
+               "node category count mismatch");
+
+  // Node levels: transform_exclusive_scan, copy_if.
+  rmm::device_uvector<size_type> token_levels(num_tokens, stream);
+  auto push_pop_it = thrust::make_transform_iterator(
+    tokens_gpu.begin(), [does_push, does_pop] __device__(PdaTokenT const token) -> size_type {
+      return does_push(token) ? 1 : (does_pop(token) ? -1 : 0);
+    });
+  thrust::exclusive_scan(
+    rmm::exec_policy(stream), push_pop_it, push_pop_it + num_tokens, token_levels.begin());
+
+  rmm::device_uvector<TreeDepthT> node_levels(num_nodes, stream);
+  auto node_levels_end = thrust::copy_if(rmm::exec_policy(stream),
+                                         token_levels.begin(),
+                                         token_levels.begin() + num_tokens,
+                                         tokens_gpu.begin(),
+                                         node_levels.begin(),
+                                         is_node);
+  CUDF_EXPECTS(node_levels_end - node_levels.begin() == num_nodes, "node level count mismatch");
+
+  // Node ranges: copy_if with transform.
+  rmm::device_uvector<SymbolOffsetT> node_range_begin(num_nodes, stream);
+  rmm::device_uvector<SymbolOffsetT> node_range_end(num_nodes, stream);
+  auto node_range_tuple_it =
+    thrust::make_zip_iterator(node_range_begin.begin(), node_range_end.begin());
+  using node_ranges_t    = node_ranges<decltype(tokens_gpu.begin()),
+                                    decltype(token_indices_gpu.begin()),
+                                    decltype(num_tokens)>;
+  auto node_range_out_it = thrust::make_transform_output_iterator(
+    node_range_tuple_it, node_ranges_t{tokens_gpu.begin(), token_indices_gpu.begin(), num_tokens});
+
+  auto node_range_out_end =
+    thrust::copy_if(rmm::exec_policy(stream),
+                    thrust::make_counting_iterator<size_type>(0),
+                    thrust::make_counting_iterator<size_type>(0) + num_tokens,
+                    node_range_out_it,
+                    [is_node, tokens_gpu = tokens_gpu.begin()] __device__(size_type i) -> bool {
+                      PdaTokenT const token = tokens_gpu[i];
+                      return is_node(token);
+                    });
+  CUDF_EXPECTS(node_range_out_end - node_range_out_it == num_nodes, "node range count mismatch");
+
+  // Node parent ids: previous push token_id transform, stable sort, segmented scan with Max,
+  // copy_if. This one is sort of logical stack. But more generalized. TODO: make it own function.
+  rmm::device_uvector<size_type> parent_token_ids(num_tokens, stream);  // XXX: fill with 0?
+  rmm::device_uvector<size_type> initial_order(num_tokens, stream);
+  thrust::sequence(rmm::exec_policy(stream), initial_order.begin(), initial_order.end());
+  thrust::tabulate(rmm::exec_policy(stream),
+                   parent_token_ids.begin(),
+                   parent_token_ids.end(),
+                   [does_push, tokens_gpu = tokens_gpu.begin()] __device__(auto i) -> size_type {
+                     if (i == 0)
+                       return -1;
+                     else
+                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;  // XXX: -1 or 0?
+                   });
+  auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
+  // TODO: use radix sort.
+  thrust::stable_sort_by_key(rmm::exec_policy(stream),
+                             token_levels.data(),
+                             token_levels.data() + token_levels.size(),
+                             out_pid);
+  // SegmentedScan Max.
+  thrust::inclusive_scan_by_key(rmm::exec_policy(stream),
+                                token_levels.data(),
+                                token_levels.data() + token_levels.size(),
+                                parent_token_ids.data(),
+                                parent_token_ids.data(),  // size_type{-1},
+                                thrust::equal_to<size_type>{},
+                                thrust::maximum<size_type>{});
+  // TODO: Avoid sorting again by  gather_if on a transform iterator. or scatter.
+  thrust::sort_by_key(rmm::exec_policy(stream),
+                      initial_order.data(),
+                      initial_order.data() + initial_order.size(),
+                      parent_token_ids.data());
+  // thrust::scatter(rmm::exec_policy(stream),
+  //                parent_token_ids.begin(),
+  //                parent_token_ids.end(),
+  //                initial_order.data(),
+  //                parent_token_ids.begin()); //same location not allowed in scatter
+  rmm::device_uvector<size_type> node_ids_gpu(num_tokens, stream);
+  thrust::exclusive_scan(
+    rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens, node_ids_gpu.begin());
+  rmm::device_uvector<NodeIndexT> parent_node_ids(num_nodes, stream);
+  auto parent_node_ids_it = thrust::make_transform_iterator(
+    parent_token_ids.begin(),
+    [node_ids_gpu = node_ids_gpu.begin()] __device__(size_type const pid) -> NodeIndexT {
+      return pid < 0 ? pid : node_ids_gpu[pid];
+    });
+  auto parent_node_ids_end = thrust::copy_if(rmm::exec_policy(stream),
+                                             parent_node_ids_it,
+                                             parent_node_ids_it + parent_token_ids.size(),
+                                             tokens_gpu.begin(),
+                                             parent_node_ids.begin(),
+                                             is_node);
+  CUDF_EXPECTS(parent_node_ids_end - parent_node_ids.begin() == num_nodes,
+               "parent node id gather mismatch");
+  return {std::move(node_categories),
+          std::move(parent_node_ids),
+          std::move(node_levels),
+          std::move(node_range_begin),
+          std::move(node_range_end)};
+}
+
+}  // namespace detail
+}  // namespace cudf::io::json
diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index 2d1383bec09..08b6af526cb 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -66,13 +66,6 @@ using TreeDepthT = StackLevelT;
  * @brief Struct that encapsulate all information of a columnar tree representation.
  */
 struct tree_meta_t {
-  std::vector<NodeT> node_categories;
-  std::vector<NodeIndexT> parent_node_ids;
-  std::vector<TreeDepthT> node_levels;
-  std::vector<SymbolOffsetT> node_range_begin;
-  std::vector<SymbolOffsetT> node_range_end;
-};
-struct tree_meta_t2 {
   rmm::device_uvector<NodeT> node_categories;
   rmm::device_uvector<NodeIndexT> parent_node_ids;
   rmm::device_uvector<TreeDepthT> node_levels;
@@ -313,10 +306,8 @@ void get_token_stream(device_span<SymbolT const> d_json_in,
  * @param stream The CUDA stream to which kernels are dispatched
  * @return
  */
-tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream);
-
-tree_meta_t2 get_tree_representation_gpu(device_span<SymbolT const> d_input,
-                                         rmm::cuda_stream_view stream);
+tree_meta_t get_tree_representation(device_span<SymbolT const> d_input,
+                                    rmm::cuda_stream_view stream);
 
 /**
  * @brief Parses the given JSON string and generates table from the given input.
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 5b4179ac410..e98abe8e0b0 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -32,19 +32,6 @@
 #include <rmm/device_uvector.hpp>
 #include <rmm/exec_policy.hpp>
 
-#include "thrust/functional.h"
-#include "thrust/iterator/counting_iterator.h"
-#include "thrust/iterator/transform_iterator.h"
-#include "thrust/iterator/transform_output_iterator.h"
-#include "thrust/iterator/zip_iterator.h"
-#include "thrust/sequence.h"
-#include "thrust/sort.h"
-#include "thrust/tabulate.h"
-#include <thrust/gather.h>
-#include <thrust/scan.h>
-
-#include <stack>
-
 #include <thrust/device_vector.h>
 #include <thrust/transform.h>
 
@@ -852,505 +839,6 @@ void get_token_stream(device_span<SymbolT const> json_in,
                                stream);
 }
 
-void print_tree(tree_meta_t const& cpu_tree, tree_meta_t const& gpu_tree)
-{
-  // DEBUG prints
-  auto to_cat = [](auto v) -> std::string {
-    switch (v) {
-      case NC_STRUCT: return " S";
-      case NC_LIST: return " L";
-      case NC_STR: return " \"";
-      case NC_VAL: return " V";
-      case NC_FN: return " F";
-      case NC_ERR: return "ER";
-      default: return "UN";
-    };
-  };
-  auto to_int    = [](auto v) { return std::to_string(static_cast<int>(v)); };
-  bool mismatch  = false;
-  auto print_vec = [&](auto const& cpu, auto const& gpu, auto const name, auto converter) {
-    if (not cpu.empty()) {
-      for (auto const& v : cpu)
-        printf("%3s,", converter(v).c_str());
-      std::cout << name << "(CPU):" << std::endl;
-    }
-    if (not cpu.empty()) {
-      for (auto const& v : gpu)
-        printf("%3s,", converter(v).c_str());
-      std::cout << name << "(GPU):" << std::endl;
-    }
-    if (not cpu.empty() and not gpu.empty()) {
-      if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
-        for (auto i = 0lu; i < cpu.size(); i++) {
-          mismatch |= (gpu[i] != cpu[i]);
-          printf("%3s,", (gpu[i] == cpu[i] ? " " : "x"));
-        }
-        std::cout << std::endl;
-      }
-    }
-  };
-#define PRINT_VEC(vec) print_vec(cpu_tree.vec, gpu_tree.vec, #vec, to_int);
-  for (int i = 0; i < int(gpu_tree.node_categories.size()); i++)
-    printf("%3d,", i);
-  printf(" node_id\n");
-  print_vec(
-    cpu_tree.node_categories, gpu_tree.node_categories, "node_categories", to_cat);  // Works
-  PRINT_VEC(node_levels);                                                            // Works
-  // PRINT_VEC(node_range_begin);  // Works
-  // PRINT_VEC(node_range_end);    // Works
-  PRINT_VEC(parent_node_ids);  // Works
-  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
-  // std::cout << "Mismatch: " << mismatch << std::endl;
-#undef PRINT_VEC
-}
-
-tree_meta_t get_tree_representation(host_span<SymbolT const> input, rmm::cuda_stream_view stream)
-{
-  constexpr std::size_t single_item = 1;
-  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream};
-  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream};
-  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
-
-  rmm::device_uvector<SymbolT> d_input{input.size(), stream};
-  cudaMemcpyAsync(
-    d_input.data(), input.data(), input.size() * sizeof(input[0]), cudaMemcpyHostToDevice, stream);
-
-  // Parse the JSON and get the token stream
-  cudf::io::json::detail::get_token_stream(
-    cudf::device_span<SymbolT>{d_input.data(), d_input.size()},
-    tokens_gpu.device_ptr(),
-    token_indices_gpu.device_ptr(),
-    num_tokens_out.device_ptr(),
-    stream);
-
-  // Copy the JSON tokens to the host
-  token_indices_gpu.device_to_host(stream);
-  tokens_gpu.device_to_host(stream);
-  num_tokens_out.device_to_host(stream);
-
-  // Make sure tokens have been copied to the host
-  stream.synchronize();
-
-  // DEBUG print
-  [[maybe_unused]] auto to_token_str = [](PdaTokenT token) {
-    switch (token) {
-      case token_t::StructBegin: return " {";
-      case token_t::StructEnd: return " }";
-      case token_t::ListBegin: return " [";
-      case token_t::ListEnd: return " ]";
-      case token_t::FieldNameBegin: return "FB";
-      case token_t::FieldNameEnd: return "FE";
-      case token_t::StringBegin: return "SB";
-      case token_t::StringEnd: return "SE";
-      case token_t::ErrorBegin: return "er";
-      case token_t::ValueBegin: return "VB";
-      case token_t::ValueEnd: return "VE";
-      case token_t::StructMemberBegin: return " <";
-      case token_t::StructMemberEnd: return " >";
-      default: return ".";
-    }
-  };
-  std::cout << "Tokens: \n";
-  for (auto i = 0u; i < num_tokens_out[0]; i++) {
-    std::cout << to_token_str(tokens_gpu[i]) << " ";
-  }
-  std::cout << std::endl;
-
-  // Whether a token does represent a node in the tree representation
-  auto is_node = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StructBegin:
-      case token_t::ListBegin:
-      case token_t::StringBegin:
-      case token_t::ValueBegin:
-      case token_t::FieldNameBegin:
-      case token_t::ErrorBegin: return true;
-      default: return false;
-    };
-  };
-
-  // The node that a token represents
-  auto token_to_node = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StructBegin: return NC_STRUCT;
-      case token_t::ListBegin: return NC_LIST;
-      case token_t::StringBegin: return NC_STR;
-      case token_t::ValueBegin: return NC_VAL;
-      case token_t::FieldNameBegin: return NC_FN;
-      default: return NC_ERR;
-    };
-  };
-
-  auto get_token_index = [](PdaTokenT const token, SymbolOffsetT const token_index) {
-    constexpr SymbolOffsetT skip_quote_char = 1;
-    switch (token) {
-      case token_t::StringBegin: return token_index + skip_quote_char;
-      case token_t::FieldNameBegin: return token_index + skip_quote_char;
-      default: return token_index;
-    };
-  };
-
-  // Whether a token expects to be followed by its respective end-of-* token partner
-  auto is_begin_of_section = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StringBegin:
-      case token_t::ValueBegin:
-      case token_t::FieldNameBegin: return true;
-      default: return false;
-    };
-  };
-
-  // The end-of-* partner token for a given beginning-of-* token
-  auto end_of_partner = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StringBegin: return token_t::StringEnd;
-      case token_t::ValueBegin: return token_t::ValueEnd;
-      case token_t::FieldNameBegin: return token_t::FieldNameEnd;
-      default: return token_t::ErrorBegin;
-    };
-  };
-
-  // Whether the token pops from the parent node stack
-  auto does_pop = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StructEnd:
-      case token_t::ListEnd: return true;
-      default: return false;
-    };
-  };
-
-  // Whether the token pushes onto the parent node stack
-  auto does_push = [](PdaTokenT const token) {
-    switch (token) {
-      case token_t::StructBegin:
-      case token_t::ListBegin: return true;
-      default: return false;
-    };
-  };
-
-  // The node id sitting on top of the stack becomes the node's parent
-  // The full stack represents the path from the root to the current node
-  std::stack<std::pair<NodeIndexT, bool>> parent_stack;
-
-  constexpr bool field_name_node    = true;
-  constexpr bool no_field_name_node = false;
-
-  std::vector<NodeT> node_categories;
-  std::vector<NodeIndexT> parent_node_ids;
-  std::vector<TreeDepthT> node_levels;
-  std::vector<SymbolOffsetT> node_range_begin;
-  std::vector<SymbolOffsetT> node_range_end;
-
-  std::size_t node_id = 0;
-  for (std::size_t i = 0; i < num_tokens_out[0]; i++) {
-    auto token = tokens_gpu[i];
-
-    // The section from the original JSON input that this token demarcates
-    std::size_t range_begin = get_token_index(token, token_indices_gpu[i]);
-    std::size_t range_end   = range_begin + 1;
-
-    // Identify this node's parent node id
-    std::size_t parent_node_id =
-      (parent_stack.size() > 0) ? parent_stack.top().first : parent_node_sentinel;
-
-    // If this token is the beginning-of-{value, string, field name}, also consume the next end-of-*
-    // token
-    if (is_begin_of_section(token)) {
-      if ((i + 1) < num_tokens_out[0] && end_of_partner(token) == tokens_gpu[i + 1]) {
-        // Update the range_end for this pair of tokens
-        range_end = token_indices_gpu[i + 1];
-        // We can skip the subsequent end-of-* token
-        i++;
-      }
-    }
-
-    // Emit node if this token becomes a node in the tree
-    if (is_node(token)) {
-      node_categories.push_back(token_to_node(token));
-      parent_node_ids.push_back(parent_node_id);
-      node_levels.push_back(parent_stack.size());
-      node_range_begin.push_back(range_begin);
-      node_range_end.push_back(range_end);
-    }
-
-    // Modify the stack if needed
-    if (token == token_t::FieldNameBegin) {
-      parent_stack.push({node_id, field_name_node});
-    } else {
-      if (does_push(token)) {
-        parent_stack.push({node_id, no_field_name_node});
-      } else if (does_pop(token)) {
-        CUDF_EXPECTS(parent_stack.size() >= 1, "Invalid JSON input.");
-        parent_stack.pop();
-      }
-
-      // If what we're left with is a field name on top of stack, we need to pop it
-      if (parent_stack.size() >= 1 && parent_stack.top().second == field_name_node) {
-        parent_stack.pop();
-      }
-    }
-
-    // Update node_id
-    if (is_node(token)) { node_id++; }
-  }
-
-  // GPU generation.
-  auto d_value         = get_tree_representation_gpu(d_input, stream);
-  tree_meta_t gpu_tree = {cudf::detail::make_std_vector_async(d_value.node_categories, stream),
-                          cudf::detail::make_std_vector_async(d_value.parent_node_ids, stream),
-                          cudf::detail::make_std_vector_async(d_value.node_levels, stream),
-                          cudf::detail::make_std_vector_async(d_value.node_range_begin, stream),
-                          cudf::detail::make_std_vector_async(d_value.node_range_end, stream)};
-
-  tree_meta_t cpu_tree = {std::move(node_categories),
-                          std::move(parent_node_ids),
-                          std::move(node_levels),
-                          std::move(node_range_begin),
-                          std::move(node_range_end)};
-  // DEBUG prints
-  print_tree(cpu_tree, gpu_tree);
-  for (int i = 0; i < int(cpu_tree.node_range_begin.size()); i++) {
-    printf("%3s ",
-           std::string(input.data() + cpu_tree.node_range_begin[i],
-                       cpu_tree.node_range_end[i] - cpu_tree.node_range_begin[i])
-             .c_str());
-  }
-  printf(" (JSON)\n");
-
-  return cpu_tree;
-}
-
-// The node that a token represents
-struct token_to_node {
-  __device__ auto operator()(PdaTokenT const token) -> NodeT
-  {
-    switch (token) {
-      case token_t::StructBegin: return NC_STRUCT;
-      case token_t::ListBegin: return NC_LIST;
-      case token_t::StringBegin: return NC_STR;
-      case token_t::ValueBegin: return NC_VAL;
-      case token_t::FieldNameBegin: return NC_FN;
-      default: return NC_ERR;
-    };
-  }
-};
-
-// convert token indices to node range for each vaid node.
-template <typename T1, typename T2, typename T3>
-struct node_ranges {
-  T1 tokens_gpu;
-  T2 token_indices_gpu;
-  T3 num_tokens;
-  __device__ auto operator()(size_type i) -> thrust::tuple<SymbolOffsetT, SymbolOffsetT>
-  {
-    // Whether a token expects to be followed by its respective end-of-* token partner
-    auto is_begin_of_section = [] __device__(PdaTokenT const token) {
-      switch (token) {
-        case token_t::StringBegin:
-        case token_t::ValueBegin:
-        case token_t::FieldNameBegin: return true;
-        default: return false;
-      };
-    };
-    // The end-of-* partner token for a given beginning-of-* token
-    auto end_of_partner = [] __device__(PdaTokenT const token) {
-      switch (token) {
-        case token_t::StringBegin: return token_t::StringEnd;
-        case token_t::ValueBegin: return token_t::ValueEnd;
-        case token_t::FieldNameBegin: return token_t::FieldNameEnd;
-        default: return token_t::ErrorBegin;
-      };
-    };
-    auto get_token_index = [] __device__(PdaTokenT const token, SymbolOffsetT const token_index) {
-      constexpr SymbolOffsetT skip_quote_char = 1;
-      switch (token) {
-        case token_t::StringBegin: return token_index + skip_quote_char;
-        case token_t::FieldNameBegin: return token_index + skip_quote_char;
-        default: return token_index;
-      };
-    };
-    PdaTokenT const token = tokens_gpu[i];
-    // The section from the original JSON input that this token demarcates
-    SymbolOffsetT range_begin = get_token_index(token, token_indices_gpu[i]);
-    SymbolOffsetT range_end   = range_begin + 1;
-    if (is_begin_of_section(token)) {
-      if ((i + 1) < num_tokens && end_of_partner(token) == tokens_gpu[i + 1]) {
-        // Update the range_end for this pair of tokens
-        range_end = token_indices_gpu[i + 1];
-      }
-    }
-    return thrust::make_tuple(range_begin, range_end);
-  }
-};
-
-// GPU version of get_tree_representation
-tree_meta_t2 get_tree_representation_gpu(device_span<SymbolT const> d_input,
-                                         rmm::cuda_stream_view stream)
-{
-  constexpr std::size_t single_item = 1;
-  rmm::device_uvector<PdaTokenT> tokens_gpu{d_input.size(), stream};
-  rmm::device_uvector<SymbolOffsetT> token_indices_gpu{d_input.size(), stream};
-  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
-
-  // Parse the JSON and get the token stream
-  cudf::io::json::detail::get_token_stream(
-    d_input, tokens_gpu.data(), token_indices_gpu.data(), num_tokens_out.device_ptr(), stream);
-
-  // Copy the JSON token count to the host
-  num_tokens_out.device_to_host(stream);
-
-  // Make sure tokens have been copied to the host
-  stream.synchronize();
-
-  // Whether a token does represent a node in the tree representation
-  auto is_node = [] __device__(PdaTokenT const token) -> size_type {
-    switch (token) {
-      case token_t::StructBegin:
-      case token_t::ListBegin:
-      case token_t::StringBegin:
-      case token_t::ValueBegin:
-      case token_t::FieldNameBegin:
-      case token_t::ErrorBegin: return 1;
-      default: return 0;
-    };
-  };
-
-  // Whether the token pops from the parent node stack
-  auto does_pop = [] __device__(PdaTokenT const token) {
-    switch (token) {
-      case token_t::StructMemberEnd:
-      case token_t::StructEnd:
-      case token_t::ListEnd: return true;
-      default: return false;
-    };
-  };
-
-  // Whether the token pushes onto the parent node stack
-  auto does_push = [] __device__(PdaTokenT const token) {
-    switch (token) {
-      // case token_t::StructMemberBegin: //TODO: Either use FieldNameBegin here or change the
-      // token_to_node function
-      case token_t::FieldNameBegin:
-      case token_t::StructBegin:
-      case token_t::ListBegin: return true;
-      default: return false;
-    };
-  };
-
-  auto num_tokens = num_tokens_out[0];
-  auto is_node_it = thrust::make_transform_iterator(tokens_gpu.begin(), is_node);
-  auto num_nodes  = thrust::reduce(rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens);
-
-  // Node categories: copy_if with transform.
-  rmm::device_uvector<NodeT> node_categories(num_nodes, stream);
-  auto node_categories_it =
-    thrust::make_transform_output_iterator(node_categories.begin(), token_to_node{});
-  auto node_categories_end = thrust::copy_if(rmm::exec_policy(stream),
-                                             tokens_gpu.begin(),
-                                             tokens_gpu.begin() + num_tokens,
-                                             node_categories_it,
-                                             is_node);
-  CUDF_EXPECTS(node_categories_end - node_categories_it == num_nodes,
-               "node category count mismatch");
-
-  // Node levels: transform_exclusive_scan, copy_if.
-  rmm::device_uvector<size_type> token_levels(num_tokens, stream);
-  auto push_pop_it = thrust::make_transform_iterator(
-    tokens_gpu.begin(), [does_push, does_pop] __device__(PdaTokenT const token) -> size_type {
-      return does_push(token) ? 1 : (does_pop(token) ? -1 : 0);
-    });
-  thrust::exclusive_scan(
-    rmm::exec_policy(stream), push_pop_it, push_pop_it + num_tokens, token_levels.begin());
-
-  rmm::device_uvector<TreeDepthT> node_levels(num_nodes, stream);
-  auto node_levels_end = thrust::copy_if(rmm::exec_policy(stream),
-                                         token_levels.begin(),
-                                         token_levels.begin() + num_tokens,
-                                         tokens_gpu.begin(),
-                                         node_levels.begin(),
-                                         is_node);
-  CUDF_EXPECTS(node_levels_end - node_levels.begin() == num_nodes, "node level count mismatch");
-
-  // Node ranges: copy_if with transform.
-  rmm::device_uvector<SymbolOffsetT> node_range_begin(num_nodes, stream);
-  rmm::device_uvector<SymbolOffsetT> node_range_end(num_nodes, stream);
-  auto node_range_tuple_it =
-    thrust::make_zip_iterator(node_range_begin.begin(), node_range_end.begin());
-  using node_ranges_t    = node_ranges<decltype(tokens_gpu.begin()),
-                                    decltype(token_indices_gpu.begin()),
-                                    decltype(num_tokens)>;
-  auto node_range_out_it = thrust::make_transform_output_iterator(
-    node_range_tuple_it, node_ranges_t{tokens_gpu.begin(), token_indices_gpu.begin(), num_tokens});
-
-  auto node_range_out_end =
-    thrust::copy_if(rmm::exec_policy(stream),
-                    thrust::make_counting_iterator<size_type>(0),
-                    thrust::make_counting_iterator<size_type>(0) + num_tokens,
-                    node_range_out_it,
-                    [is_node, tokens_gpu = tokens_gpu.begin()] __device__(size_type i) -> bool {
-                      PdaTokenT const token = tokens_gpu[i];
-                      return is_node(token);
-                    });
-  CUDF_EXPECTS(node_range_out_end - node_range_out_it == num_nodes, "node range count mismatch");
-
-  // Node parent ids: previous push token_id transform, stable sort, segmented scan with Max,
-  // copy_if. This one is sort of logical stack. But more generalized. TODO: make it own function.
-  rmm::device_uvector<size_type> parent_token_ids(num_tokens, stream);  // XXX: fill with 0?
-  rmm::device_uvector<size_type> initial_order(num_tokens, stream);
-  thrust::sequence(rmm::exec_policy(stream), initial_order.begin(), initial_order.end());
-  thrust::tabulate(rmm::exec_policy(stream),
-                   parent_token_ids.begin(),
-                   parent_token_ids.end(),
-                   [does_push, tokens_gpu = tokens_gpu.begin()] __device__(auto i) -> size_type {
-                     if (i == 0)
-                       return -1;
-                     else
-                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;  // XXX: -1 or 0?
-                   });
-  auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
-  // TODO: use radix sort.
-  thrust::stable_sort_by_key(rmm::exec_policy(stream),
-                             token_levels.data(),
-                             token_levels.data() + token_levels.size(),
-                             out_pid);
-  // SegmentedScan Max.
-  thrust::inclusive_scan_by_key(rmm::exec_policy(stream),
-                                token_levels.data(),
-                                token_levels.data() + token_levels.size(),
-                                parent_token_ids.data(),
-                                parent_token_ids.data(),  // size_type{-1},
-                                thrust::equal_to<size_type>{},
-                                thrust::maximum<size_type>{});
-  // TODO: Avoid sorting again by  gather_if on a transform iterator. or scatter.
-  thrust::sort_by_key(rmm::exec_policy(stream),
-                      initial_order.data(),
-                      initial_order.data() + initial_order.size(),
-                      parent_token_ids.data());
-
-  rmm::device_uvector<size_type> node_ids_gpu(num_tokens, stream);
-  thrust::exclusive_scan(
-    rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens, node_ids_gpu.begin());
-  rmm::device_uvector<NodeIndexT> parent_node_ids(num_nodes, stream);
-  auto parent_node_ids_it = thrust::make_transform_iterator(
-    parent_token_ids.begin(),
-    [node_ids_gpu = node_ids_gpu.begin()] __device__(size_type const pid) -> NodeIndexT {
-      return pid < 0 ? pid : node_ids_gpu[pid];
-    });
-  auto parent_node_ids_end = thrust::copy_if(rmm::exec_policy(stream),
-                                             parent_node_ids_it,
-                                             parent_node_ids_it + parent_token_ids.size(),
-                                             tokens_gpu.begin(),
-                                             parent_node_ids.begin(),
-                                             is_node);
-  CUDF_EXPECTS(parent_node_ids_end - parent_node_ids.begin() == num_nodes,
-               "parent node id gather mismatch");
-  return {std::move(node_categories),
-          std::move(parent_node_ids),
-          std::move(node_levels),
-          std::move(node_range_begin),
-          std::move(node_range_end)};
-}
-
 /**
  * @brief Parses the given JSON string and generates a tree representation of the given input.
  *
diff --git a/cpp/tests/CMakeLists.txt b/cpp/tests/CMakeLists.txt
index 8aba2a11d10..6270eebf649 100644
--- a/cpp/tests/CMakeLists.txt
+++ b/cpp/tests/CMakeLists.txt
@@ -222,7 +222,7 @@ ConfigureTest(FILE_IO_TEST io/file_io_test.cpp)
 ConfigureTest(ORC_TEST io/orc_test.cpp)
 ConfigureTest(PARQUET_TEST io/parquet_test.cpp)
 ConfigureTest(JSON_TEST io/json_test.cpp)
-ConfigureTest(NESTED_JSON_TEST io/nested_json_test.cpp)
+ConfigureTest(NESTED_JSON_TEST io/nested_json_test.cpp io/json_tree.cpp)
 ConfigureTest(ARROW_IO_SOURCE_TEST io/arrow_io_source_test.cpp)
 ConfigureTest(MULTIBYTE_SPLIT_TEST io/text/multibyte_split_test.cpp)
 ConfigureTest(LOGICAL_STACK_TEST io/fst/logical_stack_test.cu)
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
new file mode 100644
index 00000000000..64f710fc922
--- /dev/null
+++ b/cpp/tests/io/json_tree.cpp
@@ -0,0 +1,517 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <io/json/nested_json.hpp>
+#include <io/utilities/hostdevice_vector.hpp>
+
+#include <cudf/scalar/scalar.hpp>
+#include <cudf/utilities/default_stream.hpp>
+#include <cudf/utilities/span.hpp>
+
+#include <cudf_test/base_fixture.hpp>
+#include <cudf_test/column_utilities.hpp>
+#include <cudf_test/cudf_gtest.hpp>
+
+#include <rmm/cuda_stream.hpp>
+#include <rmm/cuda_stream_view.hpp>
+
+#include <string>
+
+namespace cuio_json = cudf::io::json;
+
+namespace cudf::io::json {
+// Host copy of tree_meta_t
+struct tree_meta_t2 {
+  std::vector<NodeT> node_categories;
+  std::vector<NodeIndexT> parent_node_ids;
+  std::vector<TreeDepthT> node_levels;
+  std::vector<SymbolOffsetT> node_range_begin;
+  std::vector<SymbolOffsetT> node_range_end;
+};
+}  // namespace cudf::io::json
+
+namespace {
+std::string get_node_string(std::size_t const node_id,
+                            cuio_json::tree_meta_t2 const& tree_rep,
+                            std::string const& json_input)
+{
+  auto node_to_str = [] __host__ __device__(cuio_json::PdaTokenT const token) {
+    switch (token) {
+      case cuio_json::NC_STRUCT: return "STRUCT";
+      case cuio_json::NC_LIST: return "LIST";
+      case cuio_json::NC_FN: return "FN";
+      case cuio_json::NC_STR: return "STR";
+      case cuio_json::NC_VAL: return "VAL";
+      case cuio_json::NC_ERR: return "ERR";
+      default: return "N/A";
+    };
+  };
+
+  return "<" + std::to_string(node_id) + ":" + node_to_str(tree_rep.node_categories[node_id]) +
+         ":[" + std::to_string(tree_rep.node_range_begin[node_id]) + ", " +
+         std::to_string(tree_rep.node_range_end[node_id]) + ") '" +
+         json_input.substr(tree_rep.node_range_begin[node_id],
+                           tree_rep.node_range_end[node_id] - tree_rep.node_range_begin[node_id]) +
+         "'>";
+}
+
+void print_tree_representation(std::string const& json_input,
+                               cuio_json::tree_meta_t2 const& tree_rep)
+{
+  for (std::size_t i = 0; i < tree_rep.node_categories.size(); i++) {
+    std::size_t parent_id = tree_rep.parent_node_ids[i];
+    std::stack<std::size_t> path;
+    path.push(i);
+    while (parent_id != cuio_json::parent_node_sentinel) {
+      path.push(parent_id);
+      parent_id = tree_rep.parent_node_ids[parent_id];
+    }
+
+    while (path.size()) {
+      auto const node_id = path.top();
+      std::cout << get_node_string(node_id, tree_rep, json_input)
+                << (path.size() > 1 ? " -> " : "");
+      path.pop();
+    }
+    std::cout << "\n";
+  }
+}
+}  // namespace
+
+// cudf::io::json::
+namespace cudf::io::json {
+namespace test {
+
+tree_meta_t2 to_cpu_tree(tree_meta_t const& d_value, rmm::cuda_stream_view stream)
+{
+  return {cudf::detail::make_std_vector_async(d_value.node_categories, stream),
+          cudf::detail::make_std_vector_async(d_value.parent_node_ids, stream),
+          cudf::detail::make_std_vector_async(d_value.node_levels, stream),
+          cudf::detail::make_std_vector_async(d_value.node_range_begin, stream),
+          cudf::detail::make_std_vector_async(d_value.node_range_end, stream)};
+}
+
+void compare_trees(tree_meta_t2 const& cpu_tree, tree_meta_t const& d_gpu_tree)
+{
+  auto gpu_tree = to_cpu_tree(d_gpu_tree, cudf::default_stream_value);
+  // DEBUG prints
+  auto to_cat = [](auto v) -> std::string {
+    switch (v) {
+      case NC_STRUCT: return " S";
+      case NC_LIST: return " L";
+      case NC_STR: return " \"";
+      case NC_VAL: return " V";
+      case NC_FN: return " F";
+      case NC_ERR: return "ER";
+      default: return "UN";
+    };
+  };
+  auto to_int    = [](auto v) { return std::to_string(static_cast<int>(v)); };
+  bool mismatch  = false;
+  auto print_vec = [&](auto const& cpu, auto const& gpu, auto const name, auto converter) {
+    if (not cpu.empty()) {
+      for (auto const& v : cpu)
+        printf("%3s,", converter(v).c_str());
+      std::cout << name << "(CPU):" << std::endl;
+    }
+    if (not cpu.empty()) {
+      for (auto const& v : gpu)
+        printf("%3s,", converter(v).c_str());
+      std::cout << name << "(GPU):" << std::endl;
+    }
+    if (not cpu.empty() and not gpu.empty()) {
+      if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
+        for (auto i = 0lu; i < cpu.size(); i++) {
+          mismatch |= (gpu[i] != cpu[i]);
+          printf("%3s,", (gpu[i] == cpu[i] ? " " : "x"));
+        }
+        std::cout << std::endl;
+      }
+    }
+  };
+#define PRINT_VEC(vec) print_vec(cpu_tree.vec, gpu_tree.vec, #vec, to_int);
+  for (int i = 0; i < int(gpu_tree.node_categories.size()); i++)
+    printf("%3d,", i);
+  printf(" node_id\n");
+  print_vec(
+    cpu_tree.node_categories, gpu_tree.node_categories, "node_categories", to_cat);  // Works
+  PRINT_VEC(node_levels);                                                            // Works
+  // PRINT_VEC(node_range_begin);  // Works
+  // PRINT_VEC(node_range_end);    // Works
+  PRINT_VEC(parent_node_ids);  // Works
+  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
+  // std::cout << "Mismatch: " << mismatch << std::endl;
+#undef PRINT_VEC
+}
+
+tree_meta_t2 get_tree_representation_cpu(host_span<SymbolT const> input,
+                                         rmm::cuda_stream_view stream)
+{
+  constexpr std::size_t single_item = 1;
+  hostdevice_vector<PdaTokenT> tokens_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> token_indices_gpu{input.size(), stream};
+  hostdevice_vector<SymbolOffsetT> num_tokens_out{single_item, stream};
+
+  rmm::device_uvector<SymbolT> d_input{input.size(), stream};
+  cudaMemcpyAsync(
+    d_input.data(), input.data(), input.size() * sizeof(input[0]), cudaMemcpyHostToDevice, stream);
+
+  // Parse the JSON and get the token stream
+  cudf::io::json::detail::get_token_stream(
+    cudf::device_span<SymbolT>{d_input.data(), d_input.size()},
+    tokens_gpu.device_ptr(),
+    token_indices_gpu.device_ptr(),
+    num_tokens_out.device_ptr(),
+    stream);
+
+  // Copy the JSON tokens to the host
+  token_indices_gpu.device_to_host(stream);
+  tokens_gpu.device_to_host(stream);
+  num_tokens_out.device_to_host(stream);
+
+  // Make sure tokens have been copied to the host
+  stream.synchronize();
+
+  // DEBUG print
+  [[maybe_unused]] auto to_token_str = [](PdaTokenT token) {
+    switch (token) {
+      case token_t::StructBegin: return " {";
+      case token_t::StructEnd: return " }";
+      case token_t::ListBegin: return " [";
+      case token_t::ListEnd: return " ]";
+      case token_t::FieldNameBegin: return "FB";
+      case token_t::FieldNameEnd: return "FE";
+      case token_t::StringBegin: return "SB";
+      case token_t::StringEnd: return "SE";
+      case token_t::ErrorBegin: return "er";
+      case token_t::ValueBegin: return "VB";
+      case token_t::ValueEnd: return "VE";
+      case token_t::StructMemberBegin: return " <";
+      case token_t::StructMemberEnd: return " >";
+      default: return ".";
+    }
+  };
+  std::cout << "Tokens: \n";
+  for (auto i = 0u; i < num_tokens_out[0]; i++) {
+    std::cout << to_token_str(tokens_gpu[i]) << " ";
+  }
+  std::cout << std::endl;
+
+  // Whether a token does represent a node in the tree representation
+  auto is_node = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin:
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin:
+      case token_t::ErrorBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The node that a token represents
+  auto token_to_node = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin: return NC_STRUCT;
+      case token_t::ListBegin: return NC_LIST;
+      case token_t::StringBegin: return NC_STR;
+      case token_t::ValueBegin: return NC_VAL;
+      case token_t::FieldNameBegin: return NC_FN;
+      default: return NC_ERR;
+    };
+  };
+
+  auto get_token_index = [](PdaTokenT const token, SymbolOffsetT const token_index) {
+    constexpr SymbolOffsetT skip_quote_char = 1;
+    switch (token) {
+      case token_t::StringBegin: return token_index + skip_quote_char;
+      case token_t::FieldNameBegin: return token_index + skip_quote_char;
+      default: return token_index;
+    };
+  };
+
+  // Whether a token expects to be followed by its respective end-of-* token partner
+  auto is_begin_of_section = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StringBegin:
+      case token_t::ValueBegin:
+      case token_t::FieldNameBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The end-of-* partner token for a given beginning-of-* token
+  auto end_of_partner = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StringBegin: return token_t::StringEnd;
+      case token_t::ValueBegin: return token_t::ValueEnd;
+      case token_t::FieldNameBegin: return token_t::FieldNameEnd;
+      default: return token_t::ErrorBegin;
+    };
+  };
+
+  // Whether the token pops from the parent node stack
+  auto does_pop = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructEnd:
+      case token_t::ListEnd: return true;
+      default: return false;
+    };
+  };
+
+  // Whether the token pushes onto the parent node stack
+  auto does_push = [](PdaTokenT const token) {
+    switch (token) {
+      case token_t::StructBegin:
+      case token_t::ListBegin: return true;
+      default: return false;
+    };
+  };
+
+  // The node id sitting on top of the stack becomes the node's parent
+  // The full stack represents the path from the root to the current node
+  std::stack<std::pair<NodeIndexT, bool>> parent_stack;
+
+  constexpr bool field_name_node    = true;
+  constexpr bool no_field_name_node = false;
+
+  std::vector<NodeT> node_categories;
+  std::vector<NodeIndexT> parent_node_ids;
+  std::vector<TreeDepthT> node_levels;
+  std::vector<SymbolOffsetT> node_range_begin;
+  std::vector<SymbolOffsetT> node_range_end;
+
+  std::size_t node_id = 0;
+  for (std::size_t i = 0; i < num_tokens_out[0]; i++) {
+    auto token = tokens_gpu[i];
+
+    // The section from the original JSON input that this token demarcates
+    std::size_t range_begin = get_token_index(token, token_indices_gpu[i]);
+    std::size_t range_end   = range_begin + 1;
+
+    // Identify this node's parent node id
+    std::size_t parent_node_id =
+      (parent_stack.size() > 0) ? parent_stack.top().first : parent_node_sentinel;
+
+    // If this token is the beginning-of-{value, string, field name}, also consume the next end-of-*
+    // token
+    if (is_begin_of_section(token)) {
+      if ((i + 1) < num_tokens_out[0] && end_of_partner(token) == tokens_gpu[i + 1]) {
+        // Update the range_end for this pair of tokens
+        range_end = token_indices_gpu[i + 1];
+        // We can skip the subsequent end-of-* token
+        i++;
+      }
+    }
+
+    // Emit node if this token becomes a node in the tree
+    if (is_node(token)) {
+      node_categories.push_back(token_to_node(token));
+      parent_node_ids.push_back(parent_node_id);
+      node_levels.push_back(parent_stack.size());
+      node_range_begin.push_back(range_begin);
+      node_range_end.push_back(range_end);
+    }
+
+    // Modify the stack if needed
+    if (token == token_t::FieldNameBegin) {
+      parent_stack.push({node_id, field_name_node});
+    } else {
+      if (does_push(token)) {
+        parent_stack.push({node_id, no_field_name_node});
+      } else if (does_pop(token)) {
+        CUDF_EXPECTS(parent_stack.size() >= 1, "Invalid JSON input.");
+        parent_stack.pop();
+      }
+
+      // If what we're left with is a field name on top of stack, we need to pop it
+      if (parent_stack.size() >= 1 && parent_stack.top().second == field_name_node) {
+        parent_stack.pop();
+      }
+    }
+
+    // Update node_id
+    if (is_node(token)) { node_id++; }
+  }
+
+  return {std::move(node_categories),
+          std::move(parent_node_ids),
+          std::move(node_levels),
+          std::move(node_range_begin),
+          std::move(node_range_end)};
+}
+
+}  // namespace test
+}  // namespace cudf::io::json
+
+// Base test fixture for tests
+struct JsonTest : public cudf::test::BaseFixture {
+};
+
+TEST_F(JsonTest, TreeRepresentation)
+{
+  auto stream = cudf::default_stream_value;
+
+  // Test input
+  std::string input = R"(  [{)"
+                      R"("category": "reference",)"
+                      R"("index:": [4,12,42],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "[Sayings of the Century]",)"
+                      R"("price": 8.95)"
+                      R"(},  )"
+                      R"({)"
+                      R"("category": "reference",)"
+                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
+                      R"("author": "Nigel Rees",)"
+                      R"("title": "{}[], <=semantic-symbols-string",)"
+                      R"("price": 8.95)"
+                      R"(}] )";
+  cudf::string_scalar d_input(input, true, stream);
+
+  // Get the JSON's tree representation
+  auto gpu_tree = cuio_json::detail::get_tree_representation(
+    {d_input.data(), static_cast<size_t>(d_input.size())}, stream);
+  // host tree generation
+  auto tree_rep = cuio_json::test::get_tree_representation_cpu(input, stream);
+  cudf::io::json::test::compare_trees(tree_rep, gpu_tree);
+
+  // Print tree representation
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+
+  // Golden sample of node categories
+  std::vector<cuio_json::node_t> golden_node_categories = {
+    cuio_json::NC_LIST, cuio_json::NC_STRUCT, cuio_json::NC_FN,     cuio_json::NC_STR,
+    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_VAL,    cuio_json::NC_VAL,
+    cuio_json::NC_VAL,  cuio_json::NC_FN,     cuio_json::NC_STR,    cuio_json::NC_FN,
+    cuio_json::NC_STR,  cuio_json::NC_FN,     cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_LIST,
+    cuio_json::NC_VAL,  cuio_json::NC_STRUCT, cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_STRUCT, cuio_json::NC_STRUCT,
+    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_STR,
+    cuio_json::NC_FN,   cuio_json::NC_VAL};
+
+  // Golden sample of node ids
+  // clang-format off
+  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {
+    cuio_json::parent_node_sentinel, 0, 1, 2,
+    1, 4, 5, 5,
+    5, 1, 9, 1,
+    11, 1, 13, 0,
+    15, 16, 15, 18,
+    19, 19, 19, 19,
+    23, 24, 25, 25,
+    15, 28, 15, 30,
+    15, 32};
+  // clang-format on
+
+  // Golden sample of node levels
+  std::vector<cuio_json::TreeDepthT> golden_node_levels = {0, 1, 2, 3, 2, 3, 4, 4, 4, 2, 3, 2,
+                                                           3, 2, 3, 1, 2, 3, 2, 3, 4, 4, 4, 4,
+                                                           5, 6, 7, 7, 2, 3, 2, 3, 2, 3};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_begin = {
+    2,   3,   5,   17,  29,  38,  39,  41,  44,  49,  59,  72,  81,  108, 116, 124, 126,
+    138, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 199, 212, 221, 255, 263};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_end = {
+    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
+    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
+
+  // Check results against golden samples
+  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+
+  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
+    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+  }
+}
+
+TEST_F(JsonTest, TreeRepresentation2)
+{
+  auto stream = cudf::default_stream_value;
+  // Test input: value end with comma, space, close-brace ", }"
+  std::string input =
+    //  0         1         2         3         4         5         6         7         8         9
+    //  0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
+    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
+  cudf::string_scalar d_input(input, true, stream);
+
+  // Get the JSON's tree representation
+  auto gpu_tree = cuio_json::detail::get_tree_representation(
+    {d_input.data(), static_cast<size_t>(d_input.size())}, stream);
+  // host tree generation
+  auto tree_rep = cuio_json::test::get_tree_representation_cpu(input, stream);
+  cudf::io::json::test::compare_trees(tree_rep, gpu_tree);
+
+  // Print tree representation
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+  // TODO compare with CPU version
+
+  // Golden sample of node categories
+  // clang-format off
+  std::vector<cuio_json::node_t> golden_node_categories = {
+    cuio_json::NC_LIST, cuio_json::NC_STRUCT,
+    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_LIST,
+    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL,
+                          cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL};
+
+  // Golden sample of node ids
+  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {
+    cuio_json::parent_node_sentinel, 0,
+    0, 2,  3,  4,  5,  4, 7,
+    0, 9, 10, 11, 12, 11, 14,
+      9, 16, 17, 18, 17, 20};
+  // clang-format on
+
+  // Golden sample of node levels
+  std::vector<cuio_json::TreeDepthT> golden_node_levels = {
+    0, 1, 1, 2, 3, 4, 5, 4, 5, 1, 2, 3, 4, 5, 4, 5, 2, 3, 4, 5, 4, 5,
+  };
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_begin = {0,  2,  6,  9,  13, 16, 21, 25, 29, 36, 39,
+                                                      44, 47, 52, 56, 60, 65, 70, 72, 76, 82, 86};
+
+  // Golden sample of the character-ranges from the original input that each node demarcates
+  std::vector<std::size_t> golden_node_range_end = {1,  3,  7,  10, 14, 17, 22, 26, 30, 37, 40,
+                                                    45, 48, 53, 57, 61, 66, 71, 73, 78, 83, 88};
+
+  // Check results against golden samples
+  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+
+  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
+    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+  }
+}
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index 7a85b7dbf2b..c8a55f46069 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -35,144 +35,6 @@
 
 namespace cuio_json = cudf::io::json;
 
-namespace {
-
-std::string get_node_string(std::size_t const node_id,
-                            cuio_json::tree_meta_t const& tree_rep,
-                            std::string const& json_input)
-{
-  auto node_to_str = [] __host__ __device__(cuio_json::PdaTokenT const token) {
-    switch (token) {
-      case cuio_json::NC_STRUCT: return "STRUCT";
-      case cuio_json::NC_LIST: return "LIST";
-      case cuio_json::NC_FN: return "FN";
-      case cuio_json::NC_STR: return "STR";
-      case cuio_json::NC_VAL: return "VAL";
-      case cuio_json::NC_ERR: return "ERR";
-      default: return "N/A";
-    };
-  };
-
-  return "<" + std::to_string(node_id) + ":" + node_to_str(tree_rep.node_categories[node_id]) +
-         ":[" + std::to_string(tree_rep.node_range_begin[node_id]) + ", " +
-         std::to_string(tree_rep.node_range_end[node_id]) + ") '" +
-         json_input.substr(tree_rep.node_range_begin[node_id],
-                           tree_rep.node_range_end[node_id] - tree_rep.node_range_begin[node_id]) +
-         "'>";
-}
-
-void print_tree_representation(std::string const& json_input,
-                               cuio_json::tree_meta_t const& tree_rep)
-{
-  for (std::size_t i = 0; i < tree_rep.node_categories.size(); i++) {
-    std::size_t parent_id = tree_rep.parent_node_ids[i];
-    std::stack<std::size_t> path;
-    path.push(i);
-    while (parent_id != cuio_json::parent_node_sentinel) {
-      path.push(parent_id);
-      parent_id = tree_rep.parent_node_ids[parent_id];
-    }
-
-    while (path.size()) {
-      auto const node_id = path.top();
-      std::cout << get_node_string(node_id, tree_rep, json_input)
-                << (path.size() > 1 ? " -> " : "");
-      path.pop();
-    }
-    std::cout << "\n";
-  }
-}
-// Forward declaration
-void print_column(std::string const& input,
-                  cuio_json::json_column const& column,
-                  uint32_t indent = 0);
-
-/**
- * @brief Helper to generate indentation
- */
-std::string pad(uint32_t indent = 0)
-{
-  std::string pad{};
-  if (indent > 0) pad.insert(pad.begin(), indent, ' ');
-  return pad;
-}
-
-/**
- * @brief Prints a string column.
- */
-void print_json_string_col(std::string const& input,
-                           cuio_json::json_column const& column,
-                           uint32_t indent = 0)
-{
-  for (std::size_t i = 0; i < column.string_offsets.size(); i++) {
-    std::cout << pad(indent) << i << ": [" << (column.validity[i] ? "1" : "0") << "] '"
-              << input.substr(column.string_offsets[i], column.string_lengths[i]) << "'\n";
-  }
-}
-
-/**
- * @brief Prints a list column.
- */
-void print_json_list_col(std::string const& input,
-                         cuio_json::json_column const& column,
-                         uint32_t indent = 0)
-{
-  std::cout << pad(indent) << " [LIST]\n";
-  std::cout << pad(indent) << " -> num. child-columns: " << column.child_columns.size() << "\n";
-  std::cout << pad(indent) << " -> num. rows: " << column.current_offset << "\n";
-  std::cout << pad(indent) << " -> num. valid: " << column.valid_count << "\n";
-  std::cout << pad(indent) << " offsets[]: "
-            << "\n";
-  for (std::size_t i = 0; i < column.child_offsets.size() - 1; i++) {
-    std::cout << pad(indent + 2) << i << ": [" << (column.validity[i] ? "1" : "0") << "] ["
-              << column.child_offsets[i] << ", " << column.child_offsets[i + 1] << ")\n";
-  }
-  if (column.child_columns.size() > 0) {
-    std::cout << pad(indent) << column.child_columns.begin()->first << "[]: "
-              << "\n";
-    print_column(input, column.child_columns.begin()->second, indent + 2);
-  }
-}
-
-/**
- * @brief Prints a struct column.
- */
-void print_json_struct_col(std::string const& input,
-                           cuio_json::json_column const& column,
-                           uint32_t indent = 0)
-{
-  std::cout << pad(indent) << " [STRUCT]\n";
-  std::cout << pad(indent) << " -> num. child-columns: " << column.child_columns.size() << "\n";
-  std::cout << pad(indent) << " -> num. rows: " << column.current_offset << "\n";
-  std::cout << pad(indent) << " -> num. valid: " << column.valid_count << "\n";
-  std::cout << pad(indent) << " -> validity[]: "
-            << "\n";
-  for (decltype(column.current_offset) i = 0; i < column.current_offset; i++) {
-    std::cout << pad(indent + 2) << i << ": [" << (column.validity[i] ? "1" : "0") << "]\n";
-  }
-  auto it = std::begin(column.child_columns);
-  for (std::size_t i = 0; i < column.child_columns.size(); i++) {
-    std::cout << pad(indent + 2) << "child #" << i << " '" << it->first << "'[] \n";
-    print_column(input, it->second, indent + 2);
-    it++;
-  }
-}
-
-/**
- * @brief Prints the column's data and recurses through and prints all the child columns.
- */
-void print_column(std::string const& input, cuio_json::json_column const& column, uint32_t indent)
-{
-  switch (column.type) {
-    case cuio_json::json_col_t::StringColumn: print_json_string_col(input, column, indent); break;
-    case cuio_json::json_col_t::ListColumn: print_json_list_col(input, column, indent); break;
-    case cuio_json::json_col_t::StructColumn: print_json_struct_col(input, column, indent); break;
-    case cuio_json::json_col_t::Unknown: std::cout << pad(indent) << "[UNKNOWN]\n"; break;
-    default: break;
-  }
-}
-}  // namespace
-
 // Base test fixture for tests
 struct JsonTest : public cudf::test::BaseFixture {
 };
@@ -503,168 +365,6 @@ TEST_F(JsonTest, TokenStream2)
   }
 }
 
-TEST_F(JsonTest, TreeRepresentation)
-{
-  // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
-
-  // Get the JSON's tree representation
-  auto tree_rep = cuio_json::detail::get_tree_representation(input, cudf::default_stream_value);
-
-  // Print tree representation
-  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
-
-  // Golden sample of node categories
-  std::vector<cuio_json::node_t> golden_node_categories = {
-    cuio_json::NC_LIST, cuio_json::NC_STRUCT, cuio_json::NC_FN,     cuio_json::NC_STR,
-    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_VAL,    cuio_json::NC_VAL,
-    cuio_json::NC_VAL,  cuio_json::NC_FN,     cuio_json::NC_STR,    cuio_json::NC_FN,
-    cuio_json::NC_STR,  cuio_json::NC_FN,     cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_LIST,
-    cuio_json::NC_VAL,  cuio_json::NC_STRUCT, cuio_json::NC_VAL,    cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_LIST,   cuio_json::NC_STRUCT, cuio_json::NC_STRUCT,
-    cuio_json::NC_FN,   cuio_json::NC_STR,    cuio_json::NC_FN,     cuio_json::NC_STR,
-    cuio_json::NC_FN,   cuio_json::NC_VAL};
-
-  // Golden sample of node ids
-  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {cuio_json::parent_node_sentinel,
-                                                               0,
-                                                               1,
-                                                               2,
-                                                               1,
-                                                               4,
-                                                               5,
-                                                               5,
-                                                               5,
-                                                               1,
-                                                               9,
-                                                               1,
-                                                               11,
-                                                               1,
-                                                               13,
-                                                               0,
-                                                               15,
-                                                               16,
-                                                               15,
-                                                               18,
-                                                               19,
-                                                               19,
-                                                               19,
-                                                               19,
-                                                               23,
-                                                               24,
-                                                               25,
-                                                               25,
-                                                               15,
-                                                               28,
-                                                               15,
-                                                               30,
-                                                               15,
-                                                               32};
-
-  // Golden sample of node levels
-  std::vector<cuio_json::TreeDepthT> golden_node_levels = {0, 1, 2, 3, 2, 3, 4, 4, 4, 2, 3, 2,
-                                                           3, 2, 3, 1, 2, 3, 2, 3, 4, 4, 4, 4,
-                                                           5, 6, 7, 7, 2, 3, 2, 3, 2, 3};
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_begin = {
-    2,   3,   5,   17,  29,  38,  39,  41,  44,  49,  59,  72,  81,  108, 116, 124, 126,
-    138, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 199, 212, 221, 255, 263};
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_end = {
-    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
-    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
-
-  // Check results against golden samples
-  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
-  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
-  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
-  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
-  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
-
-  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
-  }
-}
-
-TEST_F(JsonTest, TreeRepresentation2)
-{
-  // Test input: value end with comma, space, close-brace ", }"
-  std::string input =
-    //  0         1         2         3         4         5         6         7         8         9
-    //  0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
-    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
-
-  // Get the JSON's tree representation
-  auto tree_rep = cuio_json::detail::get_tree_representation(input, cudf::default_stream_value);
-
-  // Print tree representation
-  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
-  // TODO compare with CPU version
-
-  // Golden sample of node categories
-  // clang-format off
-  std::vector<cuio_json::node_t> golden_node_categories = {
-    cuio_json::NC_LIST, cuio_json::NC_STRUCT,
-    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_LIST,
-    cuio_json::NC_STRUCT, cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL,
-                          cuio_json::NC_FN, cuio_json::NC_STRUCT,  cuio_json::NC_FN,  cuio_json::NC_VAL, cuio_json::NC_FN,  cuio_json::NC_VAL};
-
-  // Golden sample of node ids
-  std::vector<cuio_json::NodeIndexT> golden_parent_node_ids = {cuio_json::parent_node_sentinel, 0,
-                                                               0, 2,  3,  4,  5,  4, 7,
-                                                               0, 9, 10, 11, 12, 11, 14,
-                                                                  9, 16, 17, 18, 17, 20};
-  // clang-format on
-
-  // Golden sample of node levels
-  std::vector<cuio_json::TreeDepthT> golden_node_levels = {
-    0, 1, 1, 2, 3, 4, 5, 4, 5, 1, 2, 3, 4, 5, 4, 5, 2, 3, 4, 5, 4, 5,
-  };
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_begin = {0,  2,  6,  9,  13, 16, 21, 25, 29, 36, 39,
-                                                      44, 47, 52, 56, 60, 65, 70, 72, 76, 82, 86};
-
-  // Golden sample of the character-ranges from the original input that each node demarcates
-  std::vector<std::size_t> golden_node_range_end = {1,  3,  7,  10, 14, 17, 22, 26, 30, 37, 40,
-                                                    45, 48, 53, 57, 61, 66, 71, 73, 78, 83, 88};
-
-  // Check results against golden samples
-  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
-  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
-  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
-  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
-  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
-
-  for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
-  }
-}
-
 TEST_F(JsonTest, ExtractColumn)
 {
   using cuio_json::SymbolT;

From 8ff75a95135f5f7f99dedaebd9fd9ef189621d3f Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 26 Aug 2022 14:20:19 +0530
Subject: [PATCH 10/27] cleanup

---
 cpp/src/io/json/json_tree.cu | 17 +++----
 cpp/tests/io/json_tree.cpp   | 90 +++++++++++++++++++++---------------
 2 files changed, 59 insertions(+), 48 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index feca46ba39a..efe2ea56ba1 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -29,8 +29,6 @@
 #include <rmm/device_uvector.hpp>
 #include <rmm/exec_policy.hpp>
 
-// JSON tree generation from tokens
-// TODO JSON tree traversal
 namespace cudf::io::json {
 namespace detail {
 
@@ -58,7 +56,7 @@ struct token_to_node {
   }
 };
 
-// convert token indices to node range for each valid node.
+// Convert token indices to node range for each valid node.
 template <typename T1, typename T2, typename T3>
 struct node_ranges {
   T1 tokens;
@@ -106,7 +104,7 @@ struct node_ranges {
   }
 };
 
-// Parses the given JSON string and generates a tree representation of the given input.
+// Generates a tree representation of the given tokens, token_indices.
 tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                                     device_span<SymbolOffsetT const> token_indices,
                                     rmm::cuda_stream_view stream,
@@ -138,8 +136,6 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   // Whether the token pushes onto the parent node stack
   auto does_push = [] __device__(PdaTokenT const token) {
     switch (token) {
-      // case token_t::StructMemberBegin: //TODO: Either use FieldNameBegin here or change the
-      // token_to_node function
       case token_t::FieldNameBegin:
       case token_t::StructBegin:
       case token_t::ListBegin: return true;
@@ -203,8 +199,9 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   CUDF_EXPECTS(node_range_out_end - node_range_out_it == num_nodes, "node range count mismatch");
 
   // Node parent ids: previous push token_id transform, stable sort, segmented scan with Max,
-  // copy_if. This one is sort of logical stack. But more generalized. TODO: make it own function.
-  rmm::device_uvector<size_type> parent_token_ids(num_tokens, stream);  // XXX: fill with 0?
+  // reorder, copy_if. This one is sort of logical stack. But more generalized.
+  // TODO: make it own function.
+  rmm::device_uvector<size_type> parent_token_ids(num_tokens, stream);
   rmm::device_uvector<size_type> initial_order(num_tokens, stream);
   thrust::sequence(rmm::exec_policy(stream), initial_order.begin(), initial_order.end());
   thrust::tabulate(rmm::exec_policy(stream),
@@ -214,7 +211,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                      if (i == 0)
                        return -1;
                      else
-                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;  // XXX: -1 or 0?
+                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;
                    });
   auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
   // TODO: use radix sort.
@@ -230,7 +227,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                                 parent_token_ids.data(),  // size_type{-1},
                                 thrust::equal_to<size_type>{},
                                 thrust::maximum<size_type>{});
-  // TODO: Avoid sorting again by  gather_if on a transform iterator. or scatter.
+  // FIXME: Avoid sorting again by scatter + extra memory. Tradeoff?
   thrust::sort_by_key(rmm::exec_policy(stream),
                       initial_order.data(),
                       initial_order.data() + initial_order.size(),
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index 47e44549206..14dabc3f5d5 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -91,7 +91,6 @@ void print_tree_representation(std::string const& json_input,
 }
 }  // namespace
 
-// cudf::io::json::
 namespace cudf::io::json {
 namespace test {
 
@@ -104,8 +103,14 @@ tree_meta_t2 to_cpu_tree(tree_meta_t const& d_value, rmm::cuda_stream_view strea
           cudf::detail::make_std_vector_async(d_value.node_range_end, stream)};
 }
 
-void compare_trees(tree_meta_t2 const& cpu_tree, tree_meta_t const& d_gpu_tree)
+void compare_trees(tree_meta_t2 const& cpu_tree, tree_meta_t const& d_gpu_tree, bool print = false)
 {
+  auto cpu_num_nodes = cpu_tree.node_categories.size();
+  EXPECT_EQ(cpu_num_nodes, d_gpu_tree.node_categories.size());
+  EXPECT_EQ(cpu_num_nodes, d_gpu_tree.parent_node_ids.size());
+  EXPECT_EQ(cpu_num_nodes, d_gpu_tree.node_levels.size());
+  EXPECT_EQ(cpu_num_nodes, d_gpu_tree.node_range_begin.size());
+  EXPECT_EQ(cpu_num_nodes, d_gpu_tree.node_range_end.size());
   auto gpu_tree = to_cpu_tree(d_gpu_tree, cudf::default_stream_value);
   // DEBUG prints
   auto to_cat = [](auto v) -> std::string {
@@ -121,40 +126,47 @@ void compare_trees(tree_meta_t2 const& cpu_tree, tree_meta_t const& d_gpu_tree)
   };
   auto to_int    = [](auto v) { return std::to_string(static_cast<int>(v)); };
   bool mismatch  = false;
-  auto print_vec = [&](auto const& cpu, auto const& gpu, auto const name, auto converter) {
-    if (not cpu.empty()) {
-      for (auto const& v : cpu)
-        printf("%3s,", converter(v).c_str());
-      std::cout << name << "(CPU):" << std::endl;
-    }
-    if (not cpu.empty()) {
-      for (auto const& v : gpu)
-        printf("%3s,", converter(v).c_str());
-      std::cout << name << "(GPU):" << std::endl;
-    }
-    if (not cpu.empty() and not gpu.empty()) {
-      if (!std::equal(gpu.begin(), gpu.end(), cpu.begin())) {
-        for (auto i = 0lu; i < cpu.size(); i++) {
-          mismatch |= (gpu[i] != cpu[i]);
-          printf("%3s,", (gpu[i] == cpu[i] ? " " : "x"));
-        }
-        std::cout << std::endl;
-      }
-    }
+  auto print_vec = [&](auto const& cpu, auto const name, auto converter) {
+    for (auto const& v : cpu)
+      printf("%3s,", converter(v).c_str());
+    std::cout << name << std::endl;
   };
-#define PRINT_VEC(vec) print_vec(cpu_tree.vec, gpu_tree.vec, #vec, to_int);
-  for (int i = 0; i < int(gpu_tree.node_categories.size()); i++)
-    printf("%3d,", i);
-  printf(" node_id\n");
-  print_vec(
-    cpu_tree.node_categories, gpu_tree.node_categories, "node_categories", to_cat);  // Works
-  PRINT_VEC(node_levels);                                                            // Works
-  // PRINT_VEC(node_range_begin);  // Works
-  // PRINT_VEC(node_range_end);    // Works
-  PRINT_VEC(parent_node_ids);  // Works
-  CUDF_EXPECTS(!mismatch, "Mismatch in GPU and CPU tree representation");
-  // std::cout << "Mismatch: " << mismatch << std::endl;
+
+#define COMPARE_MEMBER(member)                                    \
+  for (std::size_t i = 0; i < cpu_num_nodes; i++) {               \
+    EXPECT_EQ(cpu_tree.member[i], gpu_tree.member[i]) << #member; \
+  }
+  COMPARE_MEMBER(node_categories);
+  COMPARE_MEMBER(parent_node_ids);
+  COMPARE_MEMBER(node_levels);
+  COMPARE_MEMBER(node_range_begin);
+  COMPARE_MEMBER(node_range_end);
+#undef COMPARE_MEMBER
+
+#define PRINT_VEC(vec, conv) print_vec(vec, #vec, conv);
+#define PRINT_COMPARISON(vec, conv)                                                  \
+  PRINT_VEC(cpu_tree.vec, conv);                                                     \
+  PRINT_VEC(gpu_tree.vec, conv);                                                     \
+  if (!std::equal(cpu_tree.vec.begin(), cpu_tree.vec.end(), gpu_tree.vec.begin())) { \
+    for (auto i = 0lu; i < cpu_tree.vec.size(); i++) {                               \
+      mismatch |= (gpu_tree.vec[i] != cpu_tree.vec[i]);                              \
+      printf("%3s,", (gpu_tree.vec[i] == cpu_tree.vec[i] ? " " : "x"));              \
+    }                                                                                \
+    printf("\n");                                                                    \
+  }
+  if (print) {
+    for (int i = 0; i < int(cpu_num_nodes); i++)
+      printf("%3d,", i);
+    printf(" node_id\n");
+    PRINT_COMPARISON(node_categories, to_cat);   // Works
+    PRINT_COMPARISON(node_levels, to_int);       // Works
+    PRINT_COMPARISON(node_range_begin, to_int);  // Works
+    PRINT_COMPARISON(node_range_end, to_int);    // Works
+    PRINT_COMPARISON(parent_node_ids, to_int);   // Works
+    EXPECT_FALSE(mismatch);
+  }
 #undef PRINT_VEC
+#undef PRINT_COMPARISON
 }
 
 tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu,
@@ -189,11 +201,13 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
       default: return ".";
     }
   };
-  std::cout << "Tokens: \n";
-  for (auto i = 0u; i < tokens.size(); i++) {
-    std::cout << to_token_str(tokens[i]) << " ";
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) {
+    std::cout << "Tokens: \n";
+    for (auto i = 0u; i < tokens.size(); i++) {
+      std::cout << to_token_str(tokens[i]) << " ";
+    }
+    std::cout << std::endl;
   }
-  std::cout << std::endl;
 
   // Whether a token does represent a node in the tree representation
   auto is_node = [](PdaTokenT const token) {

From f73a18753f5030b83bcba5f6ac8b8642ce4875d8 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 26 Aug 2022 14:24:38 +0530
Subject: [PATCH 11/27] doc

---
 cpp/src/io/json/nested_json.hpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index c93af5f41e4..94c51679c76 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -305,7 +305,7 @@ std::pair<rmm::device_uvector<PdaTokenT>, rmm::device_uvector<SymbolOffsetT>> ge
 /**
  * @brief Parses the given JSON string and generates a tree representation of the given input.
  *
- * @param tokens device span of token types in the json string
+ * @param tokens Vector of token types in the json string
  * @param token_indices The indices within the input string corresponding to each token
  * @param stream The CUDA stream to which kernels are dispatched
  * @return A tree representation of the input JSON string as vectors of node type, parent index,

From e531cba89a47d6f988f2e0c600c9afa8c34d287f Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Thu, 1 Sep 2022 22:48:31 +0530
Subject: [PATCH 12/27] merge fix, review comments

---
 cpp/src/io/json/nested_json_gpu.cu |  2 +
 cpp/tests/io/nested_json_test.cpp  | 92 ++++++++++++++++++++++++++++++
 2 files changed, 94 insertions(+)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 07534b68586..3ef5082415e 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -1122,6 +1122,8 @@ void make_json_column(json_column& root_column,
       case token_t::StructEnd: return "StructEnd";
       case token_t::ListBegin: return "ListBegin";
       case token_t::ListEnd: return "ListEnd";
+      case token_t::StructMemberBegin: return "StructMemberBegin";
+      case token_t::StructMemberEnd: return "StructMemberEnd";
       case token_t::FieldNameBegin: return "FieldNameBegin";
       case token_t::FieldNameEnd: return "FieldNameEnd";
       case token_t::StringBegin: return "StringBegin";
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index a265d139b49..304adf863d9 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -36,6 +36,98 @@
 
 namespace cuio_json = cudf::io::json;
 
+namespace {
+// Forward declaration
+void print_column(std::string const& input,
+                  cuio_json::json_column const& column,
+                  uint32_t indent = 0);
+
+/**
+ * @brief Helper to generate indentation
+ */
+std::string pad(uint32_t indent = 0)
+{
+  std::string pad{};
+  if (indent > 0) pad.insert(pad.begin(), indent, ' ');
+  return pad;
+}
+
+/**
+ * @brief Prints a string column.
+ */
+void print_json_string_col(std::string const& input,
+                           cuio_json::json_column const& column,
+                           uint32_t indent = 0)
+{
+  for (std::size_t i = 0; i < column.string_offsets.size(); i++) {
+    std::cout << pad(indent) << i << ": [" << (column.validity[i] ? "1" : "0") << "] '"
+              << input.substr(column.string_offsets[i], column.string_lengths[i]) << "'\n";
+  }
+}
+
+/**
+ * @brief Prints a list column.
+ */
+void print_json_list_col(std::string const& input,
+                         cuio_json::json_column const& column,
+                         uint32_t indent = 0)
+{
+  std::cout << pad(indent) << " [LIST]\n";
+  std::cout << pad(indent) << " -> num. child-columns: " << column.child_columns.size() << "\n";
+  std::cout << pad(indent) << " -> num. rows: " << column.current_offset << "\n";
+  std::cout << pad(indent) << " -> num. valid: " << column.valid_count << "\n";
+  std::cout << pad(indent) << " offsets[]: "
+            << "\n";
+  for (std::size_t i = 0; i < column.child_offsets.size() - 1; i++) {
+    std::cout << pad(indent + 2) << i << ": [" << (column.validity[i] ? "1" : "0") << "] ["
+              << column.child_offsets[i] << ", " << column.child_offsets[i + 1] << ")\n";
+  }
+  if (column.child_columns.size() > 0) {
+    std::cout << pad(indent) << column.child_columns.begin()->first << "[]: "
+              << "\n";
+    print_column(input, column.child_columns.begin()->second, indent + 2);
+  }
+}
+
+/**
+ * @brief Prints a struct column.
+ */
+void print_json_struct_col(std::string const& input,
+                           cuio_json::json_column const& column,
+                           uint32_t indent = 0)
+{
+  std::cout << pad(indent) << " [STRUCT]\n";
+  std::cout << pad(indent) << " -> num. child-columns: " << column.child_columns.size() << "\n";
+  std::cout << pad(indent) << " -> num. rows: " << column.current_offset << "\n";
+  std::cout << pad(indent) << " -> num. valid: " << column.valid_count << "\n";
+  std::cout << pad(indent) << " -> validity[]: "
+            << "\n";
+  for (decltype(column.current_offset) i = 0; i < column.current_offset; i++) {
+    std::cout << pad(indent + 2) << i << ": [" << (column.validity[i] ? "1" : "0") << "]\n";
+  }
+  auto it = std::begin(column.child_columns);
+  for (std::size_t i = 0; i < column.child_columns.size(); i++) {
+    std::cout << pad(indent + 2) << "child #" << i << " '" << it->first << "'[] \n";
+    print_column(input, it->second, indent + 2);
+    it++;
+  }
+}
+
+/**
+ * @brief Prints the column's data and recurses through and prints all the child columns.
+ */
+void print_column(std::string const& input, cuio_json::json_column const& column, uint32_t indent)
+{
+  switch (column.type) {
+    case cuio_json::json_col_t::StringColumn: print_json_string_col(input, column, indent); break;
+    case cuio_json::json_col_t::ListColumn: print_json_list_col(input, column, indent); break;
+    case cuio_json::json_col_t::StructColumn: print_json_struct_col(input, column, indent); break;
+    case cuio_json::json_col_t::Unknown: std::cout << pad(indent) << "[UNKNOWN]\n"; break;
+    default: break;
+  }
+}
+}  // namespace
+
 // Base test fixture for tests
 struct JsonTest : public cudf::test::BaseFixture {
 };

From 94bdd1f2d6cafb93a6ea500dabacf6e269883ab2 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 5 Sep 2022 12:41:36 +0530
Subject: [PATCH 13/27] cleanup tests

---
 cpp/src/io/json/json_tree.cu      |  5 +-
 cpp/tests/io/json_tree.cpp        | 94 +++++++++++++++----------------
 cpp/tests/io/nested_json_test.cpp | 64 ++++++++++-----------
 3 files changed, 83 insertions(+), 80 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index efe2ea56ba1..12005c8864b 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -18,6 +18,8 @@
 
 #include <io/utilities/hostdevice_vector.hpp>
 
+#include <cudf/detail/nvtx/ranges.hpp>
+
 #include <thrust/copy.h>
 #include <thrust/iterator/counting_iterator.h>
 #include <thrust/iterator/transform_output_iterator.h>
@@ -110,6 +112,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                                     rmm::cuda_stream_view stream,
                                     rmm::mr::device_memory_resource* mr)
 {
+  CUDF_FUNC_RANGE();
   // Whether a token does represent a node in the tree representation
   auto is_node = [] __device__(PdaTokenT const token) -> size_type {
     switch (token) {
@@ -214,7 +217,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                        return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;
                    });
   auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
-  // TODO: use radix sort.
+  // Uses radix sort for builtin types.
   thrust::stable_sort_by_key(rmm::exec_policy(stream),
                              token_levels.data(),
                              token_levels.data() + token_levels.size(),
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index 14dabc3f5d5..bde1559f942 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -363,27 +363,27 @@ struct JsonTest : public cudf::test::BaseFixture {
 
 TEST_F(JsonTest, TreeRepresentation)
 {
-  auto stream = cudf::default_stream_value;
+  constexpr auto stream = cudf::default_stream_value;
 
   // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
+  std::string const input = R"(  [{)"
+                            R"("category": "reference",)"
+                            R"("index:": [4,12,42],)"
+                            R"("author": "Nigel Rees",)"
+                            R"("title": "[Sayings of the Century]",)"
+                            R"("price": 8.95)"
+                            R"(},  )"
+                            R"({)"
+                            R"("category": "reference",)"
+                            R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
+                            R"("author": "Nigel Rees",)"
+                            R"("title": "{}[], <=semantic-symbols-string",)"
+                            R"("price": 8.95)"
+                            R"(}] )";
   // Prepare input & output buffers
-  cudf::string_scalar d_scalar(input, true, stream);
-  auto d_input = cudf::device_span<cuio_json::SymbolT const>{d_scalar.data(),
-                                                             static_cast<size_t>(d_scalar.size())};
+  cudf::string_scalar const d_scalar(input, true, stream);
+  auto const d_input = cudf::device_span<cuio_json::SymbolT const>{
+    d_scalar.data(), static_cast<size_t>(d_scalar.size())};
 
   cudf::io::json_reader_options const options{};
 
@@ -394,12 +394,12 @@ TEST_F(JsonTest, TreeRepresentation)
   // Get the JSON's tree representation
   auto gpu_tree = cuio_json::detail::get_tree_representation(tokens_gpu, token_indices_gpu, stream);
   // host tree generation
-  auto tree_rep =
+  auto cpu_tree =
     cuio_json::test::get_tree_representation_cpu(tokens_gpu, token_indices_gpu, options, stream);
-  cudf::io::json::test::compare_trees(tree_rep, gpu_tree);
+  cudf::io::json::test::compare_trees(cpu_tree, gpu_tree);
 
   // Print tree representation
-  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, cpu_tree); }
 
   // Golden sample of node categories
   std::vector<cuio_json::node_t> golden_node_categories = {
@@ -443,26 +443,26 @@ TEST_F(JsonTest, TreeRepresentation)
     147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
 
   // Check results against golden samples
-  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
-  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
-  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
-  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
-  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+  ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), cpu_tree.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), cpu_tree.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), cpu_tree.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), cpu_tree.node_range_end.size());
 
   for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+    ASSERT_EQ(golden_node_categories[i], cpu_tree.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], cpu_tree.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], cpu_tree.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], cpu_tree.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], cpu_tree.node_range_end[i]);
   }
 }
 
 TEST_F(JsonTest, TreeRepresentation2)
 {
-  auto stream = cudf::default_stream_value;
+  constexpr auto stream = cudf::default_stream_value;
   // Test input: value end with comma, space, close-brace ", }"
-  std::string input =
+  std::string const input =
     //  0         1         2         3         4         5         6         7         8         9
     //  0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
     R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
@@ -480,12 +480,12 @@ TEST_F(JsonTest, TreeRepresentation2)
   // Get the JSON's tree representation
   auto gpu_tree = cuio_json::detail::get_tree_representation(tokens_gpu, token_indices_gpu, stream);
   // host tree generation
-  auto tree_rep =
+  auto cpu_tree =
     cuio_json::test::get_tree_representation_cpu(tokens_gpu, token_indices_gpu, options, stream);
-  cudf::io::json::test::compare_trees(tree_rep, gpu_tree);
+  cudf::io::json::test::compare_trees(cpu_tree, gpu_tree);
 
   // Print tree representation
-  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, tree_rep); }
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, cpu_tree); }
   // TODO compare with CPU version
 
   // Golden sample of node categories
@@ -501,7 +501,7 @@ TEST_F(JsonTest, TreeRepresentation2)
     cuio_json::parent_node_sentinel, 0,
     0, 2,  3,  4,  5,  4, 7,
     0, 9, 10, 11, 12, 11, 14,
-      9, 16, 17, 18, 17, 20};
+       9, 16, 17, 18, 17, 20};
   // clang-format on
 
   // Golden sample of node levels
@@ -518,17 +518,17 @@ TEST_F(JsonTest, TreeRepresentation2)
                                                     45, 48, 53, 57, 61, 66, 71, 73, 78, 83, 88};
 
   // Check results against golden samples
-  ASSERT_EQ(golden_node_categories.size(), tree_rep.node_categories.size());
-  ASSERT_EQ(golden_parent_node_ids.size(), tree_rep.parent_node_ids.size());
-  ASSERT_EQ(golden_node_levels.size(), tree_rep.node_levels.size());
-  ASSERT_EQ(golden_node_range_begin.size(), tree_rep.node_range_begin.size());
-  ASSERT_EQ(golden_node_range_end.size(), tree_rep.node_range_end.size());
+  ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());
+  ASSERT_EQ(golden_parent_node_ids.size(), cpu_tree.parent_node_ids.size());
+  ASSERT_EQ(golden_node_levels.size(), cpu_tree.node_levels.size());
+  ASSERT_EQ(golden_node_range_begin.size(), cpu_tree.node_range_begin.size());
+  ASSERT_EQ(golden_node_range_end.size(), cpu_tree.node_range_end.size());
 
   for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], tree_rep.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], tree_rep.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], tree_rep.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], tree_rep.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], tree_rep.node_range_end[i]);
+    ASSERT_EQ(golden_node_categories[i], cpu_tree.node_categories[i]);
+    ASSERT_EQ(golden_parent_node_ids[i], cpu_tree.parent_node_ids[i]);
+    ASSERT_EQ(golden_node_levels[i], cpu_tree.node_levels[i]);
+    ASSERT_EQ(golden_node_range_begin[i], cpu_tree.node_range_begin[i]);
+    ASSERT_EQ(golden_node_range_end[i], cpu_tree.node_range_end[i]);
   }
 }
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index 304adf863d9..e73b6778aed 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -14,7 +14,6 @@
  * limitations under the License.
  */
 
-#include "cudf/utilities/span.hpp"
 #include <io/json/nested_json.hpp>
 #include <io/utilities/hostdevice_vector.hpp>
 
@@ -24,6 +23,7 @@
 #include <cudf/lists/lists_column_view.hpp>
 #include <cudf/scalar/scalar.hpp>
 #include <cudf/utilities/default_stream.hpp>
+#include <cudf/utilities/span.hpp>
 
 #include <cudf_test/base_fixture.hpp>
 #include <cudf_test/column_utilities.hpp>
@@ -158,8 +158,8 @@ TEST_F(JsonTest, StackContext)
                             R"(}] )";
 
   // Prepare input & output buffers
-  cudf::string_scalar d_scalar(input, true, stream);
-  auto d_input =
+  cudf::string_scalar const d_scalar(input, true, stream);
+  auto const d_input =
     cudf::device_span<SymbolT const>{d_scalar.data(), static_cast<size_t>(d_scalar.size())};
   hostdevice_vector<StackSymbolT> stack_context(input.size(), stream);
 
@@ -172,7 +172,7 @@ TEST_F(JsonTest, StackContext)
   // Make sure we copied back the stack context
   stream.synchronize();
 
-  std::vector<char> golden_stack_context{
+  std::vector<char> const golden_stack_context{
     '_', '_', '_', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
     '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
     '{', '[', '[', '[', '[', '[', '[', '[', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
@@ -206,8 +206,8 @@ TEST_F(JsonTest, StackContextUtf8)
   std::string const input = R"([{"a":{"year":1882,"author": "Bharathi"}, {"a":"filip ʒakotɛ"}}])";
 
   // Prepare input & output buffers
-  cudf::string_scalar d_scalar(input, true, stream);
-  auto d_input =
+  cudf::string_scalar const d_scalar(input, true, stream);
+  auto const d_input =
     cudf::device_span<SymbolT const>{d_scalar.data(), static_cast<size_t>(d_scalar.size())};
   hostdevice_vector<StackSymbolT> stack_context(input.size(), stream);
 
@@ -220,7 +220,7 @@ TEST_F(JsonTest, StackContextUtf8)
   // Make sure we copied back the stack context
   stream.synchronize();
 
-  std::vector<char> golden_stack_context{
+  std::vector<char> const golden_stack_context{
     '_', '[', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
     '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
     '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{', '{',
@@ -236,20 +236,20 @@ TEST_F(JsonTest, TokenStream)
   using cuio_json::SymbolOffsetT;
   using cuio_json::SymbolT;
   // Test input
-  std::string input = R"(  [{)"
-                      R"("category": "reference",)"
-                      R"("index:": [4,12,42],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "[Sayings of the Century]",)"
-                      R"("price": 8.95)"
-                      R"(},  )"
-                      R"({)"
-                      R"("category": "reference",)"
-                      R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
-                      R"("author": "Nigel Rees",)"
-                      R"("title": "{}[], <=semantic-symbols-string",)"
-                      R"("price": 8.95)"
-                      R"(}] )";
+  std::string const input = R"(  [{)"
+                            R"("category": "reference",)"
+                            R"("index:": [4,12,42],)"
+                            R"("author": "Nigel Rees",)"
+                            R"("title": "[Sayings of the Century]",)"
+                            R"("price": 8.95)"
+                            R"(},  )"
+                            R"({)"
+                            R"("category": "reference",)"
+                            R"("index": [4,{},null,{"a":[{ }, {}] } ],)"
+                            R"("author": "Nigel Rees",)"
+                            R"("title": "{}[], <=semantic-symbols-string",)"
+                            R"("price": 8.95)"
+                            R"(}] )";
 
   constexpr auto stream = cudf::default_stream_value;
 
@@ -257,22 +257,22 @@ TEST_F(JsonTest, TokenStream)
   cudf::io::json_reader_options default_options{};
 
   // Prepare input & output buffers
-  cudf::string_scalar d_scalar(input, true, stream);
-  auto d_input =
+  cudf::string_scalar const d_scalar(input, true, stream);
+  auto const d_input =
     cudf::device_span<SymbolT const>{d_scalar.data(), static_cast<size_t>(d_scalar.size())};
 
   // Parse the JSON and get the token stream
   auto [d_tokens_gpu, d_token_indices_gpu] =
     cuio_json::detail::get_token_stream(d_input, default_options, stream);
   // Copy back the number of tokens that were written
-  thrust::host_vector<PdaTokenT> tokens_gpu =
+  thrust::host_vector<PdaTokenT> const tokens_gpu =
     cudf::detail::make_host_vector_async(d_tokens_gpu, stream);
-  thrust::host_vector<SymbolOffsetT> token_indices_gpu =
+  thrust::host_vector<SymbolOffsetT> const token_indices_gpu =
     cudf::detail::make_host_vector_async(d_token_indices_gpu, stream);
 
   // Golden token stream sample
   using token_t = cuio_json::token_t;
-  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> golden_token_stream = {
+  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> const golden_token_stream = {
     {2, token_t::ListBegin},
     {3, token_t::StructBegin},
     {4, token_t::StructMemberBegin},
@@ -383,7 +383,7 @@ TEST_F(JsonTest, TokenStream2)
   using cuio_json::SymbolOffsetT;
   using cuio_json::SymbolT;
   // value end with comma, space, close-brace ", }"
-  std::string input =
+  std::string const input =
     R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
 
   constexpr auto stream = cudf::default_stream_value;
@@ -392,23 +392,23 @@ TEST_F(JsonTest, TokenStream2)
   cudf::io::json_reader_options default_options{};
 
   // Prepare input & output buffers
-  cudf::string_scalar d_scalar(input, true, stream);
-  auto d_input =
+  cudf::string_scalar const d_scalar(input, true, stream);
+  auto const d_input =
     cudf::device_span<SymbolT const>{d_scalar.data(), static_cast<size_t>(d_scalar.size())};
 
   // Parse the JSON and get the token stream
   auto [d_tokens_gpu, d_token_indices_gpu] =
     cuio_json::detail::get_token_stream(d_input, default_options, stream);
   // Copy back the number of tokens that were written
-  thrust::host_vector<PdaTokenT> tokens_gpu =
+  thrust::host_vector<PdaTokenT> const tokens_gpu =
     cudf::detail::make_host_vector_async(d_tokens_gpu, stream);
-  thrust::host_vector<SymbolOffsetT> token_indices_gpu =
+  thrust::host_vector<SymbolOffsetT> const token_indices_gpu =
     cudf::detail::make_host_vector_async(d_token_indices_gpu, stream);
 
   // Golden token stream sample
   using token_t = cuio_json::token_t;
   // clang-format off
-  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> golden_token_stream = {
+  std::vector<std::pair<std::size_t, cuio_json::PdaTokenT>> const golden_token_stream = {
     {0, token_t::ListBegin},
     {2, token_t::StructBegin}, {3, token_t::StructEnd}, //{}
     {6, token_t::StructBegin},

From edb78add021ae1a47ddb1ee805ed91cc12914808 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 5 Sep 2022 13:53:39 +0530
Subject: [PATCH 14/27] add StructMemberEnd whitespace, newline test cases

---
 cpp/src/io/json/nested_json_gpu.cu |  2 +-
 cpp/tests/io/json_tree.cpp         | 17 +++++++++--------
 cpp/tests/io/nested_json_test.cpp  | 13 +++++++------
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 3ef5082415e..18f6b31df87 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -595,7 +595,7 @@ auto get_translation_table()
      {ValueEnd, StructMemberEnd},             // COMMA
      {ErrorBegin},                            // COLON
      {ValueEnd},                              // WHITE_SPACE
-     {ValueEnd},                              // LINE_BREAK //TODO StructMemberEnd here?
+     {ValueEnd},                              // LINE_BREAK
      {}}};                                    // OTHER
 
   pda_tlt[static_cast<StateT>(pda_state_t::PD_STR)] = {{              /*ROOT*/
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index bde1559f942..32499d9fbb8 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -132,9 +132,9 @@ void compare_trees(tree_meta_t2 const& cpu_tree, tree_meta_t const& d_gpu_tree,
     std::cout << name << std::endl;
   };
 
-#define COMPARE_MEMBER(member)                                    \
-  for (std::size_t i = 0; i < cpu_num_nodes; i++) {               \
-    EXPECT_EQ(cpu_tree.member[i], gpu_tree.member[i]) << #member; \
+#define COMPARE_MEMBER(member)                                                       \
+  for (std::size_t i = 0; i < cpu_num_nodes; i++) {                                  \
+    EXPECT_EQ(cpu_tree.member[i], gpu_tree.member[i]) << #member << "[" << i << "]"; \
   }
   COMPARE_MEMBER(node_categories);
   COMPARE_MEMBER(parent_node_ids);
@@ -463,9 +463,10 @@ TEST_F(JsonTest, TreeRepresentation2)
   constexpr auto stream = cudf::default_stream_value;
   // Test input: value end with comma, space, close-brace ", }"
   std::string const input =
-    //  0         1         2         3         4         5         6         7         8         9
-    //  0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
-    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
+    // 0         1         2         3         4         5         6         7         8         9
+    // 0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
+    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9 }, "b" : {"x": 10 , "z": 11)"
+    "\n}}]";
   // Prepare input & output buffers
   cudf::string_scalar d_scalar(input, true, stream);
   auto d_input = cudf::device_span<cuio_json::SymbolT const>{d_scalar.data(),
@@ -511,11 +512,11 @@ TEST_F(JsonTest, TreeRepresentation2)
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_begin = {0,  2,  6,  9,  13, 16, 21, 25, 29, 36, 39,
-                                                      44, 47, 52, 56, 60, 65, 70, 72, 76, 82, 86};
+                                                      44, 47, 52, 56, 60, 66, 71, 73, 77, 83, 87};
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_end = {1,  3,  7,  10, 14, 17, 22, 26, 30, 37, 40,
-                                                    45, 48, 53, 57, 61, 66, 71, 73, 78, 83, 88};
+                                                    45, 48, 53, 57, 61, 67, 72, 74, 79, 84, 89};
 
   // Check results against golden samples
   ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index e73b6778aed..0ffe136f4a7 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -384,7 +384,8 @@ TEST_F(JsonTest, TokenStream2)
   using cuio_json::SymbolT;
   // value end with comma, space, close-brace ", }"
   std::string const input =
-    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11}}])";
+    R"([ {}, { "a": { "y" : 6, "z": [] }}, { "a" : { "x" : 8, "y": 9}, "b" : {"x": 10 , "z": 11)"
+    "\n}}]";
 
   constexpr auto stream = cudf::default_stream_value;
 
@@ -429,11 +430,11 @@ TEST_F(JsonTest, TokenStream2)
         {64, token_t::StructMemberBegin}, {64, token_t::FieldNameBegin}, {66, token_t::FieldNameEnd}, //b
             {70, token_t::StructBegin},
                 {71, token_t::StructMemberBegin}, {71, token_t::FieldNameBegin}, {73, token_t::FieldNameEnd}, {76, token_t::ValueBegin}, {78, token_t::ValueEnd}, {79, token_t::StructMemberEnd}, //b.x
-                {81, token_t::StructMemberBegin}, {81, token_t::FieldNameBegin}, {83, token_t::FieldNameEnd}, {86, token_t::ValueBegin}, {88, token_t::ValueEnd}, {88, token_t::StructMemberEnd}, //b.z
-            {88, token_t::StructEnd},
-        {89, token_t::StructMemberEnd},
-    {89, token_t::StructEnd},
-    {90, token_t::ListEnd}};
+                {81, token_t::StructMemberBegin}, {81, token_t::FieldNameBegin}, {83, token_t::FieldNameEnd}, {86, token_t::ValueBegin}, {88, token_t::ValueEnd}, {89, token_t::StructMemberEnd}, //b.z
+            {89, token_t::StructEnd},
+        {90, token_t::StructMemberEnd},
+    {90, token_t::StructEnd},
+    {91, token_t::ListEnd}};
   // clang-format on
 
   // Verify the number of tokens matches

From ab1db5be38c80ac1d54a95c93a8876800758e4dc Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 5 Sep 2022 16:41:40 +0530
Subject: [PATCH 15/27] include_quote_chars change in tree generation

---
 cpp/src/io/json/json_tree.cu       | 22 ++++++++++++++++-----
 cpp/src/io/json/nested_json_gpu.cu | 16 ++++++++++-----
 cpp/tests/io/json_tree.cpp         | 31 ++++++++++++++++++------------
 cpp/tests/io/nested_json_test.cpp  |  2 +-
 4 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 12005c8864b..4d17fa232ee 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -64,6 +64,7 @@ struct node_ranges {
   T1 tokens;
   T2 token_indices;
   T3 num_tokens;
+  bool include_quote_char;
   __device__ auto operator()(size_type i) -> thrust::tuple<SymbolOffsetT, SymbolOffsetT>
   {
     // Whether a token expects to be followed by its respective end-of-* token partner
@@ -84,11 +85,18 @@ struct node_ranges {
         default: return token_t::ErrorBegin;
       };
     };
-    auto get_token_index = [] __device__(PdaTokenT const token, SymbolOffsetT const token_index) {
-      constexpr SymbolOffsetT skip_quote_char = 1;
+    // Includes quote char for end-of-string token or Skips the quote char for
+    // beginning-of-field-name token
+    auto get_token_index = [include_quote_char = include_quote_char] __device__(
+                             PdaTokenT const token, SymbolOffsetT const token_index) {
+      constexpr SymbolOffsetT quote_char_size = 1;
       switch (token) {
-        case token_t::StringBegin: return token_index + skip_quote_char;
-        case token_t::FieldNameBegin: return token_index + skip_quote_char;
+        // Strip off quote char included for StringBegin
+        case token_t::StringBegin: return token_index + (include_quote_char ? 0 : quote_char_size);
+        // Strip off or Include trailing quote char for string values for StringEnd
+        case token_t::StringEnd: return token_index + (include_quote_char ? quote_char_size : 0);
+        // Strip off quote char included for FieldNameBegin
+        case token_t::FieldNameBegin: return token_index + quote_char_size;
         default: return token_index;
       };
     };
@@ -185,10 +193,14 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   rmm::device_uvector<SymbolOffsetT> node_range_end(num_nodes, stream, mr);
   auto node_range_tuple_it =
     thrust::make_zip_iterator(node_range_begin.begin(), node_range_end.begin());
+  // Whether the tokenizer stage should keep quote characters for string values
+  // If the tokenizer keeps the quote characters, they may be stripped during type casting
+  constexpr bool include_quote_char = true;
   using node_ranges_t =
     node_ranges<decltype(tokens.begin()), decltype(token_indices.begin()), decltype(num_tokens)>;
   auto node_range_out_it = thrust::make_transform_output_iterator(
-    node_range_tuple_it, node_ranges_t{tokens.begin(), token_indices.begin(), num_tokens});
+    node_range_tuple_it,
+    node_ranges_t{tokens.begin(), token_indices.begin(), num_tokens, include_quote_char});
 
   auto node_range_out_end =
     thrust::copy_if(rmm::exec_policy(stream),
diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 18f6b31df87..1ba9bf7f238 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -1050,6 +1050,7 @@ void make_json_column(json_column& root_column,
 {
   // Default name for a list's child column
   std::string const list_child_name = "element";
+  constexpr bool include_quote_char = true;
 
   // Parse the JSON and get the token stream
   const auto [d_tokens_gpu, d_token_indices_gpu] = get_token_stream(d_input, options, stream, mr);
@@ -1085,12 +1086,17 @@ void make_json_column(json_column& root_column,
     };
   };
 
-  // Skips the quote char if the token is a beginning-of-string or beginning-of-field-name token
-  auto get_token_index = [](PdaTokenT const token, SymbolOffsetT const token_index) {
-    constexpr SymbolOffsetT skip_quote_char = 1;
+  // Includes quote char for end-of-string token or Skips the quote char for beginning-of-field-name
+  auto get_token_index = [include_quote_char](PdaTokenT const token,
+                                              SymbolOffsetT const token_index) {
+    constexpr SymbolOffsetT quote_char_size = 1;
     switch (token) {
-      case token_t::StringBegin: return token_index + skip_quote_char;
-      case token_t::FieldNameBegin: return token_index + skip_quote_char;
+      // Strip off or include quote char for StringBegin
+      case token_t::StringBegin: return token_index + (include_quote_char ? 0 : quote_char_size);
+      // Strip off or Include trailing quote char for string values for StringEnd
+      case token_t::StringEnd: return token_index + (include_quote_char ? quote_char_size : 0);
+      // Strip off quote char included for FieldNameBegin
+      case token_t::FieldNameBegin: return token_index + quote_char_size;
       default: return token_index;
     };
   };
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index 32499d9fbb8..f4826f7c313 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -174,6 +174,7 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
                                          cudf::io::json_reader_options const& options,
                                          rmm::cuda_stream_view stream)
 {
+  constexpr bool include_quote_char = true;
   // Copy the JSON tokens to the host
   thrust::host_vector<PdaTokenT> tokens = cudf::detail::make_host_vector_async(tokens_gpu, stream);
   thrust::host_vector<SymbolOffsetT> token_indices =
@@ -234,11 +235,17 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
     };
   };
 
-  auto get_token_index = [](PdaTokenT const token, SymbolOffsetT const token_index) {
-    constexpr SymbolOffsetT skip_quote_char = 1;
+  // Includes quote char for end-of-string token or Skips the quote char for beginning-of-field-name
+  auto get_token_index = [include_quote_char](PdaTokenT const token,
+                                              SymbolOffsetT const token_index) {
+    constexpr SymbolOffsetT quote_char_size = 1;
     switch (token) {
-      case token_t::StringBegin: return token_index + skip_quote_char;
-      case token_t::FieldNameBegin: return token_index + skip_quote_char;
+      // Strip off or include quote char for StringBegin
+      case token_t::StringBegin: return token_index + (include_quote_char ? 0 : quote_char_size);
+      // Strip off or Include trailing quote char for string values for StringEnd
+      case token_t::StringEnd: return token_index + (include_quote_char ? quote_char_size : 0);
+      // Strip off quote char included for FieldNameBegin
+      case token_t::FieldNameBegin: return token_index + quote_char_size;
       default: return token_index;
     };
   };
@@ -396,7 +403,7 @@ TEST_F(JsonTest, TreeRepresentation)
   // host tree generation
   auto cpu_tree =
     cuio_json::test::get_tree_representation_cpu(tokens_gpu, token_indices_gpu, options, stream);
-  cudf::io::json::test::compare_trees(cpu_tree, gpu_tree);
+  // cudf::io::json::test::compare_trees(cpu_tree, gpu_tree);
 
   // Print tree representation
   if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, cpu_tree); }
@@ -434,8 +441,8 @@ TEST_F(JsonTest, TreeRepresentation)
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_begin = {
-    2,   3,   5,   17,  29,  38,  39,  41,  44,  49,  59,  72,  81,  108, 116, 124, 126,
-    138, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 199, 212, 221, 255, 263};
+    2,   3,   5,   16,  29,  38,  39,  41,  44,  49,  58,  72,  80,  108, 116, 124, 126,
+    137, 150, 158, 159, 161, 164, 169, 171, 174, 175, 180, 189, 198, 212, 220, 255, 263};
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_end = {
@@ -450,11 +457,11 @@ TEST_F(JsonTest, TreeRepresentation)
   ASSERT_EQ(golden_node_range_end.size(), cpu_tree.node_range_end.size());
 
   for (std::size_t i = 0; i < golden_node_categories.size(); i++) {
-    ASSERT_EQ(golden_node_categories[i], cpu_tree.node_categories[i]);
-    ASSERT_EQ(golden_parent_node_ids[i], cpu_tree.parent_node_ids[i]);
-    ASSERT_EQ(golden_node_levels[i], cpu_tree.node_levels[i]);
-    ASSERT_EQ(golden_node_range_begin[i], cpu_tree.node_range_begin[i]);
-    ASSERT_EQ(golden_node_range_end[i], cpu_tree.node_range_end[i]);
+    ASSERT_EQ(golden_node_categories[i], cpu_tree.node_categories[i]) << "[" << i << "]";
+    ASSERT_EQ(golden_parent_node_ids[i], cpu_tree.parent_node_ids[i]) << "[" << i << "]";
+    ASSERT_EQ(golden_node_levels[i], cpu_tree.node_levels[i]) << "[" << i << "]";
+    ASSERT_EQ(golden_node_range_begin[i], cpu_tree.node_range_begin[i]) << "[" << i << "]";
+    ASSERT_EQ(golden_node_range_end[i], cpu_tree.node_range_end[i]) << "[" << i << "]";
   }
 }
 
diff --git a/cpp/tests/io/nested_json_test.cpp b/cpp/tests/io/nested_json_test.cpp
index 0ffe136f4a7..55364ca7e9d 100644
--- a/cpp/tests/io/nested_json_test.cpp
+++ b/cpp/tests/io/nested_json_test.cpp
@@ -519,7 +519,7 @@ TEST_F(JsonTest, UTF_JSON)
   CUDF_EXPECT_NO_THROW(cuio_json::detail::parse_nested_json(utf_pass, default_options, stream));
 }
 
-TEST_F(JsonTest, FromParquet)
+TEST_F(JsonTest, DISABLED_FromParquet)
 {
   using cuio_json::SymbolT;
 

From ab4c7a1fee718a43216f1dbd924caa2bea2551a7 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Thu, 8 Sep 2022 23:26:29 +0530
Subject: [PATCH 16/27] temporary workaround until PR #11574 merge

---
 cpp/src/io/json/nested_json_gpu.cu | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index 1ba9bf7f238..bc207608d8c 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -1050,7 +1050,7 @@ void make_json_column(json_column& root_column,
 {
   // Default name for a list's child column
   std::string const list_child_name = "element";
-  constexpr bool include_quote_char = true;
+  constexpr bool include_quote_char = false;  // TODO if merge conflict with PR #11574, make it true
 
   // Parse the JSON and get the token stream
   const auto [d_tokens_gpu, d_token_indices_gpu] = get_token_stream(d_input, options, stream, mr);

From a1469366221ea90b5469802f2dc0d6bded9651dc Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Thu, 8 Sep 2022 23:32:20 +0530
Subject: [PATCH 17/27] add nvtx func ranges

---
 cpp/src/io/json/nested_json_gpu.cu | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/cpp/src/io/json/nested_json_gpu.cu b/cpp/src/io/json/nested_json_gpu.cu
index bc207608d8c..4c525caa3c8 100644
--- a/cpp/src/io/json/nested_json_gpu.cu
+++ b/cpp/src/io/json/nested_json_gpu.cu
@@ -21,6 +21,7 @@
 #include <io/utilities/hostdevice_vector.hpp>
 
 #include <cudf/column/column_factories.hpp>
+#include <cudf/detail/nvtx/ranges.hpp>
 #include <cudf/detail/utilities/vector_factories.hpp>
 #include <cudf/detail/valid_if.cuh>
 #include <cudf/io/json.hpp>
@@ -920,6 +921,7 @@ void get_stack_context(device_span<SymbolT const> json_in,
                        SymbolT* d_top_of_stack,
                        rmm::cuda_stream_view stream)
 {
+  CUDF_FUNC_RANGE();
   constexpr std::size_t single_item = 1;
 
   // Symbol representing the JSON-root (i.e., we're at nesting level '0')
@@ -973,6 +975,7 @@ std::pair<rmm::device_uvector<PdaTokenT>, rmm::device_uvector<SymbolOffsetT>> ge
   rmm::cuda_stream_view stream,
   rmm::mr::device_memory_resource* mr)
 {
+  CUDF_FUNC_RANGE();
   rmm::device_uvector<PdaTokenT> tokens{json_in.size(), stream, mr};
   rmm::device_uvector<SymbolOffsetT> tokens_indices{json_in.size(), stream, mr};
   rmm::device_scalar<SymbolOffsetT> num_written_tokens{stream, mr};
@@ -1048,6 +1051,7 @@ void make_json_column(json_column& root_column,
                       rmm::cuda_stream_view stream,
                       rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource())
 {
+  CUDF_FUNC_RANGE();
   // Default name for a list's child column
   std::string const list_child_name = "element";
   constexpr bool include_quote_char = false;  // TODO if merge conflict with PR #11574, make it true
@@ -1421,6 +1425,7 @@ std::pair<std::unique_ptr<column>, std::vector<column_name_info>> json_column_to
   rmm::cuda_stream_view stream,
   rmm::mr::device_memory_resource* mr)
 {
+  CUDF_FUNC_RANGE();
   auto make_validity =
     [stream, mr](json_column const& json_col) -> std::pair<rmm::device_buffer, size_type> {
     if (json_col.current_offset == json_col.valid_count) { return {rmm::device_buffer{}, 0}; }
@@ -1516,6 +1521,7 @@ table_with_metadata parse_nested_json(host_span<SymbolT const> input,
                                       rmm::cuda_stream_view stream,
                                       rmm::mr::device_memory_resource* mr)
 {
+  CUDF_FUNC_RANGE();
   auto const new_line_delimited_json = options.is_enabled_lines();
 
   // Allocate device memory for the JSON input & copy over to device

From 6eb9bbe929a7070172a09a61611b516d1728f1c8 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Fri, 9 Sep 2022 14:19:01 +0530
Subject: [PATCH 18/27] address review comments

---
 cpp/src/io/json/json_tree.cu | 30 ++++++------------------------
 1 file changed, 6 insertions(+), 24 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 4d17fa232ee..4ff7e5b6b96 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -34,15 +34,6 @@
 namespace cudf::io::json {
 namespace detail {
 
-// DEBUG print
-template <typename T>
-void print_vec(T const& cpu, std::string const name)
-{
-  for (auto const& v : cpu)
-    printf("%3d,", int(v));
-  std::cout << name << std::endl;
-}
-
 // The node that a token represents
 struct token_to_node {
   __device__ auto operator()(PdaTokenT const token) -> NodeT
@@ -59,11 +50,9 @@ struct token_to_node {
 };
 
 // Convert token indices to node range for each valid node.
-template <typename T1, typename T2, typename T3>
 struct node_ranges {
-  T1 tokens;
-  T2 token_indices;
-  T3 num_tokens;
+  device_span<PdaTokenT const> tokens;
+  device_span<SymbolOffsetT const> token_indices;
   bool include_quote_char;
   __device__ auto operator()(size_type i) -> thrust::tuple<SymbolOffsetT, SymbolOffsetT>
   {
@@ -105,7 +94,7 @@ struct node_ranges {
     SymbolOffsetT range_begin = get_token_index(token, token_indices[i]);
     SymbolOffsetT range_end   = range_begin + 1;
     if (is_begin_of_section(token)) {
-      if ((i + 1) < num_tokens && end_of_partner(token) == tokens[i + 1]) {
+      if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
         range_end = token_indices[i + 1];
       }
@@ -196,11 +185,8 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   // Whether the tokenizer stage should keep quote characters for string values
   // If the tokenizer keeps the quote characters, they may be stripped during type casting
   constexpr bool include_quote_char = true;
-  using node_ranges_t =
-    node_ranges<decltype(tokens.begin()), decltype(token_indices.begin()), decltype(num_tokens)>;
-  auto node_range_out_it = thrust::make_transform_output_iterator(
-    node_range_tuple_it,
-    node_ranges_t{tokens.begin(), token_indices.begin(), num_tokens, include_quote_char});
+  auto node_range_out_it            = thrust::make_transform_output_iterator(
+    node_range_tuple_it, node_ranges{tokens, token_indices, include_quote_char});
 
   auto node_range_out_end =
     thrust::copy_if(rmm::exec_policy(stream),
@@ -247,11 +233,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                       initial_order.data(),
                       initial_order.data() + initial_order.size(),
                       parent_token_ids.data());
-  // thrust::scatter(rmm::exec_policy(stream),
-  //                parent_token_ids.begin(),
-  //                parent_token_ids.end(),
-  //                initial_order.data(),
-  //                parent_token_ids.begin()); //same location not allowed in scatter
+
   rmm::device_uvector<size_type> node_ids_gpu(num_tokens, stream);
   thrust::exclusive_scan(
     rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens, node_ids_gpu.begin());

From 38b99cf870800b1475b169977e590bf8e4a2723c Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 12 Sep 2022 22:38:54 +0530
Subject: [PATCH 19/27] address review comments

---
 cpp/src/io/json/json_tree.cu | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 4ff7e5b6b96..ba37cfaf6e6 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -21,6 +21,7 @@
 #include <cudf/detail/nvtx/ranges.hpp>
 
 #include <thrust/copy.h>
+#include <thrust/count.h>
 #include <thrust/iterator/counting_iterator.h>
 #include <thrust/iterator/transform_output_iterator.h>
 #include <thrust/iterator/zip_iterator.h>
@@ -92,7 +93,7 @@ struct node_ranges {
     PdaTokenT const token = tokens[i];
     // The section from the original JSON input that this token demarcates
     SymbolOffsetT range_begin = get_token_index(token, token_indices[i]);
-    SymbolOffsetT range_end   = range_begin + 1;
+    SymbolOffsetT range_end   = range_begin + 1;  // non-leaf, non-field nodes ignore this value.
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
@@ -145,7 +146,8 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
 
   auto num_tokens = tokens.size();
   auto is_node_it = thrust::make_transform_iterator(tokens.begin(), is_node);
-  auto num_nodes  = thrust::reduce(rmm::exec_policy(stream), is_node_it, is_node_it + num_tokens);
+  auto num_nodes  = thrust::count_if(
+    rmm::exec_policy(stream), tokens.begin(), tokens.begin() + num_tokens, is_node);
 
   // Node categories: copy_if with transform.
   rmm::device_uvector<NodeT> node_categories(num_nodes, stream, mr);
@@ -163,7 +165,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   rmm::device_uvector<size_type> token_levels(num_tokens, stream);
   auto push_pop_it = thrust::make_transform_iterator(
     tokens.begin(), [does_push, does_pop] __device__(PdaTokenT const token) -> size_type {
-      return does_push(token) ? 1 : (does_pop(token) ? -1 : 0);
+      return does_push(token) - does_pop(token);
     });
   thrust::exclusive_scan(
     rmm::exec_policy(stream), push_pop_it, push_pop_it + num_tokens, token_levels.begin());
@@ -194,8 +196,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                     thrust::make_counting_iterator<size_type>(0) + num_tokens,
                     node_range_out_it,
                     [is_node, tokens_gpu = tokens.begin()] __device__(size_type i) -> bool {
-                      PdaTokenT const token = tokens_gpu[i];
-                      return is_node(token);
+                      return is_node(tokens_gpu[i]);
                     });
   CUDF_EXPECTS(node_range_out_end - node_range_out_it == num_nodes, "node range count mismatch");
 
@@ -209,10 +210,7 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                    parent_token_ids.begin(),
                    parent_token_ids.end(),
                    [does_push, tokens_gpu = tokens.begin()] __device__(auto i) -> size_type {
-                     if (i == 0)
-                       return -1;
-                     else
-                       return does_push(tokens_gpu[i - 1]) ? i - 1 : -1;
+                     return (i > 0) && does_push(tokens_gpu[i - 1]) ? i - 1 : -1;
                    });
   auto out_pid = thrust::make_zip_iterator(parent_token_ids.data(), initial_order.data());
   // Uses radix sort for builtin types.

From 285298aebb2c7a95b2bfb41e7df7e2557f3a0192 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 12 Sep 2022 23:08:38 +0530
Subject: [PATCH 20/27] use scatter instead of scatter_by_key, resuse
 token_levels memory

---
 cpp/src/io/json/json_tree.cu | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index ba37cfaf6e6..6ddd5c4f44c 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -226,11 +226,14 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
                                 parent_token_ids.data(),  // size_type{-1},
                                 thrust::equal_to<size_type>{},
                                 thrust::maximum<size_type>{});
-  // FIXME: Avoid sorting again by scatter + extra memory. Tradeoff?
-  thrust::sort_by_key(rmm::exec_policy(stream),
-                      initial_order.data(),
-                      initial_order.data() + initial_order.size(),
-                      parent_token_ids.data());
+  // Reusing token_levels memory & use scatter to restore the original order.
+  std::swap(token_levels, parent_token_ids);
+  auto& sorted_parent_token_ids = token_levels;
+  thrust::scatter(rmm::exec_policy(stream),
+                  sorted_parent_token_ids.begin(),
+                  sorted_parent_token_ids.end(),
+                  initial_order.data(),
+                  parent_token_ids.data());
 
   rmm::device_uvector<size_type> node_ids_gpu(num_tokens, stream);
   thrust::exclusive_scan(

From 90cd2dc6ad1a7d621b80635ce4a3e99edffac823 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Tue, 13 Sep 2022 07:57:39 +0530
Subject: [PATCH 21/27] review comments (davidwendt)

---
 cpp/src/io/json/nested_json.hpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/cpp/src/io/json/nested_json.hpp b/cpp/src/io/json/nested_json.hpp
index 94c51679c76..fca9a3ecc42 100644
--- a/cpp/src/io/json/nested_json.hpp
+++ b/cpp/src/io/json/nested_json.hpp
@@ -24,7 +24,9 @@
 #include <cudf/utilities/span.hpp>
 
 #include <rmm/cuda_stream_view.hpp>
+#include <rmm/device_uvector.hpp>
 
+#include <map>
 #include <vector>
 
 namespace cudf::io::json {
@@ -308,6 +310,7 @@ std::pair<rmm::device_uvector<PdaTokenT>, rmm::device_uvector<SymbolOffsetT>> ge
  * @param tokens Vector of token types in the json string
  * @param token_indices The indices within the input string corresponding to each token
  * @param stream The CUDA stream to which kernels are dispatched
+ * @param mr Optional, resource with which to allocate
  * @return A tree representation of the input JSON string as vectors of node type, parent index,
  * level, begin index, and end index in the input JSON string
  */

From 76616698a9989cc8107fabf8210cc03be01f4669 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 22:46:19 +0530
Subject: [PATCH 22/27] is_node return type to bool (addressed review comment)

---
 cpp/src/io/json/json_tree.cu | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 6ddd5c4f44c..09b1864546c 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -112,15 +112,15 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
 {
   CUDF_FUNC_RANGE();
   // Whether a token does represent a node in the tree representation
-  auto is_node = [] __device__(PdaTokenT const token) -> size_type {
+  auto is_node = [] __device__(PdaTokenT const token) -> bool {
     switch (token) {
       case token_t::StructBegin:
       case token_t::ListBegin:
       case token_t::StringBegin:
       case token_t::ValueBegin:
       case token_t::FieldNameBegin:
-      case token_t::ErrorBegin: return 1;
-      default: return 0;
+      case token_t::ErrorBegin: return true;
+      default: return false;
     };
   };
 
@@ -145,8 +145,10 @@ tree_meta_t get_tree_representation(device_span<PdaTokenT const> tokens,
   };
 
   auto num_tokens = tokens.size();
-  auto is_node_it = thrust::make_transform_iterator(tokens.begin(), is_node);
-  auto num_nodes  = thrust::count_if(
+  auto is_node_it = thrust::make_transform_iterator(
+    tokens.begin(),
+    [is_node] __device__(auto t) -> size_type { return static_cast<size_type>(is_node(t)); });
+  auto num_nodes = thrust::count_if(
     rmm::exec_policy(stream), tokens.begin(), tokens.begin() + num_tokens, is_node);
 
   // Node categories: copy_if with transform.

From e69038b1f7a6cd8fc01cb9eea39e315d1670cc4e Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 22:46:44 +0530
Subject: [PATCH 23/27] add json_lines test input

---
 cpp/tests/io/json_tree.cpp | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index f4826f7c313..d523d73e216 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -540,3 +540,33 @@ TEST_F(JsonTest, TreeRepresentation2)
     ASSERT_EQ(golden_node_range_end[i], cpu_tree.node_range_end[i]);
   }
 }
+
+TEST_F(JsonTest, TreeRepresentation3)
+{
+  constexpr auto stream = cudf::default_stream_value;
+  // Test input: Json lines with same TreeRepresentation2 input
+  std::string const input =
+    R"(  {}
+ { "a": { "y" : 6, "z": [] }}
+ { "a" : { "x" : 8, "y": 9 }, "b" : {"x": 10 , "z": 11 }} )";  // Prepare input & output buffers
+  cudf::string_scalar d_scalar(input, true, stream);
+  auto d_input = cudf::device_span<cuio_json::SymbolT const>{d_scalar.data(),
+                                                             static_cast<size_t>(d_scalar.size())};
+
+  cudf::io::json_reader_options options{};
+  options.enable_lines(true);
+
+  // Parse the JSON and get the token stream
+  const auto [tokens_gpu, token_indices_gpu] =
+    cudf::io::json::detail::get_token_stream(d_input, options, stream);
+
+  // Get the JSON's tree representation
+  auto gpu_tree = cuio_json::detail::get_tree_representation(tokens_gpu, token_indices_gpu, stream);
+  // host tree generation
+  auto cpu_tree =
+    cuio_json::test::get_tree_representation_cpu(tokens_gpu, token_indices_gpu, options, stream);
+  cudf::io::json::test::compare_trees(cpu_tree, gpu_tree);
+
+  // Print tree representation
+  if (std::getenv("CUDA_DBG_DUMP") != nullptr) { print_tree_representation(input, cpu_tree); }
+}

From b8367a7b895d1fc57a6ebdca206213f65c597762 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 23:24:20 +0530
Subject: [PATCH 24/27] fix missing get_token_index in range_end

---
 cpp/src/io/json/json_tree.cu | 2 +-
 cpp/tests/io/json_tree.cpp   | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 09b1864546c..6894a216145 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -97,7 +97,7 @@ struct node_ranges {
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = token_indices[i + 1];
+        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
       }
     }
     return thrust::make_tuple(range_begin, range_end);
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index d523d73e216..0710156fff9 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -318,7 +318,7 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = token_indices[i + 1];
+        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
         // We can skip the subsequent end-of-* token
         i++;
       }
@@ -446,8 +446,8 @@ TEST_F(JsonTest, TreeRepresentation)
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_end = {
-    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
-    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
+    3,   4,   13,  27,  35,  39,  40,  43,  46,  55,  70,  77,  106, 113, 120, 125, 134,
+    148, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 210, 217, 253, 260, 267};
 
   // Check results against golden samples
   ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());

From 7c4a490125302629c45562ad5802d9686f3833cc Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 23:25:30 +0530
Subject: [PATCH 25/27] Revert "fix missing get_token_index in range_end"

This reverts commit b8367a7b895d1fc57a6ebdca206213f65c597762.
---
 cpp/src/io/json/json_tree.cu | 2 +-
 cpp/tests/io/json_tree.cpp   | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 6894a216145..09b1864546c 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -97,7 +97,7 @@ struct node_ranges {
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
+        range_end = token_indices[i + 1];
       }
     }
     return thrust::make_tuple(range_begin, range_end);
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index 0710156fff9..d523d73e216 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -318,7 +318,7 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
+        range_end = token_indices[i + 1];
         // We can skip the subsequent end-of-* token
         i++;
       }
@@ -446,8 +446,8 @@ TEST_F(JsonTest, TreeRepresentation)
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_end = {
-    3,   4,   13,  27,  35,  39,  40,  43,  46,  55,  70,  77,  106, 113, 120, 125, 134,
-    148, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 210, 217, 253, 260, 267};
+    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
+    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
 
   // Check results against golden samples
   ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());

From d3725a80603f4ed3b1b7f84d734212ef1b16ce38 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 23:26:06 +0530
Subject: [PATCH 26/27] Revert "Revert "fix missing get_token_index in
 range_end""

This reverts commit 7c4a490125302629c45562ad5802d9686f3833cc.
---
 cpp/src/io/json/json_tree.cu | 2 +-
 cpp/tests/io/json_tree.cpp   | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cpp/src/io/json/json_tree.cu b/cpp/src/io/json/json_tree.cu
index 09b1864546c..6894a216145 100644
--- a/cpp/src/io/json/json_tree.cu
+++ b/cpp/src/io/json/json_tree.cu
@@ -97,7 +97,7 @@ struct node_ranges {
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = token_indices[i + 1];
+        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
       }
     }
     return thrust::make_tuple(range_begin, range_end);
diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index d523d73e216..0710156fff9 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -318,7 +318,7 @@ tree_meta_t2 get_tree_representation_cpu(device_span<PdaTokenT const> tokens_gpu
     if (is_begin_of_section(token)) {
       if ((i + 1) < tokens.size() && end_of_partner(token) == tokens[i + 1]) {
         // Update the range_end for this pair of tokens
-        range_end = token_indices[i + 1];
+        range_end = get_token_index(tokens[i + 1], token_indices[i + 1]);
         // We can skip the subsequent end-of-* token
         i++;
       }
@@ -446,8 +446,8 @@ TEST_F(JsonTest, TreeRepresentation)
 
   // Golden sample of the character-ranges from the original input that each node demarcates
   std::vector<std::size_t> golden_node_range_end = {
-    3,   4,   13,  26,  35,  39,  40,  43,  46,  55,  69,  77,  105, 113, 120, 125, 134,
-    147, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 209, 217, 252, 260, 267};
+    3,   4,   13,  27,  35,  39,  40,  43,  46,  55,  70,  77,  106, 113, 120, 125, 134,
+    148, 155, 159, 160, 162, 168, 170, 172, 175, 176, 181, 195, 210, 217, 253, 260, 267};
 
   // Check results against golden samples
   ASSERT_EQ(golden_node_categories.size(), cpu_tree.node_categories.size());

From 2f34d3a2a96e225f31cfae6fec381da51ad1b973 Mon Sep 17 00:00:00 2001
From: Karthikeyan Natarajan <karthikeyann@users.noreply.github.com>
Date: Mon, 19 Sep 2022 23:30:31 +0530
Subject: [PATCH 27/27] remove unnecessary __host__ __device__

---
 cpp/tests/io/json_tree.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpp/tests/io/json_tree.cpp b/cpp/tests/io/json_tree.cpp
index 0710156fff9..aabee685304 100644
--- a/cpp/tests/io/json_tree.cpp
+++ b/cpp/tests/io/json_tree.cpp
@@ -48,7 +48,7 @@ std::string get_node_string(std::size_t const node_id,
                             cuio_json::tree_meta_t2 const& tree_rep,
                             std::string const& json_input)
 {
-  auto node_to_str = [] __host__ __device__(cuio_json::PdaTokenT const token) {
+  auto node_to_str = [](cuio_json::PdaTokenT const token) {
     switch (token) {
       case cuio_json::NC_STRUCT: return "STRUCT";
       case cuio_json::NC_LIST: return "LIST";