add is_valid_integer format check API #7094

chenrui17 · 2021-01-07T12:39:50Z

@revans2 this is my first commit , in order to close #7080 , i just finished the allow_decimal part , there are still overflow checks that have not been completed , please give me some suggestions about this.
In addition , this feature helps me improve my spark query performance 30%~50%, so about how to checks overflow, please give me some idea , Thank you very much.

GPUtester · 2021-01-07T12:39:52Z

Can one of the admins verify this patch?

GPUtester · 2021-01-07T12:39:52Z

Can one of the admins verify this patch?

davidwendt

There are 8 types of integers.

cudf/cpp/include/cudf/types.hpp

Lines 198 to 205 in f768da7

    
           INT8,                    ///< 1 byte signed integer 
        
           INT16,                   ///< 2 byte signed integer 
        
           INT32,                   ///< 4 byte signed integer 
        
           INT64,                   ///< 8 byte signed integer 
        
           UINT8,                   ///< 1 byte unsigned integer 
        
           UINT16,                  ///< 2 byte unsigned integer 
        
           UINT32,                  ///< 4 byte unsigned integer 
        
           UINT64,                  ///< 8 byte unsigned integer

Both the to_integers() and the to_floats() allow you to specify the output data type so it seems we should take the data type as an input parameter here and use that when checking for overflow.

davidwendt · 2021-01-07T13:43:05Z

cpp/include/cudf/strings/char_types/char_types.hpp

+  strings_column_view const& strings,
+  bool allow_decimal,
+  rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());
+)


This should not go in this header file. I would suggest creating a new header file for this function.
/cpp/include/cudf/strings/convert/is_valid_integer.hpp

davidwendt · 2021-01-07T13:44:29Z

cpp/include/cudf/strings/char_types/char_types.hpp

+ * @param allow_decimal identification whether the format is decimal or not
+ * @param mr Device memory resource used to allocate the returned column's device memory.
+ * @return New column of boolean results for each string.
+ */


This doxygen does not match this function. The doxygen in the string.cuh file looks correct and should be moved here instead. You can use the @copydoc in the .cu to refer to the doxygen here.
https://github.com/rapidsai/cudf/blob/branch-0.18/cpp/docs/DOCUMENTATION.md#copydoc

davidwendt · 2021-01-07T13:49:30Z

cpp/src/strings/char_types/char_types.cu

@@ -213,6 +213,35 @@ std::unique_ptr<column> is_integer(
  return results;
 }

+std::unique_ptr<column> is_valid_integer(


This function should be moved into a new source file instead
/cpp/src/strings/convert/is_valid_integer.cu

This source file should also contain all the decimal integer character checking as well as the overflow checking.

davidwendt · 2021-01-07T13:54:03Z

cpp/include/cudf/strings/string.cuh

+ * @param allow_decimal Decimal format or not
+ * @return true if string has valid integer characters
+ */
+__device__ bool is_valid_integer(string_view const& d_str ,bool allow_decimal)


This function should be move into the is_valid_integer.cu source file as mentioned in another review comment. It should also be updated to check for overflow as well.

jrhemstad · 2021-01-07T14:48:03Z

cpp/include/cudf/strings/char_types/char_types.hpp

+ */
+std::unique_ptr<column> is_valid_integer(
+  strings_column_view const& strings,
+  bool allow_decimal,


This parameter doesn't seem like it belongs. Fixed-point decimal numbers aren't integers. This seems like it should be is_valid_fixed_point or something similar.

I agree. I've been looking up is_numeric(), is_decimal(), etc type APIs in other languages and cannot find one that matches the requirements for #7080 . Perhaps is_valid_numeric() with allow-decimal and allow-exponent flags. Perhaps this is too Spark specific.

The name does not matter too much to me. is_valid_numeric is OK, but I think I like is_valid_fixed_point best because it does appear to match with a fixed point number.

Instead of all of these type specific APIs, couldn't we just have:

unique_ptr<column> is_valid_element(strings_column_view strings, data_type type);

i.e., verifies that the string elements in strings can be parsed as the specified type?

is_valid_element is fine with me. The main issue would be decimal vs not decimal for an integer. I think we could address that with some flags passed to is_valid_element. I also assume that this PR would only be for integer types and other types would be added later?

The main issue would be decimal vs not decimal for an integer

I don't understand the problem. This is what I was thinking:

If you want to know if "12345" is a valid integer you'd do:

is_valid_element( "12345", data_type{INT32})`

If you want to know if it's a valid fixed-point decimal, you do:

is_valid_element("12345", data_type{decimal64, scale, etc.})

And yes, is_valid_element can be a different PR.

revans2

@chenrui17 thanks for jumping on this.

revans2 · 2021-01-07T15:01:26Z

cpp/include/cudf/strings/char_types/char_types.hpp

+ */
+std::unique_ptr<column> is_valid_integer(
+  strings_column_view const& strings,
+  bool allow_decimal,


The name does not matter too much to me. is_valid_numeric is OK, but I think I like is_valid_fixed_point best because it does appear to match with a fixed point number.

revans2 · 2021-01-07T15:11:13Z

cpp/include/cudf/strings/string.cuh

+__device__ bool is_valid_integer(string_view const& d_str ,bool allow_decimal)
+{
+  bool decimal_found  = false;
+  if (!allow_decimal) return is_integer(d_str);


nit: I think we are not going to be able to use is_integer here in the long term. Especially if we want to support range checking. I think the spark code that does this is a decent starting point.

https://github.com/apache/spark/blob/7b06acc28b5c37da6c48bc44c3d921309d4ad3a8/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java#L1120-L1194

You would want to pass in a min_value that corresponds to the type you are checking. You also would never return the resulting value in any way, just the true/false.

revans2 · 2021-01-08T16:52:29Z

cpp/include/cudf/strings/convert/is_valid_fixed_point.hpp

@@ -0,0 +1,47 @@
+/*
+ * Copyright (c) 2019, NVIDIA CORPORATION.


copyright needs to be 2021, and I am not sure you want to give the copyright to NVIDIA for the code that you have written. You should probably include the copyright for the company you work for instead (NOTE: I am not a lawyer)

I fixed here to my company copyright , i don't know if this is appropriate , Are all the other contributors from NVIDIA ？What did they do

revans2 · 2021-01-08T16:53:35Z

cpp/src/strings/convert/is_valid_fixed_point.cu

@@ -0,0 +1,173 @@
+/*
+ * Copyright (c) 2019-2020, NVIDIA CORPORATION.


Same copyright issue. Should be 2021 and probably want to not have it be for NVIDIA unless you work for NVIDIA (again not a lawyer)

Also, this is a new file so it does not need the 2019- part.

revans2 · 2021-01-08T16:54:02Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+namespace strings {
+
+/**
+   * Parses this UTF8String(trimmed if needed) to INT8/16/32/64...


nit indentation in the doxegen comments appears to be off.

revans2 · 2021-01-08T16:58:04Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+  int offset = 0;
+  size_type bytes = d_str.size_bytes();
+  const char* data    = d_str.data();
+  while (offset < bytes && data[offset] == ' ') ++offset;


Do we want to support stripping leading white space? I understand why we might want to do it, but if we do then we need to have it documented for this function and documented for the public API.

The cudf::strings::to_integers() does not ignore the leading whitespace.
If this is used to validate a strings column before calling to_integers() then this should not ignore whitespace either.

revans2 · 2021-01-08T16:59:12Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+  if (offset == bytes)  return false;
+
+  int end = bytes - 1;
+  while (end > offset && data[end] == ' ') --end;


Same comment as above. Do we want to strip trailing white space (Specifically only the space character?) If so that is fine, but we need to be sure that it is documented clearly.

revans2 · 2021-01-08T17:05:07Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+    }
+
+    // We are going to process the new digit and accumulate the result. However, before doing
+    // this, if the result is already smaller than the stopValue(Long.MIN_VALUE / radix), then


nit Long.MIN_VALUE is no longer correct for this comment.

revans2 · 2021-01-08T17:05:44Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+
+    result = result * radix - digit;
+
+    // Since the previous result is less than or equal to stopValue(Long.MIN_VALUE / radix), we


nit: Same here Long.MIN_VALUE should probably be updated in the comment.

jrhemstad · 2021-01-08T20:37:58Z

cpp/include/cudf/strings/convert/is_valid_fixed_point.hpp

+ */
+std::unique_ptr<column> is_valid_fixed_point(
+  strings_column_view const& strings,
+  bool allow_decimal,


I don't understand the purpose of this parameter anymore.

The difference is [0-9] vs [0-9](\.[0-9])?. if allow_decimal is true then we allow the number to be followed by a decimal point followed and more numbers. If it is false we do not.

I understood that much, I just don't understand why this is a desired option. In my mind this would be like having an allow_decimal flag to the is_floating_point API---both strike me as odd.

bool flags are always a code smell for me as they are warning sign of a single responsibility violation.

This is to match what Spark and Hive do when casting strings to int values. Floating point allows for NaN, -Inf, etc that I guess we could special case yet again, but then the overflow wold not be properly checks, which is another feature that we are looking for.

This is to match what Spark and Hive do when casting strings to int values.

What behavior are you looking for? From my perspective, if allow_decimal is False, wouldn't that effectively make this function is_valid_integer?

so , here we need 2 functions instead of 1 ?

Yes, I think is_valid_integer should be distinct from is_valid_fixed_point. And ultimately we should just have is_valid_element.

Based on what you have discussed, I think we should do this :
@jrhemstad
this pr is for

unique_ptr<column> is_valid_integer(strings_column_view strings, data_type type);

and , i will file a new pr for

unique_ptr<column> is_valid_fixed_point(strings_column_view strings, data_type type);

@revans2
above api contains parameter data_type , that will help us to check overflow.

while, for parameter allow_decimal requirements you mentioned in #7080 , we can judge allow_decimal in spark-rapids, if allow_decimal is true, through JNI layer , we finally call is_valid_fixed_point in cudf cpp module, if allow_decimal is false , we finally call is_valid_integer in cudf cpp module, is it right ? in this way , allow_decimal parameter will not enter cudf.

please give me some advice.

For CUDF JNI I would rather mirror the underlying C++ API as much as possible. The Spark plugin can decide which CUDF API to call instead of pushing that into cudf.

@jrhemstad I'm not sure is_valid_element will do what we want. We are asking for two separate things, and perhaps it should be two separate API calls then. First we want to check if the format of the string matches what we expect, next we want to check if when we parse the string as an integer if the number will overflow. In the current set of APIs is_valid_fixed_point and is_valid_integer the name of the function indicates the format of the string while the type we pass in, indicates the overflow value. If we just have is_valid_element then we would need that same split in roles. Does the format match fixed point and we don't overflow parsing to a byte? If that is too much for a single API to do, then perhaps what we want is to add is_fixed_point along with is_integer and is_float, and also add an API to check if a strings column would overflow when parsing to an integer of type X.

davidwendt · 2021-01-11T16:54:47Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+
+  // ready a min_value corresponds to the input type in order to check overflow
+  long d_min_value = 0;
+  switch (input_type.id()) {


This should be coded using the type_dispatcher instead of a switch statement. Also this should use the std::numeric_limits class instead of hardcoded values.

davidwendt · 2021-01-11T17:01:24Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+  // This is the case when we've encountered a decimal separator. The fractional
+  // part will not change the number, but we will verify that the fractional part
+  // is well formed.
+  while (offset <= end) {


Why are we doing this?

in order to verify the fractional part whether is well formed. if allow_decimal is true and has character '.' , it will break from last while loop , so here we need to verify the fractional part after the decimal point.

davidwendt · 2021-01-11T17:01:35Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+    ++offset;
+    // We allow decimals and will return a truncated integral in that case.
+    // Therefore we won't throw an exception here (checking the fractional
+    // part happens below.


Suggested change

// part happens below.

// part happens below).

davidwendt · 2021-01-11T17:04:24Z

cpp/src/strings/convert/is_valid_fixed_point.cu

@@ -0,0 +1,173 @@
+/*
+ * Copyright (c) 2019-2020, NVIDIA CORPORATION.


Also, this is a new file so it does not need the 2019- part.

davidwendt · 2021-01-11T17:04:29Z

cpp/include/cudf/strings/convert/is_valid_fixed_point.hpp

+/**
+ * @brief Returns a boolean column identifying strings in which all
+ * characters are valid for conversion to integers.
+ * 


This description should include how the values are the decimal (if allowed) must all be valid decimal characters.
Also, the description should mention how the overflow depends on the input_type.

davidwendt · 2021-01-11T17:08:28Z

cpp/src/strings/convert/is_valid_fixed_point.cu

+  int offset = 0;
+  size_type bytes = d_str.size_bytes();
+  const char* data    = d_str.data();
+  while (offset < bytes && data[offset] == ' ') ++offset;


The cudf::strings::to_integers() does not ignore the leading whitespace.
If this is used to validate a strings column before calling to_integers() then this should not ignore whitespace either.

davidwendt · 2021-01-12T13:23:13Z

cpp/tests/strings/chars_types_tests.cpp

@@ -14,10 +14,14 @@
 * limitations under the License.
 */

+#include <cudf/strings/convert/is_valid_element.hpp>


I would suggest moving these to a new test source file /cpp/tests/strings/valid_element.cpp

davidwendt · 2021-01-12T13:27:40Z

cpp/src/strings/convert/is_valid_element.cu

+namespace strings {
+
+/**
+ * Check whether the UTF8String is valid when convert data from string to all kinds of integers,


Suggested change

* Check whether the UTF8String is valid when convert data from string to all kinds of integers,

* Check whether the string is valid when convert string to signed integers

davidwendt · 2021-01-12T13:31:12Z

cpp/src/strings/convert/is_valid_element.cu

+ * like INT8/16/32/64. for example, if allow_decimal is true, this will return `true, true` when input string 
+ * is `1.23, 123`, or this function will return `false, true`, it means firt element is invalid, 
+ * while second data is valid. 


Suggested change

* like INT8/16/32/64. for example, if allow_decimal is true, this will return `true, true` when input string

* is `1.23, 123`, or this function will return `false, true`, it means firt element is invalid,

* while second data is valid.

* like INT8/16/32/64. For example, if `allow_decimal` is true, then strings `['1.23', '123']` will return `[true, true]`.

* If `allow_decimal` is false, then this function will return `[false, true]`.

davidwendt · 2021-01-12T13:37:26Z

cpp/src/strings/convert/is_valid_element.cu

+#include <strings/utilities.cuh>
+#include <strings/utilities.hpp>
+#include <strings/utilities.cuh>


Suggested change

#include <strings/utilities.cuh>

#include <strings/utilities.hpp>

#include <strings/utilities.cuh>

These are not needed.

davidwendt · 2021-01-12T13:37:54Z

cpp/src/strings/convert/is_valid_element.cu

+#include <cudf/column/column_factories.hpp>
+#include <cudf/detail/null_mask.hpp>
+#include <cudf/detail/nvtx/ranges.hpp>
+#include <cudf/strings/detail/utilities.hpp>


Suggested change

#include <cudf/strings/detail/utilities.hpp>

This header is not required.

davidwendt · 2021-01-12T13:50:08Z

cpp/src/strings/convert/is_valid_element.cu

+    }
+
+    // We are going to process the new digit and accumulate the result. However, 
+    // before doing this, if the result is already smaller than the stopValue which is


Suggested change

// before doing this, if the result is already smaller than the stopValue which is

// before doing this, if the result is already smaller than the stop_value which is

davidwendt · 2021-01-12T13:50:55Z

cpp/src/strings/convert/is_valid_element.cu

+    // We are going to process the new digit and accumulate the result. However, 
+    // before doing this, if the result is already smaller than the stopValue which is
+    // (std::numeric_limits<data_type>::min() / radix), then result * 10 will definitely 
+    // be smaller than minValue, and we can stop.


Suggested change

// be smaller than minValue, and we can stop.

// be smaller than the min value, and we can stop.

davidwendt · 2021-01-12T13:53:22Z

cpp/src/strings/convert/is_valid_element.cu

+ * @brief The dispatch functions for calculate the min value of input data type
+ * to check overflow.


Suggested change

* @brief The dispatch functions for calculate the min value of input data type

* to check overflow.

* @brief The dispatch functions returns the min value of the input data type used

* for checking overflow.

davidwendt · 2021-01-12T13:53:33Z

cpp/src/strings/convert/is_valid_element.cu

+ * @brief The dispatch functions for calculate the min value of input data type
+ * to check overflow.
+ *
+ * The output is the min value of spicified type.


Suggested change

* The output is the min value of spicified type.

* The output is the min value of specified type.

davidwendt · 2021-01-12T13:57:51Z

cpp/src/strings/convert/is_valid_element.cu

+namespace detail {
+namespace {


Move these two lines up so that it includes the is_valid_element internal device function.

harrism · 2021-01-13T22:38:03Z

Reviewers, please make sure the title and description of this PR is made more descriptive before merging.

davidwendt · 2021-01-14T16:20:33Z

cpp/tests/strings/valid_element.cpp

+{
+  // allow_decimal = true
+  cudf::test::strings_column_wrapper strings1(
+    {"+175", "-34", "9.8", "17+2", "+-14", "1234567890", "67de", "", "1e10", "-", "++", "", "21474836482222"});


Should we check for too large negative number too?

davidwendt · 2021-01-14T16:21:06Z

cpp/src/strings/convert/is_valid_element.cu

+namespace detail {
+namespace {
+/**
+ * Check whether the string is valid when convert string to signed integers,


Missing an @brief tag.

davidwendt · 2021-01-14T16:28:14Z

cpp/include/cudf/strings/convert/is_valid_element.hpp

+ * input_type is used to check whether the data overflows, for example, if input_type is 
+ * `int8_t` and input string data is `128`, then it will return false ,because it out of ranges
+ * [-128, 127] and overflows.


Suggested change

* input_type is used to check whether the data overflows, for example, if input_type is

* `int8_t` and input string data is `128`, then it will return false ,because it out of ranges

* [-128, 127] and overflows.

* `input_type` is used to check whether the data causes an integer overflow. For example, if `input_type` is

* `INT8` and the input `strings[i]` is "128", then the `output[i]` will be `false` since resulting integer

* would be out of range `[-128, 127]` for `int8_t`.

harrism · 2021-02-02T22:45:16Z

Let's move this to 0.19 since it needs significant work.

harrism · 2021-02-02T22:46:27Z

@chenrui17 you will need to click "edit" at the top and switch the target branch to branch-0.19, and then merge branch-0.19 into your local PR branch before proceeding. Thanks!

ttnghia · 2021-03-03T15:17:50Z

How this is going? I was suggested to take over this if necessary.

davidwendt · 2021-03-03T15:32:11Z

So is_fixed_point() is available now if that helps this
https://docs.rapids.ai/api/libcudf/nightly/group__strings__convert.html#ga492da4125dd774bf90c458840779b746
Overflow is checked but the decimal point is always allowed.

ttnghia · 2021-03-10T20:46:33Z

So is_fixed_point() is available now if that helps this
https://docs.rapids.ai/api/libcudf/nightly/group__strings__convert.html#ga492da4125dd774bf90c458840779b746
Overflow is checked but the decimal point is always allowed.

@davidwendt is_fixed_point is fantastic. Since that function does range check, I think we should not add the new function is_valid_element, instead we should modifystrings::is_integer to make it doing the same things: checking for both pattern and range. By doing so, we can maintain consistency between those similar functionalities.

davidwendt · 2021-03-10T21:06:15Z

So is_fixed_point() is available now if that helps this
https://docs.rapids.ai/api/libcudf/nightly/group__strings__convert.html#ga492da4125dd774bf90c458840779b746
Overflow is checked but the decimal point is always allowed.

@davidwendt is_fixed_point is fantastic. Since that function does range check, I think we should not add the new function is_valid_element, instead we should modifystrings::is_integer to make it doing the same things: checking for both pattern and range. By doing so, we can maintain consistency between those similar functionalities.

Sorry, I don't understand why we need to change cudf::strings::is_integer().
Can you just call cudf::strings::is_fixed_point() and then we don't need this PR?

ttnghia · 2021-03-10T21:08:42Z

Sorry, I don't understand why we need to change cudf::strings::is_integer().
Can you just call cudf::strings::is_fixed_point() and then we don't need this PR?

As you said, decimal point is always allowed. is_integer should not allow decimal point.

davidwendt · 2021-03-10T21:13:27Z

Sorry, I don't understand why we need to change cudf::strings::is_integer().
Can you just call cudf::strings::is_fixed_point() and then we don't need this PR?

As you said, decimal point is always allowed. is_integer should not allow decimal point.

cudf::strings::is_integer() does not allow a decimal point.
I think you are asking for cudf::strings::is_integer() to check for overflow?

ttnghia · 2021-03-10T21:14:20Z

Sorry, I don't understand why we need to change cudf::strings::is_integer().
Can you just call cudf::strings::is_fixed_point() and then we don't need this PR?

As you said, decimal point is always allowed. is_integer should not allow decimal point.

cudf::strings::is_integer() does not allow a decimal point.
I think you are asking for cudf::strings::is_integer() to check for overflow?

Correct. Please also see here: #7557

ttnghia · 2021-03-18T20:48:17Z

Hi @chenrui17.
Sorry that I will not adopt this PR as it is very outdated. I have pushed a new PR addressing the issue of converting string to integer with bounds check (#7642). My code only addresses integer types without decimal. For decimal types, there has been another work for this (is_fixed_point()).

@ttnghia

…eger conversion (#7642) This PR addresses #5110, #7080, and rework #7094. It adds the function `cudf::strings::is_integer` that can check if strings can be correctly converted into integer values. Underflow and overflow are also taken into account. Note that this `cudf::strings::is_integer` is different from the existing `cudf::strings::string::is_integer`, which only checks for pattern and does not care about under/overflow. Examples: ``` s = { "eee", "-200", "-100", "127", "128", "1.5", NULL} is_integer(s, INT8) = { 0, 0, 1, 1, 0, 0, NULL} is_integer(s, INT32) = { 0, 1, 1, 1, 1, 0, NULL} ``` Authors: - Nghia Truong (@ttnghia) Approvers: - David (@davidwendt) - Jake Hemstad (@jrhemstad) - Mark Harris (@harrism) URL: #7642

chenrui17 requested a review from a team as a code owner January 7, 2021 12:39

chenrui17 requested review from vuule and davidwendt January 7, 2021 12:39

davidwendt requested changes Jan 7, 2021

View reviewed changes

jrhemstad requested changes Jan 7, 2021

View reviewed changes

revans2 reviewed Jan 7, 2021

View reviewed changes

revans2 assigned chenrui17 Jan 7, 2021

revans2 added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS strings strings issues (C++ and Python) labels Jan 7, 2021

chenrui17 force-pushed the branch-0.18 branch 4 times, most recently from 53cf088 to 537f92c Compare January 8, 2021 10:24

add is_valid_fixed_point function

e00b126

chenrui17 force-pushed the branch-0.18 branch from 537f92c to e00b126 Compare January 8, 2021 10:30

chenrui17 requested review from revans2 and davidwendt January 8, 2021 11:58

revans2 reviewed Jan 8, 2021

View reviewed changes

jrhemstad reviewed Jan 8, 2021

View reviewed changes

davidwendt requested changes Jan 11, 2021

View reviewed changes

review changes

87e8430

chenrui17 requested a review from revans2 January 12, 2021 10:59

chenrui17 requested review from kkraus14, davidwendt and jrhemstad January 12, 2021 10:59

davidwendt requested changes Jan 12, 2021

View reviewed changes

review changes again

a0045bd

davidwendt requested changes Jan 14, 2021

View reviewed changes

revans2 mentioned this pull request Feb 2, 2021

[FEA] string conversion to/from decimal values #7285

Closed

ttnghia mentioned this pull request Mar 10, 2021

[FEA] Refactor string conversion check #7557

Closed

ttnghia mentioned this pull request Mar 18, 2021

Add is_integer API that can check for the validity of a string-to-integer conversion #7642

Merged

ttnghia closed this Mar 25, 2021

	INT8, ///< 1 byte signed integer
	INT16, ///< 2 byte signed integer
	INT32, ///< 4 byte signed integer
	INT64, ///< 8 byte signed integer
	UINT8, ///< 1 byte unsigned integer
	UINT16, ///< 2 byte unsigned integer
	UINT32, ///< 4 byte unsigned integer
	UINT64, ///< 8 byte unsigned integer

		@@ -0,0 +1,173 @@
		/*
		* Copyright (c) 2019-2020, NVIDIA CORPORATION.


		result = result * radix - digit;

		// Since the previous result is less than or equal to stopValue(Long.MIN_VALUE / radix), we

	* Check whether the UTF8String is valid when convert data from string to all kinds of integers,
	* Check whether the string is valid when convert string to signed integers

	#include <strings/utilities.cuh>
	#include <strings/utilities.hpp>
	#include <strings/utilities.cuh>

	// before doing this, if the result is already smaller than the stopValue which is
	// before doing this, if the result is already smaller than the stop_value which is

	// be smaller than minValue, and we can stop.
	// be smaller than the min value, and we can stop.

		* @brief The dispatch functions for calculate the min value of input data type
		* to check overflow.

	* The output is the min value of spicified type.
	* The output is the min value of specified type.

add is_valid_integer format check API #7094

add is_valid_integer format check API #7094

Conversation

chenrui17 commented Jan 7, 2021

GPUtester commented Jan 7, 2021

GPUtester commented Jan 7, 2021

davidwendt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

revans2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenrui17 Jan 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harrism commented Jan 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harrism commented Feb 2, 2021

harrism commented Feb 2, 2021

ttnghia commented Mar 3, 2021

davidwendt commented Mar 3, 2021

ttnghia commented Mar 10, 2021

davidwendt commented Mar 10, 2021

ttnghia commented Mar 10, 2021

davidwendt commented Mar 10, 2021

ttnghia commented Mar 10, 2021

ttnghia commented Mar 18, 2021

chenrui17 Jan 12, 2021 •

edited

Loading