Adding operator+[=] and operator/[=] for FloatingArray and IntegerArray. #58

joshuastorck · 2016-10-27T19:08:57Z

closes #xxxx
tests added / passed
passes git diff upstream/master | flake8 --diff
whatsnew entry

This includes a design change that obviates the need for an ArrayView.
Instead, every array has an internal offset. Shallow copy is achieved by copy
constructor, though the current set of copy constructors don't yet
support a slice. Deep copy is still achieved through the Copy virtual
function.

More detailed explanation of the changes:

Adding copy/move constructors for {Floating,Integer,Numeric}Array
Adding various method for marking/getting nulls (valid bits) in
integer arrays
Changing data() and mutable_data() in NumericArray so that they
return a pointer that starts at the array's offset
Addition of Addable/Divisable classes (similar to Boost operators)
for easy support of operator[+/]
Unit test scaffolding for testing permutations of left/right hand
side types on arithmetic operators
Implementing IntegerArray::operator/, IntegerArray::operator+=
FloatingArray::operator/=, FloatingArray::operator+=

…nippets and cmake modules * Build native.pyx stub and libpandas.so * In place cmake build + rpath on linux, at least * Library scaffolding and build infrastructure for libpandas native core experiments to evaluate replacing Series and DataFrame internals and custom dtypes with a consistent set of wrapper types, array, and table data structures. Incorporates several bits of BSD / MIT / Apache-licensed code and various copyright notices have been added to LICENSES. * Do not include pandas/ * Array API stubs * Import bunch of utility code from Arrow, refactoring

Please leave code comments on Gerrit: https://review.gerrithub.io/#/c/298035/ Author: Wes McKinney <[email protected]> Closes wesm#40 from wesm/array-views and squashes the following commits: 27a194b [Wes McKinney] Change merge-pr script to target pandas-dev/pandas2 ea2017e [Wes McKinney] * Prototype array view interface * Buffer slicing and copying * Test ArrayView constructors and copy/move assignment * Test ArrayView::Slice, EnsureMutable, and some array attrs Change-Id: I1f942ec1160590c33b68a12eb88fe923dd135911

Help keep things clean. This style is based on LLVM and Google C++ styles -- we can always tweak a bit later. Author: Wes McKinney <[email protected]> Closes wesm#48 from wesm/clang-format and squashes the following commits: eed45dd [Wes McKinney] Built "make format" target format target using clang-format based on arrow-cpp and clean up existing codebase Change-Id: Id697c3805d18107ad7e20e9b81ba09350d7cd219

…or reporting, bit manipulation) with libarrow Closes wesm#44. This will spare us developing similar functionality in two places for the time being -- we can keep an eye on the relationship between the libraries in case it does make sense to split in the future (but it may not, particularly if we use Arrow to implement nested data natively in pandas). Author: Wes McKinney <[email protected]> Closes wesm#49 from wesm/link-libarrow and squashes the following commits: 1d176b1 [Wes McKinney] Refactor libpandas to use common constructs and utility code from libarrow: Status, MemoryPool, Buffer, etc. Change-Id: I3140a3ed8dd17db376be21241fc259523a16cc88

For test automation for now. This won't pass until the libarrow artifacts are updated in conda-forge Closes wesm#41 Author: Wes McKinney <[email protected]> Closes wesm#50 from wesm/circle-ci and squashes the following commits: c68b56b [Wes McKinney] Skip Cython extensions for now e2bdb62 [Wes McKinney] Add external project include, google version numbers 5d4a536 [Wes McKinney] conda create fix 5986e72 [Wes McKinney] conda create with env path af44d4d [Wes McKinney] Docker mysteries 1391c87 [Wes McKinney] Tweaks, not sure why failing 3fc6219 [Wes McKinney] Add basic Circle CI setup Change-Id: I2fefc2548d5e60a8e6f9ea39fa57efb9cd0b2e79

* Changing PrimitiveType to NumericType * Removing the macro for declaring sub-types of NumericType and instead using template arguments * Addding a static SINGLETON member to the NumericType base class instead of the macros for methods for creating singletons for each numeric type * Removing NullType's inheritance of PrimitiveType * Removing intermediate base class between {Integer,Floating}ArrayImpl and PrimitiveType and just making a concrete template based class {Integer,Floating}Array * Changing FLOAT and DOUBLE to FLOAT32 and FLOAT64 Author: Joshua Storck <[email protected]> Closes wesm#55 from joshuastorck/pandas-2.0 and squashes the following commits: 1614d44 [Joshua Storck] * Fixing native.p{xd,yx} so that it matches the latest C++ code * Changing PrimitiveType to NumericType * Removing the macro for declaring sub-types of NumericType and instead using template arguments * Addding a static SINGLETON member to the NumericType base class instead of the macros for methods for creating singletons for each numeric type * Removing NullType's inheritance of PrimitiveType * Removing intermediate base class between {Integer,Floating}ArrayImpl and PrimitiveType and just making a concrete template based class {Integer,Floating}Array * Changing FLOAT and DOUBLE to FLOAT32 and FLOAT64

shoyer · 2016-10-27T20:14:35Z

src/pandas/types/numeric.h

+    auto other_data = other.data();
+    if (HasNulls()) {
+      for (auto ii = 0; ii < length; ++ii, ++this_data, ++other_data) {
+        if (!IsNull(ii)) { *this_data += *other_data; }


This check may or may not be a good idea from a performance perspective. Are you sure the conditional check is cheaper than just doing the arithmetic always? Even when there are only a few null values?

I would say to do the arithmetic always and use byte/word-wise & with the respective bitmaps (if they are byte-aligned, otherwise do it bitwise) to propagate nulls separately

That would have been my preference. It just wasn't clear from the existing code if floating point arrays intentionally didn't have bitmaps.

wesm · 2016-10-27T21:08:29Z

src/pandas/types/numeric.h

+  // Copy the list of nulls (aka valid bits) to a buffer
+  Status CopyNulls(int64_t length, std::shared_ptr<Buffer>* out) const {
+    // Pre-condition is that we already have valid_bits_ set
+    return valid_bits_->Copy(this->offset_, BitUtil::BytesForBits(length), out);


Offset needs to be used as a bit offset. Can use CopyBitmap here (which hasn't been properly implemented yet, but the signature is okay)

wesm · 2016-10-27T21:10:39Z

src/pandas/array-test.cc

+      plus, plus_inplace);
+  OperatorTest<IntegerArray, UInt32Type, IntegerArray, UInt32Type>().TestInplaceOperator(
+      plus, plus_inplace);
+  OperatorTest<IntegerArray, UInt64Type, IntegerArray, UInt64Type>().TestInplaceOperator(


Guess we may want some macros (or other templates) to be able to say "Do this thing for the cartesian product of certain types"

We could also use type traits to associate a class template (e.g. IntegerArray<Int32Type>) with just the type Int32Type

wesm · 2016-10-27T21:14:44Z

src/pandas/types/numeric.cc

+}
+
+template <typename TYPE>
+Status NumericArray<TYPE>::EnsureMutableAndCheckChange(bool& changed) {


Google C++ guide discourages mutable references, preferring bool* changed

wesm · 2016-10-27T21:20:30Z

src/pandas/types/numeric.h

+    auto other_data = other.data();
+    if (HasNulls()) {
+      for (auto ii = 0; ii < length; ++ii, ++this_data, ++other_data) {
+        if (!IsNull(ii)) { *this_data += *other_data; }


I would say to do the arithmetic always and use byte/word-wise & with the respective bitmaps (if they are byte-aligned, otherwise do it bitwise) to propagate nulls separately

wesm · 2016-10-27T21:45:06Z

src/pandas/types/numeric.h

+  IntegerArray<TYPE>& operator+=(const IntegerArray<OTHER_TYPE>& other) {
+    // TODO: check Status and throw exception
+    EnsureMutable();
+    auto length = std::min(this->length_, other.length());


Thinking out loud here. I have been thinking about a slightly different approach with implementing operators in generality, by instead unpacking the containers into simpler data structures, like:

# SFINAE template <typename T, typename ENABLED = void> struct TypeData {}; template <typename T> struct TypeData<T, std::enable_if<std::is_integer<T>, T>::type> { T* data; uint8_t* valid_bits; int64_t length; int64_t offset; }

and then have arithmetic methods

template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> void add(TypeData<LEFT_TYPE> left, TypeData<RIGHT_TYPE> right, TypeData<OUT_TYPE> out) { ... for (int64_t i = 0; i < left.length; ++i) { out.data[i] = left.data[i] + right.data[i]; } ... }

In the case of inplace-add, in the operator dispatcher, you would have::

TypeData<LEFT_TYPE> left(...); TypeData<RIGHT_TYPE> right(...); add(left, right, left);

You would have to deal with whether to allocate the validity/null bitmap prior to invoking this function, but I think it's probably better to handle the mutability checks / memory allocation / etc. separate from the analytics. Then you can box the output as you wish.

Of course, you need to have a forward declaration someplace

extern template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> void add(TypeData<LEFT_TYPE> left, TypeData<RIGHT_TYPE> right, TypeData<OUT_TYPE> out);

and then instantiate the matrix of types you're choosing to support. Some analytical kernels might need to allocate memory based on what's in the data, and so we could do something like

void kernel_name(OperatorContext* ctx, TypeData<ARG1_TYPE> arg1, ...);

I'd also like to maximize code sharing amongst related operator implementations.

I'm going to take a crack at implementing part of Take under this regime (see https://github.com/pandas-dev/pandas/blob/master/pandas/src/algos_take_helper.pxi). Will keep thinking about this.

Also, per #46 we are contemplating going bitmaps for all null handling which would make the code for integers and floats identical (outside some type promotion details maybe)

If you assume that all numeric types will have a null bitmap, then I think NumericArray in the current code would probably hold something equivalent to what you are suggesting with TypeData.

I'm not following your SFINAE code, but I think you are driving at a design that looks a bit more like this:

template <typename KERNEL, typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE, typename... KERNEL_ARGS> void RunKernel(TypeData<LEFT_TYPE> const& left, TypeData<RIGHT_TYPE> const& right, TypeData<OUT_TYPE>& out, KERNEL_ARGS&&... kernel_args) { KERNEL kernel(std::forward<KERNEL_ARGS>(kernel_args)...); kernel(left, right, out); } class GenericKernel { public: template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE, typename FUNCTOR> void Run(const TypeData<LEFT_TYPE>& left, FUNCTOR& functor, const TypeData<RIGHT_TYPE>& right, TypeData<OUT_TYPE>& out) { // Do a bitmap OR for nulls first and then do ffs // and branch for (auto ii = 0; ii < left.length; ++ii) { functor(left.data[ii], right.data[ii], out.data[ii]); } } }; // This would get initialized once at startup and would // have information about the instruction set available // on the machine (AVX vs. AVX2) class SIMDConfiguration { }; template <typename LEFT_TYPE, typename RIGHT_TYPE> class SIMDSinglePrecisionPossible; template <typename LEFT_TYPE, typename RIGHT_TYPE> class SIMDSinglePrecisionPossible : public std::false_type { }; template <> class SIMDSinglePrecisionPossible<float, float> : public std::true_type { }; template <typename LEFT_TYPE, typename RIGHT_TYPE> class SIMDDoublePrecisionPossible; template <typename LEFT_TYPE, typename RIGHT_TYPE> class SIMDDoublePrecisionPossible : public std::false_type { }; template <> class SIMDDoublePrecisionPossible<double, double> : public std::true_type { }; template <typename LEFT_TYPE, typename RIGHT_TYPE> class SIMDImpossible { public: static constexpr auto value = !(SIMDDoublePrecisionPossible<LEFT_TYPE, RIGHT_TYPE>::value || SIMDSinglePrecisionPossible<LEFT_TYPE, RIGHT_TYPE>::value); }; class AddKernel : public GenericKernel { public: AddKernel(SIMDConfiguration& config) : config_(config) { } template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> void operator()(const TypeData<LEFT_TYPE>& left, const TypeData<RIGHT_TYPE>& right, TypeData<OUT_TYPE>& out) { RunOptimized<LEFT_TYPE, RIGHT_TYPE>(left, right, out); } template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> typename std::enable_if<SIMDSinglePrecisionPossible<LEFT_TYPE, RIGHT_TYPE>::value>::type RunOptimized(const TypeData<LEFT_TYPE>& left, const TypeData<RIGHT_TYPE>& right, TypeData<OUT_TYPE>& out) { // Use _mm_add_ps or _mm256_add_pd depending on // config_, fallback to Run } template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> typename std::enable_if<SIMDDoublePrecisionPossible<LEFT_TYPE, RIGHT_TYPE>::value>::type RunOptimized(const TypeData<LEFT_TYPE>& left, const TypeData<RIGHT_TYPE>& right, TypeData<OUT_TYPE>& out) { // Use _mm_add_pd or _mm256_add_pd depending on // config_, fallback to Run } template <typename LEFT_TYPE, typename RIGHT_TYPE, typename OUT_TYPE> typename std::enable_if<SIMDImpossible<LEFT_TYPE, RIGHT_TYPE>::value>::type RunOptimized(const TypeData<LEFT_TYPE>& left, const TypeData<RIGHT_TYPE>& right, TypeData<OUT_TYPE>& out) { static auto add = [](auto const& left, auto const& right, auto& out) { out = left + right; }; Run(left, add, right, out); } private: SIMDConfiguration& config_; }; int main(int argc, char* argv[]) { // Just to see if it compiles TypeData<float> a; TypeData<int> b; TypeData<int> c; SIMDConfiguration config; RunKernel<AddKernel>(a, b, c, config); }

I would expect FloatingArray and IntegerArray to control what operations are allowed using SFINAE, in the same fashion that I've currently implemented it. The main difference would be that those functions would just invoke RunKernel and would setup the correct return type based on the desired rules for promotion of types.

wesm · 2016-10-27T21:50:50Z

src/pandas/types/numeric.h

+    static constexpr auto needs_double =
+        (max_value > std::numeric_limits<typename FloatType::c_type>::max());
+
+    using type = typename std::conditional<needs_double, DoubleType, FloatType>::type;


In pure Python and NumPy at least, integer division always yields double

We could support integer division with //, but that's certainly a lower priority.

Just to be clear, this is just choosing the right size of floating point type based on the inputs.

xhochy · 2016-10-28T06:29:38Z

CMakeLists.txt

@@ -74,7 +74,7 @@ endif()
 #     GCC cannot always verify whether strict aliasing rules are indeed followed due to
 #     fundamental limitations in escape analysis, which can result in subtle bad code generation.
 #     This has a small perf hit but worth it to avoid hard to debug crashes.
-set(CXX_COMMON_FLAGS "-std=c++11 -fno-strict-aliasing -msse4.2 -Wall -Wno-sign-compare -Wno-deprecated -pthread -D__STDC_FORMAT_MACROS")
+set(CXX_COMMON_FLAGS "-std=c++1y -fno-strict-aliasing -msse4.2 -Wall -Wno-sign-compare -Wno-deprecated -pthread -D__STDC_FORMAT_MACROS")


This will increase the minimal required GCC version to 5?

I think c++1y is required by clang, we can do c++14 on gcc 4.9 I believe

The thing that comes to my mind is that the typical toolchains for packages that work on multiple Linux distros (i.e. manylinux1 wheels and the redhat toolchain in conda) all depend on code compiling with GCC 4.8.

Redhat devtoolset-3 has gcc 4.9.x, so I imagine by the time pandas2 is ready for primetime this upgrade will have happened. There will be other packages that need C++14, might as well push the issue

Ok, then let's hope for a manylinux2 standard then and enjoy the new C++14 features ;) . Looking forward to std::make_unique.

This includes a design change that obviates the need for an ArrayView. Instead, every array has an internal offset. Shallow copy is achieved by copy constructor, though the current set of copy constructors don't yet support a slice. Deep copy is still achieved through the Copy virtual function. More detailed explanation of the changes: * Adding copy/move constructors for {Floating,Integer,Numeric}Array * Adding various method for marking/getting nulls (valid bits) in integer arrays * Changing data() and mutable_data() in NumericArray so that they return a pointer that starts at the array's offset * Addition of Addable/Divisable classes (similar to Boost operators) for easy support of operator[+/] * Implementing IntegerArray::operator/, IntegerArray::operator+= FloatingArray::operator/=, FloatingArray::operator+= * Adding meta programming class TypeList, which is used as scaffolding in array-test to simplify the declaration of every permutation of arithmetic operation for all types.

wesm · 2016-12-10T00:49:01Z

There's a bunch of bits here that can be extracted and merged as useful utilities. I want to wait on pulling in operator implementations until the operator interface has been thought through some more: https://docs.google.com/document/d/1YmsV48iO6YNSxCIC84Xig5z-i9g4g7rzYTmCgUAxKRk/edit#heading=h.pzfqkdqgusm7

I'd like to take a crack at implementing some operators in a way that can evaluated either in single- or multi-threaded mode (without excessive code duplication -- so the code that runs things in parallel won't be too knowledgeable about the details of the operator except how invoke their virtual functions

wesm · 2016-12-13T19:07:52Z

I cherry-picked some of this as a7f2a82

I'm going to do a little prototyping on operator evaluation generally, so will return to this hopefully soon

wesm and others added 6 commits October 12, 2016 15:07

shoyer reviewed Oct 27, 2016

View reviewed changes

wesm reviewed Oct 27, 2016

View reviewed changes

xhochy reviewed Oct 28, 2016

View reviewed changes

joshuastorck force-pushed the initial_operator branch from bad99a1 to bcb8a9e Compare October 29, 2016 20:39

joshuastorck force-pushed the initial_operator branch from bcb8a9e to d8d55cf Compare October 30, 2016 23:30

wesm force-pushed the pandas-2.0 branch from 2096bf9 to 3798cc1 Compare December 12, 2016 22:07

joshuastorck closed this Dec 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding operator+[=] and operator/[=] for FloatingArray and IntegerArray. #58

Adding operator+[=] and operator/[=] for FloatingArray and IntegerArray. #58

joshuastorck commented Oct 27, 2016

shoyer Oct 27, 2016 •

edited

Loading

wesm Oct 27, 2016

joshuastorck Oct 28, 2016

wesm Oct 27, 2016

wesm Oct 27, 2016

wesm Oct 27, 2016

wesm Oct 27, 2016

wesm Oct 27, 2016

wesm Oct 27, 2016

joshuastorck Oct 28, 2016

wesm Oct 27, 2016

shoyer Oct 27, 2016

joshuastorck Oct 28, 2016

xhochy Oct 28, 2016

wesm Oct 28, 2016 •

edited

Loading

xhochy Oct 28, 2016

wesm Oct 28, 2016

xhochy Oct 28, 2016

wesm commented Dec 10, 2016

wesm commented Dec 13, 2016

Adding operator+[=] and operator/[=] for FloatingArray and IntegerArray. #58

Adding operator+[=] and operator/[=] for FloatingArray and IntegerArray. #58

Conversation

joshuastorck commented Oct 27, 2016

shoyer Oct 27, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wesm Oct 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wesm commented Dec 10, 2016

wesm commented Dec 13, 2016

shoyer Oct 27, 2016 •

edited

Loading

wesm Oct 28, 2016 •

edited

Loading