Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Bindings for Pairwise Linestring Distance #521

Merged
Merged
Show file tree
Hide file tree
Changes from 97 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
0f2c62c
Initial pass of linestring distance and test
isVoid Mar 31, 2022
29f1d62
Add more one pair linestring tests
isVoid Mar 31, 2022
954bbfc
More single pair testing
isVoid Apr 1, 2022
03748ea
update docstring
isVoid Apr 1, 2022
3b52203
Add medium test
isVoid Apr 7, 2022
6deb261
Wrapping debug prints
isVoid Apr 7, 2022
61716c6
Initial
isVoid Apr 7, 2022
95a803f
Merge branch 'fea/enable_nvbench' into feature/linestring_distance
isVoid Apr 7, 2022
3375b1b
Merge branch 'branch-22.06' of github.com:rapidsai/cuspatial into fea…
isVoid Apr 7, 2022
3cab72d
initial
isVoid Apr 7, 2022
1c6322e
Merge branch 'fix/update_cuda_try' into feature/linestring_distance_b…
isVoid Apr 7, 2022
f6b571a
Update synchonization.cpp
isVoid Apr 7, 2022
2c3a038
Merge branch 'fix/update_cuda_try' into fea/enable_nvbench
isVoid Apr 8, 2022
90de8ce
Replace one more usage of `CUDF_CUDA_TRY`
isVoid Apr 8, 2022
3efb6b0
Merge branch 'fix/update_cuda_try' into feature/linestring_distance_b…
isVoid Apr 8, 2022
a637664
Merge branch 'fea/enable_nvbench' into feature/linestring_distance_be…
isVoid Apr 8, 2022
d82f887
Add nvbench cpm
isVoid Apr 8, 2022
80e2d27
Merge branch 'fea/enable_nvbench' into feature/linestring_distance_be…
isVoid Apr 8, 2022
813178c
Add runnable nvbench
isVoid Apr 8, 2022
aa65c75
Add data gen synchronizer
isVoid Apr 8, 2022
511dd5d
Rewrites kernel to schedule on num points
isVoid Apr 9, 2022
cee2c1f
Change problem size in benchmarks
isVoid Apr 9, 2022
3bf3d44
In the middle of getting the compiler to include the file.
isVoid Apr 14, 2022
2006ac1
passes compilation
isVoid Apr 14, 2022
f463bd8
Revert "passes compilation"
isVoid Apr 14, 2022
08ef028
Revert "In the middle of getting the compiler to include the file."
isVoid Apr 14, 2022
53e822a
Code cleanups and completes docstrings
isVoid Apr 14, 2022
5dd6a4c
Revert "Change problem size in benchmarks"
isVoid Apr 14, 2022
2b1da4a
Revert "Add data gen synchronizer"
isVoid Apr 14, 2022
d8e6902
Revert "Add runnable nvbench"
isVoid Apr 14, 2022
2341301
Revert "Add nvbench cpm"
isVoid Apr 14, 2022
cc6f1a2
Revert benchmark cmake introduction of nvbench.
isVoid Apr 14, 2022
daa2966
Merge branch 'branch-22.06' of https://github.com/rapidsai/cuspatial …
isVoid Apr 14, 2022
decac94
Add small helper to retrieve endpoint index.
isVoid Apr 14, 2022
0f139e0
More cleanups
isVoid Apr 14, 2022
0fa013a
Add cython and python bindings
isVoid Apr 22, 2022
5ceccd3
add simple test case
isVoid Apr 22, 2022
30ac95c
Add single test sample from geolife
isVoid Apr 25, 2022
53a4135
Fix typo
isVoid Apr 25, 2022
7dcd1c8
Merge branch 'feature/linestring_distance' into feature/linestring_di…
isVoid Apr 25, 2022
b606956
Fix bug where collinear line segments are taken as intersect segments
isVoid Apr 26, 2022
6811cb6
Add test case for various collinear line segments and an geolife example
isVoid Apr 26, 2022
d83f2e2
Merge branch 'feature/linestring_distance' into feature/linestring_di…
isVoid Apr 26, 2022
90a5102
Multiple update with docstrings.
isVoid Apr 27, 2022
77b27a9
style fix
isVoid Apr 27, 2022
468a5e9
Switch to snake case.
isVoid Apr 27, 2022
c5df642
fix typo
isVoid Apr 27, 2022
777a01a
Rename coord_2d to vec_2d. Add vec_2d operators, rewrites `point_to_s…
isVoid Apr 28, 2022
aa93829
Merge master from upstream
isVoid Apr 28, 2022
e0a15d8
Migrate vec2d operators to vec_2d.cuh
isVoid Apr 28, 2022
877d74e
Add docstrings for operators
isVoid Apr 28, 2022
2a58b0e
Add macro for portable host_device code
isVoid Apr 28, 2022
6fdd25a
Revert "Merge master from upstream"
isVoid Apr 28, 2022
9fa8136
Add macro for portable host-device operators
isVoid Apr 28, 2022
39baadf
Merge branch 'branch-22.06' of https://github.com/rapidsai/cuspatial …
isVoid Apr 28, 2022
cc0f526
Optimize `segment_distance` method
isVoid Apr 28, 2022
f55db25
Avoid using `double` for tempeoraries
isVoid Apr 28, 2022
22ed7ec
Remove vec2d elementwise add
isVoid Apr 28, 2022
bc77e95
Merge branch 'feature/linestring_distance' into feature/linestring_di…
isVoid Apr 28, 2022
7718b7e
Add aka polylines
isVoid Apr 28, 2022
2c132c5
remove validation
isVoid Apr 28, 2022
a477f2b
remove validation
isVoid Apr 28, 2022
708906d
Add aka polylines
isVoid Apr 28, 2022
2dc421b
Add examples in docstring and test for it
isVoid Apr 28, 2022
497f1d1
update docstring of `det`
isVoid Apr 28, 2022
c09689a
Add denormalized determinants test
isVoid Apr 29, 2022
7023faa
adopt almost always auto
isVoid May 2, 2022
8004ec9
doc updates
isVoid May 2, 2022
654c08b
Several docstring updates
isVoid May 3, 2022
c864dca
update colinear names
isVoid May 3, 2022
ddf497e
fix typo
isVoid May 3, 2022
1203a86
Merge branch 'feature/linestring_distance' of github.com:isVoid/cuspa…
isVoid May 3, 2022
2da3a66
Template out the sizetype
isVoid May 3, 2022
338a5ad
Move 2d vector type definitions to `vec_2d.hpp`
isVoid May 3, 2022
94b78f3
Include vector types in haversine method
isVoid May 3, 2022
a3e84ee
update example and test
isVoid May 4, 2022
c40b81f
Lift sqrt outside of the segment distance.
isVoid May 5, 2022
65a835c
Update docstring LRAI links.
isVoid May 5, 2022
96f15ce
Merge branch 'feature/linestring_distance' into feature/linestring_di…
isVoid May 5, 2022
ba0aed1
Update naming to linestring
isVoid May 5, 2022
23875ee
Replace type declaration with `auto`
isVoid May 5, 2022
ac53f8d
Delay computations
isVoid May 5, 2022
b546e7a
Use constexpr for `tpb`
isVoid May 5, 2022
26fa8d4
Replace division with multiply denom reciprocals
isVoid May 5, 2022
b7dda2c
delay computing `bc`
isVoid May 5, 2022
2c60a5d
docstring fixes
isVoid May 5, 2022
3ef43db
Merge branch 'feature/linestring_distance' into feature/linestring_di…
isVoid May 5, 2022
c4c0817
Add more linestring python tests
isVoid May 6, 2022
e73cf68
add docs for host compute helper
isVoid May 12, 2022
523bee6
Rename pxd declaration file to hpp filename.
isVoid May 12, 2022
66b0fcf
Remove comment
isVoid May 12, 2022
11ca399
Merge branch 'branch-22.06' of https://github.com/rapidsai/cuspatial …
isVoid May 12, 2022
3fe876f
style
isVoid May 12, 2022
08e528e
update pythob/cpp docstrings
isVoid May 26, 2022
d27ff3e
doc
isVoid May 26, 2022
272d443
add example to python docs
isVoid May 27, 2022
60ed54e
docs style update
isVoid May 27, 2022
202e4e2
Merge remote-tracking branch 'origin/branch-22.06' into feature/lines…
vyasr May 28, 2022
f9b70e9
Add linestring_distance to CMakeLists.txt.
vyasr May 28, 2022
fcf44c2
Fix erroneous tests.
vyasr May 31, 2022
c2f0620
Attempt to fix style.
vyasr May 31, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions cpp/include/cuspatial/distances/linestring_distance.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,8 @@ namespace cuspatial {
* @param linestring1_points_x x-components of points in the first linestring of each pair.
* @param linestring1_points_y y-component of points in the first linestring of each pair.
* @param linestring2_offsets Indices of the first point of the second linestring of each pair.
* @param linestring2_points_x x-component of points in the first linestring of each pair.
* @param linestring2_points_y y-component of points in the first linestring of each pair.
* @param linestring2_points_x x-component of points in the second linestring of each pair.
* @param linestring2_points_y y-component of points in the second linestring of each pair.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return A column of shortest distances between each pair of linestrings.
*
Expand Down
1 change: 1 addition & 0 deletions python/cuspatial/cuspatial/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
point_in_polygon,
polygon_bounding_boxes,
polyline_bounding_boxes,
pairwise_linestring_distance,
)
from .core.indexing import quadtree_on_points
from .core.interpolate import CubicSpline
Expand Down
20 changes: 20 additions & 0 deletions python/cuspatial/cuspatial/_lib/cpp/linestring_distance.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from libcpp.memory cimport unique_ptr

from cudf._lib.column cimport Column
from cudf._lib.cpp.column.column cimport column
from cudf._lib.cpp.column.column_view cimport column_view
from cudf._lib.cpp.types cimport size_type


cdef extern from "cuspatial/distances/linestring_distance.hpp" \
namespace "cuspatial" nogil:
cdef unique_ptr[column] pairwise_linestring_distance(
const column_view linestring1_offsets,
const column_view linestring1_points_x,
const column_view linestring1_points_y,
const column_view linestring2_offsets,
const column_view linestring2_points_x,
const column_view linestring2_points_y
) except +
39 changes: 39 additions & 0 deletions python/cuspatial/cuspatial/_lib/linestring_distance.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from libcpp.memory cimport unique_ptr
from libcpp.utility cimport move

from cudf._lib.column cimport Column
from cudf._lib.cpp.column.column cimport column
from cudf._lib.cpp.column.column_view cimport column_view

from cuspatial._lib.cpp.linestring_distance cimport (
pairwise_linestring_distance as cpp_pairwise_linestring_distance,
)


def pairwise_linestring_distance(
Column linestring1_offsets,
Column linestring1_points_x,
Column linestring1_points_y,
Column linestring2_offsets,
Column linestring2_points_x,
Column linestring2_points_y
):
cdef column_view linestring1_offsets_view = linestring1_offsets.view()
cdef column_view linestring1_points_x_view = linestring1_points_x.view()
cdef column_view linestring1_points_y_view = linestring1_points_y.view()
cdef column_view linestring2_offsets_view = linestring2_offsets.view()
cdef column_view linestring2_points_x_view = linestring2_points_x.view()
cdef column_view linestring2_points_y_view = linestring2_points_y.view()

cdef unique_ptr[column] c_result
with nogil:
c_result = move(cpp_pairwise_linestring_distance(
vyasr marked this conversation as resolved.
Show resolved Hide resolved
linestring1_offsets_view,
linestring1_points_x_view,
linestring1_points_y_view,
linestring2_offsets_view,
linestring2_points_x_view,
linestring2_points_y_view
))

return Column.from_unique_ptr(move(c_result))
104 changes: 103 additions & 1 deletion python/cuspatial/cuspatial/core/gis.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
# Copyright (c) 2019-2020, NVIDIA CORPORATION.

from cudf import DataFrame
from cudf import DataFrame, Series
from cudf.core.column import as_column

from cuspatial._lib.hausdorff import (
directed_hausdorff_distance as cpp_directed_hausdorff_distance,
)
from cuspatial._lib.linestring_distance import (
pairwise_linestring_distance as cpp_pairwise_linestring_distance,
)
from cuspatial._lib.point_in_polygon import (
point_in_polygon as cpp_point_in_polygon,
)
Expand Down Expand Up @@ -335,3 +338,102 @@ def polyline_bounding_boxes(poly_offsets, xs, ys, expansion_radius):
return DataFrame._from_data(
*cpp_polyline_bounding_boxes(poly_offsets, xs, ys, expansion_radius)
)


def pairwise_linestring_distance(offsets1, xs1, ys1, offsets2, xs2, ys2):
"""Compute shortest distance between pairs of linestrings (a.k.a. polylines)

The shortest distance between two linestrings is defined as the shortest
distance between all pairs of segments of the two linestrings. If any of
the segments intersect, the distance is 0.

Parameters
----------
offsets1
Indices of the first point of the first linestring of each pair.
xs1
isVoid marked this conversation as resolved.
Show resolved Hide resolved
x-components of points in the first linestring of each pair.
ys1
y-component of points in the first linestring of each pair.
offsets2
Indices of the first point of the second linestring of each pair.
xs2
x-component of points in the second linestring of each pair.
ys2
y-component of points in the second linestring of each pair.

Returns
-------
distance : cudf.Series
the distance between each pair of linestrings

Examples
--------
The following example contains 4 pairs of linestrings.

First pair::

(0, 1) -> (1, 0) -> (-1, 0)
(1, 1) -> (2, 1) -> (2, 0) -> (3, 0)

|
* #####
| * #
----O---*---#####
| *
*
|

The shortest distance between the two linestrings is the distance
from point ``(1, 1)`` to segment ``(0, 1) -> (1, 0)``, which is
``sqrt(2)/2``.

Second pair::

(0, 0) -> (0, 1)
(1, 0) -> (1, 1) -> (1, 2)


These linestrings are parallel. Their distance is 1 (point
``(0, 0)`` to point ``(1, 0)``).

Third pair::

(0, 0) -> (2, 2) -> (-2, 0)
(2, 0) -> (0, 2)


These linestrings intersect, so their distance is 0.

Forth pair::

(2, 2) -> (-2, -2)
(1, 1) -> (5, 5) -> (10, 0)


These linestrings contain colinear and overlapping sections, so
their distance is 0.

The input of above example is::

linestring1_offsets: {0, 3, 5, 8}
linestring1_points_x: {0, 1, -1, 0, 0, 0, 2, -2, 2, -2}
linestring1_points_y: {1, 0, 0, 0, 1, 0, 2, 0, 2, -2}
linestring2_offsets: {0, 4, 7, 9}
linestring2_points_x: {1, 2, 2, 3, 1, 1, 1, 2, 0, 1, 5, 10}
linestring2_points_y: {1, 1, 0, 0, 0, 1, 2, 0, 2, 1, 5, 0}

Result: {sqrt(2.0)/2, 1, 0, 0}
"""
xs1, ys1, xs2, ys2 = normalize_point_columns(
as_column(xs1), as_column(ys1), as_column(xs2), as_column(ys2)
)
offsets1 = as_column(offsets1, dtype="int32")
offsets2 = as_column(offsets2, dtype="int32")
return Series._from_data(
{
None: cpp_pairwise_linestring_distance(
offsets1, xs1, ys1, offsets2, xs2, ys2
)
}
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
import cupy as cp
import pandas as pd
import shapely

import cudf
from cudf.testing._utils import assert_eq

import cuspatial


def shapely_pairwise_linestring_distance(data1, data2, offset1, offset2):
"""Compute pairwise linestring distances with shapely."""

def make_linestring(group):
return shapely.geometry.LineString([*zip(group["x"], group["y"])])

ridx1 = pd.RangeIndex(len(data1))
ridx2 = pd.RangeIndex(len(data2))
groupid1 = ridx1.map(lambda i: offset1.searchsorted(i, side="right"))
groupid2 = ridx2.map(lambda i: offset2.searchsorted(i, side="right"))

data1["gid"] = groupid1
data2["gid"] = groupid2

linestrings1 = data1.groupby("gid").apply(make_linestring)
linestrings2 = data2.groupby("gid").apply(make_linestring)

linestring_pairs = pd.DataFrame({"s1": linestrings1, "s2": linestrings2})
distances = linestring_pairs.apply(
lambda row: row["s1"].distance(row["s2"]), axis=1
)

return distances.reset_index(drop=True)


def test_zero_pair():
data1 = cudf.DataFrame(
{
"x": [],
"y": [],
}
)
data2 = cudf.DataFrame(
{
"x": [],
"y": [],
}
)
offset1 = cudf.Series([], dtype="int32")
offset2 = cudf.Series([], dtype="int32")

got = cuspatial.pairwise_linestring_distance(
data1["x"], data2["y"], offset1, data2["x"], data2["y"], offset2
)
expected = cudf.Series([], dtype="float64")

assert_eq(got, expected)


def test_one_pair():
data1 = cudf.DataFrame(
{
"x": [0.0, 1.0],
"y": [0.0, 1.0],
}
)
data2 = cudf.DataFrame(
{
"x": [2.0, 3.0],
"y": [2.0, 3.0],
}
)
offset1 = cudf.Series([0], dtype="int32")
offset2 = cudf.Series([0], dtype="int32")

got = cuspatial.pairwise_linestring_distance(
data1["x"], data2["y"], offset1, data2["x"], data2["y"], offset2
)
expected = shapely_pairwise_linestring_distance(
data1.to_pandas(),
data2.to_pandas(),
offset1.to_pandas(),
offset2.to_pandas(),
)

assert_eq(got, expected)


def test_two_pairs():
data1 = cudf.DataFrame(
{
"x": [0.0, 1.0, 5.0, 7.0, 8.0],
"y": [0.0, 1.0, 10.2, 11.4, 12.8],
}
)
data2 = cudf.DataFrame(
{
"x": [2.0, 3.0, -8.0, -10.0, -13.0, -3.0],
"y": [2.0, 3.0, -8.0, -5.0, -15.0, -6.0],
}
)
offset1 = cudf.Series([0, 3], dtype="int32")
offset2 = cudf.Series([0, 2], dtype="int32")

got = cuspatial.pairwise_linestring_distance(
data1["x"], data1["y"], offset1, data2["x"], data2["y"], offset2
)
expected = shapely_pairwise_linestring_distance(
data1.to_pandas(),
data2.to_pandas(),
offset1.to_pandas(),
offset2.to_pandas(),
)

assert_eq(got, expected)


def test_100_randomized_input():
rng = cp.random.RandomState(0)

max_linestring_points = 10
size = 100

offset1 = rng.randint(2, max_linestring_points, size=(size,))
offset2 = rng.randint(2, max_linestring_points, size=(size,))

offset1 = cp.cumsum(offset1)
offset2 = cp.cumsum(offset2)

num_points_1 = int(offset1[-1])
num_points_2 = int(offset2[-1])

offset1 = cp.concatenate((cp.zeros((1,)), offset1[:-1]))
offset2 = cp.concatenate((cp.zeros((1,)), offset2[:-1]))

points1_x = rng.uniform(-1, 1, (num_points_1,))
points1_y = rng.uniform(-1, 1, (num_points_1,))

points2_x = rng.uniform(0.5, 2.5, (num_points_2,))
points2_y = rng.uniform(0.5, 2.5, (num_points_2,))

got = cuspatial.pairwise_linestring_distance(
points1_x, points1_y, offset1, points2_x, points2_y, offset2
)
expected = shapely_pairwise_linestring_distance(
pd.DataFrame({"x": points1_x.get(), "y": points1_y.get()}),
pd.DataFrame({"x": points2_x.get(), "y": points2_y.get()}),
pd.Series(offset1.get()),
pd.Series(offset2.get()),
)

assert_eq(got, expected)