Skip to content

Commit

Permalink
Databricks client - Add support for array type (#274)
Browse files Browse the repository at this point in the history
* Support array types

* Increment package version

* Make test more readable

* Update documention
  • Loading branch information
Sondergaard authored Nov 17, 2023
1 parent e5d0229 commit 55e04ad
Show file tree
Hide file tree
Showing 6 changed files with 57 additions and 2 deletions.
18 changes: 18 additions & 0 deletions source/Databricks/documents/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,24 @@ await foreach (var record in records)
allSheldons.Add(new Person(record.name, record.date));
```

A query can contain arrays. When used with Apache Arrow the array is encoded as object[]. If the format is JsonArray the array is encoded as Json string array.

```c#
var statement = DatabricksStatement.FromRawSql(
@"SELECT a, b FROM VALUES
('one', array(0, 1)),
('two', array(2, 3)) AS data(a, b);").Build();

var result = client.ExecuteStatementAsync(statement, Format.ApacheArrow);
var row = await result.FirstAsync();

// Apache arrow
var values = ((object[])row.b).OfType<int>();

// JsonArray
var values = JsonConvert.DeserializeObject<string[]>((string)row.b).Select(int.Parse);
```

#### Adhoc queries

It's possible to create adhoc queries from `DatabricksStatement` class.
Expand Down
5 changes: 5 additions & 0 deletions source/Databricks/documents/release-notes/release-notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Databricks Release Notes

## Version 8.2.0

- Add support for ListArray in Apache Arrow format
- Update documentation with usage guide for Apache Arrow format and JsonArray format for arrays

## Version 8.1.0

- Create objects from statements as an alternative to dynamic
Expand Down
2 changes: 1 addition & 1 deletion source/Databricks/source/Jobs/Jobs.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ limitations under the License.

<PropertyGroup>
<PackageId>Energinet.DataHub.Core.Databricks.Jobs</PackageId>
<PackageVersion>8.1.0$(VersionSuffix)</PackageVersion>
<PackageVersion>8.2.0$(VersionSuffix)</PackageVersion>
<Title>Databricks Jobs</Title>
<Company>Energinet-DataHub</Company>
<Authors>Energinet-DataHub</Authors>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,25 @@ public DatabricksStatementsTests(DatabricksSqlWarehouseFixture sqlWarehouseFixtu
_sqlWarehouseFixture = sqlWarehouseFixture;
}

[Fact]
public async Task CanHandleArrayTypes()
{
// Arrange
var client = _sqlWarehouseFixture.CreateSqlStatementClient();
var statement = DatabricksStatement.FromRawSql(
@"SELECT a, b FROM VALUES
('one', array(0, 1)),
('two', array(2, 3)) AS data(a, b);").Build();

// Act
var result = client.ExecuteStatementAsync(statement, Format.ApacheArrow);
var row = await result.FirstAsync();
var values = ((object[])row.b).OfType<int>();

// Assert
values.Should().BeEquivalentTo(new[] { 0, 1 });
}

[Fact]
public async Task ExecuteStatement_FromRawSqlWithParameters_ShouldReturnRows()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,19 @@ internal static class IArrowArrayExtensions
TimestampArray timestampArray => timestampArray.GetTimestamp(i),
Decimal128Array decimal128Array => decimal128Array.GetValue(i),
StringArray stringArray => stringArray.GetString(i),
ListArray listArray => ReadArray(listArray, i),
_ => throw new NotSupportedException($"Unsupported data type {arrowArray}"),
};

private static object? ReadArray(ListArray array, int i)
{
var objectArray = new object?[array.Length];
var offset = array.ValueOffsets[i];
for (var j = 0; j < array.Length; j++)
{
objectArray[j] = array.Values.GetValue(j + offset);
}

return objectArray;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ limitations under the License.

<PropertyGroup>
<PackageId>Energinet.DataHub.Core.Databricks.SqlStatementExecution</PackageId>
<PackageVersion>8.1.0$(VersionSuffix)</PackageVersion>
<PackageVersion>8.2.0$(VersionSuffix)</PackageVersion>
<Title>Databricks SQL Statement Execution</Title>
<Company>Energinet-DataHub</Company>
<Authors>Energinet-DataHub</Authors>
Expand Down

0 comments on commit 55e04ad

Please sign in to comment.