Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble getting this to work with BigQuery #89

Closed
halomakes opened this issue Nov 8, 2022 · 5 comments
Closed

Trouble getting this to work with BigQuery #89

halomakes opened this issue Nov 8, 2022 · 5 comments

Comments

@halomakes
Copy link

What is the bug?
Unable to deserialize data from BigQuery Storage API

If I try to read the data using the OpenDeserializer method as such I get the following exception

var response = _bqStorage.ReadRows(stream.ReadStreamName, default);

await foreach (var avroEntry in response.GetResponseStream())
{
    using var ms = new MemoryStream(avroEntry.AvroRows.SerializedBinaryRows.ToByteArray());
    using var reader = AvroConvert.OpenDeserializer<TResult>(ms);
    while (reader.HasNext())
    {
        var item = reader.ReadNext();
        yield return item;
    }
}
SolTechnology.Avro.Infrastructure.Exceptions.AvroTypeMismatchException: Unable to deserialize [__root__] of schema [Record] to the target type [Ion.Reporting.Infrastructure.Reports.MeterReadings.MeterReadingEntry]. Inner exception:
 ---> SolTechnology.Avro.Infrastructure.Exceptions.AvroTypeMismatchException: Unable to deserialize [union] of schema [Union] to the target type [System.String]. Inner exception:
 ---> System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
   at System.Collections.Generic.List`1.get_Item(Int32 index)
   at System.Collections.ObjectModel.ReadOnlyCollection`1.get_Item(Int32 index)
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.ResolveUnion(UnionSchema writerSchema, TypeSchema readerSchema, IReader d, Type type)
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.Resolve(TypeSchema writerSchema, TypeSchema readerSchema, IReader reader, Type type)
   --- End of inner exception stack trace ---
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.Resolve(TypeSchema writerSchema, TypeSchema readerSchema, IReader reader, Type type)
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.<>c__DisplayClass19_1.<ResolveRecord>b__1()
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.ResolveRecord(RecordSchema writerSchema, RecordSchema readerSchema, IReader dec, Type type)
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.Resolve(TypeSchema writerSchema, TypeSchema readerSchema, IReader reader, Type type)
   --- End of inner exception stack trace ---
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.Resolve(TypeSchema writerSchema, TypeSchema readerSchema, IReader reader, Type type)
   at SolTechnology.Avro.AvroObjectServices.Read.Resolver.Resolve[T](IReader reader, Int64 itemsCount)
   at SolTechnology.Avro.Features.DeserializeByLine.LineReaders.ListLineReader`1.ReadNext()
   at Ion.Reporting.Infrastructure.BigQuery.BigQueryService.QueryAsync[TResult](BqStorageQuery query, CancellationToken cancellationToken)+MoveNext() in /Users/alexgriffith/Code/Ion.Reporting/src/Infrastructure/BigQuery/BigQueryService.cs:line 105
   at Ion.Reporting.Infrastructure.BigQuery.BigQueryService.QueryAsync[TResult](BqStorageQuery query, CancellationToken cancellationToken)+MoveNext() in /Users/alexgriffith/Code/Ion.Reporting/src/Infrastructure/BigQuery/BigQueryService.cs:line 94
   at Ion.Reporting.Infrastructure.BigQuery.BigQueryService.QueryAsync[TResult](BqStorageQuery query, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()
   at System.Linq.AsyncEnumerable.<ToListAsync>g__Core|424_0[TSource](IAsyncEnumerable`1 source, CancellationToken cancellationToken) in /_/Ix.NET/Source/System.Linq.Async/System/Linq/Operators/ToList.cs:line 36

Trying with the method mentioned in #69 as such does not throw any exceptions but the AvroConvert.DeserializeHeadless<List<TResult>> method always returns an empty list of 0 rows, even though I can see that avroEntry.AvroRows.SerializedBinaryRows has a significant amount of content.

var response = _bqStorage.ReadRows(stream.ReadStreamName, default);

await foreach (var avroEntry in response.GetResponseStream())
{
    var items = AvroConvert.DeserializeHeadless<List<TResult>>(
        avroEntry.AvroRows.SerializedBinaryRows.ToByteArray(), readSession.AvroSchema.Schema);
    foreach (var item in items)
    {
        yield return item;
    }
}

I have tried both of these methods with my handwritten row class (with and without the DataContractAttribute) as well as with the example class generated at the AvroConvert site using the bigquery schema and gotten the same results.

My row class:

[DataContract(Name = "__root__", Namespace = null)]
public class MeterReadingEntry
{
    [DataMember(Name = "Property_RentRoll_ID")]
    public string? RentRollId { get; set; }

    [DataMember(Name = "PropertyID")] public string? PropertyId { get; set; }

    [DataMember(Name = "header_serial_number")]
    public string? SerialNumber { get; set; }

    [DataMember(Name = "Gallons")] public double? Gallons { get; set; }

    [DataMember(Name = "Events")] public double? Events { get; set; }

    [DataMember(Name = "Flowtime")] public double? FlowTime { get; set; }

    [DataMember(Name = "hour")] public DateTime? Hour { get; set; }

    [DataMember(Name = "Running_hours")] public long? RunningHours { get; set; }

    [DataMember(Name = "CatchupFlag")] public string? CatchupFlag { get; set; }

    [DataMember(Name = "Leak_Status")] public string? LeakStatus { get; set; }

    [DataMember(Name = "Leak_Details")] public string? LeakDetails { get; set; }

    [DataMember(Name = "toilet_leak")] public double? ToiletLeak { get; set; }

    [DataMember(Name = "Miscellaneous_Leaks")]
    public double? MiscellaneousLeak { get; set; }

    [DataMember(Name = "daily_leak_gallon_total")]
    public double? DailyGallonsLeaked { get; set; }

    [DataMember(Name = "daily_leak_status")]
    public string? DailyLeakStatus { get; set; }

    [DataMember(Name = "Leak_Gallons_22_hours")]
    public double? LeakGallons22Hours { get; set; }

    [DataMember(Name = "DateLeakStarted")] public DateTime? DateLeakStarted { get; set; }

    [DataMember(Name = "Total_gallons_since_leak")]
    public double? GallonsSinceLastLeak { get; set; }

    [DataMember(Name = "last_leaking_at")] public DateTime? LastLeakingAt { get; set; }

    [DataMember(Name = "DPOE_hourly_events_hot_cold")]
    public string? DpoeHourlyEvents { get; set; }

    [DataMember(Name = "DPOE_hourly_gallons_hot_cold")]
    public string? DpoeHourlyGallons { get; set; }

    [DataMember(Name = "DPOE_hourly_flowtime_hot_cold")]
    public string? DpoeHourlyFlowTime { get; set; }

    [DataMember(Name = "Developer")] public string? DeveloperName { get; set; }

    [DataMember(Name = "GPD_Filter")] public string? GpdFilter { get; set; }

    [DataMember(Name = "FlowtimeHHmm")] public string? FlowTimeFormatted { get; set; }

    [DataMember(Name = "ran_more_than_22h")]
    public string? RanMoreThan22Hours { get; set; }

    [DataMember(Name = "hourly_reading_status_num")]
    public long? HourlyReadingStatus { get; set; }

    [DataMember(Name = "Days_Repeating")] public long? DaysRepeating { get; set; }

    [DataMember(Name = "Unit_Details")] public string? UnitDetails { get; set; }

    [DataMember(Name = "Hours_running_55min_Daily")]
    public long? HoursRunningMoreThan55Minutes { get; set; }

    [DataMember(Name = "Leak_Gallons")] public double? LeakGallons { get; set; }

    [DataMember(Name = "Orange_Leak_Gallons")]
    public double? WarningLeakGallons { get; set; }

    [DataMember(Name = "Red_Leak_Gallons")]
    public double? UrgentLeakGallons { get; set; }

    [DataMember(Name = "Latest_Leak_Status")]
    public string? LatestLeakStatus { get; set; }

    [DataMember(Name = "duration_since_last_reading")]
    public string? TimeSinceLastReading { get; set; }
}

Output of AvroConvert.GenerateSchema(typeof(MeterReadingEntry)):

{
    "name": "__root__",
    "namespace": "Reporting.Infrastructure.MeterReadings",
    "type": "record",
    "fields": [
        {
            "name": "Property_RentRoll_ID",
            "aliases": [
                "RentRollId"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "PropertyID",
            "aliases": [
                "PropertyId"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "header_serial_number",
            "aliases": [
                "SerialNumber"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Gallons",
            "aliases": [
                "Gallons"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Events",
            "aliases": [
                "Events"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Flowtime",
            "aliases": [
                "FlowTime"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "hour",
            "aliases": [
                "Hour"
            ],
            "type": [
                "null",
                {
                    "type": "long",
                    "logicalType": "timestamp-millis"
                }
            ]
        },
        {
            "name": "Running_hours",
            "aliases": [
                "RunningHours"
            ],
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "CatchupFlag",
            "aliases": [
                "CatchupFlag"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Status",
            "aliases": [
                "LeakStatus"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Details",
            "aliases": [
                "LeakDetails"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "toilet_leak",
            "aliases": [
                "ToiletLeak"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Miscellaneous_Leaks",
            "aliases": [
                "MiscellaneousLeak"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "daily_leak_gallon_total",
            "aliases": [
                "DailyGallonsLeaked"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "daily_leak_status",
            "aliases": [
                "DailyLeakStatus"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Gallons_22_hours",
            "aliases": [
                "LeakGallons22Hours"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "DateLeakStarted",
            "aliases": [
                "DateLeakStarted"
            ],
            "type": [
                "null",
                {
                    "type": "long",
                    "logicalType": "timestamp-millis"
                }
            ]
        },
        {
            "name": "Total_gallons_since_leak",
            "aliases": [
                "GallonsSinceLastLeak"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "last_leaking_at",
            "aliases": [
                "LastLeakingAt"
            ],
            "type": [
                "null",
                {
                    "type": "long",
                    "logicalType": "timestamp-millis"
                }
            ]
        },
        {
            "name": "DPOE_hourly_events_hot_cold",
            "aliases": [
                "DpoeHourlyEvents"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "DPOE_hourly_gallons_hot_cold",
            "aliases": [
                "DpoeHourlyGallons"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "DPOE_hourly_flowtime_hot_cold",
            "aliases": [
                "DpoeHourlyFlowTime"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Developer",
            "aliases": [
                "DeveloperName"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "GPD_Filter",
            "aliases": [
                "GpdFilter"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "FlowtimeHHmm",
            "aliases": [
                "FlowTimeFormatted"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "ran_more_than_22h",
            "aliases": [
                "RanMoreThan22Hours"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "hourly_reading_status_num",
            "aliases": [
                "HourlyReadingStatus"
            ],
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Days_Repeating",
            "aliases": [
                "DaysRepeating"
            ],
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Unit_Details",
            "aliases": [
                "UnitDetails"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Hours_running_55min_Daily",
            "aliases": [
                "HoursRunningMoreThan55Minutes"
            ],
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Leak_Gallons",
            "aliases": [
                "LeakGallons"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Orange_Leak_Gallons",
            "aliases": [
                "WarningLeakGallons"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Red_Leak_Gallons",
            "aliases": [
                "UrgentLeakGallons"
            ],
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Latest_Leak_Status",
            "aliases": [
                "LatestLeakStatus"
            ],
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "duration_since_last_reading",
            "aliases": [
                "TimeSinceLastReading"
            ],
            "type": [
                "null",
                "string"
            ]
        }
    ]
}

Schema provided by bigquery:

{
    "type": "record",
    "name": "__root__",
    "fields": [
        {
            "name": "Property_RentRoll_ID",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "PropertyID",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "header_serial_number",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Gallons",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Events",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Flowtime",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "hour",
            "type": [
                "null",
                {
                    "type": "long",
                    "logicalType": "timestamp-micros"
                }
            ]
        },
        {
            "name": "Running_hours",
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "CatchupFlag",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Status",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Details",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "toilet_leak",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Miscellaneous_Leaks",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "daily_leak_gallon_total",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "daily_leak_status",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Leak_Gallons_22_hours",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "DateLeakStarted",
            "type": [
                "null",
                {
                    "type": "int",
                    "logicalType": "date"
                }
            ]
        },
        {
            "name": "Total_gallons_since_leak",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "last_leaking_at",
            "type": [
                "null",
                {
                    "type": "long",
                    "logicalType": "timestamp-micros"
                }
            ]
        },
        {
            "name": "DPOE_hourly_events_hot_cold",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "DPOE_hourly_gallons_hot_cold",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "DPOE_hourly_flowtime_hot_cold",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Developer",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "GPD_Filter",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "FlowtimeHHmm",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "ran_more_than_22h",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "hourly_reading_status_num",
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Days_Repeating",
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Unit_Details",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "Hours_running_55min_Daily",
            "type": [
                "null",
                "long"
            ]
        },
        {
            "name": "Leak_Gallons",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Orange_Leak_Gallons",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Red_Leak_Gallons",
            "type": [
                "null",
                "double"
            ]
        },
        {
            "name": "Latest_Leak_Status",
            "type": [
                "null",
                "string"
            ]
        },
        {
            "name": "duration_since_last_reading",
            "type": [
                "null",
                "string"
            ]
        }
    ]
}

Here is some small sample data:

{
    "stats": {
        "progress": {
            "atResponseEnd": 5.617501486199217e-08
        }
    },
    "avroRows": {
        "serializedBinaryRows": "AgY2OTUCCDEyODQCIDAwMEQ2RjAwMTY0RjQ1QzkAAAACgKD3/vi19gUCAAAAAAIAAAAAAAAAAAIAAAAAAAAAAAIAAAAAAAAAAAIWdG9pbGV0IGxlYWsAAAAAAgowIHwgMAIKMCB8IDACCjAgfCAwAhBEb21pbml1bQIOSW5jbHVkZQAAAgAAApQBQnJpZ2h0b24gT2FrcyB8IEZsb29yIDMgfCAzMjggfCBtYWluIGZlZWQgY29sZCAtIERQT0UgfCAyIEJkIC0gMiBCYSB8RjQ1QzkAAAAAAihObyByZWFkaW5nIHlldCB0b2RheQISMjQgZCAxNSBoAgY5OTACCDEwODQCIDAwMEQ2RjAwMTQ1ODBGMzQCAAAAAAAAAAACAAAAAAAAAAACAAAAAAAAAAACgKD3/vi19gUCAAACDk5vIExlYWsCDk5vIExlYWsCAAAAAAAAAAACAAAAAAAAAAACAAAAAAAAAAACFnRvaWxldCBsZWFrAgAAAAAAAAAAAAAAAgowIHwgMAIKMCB8IDACCjAgfCAwAhBEb21pbml1bQIOSW5jbHVkZQIUMCBIciAxIE1pbgACAgAClgFWaWxsYWdlIEdyZWVuIHwgU2VuaW9yIDQ2MC0wNCB8IDQxMSB8IGZ1bGwgYmF0aCB0b2lsZXQgfCAxIEJkIC0gMSBCYSB8ODBGMzQCAAIAAAAAAAAAAAIAAAAAAAAAAAIAAAAAAAAAAAIOTm8gTGVhawIGMSBo"
    },
    "rowCount": "2",
    "avroSchema": {
        "schema": "{\n    \"type\": \"record\",\n    \"name\": \"__root__\",\n    \"fields\": [\n        {\n            \"name\": \"Property_RentRoll_ID\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"PropertyID\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"header_serial_number\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Gallons\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Events\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Flowtime\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"hour\",\n            \"type\": [\n                \"null\",\n                {\n                \"type\": \"long\",\n                \"logicalType\": \"timestamp-micros\"\n                }\n            ]\n        },\n        {\n            \"name\": \"Running_hours\",\n            \"type\": [\n                \"null\",\n                \"long\"\n            ]\n        },\n        {\n            \"name\": \"CatchupFlag\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Leak_Status\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Leak_Details\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"toilet_leak\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Miscellaneous_Leaks\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"daily_leak_gallon_total\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"daily_leak_status\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Leak_Gallons_22_hours\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"DateLeakStarted\",\n            \"type\": [\n                \"null\",\n                {\n                \"type\": \"int\",\n                \"logicalType\": \"date\"\n                }\n            ]\n        },\n        {\n            \"name\": \"Total_gallons_since_leak\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"last_leaking_at\",\n            \"type\": [\n                \"null\",\n                {\n                \"type\": \"long\",\n                \"logicalType\": \"timestamp-micros\"\n                }\n            ]\n        },\n        {\n            \"name\": \"DPOE_hourly_events_hot_cold\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"DPOE_hourly_gallons_hot_cold\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"DPOE_hourly_flowtime_hot_cold\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Developer\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"GPD_Filter\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"FlowtimeHHmm\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"ran_more_than_22h\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"hourly_reading_status_num\",\n            \"type\": [\n                \"null\",\n                \"long\"\n            ]\n        },\n        {\n            \"name\": \"Days_Repeating\",\n            \"type\": [\n                \"null\",\n                \"long\"\n            ]\n        },\n        {\n            \"name\": \"Unit_Details\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"Hours_running_55min_Daily\",\n            \"type\": [\n                \"null\",\n                \"long\"\n            ]\n        },\n        {\n            \"name\": \"Leak_Gallons\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Orange_Leak_Gallons\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Red_Leak_Gallons\",\n            \"type\": [\n                \"null\",\n                \"double\"\n            ]\n        },\n        {\n            \"name\": \"Latest_Leak_Status\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        },\n        {\n            \"name\": \"duration_since_last_reading\",\n            \"type\": [\n                \"null\",\n                \"string\"\n            ]\n        }\n    ]\n}"
    }
}

What is the expected behavior?
Rows are deserialized successfully

Thanks for any help

@AdrianStrugala
Copy link
Owner

Hello,

Thank you for reporting the issue. I will debug your sample and come back with a conclusion.

Regards,
Adrian

@AdrianStrugala AdrianStrugala added the bug Something isn't working label Nov 8, 2022
@AdrianStrugala
Copy link
Owner

AdrianStrugala commented Nov 9, 2022

That's a tough one. I've tried the combinations of different serializers and schemas with no success. The data just seems to not match the schema, which I don't really believe.

Do you know if any codec was used during the serialization? Or (would be the best) could you provide a code snippet used for the serialization of this data?

@halomakes
Copy link
Author

Unfortunately the serialization happens in google's cloud, this is how the data comes from the bigquery storage api so I have not transparency into what's happening on the other end

@AdrianStrugala
Copy link
Owner

Hello,
After several more tries, I can list what I've found

  • the data does not contain the Avro header
  • it is not Snappy nor Deflate encoded
  • it most probably does not match the schema

There is no more way I could try to deserialize the file with AvroConvert so I've tried other libraries - still with no success.

To proceed with the issue I need to have a readable version of the data (c# or json) and its representation in big query Avro. Only then I would be able to debug the deserialization part. My feeling is that there is some additional part embedded into serializedBinaryRows (could be array start, sync interval, or something similar) that would have to be excluded from deserialization.

Regards,
Adrian

@AdrianStrugala AdrianStrugala added investigation and removed bug Something isn't working labels Dec 13, 2022
@AdrianStrugala
Copy link
Owner

Closed due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants