-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem parsing large messages (i.e. 1MB) #151
Comments
I need some details about this. How do you use this method? Do you have UnitTest or example? |
I will try to give you more details or spend time building a UT for it. In the meantime, understand that the problem is very localized to the input.read call I am referring to. Since you do not check/use the return value (the actual number of bytes read or 0 (EOF) or -1 (error)), you basically assume input.read will always successfully read the full "maxLength" number of bytes. That is not always the case as explained in the InputStream.read() doc. |
Can you try last version of protobuf-swift repository? |
Thanks for bring the codebase up to speed with Swift 3.0, I've already moved to it. I haven't yet tried with large messages, I will soon, but I don't see how anything could be different, since the return value of inputStream.read() is still not being checked. |
You talk about "length-delimited encoding/decoding"... If your file/stream has a correct format, then all will be ok. Please look test case: func testDelimitedEncodingEncoding()
{
do {
let str = (NSBundle(forClass:TestUtilities.self).resourcePath! as NSString).stringByAppendingPathComponent("delimitedFile.dat")
let stream = NSOutputStream(toFileAtPath: str, append: false)
stream?.open()
for i in 1...100 {
let mes = try PBProtoPoint.Builder().setLatitude(1.0).setLongitude(Float(i)).build()
try mes.writeDelimitedToOutputStream(stream!)
}
stream?.close()
let input = NSInputStream(fileAtPath: str)
input?.open()
let message = try PBProtoPoint.parseArrayDelimitedFromInputStream(input!)
XCTAssertTrue(message[1].longitude == 2.0, "")
}
catch
{
XCTFail("Fail testDelimitedEncodingEncoding")
}
} I compare return value with "!= 1", then return nil(== Invalid ProtocolBuffers stream) value. |
Alex, Maybe my understanding is still off, but I do think there is a clear problem here. I think your test is only succeeding because your file-based InputStream is able to keep up and always read the "maxLength" requested fully... Let me try to describe it better. I have a very simple "Data" message (simplified here): message DataMessage {
bytes data = 1;
} The code I use to read/receive this message is like this: func processMessage() {
do {
if let msg = try DataMessage.parseDelimitedFrom(inputStream: inputStream!) {
onMessage?(msg)
}
print("No more message for now")
} catch {
print("Parsing error: \(error)")
}
} Important: My InputStream is a TCP connection here (Not a filesystem-based IO) When the message/data size is relatively small (1K to 30K), things work just fine.... When the size gets bigger, here is what happens,
//Delimited Encoding/Decoding
public func mergeDelimitedFrom(inputStream: InputStream) throws -> Self? {
var firstByte:UInt8 = 0
if inputStream.read(&firstByte, maxLength: 1) != 1 {
return nil
}
let rSize = try CodedInputStream.readRawVarint32(firstByte: firstByte, inputStream: inputStream)
let data = Data(bytes: [0],count: Int(rSize))
let pointer = UnsafeMutablePointer<UInt8>((data as NSData).bytes)
// Original
// inputStream.read(pointer, maxLength: Int(rSize))
// Now just printing the return value if different then rSize
let actuallyReadSize = inputStream.read(pointer, maxLength: Int(rSize))
if (actuallyReadSize != Int(rSize)) {
print("DANGER: actuallyReadSize(\(actuallyReadSize)) != rSize(\(rSize))")
}
return try mergeFrom(data: data)
} I do see the following printout:
Yes, clearly from this, you do the proper checking "!= 1" for the "firstByte" reading. No issue with that. The problem is that you do no such check on the second inputStream.read() and there is no guarantee that the full data can be read (in one shot)..... My understanding is that InputStream.read() is a non-blocking call that will only return what it has read (up to maxLength). Let me know if this makes it clearer or if I am still missing something. I will see how I can come up with a solution, but I don't know your codebase too well, so it may take some time. Cheers. G. |
@alexeyxo could you just confirm to me that you too consider this as a bug? |
FYI, I still "know" that there is a bug here, but for now, instead of decoding the message directly from the socket InputStream, I first make sure I read all the data for the message (into a Data buffer) and only then use CodedInputStream.... |
Hi @gegles Kind regards, |
I had tried something like
But still like it doesn't work. The protobufMessage sill be nil Do you have any idea? |
Version of protoc (
protoc --version
)3.0
Version of ProtocolBuffers.framework
ProtoBuf3.0-Swift3.0 branch (but most likely all the branches)
.proto
file to reproduceDescription
If this input.read is not able to read the full rSize of the message but only a subset, the function still returns a message with only the partial data.
At a minimum, this should give an error or a way for users to detect the issue, at best, a mechanism to progressively build the message (in multiple reads) should be provided.
Let me know I am completely off here or if more details are needed.
The text was updated successfully, but these errors were encountered: