Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the loop for writing varints in ProtoBuf #1294

Merged
merged 1 commit into from
Jan 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benchmark/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ dependencies {
compileOnly 'com.google.auto.value:auto-value:1.4.1'
compileOnly 'com.ryanharter.auto.value:auto-value-gson:0.4.6'

compile "org.openjdk.jmh:jmh-core:1.21"
compile "org.openjdk.jmh:jmh-core:1.22"
implementation 'com.google.guava:guava:24.1.1-jre'

compile 'com.squareup.okio:okio:1.13.0'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import kotlinx.serialization.*

internal class ByteArrayInput(private var array: ByteArray, private val endIndex: Int = array.size) {
private var position: Int = 0
public val availableBytes: Int get() = endIndex - position
private val availableBytes: Int get() = endIndex - position

fun slice(size: Int): ByteArrayInput {
ensureEnoughBytes(size)
Expand Down Expand Up @@ -124,6 +124,17 @@ internal class ByteArrayInput(private var array: ByteArray, private val endIndex
}

internal class ByteArrayOutput {

private companion object {
/*
* Map number of leading zeroes -> varint size
* See the explanation in this blogpost: https://richardstartin.github.io/posts/dont-use-protobuf-for-telemetry
*/
private val VAR_INT_LENGTHS = IntArray(65) {
(63 - it) / 7
}
}

private var array: ByteArray = ByteArray(32)
private var position: Int = 0

Expand All @@ -136,11 +147,11 @@ internal class ByteArrayOutput {
array = newArray
}

public fun size(): Int {
fun size(): Int {
return position
}

public fun toByteArray(): ByteArray {
fun toByteArray(): ByteArray {
val newArray = ByteArray(position)
array.copyInto(newArray, startIndex = 0, endIndex = this.position)
return newArray
Expand Down Expand Up @@ -189,42 +200,33 @@ internal class ByteArrayOutput {
}

fun encodeVarint32(value: Int) {
ensureCapacity(5)
// Fast-path: unrolled loop for single byte
ensureCapacity(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why now for long we ensure dynamic length but here always 5?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because 32-bit varints are used on a regular basis for tags, it's always beneficial to allocate all the space.
Also, because tags are small, one-byte fast-path is used and we do not have to calculate varint length at all for such values

if (value and 0x7F.inv() == 0) {
array[position++] = value.toByte()
return
}
// Fast-path: unrolled loop for two bytes
var current = value
array[position++] = (current or 0x80).toByte()
current = current ushr 7
if (value and 0x7F.inv() == 0) {
array[position++] = value.toByte()
return
}
encodeVarint32SlowPath(current)
val length = varIntLength(value.toLong())
encodeVarint(value.toLong(), length)
}

fun encodeVarint64(value: Long) {
val length = varIntLength(value)
ensureCapacity(length + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest do +1 inside varIntLength or VAR_INT_LENGTHS init block to match the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it will slightly complicate the loop (have to do decrement), let's leave it as is

encodeVarint(value, length)
}

private fun encodeVarint32SlowPath(value: Int) {
private inline fun encodeVarint(value: Long, length: Int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unlikely that someone now is using a 32-bit CPU but these calculations can be slow on them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on what you mean here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current now have long type, in old encodeVarint32SlowPath it was int. So perhaps for 32-bit CPU and 32-bit varints operations and, or, ushr will be slower and total speed will degrade.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally do not care about 32-bit platforms

var current = value
while (current and 0x7F.inv() != 0) {
array[position++] = ((current and 0x7F) or 0x80).toByte()
for (i in 0 until length) {
array[position + i] = ((current and 0x7F) or 0x80).toByte()
current = current ushr 7
}
array[position++] = (current and 0x7F).toByte()
array[position + length] = current.toByte()
position += length + 1
}

fun encodeVarint64(value: Long) {
ensureCapacity(10)
var currentValue = value
while (true) {
if (currentValue and 0x7F.inv() == 0L) {
array[position++] = currentValue.toByte()
return
}
array[position++] = (currentValue.toInt() and 0x7F or 0x80).toByte()
currentValue = currentValue ushr 7
}
private fun varIntLength(value: Long): Int {
return VAR_INT_LENGTHS[value.countLeadingZeroBits()]
}
}