Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make string Injections safer. Fix #199 #237

Merged
merged 2 commits into from
Jan 25, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
package com.twitter.bijection

import java.util.concurrent.atomic.AtomicReference

/**
* This class is for sharing an object between threads
* when allocation of a new one is not that cheap, but
* better than blocking.
*/
private[bijection] class AtomicSharedState[T <: AnyRef](cons: () => T) {
private[this] val ref = new AtomicReference[T](cons())

def get: T = {
val borrow = ref.getAndSet(null.asInstanceOf[T])
if (null == borrow) cons()
else borrow
}
def release(t: T): Unit = {
ref.lazySet(t)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ limitations under the License.

package com.twitter.bijection

import com.twitter.bijection.Inversion.attempt
import java.net.{ URLDecoder, URLEncoder, URL }
import java.nio.charset.{ Charset, CoderResult, CodingErrorAction }
import java.nio.{ ByteBuffer, CharBuffer }
import java.util.UUID

import scala.annotation.tailrec
import scala.collection.generic.CanBuildFrom

import com.twitter.bijection.Inversion.attempt
import scala.util.Try

trait StringInjections extends NumericInjections {
Expand All @@ -32,9 +32,40 @@ trait StringInjections extends NumericInjections {

def withEncoding(encoding: String): Injection[String, Array[Byte]] =
new AbstractInjection[String, Array[Byte]] {
private[this] val decRef =
new AtomicSharedState({ () =>
val dec = Charset.forName(encoding)
.newDecoder
.onUnmappableCharacter(CodingErrorAction.REPORT)
val buf = CharBuffer.allocate(1024) // something big enough, if not big enough, allocate
(dec, buf)
})

def apply(s: String) = s.getBytes(encoding)
override def invert(b: Array[Byte]) =
attempt(b) { new String(_, encoding) }
// toString on ByteBuffer is nicer, so use it
attempt(ByteBuffer.wrap(b)) { bb =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do an extra allocation of a ByteBuffer class here always now, should we maybe only put this in the error handling portion of things?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the decoder requires a ByteBuffer, so bb is that. I can't seem to find a way to avoid that allocation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, from reading the comment i thought we were using ByteBuffer so the toString in the event of an error was nicer. It looks like we mean the toString function on the decoder is better?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the attempt function includes a toSting of the argument, which in this case is the ByteBuffer.

//these are mutable, so it can't be shared trivially
//avoid GC pressure and (probably) perform better
val decBuf = decRef.get
val dec = decBuf._1
val buf = decBuf._2
val maxSpace = (b.length * dec.maxCharsPerByte).toInt + 1
val thisBuf = if (maxSpace > buf.limit) CharBuffer.allocate(maxSpace) else buf

// this is the error free result
@inline def assertUnderFlow(cr: CoderResult) { if(!cr.isUnderflow) cr.throwException }
assertUnderFlow(dec.reset.decode(bb, thisBuf, true))
assertUnderFlow(dec.flush(thisBuf))
// set the limit to be the position
thisBuf.flip
val str = thisBuf.toString
// make sure the buffer we store is clear.
buf.clear
// we don't reset with the larger buffer to avoid memory leaks
decRef.release(decBuf)
str
}
}

// Some bijections with string from standard java/scala classes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ object StringArbs extends BaseProperties {
class StringBijectionLaws extends CheckProperties with BaseProperties {

property("round trips string -> Array[String]") {
isLooseInjection[String, Array[Byte]]
isInjection[String, Array[Byte]]
}

implicit val symbol = arbitraryViaFn { (s: String) => Symbol(s) }
Expand Down