Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLAT-33341] Add Caffeine interface to limit Sjsonnet worker cache size #128

Merged
merged 29 commits into from
Oct 27, 2021
Merged

[PLAT-33341] Add Caffeine interface to limit Sjsonnet worker cache size #128

merged 29 commits into from
Oct 27, 2021

Conversation

tanmay-db
Copy link
Contributor

@tanmay-db tanmay-db commented Sep 1, 2021

What changes are proposed in this pull request?

The parse cache currently uses hashmap and does not limit the entries due to which high memory is used. Instead of that, caffeine cache interface is added.

This PR also contains fixing the JS and Client build of Sjsonnet since some of its tests are failing.

How is this tested?

Unit tests already written using mill ./mill __.test

@tanmay-db tanmay-db marked this pull request as draft September 1, 2021 11:44
@tanmay-db tanmay-db requested a review from iuliand-db September 1, 2021 11:44
@tanmay-db tanmay-db changed the title [Plat 33341] [Plat 33341] Add Caffeine Interface to Limit Sjsonnet worker cache size Sep 1, 2021
@tanmay-db tanmay-db changed the title [Plat 33341] Add Caffeine Interface to Limit Sjsonnet worker cache size [PLAT 33341] Add Caffeine Interface to Limit Sjsonnet worker cache size Sep 1, 2021
@tanmay-db tanmay-db changed the title [PLAT 33341] Add Caffeine Interface to Limit Sjsonnet worker cache size [PLAT-33341] Add Caffeine Interface to Limit Sjsonnet worker cache size Sep 1, 2021
@tanmay-db tanmay-db changed the title [PLAT-33341] Add Caffeine Interface to Limit Sjsonnet worker cache size [PLAT-33341] Add Caffeine interface to limit Sjsonnet worker cache size Sep 1, 2021
@iuliand-db iuliand-db requested a review from szeiger September 6, 2021 10:20
@@ -7,8 +7,24 @@ import java.nio.file.NoSuchFileException
import scala.util.Try
import scala.util.control.NonFatal

// JsonnetWorker (in universe) extends this trait in the universe so that it can pass the cache based on Caffeine to main0 here
trait ParseCacheInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this ParseCache and leave Interface out of the name, it's simpler and there's no confusion around what it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review iulian, I have updated the name


// As JsonnetWorker (in universe) passes an object to main0 which extends the trait: ParseCacheInterface, class based on HashMap
// which extends this trait is needed to use the default HashMap
case class HashMapDefault () extends ParseCacheInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is a bit too detailed. What you could say is something like A default implementation based on a mutable HashMap. This implementation is not thread-safe.

I'd make this a simple class (not a case class, because it's not "data" and we don't pattern match on it). The name could be DefaultParseCache or something like that. The name should reflect what the class does (it's a parse cache), not only how it's implemented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's make this ParseCache and DefaultParseCache. Adding Interface to a name is an old Java practice that never caught on in Scala.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review iulian and Stefan, I have updated them.

val cache = new collection.mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]]()

override def getOrElseUpdate(key: (Path, String), defaultValue: => Either[Error, (Expr, FileScope)]): Either[Error, (Expr, FileScope)] = {
cache.getOrElseUpdate((key._1, key._2), defaultValue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to allocate a new tuple for (key._1, key._2) here, just use key directly.

object SjsonnetMain {
def createParseCache() = collection.mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]]()
// def createParseCache() = collection.mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]]()
def createParseCache() = new HashMapDefault
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The presence or absence of an argument list at the call site should generally match the definition (and this is often required in Scala). In this case it's better to change the definition (class DefaultParseCache instead of class DefaultParseCache()). Since it's a constructor, you already know that it always allocates a new object, and there are no further side-effects.

@tanmay-db tanmay-db marked this pull request as ready for review September 9, 2021 13:49
@tanmay-db tanmay-db requested a review from szeiger September 9, 2021 14:03
@@ -43,8 +43,8 @@ class CachedImporter(parent: Importer) extends Importer {

class CachedResolver(
parentImporter: Importer,
val parseCache: mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]] = new mutable.HashMap
) extends CachedImporter(parentImporter) {
val parseCache: ParseCache = new DefaultParseCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this default value so we can make sure we pass the right parse cache from all code paths.

This means you'll have to find all places where this method is called and decide if it should pass a fresh DefaultParseCache (if it's on a code path that's NOT coming from Bazel, meaning mostly tests) or make sure we have an external parse cache.

@@ -18,8 +18,8 @@ class Interpreter(extVars: Map[String, ujson.Value],
preserveOrder: Boolean = false,
strict: Boolean = false,
storePos: Position => Unit = null,
val parseCache: mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]] = new mutable.HashMap,
) { self =>
val parseCache: ParseCache = new DefaultParseCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, let's not have a default argument for the parse cache

Copy link
Collaborator

@szeiger szeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be rebased/updated so we can merge it for the release

build.sc Outdated
@@ -1,5 +1,5 @@
import mill._, scalalib._, publish._, scalajslib._, scalanativelib._, scalanativelib.api._
val sjsonnetVersion = "0.4.1"
val sjsonnetVersion = "0.4.1-SNAPSHOT"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this at 0.4.1, we're going to cut a 0.4.1 release from it anyway. (We could update to 0.4.2-SNAPSHOT afterwards, but it's probably not worth the effort)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay updated

Comment on lines 27 to 30
// [Ask] SjsonnetServerMain has a valid main method (args0: Array[String]): Unit,
// [warn] but sjsonnet.SjsonnetServerMain will not have an entry point on the JVM.
// [warn] Reason: companion is a trait, which means no static forwarder can be generated.
// [warn] object SjsonnetServerMain extends SjsonnetServerMain[DefaultParseCache]{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with this comment? Is this an issue that should be addressed?

Copy link
Contributor Author

@tanmay-db tanmay-db Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got generated when I ran ./mill __.test, I am not sure why this came but after some changes this doesn't come anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I also got the warning. I haven't seen it before. It's probably a real issue, but not caused by this PR. SjsonnetServerMain was already a trait with a companion object.

@@ -6,7 +6,7 @@ import scala.scalajs.js.annotation.{JSExport, JSExportTopLevel}

@JSExportTopLevel("SjsonnetMain")
object SjsonnetMain {
def createParseCache() = collection.mutable.HashMap[(Path, String), Either[String, (Expr, FileScope)]]()
def createParseCache() = new DefaultParseCache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we deprecate this method? We have the default encapsulated nicely in new DefaultParseCache now and it is used directly in several places, so this method seems superfluous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. Updated

@@ -8,7 +8,7 @@ import scala.util.Try
import scala.util.control.NonFatal

object SjsonnetMain {
def createParseCache() = collection.mutable.HashMap[(Path, String), Either[Error, (Expr, FileScope)]]()
def createParseCache() = new DefaultParseCache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@tanmay-db tanmay-db changed the title [PLAT-33341] Add Caffeine interface to limit Sjsonnet worker cache size [PLAT-33341] Add Caffeine interface to limit Sjsonnet worker cache size Oct 14, 2021
Copy link
Collaborator

@szeiger szeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks and profiler still need to be updated:

$ sbt
[info] Loading global plugins from /Users/stefan.zeiger/.sbt/1.0/plugins
[info] Loading settings for project sjsonnet-build from plugins.sbt ...
[info] Loading project definition from /Users/stefan.zeiger/code/sjsonnet/project
[info] Loading settings for project sjsonnet from build.sbt ...
[info] Set current project to sjsonnet (in build file:/Users/stefan.zeiger/code/sjsonnet/)
[info] sbt server started at local:///Users/stefan.zeiger/.sbt/1.0/server/e8b66a493a9fe60a98ae/sock
sbt:sjsonnet> bench/jmh:run -jvm /Users/stefan.zeiger/.sdkman/candidates/java/16.0.0-zulu/bin/java -jvmArgs "-XX:+UseStringDeduplication" sjsonnet.MainBenchmark
[info] Compiling 6 Scala sources to /Users/stefan.zeiger/code/sjsonnet/bench/target/scala-2.13/classes ...
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MainBenchmark.scala:22:18: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]     val interp = new Interpreter(
[error]                  ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MainBenchmark.scala:53:34: polymorphic expression cannot be instantiated to expected type;
[error]  found   : [K, V]scala.collection.mutable.HashMap[K,V]
[error]  required: sjsonnet.ParseCache
[error]       collection.mutable.HashMap.empty,
[error]                                  ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MaterializerBenchmark.scala:30:19: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]     this.interp = new Interpreter(
[error]                   ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/RunProfiler.scala:11:20: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]   val interp = new Interpreter(
[error]                    ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/RunProfiler.scala:39:33: value valuesIterator is not a member of sjsonnet.ParseCache
[error]   val roots = interp.parseCache.valuesIterator.map(_.getOrElse(???)).map(_._1).toSeq
[error]                                 ^
[error] 5 errors found
[error] (bench / Compile / compileIncremental) Compilation failed
[error] Total time: 4 s, completed Oct 22, 2021 3:15:43 PM
sbt:sjsonnet> bench/run
[info] Compiling 6 Scala sources to /Users/stefan.zeiger/code/sjsonnet/bench/target/scala-2.13/classes ...
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MainBenchmark.scala:22:18: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]     val interp = new Interpreter(
[error]                  ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MainBenchmark.scala:53:34: polymorphic expression cannot be instantiated to expected type;
[error]  found   : [K, V]scala.collection.mutable.HashMap[K,V]
[error]  required: sjsonnet.ParseCache
[error]       collection.mutable.HashMap.empty,
[error]                                  ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/MaterializerBenchmark.scala:30:19: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]     this.interp = new Interpreter(
[error]                   ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/RunProfiler.scala:11:20: not enough arguments for constructor Interpreter: (extVars: Map[String,ujson.Value], tlaVars: Map[String,ujson.Value], wd: sjsonnet.Path, importer: sjsonnet.Importer, parseCache: sjsonnet.ParseCache, settings: sjsonnet.Settings, storePos: sjsonnet.Position => Unit, warnLogger: String => Unit): sjsonnet.Interpreter.
[error] Unspecified value parameter parseCache.
[error]   val interp = new Interpreter(
[error]                    ^
[error] /Users/stefan.zeiger/code/sjsonnet/bench/src/main/scala/sjsonnet/RunProfiler.scala:39:33: value valuesIterator is not a member of sjsonnet.ParseCache
[error]   val roots = interp.parseCache.valuesIterator.map(_.getOrElse(???)).map(_._1).toSeq
[error]                                 ^
[error] 5 errors found
[error] (bench / Compile / compileIncremental) Compilation failed
[error] Total time: 1 s, completed Oct 22, 2021 3:15:49 PM
sbt:sjsonnet>

Comment on lines 8 to 9
def valuesIterator: Iterator[Either[Error, (Expr, FileScope)]]
def keySet: scala.collection.Set[(Path, String)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new. Do we need them in universe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need them in the universe because the cache type in sjsonnet is the same as trait which it extends and the caffeine cache extends this trait. So these are needed for it to be compiled.

@tanmay-db tanmay-db requested a review from szeiger October 27, 2021 01:38
@szeiger szeiger merged commit 8193180 into databricks:master Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants