-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't deserialize scala.List[M] if scala-library is not in the same classloader as M #9237
Comments
Imported From: https://issues.scala-lang.org/browse/SI-9237?orig=1 |
@retronym said: Here's a related SBT ticket: sbt/sbt#89 |
@retronym said: When we deserialize a List, we go through custom deserizalization code in object: // Java serialization calls this before readResolve during de-serialization.
// Read the whole list and store it in `orig`.
private def readObject(in: ObjectInputStream) {
val builder = List.newBuilder[A]
while (true) in.readObject match {
case ListSerializeEnd =>
orig = builder.result()
return
case a =>
builder += a.asInstanceOf[A]
}
} The protected Class<?> resolveClass(ObjectStreamClass desc)
throws IOException, ClassNotFoundException
{
String name = desc.getName();
try {
return Class.forName(name, false, latestUserDefinedLoader());
} catch (ClassNotFoundException ex) {
Class cl = (Class) primClasses.get(name);
if (cl != null) {
return cl;
} else {
throw ex;
}
}
} This ticket against OpenJDK suggests that this behaviour is buggy and instead the current thread's context class loader should be used. But the response is that the current behaviour is as specified. Here's a further minimization: // src/test/scala/issue/Meh.scala
class Meh // src/test/scala/issue/Main.scala
package issue
import java.io.{ObjectOutputStream, ObjectInputStream, ByteArrayOutputStream, ByteArrayInputStream}
object Test {
def main(args: Array[String]): Unit = {
val obj = List(new Meh)
val arr = serialize(obj)
val obj2 = deserialize[List[Meh]](arr)
assert(obj == obj2)
}
def serialize[A](obj: A): Array[Byte] = {
val o = new ByteArrayOutputStream()
val os = new ObjectOutputStream(o)
os.writeObject(obj)
o.toByteArray()
}
def deserialize[A](bytes: Array[Byte]): A = {
val s = new ByteArrayInputStream(bytes)
val is = new ObjectInputStream(s)
is.readObject().asInstanceOf[A]
}
} I'm not sure if we can do anything to fix this in our custom serialization code. I would appreciate ideas from serialization experts. |
@hamnis said: |
@retronym said: |
@retronym said: |
@scottcarey said: This may happen in a lot more places than List. what happens if we change val is = new ObjectInputStream(s) to use a custom extended ObjectInputStream that either overrides that method or changes 'latestUserDefinedLoader'? If that works, it would be a work-around on the user side -- not much we can do here if something beyond our control is trying to load classes it can't see. |
@scottcarey said: Others have implemented things like a "ClassloaderAwareObjectInputStream". Essentially, this appears to be a flaw with ObjectInputStream's design. Also see the results when searching google for 'latestUserDefinedLoader'. The common solution is for users to extend ObjectInputStream and override resolveClass. This turns out to be non-trivial as well, see bugs in other libraries such as: prevayler/prevayler#10 |
@retronym said (edited on Apr 9, 2015 5:09:57 AM UTC): Here's a modified version of Erlend's test project that contrasts them. Output in the commit comment: |
@retronym said (edited on Apr 9, 2015 5:50:34 AM UTC): /**
* Returns the first non-null class loader (not counting class loaders of
* generated reflection implementation classes) up the execution stack, or
* null if only code from the null class loader is on the stack. This
* method is also called via reflection by the following RMI-IIOP class:
*
* com.sun.corba.se.internal.util.JDKClassLoader
*
* This method should not be removed or its signature changed without
* corresponding modifications to the above class.
*/
// REMIND: change name to something more accurate?
private static native ClassLoader latestUserDefinedLoader(); scala> classOf[java.util.TreeMap[_, _]].getClassLoader
res1: ClassLoader = null So you can have scala-library.jar in the boot class loader, or in one that has access to the classes you want to put in your collection, but you can't have it in between. Delightful stuff! |
@Ichoran said: |
@scottcarey said: The old serialization code mutated both list and hd. I had plans to move tail to val as well, and stop using ListBuffer, but there was resistance since it requires changing the compiler in a couple places to not mutate it, and requires performance validation that the techniques used to become fully immutable do not hurt performance in various cases. I still feel that there will be about as many wins as losses if done right, as there are performance opportunities in places like map and fold. Without the serialization proxy pattern, Java serialization will create an instance of the object without executing its constructor, and then readObject can mutate its fields. If you have a val member, you can't use readObject because Scala won't let you mutate a val. Reversing the list on serialization will be slow on the other end when writing. |
@Ichoran said: |
@scottcarey said (edited on Apr 29, 2015 12:34:37 AM UTC): To some extent, this is not a bug at all, its just how Java Serialization works. @Ichoran I can't quite parse 'we can achieve a similar level of safety by changing the var to a val and making sure that the class still compiles'. The old code would not compile after only changing var to val, and so the pattern used in immutable.HashMap (and elsewhere) was applied to List. This all started here: The last few messages on that thread by me cover a few other options -- reflection and/or Unsafe can modify a val (final field), but will break if there is a security context that does not allow it. Those options aren't so great. Rex, you have a comment in that thread that not using ListBuffer (and using Array instead) is much faster and that is part of my inspiration to make the tail a val as well, but replacing the use of ListBuffer will require a lot of convincing others with performance numbers. (then the two other instances in the compiler / reflection that modify the val need to change). If we simply revert the change, we are either admitting that List will always be mutable and https://gist.github.com/jrudolph/6552186 will never be fixed, or deferring this 'bug' to the future when it becomes immutable. I had a local branch where the above was fixed and list was immutable. The library worked fine, but needed performance work since ListBuffer was purposely broken (performance wise) to test it. I did not continue any work because there was resistance to changing anything else at the time. The discussion here: scala/scala#3252 Is the best place to see where things are at. I would be interested in picking this back up some time, but worry that the compiler and library still do not have enough performance testing tools to convince people that I'm not breaking things, and I don't want to embark on a big chunk of work without feeling like there is a chance it will get anywhere. |
Hi all, This seems related to a bug I've recently reported to Apache Spark: https://issues.apache.org/jira/browse/SPARK-20525 Was hoping someone here might be able to help - Thanks in advance! |
putting this on the 2.13.0-RC1 milestone because in the 2.13 collections, most/all collections are affected, instead of only however, I also think we should just close this as "not a bug". as @scottcarey wrote, "this is not a bug at all, its just how Java Serialization works" but we do need to make sure we've documented the issue in e.g. the 2.13 collections migration guide |
that PR is marked with the release-notes label, which should serve as sufficient reminder to include it in migration doc |
List has some serialization problems.
Take a look at the minimized version here.
https://github.com/hamnis/minimized-list-serialization
This seems to be triggered by classloaders, not that I understand why that is the case.
Comment in the fork in Test in build.sbt, then it works.
The text was updated successfully, but these errors were encountered: