Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.clojure/clojure is failing to analyze #53

Closed
lread opened this issue Apr 20, 2022 · 7 comments · Fixed by #93
Closed

org.clojure/clojure is failing to analyze #53

lread opened this issue Apr 20, 2022 · 7 comments · Fixed by #93

Comments

@lread
Copy link
Member

lread commented Apr 20, 2022

Raised cljdoc/cljdoc#595, will track org.clojure/clojure here.

Testing with Clojure 1.11.1

Issue 1 - class cleanup

We try to cleanup classes from jars that we analyze.
The attempt is to delete any __init.classes that have .clj or .cljc sources.

Our technique does not pickup classes print generated by definterface and proxy.
In the clojure jar, these are generated to:

  • /clojure/pprint/PrettyFlush.class
  • /clojure/pprint/proxy$java/io/Writer*.class

To check: did our analyzer handle this case once, or did Clojure change?

Options:

  • Include the above classes in deletion cleanup. Can we reliably determine that they should be deleted? Or will this be custom handling for clojure analysis?
    • Clojure does set .class source file for generated interfaces, we could check that with the asm library, this would handle cases like PrettyFlush.class
    • For proxies, clojure generates under a proxy$... dir, we could assume that a class under a dir that has a proxy$ prefix is generated by Clojure.

Issue 2 - jsr166y.jar

There is some deprecated Clojure code that requires jsr166y.jar.

When the analyzer tries to load this namespace, it fails because jsr166y.jar is not on the classpath.

To check: this code has been around forever, how did this ever work?

Aha. If I look at the Clojure 1.9.0 pom I see:

    <dependency>
      <groupId>org.codehaus.jsr166-mirror</groupId>
      <artifactId>jsr166y</artifactId>
      <version>1.7.0</version>
      <scope>provided</scope>
    </dependency>

This is missing starting from Clojure 1.10 onwards as part of CLJ-2363.

Options:

  1. include jsr166y.jar on the classpath when analyzing clojure.
    • I'm not finding a compatible jar on the web, I get an error for a missing jsr166y.forkjoin.ParallelArrayWithBounds
    • If I dig deeper, I find that jsr166y restructured their classes around here but the org.codehaus.jsr166-mirror release is way after that.
    • If I clone git clone [email protected]:codehaus/jsr166-mirror.git, then git reset 17a5e808a900c0c8976a5cf4a1201067a0834894 --hard then ant jsr166yjar (with JDK8) then include build/jsr166ylib/jsr166y.jar on the classpath when building clojure then all is good.
  2. exclude deprecated namespace from analysis (I think this is what Clojure team does? I don't see it documented in their API docs).
@lread
Copy link
Member Author

lread commented Apr 20, 2022

Testing analyzing Clojure 1.9.0

Issue 3

If running under JDK11, we are affected by what is described in CLJ-2374 and get:

java.lang.IllegalArgumentException: Must hint overloaded method: toArray, compiling:(clojure/gvec.clj:131:1)

We do not get this error if analyzing on JDK8.
This is caused by a signature change in JDK11 that now needs a type hint.

Options

  1. Analyze under jdk8 - this might become more problematic as time marches on and other deps we want to use require > jdk8.
  2. Patch clojure 1.9.0 - this would mean custom handling for clojure 1.9.0

Issue 2 - variant b

As expected, I still get a failure (see #53 (comment))

ClassNotFoundException: jsr166y.forkjoin.ParallelArray

@lread
Copy link
Member Author

lread commented Apr 20, 2022

Testing analyzing Clojure 1.8.0

Issue 4 - using newer Clojure features to analyze

Right... analyzing Clojure with Clojure might be problematic.

Metagetta is using seqable? which was not in 1.8.0.

Pretty easy to adjust this to not use seqable?, let's see how far back we can go.

@lread
Copy link
Member Author

lread commented Apr 20, 2022

Hmmmm... maybe cljdoc/cljdoc#543 makes more sense for analyzing Clojure itself?

@lread
Copy link
Member Author

lread commented Aug 25, 2023

Had another peek at this. And can make some progress.
But a difficulty lies in analyzing old versions of clojure.
Because clojure is analyzing clojure, old versions of clojure are analyzed with old versions of clojure.

Metagetta does not have many deps but it does depend on an inlined version of tools.namespace which depends on tools.reader. These inlined versions depend on clojure features introduced in specific versions of clojure. One example is reader conditionals.

Metagetta also currently depends on a recent version of ClojureScript which will likely bring in unwanted things when analyzing old versions of clojure.

One way around this might be to MrAnderson the entirety of clojure and use that for dynamic analysis. But that would be, I think, difficult and hard to get reliably right.

I think static analysis (or maybe some other means?) would be a more fruitful path to explore.

@lread
Copy link
Member Author

lread commented Aug 30, 2023

Goal

Analyze the org.clojure/clojure API all the way back to version 1.1.0

Option 1 - make metagetta Clojure 1.1.0 compatible

This is a bit... rough.
Many, many, many nice things did not exist back in Clojure 1.1.0.
And metadata on nses/vars and as type hints were either not there yet or a bit different?
This would give us a hint that skip-wiki maybe came later in clojure's lifespan and should not be relied upon as an indicator of public API.

Option 2 - use static analysis

I gave this a whirl and recorded results here: cljdoc/cljdoc#543 (comment)

Not ideal as Clojure builds/alters parts of its API at load time.

Option 3 - use an existing source of analysis

Clojure has generated API docs since 1.1.0.
We could harvest from that data.
I'm not sure, but expect that the existing data would have everything we need.

This could be the pragmatic thing to do.
At the worst, we'd scrape generated docs, but preferably we'd use same autogen doc tooling for clojure version x to generate some edn that we can transform to cljdoc format.

I think this option means we'd not be analyzing the jar and would maybe lead us to thinking more about supporting custom analysis.

Option 4 - classpath isolation

When analyzing clojure itself can we use our preferred modern version of Clojure for most work, but then in an isolated classpath loader, load and query clojure version x?

Would need to experiment to see if this is viable at all.

Option 5 - shadowing

Use something like MrAnderson so clojure version x can analyze clojure version y without any collisions.

Not sure if the analyzer of analyze would be shadowed.
If analyzee then we'd need to unshadow our generated analysis data.

@martinklepsch
Copy link
Member

Analyze the org.clojure/clojure API all the way back to version 1.1.0

FWIW I think that this is a somewhat ambitious goal with diminishing returns. It would be nice but not sure it's needed.

Screenshot 2023-08-31 at 10 58 51

... obviously getting 1.11.1+ to work would be great.

As for the JSR166 thing, I'd also — in the interest of your time — suggest to just exclude the namespace, if it's not shown in official API docs then I don't see a reason to show it on cljdoc.

@lread
Copy link
Member Author

lread commented Aug 31, 2023

@martinklepsch! So nice to hear from you on this! ❤️

Yes, I think you are right about jsr166.

As you can imagine, as I experimented going back further and further down the option 1 path it got harder and harder. I was interested in going all the way back because it is such a historically important library and was imagining an day when we had API diffing in place and the curious could learn how the API evolved.

But... yes, a person can get easily carried away on a goal when working alone! Thanks for chiming in! I can do better than v1.11.1+, but when things start to get totally awkward, I'll stop. I'm pretty sure v1.7+ is not too bad but I might be able, without too much trouble, get back to v1.4+. But even if I were to stop at v1.11.1+ that would be an improvement over what we are doing today.

lread added a commit that referenced this issue Sep 2, 2023
Added support necessary to successfully analyze Clojure v1.10+
- jar cleanup now removes clojure generated .class files for both proxies and types
- exclude clojure.parallel from analysis

Contributes to #53

The :namespaces option is was applied after analysis. This meant that
namespaces that were to be excluded because they would cause analysis to fail
still caused analysis to fail. Moved exclusion logic to after namespaces
are found but before analysis.

Closes #92
lread added a commit that referenced this issue Sep 2, 2023
A signature change in JDK11 means gvec.clj needs a type hint
See https://clojure.atlassian.net/browse/CLJ-2374
This change patches gvec.clj with the necessary type hint.
An alternative would be to run analysis under JDK8, but cljdoc does not
currently support JDK selection at this time.

Contributes to #53
lread added a commit that referenced this issue Sep 2, 2023
Metagetta was using seqable? which was introduce in Clojure 1.9.
Compensate with a workable alternative.

Contributes to #53
lread added a commit that referenced this issue Sep 2, 2023
Metagetta assumed current and compatible versions of Clojure and
ClojureScript. When analyzing older versions of Clojure we can pick up a
version that is not compatible with our default ClojureScript.

Since we don't need ClojureScript to analyze Clojure, only bring in
its support when it is needed.

Also: clojure.string/starts-with? was introduced in Clojure 1.8.
Use JVM .startsWith instead.

Contributes to #53
@lread lread closed this as completed in #93 Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants