Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include transitive target dependencies in export-dep-as-jar #8736

Closed
wants to merge 1 commit into from

Conversation

olafurpg
Copy link
Contributor

@olafurpg olafurpg commented Nov 29, 2019

Previously, export-dep-as-jar did not include the full list of
transitive target dependencies. This meant that client tools like the
IntelliJ compiler and Bloop export were left to traverse the build graph
in order to construct the full classpath of a target.

Now, the output of export-dep-as-jar includes a new field
transitive_targets field that is similar to targets except that it
includes transitive target dependencies respecting strict_deps. For
example, assuming c -> b -> a the value of transitive_targets is the
following:

++ new export-dep-as-jar output
 {
   "c:c": {
     "targets": ["b:b"],
+    "transitive_targets": ["b:b", "a:a"],
   }
 }

This change allows clients to faithfully reproduce the same classpath as
./pants compile without exposing strict_deps into the
export-dep-as-jar output.

@illicitonion
Copy link
Contributor

illicitonion commented Nov 29, 2019

Alternatives

Instead of delegating the logic to external tools for interpreting the meaning of strict_deps, the "targets" key in the ./pants export output could automatically include the transitive dependencies for targets where strict_deps=False.

I pretty strongly prefer this... Any thoughts?

@olafurpg
Copy link
Contributor Author

@illicitonion I also prefer that approach.

@blorente
Copy link
Contributor

I prefer that approach as well, but I'm not too familiar with the requirements metals has for this.
Also, on an abstract level, the information about implicitness should be included in the dependencies already, right?

if I have strict_deps enabled it will hold that for every dep d in a.dependencies, every dep of d is also in a.dependencies. I'm not sure what outputting strict_deps adds, as we still have to implement all the algorithms for compiling non-strict deps regardless.

The alternative seems reasonable and somewhat easy to implement, lmk if I can help.

@olafurpg
Copy link
Contributor Author

The only requirement from Metals is being able to faithfully reproduce the compilation outside of Pants. The same is true if we want to experiment with the IntelliJ compiler.

I hit on a case where removing strict_deps=True results in a compile error from an ambiguous reference where two wildcard imports bring the same identifier into scope. With strict_deps=True, I assume the classpath is smaller and one of the wildcard imports no longer brings the conflicting identifier into scope.

@olafurpg
Copy link
Contributor Author

Regardless of what approach we end up using for computing the classpath of a target, I think it's desirable to include strict_deps: true/false in the output of export. Example use-case:

  • programmatically find targets with fat classpaths and strict_deps=False to optimize compile times
  • display strict_deps targets with different colors in the IntelliJ/Metals UI

@illicitonion
Copy link
Contributor

Regardless of what approach we end up using for computing the classpath of a target, I think it's desirable to include strict_deps: true/false in the output of export. Example use-case:

  • programmatically find targets with fat classpaths and strict_deps=False to optimize compile times
  • display strict_deps targets with different colors in the IntelliJ/Metals UI

I think there's not really harm in exposing this, but it's worth being aware that strict_deps isn't an inherent concept; for instance, when we add dependency inference, we'll probably remove strict_deps as a concept. So it may be useful for now (though probably export isn't the best API to use for the programatic finding), but may not be long term...

Also, at least internally at Twitter, we've found that our strict_deps implementation for scala is error-prone enough that we don't recommend that teams enable it, so displaying prompts in the UI to nudge people to use strict_deps would probably be counter-productive.

@olafurpg
Copy link
Contributor Author

olafurpg commented Dec 3, 2019

Failing test looks unrelated

self = <pants_test.backend.python.tasks.native.test_ctypes.TestBuildLocalDistsWithCtypesNativeSources testMethod=test_ctypes_c_dist>

    def test_ctypes_c_dist(self):

      platform_specific_dist = self.target_dict['platform_specific_ctypes_c_dist']

      self._assert_dist_and_wheel_identity(

        expected_name='platform_specific_ctypes_c_dist',

        expected_version='0.0.0',

        expected_platform=self.ExpectedPlatformType.current,

        dist_target=platform_specific_dist,

>       extra_targets=[self.target_dict['ctypes_c_library']],

      )


@blorente
Copy link
Contributor

blorente commented Dec 3, 2019

There is a failing test that complains about not being able to resolve a pants plugin (checkstyle). I have restarted that shard because I've also seen it flake in an unrelated PR.

@olafurpg olafurpg force-pushed the strict_deps branch 2 times, most recently from 4c38f34 to 4594aee Compare December 3, 2019 14:42
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @illicitonion that even if we expose the option, we would want to render the strict deps list for the target in export, rather than attempting to compute them in the caller.

@wisechengyi : Thoughts?

@@ -276,6 +276,12 @@ def iter_transitive_jars(jar_lib):

if classpath_products:
info['libraries'] = [self._jar_id(lib) for lib in target_libraries]

strict_deps = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not correctly default to the underlying language's default value. Should use

def defaulted_property(self, target, option_name):
"""Computes a language property setting for the given JvmTarget.
:param selector A function that takes a target or platform and returns the boolean value of the
property for that target or platform, or None if that target or platform does
not directly define the property.
If the target does not override the language property, returns true iff the property
is true for any of the matched languages for the target.
"""
if target.has_sources('.java'):
matching_subsystem = Java.global_instance()
elif target.has_sources('.scala'):
matching_subsystem = ScalaPlatform.global_instance()
else:
return getattr(target, option_name)
instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using defaulted_property with this diff

diff --git a/src/python/pants/backend/project_info/tasks/export.py b/src/python/pants/backend/project_info/tasks/export.py
index d2390d3..7fd9896 100644
--- a/src/python/pants/backend/project_info/tasks/export.py
+++ b/src/python/pants/backend/project_info/tasks/export.py
@@ -7,6 +7,7 @@ from collections import defaultdict
 
 from twitter.common.collections import OrderedSet
 
+from pants.backend.jvm.subsystems.dependency_context import DependencyContext
 from pants.backend.jvm.subsystems.jvm_platform import JvmPlatform
 from pants.backend.jvm.subsystems.resolve_subsystem import JvmResolveSubsystem
 from pants.backend.jvm.subsystems.scala_platform import ScalaPlatform
@@ -60,7 +61,7 @@ class ExportTask(ResolveRequirementsTaskBase, IvyTaskMixin, CoursierMixin):
   @classmethod
   def subsystem_dependencies(cls):
     return super().subsystem_dependencies() + (
-      DistributionLocator, JvmPlatform, PythonInterpreterCache, ScalaPlatform
+      DependencyContext, DistributionLocator, JvmPlatform, PythonInterpreterCache, ScalaPlatform
     )
 
   @staticmethod
@@ -277,7 +278,7 @@ class ExportTask(ResolveRequirementsTaskBase, IvyTaskMixin, CoursierMixin):
       if classpath_products:
         info['libraries'] = [self._jar_id(lib) for lib in target_libraries]
 
-      strict_deps = False
+      strict_deps = DependencyContext.global_instance().defaulted_property(current_target, 'strict_deps')
       if hasattr(current_target, 'strict_deps'):
         strict_deps = True == getattr(current_target, 'strict_deps')
       info['strict_deps'] = strict_deps

It crashes when I run the tests because the method calls getattr(target, 'strict_deps')

                         def defaulted_property(self, target, option_name):
                           """Computes a language property setting for the given JvmTarget.

                           :param selector A function that takes a target or platform and returns the boolean value of the
                                           property for that target or platform, or None if that target or platform does
                                           not directly define the property.

                           If the target does not override the language property, returns true iff the property
                           is true for any of the matched languages for the target.
                           """
                           if target.has_sources('.java'):
                             matching_subsystem = Java.global_instance()
                           elif target.has_sources('.scala'):
                             matching_subsystem = ScalaPlatform.global_instance()
                           else:
                     >       return getattr(target, option_name)
                     E       AttributeError: 'Target' object has no attribute 'strict_deps'

What is the correct usage of this method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It assumes that it is being called on a JvmTarget, which always has that property.

One option would be to add an optional parameter to def defaulted_property(self, ..., default=None):, which would then be passed to getattr(target, option_name, default).

@wisechengyi
Copy link
Contributor

I agree with @illicitonion that even if we expose the option, we would want to render the strict deps list for the target in export, rather than attempting to compute them in the caller.

@wisechengyi : Thoughts?

Are you suggesting given

scala_library( 
  name = 'a',
  strict_deps = True,
  dependencies = ['b','c']
  exports = ['b']
)

Mark b to be strict rather than a?

If so, it feels like it's misrepresenting how the target is defined, unless it's super hard in metals to figure out the relations somehow.

Stepping back a bit, I feel this is likely optimizing a bit early, if we are able to deliver fast iteration time to users. I'm sure they won't have problem turning off strict deps, unless that's a correctness issue somehow.

Copy link
Contributor Author

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wisechengyi strict_deps affects correctness. Removing strict_deps=True from a BUILD file can introduce a compile error. Consider the following minimized example

// A.scala
package a
class Foo
// B.scala (depends on A.scala)
package b
class Foo
// C.scala (depends on B.scala)
import a._
import b._
new Foo()

The new Foo() expression resolves to b.Foo with strict_deps=True and it fails with a compile error under strict_deps=False due to an ambiguous reference.

@@ -276,6 +276,12 @@ def iter_transitive_jars(jar_lib):

if classpath_products:
info['libraries'] = [self._jar_id(lib) for lib in target_libraries]

strict_deps = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using defaulted_property with this diff

diff --git a/src/python/pants/backend/project_info/tasks/export.py b/src/python/pants/backend/project_info/tasks/export.py
index d2390d3..7fd9896 100644
--- a/src/python/pants/backend/project_info/tasks/export.py
+++ b/src/python/pants/backend/project_info/tasks/export.py
@@ -7,6 +7,7 @@ from collections import defaultdict
 
 from twitter.common.collections import OrderedSet
 
+from pants.backend.jvm.subsystems.dependency_context import DependencyContext
 from pants.backend.jvm.subsystems.jvm_platform import JvmPlatform
 from pants.backend.jvm.subsystems.resolve_subsystem import JvmResolveSubsystem
 from pants.backend.jvm.subsystems.scala_platform import ScalaPlatform
@@ -60,7 +61,7 @@ class ExportTask(ResolveRequirementsTaskBase, IvyTaskMixin, CoursierMixin):
   @classmethod
   def subsystem_dependencies(cls):
     return super().subsystem_dependencies() + (
-      DistributionLocator, JvmPlatform, PythonInterpreterCache, ScalaPlatform
+      DependencyContext, DistributionLocator, JvmPlatform, PythonInterpreterCache, ScalaPlatform
     )
 
   @staticmethod
@@ -277,7 +278,7 @@ class ExportTask(ResolveRequirementsTaskBase, IvyTaskMixin, CoursierMixin):
       if classpath_products:
         info['libraries'] = [self._jar_id(lib) for lib in target_libraries]
 
-      strict_deps = False
+      strict_deps = DependencyContext.global_instance().defaulted_property(current_target, 'strict_deps')
       if hasattr(current_target, 'strict_deps'):
         strict_deps = True == getattr(current_target, 'strict_deps')
       info['strict_deps'] = strict_deps

It crashes when I run the tests because the method calls getattr(target, 'strict_deps')

                         def defaulted_property(self, target, option_name):
                           """Computes a language property setting for the given JvmTarget.

                           :param selector A function that takes a target or platform and returns the boolean value of the
                                           property for that target or platform, or None if that target or platform does
                                           not directly define the property.

                           If the target does not override the language property, returns true iff the property
                           is true for any of the matched languages for the target.
                           """
                           if target.has_sources('.java'):
                             matching_subsystem = Java.global_instance()
                           elif target.has_sources('.scala'):
                             matching_subsystem = ScalaPlatform.global_instance()
                           else:
                     >       return getattr(target, option_name)
                     E       AttributeError: 'Target' object has no attribute 'strict_deps'

What is the correct usage of this method?

@olafurpg
Copy link
Contributor Author

olafurpg commented Dec 4, 2019

One possible backwards compatible solution could be to introduce a new key 'transitive_targets': [...] that includes the dependencies that get pulled in via strict_deps=False 🤔

@stuhood
Copy link
Member

stuhood commented Dec 5, 2019

I agree with @illicitonion that even if we expose the option, we would want to render the strict deps list for the target in export, rather than attempting to compute them in the caller.
@wisechengyi : Thoughts?

Are you suggesting given

scala_library( 
  name = 'a',
  strict_deps = True,
  dependencies = ['b','c']
  exports = ['b']
)

Mark b to be strict rather than a?

No. I'm suggesting that I agree with #8736 (comment)

@wisechengyi
Copy link
Contributor

wisechengyi commented Dec 5, 2019

No. I'm suggesting that I agree with #8736 (comment)

Oh I see. E.g.
a -> b -> c
The status quo of ./pants export a looks like

targets: {
 'a':...
 'b':...
 'c':..
}

Now, if a -(strict)-> b -> c, what should the output look like?

Also to be clear there are unfortunately two targets due to historic naming reasons (e.g. https://github.com/pantsbuild/intellij-pants-plugin/blob/master/CONTRIBUTING.md#how-the-plugin-works), the outer one means the list of the targets, the inner one actually means dependencies, so I was assuming the outer one was mentioned.

Edit: in case the inner targets was mentioned, then specifying the transitive dependencies for each target would be a n^2 thing.

@wisechengyi
Copy link
Contributor

wisechengyi commented Dec 5, 2019

@wisechengyi strict_deps affects correctness. Removing strict_deps=True from a BUILD file can introduce a compile error. Consider the following minimized example

// A.scala
package a
class Foo
// B.scala (depends on A.scala)
package b
class Foo
// C.scala (depends on B.scala)
import a._
import b._
new Foo()

The new Foo() expression resolves to b.Foo with strict_deps=True and it fails with a compile error under strict_deps=False due to an ambiguous reference.

Thanks for the explanation! That makes sense. However in this case would it be an easier fix if explicit import was used instead of wildcard? Or is there anything else I didn't consider that could make this difficult?

Edit: also is it sort of anti-pattern to rely on strict_deps to enforce which Foo to compile against?

@olafurpg olafurpg force-pushed the strict_deps branch 2 times, most recently from f49cded to 22b83a0 Compare December 9, 2019 13:40
@olafurpg olafurpg changed the title Include strict_deps in export output. Include transitive target dependencies in export-dep-as-jar Dec 9, 2019
@olafurpg
Copy link
Contributor Author

olafurpg commented Dec 9, 2019

Together with @blorente, we updated this PR to

  • extend export-dep-as-jar instead of export
  • no longer expose strict_deps in the output, instead add a new field transtive_targets that respects strict_deps

Please let us know what you think

Previously, `export-dep-as-jar` did not include the full list of
transitive target dependencies. This meant that client tools like the
IntelliJ compiler and Bloop export were left to traverse the build graph
in order to construct the full classpath of a target.

Now, the output of `export-dep-as-jar` includes a new field
`transitive_targets` field that is similar to `targets` except that it
includes transitive target dependencies respecting `strict_deps`.  For
example, assuming `c -> b -> a` the value of `transitive_targets` is the
following:

```diff
++ new export-dep-as-jar output
 {
   "c:c": {
     "targets": ["b:b"],
+    "transitive_targets": ["b:b", "a:a"],
   }
 }
```
This change allows clients to faithfully reproduce the same classpath as
`./pants compile` without exposing `strict_deps` into the
`export-dep-as-jar` output.
Comment on lines +202 to +203
for dep in transitive_targets:
info['transitive_targets'].append(dep.address.spec)
Copy link
Contributor

@wisechengyi wisechengyi Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Did a sample run with this on dataproducts/aclservice/::, the resulting json went from 4MB to 14MB, so would recommend putting this behind a flag. I haven't tried with a even larger project, but if json gets to 50MB, then that's likely not ideal.

  2. Ideally we only need to put non-target-roots into 'transitive_targets'. Also relating to Borja's work to minimize the # of intellij modules. Do you think whether below makes sense, @blorente?
    For example:

Given
A -> B -> C -> D
A -> E -> F

$ ./pants export-dep-as-jar A
{
 A: {'transitive_targets', [B, C, D, E, F], is_target_root: True}
 B: // dependencies here doesn't matter because it's not target root, only need to show where the jar is
 C: // same
 D: // same
 E: // same
 F: // same
}

$ ./pants export-dep-as-jar A B
{ 
  A: { targets: [B], transitive_targets: [E,F], is_target_root: True}, // because B is also a target root
  B: { transitive_targets: [C, D], is_target_root: True}
  C: // dependencies here doesn't matter because it's not target root, only need to show where the jar is
  D: // same
  E: // same
  F: // same

Although the tricky part could be

A -> B -> C -> D
./pants export-dep-as-jar A C

In this case, we might have to force B into target root (intellij module rather than a library), because "3rdparty library" (B in this case from A's perspective) in intellij cannot depend back onto C.

To be clear, feel free to only do 1), and I'd be okay to ship this to unblock the work.

Copy link
Member

@stuhood stuhood Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... now that you say that (apologies, should have noticed earlier), I think that what I was suggesting was related to #8736 (comment) , but different. Roughly:

  1. Render the strict dep boolean
  2. When a target is strict_deps=True, include the calculated direct strict dependencies set in the dependencies list rather than the non-strict dependencies (alternatively define a second field strict_dependencies=[...]). When strict_deps=False, change nothing: the dependencies list would be as it is.

This would avoid the exponential blowup (because we'd never render transitive dependencies) and would avoid the strict deps calculation (which is the actual challenging one: unlike the non-strict deps calculation). Sorry, should have noticed earlier.

Copy link
Contributor

@wisechengyi wisechengyi Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I follow on 2). For example A -> B , B -> C, D

scala_library(
   name = A
   strict_dep = True
   dependencies= [B]
)

scala_library(
   name = B
   strict_dep = True
   dependencies= [C, D]
   export=[C]
)

scala_library(
   name = C
   dependencies = []
)

scala_library(
   name = D
   dependencies = []
)

What would the output look like?

A: { 
    strict_dep=True, 
    dependencies = [B, C], // or should this be [B] only?
  }
B: {..}
C: {..}
D: {..}

@blorente
Copy link
Contributor

Hi! I'm going to give this another shot.

I'm strongly leaning towards including the transitive_targets field in the output of export-dep-as-jar, possibly renaming it to flattened_dependency_closure.

This field would be included in the output of each modulizable target, and would contain the (strict_deps-sensitive) flat list of source-dependencies I depend on. Therefore, as a target, my classpath would contain:

  • Whatever is in my libraries field, and
  • The compiled products of whatever is in this list.

@wisechengyi : Since #8812, we expect the amount of modulizable targets to be linear on the number of target roots, so the exponential blowup of output size is not that much of a concern.

@olafurpg : Is #8736 (comment) still a concern? I'd personally like to avoid printing the strict_deps field, especially since we don't recommend it internally, but we can discuss it if you think it would be useful.

@olafurpg
Copy link
Contributor Author

I would love to see this picked up! I'm fine either approach we use, whether that's adding a strict_deps: Boolean or flattened_dependency_closure: List[String] field. I don't have enough context to say which approach is better.

@blorente
Copy link
Contributor

Re-opened this in the above PR: #9146

@olafurpg
Copy link
Contributor Author

Thank you @blorente! Superseded by #9146

@olafurpg olafurpg closed this Feb 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants