Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel hangs when compiling large no of databinding modules #12780

Closed
arunkumar9t2 opened this issue Jan 6, 2021 · 8 comments
Closed

Bazel hangs when compiling large no of databinding modules #12780

arunkumar9t2 opened this issue Jan 6, 2021 · 8 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Android Issues for Android team type: bug

Comments

@arunkumar9t2
Copy link
Contributor

arunkumar9t2 commented Jan 6, 2021

Description of the problem / feature request:

We have a large app, 1000+ modules with many of them using databinding and encountered few issues due to usage of lists to propagate databinding data among dependencies. This leads to two issues mainly

  1. Argument list too long error is thrown when compiling large module with lot of dependencies that use databinding. See [ERROR] Android data-binding. 'Argument list too long' / 'Method too large' #12697
  2. Bazel build hangs during analysis phase and never completes. Taking a thread dump or leaving it to run for several hours throws the following error.
FATAL: bazel ran out of memory and crashed. Printing stack trace:
java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Unknown Source)
	at com.google.common.collect.ImmutableList$Builder.getReadyToExpandTo(ImmutableList.java:768)
	at com.google.common.collect.ImmutableList$Builder.add(ImmutableList.java:787)
	at com.google.devtools.build.lib.rules.android.databinding.DataBindingV2Context.processDeps(DataBindingV2Context.java:192)
	at com.google.devtools.build.lib.rules.android.AndroidCommon.init(AndroidCommon.java:500)
	at com.google.devtools.build.lib.rules.android.AndroidLibrary.create(AndroidLibrary.java:196)
	at com.google.devtools.build.lib.rules.android.AndroidLibrary.create(AndroidLibrary.java:41)
	at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createRule(ConfiguredTargetFactory.java:381)
	at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:194)
	at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:938)
	at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:1013)
	at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:371)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:477)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:398)
	at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

Feature requests: what underlying problem are you trying to solve with this feature?

Analysing the issue, current implementation of DatabindingV2Provider uses lists to propagate databinding data namely classInfos and setterStores.

This has problems when exports are used in the project, as it is currently done in Kotlin support . When exports are used, the list can contain duplicates which are not filtered at each stage and we seem to hit the problem described here. This causes problems in two places.

  1. When generating command line args for GEN_BASE_CLASSES, the -dependencyClassInfoList argument can contain lot of duplicates which for a sufficiently large graph can cause argument length to exceed Kernel limit. See [ERROR] Android data-binding. 'Argument list too long' / 'Method too large' #12697. Even if it does not crash for small apps, I believe this has implications on performance as well, since Databinding exec would do lot of duplicate work in this case.
  2. By manually filtering duplicates in classInfos, we can avoid the above error. However we hit another due to setterStores duplicates which causes the analysis hang and eventual OOM described above.

Applying the same duplicate filter to setter stores fixed both of the issues and entire app can be built. Patch file for attempted duplicate filtering fix. However I understand filtering duplicates is not an ideal solution, and wanted to propose using NestedSets just like how transitiveLabelAndJavaPackages are currently propogated.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Consider this dependency graph.



                          +-------------+
                          |             |
                          +             v
    //app +-------> A+--->B+---->C+---->D
                    +            ^
                    |            |
                    +------------+


Here if B exports D and C also exports D then in A's action there are duplicates in both setterStores and classInfos which causes this issue.

Repro steps:

  1. Extract demo-databinding.zip
  2. Run bazelisk build -s app
  3. Notice the below command for target //:A
SUBCOMMAND: # //:A [action 'GenerateDataBindingBaseClasses databinding/A/class-info.zip', configuration: 672f8fa637289946446c295a70933ce1eb05be590e7e92ce482cc3178bd0039d, execution platform: @local_config_platform//:host]
(cd /private/var/tmp/_bazel_arun.sampathkumar/ac004ba0cda23e8560cd2994568bdffb/execroot/__main__ && \
  exec env - \
  bazel-out/darwin-opt-exec-2B5CBBC6/bin/external/bazel_tools/tools/android/databinding_exec GEN_BASE_CLASSES -layoutInfoFiles bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/A/layout-info.zip -package com.google.android.databinding.libA -classInfoOut bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/A/class-info.zip -sourceOut bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/A/baseClassSrc.srcjar -zipSourceOutput true -useAndroidX false -dependencyClassInfoList bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/B/class-info.zip -dependencyClassInfoList bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/D/class-info.zip -dependencyClassInfoList bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/C/class-info.zip -dependencyClassInfoList bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/D/class-info.zip)

As you can see bazel-out/android-armeabi-v7a-fastbuild/bin/databinding/D/class-info.zip is being repeated twice, this problem is exacerbated on larger graphs eventually leading to kernel limit.

What operating system are you running Bazel on?

MacOS Mojave

What's the output of bazel info release?

release 4.0.0rc7

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

NA

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

NA

Have you found anything relevant by searching the web?

https://docs.bazel.build/versions/master/skylark/performance.html#use-depsets
https://stackoverflow.com/questions/11289551/argument-list-too-long-error-for-rm-cp-mv-commands

Any other information, logs, or outputs that you want to share?

.bazelrc

build --verbose_failures
build --define=android_incremental_dexing_tool=d8_dexbuilder
build --nouse_workers_with_dexbuilder
build --incompatible_disable_depset_items
build --strategy=Desugar=sandboxed

# Databinding flags
build --experimental_android_databinding_v2
build --android_databinding_use_v3_4_args
#build --android_databinding_use_androidx

# Flags to enable latest android providers in rules
build --experimental_google_legacy_api

try-import local.bazelrc
@aiuto aiuto added team-Android Issues for Android team untriaged labels Jan 7, 2021
@ahumesky ahumesky added P2 We'll consider working on this in future. (Assignee optional) type: bug and removed untriaged labels May 12, 2021
@Bencodes
Copy link
Contributor

We also have a large app with lots of modules and are seeing this as well during the analysis phase.

@arunkumar9t2 arunkumar9t2 changed the title Use NestedSets to propagate Databinding classInfo and setterStores. Bazel hangs when compiling large no of databinding modules Jul 6, 2021
@arunkumar9t2
Copy link
Contributor Author

I renamed the issue to better describe the problem. It seems even transitiveLabelAndJavaPackages contains lot of duplicates in them when exports are used - an unfortunate side effect of using kt_android_library which exports targets by default.

@Bencodes this hacky fix filters the duplicates and lets bazel proceed to execution phase.

Related #2694

@Bencodes
Copy link
Contributor

@ahumesky Have you had a look at this? This will probably be a blocker for folks trying to switch over to the rules_android alpha release for app bundles support seeing as the data binding flags are a pre-prerequisite.

@ahumesky
Copy link
Contributor

Thanks for the very detailed report, I took a look at the repro, and it shouldn't be too bad to fix this. If I recall correctly, the setterstore and classinfo files were supposed to be per-target only, so a list would do, but clearly in the case of a very long string of exports, it will grow quickly.

bazel-io pushed a commit that referenced this issue Jul 28, 2021
ahumesky added a commit to bazelbuild/rules_android that referenced this issue Jul 29, 2021
ahumesky added a commit to bazelbuild/rules_android that referenced this issue Jul 29, 2021
@ahumesky
Copy link
Contributor

I've submitted a couple changes that should address some of this problem.
You'll need a build of bazel that includes 876d48d

and android rules including bazelbuild/rules_android@9b6935e

(Also, bazel at head enables validation actions by default, and there's a problem with the aar_import_checks stub action in the android rules (the real aar_import_checks isn't open sourced yet), and until that's fixed, you'll need to pass --norun_validations)

@arunkumar9t2
Copy link
Contributor Author

Thanks @ahumesky will test and report back.

@arunkumar9t2
Copy link
Contributor Author

I tested in our project and can confirm Bazel does not hang anymore.

@ahumesky
Copy link
Contributor

ahumesky commented Aug 3, 2021

Excellent, thank you for confirming!

@ahumesky ahumesky closed this as completed Aug 3, 2021
luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this issue Sep 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Android Issues for Android team type: bug
Projects
None yet
Development

No branches or pull requests

4 participants