-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INFRA] Close stale PRs #18780
[INFRA] Close stale PRs #18780
Conversation
cc @srowen, @jiangxb1987, @gatorsmile, @ueshin, @jerryshao, @vanzin, @cloud-fan and @MLnick who I believe are interested in this and could double check this list. |
Before closing these PRs, we need to figure out whether they should be taken over by someone else. If we simply close them, we might forget to fix the related issues. |
@gatorsmile, I think we still could find others even after being closed if the point is to prevent forgetting. This list remains and there are many in-progress JIRAs we could find someone to work on. |
If the PRs are closed, we might not read them any more. Thus, my suggestion is be careful when you want to close the PRs. |
Is the point to leave them open to make reviewers read? I think we already had few discussion about this. The (I think) last resort, automatic closing, was already suggested before - @rxin. Also, see #13959. I personally don't like to close them but this is my effort to prevent the last resort. I think we can identify in-progress JIRAs. |
The number of JIRAs is too many. To be honest, I do not have a bandwidth to go over all the in-progress JIRAs. I suggest to review the inactive PRs one more time to ensure we do not miss any good/important PR. |
BTW, I suggest to let the committers who knew the component very well to give the suggestion to close the PRs. For example, I will not suggest to close any ML PR. |
@gatorsmile, Sure, I will take them out if any ML committer suggests. cc @jkbradley, @yanboliang and @mengxr who I know. Will also take a look and check them as far as I can once more. Also, what do you think that I send an email to user/dev mailing list to make other contributors pick up PRs they are used to? |
First, we need to identify a list of PRs that need to be taken over. For example, in SQL PRs, #18515 should be fixed, even if the original PR creator might not do it. |
But wouldn't we still be able to look though the PRs and make the list even after being closed here? I think I am lost about why we should leave them open. |
I will go over the SQL-related PRs in the next few days to identify which PRs should not be closed and we should find somebody else to take them over. |
I guess It should be good to identify important and good PRs but I am sure most of them were inactive to review comments or Jenkins failures more than a month and I left a comment to check if they are still inactive. I think if the author is willing to proceed, they can easily reopen. If we find someone to take over, we can simply cc. |
@HyukjinKwon , can you please also add the closed PR title |
Sure, let me try soon. |
If you are asking for their opinions it'd be easier if you ask more explicitly (A vs B) in one comment, rather than asking them to go through and read the entire thread ... |
Test build #80068 has finished for PR 18780 at commit
|
Sure, I guess these should be: A: Closing stale PRs, which are inactive to Jenkins test failures and review comments more than a month with a one-weak comment to check if it is really inactive, regardless of importance or goodness of the PRs. B: Closing stale PRs that are carefully identified, in terms of goodness and importance of the PRs, by committers, regardless of being inactivity (mainly from the author). |
@gatorsmile @tdas to rehash the logic here:
I'd be careful of falling into the trap of thinking that, because something looks important, but nobody has thought it important enough to look at in a long time, that leaving it open somehow increases the chance that Someone will work on it. It has never worked that way in the history of software projects. I personally would rather close more aggressively, and if that looks like bad news, address why all these important things aren't getting done: need more committers? discouraged contributors? Too much complexity? how to get long-time contributors to become more active again? This mechanism is just a messenger. But still let's just stick to closing the ones that nobody objects to. |
Sure, I added those and took out 18475. |
I'm in favor of fairly aggressive closing of inactive pull requests. The cost of reopening them is small, and the benefits of having a clear view of what is in progress are large. I think as long as the notification is clear and polite, it won't discourage contribution. |
+1 to what Michael said. List seems fine to me (YARN-related issues). You have a typo in the PR title though. |
Please take [SPARK-21287] out. |
Yes, I just took out. |
After we leave polite messages to close their PRs, I think we should still keep them open one more week at least. Although it is trivial to reopen it by themselves, the feelings are different. |
@gatorsmile most or all of these have had pings on the PR already; Hyukjin generally pings them if there's a question then tries to close 1-2 weeks later. I agree, I wouldn't surprise anyone with a close, but it seemed like they all were clearly left with the contributor or the reason it should be closed was already given |
I added those and took out 11494. |
Are we okay to go ahead with the current list? |
I would go ahead. Anyone can reopen something if needed. |
Thanks all. I merged this to master. |
Hello, sorry I'm quite new, where can I DL the patch, i have an issue that has been corrected with #18127 . Do I have to DL the master branch and recompile the JAR and replace in my maven .m2 ? |
What changes were proposed in this pull request?
This PR proposes to close stale PRs, mostly the same instances with #18017
Closes #14085 - [SPARK-16408][SQL] SparkSQL Added file get Exception: is a directory …
Closes #14239 - [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism to accelerate shuffle stage.
Closes #14567 - [SPARK-16992][PYSPARK] Python Pep8 formatting and import reorganisation
Closes #14579 - [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() should return Python context managers
Closes #14601 - [SPARK-13979][Core] Killed executor is re spawned without AWS key…
Closes #14830 - [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on Pyspark examples
Closes #14963 - [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in lint-python
Closes #15227 - [SPARK-17655][SQL]Remove unused variables declarations and definations in a WholeStageCodeGened stage
Closes #15240 - [SPARK-17556] [CORE] [SQL] Executor side broadcast for broadcast joins
Closes #15405 - [SPARK-15917][CORE] Added support for number of executors in Standalone [WIP]
Closes #16099 - [SPARK-18665][SQL] set statement state to "ERROR" after user cancel job
Closes #16445 - [SPARK-19043][SQL]Make SparkSQLSessionManager more configurable
Closes #16618 - [SPARK-14409][ML][WIP] Add RankingEvaluator
Closes #16766 - [SPARK-19426][SQL] Custom coalesce for Dataset
Closes #16832 - [SPARK-19490][SQL] ignore case sensitivity when filtering hive partition columns
Closes #17052 - [SPARK-19690][SS] Join a streaming DataFrame with a batch DataFrame which has an aggregation may not work
Closes #17267 - [SPARK-19926][PYSPARK] Make pyspark exception more user-friendly
Closes #17371 - [SPARK-19903][PYSPARK][SS] window operator miss the
watermark
metadata of time columnCloses #17401 - [SPARK-18364][YARN] Expose metrics for YarnShuffleService
Closes #17519 - [SPARK-15352][Doc] follow-up: add configuration docs for topology-aware block replication
Closes #17530 - [SPARK-5158] Access kerberized HDFS from Spark standalone
Closes #17854 - [SPARK-20564][Deploy] Reduce massive executor failures when executor count is large (>2000)
Closes #17979 - [SPARK-19320][MESOS][WIP]allow specifying a hard limit on number of gpus required in each spark executor when running on mesos
Closes #18127 - [SPARK-6628][SQL][Branch-2.1] Fix ClassCastException when executing sql statement 'insert into' on hbase table
Closes #18236 - [SPARK-21015] Check field name is not null and empty in GenericRowWit…
Closes #18269 - [SPARK-21056][SQL] Use at most one spark job to list files in InMemoryFileIndex
Closes #18328 - [SPARK-21121][SQL] Support changing storage level via the spark.sql.inMemoryColumnarStorage.level variable
Closes #18354 - [SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: Constant Pool Limit - Class Splitting
Closes #18383 - [SPARK-21167][SS] Set kafka clientId while fetch messages
Closes #18414 - [SPARK-21169] [core] Make sure to update application status to RUNNING if executors are accepted and RUNNING after recovery
Closes #18432 - resolve com.esotericsoftware.kryo.KryoException
Closes #18490 - [SPARK-21269][Core][WIP] Fix FetchFailedException when enable maxReqSizeShuffleToMem and KryoSerializer
Closes #18585 - SPARK-21359
Closes #18609 - Spark SQL merge small files to big files Update InsertIntoHiveTable.scala
Added:
Closes #18308 - [SPARK-21099][Spark Core] INFO Log Message Using Incorrect Executor I…
Closes #18599 - [SPARK-21372] spark writes one log file even I set the number of spark_rotate_log to 0
Closes #18619 - [SPARK-21397][BUILD]Maven shade plugin adding dependency-reduced-pom.xml to …
Closes #18667 - Fix the simpleString used in error messages
Closes #18782 - Branch 2.1
Added:
Closes #17694 - [SPARK-12717][PYSPARK] Resolving race condition with pyspark broadcasts when using multiple threads
Added:
Closes #16456 - [SPARK-18994] clean up the local directories for application in future by annother thread
Closes #18683 - [SPARK-21474][CORE] Make number of parallel fetches from a reducer configurable
Closes #18690 - [SPARK-21334][CORE] Add metrics reporting service to External Shuffle Server
Added:
Closes #18827 - Merge pull request 1 from apache/master
How was this patch tested?
N/A