-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use optional third argument as edge attribute. #901
Conversation
Can one of the admins verify this patch? |
This is definitely a good change. |
Jenkins, test this please. |
Upon further look, it might be better if we allow users to define the way to parse the 3rd argument to allow arbitrary data types. What do you think? |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15258/ |
Allowing arbitrary data types sounds like a good idea. I actually tried to do something like this: def edgeListFile[[@specialized(Long, Int, Double) ED: ClassTag]](sc: SparkContext, and if (lineArray.length >= 3) lineArray(2).asInstanceOf[ED] But I am running into this error: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer I guess I will have to handle primitive types separately? Also do you guys follow any scala code style guide? ; so that I can follow that for future patches |
Unlike Python (but like Java), Scala doesn't use asInstanceOf for arbitrary type conversions. In this case, it won't work to do
[1] http://docs.scala-lang.org/tutorials/FAQ/finding-implicits.html#context_bounds |
ok to test |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15320/ |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15329/ |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15337/ |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15340/ |
|
@ankurdave @rxin could you guys come to a decision on this one way or the other? Also @npanj mind adding |
@pwendell @npanj @rxin This isn't mergeable at the moment because it doesn't parse the edge attributes correctly. Since it's difficult to do this parsing in general, there are two choices for now:
I'd suggest closing this for now. |
Can one of the admins verify this patch? |
@npanj Mind closing this? |
Sorry was out of loop for a while... closing it |
* [CARMEL-5873] Upgrade Parquet to 1.12.2 (#896) * [SPARK-36726] Upgrade Parquet to 1.12.1 ### What changes were proposed in this pull request? Upgrade Apache Parquet to 1.12.1 ### Why are the changes needed? Parquet 1.12.1 contains the following bug fixes: - PARQUET-2064: Make Range public accessible in RowRanges - PARQUET-2022: ZstdDecompressorStream should close `zstdInputStream` - PARQUET-2052: Integer overflow when writing huge binary using dictionary encoding - PARQUET-1633: Fix integer overflow - PARQUET-2054: fix TCP leaking when calling ParquetFileWriter.appendFile - PARQUET-2072: Do Not Determine Both Min/Max for Binary Stats - PARQUET-2073: Fix estimate remaining row count in ColumnWriteStoreBase - PARQUET-2078: Failed to read parquet file after writing with the same In particular PARQUET-2078 is a blocker for the upcoming Apache Spark 3.2.0 release. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests + a new test for the issue in SPARK-36696 Closes #33969 from sunchao/upgrade-parquet-12.1. Authored-by: Chao Sun <[email protected]> Signed-off-by: DB Tsai <[email protected]> (cherry picked from commit a927b08) * [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0 ### What changes were proposed in this pull request? Parquet 1.12.0 New Feature - PARQUET-41 - Add bloom filters to parquet statistics - PARQUET-1373 - Encryption key management tools - PARQUET-1396 - Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory - PARQUET-1622 - Add BYTE_STREAM_SPLIT encoding - PARQUET-1784 - Column-wise configuration - PARQUET-1817 - Crypto Properties Factory - PARQUET-1854 - Properties-Driven Interface to Parquet Encryption Parquet 1.12.0 release notes: https://github.com/apache/parquet-mr/blob/apache-parquet-1.12.0/CHANGES.md ### Why are the changes needed? - Bloom filters to improve filter performance - ZSTD enhancement ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test. Closes #31649 from wangyum/SPARK-34542. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit cbffc12) Co-authored-by: Chao Sun <[email protected]> * [HADP-44647] Parquet file based kms client for encryption keys (#897) * [HADP-44647] Parquet file based kms client for encryption keys (#82) create/write parquet encryption table. ``` set spark.sql.parquet.encryption.key.file=/path/to/key/file; create table parquet_encryption(a int, b int, c int) using parquet options ( 'parquet.encryption.column.keys' 'columnKey1: a, b; columnKey2: c', 'parquet.encryption.footer.key' 'footerKey'); ``` read parquet encryption table; ``` set spark.sql.parquet.encryption.key.file=/path/to/key/file; select ... from parquet_encryption ... ``` Will raise another pr for default footerKey. * [HADP-44647][FOLLOWUP] Reuse the kms instance for same key file (#84) * Fix Co-authored-by: fwang12 <[email protected]> Co-authored-by: Chao Sun <[email protected]> Co-authored-by: fwang12 <[email protected]>
This is probably easiest way to include edge attribute. Unless you guys have been thinking about some other ways? I also wanted to add one more argument for default edge attribute value..probably I will do that in another patch.