[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation #1769

liancheng · 2014-08-04T19:01:14Z

Please refer to comments of SPARK-2650 for some other details.

This PR adjusts the initial in-memory columnar buffer size to 1MB, same as the default value of Shark's shark.column.partitionSize.mb property when running in local mode. Will add Shark style partition size estimation in another PR.

Also, before this PR, NullableColumnBuilder copies the whole buffer to add the null positions section, and then CompressibleColumnBuilder copies and compresses the buffer again, even if compression is disabled (PassThrough compression scheme is used to disable compression). In this PR the first buffer copy is eliminated to reduce memory consumption.

…emory allocation

SparkQA · 2014-08-04T19:04:32Z

QA tests have started for PR 1769. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17866/consoleFull

SparkQA · 2014-08-04T19:14:13Z

QA tests have started for PR 1769. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17868/consoleFull

SparkQA · 2014-08-04T20:19:49Z

QA results for PR 1769:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17866/consoleFull

SparkQA · 2014-08-04T20:37:31Z

QA results for PR 1769:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17868/consoleFull

marmbrus · 2014-08-06T01:51:47Z

Thanks, merged into master and 1.1.

…l buffer size and reducing memory allocation JIRA issue: [SPARK-2650](https://issues.apache.org/jira/browse/SPARK-2650) Please refer to [comments](https://issues.apache.org/jira/browse/SPARK-2650?focusedCommentId=14084397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14084397) of SPARK-2650 for some other details. This PR adjusts the initial in-memory columnar buffer size to 1MB, same as the default value of Shark's `shark.column.partitionSize.mb` property when running in local mode. Will add Shark style partition size estimation in another PR. Also, before this PR, `NullableColumnBuilder` copies the whole buffer to add the null positions section, and then `CompressibleColumnBuilder` copies and compresses the buffer again, even if compression is disabled (`PassThrough` compression scheme is used to disable compression). In this PR the first buffer copy is eliminated to reduce memory consumption. Author: Cheng Lian <[email protected]> Closes #1769 from liancheng/spark-2650 and squashes the following commits: 88a042e [Cheng Lian] Fixed method visibility and removed dead code 001f2e5 [Cheng Lian] Try fixing SPARK-2650 by adjusting initial buffer size and reducing memory allocation (cherry picked from commit d0ae3f3) Signed-off-by: Michael Armbrust <[email protected]>

…l buffer size and reducing memory allocation JIRA issue: [SPARK-2650](https://issues.apache.org/jira/browse/SPARK-2650) Please refer to [comments](https://issues.apache.org/jira/browse/SPARK-2650?focusedCommentId=14084397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14084397) of SPARK-2650 for some other details. This PR adjusts the initial in-memory columnar buffer size to 1MB, same as the default value of Shark's `shark.column.partitionSize.mb` property when running in local mode. Will add Shark style partition size estimation in another PR. Also, before this PR, `NullableColumnBuilder` copies the whole buffer to add the null positions section, and then `CompressibleColumnBuilder` copies and compresses the buffer again, even if compression is disabled (`PassThrough` compression scheme is used to disable compression). In this PR the first buffer copy is eliminated to reduce memory consumption. Author: Cheng Lian <[email protected]> Closes apache#1769 from liancheng/spark-2650 and squashes the following commits: 88a042e [Cheng Lian] Fixed method visibility and removed dead code 001f2e5 [Cheng Lian] Try fixing SPARK-2650 by adjusting initial buffer size and reducing memory allocation

Try fixing SPARK-2650 by adjusting initial buffer size and reducing m…

001f2e5

…emory allocation

Fixed method visibility and removed dead code

88a042e

asfgit closed this in d0ae3f3 Aug 6, 2014

liancheng deleted the spark-2650 branch September 24, 2014 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation #1769

[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation #1769

liancheng commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

marmbrus commented Aug 6, 2014

[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation #1769

[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation #1769

Conversation

liancheng commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

SparkQA commented Aug 4, 2014

marmbrus commented Aug 6, 2014