Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAPREDUCE-6827. Fixed bug: the second foreach-loop was not executed #177

Closed
wants to merge 1 commit into from

Conversation

javeme
Copy link

@javeme javeme commented Dec 30, 2016

Failed to traverse Iterable values with foreach at the second time in reduce() method, because the second foreach-loop was not executed.

JIRA MAPREDUCE-6827

The following code is a reduce() method (of WordCount):

public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

	@Override
	protected void reduce(Text key, Iterable<IntWritable> values, Context context)
			throws IOException, InterruptedException {

		// print some logs
		List<String> vals = new LinkedList<>();
		for(IntWritable i : values) {
			vals.add(i.toString());
		}
		System.out.println(String.format(">>>> reduce(%s, [%s])",
				key, String.join(", ", vals)));

		// sum of values
		int sum = 0;
		for(IntWritable i : values) {
			sum += i.get();
		}
		System.out.println(String.format(">>>> reduced(%s, %s)",
				key, sum));

		context.write(key, new IntWritable(sum));
	}
}

After running it, we got the result that the value of the variable sum is always 0!

After debugging, it was found that the second foreach-loop was not executed, and the root cause was the returned value of Iterable.iterator(), it returned the same instance in the two calls called by foreach-loop. In general, Iterable.iterator() should return a new instance in each call, such as ArrayList.iterator(). This patch fixed the bug.

Signed-off-by: Javeme [email protected]

… reduce() method

The following code is a reduce() method (of WordCount):

	public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

		@OverRide
		protected void reduce(Text key, Iterable<IntWritable> values, Context context)
				throws IOException, InterruptedException {

			// print some logs
			List<String> vals = new LinkedList<>();
			for(IntWritable i : values) {
				vals.add(i.toString());
			}
			System.out.println(String.format(">>>> reduce(%s, [%s])",
					key, String.join(", ", vals)));

			// sum of values
			int sum = 0;
			for(IntWritable i : values) {
				sum += i.get();
			}
			System.out.println(String.format(">>>> reduced(%s, %s)",
					key, sum));

			context.write(key, new IntWritable(sum));
		}
	}

After running it, we got the result that all sums were zero!

After debugging, it was found that the second foreach-loop was not executed, and the root cause was the returned value of Iterable.iterator(), it returned the same instance in the two calls by foreach-loop. In general, Iterable.iterator() should return a new instance in each call, such as ArrayList.iterator(). This patch fixed the bug.

Signed-off-by: Javeme <[email protected]>
@javeme
Copy link
Author

javeme commented Dec 30, 2016

NOTE: The following is a test about foreach with int[]/ArrayList, and the test results are expected(the second for-loop is also executed correctly):

import java.util.ArrayList;

public class TestForeach {

	public static void main(String[] args) {
		
		// test foreach twice with int[]
		int list1[] = new int[]{1, 2};
		
		System.out.println("==== int[] 1");
		for(int i : list1) {
			System.out.println(i);
		}
		
		System.out.println("===int[] 2");
		for(int i : list1) {
			System.out.println(i);
		}
		
		// test foreach twice with ArrayList
		ArrayList<String> list = new ArrayList<String>();
		list.add("1");
		list.add("2");
		Iterable<String> list2 = list;

		System.out.println();
		System.out.println("===ArrayList 1");
		for(String i : list2) {
			System.out.println(i);
		}
		
		System.out.println("===ArrayList 2");
		for(String i : list2) {
			System.out.println(i);
		}
	}

}

@javeme javeme changed the title MAPREDUCE-6827. Failed to traverse Iterable values the second time in… MAPREDUCE-6827. Fixed bug: the second foreach-loop was not executed Jan 4, 2017
@javeme
Copy link
Author

javeme commented Jan 4, 2017

According to Daniel Templeton, we think it is expected.

@javeme javeme closed this Jan 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant