Add new load and save ops for storing model params in a single file #7780

sidgoyal78 · 2018-01-23T07:18:52Z

This PR addresses issue #7722 by implementing the approach described in the issue.

TODO: As pointed out by @reyoung in the implementation of save_op, we need to move few file-related helper functions to a different namespace. I will do it maybe in this PR or maybe another. For the time being the helper functions are replicated and are static.

Xreki · 2018-01-23T11:41:32Z

paddle/operators/load_combine_op.cc

+ public:
+  LoadCombineOpProtoMaker(OpProto *proto, OpAttrChecker *op_checker)
+      : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddOutput("Out", "(Tensor) The tensor need to be load_combineed");


Can we change the implementation of load_op? The load_combined operator is a little confusing.

Maybe we can change the output Out of load_op to a Duplicable:

AddOutput("Out", "(Tensor) The tensor need to be loaded").AsDuplicable();

Then Out will be a vector of Variables, and we can load all the Variables from a file through one load_op.

Or we can use offset in a file instead of the position_counter? And can we avoid to read the content of the file again and again?

I think we can combine it with load_op. But in my opinion, it will look more like:

//book keeping code if( combine_param ) { // do stuff from combine_op } else // do stuff from usual op }

Regarding your Duplicable suggestion: I am not sure I follow completely, but still I think we will need a deserialization logic over this merged together tensor (which maybe the same amount of computation).

Regarding your suggestion about offset: I think it has 2 aspects:

If we use the offset, we will have to save it in some variable, (either make a class and make a member variable assign the offset, or something like that)

I don't see how an offset helps much in computation saving. I think even if we have an offset, everytime we open the file to load, we still have to move the file pointer from starting to the actual location. And we still do it in this code by hopping through different chunks (and of course, in the current implementation, there is some extra overheard of new and copy to istringstream).

I think we can combine it with load_op

I agree with it. Maybe can do it in next PRs.

Regarding your suggestion about offset. I think even if we have an offset, everytime we open the file to load, we still have to move the file pointer from starting to the actual location.

The current LoadCombineOp only process one file each time. Can it process many files at the same time? That is to say, this op loads all parameters at the same time and saves them into one file, and it doesn't need offset or point_counter. Thus, it can avoid reading the content of the file again and again.

That is to say, this op loads all parameters at the same time and saves them into one file, and it doesn't need offset or point_counter. Thus, it can avoid reading the content of the file again and again.

@luotao1 : Can you please explain a bit? I am not sure if I understand the main idea. (Did you mean : can it process many tensors (instead of files) at the same time?)

To just explain my point of view a bit more, I thought that the op will be used in the manner as done currently in save_vars() in fluid/io.py (ref. So basically, i assumed that we will have some of usage which looks like this:

https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/io.py#L89-L97

(with save replaced with save_combine)

I think we can combine it with load_op

I agree.

That is to say, this op loads all parameters at the same time and saves them into one file, and it doesn't need offset or point_counter. @luotao1 : Can you please explain a bit?

I think what @luotao1 is saying is that you can construct a save/load program desc for loading/saving all parameters data at once. This program desc has only one load/save op that takes a list of var names as input/output and do the load/save tasks for once. This list can be obtained from the main program desc by traversing variables. By doing this, we can save a lot of time reading/write files. Thus I prefer this option.

luotao1 · 2018-01-24T02:26:22Z

paddle/operators/save_load_combine_op_test.cc

@@ -0,0 +1,125 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.


2016->2018 in Copyright

luotao1 · 2018-01-24T03:18:43Z

paddle/operators/load_combine_op.cc

+ public:
+  LoadCombineOpProtoMaker(OpProto *proto, OpAttrChecker *op_checker)
+      : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddOutput("Out", "(Tensor) The tensor need to be load_combineed");


I think we can combine it with load_op

I agree with it. Maybe can do it in next PRs.

Regarding your suggestion about offset. I think even if we have an offset, everytime we open the file to load, we still have to move the file pointer from starting to the actual location.

The current LoadCombineOp only process one file each time. Can it process many files at the same time? That is to say, this op loads all parameters at the same time and saves them into one file, and it doesn't need offset or point_counter. Thus, it can avoid reading the content of the file again and again.

luotao1 · 2018-01-24T03:25:01Z

paddle/operators/save_load_combine_op_test.cc

+      50, 20, lod4, "test_var4", 3, "tensor.save", scope, place, expect_lod4);
+
+  paddle::framework::LoD actual_lod1, actual_lod2, actual_lod3, actual_lod4;
+  int* actual1 = create_and_run_load_combine_op("out_var1", 0, "tensor.save",


When LoadCombineOp loads all parameters at the same time, line 112 maybe like:
create_and_run_load_combine_op("out_var_list", "tensor.save", scope, place) ?

Xreki · 2018-01-31T06:41:15Z

@sidgoyal78 since #7909 is merged, can you close this PR?

sidgoyal78 added 3 commits January 22, 2018 16:48

Add save_combine_op

7781287

Add load_combine_op and test

6adc36e

Add unit-test

99d1d07

sidgoyal78 requested review from Xreki and kexinzhao January 23, 2018 07:19

Add a delete to free buffer memory

a24fca3

Xreki added the 预测原名Inference，包含Capi预测问题等 label Jan 23, 2018

Xreki reviewed Jan 23, 2018

View reviewed changes

Xreki requested review from qingqing01 and luotao1 January 23, 2018 12:03

luotao1 reviewed Jan 24, 2018

View reviewed changes

sidgoyal78 mentioned this pull request Jan 27, 2018

Add variant of new load and save ops for storing model params in a single file #7909

Merged

sidgoyal78 closed this Jan 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new load and save ops for storing model params in a single file #7780

Add new load and save ops for storing model params in a single file #7780

sidgoyal78 commented Jan 23, 2018

Xreki Jan 23, 2018

Xreki Jan 23, 2018

sidgoyal78 Jan 24, 2018 •

edited

Loading

luotao1 Jan 24, 2018

sidgoyal78 Jan 24, 2018 •

edited

Loading

sidgoyal78 Jan 24, 2018

kexinzhao Jan 25, 2018

luotao1 Jan 24, 2018

luotao1 Jan 24, 2018

luotao1 Jan 24, 2018

Xreki commented Jan 31, 2018

		@@ -0,0 +1,125 @@
		/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Add new load and save ops for storing model params in a single file #7780

Add new load and save ops for storing model params in a single file #7780

Conversation

sidgoyal78 commented Jan 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sidgoyal78 Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sidgoyal78 Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Xreki commented Jan 31, 2018

sidgoyal78 Jan 24, 2018 •

edited

Loading

sidgoyal78 Jan 24, 2018 •

edited

Loading