fix the state of message for evaluation #470
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #462 , and I provide an example below to show the modified output order.
Preliminary
fedavg_convnet2_on_femnist.yaml
(simulation mode);eval.freq=10
. Thus, the server would perform evaluation after 9th training round (0-9);Observations
From the figure we can observe that:
evaluate
messages to all the clients. However, theseevaluate
messages would not be handled at this moment, since the handling operations of the server here have not been over yet and cannot be interrupted. These logs are generated from the perspective of the server.evaluate
messages, the server broadcasts the training request messages for starting a new training round (i.e., the 10th round). After that, the server finishes the handling operations, and some of the clients have received two messages from the server, i.e.,evaluate
(at the end of the 9th round) and training request for the 10th round.evaluate
and/or training request messages one by one. When handling theevaluate
message, clients would not print any results locally, and the evaluation metrics would be sent to the server. When handling the training request, the client would print the training results, as shown in the 4th part of the provided example, and the updated models would be sent to the server after training. Thus, in the 4th part, although we can only observe the logs of training results, the clients also handle theevaluate
message here (and return the metrics to the server). Note that these logs are generated from the perspective of clients.Summary
In summary, although the logs show that the evaluation results (from the server) of 9th round are printed after the training results (from clients) of 10th round, the order of handling messages is precise and the same as our expectation:
Clients locally train at the 9th round (part 1)
-> Server starts evaluation at the end of the 9th round (part 2)
-> Server starts training for the 10th round (part 3)
-> Clients perform evaluation of the 9th round, and clients perform local training for the 10th round (part 4)
-> Server merges and prints the evaluation results of the 9th round (part 5)
-> Server starts training for the 11th round (part 6)