Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TA3151]feat(snap_rebuild)Flush outstanding IOs before taking snapshot on rebuild clone. #123

Merged
merged 6 commits into from
Sep 27, 2018

Conversation

satbirchhikara
Copy link
Collaborator

@satbirchhikara satbirchhikara commented Sep 27, 2018

This PR is to flush all outstanding IOs before create snapshot on rebuild clone. This way we are making sure that all IOs will be flushed to clone and we are not missing any IO.
Signed-off-by: satbir [email protected]

@satbirchhikara satbirchhikara changed the title Flush outstanding IOs before taking snapshot on rebuild clone. [TA3151]feat(snap_rebuild)Flush outstanding IOs before taking snapshot on rebuild clone. Sep 27, 2018
#endif
while (1) {
if (!zinfo->quiesce_done)
sleep(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a doubt. If replica gets disconnected from the target during rebuilding and quiescing was requested then will this thread exit?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current zfs takes care that - if either mgmt or data conn breaks, any rebuilding related threads also will be exited.
With current quiescing code also, I don't think there will be any issue, except that, it may be delayed to break rebuild threads.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I is fixed now.


/* IO to clone should be sent only when it is from app */
if (!is_rebuild && (zinfo->clone_zv != NULL)) {
if (!is_rebuild && !ZVOL_IS_HEALTHY(zinfo->main_zv) &&
(zinfo->clone_zv != NULL)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove check for clone_zv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -334,6 +341,12 @@ uzfs_zvol_worker(void *arg)
}
}

/* App IOs should go to cloen_zv */
Copy link

@vishnuitta vishnuitta Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make this as 'else' for above 'if' without NULL check for clone_zv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (inject_error.delay.rebuid_io_quiesce_check_by_pass)
continue;
#endif
while (1) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may need to be executed only in the thread where we had set these flags

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after discussion this code have no side effect.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

if (!rebuild_cmd_req &&
!ZVOL_IS_HEALTHY(zinfo->main_zv) &&
(zinfo->clone_zv != NULL))
read_zv = zinfo->clone_zv;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do the same for the sync also (ZVOL_OPCODE_SYNC) ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done sir.

mynktl
mynktl previously approved these changes Sep 27, 2018
@@ -213,6 +213,19 @@ taskq_wait_outstanding(taskq_t *tq, taskqid_t id)
taskq_wait(tq);
}

int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment related to function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir I think this is not latest code. ....I have renamed it to taskq_check_active_ios(). Please do let me know if you meant else

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just few lines of comment explaining the i/p and o/p of this function

@@ -169,6 +169,7 @@ enum zvol_status {

typedef enum zvol_status zvol_status_t;

#define ZVOL_IS_HEALTHY(zv) (zv->zv_status == ZVOL_STATUS_HEALTHY)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets us uzfs_zinfo_get_status(zinfo) getter function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some more macros which are also not as per guidelines.
#define ZINFO_IS_DEGRADED(zinfo) (ZVOL_IS_DEGRADED(zinfo->main_zv))
#define ZVOL_IS_DEGRADED(zv) (zv->zv_status == ZVOL_STATUS_DEGRADED)
Let me fix them with next PR. This PR seems to be little heavy weight.

@@ -348,6 +356,8 @@ uzfs_zvol_worker(void *arg)

case ZVOL_OPCODE_SYNC:
uzfs_flush_data(zinfo->main_zv);
if (zinfo->clone_zv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets check !ZVOL_IS_HEALTHY(zinfo->main_zv) here also instead of checking for zinfo->clone_zv).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if (zinfo->quiesce_done ||
!taskq_check_active_ios(
zinfo->uzfs_zvol_taskq)) {
zinfo->quiesce_done = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to set zinfo->quiesce_done to one? if condition already make sure of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind, saw the || condition. looks ok.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -158,6 +158,11 @@ typedef struct zvol_info_s {

/* Will be used to singal ack-sender to exit */
uint8_t conn_closed;

/* Rebuild flags to quiesce IOs */
uint8_t quiesce_requested;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use these two as bits in 'flags' variable?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and create setter/getter macros around these variables

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.. we can't set them as bits, as there is no lock in reading these variables

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -348,6 +356,8 @@ uzfs_zvol_worker(void *arg)

case ZVOL_OPCODE_SYNC:
uzfs_flush_data(zinfo->main_zv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we should flush to main_zv if (ZVOL_IS_REBUILDING_AFS(zinfo->main_zv) || ZVOL_IS_HEALTHY(zinfo->main_zv).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov-io
Copy link

Codecov Report

Merging #123 into zfs-0.7-release will decrease coverage by 0.44%.
The diff coverage is 80%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##           zfs-0.7-release     #123      +/-   ##
===================================================
- Coverage             52.4%   51.96%   -0.45%     
===================================================
  Files                  241      241              
  Lines                77442    77437       -5     
===================================================
- Hits                 40587    40240     -347     
- Misses               36855    37197     +342

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1086c95...8d7e1d0. Read the comment docs.

Copy link

@vishnuitta vishnuitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes looks good

@vishnuitta vishnuitta merged commit d0f0edd into mayadata-io:zfs-0.7-release Sep 27, 2018
@satbirchhikara satbirchhikara deleted the TA3151 branch September 27, 2018 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants