Skip to content

Commit

Permalink
HBASE-24885 STUCK RIT by hbck2 assigns (#2283)
Browse files Browse the repository at this point in the history
Adds region state check on hbck2 assigns/unassigns. Returns pid of -1
if in inappropriate state with logging explaination which suggests
passing override if operator wants to assign/unassign anyways. Here
is an example of what happens now if hbck2 tries an unassign and
Region already unassigned:

  2020-08-19 11:22:06,926 INFO  [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=50086] assignment.AssignmentManager(820): Failed {ENCODED => d1112e553991e938b6852f87774c91ee, NAME => 'TestHbck,zzzzz,1597861310769.d1112e553991e938b6852f87774c91ee.', STARTKEY => 'zzzzz', ENDKEY => ''} unassign, override=false; set override to by-pass state checks.
  org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for state=CLOSED, location=null, table=TestHbck, region=d1112e553991e938b6852f87774c91ee
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:583)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createOneUnassignProcedure(AssignmentManager.java:812)
          at org.apache.hadoop.hbase.master.MasterRpcServices.unassigns(MasterRpcServices.java:2616)
          at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
          at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:397)
          at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)

Previous it would just create the unassign anyways. Now must pass override
to queue the procedure regardless. Safer.

hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
 javadoc on assigns/unassigns. Minor refactor in assigns/unassigns to cater to
 case where procedure may come back null (if override not set and fails state checks).

hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 checkstyle cleanups.
 Clarifying javadoc on how there is no state checking when bulk assigns creating/enabling
 tables.

 createOneAssignProcedure and createOneUnassignProcedure now handle exceptions which now
 can be thrown if no override and region state is not appropriate.

 Aggregation of createAssignProcedure and createUnassignProcedure instances adding in
 region state check invoked if override is NOT set.

hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java
 Change to setProcedure so it returns passed proc as result instead of void

Signed-off-by: Duo Zhang <[email protected]>
  • Loading branch information
saintstack committed Aug 24, 2020
1 parent e0c9f91 commit 6ad73b9
Show file tree
Hide file tree
Showing 4 changed files with 177 additions and 108 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@
import org.apache.hadoop.hbase.ipc.ServerNotRunningYetException;
import org.apache.hadoop.hbase.ipc.ServerRpcController;
import org.apache.hadoop.hbase.master.assignment.RegionStates;
import org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure;
import org.apache.hadoop.hbase.master.locking.LockProcedure;
import org.apache.hadoop.hbase.master.procedure.MasterProcedureEnv;
import org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil;
Expand Down Expand Up @@ -2582,30 +2583,40 @@ private RegionInfo getRegionInfo(HBaseProtos.RegionSpecifier rs) throws UnknownR
}

/**
* A 'raw' version of assign that does bulk and skirts Master state checks (assigns can be made
* during Master startup). For use by Hbck2.
* @throws ServiceException If no MasterProcedureExecutor
*/
@Override
public MasterProtos.AssignsResponse assigns(RpcController controller,
MasterProtos.AssignsRequest request)
throws ServiceException {
private void checkMasterProcedureExecutor() throws ServiceException {
if (this.master.getMasterProcedureExecutor() == null) {
throw new ServiceException("Master's ProcedureExecutor not initialized; retry later");
}
}

/**
* A 'raw' version of assign that does bulk and can skirt Master state checks if override
* is set; i.e. assigns can be forced during Master startup or if RegionState is unclean.
* Used by HBCK2.
*/
@Override
public MasterProtos.AssignsResponse assigns(RpcController controller,
MasterProtos.AssignsRequest request) throws ServiceException {
checkMasterProcedureExecutor();
MasterProtos.AssignsResponse.Builder responseBuilder =
MasterProtos.AssignsResponse.newBuilder();
MasterProtos.AssignsResponse.newBuilder();
try {
boolean override = request.getOverride();
LOG.info("{} assigns, override={}", master.getClientIdAuditPrefix(), override);
for (HBaseProtos.RegionSpecifier rs: request.getRegionList()) {
long pid = Procedure.NO_PROC_ID;
RegionInfo ri = getRegionInfo(rs);
if (ri == null) {
LOG.info("Unknown={}", rs);
responseBuilder.addPid(Procedure.NO_PROC_ID);
continue;
} else {
Procedure p = this.master.getAssignmentManager().createOneAssignProcedure(ri, override);
if (p != null) {
pid = this.master.getMasterProcedureExecutor().submitProcedure(p);
}
}
responseBuilder.addPid(this.master.getMasterProcedureExecutor().submitProcedure(this.master
.getAssignmentManager().createOneAssignProcedure(ri, override)));
responseBuilder.addPid(pid);
}
return responseBuilder.build();
} catch (IOException ioe) {
Expand All @@ -2614,30 +2625,31 @@ public MasterProtos.AssignsResponse assigns(RpcController controller,
}

/**
* A 'raw' version of unassign that does bulk and skirts Master state checks (unassigns can be
* made during Master startup). For use by Hbck2.
* A 'raw' version of unassign that does bulk and can skirt Master state checks if override
* is set; i.e. unassigns can be forced during Master startup or if RegionState is unclean.
* Used by HBCK2.
*/
@Override
public MasterProtos.UnassignsResponse unassigns(RpcController controller,
MasterProtos.UnassignsRequest request)
throws ServiceException {
if (this.master.getMasterProcedureExecutor() == null) {
throw new ServiceException("Master's ProcedureExecutor not initialized; retry later");
}
MasterProtos.UnassignsRequest request) throws ServiceException {
checkMasterProcedureExecutor();
MasterProtos.UnassignsResponse.Builder responseBuilder =
MasterProtos.UnassignsResponse.newBuilder();
try {
boolean override = request.getOverride();
LOG.info("{} unassigns, override={}", master.getClientIdAuditPrefix(), override);
for (HBaseProtos.RegionSpecifier rs: request.getRegionList()) {
long pid = Procedure.NO_PROC_ID;
RegionInfo ri = getRegionInfo(rs);
if (ri == null) {
LOG.info("Unknown={}", rs);
responseBuilder.addPid(Procedure.NO_PROC_ID);
continue;
} else {
Procedure p = this.master.getAssignmentManager().createOneUnassignProcedure(ri, override);
if (p != null) {
pid = this.master.getMasterProcedureExecutor().submitProcedure(p);
}
}
responseBuilder.addPid(this.master.getMasterProcedureExecutor().submitProcedure(this.master
.getAssignmentManager().createOneUnassignProcedure(ri, override)));
responseBuilder.addPid(pid);
}
return responseBuilder.build();
} catch (IOException ioe) {
Expand Down
Loading

0 comments on commit 6ad73b9

Please sign in to comment.