Can't refresh twice or more once pruning happens in the first refresh #5297

hjh1011 · 2020-02-10T00:58:35Z

The tasks that lead me to this issue is something like this: say I have three data sets D1, D2, D3. I train my first model xgb_1 with D1, refresh the leaves with D2 and then incrementally grow trees with D2 so that I get a model xgb_2. When I tried to refresh xgb_2 with D3, it outputs error. I did some digging and suspect it has something to do with node prune during refresh.

To better illustrate what I ran into, I made a toy example without incremental training here. I tried a few parameters so that the example can better show my case. I saw the same behavior under version 0.72.1, 0.80, 0.90

import numpy as np
import pandas as pd
import xgboost as xgb
from sklearn.datasets import load_breast_cancer


param_dict = {'booster':'gbtree',
          'objective':'binary:logistic',
          'slient': 0,
          'eta': 0.05,
          'max_depth':2,
          'min_child_weight':1,
          'subsample':1,
          'colsample_bytree':1,
          'tree_method':'exact',
          'eval_metric':'logloss'}

refresh_dict = {'booster':'gbtree',
          'objective':'binary:logistic',
          'slient': 0,
          'eta': 0.05,
          'eval_metric':'logloss',
          'process_type':'update',
          'updater':'refresh,prune'}

data = load_breast_cancer()

X = pd.DataFrame(data["data"])
X.columns = data["feature_names"]
y = data["target"]

DM_0 = xgb.DMatrix(X, label = y)

xgb_0 = xgb.train(params=param_dict,
         dtrain=DM_0,
         num_boost_round=5)

xgb_0.dump_model("xgb_0.txt")

I trained a model with only 5 trees and depth 2 so that we can better view it:

booster[0]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.135800004] yes=3,no=4,missing=3
		3:leaf=0.0958456993
		4:leaf=-0.0200000014
	2:[mean texture<16.1100006] yes=5,no=6,missing=5
		5:leaf=0.00476190494
		6:leaf=-0.095480226
booster[1]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.135800004] yes=3,no=4,missing=3
		3:leaf=0.0913208351
		4:leaf=-0.0190817993
	2:[worst texture<19.9099998] yes=5,no=6,missing=5
		5:leaf=0.00504727243
		6:leaf=-0.0910744742
booster[2]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.160299987] yes=3,no=4,missing=3
		3:leaf=0.0828778669
		4:leaf=-0.0650591701
	2:[mean concavity<0.0721400008] yes=5,no=6,missing=5
		5:leaf=-0.00284839468
		6:leaf=-0.0886432752
booster[3]:
0:[worst concave points<0.142349988] yes=1,no=2,missing=1
	1:[worst area<957.450012] yes=3,no=4,missing=3
		3:leaf=0.0806099474
		4:leaf=-0.0552328601
	2:[worst area<729.549988] yes=5,no=6,missing=5
		5:leaf=0.0112194559
		6:leaf=-0.0840485916
booster[4]:
0:[worst perimeter<105.949997] yes=1,no=2,missing=1
	1:[worst concave points<0.158899993] yes=3,no=4,missing=3
		3:leaf=0.0797721893
		4:leaf=-0.0421395414
	2:[mean concave points<0.0488649979] yes=5,no=6,missing=5
		5:leaf=0.0206357557
		6:leaf=-0.0770028159

Now if I refresh the model with exactly the data I train it, which means no pruning will happens given the hyperparameter I set ('subsample':1, 'colsample_bytree':1,), and in the case, you can refresh it as many times as you want:

xgb_1 = xgb.train(params=refresh_dict,
                  dtrain = DM_0,
                  num_boost_round=5,
                  xgb_model = xgb_0)

xgb_2 = xgb.train(params=refresh_dict,
                  dtrain = DM_0,
                  num_boost_round=5,
                  xgb_model = xgb_1)

However, If we take a subset of the train data and make sure that tree pruning happens:

DM_1 = DM_0.slice([i for i in range(350, 450)])
xgb_1 = xgb.train(params=refresh_dict,
                  dtrain = DM_1,
                  num_boost_round=5,
                  xgb_model = xgb_0)

xgb_1.dump_model("xgb_1.txt")

In this specific case, the first pruning happens on the 4th tree:

booster[0]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.135800004] yes=3,no=4,missing=3
		3:leaf=0.0942857191
		4:leaf=0.00909090973
	2:[mean texture<16.1100006] yes=5,no=6,missing=5
		5:leaf=0
		6:leaf=-0.0785714313
booster[1]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.135800004] yes=3,no=4,missing=3
		3:leaf=0.0900324881
		4:leaf=0.00880176853
	2:[worst texture<19.9099998] yes=5,no=6,missing=5
		5:leaf=0
		6:leaf=-0.0761579573
booster[2]:
0:[worst radius<16.7950001] yes=1,no=2,missing=1
	1:[worst concave points<0.160299987] yes=3,no=4,missing=3
		3:leaf=0.0844815597
		4:leaf=0
	2:[mean concavity<0.0721400008] yes=5,no=6,missing=5
		5:leaf=0
		6:leaf=-0.0791416466
booster[3]:
0:[worst concave points<0.142349988] yes=1,no=2,missing=1
	1:leaf=0.0791857168
	2:[worst area<729.549988] yes=5,no=6,missing=5
		5:leaf=0
		6:leaf=-0.0771616027
booster[4]:
0:[worst perimeter<105.949997] yes=1,no=2,missing=1
	1:[worst concave points<0.158899993] yes=3,no=4,missing=3
		3:leaf=0.078185834
		4:leaf=0
	2:[mean concave points<0.0488649979] yes=5,no=6,missing=5
		5:leaf=0.0301180389
		6:leaf=-0.0634451061

So now if you want to refresh the model again with:

xgb_2 = xgb.train(params=refresh_dict,
                  dtrain = DM_1,
                  num_boost_round=5,
                  xgb_model = xgb_1)

Here is the error you will run into (also note that we are running with the same data we used in first refresh, and thus this run should actually have no pruning at all):

---------------------------------------------------------------------------
XGBoostError                              Traceback (most recent call last)
<ipython-input-22-de83107b3495> in <module>()
      2                   dtrain = DM_1,
      3                   num_boost_round=5,
----> 4                   xgb_model = xgb_1)

/usr/lib/python2.7/site-packages/xgboost/training.pyc in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks, learning_rates)
    214                            evals=evals,
    215                            obj=obj, feval=feval,
--> 216                            xgb_model=xgb_model, callbacks=callbacks)
    217 
    218 

/usr/lib/python2.7/site-packages/xgboost/training.pyc in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     72         # Skip the first update if it is a recovery step.
     73         if version % 2 == 0:
---> 74             bst.update(dtrain, i, obj)
     75             bst.save_rabit_checkpoint()
     76             version += 1

/usr/lib/python2.7/site-packages/xgboost/core.pyc in update(self, dtrain, iteration, fobj)
   1107         if fobj is None:
   1108             _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, ctypes.c_int(iteration),
-> 1109                                                     dtrain.handle))
   1110         else:
   1111             pred = self.predict(dtrain)

/usr/lib/python2.7/site-packages/xgboost/core.pyc in _check_call(ret)
    174     """
    175     if ret != 0:
--> 176         raise XGBoostError(py_str(_LIB.XGBGetLastError()))
    177 
    178 

XGBoostError: [00:29:27] /workspace/include/xgboost/./tree_model.h:234: Check failed: nodes_[nodes_[rid].LeftChild() ].IsLeaf(): 
Stack trace:
  [bt] (0) /usr/xgboost/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x24) [0x7f0ae3aeecb4]
  [bt] (1) /usr/xgboost/libxgboost.so(xgboost::tree::TreePruner::DoPrune(xgboost::RegTree&)+0x4ce) [0x7f0ae3c441ae]
  [bt] (2) /usr/xgboost/libxgboost.so(xgboost::tree::TreePruner::Update(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, std::vector<xgboost::RegTree*, std::allocator<xgboost::RegTree*> > const&)+0x87) [0x7f0ae3c47a57]
  [bt] (3) /usr/xgboost/libxgboost.so(xgboost::gbm::GBTree::BoostNewTrees(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, int, std::vector<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> >, std::allocator<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> > > >*)+0xaeb) [0x7f0ae3b747fb]
  [bt] (4) /usr/xgboost/libxgboost.so(xgboost::gbm::GBTree::DoBoost(xgboost::DMatrix*, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::ObjFunction*)+0xd65) [0x7f0ae3b75c95]
  [bt] (5) /usr/xgboost/libxgboost.so(xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*)+0x396) [0x7f0ae3b88556]
  [bt] (6) /usr/xgboost/libxgboost.so(XGBoosterUpdateOneIter+0x35) [0x7f0ae3aebaa5]
  [bt] (7) /lib64/libffi.so.6(ffi_call_unix64+0x4c) [0x7f0b611aadcc]
  [bt] (8) /lib64/libffi.so.6(ffi_call+0x1f5) [0x7f0b611aa6f5]

The reason I believe this is cause by pruning is because the first 3 trees are unpruned compare to the original trees and if we only refresh the first 3 trees and everything is actually fine:

xgb_2 = xgb.train(params=refresh_dict,
                  dtrain = DM_1,
                  num_boost_round=3,
                  xgb_model = xgb_1)

So, this is so far the error source that I can locate? Any ideas?

The text was updated successfully, but these errors were encountered:

hjh1011 changed the title ~~Can't refresh twice or more once pruning happens in the first time~~ Can't refresh twice or more once pruning happens in the first refresh Feb 10, 2020

trivialfis added the type: bug label Feb 19, 2020

trivialfis mentioned this issue Feb 21, 2020

Fix pruner. #5335

Merged

trivialfis closed this as completed in #5335 Feb 25, 2020

stephenpardy mentioned this issue Feb 28, 2020

Cannot incrementally train after pruning #5376

Closed

lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't refresh twice or more once pruning happens in the first refresh #5297

Can't refresh twice or more once pruning happens in the first refresh #5297

hjh1011 commented Feb 10, 2020 •

edited

Loading

Can't refresh twice or more once pruning happens in the first refresh #5297

Can't refresh twice or more once pruning happens in the first refresh #5297

Comments

hjh1011 commented Feb 10, 2020 • edited Loading

hjh1011 commented Feb 10, 2020 •

edited

Loading