Standard outputs infos even when disabled [+ Usage question] #987

elDan101 · 2017-10-12T15:20:49Z

Sorry, I didn't realise I was on a pre-release version (2.0.6.). After updating to 2.0.7. the issue is resolved. I leave the text below anyway.

Usage questions:

What does "Finished loading 120 models" mean? I get this as an output after incremental training.
Are these single trees (=models) in the Boosting tree ensemble? It is increasing every time I incrementally train (not staying the same number of decrease).

When training incrementally I get many warnings [WARNING] No further splits with positive gain, best gain: -inf Is there a "best practice" how to tackle this, to avoid these warnings?

Thanks a lot, I will leave the issue open for the usage question, but close it then.

Obsolete issue:

Hi,

my target function is a time series. I split my dataset in multiple training/testing phases.

At each training phase I want to continue training from previous models (incremental learning).

Inside a training phase I do cross validation using TimeSeriesSplit for cross validation, where I test multiple parameter settings.

At the moment I am having troubles with the console outputs. Here is a sample:

Phase 1 refers to the first training phase (there is no existing model yet).


******************************************************
INFO: fitting LightGBM in phase 1

--------------------------------------------

INFO: Start parameter tuning

INFO: Handling level0
INFO: **improved cv_error** with params={'verbose': 0, 'learning_rate': 0.01, 'num_leaves': 10, 'max_bin': 255} || current best result: 8207.489200311997
INFO: **improved cv_error** with params={'verbose': 0, 'learning_rate': 0.01, 'num_leaves': 20, 'max_bin': 255} || current best result: 8090.180625524847
INFO: tested for params={'verbose': 0, 'learning_rate': 0.01, 'num_leaves': 30, 'max_bin': 255} || result: 8252.27902379209
INFO: tested for params={'verbose': 0, 'learning_rate': 0.02, 'num_leaves': 10, 'max_bin': 255} || result: 8195.467582308242
INFO: tested for params={'verbose': 0, 'learning_rate': 0.02, 'num_leaves': 20, 'max_bin': 255} || result: 8136.15878262733
[...]

But when I enter phase 2, I get the following:

******************************************************
INFO: fitting LightGBM in phase 2
--------------------------------------------
INFO: Start parameter tuning
INFO: Handling level0
[LightGBM] [Info] Finished loading 120 models
[LightGBM] [Info] Trained a tree with leaves=10 and max_depth=6
[LightGBM] [Info] Trained a tree with leaves=10 and max_depth=6
[LightGBM] [Info] Trained a tree with leaves=10 and max_depth=6
[LightGBM] [Info] Trained a tree with leaves=10 and max_depth=6
[... Many more of these outputs...]
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=9
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=7
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=7
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=11
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=8
INFO: tested for params={'verbose': 0, 'learning_rate': 0.01, 'num_leaves': 20, 'max_bin': 255} || result: 5529.753655247278

When I set init_model=None, then in round 2 the logs "Trained a tree with leaves=XX and max_depth=XX" disappear. So I think this is a bug.

Environment info

Operating System: Windows 7 Professional Service Pack 1
CPU: Intel(R) Core(TM) i7-2640M CPU @2.8Ghz
Python version: 3.6.1., Anaconda 4.4.0, 64 bit
LightGBM 2.0.6
Further notice: I just tried the below code also on Linux Mint, and I do not see the std-output there, so may be Windows specific.

Reproducible code:

import numpy as np
import lightgbm as lgb

X1 = np.random.rand(100, 5)
Y1 = np.random.rand(100)

X2 = np.random.rand(100, 5)
Y2 = np.random.rand(100)

vX1 = np.random.rand(50, 5)
vY1 = np.random.rand(50)
vX2 = np.random.rand(50, 5)
vY2 = np.random.rand(50)

ds1 = lgb.Dataset(data=X1, label=Y1, silent=True)
ds2 = lgb.Dataset(data=X2, label=Y2, silent=True)

vds1 = lgb.Dataset(data=vX1, label=vY1, silent=True)
vds2 = lgb.Dataset(data=vX2, label=vY2, silent=True)

params = {"learning_rate": 0.1, "num_leaves": 20}
 
def feval(pred, target):
    if isinstance(pred, lgb.Dataset):
        pred = pred.label

    if isinstance(target, lgb.Dataset):
        target = target.label

    diff = pred - target
    n_samples = pred.shape[0]
    rmse = np.sqrt(np.sum(diff * diff) / n_samples)
    return "rmse", rmse, False

m1 = lgb.train(params=params, train_set=ds1, num_boost_round=1000,
          early_stopping_rounds=30, valid_names=["early_stopping_valid"],
          valid_sets=[vds1], feval=feval, init_model=None, verbose_eval=False,
          keep_training_booster=True)

print("FINISHED training model 1") 

m2 = lgb.train(params=params, train_set=ds2, num_boost_round=1000,
                early_stopping_rounds=30, valid_names=["early_stopping_valid"],
                valid_sets=[vds2], feval=feval, init_model=m1, verbose_eval=False,
                keep_training_booster=True)

print("FINISHED training model 2")

Output:

[LightGBM] [Info] Total Bins 105
[LightGBM] [Info] Number of data: 100, number of used features: 5
FINISHED training model 1
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=4 and max_depth=3
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=3 and max_depth=2
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=3 and max_depth=2
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[...]
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=4 and max_depth=2
FINISHED training model 2

The text was updated successfully, but these errors were encountered:

guolinke · 2017-10-13T02:01:59Z

we reduce these output in the latest version. please try it.

elDan101 · 2017-10-13T06:34:53Z

yes, the output was reduced when updating (sorry, I didn't realise that I was on an old version).
But I still get the Finished loading X models. Can you give me a hint what this message means? I get this as an output after each incremental training. And it seems that number is always increasing.

guolinke · 2017-10-13T06:51:26Z

@elDan101 it will output when using continued train

guolinke closed this as completed Oct 13, 2017

lock bot locked as resolved and limited conversation to collaborators Mar 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standard outputs infos even when disabled [+ Usage question] #987

Standard outputs infos even when disabled [+ Usage question] #987

elDan101 commented Oct 12, 2017 •

edited

Loading

guolinke commented Oct 13, 2017

elDan101 commented Oct 13, 2017

guolinke commented Oct 13, 2017

Standard outputs infos even when disabled [+ Usage question] #987

Standard outputs infos even when disabled [+ Usage question] #987

Comments

elDan101 commented Oct 12, 2017 • edited Loading

Usage questions:

Obsolete issue:

Environment info

Reproducible code:

guolinke commented Oct 13, 2017

elDan101 commented Oct 13, 2017

guolinke commented Oct 13, 2017

elDan101 commented Oct 12, 2017 •

edited

Loading