You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I met this issue many times. it sometimes almost hang for more than 10 hour while normal application just takes only 1 hour to succeed.
small dataset (100w+ samples) and big dataset(1000w+ samples) both occur this issue.
compare to logs of success application, the logs of stuck application lack of this sentences "24/09/02 12:04:09 INFO MemoryStore: Block rdd_43_0 stored as values in memory (estimated size 984.0 B, free 7.8 GB)
24/09/02 12:04:11 INFO Executor: 1 block locks were not released by TID = 3005:
[rdd_43_0]" .
I met this issue many times. it sometimes almost hang for more than 10 hour while normal application just takes only 1 hour to succeed.
small dataset (100w+ samples) and big dataset(1000w+ samples) both occur this issue.
XGBoostSpark: Running XGBoost 1.0.0 with parameters:
alpha -> 0.0
min_child_weight -> 300.0
sample_type -> uniform
base_score -> 0.5
weight_col ->
rabit_timeout -> -1
colsample_bylevel -> 1.0
grow_policy -> depthwise
skip_drop -> 0.0
lambda_bias -> 0.0
silent -> 0
scale_pos_weight -> 1.0
seed -> 0
cache_training_set -> false
features_col -> features
num_early_stopping_rounds -> 0
label_col -> label
num_workers -> 200
subsample -> 1.0
lambda -> 1.0
max_depth -> 6
probability_col -> probability
raw_prediction_col -> rawPrediction
tree_limit -> 0
custom_eval -> null
dmlc_worker_connect_retry -> 5
rate_drop -> 0.0
max_bin -> 16
train_test_ratio -> 1.0
use_external_memory -> false
objective -> binary:logistic
eval_metric -> auc
num_round -> 200
timeout_request_workers -> 1800000
missing -> 0.0
rabit_ring_reduce_threshold -> 32768
checkpoint_path ->
tracker_conf -> TrackerConf(0,python)
tree_method -> hist
max_delta_step -> 0.0
eta -> 0.15
verbosity -> 1
colsample_bytree -> 1.0
normalize_type -> tree
allow_non_zero_for_missing -> false
custom_obj -> null
gamma -> 0.0
sketch_eps -> 0.03
nthread -> 1
prediction_col -> prediction
checkpoint_interval -> -1
The text was updated successfully, but these errors were encountered: