You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This started a 125 instances. Unfortunately, the job crashed after 2 hours but continued running for another ~7 hours before quitting.
The logs are quite useless:
controller
2023-08-28T01:27:49.231Z INFO Ensure step 1 jar file command-runner.jar
2023-08-28T01:27:49.232Z INFO StepRunner: Created Runner for step 1
INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --class ldbc.snb.datagen.LdbcDatagen s3://ldbc-snb-datagen-bi-2021-07/jars/ldbc_snb_datagen_2.12_spark3.2-0.5.1+16-d6bfc51f-jar-with-dependencies.jar --output-dir /ldbc_snb_datagen/build --scale-factor 30000 --num-threads 3000 --mode bi --format csv --explode-edges --format-options compression=gzip --generate-factors'
INFO Environment:
PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/aws/puppet/bin/
SECURITY_PROPERTIES=/emr/instance-controller/lib/security.properties
HISTCONTROL=ignoredups
HISTSIZE=1000
HADOOP_ROOT_LOGGER=INFO,DRFA
JAVA_HOME=/etc/alternatives/jre
AWS_DEFAULT_REGION=us-east-2
LANG=en_US.UTF-8
MAIL=/var/spool/mail/hadoop
LOGNAME=hadoop
PWD=/
HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-090045419NSP6RN3ALBQ/tmp
_=/etc/alternatives/jre/bin/java
LESSOPEN=||/usr/bin/lesspipe.sh %s
SHELL=/bin/bash
QTINC=/usr/lib64/qt-3.3/include
USER=hadoop
HADOOP_LOGFILE=syslog
HOSTNAME=ip-172-31-27-148
QTDIR=/usr/lib64/qt-3.3
HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-090045419NSP6RN3ALBQ
EMR_STEP_ID=s-090045419NSP6RN3ALBQ
QTLIB=/usr/lib64/qt-3.3/lib
HOME=/home/hadoop
SHLVL=1
HADOOP_IDENT_STRING=hadoop
INFO redirectOutput to /mnt/var/log/hadoop/steps/s-090045419NSP6RN3ALBQ/stdout
INFO redirectError to /mnt/var/log/hadoop/steps/s-090045419NSP6RN3ALBQ/stderr
INFO Working dir /mnt/var/lib/hadoop/steps/s-090045419NSP6RN3ALBQ
INFO ProcessRunner started child process 19181
2023-08-28T01:27:49.234Z INFO HadoopJarStepRunner.Runner: startRun() called for s-090045419NSP6RN3ALBQ Child Pid: 19181
INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO Process 19181 still running
INFO waitProcessCompletion ended with exit code 1 : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO total process run time: 31543 seconds
2023-08-28T10:13:33.044Z INFO Step created jobs:
2023-08-28T10:13:33.044Z WARN Step failed with exitCode 1 and took 31543 seconds
stdout
Reading scale factors..
Available scale factor configuration set 0.003
Available scale factor configuration set 0.1
Available scale factor configuration set 0.3
Available scale factor configuration set 1
Available scale factor configuration set 3
Available scale factor configuration set 10
Available scale factor configuration set 30
Available scale factor configuration set 100
Available scale factor configuration set 300
Available scale factor configuration set 1000
Available scale factor configuration set 3000
Available scale factor configuration set 10000
Available scale factor configuration set 30000
Number of scale factors read 13
Applied configuration of scale factor 30000
... Num Persons 77000000
... Start Year 2010
... Num Years 3
Done ... 49558 surnames were extracted
Done ... 42970 given names were extracted
I ran the generator with the following configuration:
This started a 125 instances. Unfortunately, the job crashed after 2 hours but continued running for another ~7 hours before quitting.
The logs are quite useless:
controller
stdout
stderr
See https://gist.github.com/szarnyasg/28673ecdda0325b59295a3fc8c70cc14
The text was updated successfully, but these errors were encountered: