You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using better-queue in production with a PostgreSQL store.
We had an issue in the service running the queue and some significant data piled up in the store (2 312 233 tasks).
This caused the application to crash 1 min after the start. The crash was completely silent, with no unhandledRejection nor uncaughtException and no signal traps either (SIGTERM, SIGINT, SIGUSR2). The machine resources were not exhausted the CPU was at 40% and very low memory consumption.
I have logs set at the beginning of the queue process and on error events as you can see in the following code
constqueueOptions={batchSize: 250000,batchDelay: 3600000,concurrent: 1,maxRetries: Infinity,autoResume: false,// I tested with and without the `autoResume`retryDelay: 3600000+10,afterProcessDelay: 3600000,precondition: async(cb: any)=>{try{constlock=awaitcacheInstance.getValue(LOCK_KEY);if(lock){logger.info('Precondition failed, resources still locked');cb(null,false);}else{cb(null,true);}}catch(err){logger.warn('Couldn\'t check the queue precondition',err);cb(err);}},preconditionRetryTimeout: 3600000,};if(config.env==='production'){// @ts-ignorequeueOptions.store={type: 'sql',dialect: 'postgres',host: process.env.DATABASE_HOST,port: process.env.DATABASE_PORT,username: process.env.DATABASE_USERNAME,password: process.env.DATABASE_PASSWORD,dbname: process.env.DATABASE_NAME,tableName: 'my_queue_store',// The table will be created automatically.};}constmyQueue=newQueue(async(payload: any,cb: any)=>{try{constlock=awaitcacheInstance.lock(LOCK_KEY,3600000);// await doTheProcessing() and release the lock.cb(null,'queue_processed');}catch(err){// Release the lock.cb(err);}},queueOptions);// Queue logsmyQueue.on('batch_failed',(error: string)=>{logger.warn(`Failed to process the queue`,error);});myQueue.on('batch_finish',()=>{logger.info(`Processed the queue batch`);});
also, I have the following logs when I push data to the queue
myQueue.push(payload).on('finish',(result)=>{logger.verbose(`Pushed an event to the queue`,result);}).on('failed',(err)=>{logger.warn(`Failed to push an event to queue`,err);});
The absence of logs made it very hard to find the issue, I discovered the issue when disabling the SQL store and using the default memory where the app stopped crashing.
My only solution, for now, was to backup the my_queue_store table and truncate it.
my tech stack is the following:
OS: 64bit Amazon Linux 2/5.2.1 running in EBS
Node version: Node.js 12 running
better-queue: 3.8.2
better-queue-sql: 1.0.3
How can this be avoided?
How to improve logs for a similar situation?
Thank you 💗
The text was updated successfully, but these errors were encountered:
I'm using
better-queue
in production with a PostgreSQL store.We had an issue in the service running the queue and some significant data piled up in the store (

2 312 233
tasks).This caused the application to crash 1 min after the start. The crash was completely silent, with no
unhandledRejection
noruncaughtException
and no signal traps either (SIGTERM
,SIGINT
,SIGUSR2
). The machine resources were not exhausted the CPU was at 40% and very low memory consumption.I have logs set at the beginning of the queue process and on error events as you can see in the following code
also, I have the following logs when I push data to the queue
The absence of logs made it very hard to find the issue, I discovered the issue when disabling the SQL store and using the default
memory
where the app stopped crashing.My only solution, for now, was to backup the
my_queue_store
table and truncate it.my tech stack is the following:
OS: 64bit Amazon Linux 2/5.2.1 running in EBS
Node version: Node.js 12 running
better-queue: 3.8.2
better-queue-sql: 1.0.3
How can this be avoided?
How to improve logs for a similar situation?
Thank you 💗
The text was updated successfully, but these errors were encountered: