-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Winston 3.0.0 still halts logging in high stress. Makes it unusable for us. #1364
Comments
Using while(true) in this example is going to starve the event loop and prevent asynchronous code (writing to your log file) from running. A modified version of this that is more node friendly could look something like this:
|
@mempf is right, if you block the event loop like with while(true) then any async node code (not winston-specific) will eventually hang. Does his^ example above work fine for you? |
@mempf We had the same problem within our project. We use "bottleneck" to prevent the problem. In our case, we sometimes log a bunch of entries at once and winston hung up in the past. We use bottleneck to "queue" calls of the log function. This fixed our problem. Our LoggerService looks like this (typescript): import * as winston from 'winston';
import * as bottleneck from 'bottleneck';
const { printf, combine, colorize } = winston.format;
const logLevels = {
levels: {
error: 0,
warn: 1,
info: 2,
http: 3,
sql: 4,
debug: 5
},
colors: {
error: 'red',
warn: 'yellow',
info: 'green',
http: 'blue',
sql: 'blue',
debug: 'gray'
}
};
winston.addColors(logLevels);
export const jsonFormatter = (logEntry: any) => {
const base = { timestamp: new Date() };
const json = Object.assign(base, logEntry);
const today = new Date();
const day = today.toISOString().split('T')[0];
const hours = today.getUTCHours();
const minutes = today.getUTCMinutes();
logEntry['message'] = `[${day} ${hours}:${minutes} UTC]: ${json.message}`;
return logEntry;
};
/**
* Logger Service. Handles logging with winston. Service handles logging time intervals with bottleneck.
* Bottleneck, because winston will hang up under heavy load.
*/
export const LoggerService = {
/** "Bottleneck" Limiter for Logger */
bottleneck: new bottleneck.default({
maxConcurrent: 1,
minTime: 5
}),
/** Winston logger. See https://github.com/winstonjs/winston#creating-your-own-logger for more details */
logger: winston.createLogger({
level: 'info',
format: combine(
colorize(),
winston.format(jsonFormatter)(),
printf((info: any) => `${info.level}: ${info.message}`)
),
transports: [
new winston.transports.File({ filename: 'error.log', level: 'error' }),
new winston.transports.File({ filename: 'combined.log' })
],
exceptionHandlers: [
new winston.transports.File({ filename: 'exceptions.log' })
]
}),
/** Returns winston logger */
getLogger() {
return LoggerService.logger;
},
/**
* Logs an event with winston
* @param param0.level Log level. e.g. warn, info, error
* @param param0.message Log message
*/
log({ level, message } : {level: string, message: string }) {
LoggerService.bottleneck.schedule({}, () => {
return Promise.resolve(LoggerService.getLogger().log({ level, message }));
});
}
};
LoggerService.logger.add(new winston.transports.Console({
format: winston.format.simple()
})); |
We have the same problem, winston stops outputting after receiving a lot of input. I'm not familiar enough with node to implement the above example into my project. So any help would be appreciated. |
This is our code:
When we try to log anything, it causes a delay of multiple seconds which is unacceptable. Any thoughts? |
It is interesting, because even if we remove all transports, this still causes a huge spike in CPU (90% vs 4%) and slows down our code substantially (less than a second to run a method vs over 10). |
I am experiencing the same behaviour and I think it makes no sense to work around it. |
@kollerebbe - Thanks for your tip on bottleneck. We've been testing it for a while now and it more or less solved all our problems. And using the actual working logging we were also able to fix a memory leak.. 👍 For me that solves the problem, which means I could close this issue. I've got the feeling that more people in here have the same issue though. @the other people in here (@samothx , @tonestrike @gnorbsl ) did you guys also try the bottleneck code? If not, try it out and post here if that solves your problems.. or not |
We have not tried the bottleneck code. A lot of our issues were based around the size of the logs. If a particular log had a large JSON, Winston would spike our CPU to 90%+. However, if we stringified the JSON prior to passing it into Winston, we do not see such a huge spike. If we run into further issues, we will check out the bottleneck code above. |
Same here... |
@kramer65 would you know what was bringing down your node instance? Memory use? CPU? We're experiencing a memory leak with Winston 3.0.0. Unfortunately we cannot reproduce it. It just happens every now and then (see #1445). Therefore we cannot easily test whether Bottleneck would help us. Would be great if some of you could confirm it's helping with memory leaks, too. Thanks! |
Seeing the same issue in our application. Can't reproduce yet but do see the memory climb up and the CPU go to max right around the time when it stops logging. |
@mherger - We experienced high CPU (~100%) and when the CPU spiked, the memory would rise as well. If at some point the CPU lowered to less than 100% (due to temporary less logging demands), Winston would use that breathing time to store all the logs from the queue and the memory would also lower to the normal level. If the CPU spiked for too long however, the memory would rise until it would reach 100% and then node would stop because of a memory error. Bottleneck solved it for us because the logging would not interfere with the normal running of the program. This did have the effect that in case of high CPU, the logs would be written way after they were initiated. I tested this by writing the moment the log was initiated as well as the moment the log was written. At some points I saw a half an hour delay between the moment the log was initiated and when it was written. Since our logs are normally used for analysis at a later point that is not a problem for us. We've decided to rewrite our whole application into golang though, since we simply need higher performance. I hope this helps you to find the problem in your application. Let me know if I can be of any more help. |
Another observation that might be related: Logging in long running processes eventually stops We use winston together with express-winston. Some services run for days without restarting and we're now observing that there are no logs anymore after about 4-5 days or so even without a high load. |
This is still an issue with 3.1.0. I can actively reproduce it with apache bench after attempting to write just 10 log messages to a file. |
Hello All , I am facing similar issue . Can anyone help if there is any workaround or Is there any fix given for this in 3.2.0 |
I am still facing this issue with 3.2.1 #1624 |
Same issue for me... winston 3.2.1 and node 10.15.1 |
Same issue with winston 3.2.1, node 8.16 |
Any updates on this? |
@avik-so I would recommend switching to pino (https://github.com/pinojs/pino) - we are no longer experiencing this problem after migration. |
Happened to my app yesterday as well with 3.2.1. I had to restart my server to get back the logs. |
Actively facing the issue.Please suggest fixes. |
It seems that if a transport causes an error (i.e. it passes an error to the "log" callback function), it is removed from the logger! const winston = require("winston")
describe("transports issue", () => {
let logger
let errorMessage
let counter
let maxCounter
let transport
let logMock
let logError
beforeEach(() => {
counter = 0
maxCounter = 1
jest.clearAllMocks()
logError = undefined
logger = winston.createLogger({
level: "info",
transports: [transport]
// transports: [loggingWinston]
})
logger.on("error", error => {
logError = error
})
})
beforeAll(() => {
errorMessage = "Error logging!"
logMock = jest.fn((info, next) => {
if (counter === maxCounter) {
next(new Error(errorMessage))
} else {
counter = counter + 1
}
next()
})
transport = Object.assign(new winston.transports.Console(), {
log: logMock
})
})
describe("only log once", () => {
beforeEach(() => {
logger.info("log once")
})
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport])
})
test("error didn't", () => {
expect(logError).toBeUndefined()
})
})
describe("log twice", () => {
beforeEach(() => {
logger.info("log twice 1")
logger.info("log twice 2") // this returns an error for the transport
})
// THIS TEST FAILS
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport]) //empty array received
})
test("error occured", () => {
expect(logError).toHaveProperty("message", errorMessage)
})
})
}) |
I believe I have found a work around and it's a lot simpler then you might think. It turns out that "readable-stream" automatically "unpipe"s the writable stream on any transport error. Workaround: const winston = require("winston")
const transport = newErrorProneTransport() // FileTransport is only one, there are many
const logger = winston.createLogger({
level: "info",
transports: [transport]
})
logger.on("error", error => {
//if your JS version doesn't have "includes(transport)", use another method like `.some(t => t===transport)`
if (!logger.transports.includes(transport)) {
logger.add(transport)
}
}) Here is the test now correctly passing const winston = require("winston")
describe("transports issue", () => {
const mainError = "Error logging!"
const otherError = "Other error"
let logger
let errorMessage
let counter
let maxCounter
let logError
let transport
const newTransport = () =>
Object.assign(new winston.transports.Console(), {
log: (info, next) => {
if (counter === maxCounter) {
next(new Error(errorMessage))
} else {
if (logError !== undefined) {
errorMessage = otherError
}
counter = counter + 1
next()
}
}
})
beforeEach(() => {
errorMessage = mainError
counter = 0
maxCounter = 1
logError = undefined
transport = newTransport()
logger = winston.createLogger({
level: "info",
transports: [transport]
})
logger.on("error", error => {
if (!logger.transports.includes(transport)) {
logger.add(transport)
counter = 0
}
logError = error
})
})
describe("only log once", () => {
beforeEach(() => {
logger.info("log once")
})
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport])
})
test("error didn't", () => {
expect(logError).toBeUndefined()
})
})
describe("log twice", () => {
beforeEach(() => {
logger.info("log twice 1")
logger.info("log twice 2") // this raises the `mainError` for the transport
})
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport])
})
test("error occurred", () => {
expect(logError).toHaveProperty("message", mainError)
})
})
describe("log thrice", () => {
beforeEach(() => {
logger.info("log thrice 1")
logger.info("log thrice 2") // this raises the `mainError` for the transport
logger.info("log thrice 3")
})
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport])
})
test("error occurred", () => {
expect(logError).toHaveProperty("message", mainError)
})
})
describe("log four times", () => {
beforeEach(() => {
logger.info("log four times 1")
logger.info("log four times 2") // this raises the `mainError` for the transport
logger.info("log four times 3")
logger.info("log four times 4") // this raises the `otherError` for the transport
})
test("pipes is transport", () => {
expect(logger.transports).toEqual([transport])
})
test("other error occurred", () => {
expect(logError).toHaveProperty("message", otherError)
})
})
}) |
@yinzara Can you make a PR for this fix and reference this issue in it? |
I'm not 100% sure this is something that's just a PR kind of fix unless you want to call it a "breaking change". Currently if a transport error is thrown during the logging process and you have not added a "error" listener to the log, you would get a "on unhandled promise rejection". If I "fix" this, you would get no warning or error of any sort when a transport throws an error (you wouldn't even get a log or console message about it) however at least the log system would keep functioning. If you had added an error listener to the log, you would still get the same behavior however it's unlikely anyone was doing this as it was not suggested in the readme. Maybe adding a configuration variable to the logger options that says "autoHandleTransportErrors" that gets set to false now (or maybe true with documentation about the breaking change)? Thoughts by others? |
My two cents is that if the logger stops logging then we have a problem that needs to be fixed. Could we re-add the transport and then log that we just encountered an error? Perhaps something like: logger.on("error", error => {
//if your JS version doesn't have "includes(transport)", use another method like `.some(t => t===transport)`
if (!logger.transports.includes(transport)) {
logger.add(transport)
logger.log("error", error)
}
}) |
Yes but that's just another version of a breaking change (i.e. it changes the functionality of the existing library) and I would surmise others would want that as a config option. Additionally, you may have multiple transports in which some you may want to keep the original functionality of removing the transport on error (I'm not exactly sure why you would, but it would be a change from current functionality). Maybe this "fix" for this is to provide this option and either default it to true for now, or release a minor version like 3.3 with this option disabled and 4.0 with the option enabled? Opinions from others? The "breaking change" is really only a change in undocumented behavior so I'd feel ok with making it true and making a minor revision change. |
@yinzara - is this fixed yet? I may be running into the issue as well but I'm not sure |
I even added the |
Did you try the small snippet of code in the referenced PR? |
Which snippet? I added a hook with logger.on('error', (error) => {
// stuff
console.log('BlaBlaBla');
) But it's not called! |
Change your dependency version for winston to: |
I'll check if that fixes it for me! |
That version didn't fix it for me - so maybe it's something else which I don't see... |
I have seen some examples out there that stringify the message before sending it, are there any instructions anywhere about doing this or this this something that people just found out the hard way? |
Hy guys. Been looking into solutions and makes sense that i'm having this kind of issues. 4months ago the PR was approved, but turns out that any NPM version was published with this updated, turns out that this code isn't live. The package.json as example was last edited I gonna try today the fix mentioned here, about reinserting the transport onError but i think this could be published so everyone could have their logs working as expect. |
Still facing this issue. Anyone can help? |
…instonjs#1714) * Fix Issue where winston removes transport on error (winstonjs#1364) * Fix lint error to cause rebuild
any updates on this? |
I am using 3.3.3 and I still see the issue. My production logs are not getting displayed. Is someone working on it ? |
@aniketkalamkar Not really. In 2020 pino is definitely a much better choice as a logger, and it doesn't have this problem. |
Is this issue being investigated ? This keeps happening a lot when a sudden flood of logs goes to winston. |
The same issue I have raised 1.5 year ago. I don't think someone is looking into it. People might have some other priority task in hand. |
I migrated to PinoJS and it's way better option. Sorry about the maintainers, i dont like to make this kind of posts but i've invested way time trying to make this work and had several issues, thats why im mentioning to other devs. |
Another commenting asking if this issue is being investigated, why the patched version hasn't been published via npm (has it?), and if moving to pino is the only solution? |
We had this issue. Replaced |
I also faced similar issue. When logging too much to a file with Winston, it halts. I tried to dig a little deeper to find out the issue and how Pino is handling high volume log. Based on @kramer65 code I experimented some things // winston-too-much-log.js
const winston = require('winston')
const fs = require('fs')
const myFormat = winston.format.printf(info => {
return `${info.timestamp} ${info.level}: ${info.message}`
})
const logger = winston.createLogger({
level: 'info',
format: winston.format.timestamp(),
transports: [
new winston.transports.File({
filename: './some.log',
format: winston.format.json()
})
]
})
function mylog () {
let random_string = Math.random().toString(36).substring(7)
logger.info(random_string)
console.log(random_string)
}
for (let i =0; i < 1000000; i++) {
mylog()
} As known, in the above code Winston stops writing to file for some time. The reason explained by @samothx seems correct.It's due to stream buffer. I tried to write to a file without Winston and it also halts some time. // write-file-too-much.js
const fs = require('fs')
var util = require('util')
var stream = fs.createWriteStream('./some.log', {flags: 'a'})
function mylog () {
let random_string = Math.random().toString(36).substring(7)
stream.write('file: ' + random_string + '\n')
console.log(random_string)
}
for (let i =0; i < 1000000; i++) {
mylog()
} So the solution suggested by @kollerebbe using "bottleneck" will work very well. I used p-queue and it worked very well in both of my previous examples. // winston-too-much-log-with-queue.js
const {default: PQueue} = require('p-queue')
const winston = require('winston')
const fs = require('fs')
const queue = new PQueue({
concurrency: 1,
intervalCap: 1,
interval: 10
})
const myFormat = winston.format.printf(info => {
return `${info.timestamp} ${info.level}: ${info.message}`
})
const logger = winston.createLogger({
level: 'info',
format: winston.format.timestamp(),
transports: [
new winston.transports.File({
filename: './some.log',
format: winston.format.json()
})
]
})
function mylog () {
let random_string = Math.random().toString(36).substring(7)
logger.info(random_string)
console.log(random_string)
return Promise.resolve(true)
}
for (let i =0; i < 100000; i++) {
queue.add(mylog)
} // write-file-too-much-with-queue.js
const {default: PQueue} = require('p-queue')
const fs = require('fs')
var util = require('util')
var stream = fs.createWriteStream('./some.log', {flags: 'a'})
const queue = new PQueue({
concurrency: 1,
intervalCap: 1,
interval: 1
})
function mylog () {
let random_string = Math.random().toString(36).substring(7)
stream.write('file: ' + random_string + '\n')
console.log(random_string)
return Promise.resolve()
}
for (let i =0; i < 1000000; i++) {
queue.add(mylog)
} So now how is Pino is handling this This is from Readme
Pino is using Pino-Tee in a separate process to write the logs to files. So logging high volume data in same app process is not suggestible. So while using Winston if we do not log to file then there would not be any issue. With Winston I am logging to console and with PM2 I am writing to files. I used PM2 for logs channelizing to files, as I am already using PM2 to manage my services/apps. |
Please tell us about your environment:
[email protected]
node -v
outputs:v8.10.0
Ubuntu 18.04
ES6
This issue has been closed, but running the following code (which I also posted here) still stops Winston logs from working within a second or so:
The text was updated successfully, but these errors were encountered: