Replies: 6 comments 4 replies
-
Thanks for taking the time to write up your findings. My thoughts:
I'm not sure about the other two questions you asked, but those are my initial thoughts. |
Beta Was this translation helpful? Give feedback.
-
This is something I'm hoping to eliminate with the new data models. When the data model was all json objects and the project was very small it was easier to put the media progress on the user. Now that the data model is relational we don't want to have to pull media progresses every time we load a user.
This is a good question that I hadn't thought of. We don't need to load the user in a lot of cases if we store the users account type in the serialized user data. In that case we would create a new API token when changing the users account type.
I'm not sure why they have it in their example. https://www.passportjs.org/concepts/authentication/sessions/
I've seen some things about improving concurrency for node sqlite3 but haven't done any testing. It will be good to look into that even though it looks like we can make some improvements that could allow for not making so many concurrent queries. |
Beta Was this translation helpful? Give feedback.
-
Performance analysis day 2process.nextTick removalI removed the let liveRequests = 0
passport.deserializeUser(
function (req, user, cb) {
liveRequests += 1
BufferedLogger.log(`[${req.path}] start. pending: ${liveRequests}`)
const start = performance.now()
const parsedUserInfo = JSON.parse(user)
// load the user by ID that is stored in the session
Database.userModel.getUserById(parsedUserInfo.id).then((dbUser) => {
const end = performance.now()
const timeInMs = end - start
liveRequests -= 1
BufferedLogger.log(`[${req.path}] end. pending: ${liveRequests}. time: ${timeInMs.toFixed(3)}`)
return cb(null, dbUser)
})
}.bind(this)
)
passport.passReqToCallback = true
} This had no visible effect on the timing and bottleneck. bottom line - I think using mediaProgress exclusionI then tried changing static async getUserById(userId) {
if (!userId) return null
return this.findByPk(userId)
/*
return this.findByPk(userId, {
include: this.sequelize.models.mediaProgress
})
*/
} This had massive impact on the deserializeUser timing. These are the log messages:
Although we still see concurrency issues, Bottom line: there's absolutely no justification to populate media progress in req.user for any request. If an API call requires the user's media progress, it can perform a separate query for it. |
Beta Was this translation helpful? Give feedback.
-
look into 11ty's image plugin zachs left nothing on the table when it comes to image optimization the blogpost about it |
Beta Was this translation helpful? Give feedback.
-
More experimentsNo req.user for cover requestsIn the next experiment, if the request is a cover request, we don't perform user deserialization at all and req.user is set to null. if (!req.user.checkCanAccessLibraryItem(libraryItem)) {
return res.sendStatus(403)
} This of course reduces After some further digging, I found the next choke point to be in async jwtAuthCheck(jwt_payload, done) {
// load user by id from the jwt token
const user = await Database.userModel.getUserByIdOrOldId(jwt_payload.userId)
if (!user?.isActive) {
// deny login
done(null, null)
return
}
// approve login
done(null, user)
return
} No authentication for cover requestsSo for the next experiment, I disabled authentication for cover requests, by modifying isAuthenticated(req, res, next) {
// check if session cookie says that we are authenticated
if (req.isAuthenticated() || req.path.includes('/cover')) {
next()
} else {
// try JWT to authenticate
passport.authenticate('jwt')(req, res, next)
}
} Now the total serving time went down to ~60ms on average, and finally most of that time is spent in Using
|
Beta Was this translation helpful? Give feedback.
-
ConclusionsI'm really happy to have gone through these experiments. They've taught me a lot about the server's behavior and hidden costs. My recommendations are pretty obvious from experiments:
1-3 will vastly improve response time for the homepage, library, and series pages. They wil also allow us to remove the token from the url, which is quite a glaring secutiry issue. 4 and 5 are less important if 1-3 are implemented, but I think they should also be done. @advplyr also suggested storing the required user information in the JWT, which may obviate the need to go to the database at all for populating req.user. So unless I hear other suggestions, my first focus will be 1-3. |
Beta Was this translation helpful? Give feedback.
-
I'm starting this to describe my experiments and analysis of cover image loading.
Background
I noticed for some time that cold rendering (i.e. without browser caching) of pages containing many cover images (e.g. Homepage, Library, and especially the Series page) takes a long time, especially when the server is accessed through https. So I wanted to investigate a bit more.
There was also a lengthy discussion around this on #dev-chat around this on Oct 14, where a number of interesting ideas were raised.
But I felt like we first needed to get a clear sense of where the issues lay.
Initial observations
Setup:
Disable cache
, and cleared the list.Result:
Observations:
So, the main conclusion from this was that there's certainly something on the server side that delays the cover image requests.
Server experiments
So the first thing I wanted to understand was where do cover image requests spend most of the their time.
For that, I first revived some code I wrote a while ago to debug some other issue:
measureRequestServingTime()
is an Express middleware that measures the total time, average total time, and pending requests for requests matching some pattern (e.g./cover
). It is used by inserting it as the first middleware.measureMiddlewareTime()
is a wrapper that you can wrap around another a specific middleware to measure the time, average time, and pending requests for requests that pass within that middleware. The wrapper is a middleware itself, so if you have a call like this in the code:You can replace it with:
to enable time measurements.
BufferedLogger
is a special logger that doesn't immediately print to the console or writes to file. I use it instead of the standardLogger
to make sure that logging itself (which in the case ofLogger
might take a few milliseconds) doesn't significantly impact the measurement. The BufferedLogger keeps the logged messages in an array, and can be flushed on command.I started by adding
app.use(measureRequestServingTime(/\/cover/))
as the first middleware, and made sure the times I've seen in the devtools roughly matched the time measured on the server.Then I started going through the registered middlewares one-by-one. For each middleware I:
measureMiddlewareTime
I found the first choke-point at
app.use(passport.session())
. It looked like many requests were spending most of their time here.After some reading about what
passport.session
does, I found that likely the heaviest thing it does is rundeserializeUser
(a user-provided function that reads the user data from storage given a user info record extracted from the session cookie), in order to put its result inreq.user
.in our case, deserializeUser (in
Auth.js
) looks like this:I instrumented it to measure time and pending requests, like this:
These are the log messages from the buffered logger:
Observations:
So it looks like even though sequelize calls are asynchronous and non-blocking, it doesn't seem to handle concurrency very well.
Many questions come to mind:
This is what I have right now. Will continue tomorrow.
Beta Was this translation helpful? Give feedback.
All reactions