curious what you would recommend for real-time training + prediction models? #491

victusfate · 2021-11-09T19:03:54Z

I admire the api, efficiency, and results of implicit.

I'm finding a need for real time training + prediction in some of my company's systems, and started searching around for ideas/implementations. Has anyone had experience working with this?

Realize this is off topic from implicit (totally understand if it's closed).
Starting to look for ideas here:

victusfate · 2021-11-23T19:07:31Z

After spending some time looking at hrnn and implementations, I switched gears to something simpler to support continuous learning https://github.com/online-ml/river

victusfate · 2021-11-30T21:05:39Z

If anyone's curious I'm building an open source version here https://github.com/victusfate/concierge
Just hooked up redis pubsub events into updating the model today

Todo: on server startup get all events since last model training and update each model

benfred · 2022-01-25T18:56:05Z

There are two different things you can do here with implicit to get near-realtime updates with the ALS model :

You can set the recalculate_user flag on the model.recommend calls to automatically regenerate the user representation . This lets your recommendations react to changes in what the user has interacted with at inference time.
I've added support for incremental retraining for ALS models just now with PR Add incremental retraining support for ALS models #527 - which will let you update the model with new items or users, as well as let you recalculate existing items with new interactions.

victusfate · 2022-01-26T16:12:02Z

This is great news, I'd love to compare the results to river-ml since I have more experience with implicit.
~~When it's ready for review, it'd be great to see a small sample program/example with live updates to the model for recommendations~~ Oh it's already ready to try out, I'll get this on my schedule.

Also worth noting I got the deployed system to work great.

I gather all user item ratings hourly for a full training (snapshot model). When new servers come up they load this model and then delta train from a redis ordered set of all user item ratings since the last model snapshot. In addition live models receive real time updates via redis pubsub.

This way at scale, I can have multiple predictor http servers all yielding similar results (can't guarantee they all receive all updates in the same order), but they are generally convergent.
online-ml/river#803

sorenrife · 2022-02-14T00:07:34Z

In the case where a user is new, but the server is incapable to fit it yet into the model (as @victusfate explained, cause a pub/sub flow to add new users/items should preferably have certain delay for performance optimisation); How could I recommend to this new user?

Should I use the recommend method with a random userid and pass to user_items the few interactions of this new user? If that is true, could make sense to make the userid parameter optional?

(This assumption is made by not knowing the truly relevance of the userid in the recommend method if the recalculate_user flag is true)

victusfate · 2022-02-15T16:29:09Z

@sorenrife I ended up using popular results for new users in my current deployment using implicit (just hourly trained atm), and I think you can take the same approach with live model updates (keep an active popularity rank going as ratings come in)

something like this (grabbing code snippets from my hourly training) -> df is a pandas data set

    pr = df.groupby([constants.ITEM_COLUMN])[constants.RATING_COLUMN].sum()
    pr = (pr-pr.min())/(pr.max()-pr.min())
    self.item_popularity_map = pr.to_dict()
    self.item_popularity_map = {k: v for k, v in sorted(self.item_popularity_map.items(), key=lambda item: item[1],reverse=True)}

and in the rankings method

  def rankings(self,user_id: str,selected_items):
    ranks = {}
    selected_idx = []
    for selected_item in selected_items:
      selected_idx.append(self.inv_item_map[selected_item])

    # handle novel / unknown users with popularity rank
    if user_id not in self.inv_user_map:
      try:
        # print('rankings selected_items',selected_items)
        for k in selected_idx:
          item_name = self.item_map[k]
          score     = self.item_popularity_map[k]
          # print('rankings k',k,'item_name',item_name,'score',score)
          ranks[item_name] = float(score)
      except Exception as e:
        print('ImplicitPredictor.rankings popularity exception',e)
    else:
      user_idx = self.inv_user_map[user_id]
      try:
        rankings = self.model.rank_items(user_idx, self.user_items, selected_idx)
        for item_idx,prob in rankings:
          item_name = self.item_map[item_idx]
          ranks[item_name] = float(prob)
      except Exception as e:
        print('rankings exception',e)
    return ranks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

curious what you would recommend for real-time training + prediction models? #491

curious what you would recommend for real-time training + prediction models? #491

victusfate commented Nov 9, 2021 •

edited

Loading

victusfate commented Nov 23, 2021 •

edited

Loading

victusfate commented Nov 30, 2021

benfred commented Jan 25, 2022

victusfate commented Jan 26, 2022 •

edited

Loading

sorenrife commented Feb 14, 2022 •

edited

Loading

victusfate commented Feb 15, 2022 •

edited

Loading

curious what you would recommend for real-time training + prediction models? #491

curious what you would recommend for real-time training + prediction models? #491

Comments

victusfate commented Nov 9, 2021 • edited Loading

victusfate commented Nov 23, 2021 • edited Loading

victusfate commented Nov 30, 2021

benfred commented Jan 25, 2022

victusfate commented Jan 26, 2022 • edited Loading

sorenrife commented Feb 14, 2022 • edited Loading

victusfate commented Feb 15, 2022 • edited Loading

victusfate commented Nov 9, 2021 •

edited

Loading

victusfate commented Nov 23, 2021 •

edited

Loading

victusfate commented Jan 26, 2022 •

edited

Loading

sorenrife commented Feb 14, 2022 •

edited

Loading

victusfate commented Feb 15, 2022 •

edited

Loading