Skip to content

Latest commit

 

History

History
32 lines (30 loc) · 898 Bytes

prepare_multimodal_information_file.md

File metadata and controls

32 lines (30 loc) · 898 Bytes

Prepara multimodal information

Let's take the text information as an example. Firstly, you need to apply a pre-trained text encoder (like BERT) to encode the text information and get the text_emb.txt like:

item_id:token	txt_emb:float_seq
1234 -0.24010642 -0.01735022 -1.067751...
1235 0.029363764 -0.059917185 -1.0038725...
...

And then, apply the following Python script to tranfer this file as .pth file:

import torch 
f = open("text_emb.txt","r")
idx = 0
d = {}
for l in f:
    idx += 1
    if idx == 1:
        continue
    ll = l.strip().split("\t")
    id = ll[0]
    emb = ll[1].split(" ")
    emb = [float(i) for i in emb]
    d[id] = torch.tensor(emb)
torch.save(d, "text_emb.pth")

And now, you get the text_emb.pth. At last, you need to set the item_feat_emb in the config file (.yaml file) like:

item_feat_emb: YOUR_FILE_PATH/text_emb.pth