Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dumping and loading : no documents #2

Open
dav009 opened this issue Sep 9, 2020 · 2 comments
Open

Dumping and loading : no documents #2

dav009 opened this issue Sep 9, 2020 · 2 comments

Comments

@dav009
Copy link

dav009 commented Sep 9, 2020

I am loading some random vectors to engine, then I do some dummy searches, which work fine.

however when I try to dump and load the vector, the engine says it has zero documents, and querying it returns zero results.

genereating the dump:

def generate_vector(n=520):
    features = np.random.rand(n,dimension).astype('float32')
    doc_items = []
    print(n)
    for i in range(n):
        profiles = {}
        profiles["id"] = i
        profiles["embedding"] = features[i,:].tolist()
        doc_items.append(profiles)
    return doc_items

engine = vearch.Engine("dummy_data", max_doc_size)
engine.init_log_dir("dummy_logs")
table = {
    "name": "test_table",
    "index_size":10000,
    "model": {
        "name": "IVFPQ",
        "nprobe": -1,
        "metric_type": "L2",
        "ncentroids": -1,
        "nsubvector": -1
    },
    "properties": {
        "id": {
            "type": "integer",
            "index": "true"
        },
        "embedding": {
            "index": "true",
            "type": "vector",
            "dimension": dimension,
            "store_type": "Mmap",
            "store_param": {"cache_size": 2000}
        },
    },
}
engine.create_table(table)
doc_items = generate_vector(n=9984)
engine.build_index()
engine.dump()

loading:

engine2 = vearch.Engine("dummy_data", max_doc_size)
engine2.init_log_dir("dummy_logs")
engine2.load()
total_num = engine2.get_doc_num()
print("total docs")
# gets 0
print(total_num)
# also searching any vectors returns empty results.
@mk-michal
Copy link

I have the same issue with loading Engine from file. I tried example that is described in docs, dumped the model, but on loading I get these error messages from table.cc and gamma_engine.cc

ERROR 2021-03-04 16:13:21,893 table.cc:290 Duplicate field _id
ERROR 2021-03-04 16:13:21,893 gamma_engine.cc:444 Cannot create table!
ERROR 2021-03-04 16:13:21,893 gamma_engine.cc:901 create table error when loading
ERROR 2021-03-04 16:13:21,893 gamma_engine.cc:915 create table from local error

How can I solve that duplicate field_id?

@dav009
Copy link
Author

dav009 commented Mar 5, 2021

@mk-michal not that useful but I could not figure that out, ended up moving to another tool: milvus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants