Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the index creation as per data model #2204

Closed
kajal583 opened this issue Nov 9, 2022 · 5 comments
Closed

Improve the index creation as per data model #2204

kajal583 opened this issue Nov 9, 2022 · 5 comments
Milestone

Comments

@kajal583
Copy link
Contributor

kajal583 commented Nov 9, 2022

Version:
・Orion : 2.1.0
・STH-Comet : 2.3.0
・Cygnus : 1.15.0
・MongoDB : 3.4

Bug Description:
An index is automatically created when creating an STH-Comet collection in MongoDB.
However, in STH-Comet's 'collection-per-entity' data model, when searching, STH-Comet does'nt use index and a full scan is performed.
So, after investigating we have found that cygnus in creating indexes . And for all data model same index is created so we need to improve index as per the data model.

Steps to reproduce:
After Integrating cygnus with orion, STH-Comet, MongoDBand executing below comands
[Collection confirmation]

> show collections
sth_x002ftestxfffftid1xfffftype
sth_x002ftestxfffftid1xfffftype.aggr 

[Index set on collection]
The following indexes are automatically created when creating a collection.
 ・entityId
 ・entityType
 ・attrName
 ・recvTime

> db['sth_x002ftestxfffftid1xfffftype'].getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "sth_test.sth_x002ftestxfffftid1xfffftype" 
        },
        {
                "v" : 2,
                "key" : {
                        "entityId" : 1,
                        "entityType" : 1,
                        "attrName" : 1,
                        "recvTime" : 1
                },
                "name" : "entityId_1_entityType_1_attrName_1_recvTime_1",
                "ns" : "sth_test.sth_x002ftestxfffftid1xfffftype" 
        }
]

[Collection data]
The 'entityId' and 'entityType' values for the index are not present in the registration data.

> db['sth_x002ftestxfffftid1xfffftype'].find()
{ "_id" : ObjectId("633fc5d51959bd0017145b1b"), "recvTime" : ISODate("2022-10-07T06:23:11.345Z"), "attrName" : "pressure", "attrType" : "Float", "attrValue" : "130" }
{ "_id" : ObjectId("633fc8151959bd0017145b1c"), "recvTime" : ISODate("2022-10-07T06:32:53.570Z"), "attrName" : "pressure", "attrType" : "Float", "attrValue" : "100" }
{ "_id" : ObjectId("633fc8d81959bd0017145b1d"), "recvTime" : ISODate("2022-10-07T06:36:07.866Z"), "attrName" : "pressure", "attrType" : "Float", "attrValue" : "101" }

Expected behaviour:
Below command should not give entityId and entityType of index :
db['sth_x002ftestxfffftid1xfffftype'].getIndexes()

@fgalan
Copy link
Member

fgalan commented Nov 10, 2022

Thanks for your feedback!

Could you test with a newer stack, please? Some of the versions you are using at at least 4 years old...

In particular, considering the last versions at the time of writing this, it would be:

  • Orion : 3.7.0
  • STH-Comet : 2.10.0
  • Cygnus : 2.20.0
  • MongoDB : 4.4

The index management behaviour has changed in newest versions, and although I cannot assure that it is working as expected with the current ones, it's better to check (and eventually debug and fix) with the last version of the software.

@kajal583
Copy link
Contributor Author

Could you test with a newer stack, please? Some of the versions you are using at at least 4 years old...

In particular, considering the last versions at the time of writing this, it would be:

  • Orion : 3.7.0
  • STH-Comet : 2.10.0
  • Cygnus : 2.20.0
  • MongoDB : 4.4

The index management behaviour has changed in newest versions, and although I cannot assure that it is working as expected with the current ones, it's better to check (and eventually debug and fix) with the last version of the software.

I have tested with a newer version stack which is shared by you as below:

  • Orion : 3.7.0
  • STH-Comet : 2.10.0
  • Cygnus : 2.20.0
  • MongoDB : 4.4

But the index creation as per data model is not applied in cygnus:2.20.0 also.

@mapedraza
Copy link
Collaborator

The index are all created at same time, having all of them placed even if
due to a matter of simplicity and homogeneity in the code. As we were not having big performance issues in placing extra indexes.

Are you facing any performance issue with this? Can you provide some tests comparing the performance between a table just with 2 indexes and a table with the 4 indexes?

The code part related with the index creation is placed here
(Note that there are two homonym createCollection method that differs one from another)

https://github.com/telefonicaid/fiware-cygnus/blob/master/cygnus-common/src/main/java/com/telefonica/iot/cygnus/backends/mongo/MongoBackendImpl.java#L86-L187

@kajal583
Copy link
Contributor Author

kajal583 commented Nov 16, 2022

Hi @fgalan , @mapedraza

The index are all created at same time, having all of them placed even if due to a matter of simplicity and homogeneity in the code. As we were not having big performance issues in placing extra indexes.

Are you facing any performance issue with this? Can you provide some tests comparing the performance between a table just with 2 indexes and a table with the 4 indexes?

The code part related with the index creation is placed here (Note that there are two homonym createCollection method that differs one from another)

https://github.com/telefonicaid/fiware-cygnus/blob/master/cygnus-common/src/main/java/com/telefonica/iot/cygnus/backends/mongo/MongoBackendImpl.java#L86-L187

We are having below performance issues in testing:

  • An index is automatically created when creating collection in MongoDB. However, in Collection per entity data model, when searching, it does not use index and full scan is performed.

  • When we execute 'explain()' function we are getting stage as 'COLLSCAN' as shown in below output. However, it should be 'IXSCAN' when using index.

[Result of 'explain()']

> db["sth_x002ftestxfffftid1xfffftype"].find({"attrName" : "pressure", "recvTime" : {"$lte" : ISODate("2022-10-10T14:59:59.999Z"), "$gte" : ISODate("2022-10-07T15:00:00Z") }}).explain('executionStats')
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "sth_test.sth_x002ftestxfffftid1xfffftype",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "attrName" : {
                                                "$eq" : "pressure" 
                                        }
                                },
                                {
                                        "recvTime" : {
                                                "$lte" : ISODate("2022-10-10T14:59:59.999Z")
                                        }
                                },
                                {
                                        "recvTime" : {
                                                "$gte" : ISODate("2022-10-07T15:00:00Z")
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "$and" : [
                                        {
                                                "attrName" : {
                                                        "$eq" : "pressure" 
                                                }
                                        },
                                        {
                                                "recvTime" : {
                                                        "$lte" : ISODate("2022-10-10T14:59:59.999Z")
                                                }
                                        },
                                        {
                                                "recvTime" : {
                                                        "$gte" : ISODate("2022-10-07T15:00:00Z")
                                                }
                                        }
                                ]
                        },
                        "direction" : "forward" 
                },
                "rejectedPlans" : [ ]
        },
        "executionStats" : {
                "executionSuccess" : true,
                "nReturned" : 0,
                "executionTimeMillis" : 1,
                "totalKeysExamined" : 0,
                "totalDocsExamined" : 169,
                "executionStages" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "$and" : [
                                        {
                                                "attrName" : {
                                                        "$eq" : "pressure" 
                                                }
                                        },
                                        {
                                                "recvTime" : {
                                                        "$lte" : ISODate("2022-10-10T14:59:59.999Z")
                                                }
                                        },
                                        {
                                                "recvTime" : {
                                                        "$gte" : ISODate("2022-10-07T15:00:00Z")
                                                }
                                        }
                                ]
                        },
                        "nReturned" : 0,
                        "executionTimeMillisEstimate" : 0,
                        "works" : 171,
                        "advanced" : 0,
                        "needTime" : 170,
                        "needYield" : 0,
                        "saveState" : 1,
                        "restoreState" : 1,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "direction" : "forward",
                        "docsExamined" : 169
                }
        },
        "serverInfo" : {
                "host" : "mongo-common-deployment-7fcd9788c4-txjz4",
                "port" : 27017,
                "version" : "3.4.23",
                "gitVersion" : "324017ede1dbb1c9554dd2dceb15f8da3c59d0e8" 
        },
        "ok" : 1
}


Also in STH-Comet, indexes are created based on data model https://github.com/telefonicaid/fiware-sth-comet/blob/master/lib/database/sthDatabase.js#L159. The index creation logic at cygnus side should be in sync with sth-comet.
Currently, index is created on below keys for every datamodel.

  • entityId
  • entityType
  • attrName
  • recvTime

In case of 'collection per entity' datamodel, above key will not be used as data will not contain entityId. Same is with 'collection per attribute' datamodel.

So, we should create an index which will work in case of ' collection per entity ' datamodel and 'collection per attribute ' datamodel.
If you have any query related to this, please let me know.
I would like to work on this issue. Please assign this issue to me.

@fgalan
Copy link
Member

fgalan commented Dec 13, 2022

Fixed by PR #2205

@fgalan fgalan closed this as completed Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants