Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kundera performance slow down with load increase #562

Closed
ansarrafique opened this issue Apr 2, 2014 · 6 comments
Closed

Kundera performance slow down with load increase #562

ansarrafique opened this issue Apr 2, 2014 · 6 comments
Labels
Milestone

Comments

@ansarrafique
Copy link

I have 3 nodes cassandra cluster and I am generating load on 3 nodes using kundera client. I realized that the performance goes down quickly if I increase the load. I did the same experiment using Datastax driver 2.0.1 and there was no issue with performance. There might be a possibility that I have not configured it correctly. I have only persistence.xml file and there is no other configuration file. I am using kundera-cassandra 2.9.

    <properties>            
        <property name="kundera.nodes" value="192.168.121.12"/>
        <property name="kundera.port" value="9160"/>
        <property name="kundera.keyspace" value="kundera"/>
        <property name="kundera.dialect" value="cassandra"/>
        <property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" />
        <property name="kundera.ddl.auto.prepare" value="update" />

I have edit cassandra.yaml configuration file and define cluster. Do I have to define cluster in kundera if so, how is there any example ? Is there any clue that why performance is slow ? Also can you please tell, kundera executes call synchronously or asynchronously underneath ? Also why kundera thrift have better performance than kundera pelops ?

@mevivs
Copy link
Collaborator

mevivs commented Apr 2, 2014

  1. With how much volume you testing with?
  2. please do em.flush() or em.clear() to clear persistence cache(first level cache)
  3. Try with given connection pooling properties
            <property name="kundera.pool.size.max.active" value="50" />
            <property name="kundera.pool.size.max.total" value="50" />
  1. Pelops is build on top of Thrift hence relatively slow. Also there is no active development going on with Pelops, that's the reason behind bringing in Kundera's thrift client.
  2. For Datastax java driver, Kundera also provide support for the same.
  3. Calls are synchronous.

-Vivek

@ansarrafique
Copy link
Author

  1. I am inserting 1 million records with a single session. There is a single entity with no association and have only 10 columns in it.
  2. Actually em.persist() is called for each entity, do I have to call em.flush() for each entity and why ?
  3. I don't want to use connection pooling at this point and I want to evaluate the performance without connection pooling.
  4. If calls are synchronous, how kundera has good performance compare to datastax api ? Does datastax api uses thrift api underneath and built a thin abstraction on top?

Thanks,

@mevivs
Copy link
Collaborator

mevivs commented Apr 2, 2014

2.Actually em.persist() is called for each entity, do I have to call em.flush() for each entity and why ?

Because Kundera keeps in memory cache for managed entities as per JPA. No need to call it for each entity. Periodic flush say after count on entities reaches to 5000. We did test 1 million record over on single thread and multiple threads as well.

I assume you are using single entity manager factory as you mentioned single session.

If calls are synchronous, how kundera has good performance compare to datastax api ? Does datastax api uses thrift api underneath and built a thin abstraction on top?

Datastax java driver is based on CQL binary protocol not the Thrift one. Being an object mapper Kundera overhead is not more than 7% on Thrift or datastax.

-Vivek

@mevivs mevivs added the Discuss label Apr 2, 2014
@ansarrafique
Copy link
Author

Yes, I am using single entity manager factory. I only create a single entity manager and persist data. I have noticed that it works fine for half million records but eventually become slow if records are greater than half millions. The same test I did with datastax api and it works fine.

I am inserting data with kundera thrift, kundera pelops, and datastax api. I noticed that kundera performs better than datastax api if I execute queries synchronously with datastax api and datastax performs 10 times better than kundera if queries are executed asynchronously.

Kundera being an object mapper should cause an overhead, but I am surprise how it performs better than datastax api.

@mevivs
Copy link
Collaborator

mevivs commented Apr 3, 2014

 I have noticed that it works fine for half million records but eventually become slow if records are greater than half millions

Please add em.clear() and try. Clearly it is because of first level cache as memory is not getting freed up

I noticed that kundera performs better than datastax api if I execute queries synchronously with datastax api and datastax performs 10 times better than kundera if queries are executed asynchronously.

Kundera only supports synchronous executions.

We have used ycsb for Kundera vs raw thrift API comparison. Have a look at:
https://github.com/impetus-opensource/Kundera/blob/trunk/test/benchmark/ycsb/src/main/java/com/impetus/kundera/ycsb/benchmark/KunderaThriftClient.java

HTH,
-Vivek

@ansarrafique
Copy link
Author

The problem was solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants