Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test MyRocks with concurrent transactions through replication #107

Closed
spetrunia opened this issue Nov 17, 2015 · 7 comments
Closed

Test MyRocks with concurrent transactions through replication #107

spetrunia opened this issue Nov 17, 2015 · 7 comments
Labels

Comments

@spetrunia
Copy link
Contributor

(this is a write-up from last week's call. The task is for Daniel Lee @dleeyh).

We would like to test MyRocks behavior with concurrent transactions, both for crashes and for correctness. There is no ready tool to do that. Here is an idea how to achieve this:

  1. Run a master-slave setup with replication (row-based).
  2. Run a concurrent load against the master
  3. Let the slave catch up
  4. Compare the data on the master and slave.

This way, master runs data in parallel, the slave applies it sequentially (and hopefully correctly), which gives us ability to check master against the slave.

@mariadb-DanielLee
Copy link

Completed the following items:

  1. Build single-server project and simple test
  2. Setup replication stack (one master and one slave)
  3. Verify data (table row counts and checksums) between master and slave nodes
  4. Random Query Generator test
  5. InnoDB stress test (need to configure it to run on MyRocks)
  6. Written scripts to automate above tasks

@mariadb-DanielLee
Copy link

MyRocks test result update.

Please see MyRocks test results so far in the attached .csv file. I have a OS X Numbers workbook file. Unfortunately it is a supported file type for attachment here. Let me know if you prefer to have that so I can email it to you.

MyRocks Test Results.txt

Couple things observed:

  1. Bug mysqld some times crashes at the end of Random Query Generator test #111 seems to occur more often when testing in a single server environment than in a replication environment. Both tests were performed in the same VM and replication was configured to used two instances of the same binary, with separate data directories. The issue could be time sensitive, since the single-server test executes faster, using all available resources. In this case, it would be difficult to reproduce in other setups, such as your development environments.

  2. After testing using MyRocks as the only engine in the replication test, I decided to use MyRocks for master and InnoDB as slave. In this configuration, I got mismatched results between master and slave when concurrent users was increased from 25 to 50. Row counts for all 4 test tables (with partition support disabled) are not matching. I have not check if row contents matched and I don't know what the cause is at this time.

  3. I got the latest code checked in for issue key_info[secondary_key].actual_key_parts does not include primary key on partitioned tables #105 and performed few tests with Partition support enabled. So far so good. I will perform more tests today.

At this time, I would like to sync up with both Facebook and MariaDB teams to make sure my testing is within your expectation. Please let me know if you have preference on the what areas I should focus my testing effort on. My next test plan is to modify the SQL statement syntax (grammer in RQG) and run tests on the MyRocks-InnoDB replication setup.

Thanks

Daniel

@yoshinorim
Copy link
Contributor

@dleeyh : I tested MyRocks-InnoDB master-slave replication but I couldn't reproduce row count mismatch. Could you double check if InnoDB slave was not lagged? When I tested, InnoDB slave was lagged a lot (though setting innodb_flush_log_at_trx_commit=0 mitigated the lag) so I had to wait for a while for InnoDB slave to catch up.
Also, how did you check data mismatch between InnoDB and MyRocks? CHECKSUM TABLE is not compatible between InnoDB and MyRocks, so tools relying on this command (i.e. pt_table_checksum v2) doesn't work.

@mariadb-DanielLee
Copy link

Ok. I will check to see if InnoDB is lagging after my current test.

I have a SQL script to dump the content of the tables to a txt file and diff on them.

@mariadb-DanielLee
Copy link

With the latest release build (without debugging), I ran the test again with threads=20. I did not see any InnoDB lagging and I did not see the mismatched results that I reported earlier. I will do more tests, possibly with a debug build again.

@mariadb-DanielLee
Copy link

Correction: My last test was on the same debug release that I use before.

@mariadb-DanielLee
Copy link

I increased the threads to 50 and ran more tests. I noticed the InnoDB slave lagging and the mismatched results. After the InnoDB slave caught up processing, the results matched. Thanks, Yoshinori.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants