Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Code for removing data provider dups #1652

Merged
merged 21 commits into from
Sep 3, 2024
Merged

Conversation

oblodgett
Copy link
Member

No description provided.

@oblodgett
Copy link
Member Author

oblodgett commented Aug 29, 2024

Choose to split the migrations into multiple files. Each file is basically its own transaction, which means putting everything into one transaction can run the server out of memory. This is the same reason for breaking up the deletes. They were MUCH faster doing them in batches vs one lump query. The single queries are run at the end just incase any were missed.

Still need to run the GFF loads to test that this works. Will merge can happen after the testing is done locally.

@oblodgett
Copy link
Member Author

Here are the timings on my local system... basically it ran ALL day.
2024-08-28 08:20:50,053 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Current version of schema "public": 0.37.0.6 2024-08-28 08:20:50,078 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.7 - remove dataprovider dups" 2024-08-28 08:24:59,231 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.8 - remove dataprovider dups" 2024-08-28 08:48:12,741 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.9 - remove dataprovider dups" 2024-08-28 08:50:30,722 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.10 - remove dataprovider dups" 2024-08-28 09:31:38,553 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.11 - remove dataprovider dups" 2024-08-28 09:36:06,771 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.12 - remove dataprovider dups" 2024-08-28 09:39:27,132 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.13 - remove dataprovider dups" 2024-08-28 10:21:22,436 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.14 - remove dataprovider dups" 2024-08-28 10:33:49,684 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.15 - remove dataprovider dups" 2024-08-28 10:33:49,772 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.16 - remove dataprovider dups" 2024-08-28 10:44:47,021 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.17 - remove dataprovider dups" 2024-08-28 10:44:47,145 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.18 - remove dataprovider dups" 2024-08-28 10:47:08,591 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.19 - remove dataprovider dups" 2024-08-28 11:18:43,031 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.20 - remove dataprovider dups" 2024-08-28 11:22:54,690 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.21 - remove dataprovider dups" 2024-08-28 11:28:16,288 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.22 - remove dataprovider dups" 2024-08-28 11:39:18,748 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.23 - remove dataprovider dups" 2024-08-28 11:41:47,340 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.24 - remove dataprovider dups" 2024-08-28 11:43:21,762 ERROR [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migration of schema "public" to version "0.37.0.24 - remove dataprovider dups" failed! Changes successfully rolled back. 2024-08-28 12:28:49,399 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.24 - remove dataprovider dups" 2024-08-28 12:40:29,285 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.25 - remove dataprovider dups" 2024-08-28 12:51:53,262 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.26 - remove dataprovider dups" 2024-08-28 13:02:51,569 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.27 - remove dataprovider dups" 2024-08-28 13:20:07,826 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.28 - remove dataprovider dups" 2024-08-28 13:43:55,823 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.29 - remove dataprovider dups" 2024-08-28 14:06:00,491 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.30 - remove dataprovider dups" 2024-08-28 14:25:40,790 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.31 - remove dataprovider dups" 2024-08-28 14:43:45,979 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.32 - remove dataprovider dups" 2024-08-28 15:01:02,102 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.33 - remove dataprovider dups" 2024-08-28 15:10:57,686 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.34 - remove dataprovider dups" 2024-08-28 15:13:06,906 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.35 - remove dataprovider dups" 2024-08-28 15:14:08,119 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.36 - remove dataprovider dups" 2024-08-28 15:22:47,710 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.37 - remove dataprovider dups" 2024-08-28 15:31:37,141 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.38 - remove dataprovider dups" 2024-08-28 15:40:49,514 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.39 - remove dataprovider dups" 2024-08-28 16:04:08,633 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.40 - remove dataprovider dups" 2024-08-28 16:28:00,989 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.41 - remove dataprovider dups" 2024-08-28 16:49:54,621 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.42 - remove dataprovider dups" 2024-08-28 17:10:01,377 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.43 - remove dataprovider dups" 2024-08-28 17:27:19,475 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.44 - remove dataprovider dups" 2024-08-28 17:36:39,674 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.45 - remove dataprovider dups" 2024-08-28 17:38:47,192 INFO [org.fly.cor.int.com.DbMigrate] (Quarkus Main Thread) Migrating schema "public" to version "0.37.0.46 - remove dataprovider dups"

@oblodgett
Copy link
Member Author

Added to this is the reorg of BulkLoad -> BulkLoadFile -> BulkLoadFileHistory instead of both relationships being O2O there is not a BulkLoadFile - O2M -> BulkLoadFileHistory <- M2O - BulkLoadFile. This allows one file to be under multiple loads, and being able to split up loads into their parts. As also this PR does with the GFF loads and breaks them out into the 9 seperate loads per mod. This might be a bit excessive however it makes the loads run very fast.

@oblodgett
Copy link
Member Author

@markquintontulloch if you can look at this PR when you get a chance.

@oblodgett oblodgett merged commit 17f7dbe into alpha Sep 3, 2024
10 checks passed
@oblodgett oblodgett deleted the remove_dataprovider_dups branch September 3, 2024 14:52
@@ -45,7 +45,7 @@ export const DataLoadsComponent = () => {
const [bulkLoadDialog, setBulkLoadDialog] = useState(false);
const [expandedGroupRows, setExpandedGroupRows] = useState(null);
const [expandedLoadRows, setExpandedLoadRows] = useState(null);
const [expandedFileRows, setExpandedFileRows] = useState(null);
//const [expandedFileRows, setExpandedFileRows] = useState(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be removed? Or is that commenting out only temporary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya the tables need to get reworked, in order to support to new structure.

public APIResponse updateTranscripts(String dataProvider, String assembly, List<Gff3DTO> gffData) {
return gff3Executor.runLoadApi(dataProvider, assembly, gffData);
BulkLoadFileHistory history = new BulkLoadFileHistory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like code very much like for the updateExons() method. Could that common code be centralized and being called?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants