Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cargo sqlx migrate #171

Merged

Conversation

JesperAxelsson
Copy link
Contributor

MVP migrator.
Can't create databases yet and no down migrations

@mehcode
Copy link
Member

mehcode commented Mar 31, 2020

Thanks for the initial push @JesperAxelsson.


I plan to look get this initially merged in after we get a couple bugs squashed and push out v0.3.1 in a couple days.

@D1plo1d
Copy link
Contributor

D1plo1d commented Apr 4, 2020

Excited to see this feature so I tried it out from your branch. I found that it works great for migrations that create a single table but trying to run a migration with multiple CREATE TABLE statements gives the following error:

Applying migration: '2020-04-04_14-29-08_create_initial_schema.sql'
thread 'main' panicked at 'Failed to run migration "2020-04-04_14-29-08_create_initial_schema.sql": Database(DatabaseError { message: "cannot insert multiple commands into a prepared statement", details: None, hint: None, table_name: None, column_name: None, constraint_name: None })', /home/[redacted]/.cargo/git/checkouts/sqlx-85e146549543ea70/fcd6ef4/cargo-sqlx/src/main.rs:159:9

@JesperAxelsson
Copy link
Contributor Author

Good find! The issue is probably that I run the migration in a transaction, this causes issues with create database as well. Will take look at soon as I have some time over.



##### Limitations
- No down migrations! If you need down migrations, there are other more feature complete migrators to use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think down migrations should exist.

Generally I would recommend leap-frogging migrations as a migration should never be breaking on its own as that would equal downtime and make automatic migrations mostly pointless anyway.

If you made a mistake in a schema migration, make another to roll it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The only place where I use down migrations is during development. But that can be worked around if you have a good seed. The refresh feature mentioned in the podcast could be useful though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue to talk about the commands but we should definitely have a refresh, recycle, or otherwise command that drops, creates, and then runs all migrations. Infinitely useful in development.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue to talk about the commands but we should definitely have a refresh, recycle, or otherwise command that drops, creates, and then runs all migrations. Infinitely useful in development.

I broadly agree but want to add a caveat here is around seed data.

But that can be worked around if you have a good seed

While this is an important point, good seed data typically targets the latest migration. If you specifically want to test and refine the migration against known heterogenous data (e.g. adding a column derived from data in an existing column), this becomes quite clumsy. Down migrations--even if they were only an option in dev somehow--are really a nice option for this kind of thing. In my experience working with larger and/or older datasets (1TB+ in my most recent case, larger before that) these kinds of migrations are common and laborious and having to blow away your entire DB as the only option for testing a migration in the presence of such data would be a total PITA.


##### Limitations
- No down migrations! If you need down migrations, there are other more feature complete migrators to use.
- Only support postgres. Could be convinced to add other databases if there is need and easy to use database connection libs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to lift the same style of database connection that is done in the macros crate:

https://github.com/launchbadge/sqlx/blob/master/sqlx-macros/src/lib.rs#L63

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to be general over a connection in sqlx?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.. It wouldn't be easy to allow until lazy normalization gets finished in the Rust compiler due to our use of higher-kinded trait bounds.

See: rust-lang/rust#60471

// }

fn add_migration_file(name: &str) {
use chrono::prelude::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use time over chrono in here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, unfortunately formatting is not something std::time does.

Stackoverflow: Format std::time output

If there is a way to do it with std::time I would be happy to change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::path::Path;
use std::path::PathBuf;

if !Path::new(MIGRATION_FOLDER).exists() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to drop the exists check and just use fs::create_dir_all. This would support nested migration paths (once that is configurable) and remove the race condition here.

async fn main() {
let opt = Opt::from_args();

match opt {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commands should return anyhow::Result and we should catch and print a pretty error: ... message

}

let dt = Utc::now();
let mut file_name = dt.format("%Y-%m-%d_%H-%M-%S").to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a larger discussion for file formats. I'd like some research into how other tools do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this is almost how ef core does it. They mush the numbers together like: %Y%m%d%H%M%S. I think knex does something similiar. Most other tools I have seen use some variation, like unix timestamps.

}

fn load_migrations() -> Vec<Migration> {
let entries = fs::read_dir(&MIGRATION_FOLDER).expect("Could not find 'migrations' dir");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a glob crate? https://crates.io/crates/glob

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it, but glob is a pretty big dependency that I would rather skip if it is not needed. If we start add more complicated rules I would be all for using a glob lib.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. The dependency tree is already pretty big because of async.

migrations
}

async fn run_migrations() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use -> anyhow::Result everywhere with ? or .context("")? where needed

dotenv().ok();
let db_url = env::var("DATABASE_URL").expect("Failed to find 'DATABASE_URL'");

let mut pool = PgPool::new(&db_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a single connection, not a pool. No need for a migrator.

}
println!("Applying migration: '{}'", mig.name);

sqlx::query(&mig.sql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to use raw or unprepared queries for this as they will allow batch execution.

tx.execute(&*mig.sql).await

}

async fn create_migration_table(mut pool: &PgPool) {
sqlx::query(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, use a raw query conn.execute(string) instead of `query(string).execute(conn)

@mehcode mehcode merged commit 7038721 into launchbadge:master Apr 7, 2020
@mehcode
Copy link
Member

mehcode commented Apr 7, 2020

Thanks for the awesome start @JesperAxelsson. Let me know how much or how little you want to contribute moving forward. I know you had the repo listed as "passively maintained" before. I added you as a collaborator.

I left a ton of comments here if you or someone else wants to tackle them to iterate on this. I'm going to circle back to this likely within a week~ if the notes haven't been looked at yet.

@JesperAxelsson
Copy link
Contributor Author

Thanks for awesome comments! Really learning a lot here. I will try to put some time in every now and then but. Of course feel free to make changes you feel is necessary.

How do you want to handle commits? Should I work via pull request from a fork or are you comfortable with me with in the main repo?

@mehcode
Copy link
Member

mehcode commented Apr 10, 2020

How do you want to handle commits? Should I work via pull request from a fork or are you comfortable with me with in the main repo?

I want this to be Rust's SQL toolkit. Design stuff we should discuss. But feel free to push directly unless you want to formally talk about the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants