Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cargo sqlx migrate #171

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ members = [
"examples/listen-postgres",
"examples/realworld-postgres",
"examples/todos-postgres",
"cargo-sqlx",
]

[package]
Expand Down
4 changes: 4 additions & 0 deletions cargo-sqlx/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/target
/migrations
Cargo.lock
.env
26 changes: 26 additions & 0 deletions cargo-sqlx/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[package]
name = "cargo-sqlx"
version = "0.1.0"
description = "Simple postgres migrator without support for down migration"
authors = ["Jesper Axelsson <[email protected]>"]
edition = "2018"
readme = "README.md"
homepage = "https://github.com/launchbadge/sqlx"
repository = "https://github.com/launchbadge/sqlx"
keywords = ["database", "postgres", "database-management", "migration"]
categories = ["database", "command-line-utilities"]

[[bin]]
name = "sqlx"
path = "src/main.rs"

[dependencies]
dotenv = "0.15"

tokio = { version = "0.2", features = ["macros"] }
# sqlx = { path = "..", default-features = false, features = [ "runtime-tokio", "macros", "postgres" ] }
sqlx = { version = "0.2", default-features = false, features = [ "runtime-tokio", "macros", "postgres" ] }
futures="0.3"

structopt = "0.3"
chrono = "0.4"
14 changes: 14 additions & 0 deletions cargo-sqlx/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# cargo-sqlx

Sqlx migrator runs all `*.sql` files under `migrations` folder and remembers which ones has been run.

Database url is supplied through either env variable or `.env` file containing `DATABASE_URL="postgres://postgres:postgres@localhost/realworld"`.

##### Commands
- `add <name>` - add new migration to your migrations folder named `<timestamp>_<name>.sql`
- `run` - Runs all migrations in your migrations folder


##### Limitations
- No down migrations! If you need down migrations, there are other more feature complete migrators to use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think down migrations should exist.

Generally I would recommend leap-frogging migrations as a migration should never be breaking on its own as that would equal downtime and make automatic migrations mostly pointless anyway.

If you made a mistake in a schema migration, make another to roll it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The only place where I use down migrations is during development. But that can be worked around if you have a good seed. The refresh feature mentioned in the podcast could be useful though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue to talk about the commands but we should definitely have a refresh, recycle, or otherwise command that drops, creates, and then runs all migrations. Infinitely useful in development.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue to talk about the commands but we should definitely have a refresh, recycle, or otherwise command that drops, creates, and then runs all migrations. Infinitely useful in development.

I broadly agree but want to add a caveat here is around seed data.

But that can be worked around if you have a good seed

While this is an important point, good seed data typically targets the latest migration. If you specifically want to test and refine the migration against known heterogenous data (e.g. adding a column derived from data in an existing column), this becomes quite clumsy. Down migrations--even if they were only an option in dev somehow--are really a nice option for this kind of thing. In my experience working with larger and/or older datasets (1TB+ in my most recent case, larger before that) these kinds of migrations are common and laborious and having to blow away your entire DB as the only option for testing a migration in the presence of such data would be a total PITA.

- Only support postgres. Could be convinced to add other databases if there is need and easy to use database connection libs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to lift the same style of database connection that is done in the macros crate:

https://github.com/launchbadge/sqlx/blob/master/sqlx-macros/src/lib.rs#L63

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to be general over a connection in sqlx?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.. It wouldn't be easy to allow until lazy normalization gets finished in the Rust compiler due to our use of higher-kinded trait bounds.

See: rust-lang/rust#60471

206 changes: 206 additions & 0 deletions cargo-sqlx/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
use std::env;
use std::fs;
use std::fs::File;
use std::io::prelude::*;

use dotenv::dotenv;

use sqlx::PgConnection;
use sqlx::PgPool;

use structopt::StructOpt;

const MIGRATION_FOLDER: &'static str = "migrations";

/// Sqlx commandline tool
#[derive(StructOpt, Debug)]
#[structopt(name = "Sqlx")]
enum Opt {
// #[structopt(subcommand)]
Migrate(MigrationCommand),
}

/// Simple postgres migrator
#[derive(StructOpt, Debug)]
#[structopt(name = "Sqlx migrator")]
enum MigrationCommand {
/// Initalizes new migration directory with db create script
// Init {
// // #[structopt(long)]
// database_name: String,
// },

/// Add new migration with name <timestamp>_<migration_name>.sql
Add {
// #[structopt(long)]
name: String,
},

/// Run all migrations
Run,
}

#[tokio::main]
async fn main() {
let opt = Opt::from_args();

match opt {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commands should return anyhow::Result and we should catch and print a pretty error: ... message

Opt::Migrate(command) => match command {
// Opt::Init { database_name } => init_migrations(&database_name),
MigrationCommand::Add { name } => add_migration_file(&name),
MigrationCommand::Run => run_migrations().await,
},
}

println!("All done!");
}

// fn init_migrations(db_name: &str) {
// println!("Initing the migrations so hard! db: {:#?}", db_name);
// }

fn add_migration_file(name: &str) {
use chrono::prelude::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use time over chrono in here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, unfortunately formatting is not something std::time does.

Stackoverflow: Format std::time output

If there is a way to do it with std::time I would be happy to change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::path::Path;
use std::path::PathBuf;

if !Path::new(MIGRATION_FOLDER).exists() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to drop the exists check and just use fs::create_dir_all. This would support nested migration paths (once that is configurable) and remove the race condition here.

fs::create_dir(MIGRATION_FOLDER).expect("Failed to create 'migrations' dir")
}

let dt = Utc::now();
let mut file_name = dt.format("%Y-%m-%d_%H-%M-%S").to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a larger discussion for file formats. I'd like some research into how other tools do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this is almost how ef core does it. They mush the numbers together like: %Y%m%d%H%M%S. I think knex does something similiar. Most other tools I have seen use some variation, like unix timestamps.

file_name.push_str("_");
file_name.push_str(name);
file_name.push_str(".sql");

let mut path = PathBuf::new();
path.push(MIGRATION_FOLDER);
path.push(&file_name);

if path.exists() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't check for path exists first. It'll die nicely on creation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent was to differentiate between the file already existing and other issues like unauthorized access. Might be overkill? Just a bit tired of using programs that just tells you it failed but no hints on what has actually failed :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand it shouldn't really be possible to accidentally create two with the same name...

eprintln!("Migration already exists!");
return;
}

let mut file = File::create(path).expect("Failed to create file");
file.write_all(b"-- Add migration script here")
.expect("Could not write to file");

println!("Created migration: '{}'", file_name);
}

pub struct Migration {
pub name: String,
pub sql: String,
}

fn load_migrations() -> Vec<Migration> {
let entries = fs::read_dir(&MIGRATION_FOLDER).expect("Could not find 'migrations' dir");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a glob crate? https://crates.io/crates/glob

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it, but glob is a pretty big dependency that I would rather skip if it is not needed. If we start add more complicated rules I would be all for using a glob lib.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. The dependency tree is already pretty big because of async.


let mut migrations = Vec::new();

for e in entries {
if let Ok(e) = e {
if let Ok(meta) = e.metadata() {
if !meta.is_file() {
continue;
}

if let Some(ext) = e.path().extension() {
if ext != "sql" {
println!("Wrong ext: {:?}", ext);
continue;
}
} else {
continue;
}

let mut file =
File::open(e.path()).expect(&format!("Failed to open: '{:?}'", e.file_name()));
let mut contents = String::new();
file.read_to_string(&mut contents)
.expect(&format!("Failed to read: '{:?}'", e.file_name()));

migrations.push(Migration {
name: e.file_name().to_str().unwrap().to_string(),
sql: contents,
});
}
}
}

migrations.sort_by(|a, b| a.name.partial_cmp(&b.name).unwrap());

migrations
}

async fn run_migrations() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use -> anyhow::Result everywhere with ? or .context("")? where needed

dotenv().ok();
let db_url = env::var("DATABASE_URL").expect("Failed to find 'DATABASE_URL'");

let mut pool = PgPool::new(&db_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a single connection, not a pool. No need for a migrator.

.await
.expect("Failed to connect to pool");

create_migration_table(&mut pool).await;

let migrations = load_migrations();

for mig in migrations.iter() {
let mut tx = pool.begin().await.unwrap();

if check_if_applied(&mut tx, &mig.name).await {
println!("Already applied migration: '{}'", mig.name);
continue;
}
println!("Applying migration: '{}'", mig.name);

sqlx::query(&mig.sql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to use raw or unprepared queries for this as they will allow batch execution.

tx.execute(&*mig.sql).await

.execute(&mut tx)
.await
.expect(&format!("Failed to run migration {:?}", &mig.name));

save_applied_migration(&mut tx, &mig.name).await;

tx.commit().await.unwrap();
}
}

async fn create_migration_table(mut pool: &PgPool) {
sqlx::query(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, use a raw query conn.execute(string) instead of `query(string).execute(conn)

r#"
CREATE TABLE IF NOT EXISTS __migrations (
migration VARCHAR (255) PRIMARY KEY,
created TIMESTAMP NOT NULL DEFAULT current_timestamp
);
"#,
)
.execute(&mut pool)
.await
.expect("Failed to create migration table");
}

async fn check_if_applied(pool: &mut PgConnection, migration: &str) -> bool {
use sqlx::row::Row;

let row = sqlx::query(
"select exists(select migration from __migrations where migration = $1) as exists",
)
.bind(migration.to_string())
.fetch_one(pool)
.await
.expect("Failed to check migration table");

let exists: bool = row.get("exists");

exists
}

async fn save_applied_migration(pool: &mut PgConnection, migration: &str) {
sqlx::query("insert into __migrations (migration) values ($1)")
.bind(migration.to_string())
.execute(pool)
.await
.expect("Failed to insert migration ");
}