Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple databases in the same model #4898

Closed
laskoviymishka opened this issue Mar 24, 2016 · 20 comments
Closed

Support for multiple databases in the same model #4898

laskoviymishka opened this issue Mar 24, 2016 · 20 comments

Comments

@laskoviymishka
Copy link
Contributor

Issue based on #4019 and PR #4672
It nice to have a feature to access multiple database from same model context.
This feature pretty massive. We need to modify following areas:

  1. Add database property to relational type annotations and follow this property from explicit mapping to default query sql generator (to create proper table alias including database name)
  2. Enhance data model with new field for catalogue name
  3. Implement sql generator and model validator for code first migrations from model with multiple databases (it means we should modify all generators and migration operators)
  4. Implement additional annotations to support attribute database generation, not only explicit

There are still some open design questions that needs to be resolved:

  1. There is no database/catalog propery in TableAttribute

So we could not implement data annotation mapping to different database. There is a possible solution by introducing additional attribute for this need (like DatabaseAttribute) with single proprte Name. In this case usage could be fairly simple

[Database("Pink")]
[Table("Unicorns", Schema="Fluffy")]
class DifferentDatabaseItem{ }

Also there is should be modelBuilder interface to setup database catalogue manually. Something like this:

 modelBuilder.Entity<MappingQueryTestBase.DifferentDatabaseItem>(e =>
            {
                e.Metadata.Relational().TableName = "Unicorns";
                e.Metadata.Relational().Schema = "Fluffy";
                e.Metadata.Relational().Database = "Pink";
            });

For such models we should generate additional catalogue prefix for FROM and JOIN statements.

  1. There is issue with navigation property building. We could easily generate proper join statement cross catalogues, but there is no proper way to add foreign key for such properties.
    There is couple possible solutions:
  • Forbid adding navigation properties between databases. This aproach is not really healthy. If we implement this - there is no particullar reason to keep multiple catalogues in single model context.
  • Allow adding navigation properties, but enhance migration to prevent generating foreighn key operations for this properties. In this case we could work with our entities without any penalties, but from data storage perspective we will not have any "real" foreighn keys between them.
  1. There is open question regards to should this be a relational concept, or specific to SQL Server. As far as i know this behaivior supported in MySql and Oracle (if placed on same server). Not sure for postgre SQL and sql lite.
@laskoviymishka laskoviymishka changed the title ModelBuilder: Allow specifying multiple data datalogues in context ModelBuilder: Allow specifying multiple data сatalogues in context Mar 24, 2016
@rowanmiller rowanmiller added this to the 1.0.0 milestone Mar 25, 2016
@rowanmiller
Copy link
Contributor

@anpete will help with guidance on this one

@divega
Copy link
Contributor

divega commented Jun 7, 2016

From @GSPP' comment in #5626:

In the age of cloud it becomes more common to shard across databases based on tenant, geography and scaling needs. EF should support querying across databases better.

It's not enough to change the connection string because one query might need to pull from multiple databases. Example: Join customer-specific data with master data.

I think it's really required that EF allows us to specify the database name per table reference. A strawman:

from o in db.Orders.WithDatabase("NewYork-Tenant1234")
join c in db.Customers.WithDatabase("NewYork-master") on ...
select new { o, c }

This is supposed to be a scenario where the database name depends on a geographical sharding as well as on the tenant name. Both are dynamic and can change for every query. Connection strings cannot do this. Statically configured database names cannot do that either.

This query would translate to:

 select *
 from [NewYork-Tenant1234].dbo.Orders o
 join [NewYork-master].dbo.Customers c on ...

The same configuration mechanism would lend itself well to configuring hints such as NOLOCK or the common UPDLOCK, ROWLOCK, HOLDLOCK combination. This is unrelated to this issue but I'm mentioning this to justify with the WithDatabase syntax makes sense.

A simple extension to this would be to allow completely dynamic table names. WithTableName(database, schema, name) would map to database.schema.name on the SQL side. This is useful for example when sharding by date with one table per day. SQL Server Standard Edition does not have partitioning.

What is the team's stance on supporting such scenarios?

@divega divega changed the title ModelBuilder: Allow specifying multiple data сatalogues in context Support for multiple databases in the same model Jun 7, 2016
@laskoviymishka
Copy link
Contributor Author

This is very usefull. Small question about linq API. Will it looks like this?

var result = db.Order
               .WithDatabase("NewYork-Tenant1234")
               .Join(t => t.Customers.WithDatabase("NewYork-Tenant1234"))
               .Select(o => new { o.OrderLiner, o.Customer }

If so this information probably should be added as extension to DbSet. All these logic related only for relation concept. So we need to reorgonize current DbSet implementation to DbSet and RelationalDbSet (which extend base DbSet with relational concepts).

For the configuration hint (nolock etc) it probably should be separated in different issue.
And since this really hard coupling with MS SQL should be implemented as part of Microsoft.EntityFrameworkCore.SqlServer, so it require new implementation of RelationalDbSet like SqlServerDbSet which could contain any tweek you like (and maybe some api for extension).

Any thoughts on this?

@GSPP
Copy link

GSPP commented Jun 7, 2016

A few more points:

  1. Since my issue has been merged into this (I agree with that) I want to stress this point: The table location (db, schema, name) must be changeable per table reference. Not per query or per application run. Not even per query is enough because I might want to union over all existing tenants (e.g. db.Orders.WithDatabase("Tenant1").Concat(db.Orders.WithDatabase("Tenant2")).Concat(db.Orders.WithDatabase("Tenant3"))...; this expression would, of course, be built through a loop and not hard-coded like that).
  2. Navigation properties should be supported across databases. I am using this pattern right now with a large L2S application. The use case is that one database is small and on different storage. The other one is for "bulk data" (all kinds of logs) and is on different storage. It is super convenient to hide this split from the application. Of course there can't be any FKs spanning the databases.
  3. Navigation properties should be configurable as well. If not we'd be forced to use explicit joins a lot. Strawman syntax: from o in db.Orders.WithDatabase("Tenant") select WithDatabase(o.Customer, "Master"). Semantically, a WithDatabase wrapping call would be interpreted as returning the first argument unmodified but with location meta-data attached. This function call would only be valid inside of an expression tree. The same wrapping could be done for sequence navigation properties. If extension syntax is wanted this requires an extension method on System.Object alas. This smells, I admit. So it could be a static method QueryHelpers.WithDatabase.
  4. There's still a need to configure location statically. This is useful in the scenario described under (2). Sometimes, there's a fixed number of databases and each table resides in exactly one database. This should be conveniently supported.
  5. I think this is a relational concept that not every provider can support. It can be a runtime error to use this with a provider not liking this.

Since there are so many different use cases there should be a hierarchy of location configuation:

  1. An explicit WithDatabase inside of the query takes precedence over everything.
  2. There should be a per-query default that places every table in a specific database and schema (e.g. myQuery.WithDefaultTableLocation(db, schema).ToList()).
  3. There should be a per-context default.
  4. Per-table static configuration.
  5. Connection string.

For each of these levels I can present you a real-world use case that I have had... If that hierarchy is properly document I don't think it's hard to understand or hard to work with.

Is there not an issue tracking table hints such as NOLOCK? There should be. It's important. Surely, the team is tracking it internally.

@laskoviymishka
Copy link
Contributor Author

@GSPP the use case is pretty clear here.
But question about API still open. Static QueryHelper can solve issue for sure.
Migrations also pretty intresting question. We should determine where table locate and could we define a REAL foreighn key or just virtual one.

@tillig
Copy link

tillig commented Jun 17, 2016

I'm not sure if it falls into this category (if not, I'll open a new issue - let me know), but the ability for a single database to have a different table per tenant for the same entity would be good, too. Using the example above, it'd be like

select *
from [Orders-Tenant1234] o
join [Products] p on ...

Master/shared table for some things, tenant-specific table for others.

@laskoviymishka
Copy link
Contributor Author

You mean something like have different order table per tenant?

@tillig
Copy link

tillig commented Jun 17, 2016

@laskoviymishka Exactly. Rather than partition by database as noted in the earlier example, partition by table in a single database. (I'd like EF to be able to support that. I have a situation like this right now.)

@laskoviymishka
Copy link
Contributor Author

laskoviymishka commented Jun 17, 2016

As far as I know there is limitation for tables count in azure. It may be not so scalable as it looks :)

@tillig
Copy link

tillig commented Jun 17, 2016

Maybe I'm not hosting in Azure. Or using SQL Server. Point being, the ability to have dynamic table names for an entity seems just as interesting to me as having different databases. If this isn't the right place to note that, like I said, I'm happy to start a different issue. It just seemed like similar concerns - the ability to dynamically change where an entity comes from.

@laskoviymishka
Copy link
Contributor Author

laskoviymishka commented Jun 17, 2016

It may have sense

@GSPP
Copy link

GSPP commented Jun 18, 2016

@tillig this is absolutely part of this issue (at least the part that's by me and was merged into this one).

@rowanmiller
Copy link
Contributor

Duplicate of #4019

@GSPP
Copy link

GSPP commented Jul 6, 2016

@rowanmiller this ticket goes far beyond the other ticket. The other ticket is about statically specifying a database name. This ticket includes dynamic options. If you want to deduplicate I suggest closing the other way around.

@rowanmiller rowanmiller removed this from the 1.1.0 milestone Jul 7, 2016
@rowanmiller rowanmiller reopened this Jul 7, 2016
@rowanmiller
Copy link
Contributor

@GSPP fair point. The initial description does line up with #4019 but I agree the discussion in the comments talks about a lot more. Remarking for triage so that we can discuss this scenario.

@GSPP
Copy link

GSPP commented Jul 7, 2016

Good call! I think this is important.

@rowanmiller
Copy link
Contributor

Given that this issue ended up being about a couple of features, we decided to track them separately:

Neither of these are things we are planning to tackle right away, so they are both on the backlog for the moment.

@narendrasinghrathore
Copy link

@rowanmiller

Given that this issue ended up being about a couple of features, we decided to track them separately:

Support specifying catalog for table mapping #4019
Dynamically Connect to Multiple Databases #6089

Any update. How can one use one entity model to connect multiple database provider like mssql, mysql,oracle ,mongodb etc . Found this but not helpful https://www.codeproject.com/Articles/82017/Preparing-an-Entity-Framework-model-for-multi-prov https://www.codeproject.com/Tips/1062375/Entity-Framework-Multi-DB-Support
Please can you help on this.

@ajcvickers
Copy link
Contributor

@narendrasinghrathore This issue is about accessing multiple databases of the same kind--for example, multiple SQL Server databases. It's not about different types of databases. You seem to be wanting to access different types of database using the same model. There are several dimensions to this question, so it's hard to give an answer here. I would suggest either opening a new issue explaining clearly what it is you need to do, or perhaps doing the same thing but in a Stack Overflow question--you'll get more knowledgeable eyes on it on Stack Overflow.

Finally, keep in mind this GitHub repo is for EF Core. The articles you referenced are for earlier versions of EF, for which the best place to log issues is https://github.com/aspnet/EntityFramework6

@narendrasinghrathore
Copy link

@ajcvickers Thanks, https://github.com/aspnet/EntityFramework6/issues/269

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants