From b0d934765fae25a072e77b3e3cd312b95af557f8 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 6 Mar 2017 01:36:30 +0600 Subject: [PATCH 01/80] Initial draft proposal WIP --- draft/orm-improvements-for-composite-pk.rst | 437 ++++++++++++++++++++ 1 file changed, 437 insertions(+) create mode 100644 draft/orm-improvements-for-composite-pk.rst diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst new file mode 100644 index 00000000..9e7abb65 --- /dev/null +++ b/draft/orm-improvements-for-composite-pk.rst @@ -0,0 +1,437 @@ +========================================================= +DEP : ORM Fields and related improvement for composite PK +========================================================= + +:DEP: 0201 +:Author: Asif Saif Uddin +:Implementation Team: Asif Saif Uddin, django core team +:Shepherd: Django Core Team +:Status: Draft +:Type: Feature +:Created: 2017-3-2 +:Last-Modified: 2017-00-00 + +.. contents:: Table of Contents + :depth: 3 + :local: + + +Abstract +======== + +This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. + +Key concerns of New Approach to implement ``CompositeField`` +============================================================== + +1. Change ForeignObjectRel subclasses to real field instances. (For example, + ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. +2. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + +3. Partition ForeignKey to virtual relation field, and concrete data field. The former is the model.author, the latter model.author_id's backing implementation. +Consider other cases where true virtual fields are needed. + +4. Introduce new standalone ``VirtualField`` +5. Incorporate ``VirtualField`` related changes in django +6. Split out existing Fields API into ``ConcreteField`` and BaseField + to utilize ``VirtualField``. +7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` +8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey +9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey + + + +Summary of ``CompositeField`` +============================= + +This section summarizes the basic API as established in the proposal for +GSoC 2011 [1]_. + +A ``CompositeField`` requires a list of enclosed regular model fields as +positional arguments, as shown in this example:: + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + +The model class then contains a descriptor for the composite field, which +returns a ``CompositeValue`` which is a customized namedtuple, the +descriptor accepts any iterable of the appropriate length. An example +interactive session:: + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +``CompositeField`` supports the following standard field options: +``unique``, ``db_index``, ``primary_key``. The first two will simply add a +corresponding tuple to ``model._meta.unique_together`` or +``model._meta.index_together``. Other field options don't make much sense +in the context of composite fields. + +Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former +should be clear enough, the latter is elaborated in a separate section. + +It will be possible to use a ``CompositeField`` as a target field of +``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is +described in more detail in the following section. + +Changes in ``ForeignKey`` +========================= + +Currently ``ForeignKey`` is a regular concrete field which manages both +the raw value stored in the database and the higher-level relationship +semantics. Managing the raw value is simple enough for simple +(single-column) targets. However, in the case of a composite target field, +this task becomes more complex. The biggest problem is that many parts of +the ORM work under the assumption that for each database column there is a +model field it can assign the value from the column to. While it might be +possible to lift this restriction, it would be a really complex project by +itself. + +On the other hand, there is the abstraction of virtual fields working on +top of other fields which is required for this project anyway. The way +forward would be to use this abstraction for relationship fields. +Currently, ``ForeignKey`` (and by extension ``OneToOneField``) is the only +field whose ``name`` and ``attname`` differ, where ``name`` stores the +value dictated by the semantics of the field and ``attname`` stores the +raw value from the database. + +We can use this to our advantage and put an auxiliary field into the +``attname`` of each ``ForeignKey``, which would be of the same database +type as the target field, and turn ``ForeignKey`` into a virtual field on +top of the auxiliary field. This solution has the advantage that it +offloads the need to manage the raw database value off ``ForeignKey`` and +uses a field specifically intended for the task. + +In order to keep this backwards compatible and avoid the need to +explicitly create two fields for each ``ForeignKey``, the auxiliary field +needs to be created automatically during the phase where a model class is +created by its metaclass. Initially I implemented this as a method on +``ForeignKey`` which takes the target field and creates its copy, touches +it up and adds it to the model class. However, this requires performing +special tasks with certain types of fields, such as ``AutoField`` which +needs to be turned into an ``IntegerField`` or ``CompositeField`` which +requires copying its enclosed fields as well. + +A better approach is to add a method such as ``create_auxiliary_copy`` on +``Field`` which would create all new field instances and add them to the +appropriate model class. + +One possible problem with these changes is that they change the contents +of ``_meta.fields`` in each model out there that contains a relationship +field. For example, if a model contains the following fields:: + + ['id', + 'name', + 'address', + 'place_ptr', + 'rating', + 'serves_hot_dogs', + 'serves_pizza', + 'chef'] + +where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a +``ForeignKey``, after the change it will contain the following list:: + + ['id', + 'name', + 'address', + 'place_ptr', + 'place_ptr_id', + 'rating', + 'serves_hot_dogs', + 'serves_pizza', + 'chef', + 'chef_id'] + +This causes a lot of failures in the Django test suite, because there are +a lot of tests relying on the contents of ``_meta.fields`` or other +related attributes/properties. (Actually, this example is taken from one +of these tests, +``model_inheritance.tests.ModelInheritanceTests.test_multiple_table``.) +Fixing these is fairly simple, all they need is to add the appropriate +``__id`` fields. However, this raises a concern of how ``_meta`` is +regarded. It has always been a private API officially, but everyone uses +it in their projects anyway. I still think the change is worth it, but it +might be a good idea to include a note about the change in the release +notes. + +Porting previous work on top of master +====================================== + +The first major task of this project is to take the code I wrote as part +of GSoC 2011 and sync it with the current state of master. The order in +which I implemented things two years ago was to implement +``CompositeField`` first and then I did a refactor of ``ForeignKey`` which +is required to make it support ``CompositeField``. This turned out to be +inefficient with respect to the development process, because some parts of +the refactor broke the introduced ``CompositeField`` functionality, +meaning I had to effectively reimplement parts of it again. Also, some +abstractions introduced by the refactor made it possible to rewrite +certain parts in a cleaner way than what was necessary for +``CompositeField`` alone (e.g. database creation or certain features of +``model._meta``). + +In light of these findings I am convinced that a better approach would be +to first do the required refactor of ``ForeignKey`` and implement +CompositeField as the next step. This will result in a better maintainable +development branch and a cleaner revision history, making it easier to +review the work before its eventual inclusion into Django. + +``__in`` lookups for ``CompositeField`` +======================================= + +The existing implementation of ``CompositeField`` handles ``__in`` lookups +in the generic, backend-independent ``WhereNode`` class and uses a +disjunctive normal form expression as in the following example:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); + +The problem with this solution is that in cases where the list of values +contains tens or hundreds of tuples, this DNF expression will be extremely +long and the database will have to evaluate it for each and every row, +without a possibility of optimizing the query. + +Certain database backends support the following alternative:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; + +This would probably be the best option, but it can't be used by SQLite, +for instance. This is also the reason why the DNF expression was +implemented in the first place. + +In order to support this more natural syntax, the ``DatabaseOperations`` +needs to be extended with a method such as ``composite_in_sql``. + +However, this leaves the issue of the inefficient DNF unresolved for +backends without support for tuple literals. For such backends, the +following expression is proposed:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c + UNION SELECT 4, 5, 6) + WHERE a1=1 AND b1=b AND c1=c); + +Since both syntaxes are rather generic and at least one of them should fit +any database backend directly, a new flag will be introduced, +``DatabaseFeatures.supports_tuple_literals`` which the default +implementation of ``composite_in_sql`` will consult in order to choose +between the two options. + +``contenttypes`` and ``GenericForeignKey`` +========================================== + + +It's fairly easy to represent composite values as strings. Given an +``escape`` function which uniquely escapes commas, something like the +following works quite well:: + + ",".join(escape(value) for value in composite_value) + +However, in order to support JOINs generated by ``GenericRelation``, we +need to be able to reproduce exactly the same encoding using an SQL +expression which would be used in the JOIN condition. + +Luckily, while thus encoded strings need to be possible to decode in +Python (for example, when retrieving the related object using +``GenericForeignKey`` or when the admin decodes the primary key from URL), +this isn't necessary at the database level. Using SQL we only ever need to +perform this in one direction, that is from a tuple of values into a +string. + +That means we can use a generalized version of the function +``django.contrib.admin.utils.quote`` which replaces each unsafe +character with its ASCII value in hexadecimal base, preceded by an escape +character. In this case, only two characters are unsafe -- comma (which is +used to separate the values) and an escape character (which I arbitrarily +chose as '~'). + +To reproduce this encoding, all values need to be cast to strings and then +for each such string two calls to the ``replace`` functions are made:: + + replace(replace(CAST (`column` AS text), '~', '~7E'), ',', '~2C') + +According to available documentation, all four supported database backends +provide the ``replace`` function. [2]_ [3]_ [4]_ [5]_ + +Even though the ``replace`` function seems to be available in all major +database servers (even ones not officially supported by Django, including +MSSQL, DB2, Informix and others), this is still probably best left to the +database backend and will be implemented as +``DatabaseOperations.composite_value_to_text_sql``. + +One possible pitfall of this implementation might be that it may not work +with any column type that isn't an integer or a text string due to a +simple fact – the string the database would cast it to will probably +differ from the one Python will use. However, I'm not sure there's +anything we can do about this, especially since the string representation +chosen by the database may be specific for each database server. Therefore +I'm inclined to declare ``GenericRelation`` unsupported for models with a +composite primary key containing any special columns. This should be +extremely rare anyway. + +Database introspection, ``inspectdb`` +===================================== + +There are three main goals concerning database introspection in this +project. The first is to ensure the output of ``inspectdb`` remains the +same as it is now for models with simple primary keys and simple foreign +key references, or at least equivalent. While this shouldn't be too +difficult to achieve, it will still be regarded with high importance. + +The second goal is to extend ``inspectdb`` to also create a +``CompositeField`` in models where the table contains a composite primary +key. This part shouldn't be too difficult, +``DatabaseIntrospection.get_primary_key_column`` will be renamed to +``get_primary_key`` which will return a tuple of columns and in case the +tuple contains more than one element, an appropriate ``CompositeField`` +will be added. This will also require updating +``DatabaseWrapper.check_constraints`` for certain backends since it uses +``get_primary_key_column``. + +The third goal is to also make ``inspectdb`` aware of composite foreign +keys. This will need a rewrite of ``get_relations`` which will have to +return a mapping between tuples of columns instead of single columns. It +should also ensure each tuple of columns pointed to by a foreign key gets +a ``CompositeField``. This part will also probably require some changes in +other backend methods as well, especially since each backend has a unique +tangle of introspection methods. + +This part requires a tremendous amount of work, because practically every +single change needs to be done four times and needs separate research of +the specific backend in question. Therefore I can't promise to deliver full support +for all features mentioned in this section for all backends. I'd say +backwards compatibility is a requirement, recognition of composite primary +keys is a highly wanted feature that I'll try to implement for as many +backends as possible and recognition of composite foreign keys would be a +nice extra to have for at least one or two backends. + +I'll be implementing the features for the individual backends in the +following order: PostgreSQL, MySQL, SQLite and Oracle. I put PostgreSQL +first because, well, this is the backend with the best support in Django +(and also because it is the one where I'd actually use the features I'm +proposing). Oracle comes last because I don't have any way to test it and +I'm afraid I'd be stabbing in the dark anyway. Of the two remaining +backends I put MySQL first for two reasons. First, I don't think people +need to run ``inspectdb`` on SQLite databases too often (if ever). Second, +on MySQL the task seems marginally easier as the database has +introspection features other than just “give me the SQL statement used to +create this table”, whose parsing is most likely going to be a complete +mess. + +All in all, extending ``inspectdb`` features is a tedious and difficult +task with shady outcome, which I'm well aware of. Still, I would like to +try to at least implement the easier parts for the most used backends. It +might quite possibly turn out that I won't manage to implement more than +composite primary key detection for PostgreSQL. This is the reason I keep +this as one of the last features I intend to work on, as shown in the +timeline. It isn't a necessity, we can always just add a note to the docs +that ``inspectdb`` just can't detect certain scenarios and ask people to +edit their models manually. + +Updatable primary keys in models +================================ + +The algorithm that determines what kind of database query to issue on +``model.save()`` is a fairly simple and well-documented one [6]_. If a row +exists in the database with the value of its primary key equal to the +saved object, it is updated, otherwise a new row is inserted. This +behavior is intuitive and works well for models where the primary key is +automatically created by the framework (be it an ``AutoField`` or a parent +link in the case of model inheritance). + +However, as soon as the primary key is explicitly created, the behavior +becomes less intuitive and might be confusing, for example, to users of the +admin. For instance, say we have the following model:: + + class Person(models.Model): + first_name = models.CharField(max_length=47) + last_name = models.CharField(max_length=47) + shoe_size = models.PositiveSmallIntegerField() + + full_name = models.CompositeField(first_name, last_name, + primary_key=True) + +Then we register the model in the admin using the standard one-liner:: + + admin.site.register(Person) + +Since we haven't excluded any fields, all three fields will be editable in +the admin. Now, suppose there's an instance whose ``full_name`` is +``CompositeValue(first_name='Darth', last_name='Vadur')``. A user decides +to fix the last name using the admin, hits the “Save” button and instead +of fixing an existing record, a new one will appear with the new value, +while the old one remains untouched. This behavior is clearly broken from +the point of view of the user. + +It can be argued that it is the developer's fault that the database schema +is poorly chosen and that they expose the primary key to their users. +While this may be true in some cases, it is still to some extent a +subjective matter. + +Therefore I propose a new behavior for ``model.save()`` where it would +detect a change in the instance's primary key and in that case issue an +``UPDATE`` for the right row, i.e. ``WHERE primary_key = previous_value``. + +Of course, just going ahead and changing the behavior in this way for all +models would be backwards incompatible. To do this properly, we would need +to make this an opt-in feature. This can be achieved in multiple ways. + +1) add a keyword argument such as ``update_pk`` to ``Model.save`` +2) add a new option to ``Model.Meta``, ``updatable_pk`` +3) make this a project-wide setting + +Option 3 doesn't look pleasant and I think I can safely eliminate that. +Option 2 is somewhat better, although it adds a new ``Meta`` option. +Option 1 is the most flexible solution, however, it does not change the +behavior of the admin, at least not by default. This can be worked around +by overriding the ``save`` method to use a different default:: + + class MyModel(models.Model): + def save(self, update_pk=True, **kwargs): + kwargs['update_pk'] = update_pk + return super(MyModel, self).save(**kwargs) + +To avoid the need to repeat this for each model, a class decorator might +be provided to perform this automatically. + +In order to implement this new behavior a little bit of extra complexity +would have to be added to models. Model instances would need to store the +last known value of the primary key as retrieved from the database. On +save it would just find out whether the last known value is present and in +that case issue an ``UPDATE`` using the old value in the ``WHERE`` +condition. + +So far so good, this could be implemented fairly easily. However, the +problem becomes considerably more difficult as soon as we take into +account the fact that updating a primary key value may break foreign key +references. In order to avoid breaking references the ``on_delete`` +mechanism of ``ForeignKey`` would have to be extended to support updates +as well. This means that the collector used by deletion will need to be +extended as well. + +The problem becomes particularly nasty if we realize that a ``ForeignKey`` +might be part of a primary key, which means the collector needs to keep +track of which field depends on which in a graph of potentially unlimited +size. Compared to this, deletion is simpler as it only needs to find a +list of all affected model instances as opposed to having to keep track of +which field to update using which value. + +Given the complexity of this problem and the fact that it is not directly +related to composite fields, this is left as the last feature which will +be implemented only if I manage to finish everything else on time. + + +# https://people.ksp.sk/~johnny64/GSoC-full-proposal + From 6a0b9fab5cbbfeb01b5a4810843ccf5717e64603 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 9 Mar 2017 01:32:17 +0600 Subject: [PATCH 02/80] changes --- draft/orm-improvements-for-composite-pk.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 9e7abb65..67091693 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -38,6 +38,8 @@ Consider other cases where true virtual fields are needed. 7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` 8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey 9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey +10. Make changes to migrations framework to work properly with Reafctored Field + API. From 4c89d85423c1bc6c4212618ee17feacede156c27 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 9 Mar 2017 02:30:14 +0600 Subject: [PATCH 03/80] re order --- draft/orm-improvements-for-composite-pk.rst | 121 +++++++++++--------- 1 file changed, 69 insertions(+), 52 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 67091693..3df41f5a 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -23,68 +23,32 @@ This DEP aims to improve different part of django ORM and other associated parts Key concerns of New Approach to implement ``CompositeField`` ============================================================== - -1. Change ForeignObjectRel subclasses to real field instances. (For example, +1. Split out Field API to ConcreteField, BaseField etc and change on ORM based on the splitted API. +2. Introduce new standalone well defined ``VirtualField`` +3. Incorporate ``VirtualField`` related changes in django +4. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +5. Figure out other cases where true virtual fields are needed. +6. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey +7. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. Change ForeignObjectRel subclasses to real field instances. (For example, ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. -2. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. - -3. Partition ForeignKey to virtual relation field, and concrete data field. The former is the model.author, the latter model.author_id's backing implementation. -Consider other cases where true virtual fields are needed. - -4. Introduce new standalone ``VirtualField`` -5. Incorporate ``VirtualField`` related changes in django -6. Split out existing Fields API into ``ConcreteField`` and BaseField - to utilize ``VirtualField``. -7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` -8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey -9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey +9. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + 10. Make changes to migrations framework to work properly with Reafctored Field API. +11. Consider Database Contraints work of lan-foote and +12. Changes in AutoField -Summary of ``CompositeField`` -============================= - -This section summarizes the basic API as established in the proposal for -GSoC 2011 [1]_. - -A ``CompositeField`` requires a list of enclosed regular model fields as -positional arguments, as shown in this example:: - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) +New split out Field API +========================= -The model class then contains a descriptor for the composite field, which -returns a ``CompositeValue`` which is a customized namedtuple, the -descriptor accepts any iterable of the appropriate length. An example -interactive session:: - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') +Introduce ``VirtualField`` +========================= -``CompositeField`` supports the following standard field options: -``unique``, ``db_index``, ``primary_key``. The first two will simply add a -corresponding tuple to ``model._meta.unique_together`` or -``model._meta.index_together``. Other field options don't make much sense -in the context of composite fields. -Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former -should be clear enough, the latter is elaborated in a separate section. - -It will be possible to use a ``CompositeField`` as a target field of -``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is -described in more detail in the following section. Changes in ``ForeignKey`` ========================= @@ -167,6 +131,57 @@ it in their projects anyway. I still think the change is worth it, but it might be a good idea to include a note about the change in the release notes. + + +Summary of ``CompositeField`` +============================= + +This section summarizes the basic API as established in the proposal for +GSoC 2011 [1]_. + +A ``CompositeField`` requires a list of enclosed regular model fields as +positional arguments, as shown in this example:: + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + +The model class then contains a descriptor for the composite field, which +returns a ``CompositeValue`` which is a customized namedtuple, the +descriptor accepts any iterable of the appropriate length. An example +interactive session:: + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +``CompositeField`` supports the following standard field options: +``unique``, ``db_index``, ``primary_key``. The first two will simply add a +corresponding tuple to ``model._meta.unique_together`` or +``model._meta.index_together``. Other field options don't make much sense +in the context of composite fields. + +Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former +should be clear enough, the latter is elaborated in a separate section. + +It will be possible to use a ``CompositeField`` as a target field of +``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is +described in more detail in the following section. + + + +Alternative Approach of compositeFiled +======================================= + + Porting previous work on top of master ====================================== @@ -231,6 +246,7 @@ any database backend directly, a new flag will be introduced, implementation of ``composite_in_sql`` will consult in order to choose between the two options. + ``contenttypes`` and ``GenericForeignKey`` ========================================== @@ -283,6 +299,7 @@ I'm inclined to declare ``GenericRelation`` unsupported for models with a composite primary key containing any special columns. This should be extremely rare anyway. + Database introspection, ``inspectdb`` ===================================== From 8dbd2114a6f7768914e1998027028027d84a7916 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 18:27:49 +0600 Subject: [PATCH 04/80] modifications --- draft/orm-improvements-for-composite-pk.rst | 59 ++++++++++----------- 1 file changed, 28 insertions(+), 31 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 3df41f5a..0e9caf98 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -41,6 +41,30 @@ Key concerns of New Approach to implement ``CompositeField`` 12. Changes in AutoField +Porting previous work on top of master +====================================== + +The first major task of this project is to take the code written as part +of GSoC 2013 and sync it with the current state of master. The order in +which It was implemented two years ago was to implement +``CompositeField`` first and then a refactor of ``ForeignKey`` which +is required to make it support ``CompositeField``. This turned out to be +inefficient with respect to the development process, because some parts of +the refactor broke the introduced ``CompositeField`` functionality, +meaning that it was needed effectively reimplement parts of it again. + +Also, some abstractions introduced by the refactor made it possible to +rewrite certain parts in a cleaner way than what was necessary for +``CompositeField`` alone (e.g. database creation or certain features of +``model._meta``). + +In light of these findings I am convinced that a better approach would be +to first do the required refactor of ``ForeignKey`` and implement +CompositeField as the next step. This will result in a better maintainable +development branch and a cleaner revision history, making it easier to +review the work before its eventual inclusion into Django. + + New split out Field API ========================= @@ -182,27 +206,6 @@ Alternative Approach of compositeFiled ======================================= -Porting previous work on top of master -====================================== - -The first major task of this project is to take the code I wrote as part -of GSoC 2011 and sync it with the current state of master. The order in -which I implemented things two years ago was to implement -``CompositeField`` first and then I did a refactor of ``ForeignKey`` which -is required to make it support ``CompositeField``. This turned out to be -inefficient with respect to the development process, because some parts of -the refactor broke the introduced ``CompositeField`` functionality, -meaning I had to effectively reimplement parts of it again. Also, some -abstractions introduced by the refactor made it possible to rewrite -certain parts in a cleaner way than what was necessary for -``CompositeField`` alone (e.g. database creation or certain features of -``model._meta``). - -In light of these findings I am convinced that a better approach would be -to first do the required refactor of ``ForeignKey`` and implement -CompositeField as the next step. This will result in a better maintainable -development branch and a cleaner revision history, making it easier to -review the work before its eventual inclusion into Django. ``__in`` lookups for ``CompositeField`` ======================================= @@ -359,13 +362,14 @@ timeline. It isn't a necessity, we can always just add a note to the docs that ``inspectdb`` just can't detect certain scenarios and ask people to edit their models manually. + Updatable primary keys in models ================================ The algorithm that determines what kind of database query to issue on -``model.save()`` is a fairly simple and well-documented one [6]_. If a row -exists in the database with the value of its primary key equal to the -saved object, it is updated, otherwise a new row is inserted. This +``model.save()`` is a fairly simple and well-documented one [6]_. If a +row exists in the database with the value of its primary key equal to +the saved object, it is updated, otherwise a new row is inserted. This behavior is intuitive and works well for models where the primary key is automatically created by the framework (be it an ``AutoField`` or a parent link in the case of model inheritance). @@ -447,10 +451,3 @@ size. Compared to this, deletion is simpler as it only needs to find a list of all affected model instances as opposed to having to keep track of which field to update using which value. -Given the complexity of this problem and the fact that it is not directly -related to composite fields, this is left as the last feature which will -be implemented only if I manage to finish everything else on time. - - -# https://people.ksp.sk/~johnny64/GSoC-full-proposal - From 85eb8ef0a606001001ec9538bd891865ea7b5da6 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 20:08:25 +0600 Subject: [PATCH 05/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 39 ++++++++++++++++++--- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 0e9caf98..6748bb78 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -18,8 +18,31 @@ DEP : ORM Fields and related improvement for composite PK Abstract ======== +Django's ORM is a powerful tool which suits perfectly most use-cases, +however, there are cases where having exactly one primary key column per +table induces unnecessary redundancy. + +One such case is the many-to-many intermediary model. Even though the pair +of ForeignKeys in this model identifies uniquely each relationship, an +additional field is required by the ORM to identify individual rows. While +this isn't a real problem when the underlying database schema is created +by Django, it becomes an obstacle as soon as one tries to develop a Django +application using a legacy database. + +Since there is already a lot of code relying on the pk property of model +instances and the ability to use it in QuerySet filters, it is necessary +to implement a mechanism to allow filtering of several actual fields by +specifying a single filter. + +The proposed solution is using Virtualfield type, CompositeField. This field +type will enclose several real fields within one single object. + + +Motivation +========== +This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API +and design everything as much simple and small as possible to be able to implement separately. -This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. Key concerns of New Approach to implement ``CompositeField`` ============================================================== @@ -36,9 +59,15 @@ Key concerns of New Approach to implement ``CompositeField`` 10. Make changes to migrations framework to work properly with Reafctored Field API. -11. Consider Database Contraints work of lan-foote and -12. Changes in AutoField +11. Make sure new class based Index API ise used properly with refactored Field + API. + +12. Consider Database Contraints work of lan-foote and + +13. SubField/AuxilaryField + +14. Update in AutoField Porting previous work on top of master @@ -69,8 +98,8 @@ New split out Field API ========================= -Introduce ``VirtualField`` -========================= +Introduce standalone ``VirtualField`` +===================================== From 17b34bd2b799326d741774011deb5927ca5e143a Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 20:37:48 +0600 Subject: [PATCH 06/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 6748bb78..3dfab7e5 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -74,8 +74,9 @@ Porting previous work on top of master ====================================== The first major task of this project is to take the code written as part -of GSoC 2013 and sync it with the current state of master. The order in -which It was implemented two years ago was to implement +of GSoC 2013 and compare it aganist master to have Idea of valid part. + +The order in which It was implemented few years ago was to implement ``CompositeField`` first and then a refactor of ``ForeignKey`` which is required to make it support ``CompositeField``. This turned out to be inefficient with respect to the development process, because some parts of @@ -87,11 +88,11 @@ rewrite certain parts in a cleaner way than what was necessary for ``CompositeField`` alone (e.g. database creation or certain features of ``model._meta``). -In light of these findings I am convinced that a better approach would be -to first do the required refactor of ``ForeignKey`` and implement -CompositeField as the next step. This will result in a better maintainable -development branch and a cleaner revision history, making it easier to -review the work before its eventual inclusion into Django. +I am convinced that a better approach would be to Improve Field API and later +imlement VirtualField type to first do the required refactor of ``ForeignKey`` +and implement CompositeField as the next step. This will result in a better +maintainable development branch and a cleaner revision history, making it easier +to review the work before its eventual inclusion into Django. New split out Field API From 438c85a6cc0f26a4c78b891456c593c555d730ee Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 23:34:59 +0600 Subject: [PATCH 07/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 30 +++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 3dfab7e5..c9ebd82e 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -97,6 +97,33 @@ to review the work before its eventual inclusion into Django. New split out Field API ========================= +1. BaseField: +------------- +Base structure for all Field types in django ORM wheather it is Concrete +or VirtualField + +2. ConcreteField: +----------------- +ConcreteField will have all the common attributes of a Regular concrete field + +3. Field: +--------- +Presence base Field class with should refactored using BaseField and ConcreteField. +If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. + +4. VirtualField: +---------------- +A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField +or any virtual type field can be benefitted from VirtualField. + +5. RelationField: +----------------- + + +6. CompositeField: +------------------ +A composite field can be implemented based on BaseField and VirtualField to solve +the CompositeKey/Multi column PrimaryKey issue. Introduce standalone ``VirtualField`` @@ -186,6 +213,9 @@ might be a good idea to include a note about the change in the release notes. +Changes in ``RelationField`` +============================= + Summary of ``CompositeField`` ============================= From 2c6eec514742d11f067b3fd9776322f0f78c48ad Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 12 Mar 2017 00:04:31 +0600 Subject: [PATCH 08/80] more detail break down from older references --- draft/orm-improvements-for-composite-pk.rst | 231 ++++++++++++++++++++ 1 file changed, 231 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index c9ebd82e..b53d6103 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -266,6 +266,212 @@ Alternative Approach of compositeFiled ======================================= +Implementation +-------------- + +Specifying a CompositeField in a Model +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The constructor of a CompositeField will accept the supported options as +keyword parameters and the enclosed fields will be specified as positional +parameters. The order in which they are specified will determine their +order in the namedtuple representing the CompositeField value (i. e. when +retrieving and assigning the CompositeField's value; see example below). + +unique and db_index +~~~~~~~~~~~~~~~~~~~ +Implementing these will require some modifications in the backend code. +The table creation code will have to handle virtual fields as well as +local fields in the table creation and index creation routines +respectively. + +When the code handling CompositeField.unique is finished, the +models.options.Options class will have to be modified to create a unique +CompositeField for each tuple in the Meta.unique_together attribute. The +code handling unique checks in models.Model will also have to be updated +to reflect the change. + +Retrieval and assignment +~~~~~~~~~~~~~~~~~~~~~~~~ + +Jacob has actually already provided a skeleton of the code that takes care +of this as seen in [1]. I'll only summarize the behaviour in a brief +example of my own. + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +Accessing the field attribute will create a CompositeObject instance which +will behave like a tuple but also with direct access to enclosed field +values via appropriately named attributes. + +Assignment will be possible using any iterable. The order of the values in +the iterable will have to be the same as the order in which undelying +fields have been specified to the CompositeField. + +QuerySet filtering +~~~~~~~~~~~~~~~~~~ + +This is where the real fun begins. + +The fundamental problem here is that Q objects which are used all over the +code that handles filtering are designed to describe single field lookups. +On the other hand, CompositeFields will require a way to describe several +individual field lookups by a single expression. + +Since the Q objects themselves have no idea about fields at all and the +actual field resolution from the filter conditions happens deeper down the +line, inside models.sql.query.Query, this is where we can handle the +filters properly. + +There is already some basic machinery inside Query.add_filter and +Query.setup_joins that is in use by GenericRelations, this is +unfortunately not enough. The optional extra_filters field method will be +of great use here, though it will have to be extended. + +Currently the only parameters it gets are the list of joins the +filter traverses, the position in the list and a negate parameter +specifying whether the filter is negated. The GenericRelation instance can +determine the value of the content type (which is what the extra_filters +method is used for) easily based on the model it belongs to. + +This is not the case for a CompositeField -- it doesn't have any idea +about the values used in the query. Therefore a new parameter has to be +added to the method so that the CompositeField can construct all the +actual filters from the iterable containing the values. + +Afterwards the handling inside Query is pretty straightforward. For +CompositeFields (and virtual fields in general) there is no value to be +used in the where node, the extra_filters are responsible for all +filtering, but since the filter should apply to a single object even after +join traversals, the aliases will be set up while handling the "root" +filter and then reused for each one of the extra_filters. + +This way of extending the extra_filters mechanism will allow the field +class to create conjunctions of atomic conditions. This is sufficient for +the "__exact" lookup type which will be implemented. + +Of the other lookup types, the only one that looks reasonable is "__in". +This will, however, have to be represented as a disjunction of multiple +"__exact" conditions since not all database backends support tuple +construction inside expressions. Therefore this lookup type will be left +out of this project as the mechanism would need much more work to make it +possible. + +CompositeField.primary_key +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As with db_index and unique, the backend table generating code will have +to be updated to set the PRIMARY KEY to a tuple. In this case, however, +the impact on the rest of the ORM and some other parts of Django is more +serious. + +A (hopefully) complete list of things affected by this is: +- the admin: the possibility to pass the value of the primary key as a + parameter inside the URL is a necessity to be able to work with a model +- contenttypes: since the admin uses GenericForeignKeys to log activity, + there will have to be some support +- forms: more precisely, ModelForms and their ModelChoiceFields +- relationship fields: ForeignKey, ManyToManyField and OneToOneField will + need a way to point to a model with a CompositeField as its primary key + +Let's look at each one of them in more detail. + +Admin +~~~~~ + +The solution that has been proposed so many times in the past [2], [3] is +to extend the quote function used in the admin to also quote the comma and +then use an unquoted comma as the separator. Even though this solution +looks ugly to some, I don't think there is much choice -- there needs to +be a way to separate the values and in theory, any character could be +contained inside a value so we can't really avoid choosing one and +escaping it. + +GenericForeignKeys +~~~~~~~~~~~~~~~~~~ + +Even though the admin uses the contenttypes framework to log the history +of actions, it turns out proper handling on the admin side will make +things work without the need to modify GenericForeignKey code at all. This +is thanks to the fact that the admin uses only the ContentType field and +handles the relations on its own. Making sure the unquoting function +recreates the whole CompositeObjects where necessary should suffice. + +At a later stage, however, GenericForeignKeys could also be improved to +support composite primary keys. Using the same quoting solution as in the +admin could work in theory, although it would only allow fields capable of +storing arbitrary strings to be usable for object_id storage. This has +been left out of the scope of this project, though. + +ModelChoiceFields +~~~~~~~~~~~~~~~~~ + +Again, we need a way to specify the value as a parameter passed in the +form. The same escaping solution can be used even here. + +Relationship fields +~~~~~~~~~~~~~~~~~~~ + +This turns out to be, not too surprisingly, the toughest problem. The fact +that related fields are spread across about fifteen different classes, +most of which are quite nontrivial, makes the whole bundle pretty fragile, +which means the changes have to be made carefully not to break anything. + +What we need to achieve is that the ForeignKey, ManyToManyField and +OneToOneField detect when their target field is a CompositeField in +several situations and act accordingly since this will require different +handling than regular fields that map directly to database columns. + +The first one to look at is ForeignKey since the other two rely on its +functionality, OneToOneField being its descendant and ManyToManyField +using ForeignKeys in the intermediary model. Once the ForeignKeys work, +OneToOneField should require minimal to no changes since it inherits +almost everything from ForeignKey. + +The easiest part is that for composite related fields, the db_type will be +None since the data will be stored elsewhere. + +ForeignKey and OneToOneField will also be able to create the underlying +fields automatically when added to the model. I'm proposing the following +default names: "fkname_targetname" where "fkname" is the name of the +ForeignKey field and "targetname" is the name of the remote field name +corresponding to the local one. I'm open to other suggestions on this. + +There will also be a way to override the default names using a new field +option "enclosed_fields". This option will expect a tuple of fields each +of whose corresponds to one individual field in the same order as +specified in the target CompositeField. This option will be ignored for +non-composite ForeignKeys. + +The trickiest part, however, will be relation traversals in QuerySet +lookups. Currently the code in models.sql.query.Query that creates joins +only joins on single columns. To be able to span a composite relationship +the code that generates joins will have to recognize column tuples and add +a constraint for each pair of corresponding columns with the same aliases +in all conditions. + +For the sake of completeness, ForeignKey will also have an extra_filters +method allowing to filter by a related object or its primary key. + +With all this infrastructure set up, ManyToMany relationships using +composite fields will be easy enough. Intermediary model creation will +work thanks to automatic underlying field creation for composite fields +and traversal in both directions will be supported by the query code. + ``__in`` lookups for ``CompositeField`` ======================================= @@ -423,6 +629,31 @@ that ``inspectdb`` just can't detect certain scenarios and ask people to edit their models manually. +Other considerations +-------------------- + +This infrastructure will allow reimplementing the GenericForeignKey as a +CompositeField at a later stage. Thanks to the modifications in the +joining code it should also be possible to implement bidirectional generic +relationship traversal in QuerySet filters. This is, however, out of scope +of this project. + +CompositeFields will have the serialize option set to False to prevent +their serialization. Otherwise the enclosed fields would be serialized +twice which would not only infer redundancy but also ambiguity. + +Also CompositeFields will be ignored in ModelForms by default, for two +reasons: +- otherwise the same field would be inside the form twice +- there aren't really any form fields usable for tuples and a fieldset + would require even more out-of-scope machinery + +The CompositeField will not allow enclosing other CompositeFields. The +only exception might be the case of composite ForeignKeys which could also +be implemented after successful finish of this project. With this feature +the autogenerated intermediary M2M model could make the two ForeignKeys +its primary key, dropping the need to have a redundant id AutoField. + Updatable primary keys in models ================================ From b081a971af0c1b002fd5302f7df1fd84d465f506 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 17 Mar 2017 00:04:04 +0600 Subject: [PATCH 09/80] modification --- draft/orm-improvements-for-composite-pk.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index b53d6103..6e620891 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -22,6 +22,8 @@ Django's ORM is a powerful tool which suits perfectly most use-cases, however, there are cases where having exactly one primary key column per table induces unnecessary redundancy. +Django ORM fields does have some historical design decisions like + One such case is the many-to-many intermediary model. Even though the pair of ForeignKeys in this model identifies uniquely each relationship, an additional field is required by the ORM to identify individual rows. While From dc4581969e02ae130fafc4b8abb4424d0ba50d07 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 11:19:54 +0600 Subject: [PATCH 10/80] modification --- draft/orm-improvements-for-composite-pk.rst | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 6e620891..f161d1f8 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -18,12 +18,15 @@ DEP : ORM Fields and related improvement for composite PK Abstract ======== -Django's ORM is a powerful tool which suits perfectly most use-cases, -however, there are cases where having exactly one primary key column per -table induces unnecessary redundancy. - -Django ORM fields does have some historical design decisions like +Django's ORM is a simple & powerful tool which suits most use-cases, +however, there are some historical design decisions like all the fields are +concreteField by default. This type of design limitation made it difficult +to add support for composite primarykey or working with relationField/genericRelations +very inconsistant behaviour. +cases where having exactly one primary key column per +table induces unnecessary redundancy. + One such case is the many-to-many intermediary model. Even though the pair of ForeignKeys in this model identifies uniquely each relationship, an additional field is required by the ORM to identify individual rows. While @@ -36,13 +39,13 @@ instances and the ability to use it in QuerySet filters, it is necessary to implement a mechanism to allow filtering of several actual fields by specifying a single filter. -The proposed solution is using Virtualfield type, CompositeField. This field -type will enclose several real fields within one single object. +The proposed solution is using Virtualfield type, and necessary VirtualField desendent +Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. Motivation ========== -This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API +This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. From 615d89d6196e038eafe2bc7f311bf190651d149c Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:08:37 +0600 Subject: [PATCH 11/80] modification --- draft/orm-improvements-for-composite-pk.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index f161d1f8..f8339055 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -93,8 +93,7 @@ rewrite certain parts in a cleaner way than what was necessary for ``CompositeField`` alone (e.g. database creation or certain features of ``model._meta``). -I am convinced that a better approach would be to Improve Field API and later -imlement VirtualField type to first do the required refactor of ``ForeignKey`` +I am convinced that a better approach would be to Improve Field API and RealtionField API and later imlement VirtualField type to first do the required refactor of ``ForeignKey`` and implement CompositeField as the next step. This will result in a better maintainable development branch and a cleaner revision history, making it easier to review the work before its eventual inclusion into Django. From 3c5ba6f963937fe0bac1efe98987cf4ac021b405 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:34:39 +0600 Subject: [PATCH 12/80] rename draft --- ...for-composite-pk.rst => orm-field-api-related-improvement.rst} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename draft/{orm-improvements-for-composite-pk.rst => orm-field-api-related-improvement.rst} (100%) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-field-api-related-improvement.rst similarity index 100% rename from draft/orm-improvements-for-composite-pk.rst rename to draft/orm-field-api-related-improvement.rst From a88bc86db44894282a373a7df93a1c80c95889a0 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:44:28 +0600 Subject: [PATCH 13/80] ORM Fields API and Related Improvements --- draft/orm-field-api-related-improvement.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f8339055..28c05eaa 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -1,5 +1,5 @@ ========================================================= -DEP : ORM Fields and related improvement for composite PK +DEP : ORM Fields API & Related Improvements ========================================================= :DEP: 0201 From 297885690e976465cb8b8e62ea1e1c7ea83cfc0b Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 14:45:55 +0600 Subject: [PATCH 14/80] adjustments --- draft/orm-field-api-related-improvement.rst | 72 ++++++++++++--------- 1 file changed, 41 insertions(+), 31 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 28c05eaa..bb708d3f 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -7,8 +7,8 @@ DEP : ORM Fields API & Related Improvements :Implementation Team: Asif Saif Uddin, django core team :Shepherd: Django Core Team :Status: Draft -:Type: Feature -:Created: 2017-3-2 +:Type: Feature/Cleanup/Optimization +:Created: 2017-3-18 :Last-Modified: 2017-00-00 .. contents:: Table of Contents @@ -16,20 +16,19 @@ DEP : ORM Fields API & Related Improvements :local: -Abstract -======== +Background: +=========== Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design decisions like all the fields are -concreteField by default. This type of design limitation made it difficult -to add support for composite primarykey or working with relationField/genericRelations -very inconsistant behaviour. +however, there are some historical design limitations and many inconsistant +implementation in orm relation fields API which produce many inconsistant +behaviour -cases where having exactly one primary key column per -table induces unnecessary redundancy. - -One such case is the many-to-many intermediary model. Even though the pair -of ForeignKeys in this model identifies uniquely each relationship, an -additional field is required by the ORM to identify individual rows. While +This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it +produces inconsistant behaviour and a very hard implementation to maintain. + +Also there are such case is the many-to-many intermediary model. Even though +the pair of ForeignKeys in this model identifies uniquely each relationship, +an additional field is required by the ORM to identify individual rows. While this isn't a real problem when the underlying database schema is created by Django, it becomes an obstacle as soon as one tries to develop a Django application using a legacy database. @@ -39,11 +38,11 @@ instances and the ability to use it in QuerySet filters, it is necessary to implement a mechanism to allow filtering of several actual fields by specifying a single filter. -The proposed solution is using Virtualfield type, and necessary VirtualField desendent -Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +The proposed solution is using Virtualfield type, and necessary VirtualField +desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. -Motivation +Abstract ========== This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. @@ -51,28 +50,39 @@ and design everything as much simple and small as possible to be able to impleme Key concerns of New Approach to implement ``CompositeField`` ============================================================== -1. Split out Field API to ConcreteField, BaseField etc and change on ORM based on the splitted API. -2. Introduce new standalone well defined ``VirtualField`` -3. Incorporate ``VirtualField`` related changes in django -4. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API -5. Figure out other cases where true virtual fields are needed. -6. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey -7. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -8. Change ForeignObjectRel subclasses to real field instances. (For example, +1. Split out Field API logically to separate ConcreteField, + BaseField etc and change on ORM based on the splitted API. + +2. Change ForeignObjectRel subclasses to real field instances. (For example, ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. -9. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be + advantageous to be able to define reverse relations directly. For example, + see ​https://github.com/akaariai/django-reverse-unique. + +5. Introduce new standalone well defined ``VirtualField`` + +6. Incorporate ``VirtualField`` related changes in django + +7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API + +8. Figure out other cases where true virtual fields are needed. + +9. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey + +10. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -10. Make changes to migrations framework to work properly with Reafctored Field +11. Make changes to migrations framework to work properly with Reafctored Field API. -11. Make sure new class based Index API ise used properly with refactored Field +12. Make sure new class based Index API ise used properly with refactored Field API. -12. Consider Database Contraints work of lan-foote and +13. Consider Database Contraints work of lan-foote and -13. SubField/AuxilaryField +14. SubField/AuxilaryField -14. Update in AutoField +15. Update in AutoField Porting previous work on top of master From 528baeba88205c76f5a197fff1bced72d0bb1abc Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 14:58:24 +0600 Subject: [PATCH 15/80] major steps --- draft/orm-field-api-related-improvement.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index bb708d3f..dcdfd2e2 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -47,6 +47,11 @@ Abstract This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. +To keep thing sane I will try to split the Dep in 3 major Part: +1. Logical refactor of present Field API and RelationField API +2. VirtualField Based refactor +3. CompositeField API formalization + Key concerns of New Approach to implement ``CompositeField`` ============================================================== From 73e6943b93f450d137b0c1798246a21dc478f5d9 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 15:16:55 +0600 Subject: [PATCH 16/80] keep thing simple --- draft/orm-field-api-related-improvement.rst | 280 +++++++------------- 1 file changed, 101 insertions(+), 179 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index dcdfd2e2..3d492b71 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -114,6 +114,9 @@ maintainable development branch and a cleaner revision history, making it easier to review the work before its eventual inclusion into Django. +Specification: +=============== + New split out Field API ========================= 1. BaseField: @@ -139,6 +142,12 @@ or any virtual type field can be benefitted from VirtualField. ----------------- + + + + + + 6. CompositeField: ------------------ A composite field can be implemented based on BaseField and VirtualField to solve @@ -239,45 +248,6 @@ Changes in ``RelationField`` Summary of ``CompositeField`` ============================= -This section summarizes the basic API as established in the proposal for -GSoC 2011 [1]_. - -A ``CompositeField`` requires a list of enclosed regular model fields as -positional arguments, as shown in this example:: - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) - -The model class then contains a descriptor for the composite field, which -returns a ``CompositeValue`` which is a customized namedtuple, the -descriptor accepts any iterable of the appropriate length. An example -interactive session:: - - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') - -``CompositeField`` supports the following standard field options: -``unique``, ``db_index``, ``primary_key``. The first two will simply add a -corresponding tuple to ``model._meta.unique_together`` or -``model._meta.index_together``. Other field options don't make much sense -in the context of composite fields. - -Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former -should be clear enough, the latter is elaborated in a separate section. - -It will be possible to use a ``CompositeField`` as a target field of -``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is -described in more detail in the following section. @@ -288,107 +258,8 @@ Alternative Approach of compositeFiled Implementation -------------- -Specifying a CompositeField in a Model -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The constructor of a CompositeField will accept the supported options as -keyword parameters and the enclosed fields will be specified as positional -parameters. The order in which they are specified will determine their -order in the namedtuple representing the CompositeField value (i. e. when -retrieving and assigning the CompositeField's value; see example below). - -unique and db_index -~~~~~~~~~~~~~~~~~~~ -Implementing these will require some modifications in the backend code. -The table creation code will have to handle virtual fields as well as -local fields in the table creation and index creation routines -respectively. - -When the code handling CompositeField.unique is finished, the -models.options.Options class will have to be modified to create a unique -CompositeField for each tuple in the Meta.unique_together attribute. The -code handling unique checks in models.Model will also have to be updated -to reflect the change. - -Retrieval and assignment -~~~~~~~~~~~~~~~~~~~~~~~~ - -Jacob has actually already provided a skeleton of the code that takes care -of this as seen in [1]. I'll only summarize the behaviour in a brief -example of my own. - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) - - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') - -Accessing the field attribute will create a CompositeObject instance which -will behave like a tuple but also with direct access to enclosed field -values via appropriately named attributes. - -Assignment will be possible using any iterable. The order of the values in -the iterable will have to be the same as the order in which undelying -fields have been specified to the CompositeField. - -QuerySet filtering -~~~~~~~~~~~~~~~~~~ - -This is where the real fun begins. - -The fundamental problem here is that Q objects which are used all over the -code that handles filtering are designed to describe single field lookups. -On the other hand, CompositeFields will require a way to describe several -individual field lookups by a single expression. - -Since the Q objects themselves have no idea about fields at all and the -actual field resolution from the filter conditions happens deeper down the -line, inside models.sql.query.Query, this is where we can handle the -filters properly. -There is already some basic machinery inside Query.add_filter and -Query.setup_joins that is in use by GenericRelations, this is -unfortunately not enough. The optional extra_filters field method will be -of great use here, though it will have to be extended. -Currently the only parameters it gets are the list of joins the -filter traverses, the position in the list and a negate parameter -specifying whether the filter is negated. The GenericRelation instance can -determine the value of the content type (which is what the extra_filters -method is used for) easily based on the model it belongs to. - -This is not the case for a CompositeField -- it doesn't have any idea -about the values used in the query. Therefore a new parameter has to be -added to the method so that the CompositeField can construct all the -actual filters from the iterable containing the values. - -Afterwards the handling inside Query is pretty straightforward. For -CompositeFields (and virtual fields in general) there is no value to be -used in the where node, the extra_filters are responsible for all -filtering, but since the filter should apply to a single object even after -join traversals, the aliases will be set up while handling the "root" -filter and then reused for each one of the extra_filters. - -This way of extending the extra_filters mechanism will allow the field -class to create conjunctions of atomic conditions. This is sufficient for -the "__exact" lookup type which will be implemented. - -Of the other lookup types, the only one that looks reasonable is "__in". -This will, however, have to be represented as a disjunction of multiple -"__exact" conditions since not all database backends support tuple -construction inside expressions. Therefore this lookup type will be left -out of this project as the mechanism would need much more work to make it -possible. CompositeField.primary_key ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -492,47 +363,6 @@ work thanks to automatic underlying field creation for composite fields and traversal in both directions will be supported by the query code. -``__in`` lookups for ``CompositeField`` -======================================= - -The existing implementation of ``CompositeField`` handles ``__in`` lookups -in the generic, backend-independent ``WhereNode`` class and uses a -disjunctive normal form expression as in the following example:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); - -The problem with this solution is that in cases where the list of values -contains tens or hundreds of tuples, this DNF expression will be extremely -long and the database will have to evaluate it for each and every row, -without a possibility of optimizing the query. - -Certain database backends support the following alternative:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; - -This would probably be the best option, but it can't be used by SQLite, -for instance. This is also the reason why the DNF expression was -implemented in the first place. - -In order to support this more natural syntax, the ``DatabaseOperations`` -needs to be extended with a method such as ``composite_in_sql``. - -However, this leaves the issue of the inefficient DNF unresolved for -backends without support for tuple literals. For such backends, the -following expression is proposed:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c - UNION SELECT 4, 5, 6) - WHERE a1=1 AND b1=b AND c1=c); - -Since both syntaxes are rather generic and at least one of them should fit -any database backend directly, a new flag will be introduced, -``DatabaseFeatures.supports_tuple_literals`` which the default -implementation of ``composite_in_sql`` will consult in order to choose -between the two options. ``contenttypes`` and ``GenericForeignKey`` @@ -588,6 +418,98 @@ composite primary key containing any special columns. This should be extremely rare anyway. +QuerySet filtering +~~~~~~~~~~~~~~~~~~ + +This is where the real fun begins. + +The fundamental problem here is that Q objects which are used all over the +code that handles filtering are designed to describe single field lookups. +On the other hand, CompositeFields will require a way to describe several +individual field lookups by a single expression. + +Since the Q objects themselves have no idea about fields at all and the +actual field resolution from the filter conditions happens deeper down the +line, inside models.sql.query.Query, this is where we can handle the +filters properly. + +There is already some basic machinery inside Query.add_filter and +Query.setup_joins that is in use by GenericRelations, this is +unfortunately not enough. The optional extra_filters field method will be +of great use here, though it will have to be extended. + +Currently the only parameters it gets are the list of joins the +filter traverses, the position in the list and a negate parameter +specifying whether the filter is negated. The GenericRelation instance can +determine the value of the content type (which is what the extra_filters +method is used for) easily based on the model it belongs to. + +This is not the case for a CompositeField -- it doesn't have any idea +about the values used in the query. Therefore a new parameter has to be +added to the method so that the CompositeField can construct all the +actual filters from the iterable containing the values. + +Afterwards the handling inside Query is pretty straightforward. For +CompositeFields (and virtual fields in general) there is no value to be +used in the where node, the extra_filters are responsible for all +filtering, but since the filter should apply to a single object even after +join traversals, the aliases will be set up while handling the "root" +filter and then reused for each one of the extra_filters. + +This way of extending the extra_filters mechanism will allow the field +class to create conjunctions of atomic conditions. This is sufficient for +the "__exact" lookup type which will be implemented. + +Of the other lookup types, the only one that looks reasonable is "__in". +This will, however, have to be represented as a disjunction of multiple +"__exact" conditions since not all database backends support tuple +construction inside expressions. Therefore this lookup type will be left +out of this project as the mechanism would need much more work to make it +possible. + +``__in`` lookups for ``CompositeField`` +======================================= + +The existing implementation of ``CompositeField`` handles ``__in`` lookups +in the generic, backend-independent ``WhereNode`` class and uses a +disjunctive normal form expression as in the following example:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); + +The problem with this solution is that in cases where the list of values +contains tens or hundreds of tuples, this DNF expression will be extremely +long and the database will have to evaluate it for each and every row, +without a possibility of optimizing the query. + +Certain database backends support the following alternative:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; + +This would probably be the best option, but it can't be used by SQLite, +for instance. This is also the reason why the DNF expression was +implemented in the first place. + +In order to support this more natural syntax, the ``DatabaseOperations`` +needs to be extended with a method such as ``composite_in_sql``. + +However, this leaves the issue of the inefficient DNF unresolved for +backends without support for tuple literals. For such backends, the +following expression is proposed:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c + UNION SELECT 4, 5, 6) + WHERE a1=1 AND b1=b AND c1=c); + +Since both syntaxes are rather generic and at least one of them should fit +any database backend directly, a new flag will be introduced, +``DatabaseFeatures.supports_tuple_literals`` which the default +implementation of ``composite_in_sql`` will consult in order to choose +between the two options. + + Database introspection, ``inspectdb`` ===================================== From 710639a25768e787298a044f371b94097ed85c5f Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 15:47:53 +0600 Subject: [PATCH 17/80] organization --- draft/orm-field-api-related-improvement.rst | 171 +++----------------- 1 file changed, 23 insertions(+), 148 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 3d492b71..a58fcfc9 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -42,6 +42,21 @@ The proposed solution is using Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is not practical and trivial +to try and rebase the previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach on top of master and present +ORM internals design. + +A better approach would be to Improve Field API, RealtionField API and model._meta +first. +Later imlement VirtualField type to first and star refactor of ``ForeignKey`` +and implement CompositeField as the next step. This will result in a better +maintainable development branch and a cleaner revision history, making it easier +to review the work before its eventual inclusion into Django. + + Abstract ========== This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API @@ -53,7 +68,7 @@ To keep thing sane I will try to split the Dep in 3 major Part: 3. CompositeField API formalization -Key concerns of New Approach to implement ``CompositeField`` +Key steps of New Approach to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField etc and change on ORM based on the splitted API. @@ -90,33 +105,14 @@ Key concerns of New Approach to implement ``CompositeField`` 15. Update in AutoField -Porting previous work on top of master -====================================== -The first major task of this project is to take the code written as part -of GSoC 2013 and compare it aganist master to have Idea of valid part. -The order in which It was implemented few years ago was to implement -``CompositeField`` first and then a refactor of ``ForeignKey`` which -is required to make it support ``CompositeField``. This turned out to be -inefficient with respect to the development process, because some parts of -the refactor broke the introduced ``CompositeField`` functionality, -meaning that it was needed effectively reimplement parts of it again. - -Also, some abstractions introduced by the refactor made it possible to -rewrite certain parts in a cleaner way than what was necessary for -``CompositeField`` alone (e.g. database creation or certain features of -``model._meta``). - -I am convinced that a better approach would be to Improve Field API and RealtionField API and later imlement VirtualField type to first do the required refactor of ``ForeignKey`` -and implement CompositeField as the next step. This will result in a better -maintainable development branch and a cleaner revision history, making it easier -to review the work before its eventual inclusion into Django. - - -Specification: +Specifications: =============== +Part-1: +======= + New split out Field API ========================= 1. BaseField: @@ -141,19 +137,15 @@ or any virtual type field can be benefitted from VirtualField. 5. RelationField: ----------------- - - - - - - - 6. CompositeField: ------------------ A composite field can be implemented based on BaseField and VirtualField to solve the CompositeKey/Multi column PrimaryKey issue. +Part-2: +======= + Introduce standalone ``VirtualField`` ===================================== @@ -245,41 +237,11 @@ Changes in ``RelationField`` ============================= -Summary of ``CompositeField`` -============================= - - - - -Alternative Approach of compositeFiled -======================================= Implementation -------------- - - - -CompositeField.primary_key -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -As with db_index and unique, the backend table generating code will have -to be updated to set the PRIMARY KEY to a tuple. In this case, however, -the impact on the rest of the ORM and some other parts of Django is more -serious. - -A (hopefully) complete list of things affected by this is: -- the admin: the possibility to pass the value of the primary key as a - parameter inside the URL is a necessity to be able to work with a model -- contenttypes: since the admin uses GenericForeignKeys to log activity, - there will have to be some support -- forms: more precisely, ModelForms and their ModelChoiceFields -- relationship fields: ForeignKey, ManyToManyField and OneToOneField will - need a way to point to a model with a CompositeField as its primary key - -Let's look at each one of them in more detail. - Admin ~~~~~ @@ -595,91 +557,4 @@ be implemented after successful finish of this project. With this feature the autogenerated intermediary M2M model could make the two ForeignKeys its primary key, dropping the need to have a redundant id AutoField. -Updatable primary keys in models -================================ - -The algorithm that determines what kind of database query to issue on -``model.save()`` is a fairly simple and well-documented one [6]_. If a -row exists in the database with the value of its primary key equal to -the saved object, it is updated, otherwise a new row is inserted. This -behavior is intuitive and works well for models where the primary key is -automatically created by the framework (be it an ``AutoField`` or a parent -link in the case of model inheritance). - -However, as soon as the primary key is explicitly created, the behavior -becomes less intuitive and might be confusing, for example, to users of the -admin. For instance, say we have the following model:: - - class Person(models.Model): - first_name = models.CharField(max_length=47) - last_name = models.CharField(max_length=47) - shoe_size = models.PositiveSmallIntegerField() - - full_name = models.CompositeField(first_name, last_name, - primary_key=True) - -Then we register the model in the admin using the standard one-liner:: - - admin.site.register(Person) - -Since we haven't excluded any fields, all three fields will be editable in -the admin. Now, suppose there's an instance whose ``full_name`` is -``CompositeValue(first_name='Darth', last_name='Vadur')``. A user decides -to fix the last name using the admin, hits the “Save” button and instead -of fixing an existing record, a new one will appear with the new value, -while the old one remains untouched. This behavior is clearly broken from -the point of view of the user. - -It can be argued that it is the developer's fault that the database schema -is poorly chosen and that they expose the primary key to their users. -While this may be true in some cases, it is still to some extent a -subjective matter. - -Therefore I propose a new behavior for ``model.save()`` where it would -detect a change in the instance's primary key and in that case issue an -``UPDATE`` for the right row, i.e. ``WHERE primary_key = previous_value``. - -Of course, just going ahead and changing the behavior in this way for all -models would be backwards incompatible. To do this properly, we would need -to make this an opt-in feature. This can be achieved in multiple ways. - -1) add a keyword argument such as ``update_pk`` to ``Model.save`` -2) add a new option to ``Model.Meta``, ``updatable_pk`` -3) make this a project-wide setting - -Option 3 doesn't look pleasant and I think I can safely eliminate that. -Option 2 is somewhat better, although it adds a new ``Meta`` option. -Option 1 is the most flexible solution, however, it does not change the -behavior of the admin, at least not by default. This can be worked around -by overriding the ``save`` method to use a different default:: - - class MyModel(models.Model): - def save(self, update_pk=True, **kwargs): - kwargs['update_pk'] = update_pk - return super(MyModel, self).save(**kwargs) - -To avoid the need to repeat this for each model, a class decorator might -be provided to perform this automatically. - -In order to implement this new behavior a little bit of extra complexity -would have to be added to models. Model instances would need to store the -last known value of the primary key as retrieved from the database. On -save it would just find out whether the last known value is present and in -that case issue an ``UPDATE`` using the old value in the ``WHERE`` -condition. - -So far so good, this could be implemented fairly easily. However, the -problem becomes considerably more difficult as soon as we take into -account the fact that updating a primary key value may break foreign key -references. In order to avoid breaking references the ``on_delete`` -mechanism of ``ForeignKey`` would have to be extended to support updates -as well. This means that the collector used by deletion will need to be -extended as well. - -The problem becomes particularly nasty if we realize that a ``ForeignKey`` -might be part of a primary key, which means the collector needs to keep -track of which field depends on which in a graph of potentially unlimited -size. Compared to this, deletion is simpler as it only needs to find a -list of all affected model instances as opposed to having to keep track of -which field to update using which value. From 3f48c64e26f906765c2b11cb44b1bfc4f82de0b0 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 16:04:48 +0600 Subject: [PATCH 18/80] simple and focused --- draft/orm-field-api-related-improvement.rst | 83 +-------------------- 1 file changed, 3 insertions(+), 80 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a58fcfc9..8a885b2d 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -26,20 +26,7 @@ behaviour This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it produces inconsistant behaviour and a very hard implementation to maintain. -Also there are such case is the many-to-many intermediary model. Even though -the pair of ForeignKeys in this model identifies uniquely each relationship, -an additional field is required by the ORM to identify individual rows. While -this isn't a real problem when the underlying database schema is created -by Django, it becomes an obstacle as soon as one tries to develop a Django -application using a legacy database. - -Since there is already a lot of code relying on the pk property of model -instances and the ability to use it in QuerySet filters, it is necessary -to implement a mechanism to allow filtering of several actual fields by -specifying a single filter. - -The proposed solution is using Virtualfield type, and necessary VirtualField -desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +The proposed solution is using Cleanup/provisional RealatedField API, Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. Notes on Porting previous work on top of master: @@ -143,6 +130,7 @@ A composite field can be implemented based on BaseField and VirtualField to solv the CompositeKey/Multi column PrimaryKey issue. + Part-2: ======= @@ -236,12 +224,6 @@ notes. Changes in ``RelationField`` ============================= - - - -Implementation --------------- - Admin ~~~~~ @@ -429,7 +411,7 @@ construction inside expressions. Therefore this lookup type will be left out of this project as the mechanism would need much more work to make it possible. -``__in`` lookups for ``CompositeField`` +``__in`` lookups for ``VirtualField`` ======================================= The existing implementation of ``CompositeField`` handles ``__in`` lookups @@ -472,65 +454,6 @@ implementation of ``composite_in_sql`` will consult in order to choose between the two options. -Database introspection, ``inspectdb`` -===================================== - -There are three main goals concerning database introspection in this -project. The first is to ensure the output of ``inspectdb`` remains the -same as it is now for models with simple primary keys and simple foreign -key references, or at least equivalent. While this shouldn't be too -difficult to achieve, it will still be regarded with high importance. - -The second goal is to extend ``inspectdb`` to also create a -``CompositeField`` in models where the table contains a composite primary -key. This part shouldn't be too difficult, -``DatabaseIntrospection.get_primary_key_column`` will be renamed to -``get_primary_key`` which will return a tuple of columns and in case the -tuple contains more than one element, an appropriate ``CompositeField`` -will be added. This will also require updating -``DatabaseWrapper.check_constraints`` for certain backends since it uses -``get_primary_key_column``. - -The third goal is to also make ``inspectdb`` aware of composite foreign -keys. This will need a rewrite of ``get_relations`` which will have to -return a mapping between tuples of columns instead of single columns. It -should also ensure each tuple of columns pointed to by a foreign key gets -a ``CompositeField``. This part will also probably require some changes in -other backend methods as well, especially since each backend has a unique -tangle of introspection methods. - -This part requires a tremendous amount of work, because practically every -single change needs to be done four times and needs separate research of -the specific backend in question. Therefore I can't promise to deliver full support -for all features mentioned in this section for all backends. I'd say -backwards compatibility is a requirement, recognition of composite primary -keys is a highly wanted feature that I'll try to implement for as many -backends as possible and recognition of composite foreign keys would be a -nice extra to have for at least one or two backends. - -I'll be implementing the features for the individual backends in the -following order: PostgreSQL, MySQL, SQLite and Oracle. I put PostgreSQL -first because, well, this is the backend with the best support in Django -(and also because it is the one where I'd actually use the features I'm -proposing). Oracle comes last because I don't have any way to test it and -I'm afraid I'd be stabbing in the dark anyway. Of the two remaining -backends I put MySQL first for two reasons. First, I don't think people -need to run ``inspectdb`` on SQLite databases too often (if ever). Second, -on MySQL the task seems marginally easier as the database has -introspection features other than just “give me the SQL statement used to -create this table”, whose parsing is most likely going to be a complete -mess. - -All in all, extending ``inspectdb`` features is a tedious and difficult -task with shady outcome, which I'm well aware of. Still, I would like to -try to at least implement the easier parts for the most used backends. It -might quite possibly turn out that I won't manage to implement more than -composite primary key detection for PostgreSQL. This is the reason I keep -this as one of the last features I intend to work on, as shown in the -timeline. It isn't a necessity, we can always just add a note to the docs -that ``inspectdb`` just can't detect certain scenarios and ask people to -edit their models manually. - Other considerations -------------------- From f00713d9ca0085f1ae74d91aee369ec7649f1875 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 19 Mar 2017 14:45:13 +0600 Subject: [PATCH 19/80] modifications --- draft/orm-field-api-related-improvement.rst | 111 ++++++++++---------- 1 file changed, 58 insertions(+), 53 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 8a885b2d..67058408 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -8,7 +8,7 @@ DEP : ORM Fields API & Related Improvements :Shepherd: Django Core Team :Status: Draft :Type: Feature/Cleanup/Optimization -:Created: 2017-3-18 +:Created: 2017-3-5 :Last-Modified: 2017-00-00 .. contents:: Table of Contents @@ -23,33 +23,41 @@ however, there are some historical design limitations and many inconsistant implementation in orm relation fields API which produce many inconsistant behaviour -This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it -produces inconsistant behaviour and a very hard implementation to maintain. +This type of design limitation made it difficult to add support for composite primarykey or working +with relationField/genericRelations very annoying as they produces inconsistant behaviour and a +their implementaion is hard to maintain sue to many special casing. -The proposed solution is using Cleanup/provisional RealatedField API, Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +In order to fix this design limitations and inconsistant API's the proposed solution is to introduce REAL +VirtualField types and refactor Fields/RelationFields API based on virtualFields type. Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is not practical and trivial -to try and rebase the previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach on top of master and present -ORM internals design. +Considering the huge changes in ORM internals it is neither practical nor trivial +to rebase & port previous works related to ForeignKey refactor and CompositeKey without +figuring out new approach based on present ORM internals design on top of master. -A better approach would be to Improve Field API, RealtionField API and model._meta -first. -Later imlement VirtualField type to first and star refactor of ``ForeignKey`` -and implement CompositeField as the next step. This will result in a better -maintainable development branch and a cleaner revision history, making it easier -to review the work before its eventual inclusion into Django. +A better approach would be to Improve Field API, major cleanup of RealtionField API, model._meta, +and internal field_valaue_cache and related areas first. +Later after completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be +introduced and VirtualField based refactor of ForeignKey and relationFields should take place. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. Abstract ========== -This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API -and design everything as much simple and small as possible to be able to implement separately. +This DEP aims to improve different part of django ORM and ot associated parts of django to support Real VirtualField +type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested +approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related +tickets were also analyzed to find out the proper ways and API design. + +The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. + +To keep thing sane it would be bette to split the Dep in 3 major Part: -To keep thing sane I will try to split the Dep in 3 major Part: 1. Logical refactor of present Field API and RelationField API 2. VirtualField Based refactor 3. CompositeField API formalization @@ -89,8 +97,6 @@ Key steps of New Approach to improve ORM Field API internals: 14. SubField/AuxilaryField -15. Update in AutoField - @@ -223,40 +229,6 @@ notes. Changes in ``RelationField`` ============================= - -Admin -~~~~~ - -The solution that has been proposed so many times in the past [2], [3] is -to extend the quote function used in the admin to also quote the comma and -then use an unquoted comma as the separator. Even though this solution -looks ugly to some, I don't think there is much choice -- there needs to -be a way to separate the values and in theory, any character could be -contained inside a value so we can't really avoid choosing one and -escaping it. - -GenericForeignKeys -~~~~~~~~~~~~~~~~~~ - -Even though the admin uses the contenttypes framework to log the history -of actions, it turns out proper handling on the admin side will make -things work without the need to modify GenericForeignKey code at all. This -is thanks to the fact that the admin uses only the ContentType field and -handles the relations on its own. Making sure the unquoting function -recreates the whole CompositeObjects where necessary should suffice. - -At a later stage, however, GenericForeignKeys could also be improved to -support composite primary keys. Using the same quoting solution as in the -admin could work in theory, although it would only allow fields capable of -storing arbitrary strings to be usable for object_id storage. This has -been left out of the scope of this project, though. - -ModelChoiceFields -~~~~~~~~~~~~~~~~~ - -Again, we need a way to specify the value as a parameter passed in the -form. The same escaping solution can be used even here. - Relationship fields ~~~~~~~~~~~~~~~~~~~ @@ -362,6 +334,23 @@ composite primary key containing any special columns. This should be extremely rare anyway. +GenericForeignKeys +~~~~~~~~~~~~~~~~~~ + +Even though the admin uses the contenttypes framework to log the history +of actions, it turns out proper handling on the admin side will make +things work without the need to modify GenericForeignKey code at all. This +is thanks to the fact that the admin uses only the ContentType field and +handles the relations on its own. Making sure the unquoting function +recreates the whole CompositeObjects where necessary should suffice. + +At a later stage, however, GenericForeignKeys could also be improved to +support composite primary keys. Using the same quoting solution as in the +admin could work in theory, although it would only allow fields capable of +storing arbitrary strings to be usable for object_id storage. This has +been left out of the scope of this project, though. + + QuerySet filtering ~~~~~~~~~~~~~~~~~~ @@ -453,6 +442,22 @@ any database backend directly, a new flag will be introduced, implementation of ``composite_in_sql`` will consult in order to choose between the two options. +ModelChoiceFields +~~~~~~~~~~~~~~~~~ + +Again, we need a way to specify the value as a parameter passed in the +form. The same escaping solution can be used even here. + +Admin +~~~~~ + +The solution that has been proposed so many times in the past [2], [3] is +to extend the quote function used in the admin to also quote the comma and +then use an unquoted comma as the separator. Even though this solution +looks ugly to some, I don't think there is much choice -- there needs to +be a way to separate the values and in theory, any character could be +contained inside a value so we can't really avoid choosing one and +escaping it. Other considerations From f6d08f9cce1e7abaa8e6fab2d30f72d65107983b Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Wed, 22 Mar 2017 12:32:26 +0600 Subject: [PATCH 20/80] changes about related field clean up --- draft/orm-field-api-related-improvement.rst | 108 ++++++++++++++++++-- 1 file changed, 98 insertions(+), 10 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 67058408..022e8dad 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -58,9 +58,11 @@ The main motivation of this Dep's approach is to improve django ORM's Field API To keep thing sane it would be bette to split the Dep in 3 major Part: -1. Logical refactor of present Field API and RelationField API +1. Logical refactor of present Field API and RelationField API and make + them consistant + 2. VirtualField Based refactor -3. CompositeField API formalization + Key steps of New Approach to improve ORM Field API internals: @@ -93,7 +95,7 @@ Key steps of New Approach to improve ORM Field API internals: 12. Make sure new class based Index API ise used properly with refactored Field API. -13. Consider Database Contraints work of lan-foote and +13. Consider Database Contraints work 14. SubField/AuxilaryField @@ -122,18 +124,104 @@ ConcreteField will have all the common attributes of a Regular concrete field Presence base Field class with should refactored using BaseField and ConcreteField. If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. -4. VirtualField: +4. RelationField: +----------------- + +5. VirtualField: ---------------- A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField or any virtual type field can be benefitted from VirtualField. -5. RelationField: ------------------ +Relation Field API clean up: +============================ + +How relation works in django now: +================================= +Before defining clean up mechanism, lets jump into how relations work in django + +A relation in Django consits of: + - The originating field itself + - A descriptor to access the objects of the relation + - The descriptor might need a custom manager + - Possibly a remote relation field (the field to travel the relation in other direction) + Note that this is different from the target and source fields, which define which concrete fields this relation use (essentially, which columns to equate in the JOIN condition) + - The remote field can also contain a descriptor and a manager. + - For deprecation period, field.rel is a bit like the remote field, but without + actually being a field instance. This is created only in the origin field, the remote field doesn't have a rel (as we don't need backwards compatibility + for the remote fields) + + The loading order is as follows: + - The origin field is created as part of importing the class (or separately + by migrations). + - The origin field is added to the origin model's meta (the field's contribute_to_class is called). + - When both the origin and the remote classes are loaded, the remote field is created and the descriptors are created. The remote field is added to the + target class' _meta + - For migrations it is possible that a model is replaced live in the app-cache. For example, + assume model Author is changed, and it is thus reloaded. Model Book has foreign key to + Author, so its reverse field must be recreated in the Author model, too. The way this is + done is that we collect all fields that have been auto-created as relationships into the + Author model, and recreate the related field once Author has been reloaded. + + Example: + + class Author(models.Model): + pass + + class Book(models.Model): + author = models.ForeignKey(Author) + + 1. Author is seen, and thus added to the appconfig. + 2. Book is seen, the field author is seen. + - The author field is created and assigned to Book's class level variable author. + - The author field's rel instance is created at the same time the field is created. + - The metaclass loading for models sees the field instance in Book's attrs, + and the field is added the class, that is author's contribute_to_class is called. + - In the contribute_to_class method, the field is added to Book's meta. + - As last step of contribut_to_class method the prepare_remote() method + is added as a lazy loaded method. It will be called when both Book and + Author are ready. As it happens, they are both ready in the example, + so the method is called immediately.If the Author model was defined later + than Book, and Book had a string reference to Author, then the method would + be called only after Author was ready. + 3. The prepare_remote() method is called. + - The remote field is created based on attributes of the origin field. + The field is added to the remote model (the field's contribute_to_class + is called) + - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field + (unless the remote relation is hidden, in which case no descriptor is created) + +Clean up Relation API to make it consistant: +============================================ +The problem is that when using get_fields(), you'll get either a +field.rel instance (for reverse side of user defined fields), or +a real field instance(for example ForeignKey). These behave +differently, so that the user must always remember which one +he is dealing with. This creates lots of non-necessary conditioning +in multiple places of +Django. + +For example, the select_related descent has one branch for descending foreign +keys and one to one fields, and another branch for descending to reverse one +to one fields. Conceptually both one to one and reverse one to one fields +are very similar, so this complication is non-necessary. + +The idea is to deprecate field.rel, and instead add field.remote_field. +The remote_field is just a field subclass, just like everything else +in Django. + +The benefits are: +Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. Everything from get_fields() should be a field. +Code simplicity - no special casing based on if a given relation is described +by a rel or not +Code reuse - ReverseManyToManyField is in most regard exactly like +ManyToManyField. + +The expected problems are mostly from 3rd party code. Users of _meta that +already work on expectation of getting rel instances will likely need updating. +Those users who subclass Django's fields (or duck-type Django's fields) will +need updating. Examples of such projects include django-rest-framework and django-taggit. + -6. CompositeField: ------------------- -A composite field can be implemented based on BaseField and VirtualField to solve -the CompositeKey/Multi column PrimaryKey issue. From 57107adf84f320708ff9cb957a0b6a49da94f833 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 00:50:39 +0600 Subject: [PATCH 21/80] define virtualfield --- draft/orm-field-api-related-improvement.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 022e8dad..ba27a00c 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -221,6 +221,9 @@ already work on expectation of getting rel instances will likely need updating. Those users who subclass Django's fields (or duck-type Django's fields) will need updating. Examples of such projects include django-rest-framework and django-taggit. +Proposed API and workd flow for clean ups: +========================================== + @@ -230,6 +233,10 @@ Part-2: Introduce standalone ``VirtualField`` ===================================== +what is ``VirtualField``? +------------------------- +"A virtual field is a model field which it correlates to one or multiple +concrete fields, but doesn't add or alter columns in the database." From cb761e91b33612a2a38c752af81acdf1b77d073e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 12:24:39 +0600 Subject: [PATCH 22/80] rewording --- draft/orm-field-api-related-improvement.rst | 69 +++++++++++---------- 1 file changed, 35 insertions(+), 34 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index ba27a00c..3e9011b8 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -1,6 +1,6 @@ -========================================================= -DEP : ORM Fields API & Related Improvements -========================================================= +============================================================== +DEP : ORM Relation Fields API Improvements using VirtualField +============================================================== :DEP: 0201 :Author: Asif Saif Uddin @@ -19,49 +19,60 @@ DEP : ORM Fields API & Related Improvements Background: =========== Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design limitations and many inconsistant +however, there are some historical design limitations and inconsistant implementation in orm relation fields API which produce many inconsistant -behaviour +behaviour. -This type of design limitation made it difficult to add support for composite primarykey or working -with relationField/genericRelations very annoying as they produces inconsistant behaviour and a -their implementaion is hard to maintain sue to many special casing. +This type of design limitation made it difficult to add support for +composite primarykey or working with relationField/genericRelations +very annoying as they produces inconsistant behaviour and their +implementaion is hard to maintain due to many special casing. -In order to fix this design limitations and inconsistant API's the proposed solution is to introduce REAL -VirtualField types and refactor Fields/RelationFields API based on virtualFields type. +In order to fix this design limitations and inconsistant API's the proposed +solution is to introduce REAL VirtualField types and refactor +Fields/RelationFields API based on virtualFields type. Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is neither practical nor trivial -to rebase & port previous works related to ForeignKey refactor and CompositeKey without -figuring out new approach based on present ORM internals design on top of master. +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +design on top of master. -A better approach would be to Improve Field API, major cleanup of RealtionField API, model._meta, -and internal field_valaue_cache and related areas first. +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. -Later after completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be -introduced and VirtualField based refactor of ForeignKey and relationFields should take place. +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. This appraoch should keep things sane and easier to approach on smaller chunks. -Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. + Abstract ========== -This DEP aims to improve different part of django ORM and ot associated parts of django to support Real VirtualField -type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested -approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related +This DEP aims to improve different part of django ORM and ot associated +parts of django to support Real VirtualFieldtype in django. There were +several attempt to fix this problem before. So in this Dep we will try +to follow the suggested approaches from Michal Patrucha's previous works +and suggestions in tickets and IRC chat/mailing list. Few other related tickets were also analyzed to find out the proper ways and API design. -The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. +The main motivation of this Dep's approach is to improve django ORM's +Field API and design everything as much simple and small as possible +to be able to implement separately. -To keep thing sane it would be bette to split the Dep in 3 major Part: +To keep thing sane it would be better to split the Dep in 3 major Part: 1. Logical refactor of present Field API and RelationField API and make them consistant -2. VirtualField Based refactor +2. Fields internal value cache refactor for relation fields + +3. VirtualField Based refactor @@ -101,7 +112,6 @@ Key steps of New Approach to improve ORM Field API internals: - Specifications: =============== @@ -375,11 +385,9 @@ and traversal in both directions will be supported by the query code. - ``contenttypes`` and ``GenericForeignKey`` ========================================== - It's fairly easy to represent composite values as strings. Given an ``escape`` function which uniquely escapes commas, something like the following works quite well:: @@ -574,10 +582,3 @@ reasons: - there aren't really any form fields usable for tuples and a fieldset would require even more out-of-scope machinery -The CompositeField will not allow enclosing other CompositeFields. The -only exception might be the case of composite ForeignKeys which could also -be implemented after successful finish of this project. With this feature -the autogenerated intermediary M2M model could make the two ForeignKeys -its primary key, dropping the need to have a redundant id AutoField. - - From 06cf082ed073467df4d9ea98e5dc50dc0ecb011f Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 12:52:40 +0600 Subject: [PATCH 23/80] addressed another limitation of present related api related to dorect reverse relation --- draft/orm-field-api-related-improvement.rst | 91 +++++++++++++++++---- 1 file changed, 76 insertions(+), 15 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 3e9011b8..2d0817cb 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -200,6 +200,81 @@ A relation in Django consits of: - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) +Another limitation is, + +Django supports many-to-one relationships -- the foreign keys live on +the "many", and point to the "one". So, in a simple app where you +have Comments that can get Flagged, one Comment can have many Flag's, +but each Flag refers to one and only one Comment: + +class Comment(models.Model): + text = models.TextField() + +class Flag(models.Model): + comment = models.ForeignKey(Comment) + +However, there are circumstances where it's much more convenient to +express the relationship as a one-to-many relationship. Suppose, for +example, you want to have a generic "flagging" app which other models +can use: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +That way, if you had a new content type (say, a "Post"), it could also +participate in flagging, without having to modify the model definition +of "Flag" to add a new foreign key. Without baking in migrations, +there's obviously no way to make the underlying SQL play nice in this +circumstance: one-to-many relationships with just two tables can only +be expressed in SQL with a reverse foreign key relationship. However, +it's possible to describe OneToMany as a subset of ManyToMany, with a +uniqueness constraint on the "One" -- we rely on the join table to +handle the relationship: + +class Comment(models.Model): + text = models.TextField() + flags = models.ManyToMany(Flag, through=CommentFlag) + +class CommentFlag(models.Model): + comment = models.ForeignKey(Comment) + flag = models.ForeignKey(Flag, unique=True) + +While this works, the query interface remains cumbersome. To access +the comment from a flag, I have to call: + +comment = flag.comment_set.all()[0] + +as the ORM doesn't know for a fact that each flag could only have one +comment. But Django _could_ implement a OneToManyField in this way +(using the underlying ManyToMany paradigm), and provide sugar such +that this would all be nice and flexible, without having to do cumbersome +ORM calls or explicitly define extra join tables: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +class Post(models.Model): + body = models.TextField() + flags = models.OneToMany(Flag) + +# in a separate reusable app... +class Flag(models.Model) + reason = models.TextField() + resolved = models.BooleanField() + +# in a view... +comment = flag.comment +post = flag.post + +It's obviously less database efficient than simple 2-table reverse +ForeignKey relationships, as you have to do an extra join on the third +table; but you gain semantic clarity and a nice way to use it in +reusable apps, so in many circumstances it's worth it. And it's a +fair shake clearer than the existing generic foreign key solutions. + + Clean up Relation API to make it consistant: ============================================ The problem is that when using get_fields(), you'll get either a @@ -503,6 +578,7 @@ construction inside expressions. Therefore this lookup type will be left out of this project as the mechanism would need much more work to make it possible. + ``__in`` lookups for ``VirtualField`` ======================================= @@ -566,19 +642,4 @@ escaping it. Other considerations -------------------- -This infrastructure will allow reimplementing the GenericForeignKey as a -CompositeField at a later stage. Thanks to the modifications in the -joining code it should also be possible to implement bidirectional generic -relationship traversal in QuerySet filters. This is, however, out of scope -of this project. - -CompositeFields will have the serialize option set to False to prevent -their serialization. Otherwise the enclosed fields would be serialized -twice which would not only infer redundancy but also ambiguity. - -Also CompositeFields will be ignored in ModelForms by default, for two -reasons: -- otherwise the same field would be inside the form twice -- there aren't really any form fields usable for tuples and a fieldset - would require even more out-of-scope machinery From 87cf4c078b5e91ec80c6372c2800e9099d98f25f Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 31 Mar 2017 01:06:52 +0600 Subject: [PATCH 24/80] modifications --- draft/orm-field-api-related-improvement.rst | 56 +++++++++------------ 1 file changed, 25 insertions(+), 31 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 2d0817cb..45717fac 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -33,50 +33,28 @@ solution is to introduce REAL VirtualField types and refactor Fields/RelationFields API based on virtualFields type. -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. - -This appraoch should keep things sane and easier to approach on smaller chunks. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. - - Abstract ========== This DEP aims to improve different part of django ORM and ot associated -parts of django to support Real VirtualFieldtype in django. There were +parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related -tickets were also analyzed to find out the proper ways and API design. +tickets were also analyzed to find out possible way's of API design. -The main motivation of this Dep's approach is to improve django ORM's -Field API and design everything as much simple and small as possible -to be able to implement separately. -To keep thing sane it would be better to split the Dep in 3 major Part: +To keep thing sane it would be better to split the Dep in some major Parts: -1. Logical refactor of present Field API and RelationField API and make - them consistant +1. Logical refactor of present Field API and RelationField API, to make + them sipler and consistant with _meta API calls -2. Fields internal value cache refactor for relation fields +2. Fields internal value cache refactor for relation fields (may be) -3. VirtualField Based refactor +3. VirtualField Based refactor of RelationFields API -Key steps of New Approach to improve ORM Field API internals: +Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField etc and change on ORM based on the splitted API. @@ -108,7 +86,7 @@ Key steps of New Approach to improve ORM Field API internals: 13. Consider Database Contraints work -14. SubField/AuxilaryField +14. SubField/AuxilaryField [may be] @@ -642,4 +620,20 @@ escaping it. Other considerations -------------------- +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +design on top of master. + +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. From 76e3038076e62fe692c7574773b92c14232673a3 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 31 Mar 2017 01:41:48 +0600 Subject: [PATCH 25/80] more modifications --- draft/orm-field-api-related-improvement.rst | 61 +++++++++++---------- 1 file changed, 32 insertions(+), 29 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 45717fac..a9c85003 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -11,31 +11,28 @@ DEP : ORM Relation Fields API Improvements using VirtualField :Created: 2017-3-5 :Last-Modified: 2017-00-00 -.. contents:: Table of Contents - :depth: 3 - :local: Background: =========== -Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design limitations and inconsistant -implementation in orm relation fields API which produce many inconsistant -behaviour. +Django's ORM is a simple & powerful tool which suits most use-cases. +However, historicaly it has some design limitations and complex internal +API which makes it not only hard to maintain but also produce inconsistant +behaviours. This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations -very annoying as they produces inconsistant behaviour and their +very annoying as they don't produce consistant behaviour and their implementaion is hard to maintain due to many special casing. -In order to fix this design limitations and inconsistant API's the proposed -solution is to introduce REAL VirtualField types and refactor -Fields/RelationFields API based on virtualFields type. +In order to fix these design limitations and inconsistancies, the proposed +solution is to refactor Fields/RelationFields to new simpler API and +incorporate virtualField type based refctors of RelationFields. Abstract ========== -This DEP aims to improve different part of django ORM and ot associated +This DEP aims to improve different part of django ORM and associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested approaches from Michal Patrucha's previous works @@ -46,11 +43,13 @@ tickets were also analyzed to find out possible way's of API design. To keep thing sane it would be better to split the Dep in some major Parts: 1. Logical refactor of present Field API and RelationField API, to make - them sipler and consistant with _meta API calls + them simpler and consistant with _meta API calls -2. Fields internal value cache refactor for relation fields (may be) +2. Introduce new sane API for RelationFields [internal/provisional] -3. VirtualField Based refactor of RelationFields API +3. Fields internal value cache refactor for relation fields (may be) + +4. VirtualField Based refactor of RelationFields API @@ -60,33 +59,37 @@ Key steps of to follow to improve ORM Field API internals: BaseField etc and change on ORM based on the splitted API. 2. Change ForeignObjectRel subclasses to real field instances. (For example, - ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. + ForeignKey generates a ManyToOneRel in the related model). The Rel instances + are already returned from get_field(), but they aren't yet field subclasses. -3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be - advantageous to be able to define reverse relations directly. For example, +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it + can be advantageous to be able to define reverse relations directly. For + example, see ​https://github.com/akaariai/django-reverse-unique. -5. Introduce new standalone well defined ``VirtualField`` +4. Introduce new standalone well defined ``VirtualField`` -6. Incorporate ``VirtualField`` related changes in django +5. Incorporate ``VirtualField`` related changes in django -7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API -8. Figure out other cases where true virtual fields are needed. +7. Figure out other cases where true virtual fields are needed. -9. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey +8. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey -10. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -11. Make changes to migrations framework to work properly with Reafctored Field +10. Make changes to migrations framework to work properly with Reafctored Field API. +11. Migrations work well with VirtualField based refactored API + 12. Make sure new class based Index API ise used properly with refactored Field API. -13. Consider Database Contraints work +13. Query/QuerySets/Expressions work well with new refactored API's -14. SubField/AuxilaryField [may be] +14. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API @@ -605,8 +608,8 @@ ModelChoiceFields Again, we need a way to specify the value as a parameter passed in the form. The same escaping solution can be used even here. -Admin -~~~~~ +Admin/ModelForms +================ The solution that has been proposed so many times in the past [2], [3] is to extend the quote function used in the admin to also quote the comma and From 0ec6242d28b8d103b7ef50ffb9e3cd5331cc794a Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 13:50:35 +0600 Subject: [PATCH 26/80] modifications --- draft/orm-field-api-related-improvement.rst | 70 ++++++++++++--------- 1 file changed, 42 insertions(+), 28 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a9c85003..4a84bd79 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -30,8 +30,8 @@ solution is to refactor Fields/RelationFields to new simpler API and incorporate virtualField type based refctors of RelationFields. -Abstract -========== +Aim of the Proposal: +==================== This DEP aims to improve different part of django ORM and associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try @@ -56,29 +56,31 @@ To keep thing sane it would be better to split the Dep in some major Parts: Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, - BaseField etc and change on ORM based on the splitted API. + BaseField, RelationField etc and adjust codes based on that API. -2. Change ForeignObjectRel subclasses to real field instances. (For example, - ForeignKey generates a ManyToOneRel in the related model). The Rel instances - are already returned from get_field(), but they aren't yet field subclasses. +2. Change ForeignObjectRel subclasses to real field instances. + The Rel instances are already returned from get_field(), but they + aren't yet field subclasses. (For example, ForeignKey generates + a ManyToOneRel in the related model). -3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it - can be advantageous to be able to define reverse relations directly. For - example, - see ​https://github.com/akaariai/django-reverse-unique. +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases + it could be advantageous to be able to define reverse relations directly. + For example, ​https://github.com/akaariai/django-reverse-unique. -4. Introduce new standalone well defined ``VirtualField`` +4. Introduce new standalone well defined ``VirtualField``. -5. Incorporate ``VirtualField`` related changes in django +5. Incorporate ``VirtualField`` related changes in django. -6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` + etc new Field API. -7. Figure out other cases where true virtual fields are needed. +7. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` + and new Field API based ForeignKey. -8. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey - -9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +9. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API + 10. Make changes to migrations framework to work properly with Reafctored Field API. @@ -89,7 +91,9 @@ Key steps of to follow to improve ORM Field API internals: 13. Query/QuerySets/Expressions work well with new refactored API's -14. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API +14. refactor GIS framework based on the changes in ORM + +15. ModelForms/Admin work well with posposed changes @@ -103,8 +107,8 @@ New split out Field API ========================= 1. BaseField: ------------- -Base structure for all Field types in django ORM wheather it is Concrete -or VirtualField +Base structure for all Field types in django ORM wheather it is Concrete, +relation or VirtualField 2. ConcreteField: ----------------- @@ -113,15 +117,21 @@ ConcreteField will have all the common attributes of a Regular concrete field 3. Field: --------- Presence base Field class with should refactored using BaseField and ConcreteField. -If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. +If it is decided to provide the optional virtual type to regular fields then +VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- +Based Field for All relation fields. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField -or any virtual type field can be benefitted from VirtualField. +A true stand alone virtula field will be added to the system to be used to solve +some long standing design limitations of django orm. initially RelationFields, +GenericRelations etc will be benefitted by using VirtualFields and later +CompositeField or any virtual type field can be benefitted from VirtualField. + + Relation Field API clean up: ============================ @@ -135,10 +145,13 @@ A relation in Django consits of: - A descriptor to access the objects of the relation - The descriptor might need a custom manager - Possibly a remote relation field (the field to travel the relation in other direction) - Note that this is different from the target and source fields, which define which concrete fields this relation use (essentially, which columns to equate in the JOIN condition) + Note that this is different from the target and source fields, which define which + concrete fields this relation use (essentially, which columns to equate in the + JOIN condition) - The remote field can also contain a descriptor and a manager. - For deprecation period, field.rel is a bit like the remote field, but without - actually being a field instance. This is created only in the origin field, the remote field doesn't have a rel (as we don't need backwards compatibility + actually being a field instance. This is created only in the origin field, + the remote field doesn't have a rel (as we don't need backwards compatibility for the remote fields) The loading order is as follows: @@ -620,13 +633,14 @@ contained inside a value so we can't really avoid choosing one and escaping it. -Other considerations --------------------- +GIS Framework: +============== Notes on Porting previous work on top of master: ================================================ Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +trivial to rebase & port previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach based on present ORM internals design on top of master. A better approach would be to Improve Field API, major cleanup of RealtionField From 9c0b532470d8e517c5eedfd7dbcc514cf0aefa7d Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 22:12:45 +0600 Subject: [PATCH 27/80] drfat relational field api --- draft/orm-field-api-related-improvement.rst | 67 +++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 4a84bd79..d9678add 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -302,7 +302,74 @@ need updating. Examples of such projects include django-rest-framework and djang Proposed API and workd flow for clean ups: ========================================== +Relational field API +==================== +Currently the main use case is that we have a single place where I +can check that we don't define redundant APIs for related fields. + +Structure of a relational field +------------------------------- +A relational field consist of: + + - The user created field + - Possibly of a remote field, which is auto-created by the user created field + + Both the created field and the remote field can possibly add a descriptor to + the field's model. + + Both the remote field and the user created field have (mostly) matching API. + The API consists of the following attributes and methods: + + .. attribute:: name + + The name of the field. This is the key of the field in _meta.get_field() calls, and + thus this is also the name used in ORM queries. + + .. attribute:: attname + + ForeignKeys have the concrete value in field.attname, and the model instance in + field.name. For example Author.book_id contains an integer, and Author.book contains + a book instance. Attname is the book_id value. + + .. method:: get_query_name() + + A method that generates the field's name. Only needed for remote fields. + + .. method:: get_accessor_name() + + A method that generates the name the field's descriptor should be placed into. + + For remote fields, get_query_name() is essentially similar to related_query_name + parameter, and get_accessor_name() is similar to related_name parameter. + + .. method:: get_path_info() + + Tells Django which relations to travel when this field is queried. Essentially + returns one PathInfo structure for each join needed by this field. + + .. method:: get_extra_restriction() + + Tells Django which extra restrictions should be placed onto joins generated. + + .. attribute:: model + + The originating model of this field. + + .. attribute:: remote_field + + The remote field of this model. + + .. attribute:: remote_model + + Same as self.remote_field.model. + + + ******************************** RANDOM DESIGN DOCUMENTATION *********************** + Abstract models and relational fields: + - If an abstract model defines a relation to non-abstract model, we must not add the remote + field. + - If an model defines a relation to abstract model, this should just fail (check this!) From 828d628a6a38e66d959d5f6e6da5ce730165416b Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 22:21:14 +0600 Subject: [PATCH 28/80] problem section --- draft/orm-field-api-related-improvement.rst | 194 ++++++++++---------- 1 file changed, 97 insertions(+), 97 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index d9678add..f33a611b 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -30,6 +30,83 @@ solution is to refactor Fields/RelationFields to new simpler API and incorporate virtualField type based refctors of RelationFields. +Limitations of ORM that will be taken care of: +============================================== +One limitation is, + +Django supports many-to-one relationships -- the foreign keys live on +the "many", and point to the "one". So, in a simple app where you +have Comments that can get Flagged, one Comment can have many Flag's, +but each Flag refers to one and only one Comment: + +class Comment(models.Model): + text = models.TextField() + +class Flag(models.Model): + comment = models.ForeignKey(Comment) + +However, there are circumstances where it's much more convenient to +express the relationship as a one-to-many relationship. Suppose, for +example, you want to have a generic "flagging" app which other models +can use: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +That way, if you had a new content type (say, a "Post"), it could also +participate in flagging, without having to modify the model definition +of "Flag" to add a new foreign key. Without baking in migrations, +there's obviously no way to make the underlying SQL play nice in this +circumstance: one-to-many relationships with just two tables can only +be expressed in SQL with a reverse foreign key relationship. However, +it's possible to describe OneToMany as a subset of ManyToMany, with a +uniqueness constraint on the "One" -- we rely on the join table to +handle the relationship: + +class Comment(models.Model): + text = models.TextField() + flags = models.ManyToMany(Flag, through=CommentFlag) + +class CommentFlag(models.Model): + comment = models.ForeignKey(Comment) + flag = models.ForeignKey(Flag, unique=True) + +While this works, the query interface remains cumbersome. To access +the comment from a flag, I have to call: + +comment = flag.comment_set.all()[0] + +as the ORM doesn't know for a fact that each flag could only have one +comment. But Django _could_ implement a OneToManyField in this way +(using the underlying ManyToMany paradigm), and provide sugar such +that this would all be nice and flexible, without having to do cumbersome +ORM calls or explicitly define extra join tables: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +class Post(models.Model): + body = models.TextField() + flags = models.OneToMany(Flag) + +# in a separate reusable app... +class Flag(models.Model) + reason = models.TextField() + resolved = models.BooleanField() + +# in a view... +comment = flag.comment +post = flag.post + +It's obviously less database efficient than simple 2-table reverse +ForeignKey relationships, as you have to do an extra join on the third +table; but you gain semantic clarity and a nice way to use it in +reusable apps, so in many circumstances it's worth it. And it's a +fair shake clearer than the existing generic foreign key solutions. + + Aim of the Proposal: ==================== This DEP aims to improve different part of django ORM and associated @@ -53,6 +130,26 @@ To keep thing sane it would be better to split the Dep in some major Parts: +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach based on present ORM internals +design on top of master. + +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. + +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. + + Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, @@ -194,79 +291,7 @@ A relation in Django consits of: - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) -Another limitation is, - -Django supports many-to-one relationships -- the foreign keys live on -the "many", and point to the "one". So, in a simple app where you -have Comments that can get Flagged, one Comment can have many Flag's, -but each Flag refers to one and only one Comment: - -class Comment(models.Model): - text = models.TextField() - -class Flag(models.Model): - comment = models.ForeignKey(Comment) - -However, there are circumstances where it's much more convenient to -express the relationship as a one-to-many relationship. Suppose, for -example, you want to have a generic "flagging" app which other models -can use: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -That way, if you had a new content type (say, a "Post"), it could also -participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, -there's obviously no way to make the underlying SQL play nice in this -circumstance: one-to-many relationships with just two tables can only -be expressed in SQL with a reverse foreign key relationship. However, -it's possible to describe OneToMany as a subset of ManyToMany, with a -uniqueness constraint on the "One" -- we rely on the join table to -handle the relationship: -class Comment(models.Model): - text = models.TextField() - flags = models.ManyToMany(Flag, through=CommentFlag) - -class CommentFlag(models.Model): - comment = models.ForeignKey(Comment) - flag = models.ForeignKey(Flag, unique=True) - -While this works, the query interface remains cumbersome. To access -the comment from a flag, I have to call: - -comment = flag.comment_set.all()[0] - -as the ORM doesn't know for a fact that each flag could only have one -comment. But Django _could_ implement a OneToManyField in this way -(using the underlying ManyToMany paradigm), and provide sugar such -that this would all be nice and flexible, without having to do cumbersome -ORM calls or explicitly define extra join tables: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -class Post(models.Model): - body = models.TextField() - flags = models.OneToMany(Flag) - -# in a separate reusable app... -class Flag(models.Model) - reason = models.TextField() - resolved = models.BooleanField() - -# in a view... -comment = flag.comment -post = flag.post - -It's obviously less database efficient than simple 2-table reverse -ForeignKey relationships, as you have to do an extra join on the third -table; but you gain semantic clarity and a nice way to use it in -reusable apps, so in many circumstances it's worth it. And it's a -fair shake clearer than the existing generic foreign key solutions. Clean up Relation API to make it consistant: @@ -691,33 +716,8 @@ form. The same escaping solution can be used even here. Admin/ModelForms ================ -The solution that has been proposed so many times in the past [2], [3] is -to extend the quote function used in the admin to also quote the comma and -then use an unquoted comma as the separator. Even though this solution -looks ugly to some, I don't think there is much choice -- there needs to -be a way to separate the values and in theory, any character could be -contained inside a value so we can't really avoid choosing one and -escaping it. GIS Framework: ============== -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. - -This appraoch should keep things sane and easier to approach on smaller chunks. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. From a12712482b6e147a2316c8cc277f690cf6983047 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 00:46:07 +0600 Subject: [PATCH 29/80] organize --- draft/orm-field-api-related-improvement.rst | 107 +++++--------------- 1 file changed, 28 insertions(+), 79 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f33a611b..219a09ae 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -73,7 +73,7 @@ class CommentFlag(models.Model): flag = models.ForeignKey(Flag, unique=True) While this works, the query interface remains cumbersome. To access -the comment from a flag, I have to call: +the comment from a flag, have to call: comment = flag.comment_set.all()[0] @@ -132,19 +132,20 @@ To keep thing sane it would be better to split the Dep in some major Parts: Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach based on present ORM internals +Considering the huge changes in ORM internals it is neither trivial nor +practical to rebase & port previous works related to ForeignKey refactor +without figuring out new approach based on present ORM internals design on top of master. -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. +A better approach would be to Improve Field API, major cleanup of +RealtionField API, model._meta and internal field_valaue_cache and +related areas first. After completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. +of ForeignKey and relationFields could have been done. -This appraoch should keep things sane and easier to approach on smaller chunks. +This appraoch should keep things easier to approach with smaller steps. Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. @@ -205,28 +206,28 @@ New split out Field API 1. BaseField: ------------- Base structure for all Field types in django ORM wheather it is Concrete, -relation or VirtualField +RelationField or VirtualField 2. ConcreteField: ----------------- -ConcreteField will have all the common attributes of a Regular concrete field +ConcreteField will extract all the common attributes of a Regular concrete field 3. Field: --------- -Presence base Field class with should refactored using BaseField and ConcreteField. -If it is decided to provide the optional virtual type to regular fields then +Field class should be refactored using BaseField and ConcreteField. If it +is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- -Based Field for All relation fields. +Base Field for All relation fields extended from new BaseField class. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to the system to be used to solve -some long standing design limitations of django orm. initially RelationFields, -GenericRelations etc will be benefitted by using VirtualFields and later -CompositeField or any virtual type field can be benefitted from VirtualField. +A true stand alone virtula field will be added to solve some long standing +design limitations of django orm. initially RelationFields, GenericRelations +etc will be benefitted by using VirtualFields and later CompositeField or +any virtual type field can be benefitted from VirtualField. @@ -281,14 +282,15 @@ A relation in Django consits of: - As last step of contribut_to_class method the prepare_remote() method is added as a lazy loaded method. It will be called when both Book and Author are ready. As it happens, they are both ready in the example, - so the method is called immediately.If the Author model was defined later - than Book, and Book had a string reference to Author, then the method would + so the method is called immediately. If the Author model was defined later + than Book and Book had a string reference to Author, then the method would be called only after Author was ready. 3. The prepare_remote() method is called. - The remote field is created based on attributes of the origin field. The field is added to the remote model (the field's contribute_to_class is called) - - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field + - The post_relation_ready() method is called for both the origin and the remote field. + This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) @@ -301,8 +303,7 @@ field.rel instance (for reverse side of user defined fields), or a real field instance(for example ForeignKey). These behave differently, so that the user must always remember which one he is dealing with. This creates lots of non-necessary conditioning -in multiple places of -Django. +in multiple places of Django. For example, the select_related descent has one branch for descending foreign keys and one to one fields, and another branch for descending to reverse one @@ -314,7 +315,8 @@ The remote_field is just a field subclass, just like everything else in Django. The benefits are: -Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. Everything from get_fields() should be a field. +Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. +Everything from get_fields() should be a field. Code simplicity - no special casing based on if a given relation is described by a rel or not Code reuse - ReverseManyToManyField is in most regard exactly like @@ -323,7 +325,9 @@ ManyToManyField. The expected problems are mostly from 3rd party code. Users of _meta that already work on expectation of getting rel instances will likely need updating. Those users who subclass Django's fields (or duck-type Django's fields) will -need updating. Examples of such projects include django-rest-framework and django-taggit. +need updating. Examples of such projects include django-rest-framework and +django-taggit. + Proposed API and workd flow for clean ups: ========================================== @@ -598,23 +602,6 @@ composite primary key containing any special columns. This should be extremely rare anyway. -GenericForeignKeys -~~~~~~~~~~~~~~~~~~ - -Even though the admin uses the contenttypes framework to log the history -of actions, it turns out proper handling on the admin side will make -things work without the need to modify GenericForeignKey code at all. This -is thanks to the fact that the admin uses only the ContentType field and -handles the relations on its own. Making sure the unquoting function -recreates the whole CompositeObjects where necessary should suffice. - -At a later stage, however, GenericForeignKeys could also be improved to -support composite primary keys. Using the same quoting solution as in the -admin could work in theory, although it would only allow fields capable of -storing arbitrary strings to be usable for object_id storage. This has -been left out of the scope of this project, though. - - QuerySet filtering ~~~~~~~~~~~~~~~~~~ @@ -668,44 +655,6 @@ possible. ``__in`` lookups for ``VirtualField`` ======================================= -The existing implementation of ``CompositeField`` handles ``__in`` lookups -in the generic, backend-independent ``WhereNode`` class and uses a -disjunctive normal form expression as in the following example:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); - -The problem with this solution is that in cases where the list of values -contains tens or hundreds of tuples, this DNF expression will be extremely -long and the database will have to evaluate it for each and every row, -without a possibility of optimizing the query. - -Certain database backends support the following alternative:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; - -This would probably be the best option, but it can't be used by SQLite, -for instance. This is also the reason why the DNF expression was -implemented in the first place. - -In order to support this more natural syntax, the ``DatabaseOperations`` -needs to be extended with a method such as ``composite_in_sql``. - -However, this leaves the issue of the inefficient DNF unresolved for -backends without support for tuple literals. For such backends, the -following expression is proposed:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c - UNION SELECT 4, 5, 6) - WHERE a1=1 AND b1=b AND c1=c); - -Since both syntaxes are rather generic and at least one of them should fit -any database backend directly, a new flag will be introduced, -``DatabaseFeatures.supports_tuple_literals`` which the default -implementation of ``composite_in_sql`` will consult in order to choose -between the two options. ModelChoiceFields ~~~~~~~~~~~~~~~~~ From e1999922571dccd19d6f582d2ff8c93375f619c6 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 14:47:26 +0600 Subject: [PATCH 30/80] re ogranize n clean ups --- draft/orm-field-api-related-improvement.rst | 168 +++++++------------- 1 file changed, 60 insertions(+), 108 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 219a09ae..4c03e077 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -415,14 +415,65 @@ concrete fields, but doesn't add or alter columns in the database." +Changes in ``RelationField`` +============================= +Relationship fields +~~~~~~~~~~~~~~~~~~~ + +This turns out to be, not too surprisingly, the toughest problem. The fact +that related fields are spread across about fifteen different classes, +most of which are quite nontrivial, makes the whole bundle pretty fragile, +which means the changes have to be made carefully not to break anything. + +What we need to achieve is that the ForeignKey, ManyToManyField and +OneToOneField detect when their target field is a CompositeField in +several situations and act accordingly since this will require different +handling than regular fields that map directly to database columns. + +The first one to look at is ForeignKey since the other two rely on its +functionality, OneToOneField being its descendant and ManyToManyField +using ForeignKeys in the intermediary model. Once the ForeignKeys work, +OneToOneField should require minimal to no changes since it inherits +almost everything from ForeignKey. + +The easiest part is that for composite related fields, the db_type will be +None since the data will be stored elsewhere. + +ForeignKey and OneToOneField will also be able to create the underlying +fields automatically when added to the model. I'm proposing the following +default names: "fkname_targetname" where "fkname" is the name of the +ForeignKey field and "targetname" is the name of the remote field name +corresponding to the local one. I'm open to other suggestions on this. + +There will also be a way to override the default names using a new field +option "enclosed_fields". This option will expect a tuple of fields each +of whose corresponds to one individual field in the same order as +specified in the target CompositeField. This option will be ignored for +non-composite ForeignKeys. + +The trickiest part, however, will be relation traversals in QuerySet +lookups. Currently the code in models.sql.query.Query that creates joins +only joins on single columns. To be able to span a composite relationship +the code that generates joins will have to recognize column tuples and add +a constraint for each pair of corresponding columns with the same aliases +in all conditions. + +For the sake of completeness, ForeignKey will also have an extra_filters +method allowing to filter by a related object or its primary key. + +With all this infrastructure set up, ManyToMany relationships using +composite fields will be easy enough. Intermediary model creation will +work thanks to automatic underlying field creation for composite fields +and traversal in both directions will be supported by the query code. + + Changes in ``ForeignKey`` ========================= Currently ``ForeignKey`` is a regular concrete field which manages both the raw value stored in the database and the higher-level relationship semantics. Managing the raw value is simple enough for simple -(single-column) targets. However, in the case of a composite target field, -this task becomes more complex. The biggest problem is that many parts of +(single-column) targets. The biggest problem is that many parts of the ORM work under the assumption that for each database column there is a model field it can assign the value from the column to. While it might be possible to lift this restriction, it would be a really complex project by @@ -446,15 +497,10 @@ uses a field specifically intended for the task. In order to keep this backwards compatible and avoid the need to explicitly create two fields for each ``ForeignKey``, the auxiliary field needs to be created automatically during the phase where a model class is -created by its metaclass. Initially I implemented this as a method on -``ForeignKey`` which takes the target field and creates its copy, touches -it up and adds it to the model class. However, this requires performing -special tasks with certain types of fields, such as ``AutoField`` which -needs to be turned into an ``IntegerField`` or ``CompositeField`` which -requires copying its enclosed fields as well. - -A better approach is to add a method such as ``create_auxiliary_copy`` on -``Field`` which would create all new field instances and add them to the +created by its metaclass. + +A better approach could be to add a method such as ``create_auxiliary_copy`` +on ``Field`` which would create all new field instances and add them to the appropriate model class. One possible problem with these changes is that they change the contents @@ -484,84 +530,15 @@ where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a 'chef', 'chef_id'] -This causes a lot of failures in the Django test suite, because there are -a lot of tests relying on the contents of ``_meta.fields`` or other -related attributes/properties. (Actually, this example is taken from one -of these tests, -``model_inheritance.tests.ModelInheritanceTests.test_multiple_table``.) -Fixing these is fairly simple, all they need is to add the appropriate -``__id`` fields. However, this raises a concern of how ``_meta`` is -regarded. It has always been a private API officially, but everyone uses -it in their projects anyway. I still think the change is worth it, but it -might be a good idea to include a note about the change in the release -notes. - - -Changes in ``RelationField`` -============================= -Relationship fields -~~~~~~~~~~~~~~~~~~~ - -This turns out to be, not too surprisingly, the toughest problem. The fact -that related fields are spread across about fifteen different classes, -most of which are quite nontrivial, makes the whole bundle pretty fragile, -which means the changes have to be made carefully not to break anything. - -What we need to achieve is that the ForeignKey, ManyToManyField and -OneToOneField detect when their target field is a CompositeField in -several situations and act accordingly since this will require different -handling than regular fields that map directly to database columns. - -The first one to look at is ForeignKey since the other two rely on its -functionality, OneToOneField being its descendant and ManyToManyField -using ForeignKeys in the intermediary model. Once the ForeignKeys work, -OneToOneField should require minimal to no changes since it inherits -almost everything from ForeignKey. - -The easiest part is that for composite related fields, the db_type will be -None since the data will be stored elsewhere. - -ForeignKey and OneToOneField will also be able to create the underlying -fields automatically when added to the model. I'm proposing the following -default names: "fkname_targetname" where "fkname" is the name of the -ForeignKey field and "targetname" is the name of the remote field name -corresponding to the local one. I'm open to other suggestions on this. - -There will also be a way to override the default names using a new field -option "enclosed_fields". This option will expect a tuple of fields each -of whose corresponds to one individual field in the same order as -specified in the target CompositeField. This option will be ignored for -non-composite ForeignKeys. - -The trickiest part, however, will be relation traversals in QuerySet -lookups. Currently the code in models.sql.query.Query that creates joins -only joins on single columns. To be able to span a composite relationship -the code that generates joins will have to recognize column tuples and add -a constraint for each pair of corresponding columns with the same aliases -in all conditions. - -For the sake of completeness, ForeignKey will also have an extra_filters -method allowing to filter by a related object or its primary key. - -With all this infrastructure set up, ManyToMany relationships using -composite fields will be easy enough. Intermediary model creation will -work thanks to automatic underlying field creation for composite fields -and traversal in both directions will be supported by the query code. ``contenttypes`` and ``GenericForeignKey`` ========================================== -It's fairly easy to represent composite values as strings. Given an -``escape`` function which uniquely escapes commas, something like the -following works quite well:: - - ",".join(escape(value) for value in composite_value) - -However, in order to support JOINs generated by ``GenericRelation``, we -need to be able to reproduce exactly the same encoding using an SQL -expression which would be used in the JOIN condition. +However, in order to support JOINs generated by ``GenericRelation``, +we need to be able to reproduce exactly the same encoding using an +SQL expression which would be used in the JOIN condition. Luckily, while thus encoded strings need to be possible to decode in Python (for example, when retrieving the related object using @@ -570,27 +547,6 @@ this isn't necessary at the database level. Using SQL we only ever need to perform this in one direction, that is from a tuple of values into a string. -That means we can use a generalized version of the function -``django.contrib.admin.utils.quote`` which replaces each unsafe -character with its ASCII value in hexadecimal base, preceded by an escape -character. In this case, only two characters are unsafe -- comma (which is -used to separate the values) and an escape character (which I arbitrarily -chose as '~'). - -To reproduce this encoding, all values need to be cast to strings and then -for each such string two calls to the ``replace`` functions are made:: - - replace(replace(CAST (`column` AS text), '~', '~7E'), ',', '~2C') - -According to available documentation, all four supported database backends -provide the ``replace`` function. [2]_ [3]_ [4]_ [5]_ - -Even though the ``replace`` function seems to be available in all major -database servers (even ones not officially supported by Django, including -MSSQL, DB2, Informix and others), this is still probably best left to the -database backend and will be implemented as -``DatabaseOperations.composite_value_to_text_sql``. - One possible pitfall of this implementation might be that it may not work with any column type that isn't an integer or a text string due to a simple fact – the string the database would cast it to will probably @@ -605,12 +561,8 @@ extremely rare anyway. QuerySet filtering ~~~~~~~~~~~~~~~~~~ -This is where the real fun begins. - The fundamental problem here is that Q objects which are used all over the code that handles filtering are designed to describe single field lookups. -On the other hand, CompositeFields will require a way to describe several -individual field lookups by a single expression. Since the Q objects themselves have no idea about fields at all and the actual field resolution from the filter conditions happens deeper down the From 6f802f68c2f6f213c76458e419d5155378860950 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 20:19:28 +0600 Subject: [PATCH 31/80] re arrange n clean up --- draft/orm-field-api-related-improvement.rst | 79 +++++---------------- 1 file changed, 16 insertions(+), 63 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 4c03e077..f1886693 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -110,12 +110,23 @@ fair shake clearer than the existing generic foreign key solutions. Aim of the Proposal: ==================== This DEP aims to improve different part of django ORM and associated -parts of django to support Real VirtualField type in django. There were -several attempt to fix this problem before. So in this Dep we will try -to follow the suggested approaches from Michal Patrucha's previous works -and suggestions in tickets and IRC chat/mailing list. Few other related -tickets were also analyzed to find out possible way's of API design. +parts of django to support Real VirtualField type in django. So in this +Dep we will try to follow the suggested approaches from Michal Patrucha's +previous works and suggestions in tickets and IRC chat/mailing list. +Related tickets were also analyzed to find out possible way's of API design. +A better approach would be to Improve Field API, major cleanup of +RealtionField API, model._meta and internal field_valaue_cache and +related areas first. + +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could have been done. + +This appraoch should keep things easier to approach with smaller steps. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. To keep thing sane it would be better to split the Dep in some major Parts: @@ -130,26 +141,6 @@ To keep thing sane it would be better to split the Dep in some major Parts: -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither trivial nor -practical to rebase & port previous works related to ForeignKey refactor -without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of -RealtionField API, model._meta and internal field_valaue_cache and -related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could have been done. - -This appraoch should keep things easier to approach with smaller steps. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. - Key steps of to follow to improve ORM Field API internals: ============================================================== @@ -564,44 +555,6 @@ QuerySet filtering The fundamental problem here is that Q objects which are used all over the code that handles filtering are designed to describe single field lookups. -Since the Q objects themselves have no idea about fields at all and the -actual field resolution from the filter conditions happens deeper down the -line, inside models.sql.query.Query, this is where we can handle the -filters properly. - -There is already some basic machinery inside Query.add_filter and -Query.setup_joins that is in use by GenericRelations, this is -unfortunately not enough. The optional extra_filters field method will be -of great use here, though it will have to be extended. - -Currently the only parameters it gets are the list of joins the -filter traverses, the position in the list and a negate parameter -specifying whether the filter is negated. The GenericRelation instance can -determine the value of the content type (which is what the extra_filters -method is used for) easily based on the model it belongs to. - -This is not the case for a CompositeField -- it doesn't have any idea -about the values used in the query. Therefore a new parameter has to be -added to the method so that the CompositeField can construct all the -actual filters from the iterable containing the values. - -Afterwards the handling inside Query is pretty straightforward. For -CompositeFields (and virtual fields in general) there is no value to be -used in the where node, the extra_filters are responsible for all -filtering, but since the filter should apply to a single object even after -join traversals, the aliases will be set up while handling the "root" -filter and then reused for each one of the extra_filters. - -This way of extending the extra_filters mechanism will allow the field -class to create conjunctions of atomic conditions. This is sufficient for -the "__exact" lookup type which will be implemented. - -Of the other lookup types, the only one that looks reasonable is "__in". -This will, however, have to be represented as a disjunction of multiple -"__exact" conditions since not all database backends support tuple -construction inside expressions. Therefore this lookup type will be left -out of this project as the mechanism would need much more work to make it -possible. ``__in`` lookups for ``VirtualField`` From a0fcd98dffd522641930b5643fa551f33ead9bed Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 22:58:18 +0600 Subject: [PATCH 32/80] minor clean up --- draft/orm-field-api-related-improvement.rst | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f1886693..97e833c5 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -215,7 +215,7 @@ Base Field for All relation fields extended from new BaseField class. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to solve some long standing +A true stand alone VirtualField will be added to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField or any virtual type field can be benefitted from VirtualField. @@ -346,11 +346,11 @@ A relational field consist of: The name of the field. This is the key of the field in _meta.get_field() calls, and thus this is also the name used in ORM queries. - .. attribute:: attname + .. attribute:: attr_name - ForeignKeys have the concrete value in field.attname, and the model instance in + ForeignKeys have the concrete value in field.attr_name, and the model instance in field.name. For example Author.book_id contains an integer, and Author.book contains - a book instance. Attname is the book_id value. + a book instance. attr_name is the book_id value. .. method:: get_query_name() @@ -385,7 +385,6 @@ A relational field consist of: Same as self.remote_field.model. - ******************************** RANDOM DESIGN DOCUMENTATION *********************** Abstract models and relational fields: - If an abstract model defines a relation to non-abstract model, we must not add the remote field. @@ -401,8 +400,12 @@ Introduce standalone ``VirtualField`` ===================================== what is ``VirtualField``? ------------------------- -"A virtual field is a model field which it correlates to one or multiple -concrete fields, but doesn't add or alter columns in the database." +A VirtualField is a model field type which co-relates to one or multiple +concrete fields, but doesn't add or alter columns in the database. + +ORM or migrations certainly can't ignore ForeignKey once it becomes virtual; +instead, migrations will have to hide any auto-generated auxiliary concrete +fields to make migrations backwards-compatible. From 026cd8a47541282c61e4fd46c241cfd9fcb4b023 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:05:38 +0600 Subject: [PATCH 33/80] VirtualField n other changes --- draft/orm-field-api-related-improvement.rst | 115 ++++++++++++++------ 1 file changed, 83 insertions(+), 32 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 97e833c5..84aaecd6 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -109,19 +109,17 @@ fair shake clearer than the existing generic foreign key solutions. Aim of the Proposal: ==================== -This DEP aims to improve different part of django ORM and associated -parts of django to support Real VirtualField type in django. So in this -Dep we will try to follow the suggested approaches from Michal Patrucha's -previous works and suggestions in tickets and IRC chat/mailing list. -Related tickets were also analyzed to find out possible way's of API design. +This DEP aims to improve django ORM internal Field and related Fields +private api to provide a sane API and mechanism for relation fileds. +Parts of it also propose to introduce true VirtualField type in django. -A better approach would be to Improve Field API, major cleanup of -RealtionField API, model._meta and internal field_valaue_cache and -related areas first. +To acheive these goals, a better approach would be to Improve Field API, +major cleanup of RealtionField API, model._meta and internal field_valaue_cache +and related areas first. -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could have been done. +After completing the major clean ups of Fields/RelationFields a standalone +VirtualField and VirtualField based refactors of ForeignKey and relationFields +and other parts of orm/contenttypes etc could have been done. This appraoch should keep things easier to approach with smaller steps. @@ -131,18 +129,22 @@ should be less complex after the completion of virtualField based refactors. To keep thing sane it would be better to split the Dep in some major Parts: 1. Logical refactor of present Field API and RelationField API, to make - them simpler and consistant with _meta API calls + them simpler and return consistant result with _meta API calls. 2. Introduce new sane API for RelationFields [internal/provisional] -3. Fields internal value cache refactor for relation fields (may be) +3. Make it possible to use Reverse relation directly if necessary. -4. VirtualField Based refactor of RelationFields API +4. Take care of Fields internal value cache for relation fields. [may be] +5. VirtualField Based refactor of RelationFields API +6. ContentTypes refactor. -Key steps of to follow to improve ORM Field API internals: + + +Key steps to refactor ORM Fields API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField, RelationField etc and adjust codes based on that API. @@ -166,23 +168,25 @@ Key steps of to follow to improve ORM Field API internals: 7. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey. -8. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. AuxiliaryField + +9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -9. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API +10. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API -10. Make changes to migrations framework to work properly with Reafctored Field +11. Make changes to migrations framework to work properly with Reafctored Field API. -11. Migrations work well with VirtualField based refactored API +12. Migrations work well with VirtualField based refactored API -12. Make sure new class based Index API ise used properly with refactored Field +13. Make sure new class based Index API ise used properly with refactored Field API. -13. Query/QuerySets/Expressions work well with new refactored API's +14. Query/QuerySets/Expressions work well with new refactored API's -14. refactor GIS framework based on the changes in ORM +15. refactor GIS framework based on the changes in ORM -15. ModelForms/Admin work well with posposed changes +16. ModelForms/Admin work well with posposed changes @@ -407,6 +411,57 @@ ORM or migrations certainly can't ignore ForeignKey once it becomes virtual; instead, migrations will have to hide any auto-generated auxiliary concrete fields to make migrations backwards-compatible. +A virtualField class could be like the following + + +class VirtualField(Field): + """ + Base class for field types with no direct database representation. + """ + def __init__(self, **kwargs): + kwargs.setdefault('serialize', False) + kwargs.setdefault('editable', False) + super().__init__(**kwargs) + + def db_type(self, connection): + """ + By default no db representation, and thus also no db_type. + """ + return None + + def contribute_to_class(self, cls, name): + super().contribute_to_class(cls, name) + + def get_column(self): + return None + + @cached_property + def fields(self): + return [] + + @cached_property + def concrete_fields(self): + return [f + for myfield in self.fields + for f in myfield.concrete_fields] + + def resolve_concrete_values(self, data): + if data is None: + return [None] * len(self.concrete_fields) + if len(self.concrete_fields) > 1: + if not isinstance(data, (list, tuple)): + raise ValueError( + "Can't resolve data that isn't list or tuple to values for field %s" % + self.name) + elif len(data) != len(self.concrete_fields): + raise ValueError( + "Invalid amount of values for field %s. Required %s, got %s." % + (self.name, len(self.concrete_fields), len(data))) + return data + else: + return [data] + + Changes in ``RelationField`` @@ -414,15 +469,11 @@ Changes in ``RelationField`` Relationship fields ~~~~~~~~~~~~~~~~~~~ -This turns out to be, not too surprisingly, the toughest problem. The fact -that related fields are spread across about fifteen different classes, -most of which are quite nontrivial, makes the whole bundle pretty fragile, -which means the changes have to be made carefully not to break anything. - -What we need to achieve is that the ForeignKey, ManyToManyField and -OneToOneField detect when their target field is a CompositeField in -several situations and act accordingly since this will require different -handling than regular fields that map directly to database columns. +The fact that related fields are spread across about fifteen different +classes, most of which are quite nontrivial, makes the whole bundle +pretty fragile, which means the changes have to be made carefully not +to break anything. This will require different handling than regular +fields that map directly to database columns. The first one to look at is ForeignKey since the other two rely on its functionality, OneToOneField being its descendant and ManyToManyField From 74514d186e9c02ea6220cbc630fea1ce0918fda9 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:21:40 +0600 Subject: [PATCH 34/80] VirtualField n other changes --- draft/orm-field-api-related-improvement.rst | 24 +++------------------ 1 file changed, 3 insertions(+), 21 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 84aaecd6..c8746438 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -578,29 +578,11 @@ where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a -``contenttypes`` and ``GenericForeignKey`` +``ContentTypes`` and ``GenericForeignKey`` ========================================== +Following the refactor of Fields API and introduction of true +VirtualField type, this part will also be refactored. -However, in order to support JOINs generated by ``GenericRelation``, -we need to be able to reproduce exactly the same encoding using an -SQL expression which would be used in the JOIN condition. - -Luckily, while thus encoded strings need to be possible to decode in -Python (for example, when retrieving the related object using -``GenericForeignKey`` or when the admin decodes the primary key from URL), -this isn't necessary at the database level. Using SQL we only ever need to -perform this in one direction, that is from a tuple of values into a -string. - -One possible pitfall of this implementation might be that it may not work -with any column type that isn't an integer or a text string due to a -simple fact – the string the database would cast it to will probably -differ from the one Python will use. However, I'm not sure there's -anything we can do about this, especially since the string representation -chosen by the database may be specific for each database server. Therefore -I'm inclined to declare ``GenericRelation`` unsupported for models with a -composite primary key containing any special columns. This should be -extremely rare anyway. QuerySet filtering From e8c369e8b29b84be6c39c79f0f652baab97793af Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:25:34 +0600 Subject: [PATCH 35/80] relation field clean up --- draft/orm-field-api-related-improvement.rst | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index c8746438..057fd4fd 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -493,23 +493,13 @@ corresponding to the local one. I'm open to other suggestions on this. There will also be a way to override the default names using a new field option "enclosed_fields". This option will expect a tuple of fields each of whose corresponds to one individual field in the same order as -specified in the target CompositeField. This option will be ignored for +specified in the target Field. This option will be ignored for non-composite ForeignKeys. -The trickiest part, however, will be relation traversals in QuerySet -lookups. Currently the code in models.sql.query.Query that creates joins -only joins on single columns. To be able to span a composite relationship -the code that generates joins will have to recognize column tuples and add -a constraint for each pair of corresponding columns with the same aliases -in all conditions. For the sake of completeness, ForeignKey will also have an extra_filters method allowing to filter by a related object or its primary key. -With all this infrastructure set up, ManyToMany relationships using -composite fields will be easy enough. Intermediary model creation will -work thanks to automatic underlying field creation for composite fields -and traversal in both directions will be supported by the query code. Changes in ``ForeignKey`` @@ -585,6 +575,7 @@ VirtualField type, this part will also be refactored. + QuerySet filtering ~~~~~~~~~~~~~~~~~~ From ec9801d35131ed018ca502a37c28e26ec5e60e44 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 15:45:22 +0600 Subject: [PATCH 36/80] changes --- draft/orm-field-api-related-improvement.rst | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 057fd4fd..ebd13067 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -54,9 +54,9 @@ class Comment(models.Model): text = models.TextField() flags = models.OneToMany(Flag) -That way, if you had a new content type (say, a "Post"), it could also +That way, if we had a new content type (say, a "Post"), it could also participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, +of "Flag" to add a new foreign key. Without baking in migrations, there's obviously no way to make the underlying SQL play nice in this circumstance: one-to-many relationships with just two tables can only be expressed in SQL with a reverse foreign key relationship. However, @@ -78,7 +78,7 @@ the comment from a flag, have to call: comment = flag.comment_set.all()[0] as the ORM doesn't know for a fact that each flag could only have one -comment. But Django _could_ implement a OneToManyField in this way +comment. But Django can implement a OneToManyField in this way (using the underlying ManyToMany paradigm), and provide sugar such that this would all be nice and flexible, without having to do cumbersome ORM calls or explicitly define extra join tables: @@ -216,6 +216,7 @@ VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- Base Field for All relation fields extended from new BaseField class. +In new class hirerarchy RelationFields will be Virtual. 5. VirtualField: ---------------- @@ -323,6 +324,17 @@ Those users who subclass Django's fields (or duck-type Django's fields) will need updating. Examples of such projects include django-rest-framework and django-taggit. +While the advised approach was: +1. Find places where rield.remote_field responds to different API than Field. +Fix these one at a time while trying to have backwards compat, even if the +API isn't public. + +2. In addition, simplifications to the APIs are welcome, as is a high level +documentation of how related fields actually work. + +3. We need to try to keep backwards compat as many projects are forced to +use the private APIs. But most of all, do small incremental changes. + Proposed API and workd flow for clean ups: ========================================== From 35d9e552e147971a6e43e44eefab0a04e33f6abf Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 16:36:02 +0600 Subject: [PATCH 37/80] changes --- draft/orm-field-api-related-improvement.rst | 22 ++++++++++----------- 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index ebd13067..54ea7c1c 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -325,6 +325,7 @@ need updating. Examples of such projects include django-rest-framework and django-taggit. While the advised approach was: + 1. Find places where rield.remote_field responds to different API than Field. Fix these one at a time while trying to have backwards compat, even if the API isn't public. @@ -336,6 +337,14 @@ documentation of how related fields actually work. use the private APIs. But most of all, do small incremental changes. +I would like to try the more direct approach. The reasons are, + +1. Define clear definition of relation fields class hierarchy and naming. + at present the class names for reverse relation and backreference is + quite confusing, like BackReference of any relation class is being called + + + Proposed API and workd flow for clean ups: ========================================== Relational field API @@ -502,12 +511,6 @@ default names: "fkname_targetname" where "fkname" is the name of the ForeignKey field and "targetname" is the name of the remote field name corresponding to the local one. I'm open to other suggestions on this. -There will also be a way to override the default names using a new field -option "enclosed_fields". This option will expect a tuple of fields each -of whose corresponds to one individual field in the same order as -specified in the target Field. This option will be ignored for -non-composite ForeignKeys. - For the sake of completeness, ForeignKey will also have an extra_filters method allowing to filter by a related object or its primary key. @@ -596,15 +599,10 @@ code that handles filtering are designed to describe single field lookups. -``__in`` lookups for ``VirtualField`` -======================================= - - ModelChoiceFields ~~~~~~~~~~~~~~~~~ -Again, we need a way to specify the value as a parameter passed in the -form. The same escaping solution can be used even here. +As the virtualField itself won't be backed by any real db field Admin/ModelForms ================ From f6d76a31526b62b230ce7ba73b3915038107134c Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 8 Apr 2017 14:53:25 +0600 Subject: [PATCH 38/80] changes --- draft/orm-field-api-related-improvement.rst | 36 +++++++++++++-------- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 54ea7c1c..d896e385 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -339,9 +339,16 @@ use the private APIs. But most of all, do small incremental changes. I would like to try the more direct approach. The reasons are, -1. Define clear definition of relation fields class hierarchy and naming. - at present the class names for reverse relation and backreference is - quite confusing, like BackReference of any relation class is being called +1. Define clear definition of relation fields class hierarchy and naming + At present the class names for reverse relation and backreference is + quite confusing. Like RemoteField is actually holding the information about + any Fields relation which are now + +2. I have plan to introduce OneToManyField which can be used directly + and will be the main ReverseForeignKey + +3. + @@ -349,8 +356,8 @@ Proposed API and workd flow for clean ups: ========================================== Relational field API ==================== -Currently the main use case is that we have a single place where I -can check that we don't define redundant APIs for related fields. +Currently the main use case is that we have a single place where +can be checked that we don't define redundant APIs for related fields. Structure of a relational field ------------------------------- @@ -358,7 +365,7 @@ Structure of a relational field A relational field consist of: - The user created field - - Possibly of a remote field, which is auto-created by the user created field + - Possibly of a remote_field, which is auto-created by the user created field Both the created field and the remote field can possibly add a descriptor to the field's model. @@ -415,7 +422,9 @@ A relational field consist of: field. - If an model defines a relation to abstract model, this should just fail (check this!) +This was basically taken from a old work on Relational API clean up, but not well tested. +I believe I can adjust these later. Part-2: @@ -496,25 +505,23 @@ pretty fragile, which means the changes have to be made carefully not to break anything. This will require different handling than regular fields that map directly to database columns. +For that reason the Relational API will be cleaned up to return consistant +result and later VirtualField based refactor will take place. + The first one to look at is ForeignKey since the other two rely on its functionality, OneToOneField being its descendant and ManyToManyField using ForeignKeys in the intermediary model. Once the ForeignKeys work, -OneToOneField should require minimal to no changes since it inherits +OneToOneField should require minimal changes since it inherits almost everything from ForeignKey. -The easiest part is that for composite related fields, the db_type will be -None since the data will be stored elsewhere. ForeignKey and OneToOneField will also be able to create the underlying fields automatically when added to the model. I'm proposing the following -default names: "fkname_targetname" where "fkname" is the name of the +default names: "fk_targetname" where "fkname" is the name of the ForeignKey field and "targetname" is the name of the remote field name corresponding to the local one. I'm open to other suggestions on this. -For the sake of completeness, ForeignKey will also have an extra_filters -method allowing to filter by a related object or its primary key. - Changes in ``ForeignKey`` @@ -612,3 +619,6 @@ Admin/ModelForms GIS Framework: ============== + + + From 934132be1c6de1c238e1d0c5e211c2f0ed598e5d Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 13 Apr 2017 20:21:58 +0600 Subject: [PATCH 39/80] changes --- draft/orm-field-api-related-improvement.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index d896e385..a893d671 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -15,8 +15,7 @@ DEP : ORM Relation Fields API Improvements using VirtualField Background: =========== -Django's ORM is a simple & powerful tool which suits most use-cases. -However, historicaly it has some design limitations and complex internal +Historically Django's ORM has some design limitations and complex internal API which makes it not only hard to maintain but also produce inconsistant behaviours. From c85c1ddcc3c2cab416c6550b84d8fe8f3fea01b8 Mon Sep 17 00:00:00 2001 From: Asif Saif Uddin Date: Fri, 6 Oct 2017 16:46:05 +0600 Subject: [PATCH 40/80] drop un needed texts --- draft/orm-field-api-related-improvement.rst | 74 +-------------------- 1 file changed, 1 insertion(+), 73 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a893d671..9c3c1074 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -31,79 +31,7 @@ incorporate virtualField type based refctors of RelationFields. Limitations of ORM that will be taken care of: ============================================== -One limitation is, - -Django supports many-to-one relationships -- the foreign keys live on -the "many", and point to the "one". So, in a simple app where you -have Comments that can get Flagged, one Comment can have many Flag's, -but each Flag refers to one and only one Comment: - -class Comment(models.Model): - text = models.TextField() - -class Flag(models.Model): - comment = models.ForeignKey(Comment) - -However, there are circumstances where it's much more convenient to -express the relationship as a one-to-many relationship. Suppose, for -example, you want to have a generic "flagging" app which other models -can use: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -That way, if we had a new content type (say, a "Post"), it could also -participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, -there's obviously no way to make the underlying SQL play nice in this -circumstance: one-to-many relationships with just two tables can only -be expressed in SQL with a reverse foreign key relationship. However, -it's possible to describe OneToMany as a subset of ManyToMany, with a -uniqueness constraint on the "One" -- we rely on the join table to -handle the relationship: - -class Comment(models.Model): - text = models.TextField() - flags = models.ManyToMany(Flag, through=CommentFlag) - -class CommentFlag(models.Model): - comment = models.ForeignKey(Comment) - flag = models.ForeignKey(Flag, unique=True) - -While this works, the query interface remains cumbersome. To access -the comment from a flag, have to call: - -comment = flag.comment_set.all()[0] - -as the ORM doesn't know for a fact that each flag could only have one -comment. But Django can implement a OneToManyField in this way -(using the underlying ManyToMany paradigm), and provide sugar such -that this would all be nice and flexible, without having to do cumbersome -ORM calls or explicitly define extra join tables: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -class Post(models.Model): - body = models.TextField() - flags = models.OneToMany(Flag) - -# in a separate reusable app... -class Flag(models.Model) - reason = models.TextField() - resolved = models.BooleanField() - -# in a view... -comment = flag.comment -post = flag.post - -It's obviously less database efficient than simple 2-table reverse -ForeignKey relationships, as you have to do an extra join on the third -table; but you gain semantic clarity and a nice way to use it in -reusable apps, so in many circumstances it's worth it. And it's a -fair shake clearer than the existing generic foreign key solutions. + Aim of the Proposal: From ca3c3865f725246631fe870d6cc8043d74a04654 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 6 Mar 2017 01:36:30 +0600 Subject: [PATCH 41/80] Initial draft proposal WIP --- draft/orm-improvements-for-composite-pk.rst | 437 ++++++++++++++++++++ 1 file changed, 437 insertions(+) create mode 100644 draft/orm-improvements-for-composite-pk.rst diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst new file mode 100644 index 00000000..9e7abb65 --- /dev/null +++ b/draft/orm-improvements-for-composite-pk.rst @@ -0,0 +1,437 @@ +========================================================= +DEP : ORM Fields and related improvement for composite PK +========================================================= + +:DEP: 0201 +:Author: Asif Saif Uddin +:Implementation Team: Asif Saif Uddin, django core team +:Shepherd: Django Core Team +:Status: Draft +:Type: Feature +:Created: 2017-3-2 +:Last-Modified: 2017-00-00 + +.. contents:: Table of Contents + :depth: 3 + :local: + + +Abstract +======== + +This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. + +Key concerns of New Approach to implement ``CompositeField`` +============================================================== + +1. Change ForeignObjectRel subclasses to real field instances. (For example, + ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. +2. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + +3. Partition ForeignKey to virtual relation field, and concrete data field. The former is the model.author, the latter model.author_id's backing implementation. +Consider other cases where true virtual fields are needed. + +4. Introduce new standalone ``VirtualField`` +5. Incorporate ``VirtualField`` related changes in django +6. Split out existing Fields API into ``ConcreteField`` and BaseField + to utilize ``VirtualField``. +7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` +8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey +9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey + + + +Summary of ``CompositeField`` +============================= + +This section summarizes the basic API as established in the proposal for +GSoC 2011 [1]_. + +A ``CompositeField`` requires a list of enclosed regular model fields as +positional arguments, as shown in this example:: + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + +The model class then contains a descriptor for the composite field, which +returns a ``CompositeValue`` which is a customized namedtuple, the +descriptor accepts any iterable of the appropriate length. An example +interactive session:: + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +``CompositeField`` supports the following standard field options: +``unique``, ``db_index``, ``primary_key``. The first two will simply add a +corresponding tuple to ``model._meta.unique_together`` or +``model._meta.index_together``. Other field options don't make much sense +in the context of composite fields. + +Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former +should be clear enough, the latter is elaborated in a separate section. + +It will be possible to use a ``CompositeField`` as a target field of +``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is +described in more detail in the following section. + +Changes in ``ForeignKey`` +========================= + +Currently ``ForeignKey`` is a regular concrete field which manages both +the raw value stored in the database and the higher-level relationship +semantics. Managing the raw value is simple enough for simple +(single-column) targets. However, in the case of a composite target field, +this task becomes more complex. The biggest problem is that many parts of +the ORM work under the assumption that for each database column there is a +model field it can assign the value from the column to. While it might be +possible to lift this restriction, it would be a really complex project by +itself. + +On the other hand, there is the abstraction of virtual fields working on +top of other fields which is required for this project anyway. The way +forward would be to use this abstraction for relationship fields. +Currently, ``ForeignKey`` (and by extension ``OneToOneField``) is the only +field whose ``name`` and ``attname`` differ, where ``name`` stores the +value dictated by the semantics of the field and ``attname`` stores the +raw value from the database. + +We can use this to our advantage and put an auxiliary field into the +``attname`` of each ``ForeignKey``, which would be of the same database +type as the target field, and turn ``ForeignKey`` into a virtual field on +top of the auxiliary field. This solution has the advantage that it +offloads the need to manage the raw database value off ``ForeignKey`` and +uses a field specifically intended for the task. + +In order to keep this backwards compatible and avoid the need to +explicitly create two fields for each ``ForeignKey``, the auxiliary field +needs to be created automatically during the phase where a model class is +created by its metaclass. Initially I implemented this as a method on +``ForeignKey`` which takes the target field and creates its copy, touches +it up and adds it to the model class. However, this requires performing +special tasks with certain types of fields, such as ``AutoField`` which +needs to be turned into an ``IntegerField`` or ``CompositeField`` which +requires copying its enclosed fields as well. + +A better approach is to add a method such as ``create_auxiliary_copy`` on +``Field`` which would create all new field instances and add them to the +appropriate model class. + +One possible problem with these changes is that they change the contents +of ``_meta.fields`` in each model out there that contains a relationship +field. For example, if a model contains the following fields:: + + ['id', + 'name', + 'address', + 'place_ptr', + 'rating', + 'serves_hot_dogs', + 'serves_pizza', + 'chef'] + +where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a +``ForeignKey``, after the change it will contain the following list:: + + ['id', + 'name', + 'address', + 'place_ptr', + 'place_ptr_id', + 'rating', + 'serves_hot_dogs', + 'serves_pizza', + 'chef', + 'chef_id'] + +This causes a lot of failures in the Django test suite, because there are +a lot of tests relying on the contents of ``_meta.fields`` or other +related attributes/properties. (Actually, this example is taken from one +of these tests, +``model_inheritance.tests.ModelInheritanceTests.test_multiple_table``.) +Fixing these is fairly simple, all they need is to add the appropriate +``__id`` fields. However, this raises a concern of how ``_meta`` is +regarded. It has always been a private API officially, but everyone uses +it in their projects anyway. I still think the change is worth it, but it +might be a good idea to include a note about the change in the release +notes. + +Porting previous work on top of master +====================================== + +The first major task of this project is to take the code I wrote as part +of GSoC 2011 and sync it with the current state of master. The order in +which I implemented things two years ago was to implement +``CompositeField`` first and then I did a refactor of ``ForeignKey`` which +is required to make it support ``CompositeField``. This turned out to be +inefficient with respect to the development process, because some parts of +the refactor broke the introduced ``CompositeField`` functionality, +meaning I had to effectively reimplement parts of it again. Also, some +abstractions introduced by the refactor made it possible to rewrite +certain parts in a cleaner way than what was necessary for +``CompositeField`` alone (e.g. database creation or certain features of +``model._meta``). + +In light of these findings I am convinced that a better approach would be +to first do the required refactor of ``ForeignKey`` and implement +CompositeField as the next step. This will result in a better maintainable +development branch and a cleaner revision history, making it easier to +review the work before its eventual inclusion into Django. + +``__in`` lookups for ``CompositeField`` +======================================= + +The existing implementation of ``CompositeField`` handles ``__in`` lookups +in the generic, backend-independent ``WhereNode`` class and uses a +disjunctive normal form expression as in the following example:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); + +The problem with this solution is that in cases where the list of values +contains tens or hundreds of tuples, this DNF expression will be extremely +long and the database will have to evaluate it for each and every row, +without a possibility of optimizing the query. + +Certain database backends support the following alternative:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; + +This would probably be the best option, but it can't be used by SQLite, +for instance. This is also the reason why the DNF expression was +implemented in the first place. + +In order to support this more natural syntax, the ``DatabaseOperations`` +needs to be extended with a method such as ``composite_in_sql``. + +However, this leaves the issue of the inefficient DNF unresolved for +backends without support for tuple literals. For such backends, the +following expression is proposed:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c + UNION SELECT 4, 5, 6) + WHERE a1=1 AND b1=b AND c1=c); + +Since both syntaxes are rather generic and at least one of them should fit +any database backend directly, a new flag will be introduced, +``DatabaseFeatures.supports_tuple_literals`` which the default +implementation of ``composite_in_sql`` will consult in order to choose +between the two options. + +``contenttypes`` and ``GenericForeignKey`` +========================================== + + +It's fairly easy to represent composite values as strings. Given an +``escape`` function which uniquely escapes commas, something like the +following works quite well:: + + ",".join(escape(value) for value in composite_value) + +However, in order to support JOINs generated by ``GenericRelation``, we +need to be able to reproduce exactly the same encoding using an SQL +expression which would be used in the JOIN condition. + +Luckily, while thus encoded strings need to be possible to decode in +Python (for example, when retrieving the related object using +``GenericForeignKey`` or when the admin decodes the primary key from URL), +this isn't necessary at the database level. Using SQL we only ever need to +perform this in one direction, that is from a tuple of values into a +string. + +That means we can use a generalized version of the function +``django.contrib.admin.utils.quote`` which replaces each unsafe +character with its ASCII value in hexadecimal base, preceded by an escape +character. In this case, only two characters are unsafe -- comma (which is +used to separate the values) and an escape character (which I arbitrarily +chose as '~'). + +To reproduce this encoding, all values need to be cast to strings and then +for each such string two calls to the ``replace`` functions are made:: + + replace(replace(CAST (`column` AS text), '~', '~7E'), ',', '~2C') + +According to available documentation, all four supported database backends +provide the ``replace`` function. [2]_ [3]_ [4]_ [5]_ + +Even though the ``replace`` function seems to be available in all major +database servers (even ones not officially supported by Django, including +MSSQL, DB2, Informix and others), this is still probably best left to the +database backend and will be implemented as +``DatabaseOperations.composite_value_to_text_sql``. + +One possible pitfall of this implementation might be that it may not work +with any column type that isn't an integer or a text string due to a +simple fact – the string the database would cast it to will probably +differ from the one Python will use. However, I'm not sure there's +anything we can do about this, especially since the string representation +chosen by the database may be specific for each database server. Therefore +I'm inclined to declare ``GenericRelation`` unsupported for models with a +composite primary key containing any special columns. This should be +extremely rare anyway. + +Database introspection, ``inspectdb`` +===================================== + +There are three main goals concerning database introspection in this +project. The first is to ensure the output of ``inspectdb`` remains the +same as it is now for models with simple primary keys and simple foreign +key references, or at least equivalent. While this shouldn't be too +difficult to achieve, it will still be regarded with high importance. + +The second goal is to extend ``inspectdb`` to also create a +``CompositeField`` in models where the table contains a composite primary +key. This part shouldn't be too difficult, +``DatabaseIntrospection.get_primary_key_column`` will be renamed to +``get_primary_key`` which will return a tuple of columns and in case the +tuple contains more than one element, an appropriate ``CompositeField`` +will be added. This will also require updating +``DatabaseWrapper.check_constraints`` for certain backends since it uses +``get_primary_key_column``. + +The third goal is to also make ``inspectdb`` aware of composite foreign +keys. This will need a rewrite of ``get_relations`` which will have to +return a mapping between tuples of columns instead of single columns. It +should also ensure each tuple of columns pointed to by a foreign key gets +a ``CompositeField``. This part will also probably require some changes in +other backend methods as well, especially since each backend has a unique +tangle of introspection methods. + +This part requires a tremendous amount of work, because practically every +single change needs to be done four times and needs separate research of +the specific backend in question. Therefore I can't promise to deliver full support +for all features mentioned in this section for all backends. I'd say +backwards compatibility is a requirement, recognition of composite primary +keys is a highly wanted feature that I'll try to implement for as many +backends as possible and recognition of composite foreign keys would be a +nice extra to have for at least one or two backends. + +I'll be implementing the features for the individual backends in the +following order: PostgreSQL, MySQL, SQLite and Oracle. I put PostgreSQL +first because, well, this is the backend with the best support in Django +(and also because it is the one where I'd actually use the features I'm +proposing). Oracle comes last because I don't have any way to test it and +I'm afraid I'd be stabbing in the dark anyway. Of the two remaining +backends I put MySQL first for two reasons. First, I don't think people +need to run ``inspectdb`` on SQLite databases too often (if ever). Second, +on MySQL the task seems marginally easier as the database has +introspection features other than just “give me the SQL statement used to +create this table”, whose parsing is most likely going to be a complete +mess. + +All in all, extending ``inspectdb`` features is a tedious and difficult +task with shady outcome, which I'm well aware of. Still, I would like to +try to at least implement the easier parts for the most used backends. It +might quite possibly turn out that I won't manage to implement more than +composite primary key detection for PostgreSQL. This is the reason I keep +this as one of the last features I intend to work on, as shown in the +timeline. It isn't a necessity, we can always just add a note to the docs +that ``inspectdb`` just can't detect certain scenarios and ask people to +edit their models manually. + +Updatable primary keys in models +================================ + +The algorithm that determines what kind of database query to issue on +``model.save()`` is a fairly simple and well-documented one [6]_. If a row +exists in the database with the value of its primary key equal to the +saved object, it is updated, otherwise a new row is inserted. This +behavior is intuitive and works well for models where the primary key is +automatically created by the framework (be it an ``AutoField`` or a parent +link in the case of model inheritance). + +However, as soon as the primary key is explicitly created, the behavior +becomes less intuitive and might be confusing, for example, to users of the +admin. For instance, say we have the following model:: + + class Person(models.Model): + first_name = models.CharField(max_length=47) + last_name = models.CharField(max_length=47) + shoe_size = models.PositiveSmallIntegerField() + + full_name = models.CompositeField(first_name, last_name, + primary_key=True) + +Then we register the model in the admin using the standard one-liner:: + + admin.site.register(Person) + +Since we haven't excluded any fields, all three fields will be editable in +the admin. Now, suppose there's an instance whose ``full_name`` is +``CompositeValue(first_name='Darth', last_name='Vadur')``. A user decides +to fix the last name using the admin, hits the “Save” button and instead +of fixing an existing record, a new one will appear with the new value, +while the old one remains untouched. This behavior is clearly broken from +the point of view of the user. + +It can be argued that it is the developer's fault that the database schema +is poorly chosen and that they expose the primary key to their users. +While this may be true in some cases, it is still to some extent a +subjective matter. + +Therefore I propose a new behavior for ``model.save()`` where it would +detect a change in the instance's primary key and in that case issue an +``UPDATE`` for the right row, i.e. ``WHERE primary_key = previous_value``. + +Of course, just going ahead and changing the behavior in this way for all +models would be backwards incompatible. To do this properly, we would need +to make this an opt-in feature. This can be achieved in multiple ways. + +1) add a keyword argument such as ``update_pk`` to ``Model.save`` +2) add a new option to ``Model.Meta``, ``updatable_pk`` +3) make this a project-wide setting + +Option 3 doesn't look pleasant and I think I can safely eliminate that. +Option 2 is somewhat better, although it adds a new ``Meta`` option. +Option 1 is the most flexible solution, however, it does not change the +behavior of the admin, at least not by default. This can be worked around +by overriding the ``save`` method to use a different default:: + + class MyModel(models.Model): + def save(self, update_pk=True, **kwargs): + kwargs['update_pk'] = update_pk + return super(MyModel, self).save(**kwargs) + +To avoid the need to repeat this for each model, a class decorator might +be provided to perform this automatically. + +In order to implement this new behavior a little bit of extra complexity +would have to be added to models. Model instances would need to store the +last known value of the primary key as retrieved from the database. On +save it would just find out whether the last known value is present and in +that case issue an ``UPDATE`` using the old value in the ``WHERE`` +condition. + +So far so good, this could be implemented fairly easily. However, the +problem becomes considerably more difficult as soon as we take into +account the fact that updating a primary key value may break foreign key +references. In order to avoid breaking references the ``on_delete`` +mechanism of ``ForeignKey`` would have to be extended to support updates +as well. This means that the collector used by deletion will need to be +extended as well. + +The problem becomes particularly nasty if we realize that a ``ForeignKey`` +might be part of a primary key, which means the collector needs to keep +track of which field depends on which in a graph of potentially unlimited +size. Compared to this, deletion is simpler as it only needs to find a +list of all affected model instances as opposed to having to keep track of +which field to update using which value. + +Given the complexity of this problem and the fact that it is not directly +related to composite fields, this is left as the last feature which will +be implemented only if I manage to finish everything else on time. + + +# https://people.ksp.sk/~johnny64/GSoC-full-proposal + From e99b8314064a391fe6605f666f19e151340d09e6 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 9 Mar 2017 01:32:17 +0600 Subject: [PATCH 42/80] changes --- draft/orm-improvements-for-composite-pk.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 9e7abb65..67091693 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -38,6 +38,8 @@ Consider other cases where true virtual fields are needed. 7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` 8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey 9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey +10. Make changes to migrations framework to work properly with Reafctored Field + API. From 7b08f7e54d62e7cf96021c3f9278bc6ec7c087f0 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 9 Mar 2017 02:30:14 +0600 Subject: [PATCH 43/80] re order --- draft/orm-improvements-for-composite-pk.rst | 121 +++++++++++--------- 1 file changed, 69 insertions(+), 52 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 67091693..3df41f5a 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -23,68 +23,32 @@ This DEP aims to improve different part of django ORM and other associated parts Key concerns of New Approach to implement ``CompositeField`` ============================================================== - -1. Change ForeignObjectRel subclasses to real field instances. (For example, +1. Split out Field API to ConcreteField, BaseField etc and change on ORM based on the splitted API. +2. Introduce new standalone well defined ``VirtualField`` +3. Incorporate ``VirtualField`` related changes in django +4. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +5. Figure out other cases where true virtual fields are needed. +6. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey +7. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. Change ForeignObjectRel subclasses to real field instances. (For example, ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. -2. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. - -3. Partition ForeignKey to virtual relation field, and concrete data field. The former is the model.author, the latter model.author_id's backing implementation. -Consider other cases where true virtual fields are needed. - -4. Introduce new standalone ``VirtualField`` -5. Incorporate ``VirtualField`` related changes in django -6. Split out existing Fields API into ``ConcreteField`` and BaseField - to utilize ``VirtualField``. -7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` -8. Refactor all RelationFields based on ``VirtualField`` based ForeignKey -9. Refactor GenericForeignKey based on ``VirtualField`` based ForeignKey +9. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + 10. Make changes to migrations framework to work properly with Reafctored Field API. +11. Consider Database Contraints work of lan-foote and +12. Changes in AutoField -Summary of ``CompositeField`` -============================= - -This section summarizes the basic API as established in the proposal for -GSoC 2011 [1]_. - -A ``CompositeField`` requires a list of enclosed regular model fields as -positional arguments, as shown in this example:: - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) +New split out Field API +========================= -The model class then contains a descriptor for the composite field, which -returns a ``CompositeValue`` which is a customized namedtuple, the -descriptor accepts any iterable of the appropriate length. An example -interactive session:: - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') +Introduce ``VirtualField`` +========================= -``CompositeField`` supports the following standard field options: -``unique``, ``db_index``, ``primary_key``. The first two will simply add a -corresponding tuple to ``model._meta.unique_together`` or -``model._meta.index_together``. Other field options don't make much sense -in the context of composite fields. -Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former -should be clear enough, the latter is elaborated in a separate section. - -It will be possible to use a ``CompositeField`` as a target field of -``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is -described in more detail in the following section. Changes in ``ForeignKey`` ========================= @@ -167,6 +131,57 @@ it in their projects anyway. I still think the change is worth it, but it might be a good idea to include a note about the change in the release notes. + + +Summary of ``CompositeField`` +============================= + +This section summarizes the basic API as established in the proposal for +GSoC 2011 [1]_. + +A ``CompositeField`` requires a list of enclosed regular model fields as +positional arguments, as shown in this example:: + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + +The model class then contains a descriptor for the composite field, which +returns a ``CompositeValue`` which is a customized namedtuple, the +descriptor accepts any iterable of the appropriate length. An example +interactive session:: + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +``CompositeField`` supports the following standard field options: +``unique``, ``db_index``, ``primary_key``. The first two will simply add a +corresponding tuple to ``model._meta.unique_together`` or +``model._meta.index_together``. Other field options don't make much sense +in the context of composite fields. + +Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former +should be clear enough, the latter is elaborated in a separate section. + +It will be possible to use a ``CompositeField`` as a target field of +``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is +described in more detail in the following section. + + + +Alternative Approach of compositeFiled +======================================= + + Porting previous work on top of master ====================================== @@ -231,6 +246,7 @@ any database backend directly, a new flag will be introduced, implementation of ``composite_in_sql`` will consult in order to choose between the two options. + ``contenttypes`` and ``GenericForeignKey`` ========================================== @@ -283,6 +299,7 @@ I'm inclined to declare ``GenericRelation`` unsupported for models with a composite primary key containing any special columns. This should be extremely rare anyway. + Database introspection, ``inspectdb`` ===================================== From 3e591562b08c86d8119a4b11c782d6689fe6c889 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 18:27:49 +0600 Subject: [PATCH 44/80] modifications --- draft/orm-improvements-for-composite-pk.rst | 59 ++++++++++----------- 1 file changed, 28 insertions(+), 31 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 3df41f5a..0e9caf98 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -41,6 +41,30 @@ Key concerns of New Approach to implement ``CompositeField`` 12. Changes in AutoField +Porting previous work on top of master +====================================== + +The first major task of this project is to take the code written as part +of GSoC 2013 and sync it with the current state of master. The order in +which It was implemented two years ago was to implement +``CompositeField`` first and then a refactor of ``ForeignKey`` which +is required to make it support ``CompositeField``. This turned out to be +inefficient with respect to the development process, because some parts of +the refactor broke the introduced ``CompositeField`` functionality, +meaning that it was needed effectively reimplement parts of it again. + +Also, some abstractions introduced by the refactor made it possible to +rewrite certain parts in a cleaner way than what was necessary for +``CompositeField`` alone (e.g. database creation or certain features of +``model._meta``). + +In light of these findings I am convinced that a better approach would be +to first do the required refactor of ``ForeignKey`` and implement +CompositeField as the next step. This will result in a better maintainable +development branch and a cleaner revision history, making it easier to +review the work before its eventual inclusion into Django. + + New split out Field API ========================= @@ -182,27 +206,6 @@ Alternative Approach of compositeFiled ======================================= -Porting previous work on top of master -====================================== - -The first major task of this project is to take the code I wrote as part -of GSoC 2011 and sync it with the current state of master. The order in -which I implemented things two years ago was to implement -``CompositeField`` first and then I did a refactor of ``ForeignKey`` which -is required to make it support ``CompositeField``. This turned out to be -inefficient with respect to the development process, because some parts of -the refactor broke the introduced ``CompositeField`` functionality, -meaning I had to effectively reimplement parts of it again. Also, some -abstractions introduced by the refactor made it possible to rewrite -certain parts in a cleaner way than what was necessary for -``CompositeField`` alone (e.g. database creation or certain features of -``model._meta``). - -In light of these findings I am convinced that a better approach would be -to first do the required refactor of ``ForeignKey`` and implement -CompositeField as the next step. This will result in a better maintainable -development branch and a cleaner revision history, making it easier to -review the work before its eventual inclusion into Django. ``__in`` lookups for ``CompositeField`` ======================================= @@ -359,13 +362,14 @@ timeline. It isn't a necessity, we can always just add a note to the docs that ``inspectdb`` just can't detect certain scenarios and ask people to edit their models manually. + Updatable primary keys in models ================================ The algorithm that determines what kind of database query to issue on -``model.save()`` is a fairly simple and well-documented one [6]_. If a row -exists in the database with the value of its primary key equal to the -saved object, it is updated, otherwise a new row is inserted. This +``model.save()`` is a fairly simple and well-documented one [6]_. If a +row exists in the database with the value of its primary key equal to +the saved object, it is updated, otherwise a new row is inserted. This behavior is intuitive and works well for models where the primary key is automatically created by the framework (be it an ``AutoField`` or a parent link in the case of model inheritance). @@ -447,10 +451,3 @@ size. Compared to this, deletion is simpler as it only needs to find a list of all affected model instances as opposed to having to keep track of which field to update using which value. -Given the complexity of this problem and the fact that it is not directly -related to composite fields, this is left as the last feature which will -be implemented only if I manage to finish everything else on time. - - -# https://people.ksp.sk/~johnny64/GSoC-full-proposal - From 94956935a27c031453b7aaf79f93a3298ba684e9 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 20:08:25 +0600 Subject: [PATCH 45/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 39 ++++++++++++++++++--- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 0e9caf98..6748bb78 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -18,8 +18,31 @@ DEP : ORM Fields and related improvement for composite PK Abstract ======== +Django's ORM is a powerful tool which suits perfectly most use-cases, +however, there are cases where having exactly one primary key column per +table induces unnecessary redundancy. + +One such case is the many-to-many intermediary model. Even though the pair +of ForeignKeys in this model identifies uniquely each relationship, an +additional field is required by the ORM to identify individual rows. While +this isn't a real problem when the underlying database schema is created +by Django, it becomes an obstacle as soon as one tries to develop a Django +application using a legacy database. + +Since there is already a lot of code relying on the pk property of model +instances and the ability to use it in QuerySet filters, it is necessary +to implement a mechanism to allow filtering of several actual fields by +specifying a single filter. + +The proposed solution is using Virtualfield type, CompositeField. This field +type will enclose several real fields within one single object. + + +Motivation +========== +This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API +and design everything as much simple and small as possible to be able to implement separately. -This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. Key concerns of New Approach to implement ``CompositeField`` ============================================================== @@ -36,9 +59,15 @@ Key concerns of New Approach to implement ``CompositeField`` 10. Make changes to migrations framework to work properly with Reafctored Field API. -11. Consider Database Contraints work of lan-foote and -12. Changes in AutoField +11. Make sure new class based Index API ise used properly with refactored Field + API. + +12. Consider Database Contraints work of lan-foote and + +13. SubField/AuxilaryField + +14. Update in AutoField Porting previous work on top of master @@ -69,8 +98,8 @@ New split out Field API ========================= -Introduce ``VirtualField`` -========================= +Introduce standalone ``VirtualField`` +===================================== From 90805c77716546cb39c668a2489d4bafc7f82f3e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 20:37:48 +0600 Subject: [PATCH 46/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 6748bb78..3dfab7e5 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -74,8 +74,9 @@ Porting previous work on top of master ====================================== The first major task of this project is to take the code written as part -of GSoC 2013 and sync it with the current state of master. The order in -which It was implemented two years ago was to implement +of GSoC 2013 and compare it aganist master to have Idea of valid part. + +The order in which It was implemented few years ago was to implement ``CompositeField`` first and then a refactor of ``ForeignKey`` which is required to make it support ``CompositeField``. This turned out to be inefficient with respect to the development process, because some parts of @@ -87,11 +88,11 @@ rewrite certain parts in a cleaner way than what was necessary for ``CompositeField`` alone (e.g. database creation or certain features of ``model._meta``). -In light of these findings I am convinced that a better approach would be -to first do the required refactor of ``ForeignKey`` and implement -CompositeField as the next step. This will result in a better maintainable -development branch and a cleaner revision history, making it easier to -review the work before its eventual inclusion into Django. +I am convinced that a better approach would be to Improve Field API and later +imlement VirtualField type to first do the required refactor of ``ForeignKey`` +and implement CompositeField as the next step. This will result in a better +maintainable development branch and a cleaner revision history, making it easier +to review the work before its eventual inclusion into Django. New split out Field API From 71bf93c36af2cb06d46ada1fd25523862a6b33cc Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 11 Mar 2017 23:34:59 +0600 Subject: [PATCH 47/80] more modifications --- draft/orm-improvements-for-composite-pk.rst | 30 +++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 3dfab7e5..c9ebd82e 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -97,6 +97,33 @@ to review the work before its eventual inclusion into Django. New split out Field API ========================= +1. BaseField: +------------- +Base structure for all Field types in django ORM wheather it is Concrete +or VirtualField + +2. ConcreteField: +----------------- +ConcreteField will have all the common attributes of a Regular concrete field + +3. Field: +--------- +Presence base Field class with should refactored using BaseField and ConcreteField. +If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. + +4. VirtualField: +---------------- +A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField +or any virtual type field can be benefitted from VirtualField. + +5. RelationField: +----------------- + + +6. CompositeField: +------------------ +A composite field can be implemented based on BaseField and VirtualField to solve +the CompositeKey/Multi column PrimaryKey issue. Introduce standalone ``VirtualField`` @@ -186,6 +213,9 @@ might be a good idea to include a note about the change in the release notes. +Changes in ``RelationField`` +============================= + Summary of ``CompositeField`` ============================= From 23c4cc3ee22af6e1aba9d6378b03bca984a75951 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 12 Mar 2017 00:04:31 +0600 Subject: [PATCH 48/80] more detail break down from older references --- draft/orm-improvements-for-composite-pk.rst | 231 ++++++++++++++++++++ 1 file changed, 231 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index c9ebd82e..b53d6103 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -266,6 +266,212 @@ Alternative Approach of compositeFiled ======================================= +Implementation +-------------- + +Specifying a CompositeField in a Model +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The constructor of a CompositeField will accept the supported options as +keyword parameters and the enclosed fields will be specified as positional +parameters. The order in which they are specified will determine their +order in the namedtuple representing the CompositeField value (i. e. when +retrieving and assigning the CompositeField's value; see example below). + +unique and db_index +~~~~~~~~~~~~~~~~~~~ +Implementing these will require some modifications in the backend code. +The table creation code will have to handle virtual fields as well as +local fields in the table creation and index creation routines +respectively. + +When the code handling CompositeField.unique is finished, the +models.options.Options class will have to be modified to create a unique +CompositeField for each tuple in the Meta.unique_together attribute. The +code handling unique checks in models.Model will also have to be updated +to reflect the change. + +Retrieval and assignment +~~~~~~~~~~~~~~~~~~~~~~~~ + +Jacob has actually already provided a skeleton of the code that takes care +of this as seen in [1]. I'll only summarize the behaviour in a brief +example of my own. + + class SomeModel(models.Model): + first_field = models.IntegerField() + second_field = models.CharField(max_length=100) + composite = models.CompositeField(first_field, second_field) + + >>> instance = new SomeModel(first_field=47, second_field="some string") + >>> instance.composite + CompositeObject(first_field=47, second_field='some string') + >>> instance.composite.first_field + 47 + >>> instance.composite[1] + 'some string' + >>> instance.composite = (74, "other string") + >>> instance.first_field, instance.second_field + (74, 'other string') + +Accessing the field attribute will create a CompositeObject instance which +will behave like a tuple but also with direct access to enclosed field +values via appropriately named attributes. + +Assignment will be possible using any iterable. The order of the values in +the iterable will have to be the same as the order in which undelying +fields have been specified to the CompositeField. + +QuerySet filtering +~~~~~~~~~~~~~~~~~~ + +This is where the real fun begins. + +The fundamental problem here is that Q objects which are used all over the +code that handles filtering are designed to describe single field lookups. +On the other hand, CompositeFields will require a way to describe several +individual field lookups by a single expression. + +Since the Q objects themselves have no idea about fields at all and the +actual field resolution from the filter conditions happens deeper down the +line, inside models.sql.query.Query, this is where we can handle the +filters properly. + +There is already some basic machinery inside Query.add_filter and +Query.setup_joins that is in use by GenericRelations, this is +unfortunately not enough. The optional extra_filters field method will be +of great use here, though it will have to be extended. + +Currently the only parameters it gets are the list of joins the +filter traverses, the position in the list and a negate parameter +specifying whether the filter is negated. The GenericRelation instance can +determine the value of the content type (which is what the extra_filters +method is used for) easily based on the model it belongs to. + +This is not the case for a CompositeField -- it doesn't have any idea +about the values used in the query. Therefore a new parameter has to be +added to the method so that the CompositeField can construct all the +actual filters from the iterable containing the values. + +Afterwards the handling inside Query is pretty straightforward. For +CompositeFields (and virtual fields in general) there is no value to be +used in the where node, the extra_filters are responsible for all +filtering, but since the filter should apply to a single object even after +join traversals, the aliases will be set up while handling the "root" +filter and then reused for each one of the extra_filters. + +This way of extending the extra_filters mechanism will allow the field +class to create conjunctions of atomic conditions. This is sufficient for +the "__exact" lookup type which will be implemented. + +Of the other lookup types, the only one that looks reasonable is "__in". +This will, however, have to be represented as a disjunction of multiple +"__exact" conditions since not all database backends support tuple +construction inside expressions. Therefore this lookup type will be left +out of this project as the mechanism would need much more work to make it +possible. + +CompositeField.primary_key +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As with db_index and unique, the backend table generating code will have +to be updated to set the PRIMARY KEY to a tuple. In this case, however, +the impact on the rest of the ORM and some other parts of Django is more +serious. + +A (hopefully) complete list of things affected by this is: +- the admin: the possibility to pass the value of the primary key as a + parameter inside the URL is a necessity to be able to work with a model +- contenttypes: since the admin uses GenericForeignKeys to log activity, + there will have to be some support +- forms: more precisely, ModelForms and their ModelChoiceFields +- relationship fields: ForeignKey, ManyToManyField and OneToOneField will + need a way to point to a model with a CompositeField as its primary key + +Let's look at each one of them in more detail. + +Admin +~~~~~ + +The solution that has been proposed so many times in the past [2], [3] is +to extend the quote function used in the admin to also quote the comma and +then use an unquoted comma as the separator. Even though this solution +looks ugly to some, I don't think there is much choice -- there needs to +be a way to separate the values and in theory, any character could be +contained inside a value so we can't really avoid choosing one and +escaping it. + +GenericForeignKeys +~~~~~~~~~~~~~~~~~~ + +Even though the admin uses the contenttypes framework to log the history +of actions, it turns out proper handling on the admin side will make +things work without the need to modify GenericForeignKey code at all. This +is thanks to the fact that the admin uses only the ContentType field and +handles the relations on its own. Making sure the unquoting function +recreates the whole CompositeObjects where necessary should suffice. + +At a later stage, however, GenericForeignKeys could also be improved to +support composite primary keys. Using the same quoting solution as in the +admin could work in theory, although it would only allow fields capable of +storing arbitrary strings to be usable for object_id storage. This has +been left out of the scope of this project, though. + +ModelChoiceFields +~~~~~~~~~~~~~~~~~ + +Again, we need a way to specify the value as a parameter passed in the +form. The same escaping solution can be used even here. + +Relationship fields +~~~~~~~~~~~~~~~~~~~ + +This turns out to be, not too surprisingly, the toughest problem. The fact +that related fields are spread across about fifteen different classes, +most of which are quite nontrivial, makes the whole bundle pretty fragile, +which means the changes have to be made carefully not to break anything. + +What we need to achieve is that the ForeignKey, ManyToManyField and +OneToOneField detect when their target field is a CompositeField in +several situations and act accordingly since this will require different +handling than regular fields that map directly to database columns. + +The first one to look at is ForeignKey since the other two rely on its +functionality, OneToOneField being its descendant and ManyToManyField +using ForeignKeys in the intermediary model. Once the ForeignKeys work, +OneToOneField should require minimal to no changes since it inherits +almost everything from ForeignKey. + +The easiest part is that for composite related fields, the db_type will be +None since the data will be stored elsewhere. + +ForeignKey and OneToOneField will also be able to create the underlying +fields automatically when added to the model. I'm proposing the following +default names: "fkname_targetname" where "fkname" is the name of the +ForeignKey field and "targetname" is the name of the remote field name +corresponding to the local one. I'm open to other suggestions on this. + +There will also be a way to override the default names using a new field +option "enclosed_fields". This option will expect a tuple of fields each +of whose corresponds to one individual field in the same order as +specified in the target CompositeField. This option will be ignored for +non-composite ForeignKeys. + +The trickiest part, however, will be relation traversals in QuerySet +lookups. Currently the code in models.sql.query.Query that creates joins +only joins on single columns. To be able to span a composite relationship +the code that generates joins will have to recognize column tuples and add +a constraint for each pair of corresponding columns with the same aliases +in all conditions. + +For the sake of completeness, ForeignKey will also have an extra_filters +method allowing to filter by a related object or its primary key. + +With all this infrastructure set up, ManyToMany relationships using +composite fields will be easy enough. Intermediary model creation will +work thanks to automatic underlying field creation for composite fields +and traversal in both directions will be supported by the query code. + ``__in`` lookups for ``CompositeField`` ======================================= @@ -423,6 +629,31 @@ that ``inspectdb`` just can't detect certain scenarios and ask people to edit their models manually. +Other considerations +-------------------- + +This infrastructure will allow reimplementing the GenericForeignKey as a +CompositeField at a later stage. Thanks to the modifications in the +joining code it should also be possible to implement bidirectional generic +relationship traversal in QuerySet filters. This is, however, out of scope +of this project. + +CompositeFields will have the serialize option set to False to prevent +their serialization. Otherwise the enclosed fields would be serialized +twice which would not only infer redundancy but also ambiguity. + +Also CompositeFields will be ignored in ModelForms by default, for two +reasons: +- otherwise the same field would be inside the form twice +- there aren't really any form fields usable for tuples and a fieldset + would require even more out-of-scope machinery + +The CompositeField will not allow enclosing other CompositeFields. The +only exception might be the case of composite ForeignKeys which could also +be implemented after successful finish of this project. With this feature +the autogenerated intermediary M2M model could make the two ForeignKeys +its primary key, dropping the need to have a redundant id AutoField. + Updatable primary keys in models ================================ From 55c4da730779ecbf2aa0abaa609989a95709fe7e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 17 Mar 2017 00:04:04 +0600 Subject: [PATCH 49/80] modification --- draft/orm-improvements-for-composite-pk.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index b53d6103..6e620891 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -22,6 +22,8 @@ Django's ORM is a powerful tool which suits perfectly most use-cases, however, there are cases where having exactly one primary key column per table induces unnecessary redundancy. +Django ORM fields does have some historical design decisions like + One such case is the many-to-many intermediary model. Even though the pair of ForeignKeys in this model identifies uniquely each relationship, an additional field is required by the ORM to identify individual rows. While From 4925fbd46cbb67f98c29e8eb742c3a1503ea88d9 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 11:19:54 +0600 Subject: [PATCH 50/80] modification --- draft/orm-improvements-for-composite-pk.rst | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index 6e620891..f161d1f8 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -18,12 +18,15 @@ DEP : ORM Fields and related improvement for composite PK Abstract ======== -Django's ORM is a powerful tool which suits perfectly most use-cases, -however, there are cases where having exactly one primary key column per -table induces unnecessary redundancy. - -Django ORM fields does have some historical design decisions like +Django's ORM is a simple & powerful tool which suits most use-cases, +however, there are some historical design decisions like all the fields are +concreteField by default. This type of design limitation made it difficult +to add support for composite primarykey or working with relationField/genericRelations +very inconsistant behaviour. +cases where having exactly one primary key column per +table induces unnecessary redundancy. + One such case is the many-to-many intermediary model. Even though the pair of ForeignKeys in this model identifies uniquely each relationship, an additional field is required by the ORM to identify individual rows. While @@ -36,13 +39,13 @@ instances and the ability to use it in QuerySet filters, it is necessary to implement a mechanism to allow filtering of several actual fields by specifying a single filter. -The proposed solution is using Virtualfield type, CompositeField. This field -type will enclose several real fields within one single object. +The proposed solution is using Virtualfield type, and necessary VirtualField desendent +Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. Motivation ========== -This DEP aims to improve different part of django ORM and other associated parts of django to support composite primary key in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API +This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. From cd81d9df33916e0063733ba1092d34ff4af91b67 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:08:37 +0600 Subject: [PATCH 51/80] modification --- draft/orm-improvements-for-composite-pk.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-improvements-for-composite-pk.rst index f161d1f8..f8339055 100644 --- a/draft/orm-improvements-for-composite-pk.rst +++ b/draft/orm-improvements-for-composite-pk.rst @@ -93,8 +93,7 @@ rewrite certain parts in a cleaner way than what was necessary for ``CompositeField`` alone (e.g. database creation or certain features of ``model._meta``). -I am convinced that a better approach would be to Improve Field API and later -imlement VirtualField type to first do the required refactor of ``ForeignKey`` +I am convinced that a better approach would be to Improve Field API and RealtionField API and later imlement VirtualField type to first do the required refactor of ``ForeignKey`` and implement CompositeField as the next step. This will result in a better maintainable development branch and a cleaner revision history, making it easier to review the work before its eventual inclusion into Django. From 6ffbbc1fd22d78c4a9c730af42ae92f2b9e2364f Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:34:39 +0600 Subject: [PATCH 52/80] rename draft --- ...for-composite-pk.rst => orm-field-api-related-improvement.rst} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename draft/{orm-improvements-for-composite-pk.rst => orm-field-api-related-improvement.rst} (100%) diff --git a/draft/orm-improvements-for-composite-pk.rst b/draft/orm-field-api-related-improvement.rst similarity index 100% rename from draft/orm-improvements-for-composite-pk.rst rename to draft/orm-field-api-related-improvement.rst From 28dab6ed71dd0fb298af626b63d90af750393453 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 13:44:28 +0600 Subject: [PATCH 53/80] ORM Fields API and Related Improvements --- draft/orm-field-api-related-improvement.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f8339055..28c05eaa 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -1,5 +1,5 @@ ========================================================= -DEP : ORM Fields and related improvement for composite PK +DEP : ORM Fields API & Related Improvements ========================================================= :DEP: 0201 From 19d731aaed56fc5a88a3df8437d558a178a44893 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 14:45:55 +0600 Subject: [PATCH 54/80] adjustments --- draft/orm-field-api-related-improvement.rst | 72 ++++++++++++--------- 1 file changed, 41 insertions(+), 31 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 28c05eaa..bb708d3f 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -7,8 +7,8 @@ DEP : ORM Fields API & Related Improvements :Implementation Team: Asif Saif Uddin, django core team :Shepherd: Django Core Team :Status: Draft -:Type: Feature -:Created: 2017-3-2 +:Type: Feature/Cleanup/Optimization +:Created: 2017-3-18 :Last-Modified: 2017-00-00 .. contents:: Table of Contents @@ -16,20 +16,19 @@ DEP : ORM Fields API & Related Improvements :local: -Abstract -======== +Background: +=========== Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design decisions like all the fields are -concreteField by default. This type of design limitation made it difficult -to add support for composite primarykey or working with relationField/genericRelations -very inconsistant behaviour. +however, there are some historical design limitations and many inconsistant +implementation in orm relation fields API which produce many inconsistant +behaviour -cases where having exactly one primary key column per -table induces unnecessary redundancy. - -One such case is the many-to-many intermediary model. Even though the pair -of ForeignKeys in this model identifies uniquely each relationship, an -additional field is required by the ORM to identify individual rows. While +This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it +produces inconsistant behaviour and a very hard implementation to maintain. + +Also there are such case is the many-to-many intermediary model. Even though +the pair of ForeignKeys in this model identifies uniquely each relationship, +an additional field is required by the ORM to identify individual rows. While this isn't a real problem when the underlying database schema is created by Django, it becomes an obstacle as soon as one tries to develop a Django application using a legacy database. @@ -39,11 +38,11 @@ instances and the ability to use it in QuerySet filters, it is necessary to implement a mechanism to allow filtering of several actual fields by specifying a single filter. -The proposed solution is using Virtualfield type, and necessary VirtualField desendent -Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +The proposed solution is using Virtualfield type, and necessary VirtualField +desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. -Motivation +Abstract ========== This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. @@ -51,28 +50,39 @@ and design everything as much simple and small as possible to be able to impleme Key concerns of New Approach to implement ``CompositeField`` ============================================================== -1. Split out Field API to ConcreteField, BaseField etc and change on ORM based on the splitted API. -2. Introduce new standalone well defined ``VirtualField`` -3. Incorporate ``VirtualField`` related changes in django -4. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API -5. Figure out other cases where true virtual fields are needed. -6. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey -7. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -8. Change ForeignObjectRel subclasses to real field instances. (For example, +1. Split out Field API logically to separate ConcreteField, + BaseField etc and change on ORM based on the splitted API. + +2. Change ForeignObjectRel subclasses to real field instances. (For example, ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. -9. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see ​https://github.com/akaariai/django-reverse-unique. + +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be + advantageous to be able to define reverse relations directly. For example, + see ​https://github.com/akaariai/django-reverse-unique. + +5. Introduce new standalone well defined ``VirtualField`` + +6. Incorporate ``VirtualField`` related changes in django + +7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API + +8. Figure out other cases where true virtual fields are needed. + +9. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey + +10. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -10. Make changes to migrations framework to work properly with Reafctored Field +11. Make changes to migrations framework to work properly with Reafctored Field API. -11. Make sure new class based Index API ise used properly with refactored Field +12. Make sure new class based Index API ise used properly with refactored Field API. -12. Consider Database Contraints work of lan-foote and +13. Consider Database Contraints work of lan-foote and -13. SubField/AuxilaryField +14. SubField/AuxilaryField -14. Update in AutoField +15. Update in AutoField Porting previous work on top of master From 361dc5c99f40e21b9c200205b8daf12495c2d9a1 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 14:58:24 +0600 Subject: [PATCH 55/80] major steps --- draft/orm-field-api-related-improvement.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index bb708d3f..dcdfd2e2 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -47,6 +47,11 @@ Abstract This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. +To keep thing sane I will try to split the Dep in 3 major Part: +1. Logical refactor of present Field API and RelationField API +2. VirtualField Based refactor +3. CompositeField API formalization + Key concerns of New Approach to implement ``CompositeField`` ============================================================== From f5e2fbc45f198acea275e28998651e1f122b29cf Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 15:16:55 +0600 Subject: [PATCH 56/80] keep thing simple --- draft/orm-field-api-related-improvement.rst | 280 +++++++------------- 1 file changed, 101 insertions(+), 179 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index dcdfd2e2..3d492b71 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -114,6 +114,9 @@ maintainable development branch and a cleaner revision history, making it easier to review the work before its eventual inclusion into Django. +Specification: +=============== + New split out Field API ========================= 1. BaseField: @@ -139,6 +142,12 @@ or any virtual type field can be benefitted from VirtualField. ----------------- + + + + + + 6. CompositeField: ------------------ A composite field can be implemented based on BaseField and VirtualField to solve @@ -239,45 +248,6 @@ Changes in ``RelationField`` Summary of ``CompositeField`` ============================= -This section summarizes the basic API as established in the proposal for -GSoC 2011 [1]_. - -A ``CompositeField`` requires a list of enclosed regular model fields as -positional arguments, as shown in this example:: - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) - -The model class then contains a descriptor for the composite field, which -returns a ``CompositeValue`` which is a customized namedtuple, the -descriptor accepts any iterable of the appropriate length. An example -interactive session:: - - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') - -``CompositeField`` supports the following standard field options: -``unique``, ``db_index``, ``primary_key``. The first two will simply add a -corresponding tuple to ``model._meta.unique_together`` or -``model._meta.index_together``. Other field options don't make much sense -in the context of composite fields. - -Supported ``QuerySet`` filters will be ``exact`` and ``in``. The former -should be clear enough, the latter is elaborated in a separate section. - -It will be possible to use a ``CompositeField`` as a target field of -``ForeignKey``, ``OneToOneField`` and ``ManyToManyField``. This is -described in more detail in the following section. @@ -288,107 +258,8 @@ Alternative Approach of compositeFiled Implementation -------------- -Specifying a CompositeField in a Model -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The constructor of a CompositeField will accept the supported options as -keyword parameters and the enclosed fields will be specified as positional -parameters. The order in which they are specified will determine their -order in the namedtuple representing the CompositeField value (i. e. when -retrieving and assigning the CompositeField's value; see example below). - -unique and db_index -~~~~~~~~~~~~~~~~~~~ -Implementing these will require some modifications in the backend code. -The table creation code will have to handle virtual fields as well as -local fields in the table creation and index creation routines -respectively. - -When the code handling CompositeField.unique is finished, the -models.options.Options class will have to be modified to create a unique -CompositeField for each tuple in the Meta.unique_together attribute. The -code handling unique checks in models.Model will also have to be updated -to reflect the change. - -Retrieval and assignment -~~~~~~~~~~~~~~~~~~~~~~~~ - -Jacob has actually already provided a skeleton of the code that takes care -of this as seen in [1]. I'll only summarize the behaviour in a brief -example of my own. - - class SomeModel(models.Model): - first_field = models.IntegerField() - second_field = models.CharField(max_length=100) - composite = models.CompositeField(first_field, second_field) - - >>> instance = new SomeModel(first_field=47, second_field="some string") - >>> instance.composite - CompositeObject(first_field=47, second_field='some string') - >>> instance.composite.first_field - 47 - >>> instance.composite[1] - 'some string' - >>> instance.composite = (74, "other string") - >>> instance.first_field, instance.second_field - (74, 'other string') - -Accessing the field attribute will create a CompositeObject instance which -will behave like a tuple but also with direct access to enclosed field -values via appropriately named attributes. - -Assignment will be possible using any iterable. The order of the values in -the iterable will have to be the same as the order in which undelying -fields have been specified to the CompositeField. - -QuerySet filtering -~~~~~~~~~~~~~~~~~~ - -This is where the real fun begins. - -The fundamental problem here is that Q objects which are used all over the -code that handles filtering are designed to describe single field lookups. -On the other hand, CompositeFields will require a way to describe several -individual field lookups by a single expression. - -Since the Q objects themselves have no idea about fields at all and the -actual field resolution from the filter conditions happens deeper down the -line, inside models.sql.query.Query, this is where we can handle the -filters properly. -There is already some basic machinery inside Query.add_filter and -Query.setup_joins that is in use by GenericRelations, this is -unfortunately not enough. The optional extra_filters field method will be -of great use here, though it will have to be extended. -Currently the only parameters it gets are the list of joins the -filter traverses, the position in the list and a negate parameter -specifying whether the filter is negated. The GenericRelation instance can -determine the value of the content type (which is what the extra_filters -method is used for) easily based on the model it belongs to. - -This is not the case for a CompositeField -- it doesn't have any idea -about the values used in the query. Therefore a new parameter has to be -added to the method so that the CompositeField can construct all the -actual filters from the iterable containing the values. - -Afterwards the handling inside Query is pretty straightforward. For -CompositeFields (and virtual fields in general) there is no value to be -used in the where node, the extra_filters are responsible for all -filtering, but since the filter should apply to a single object even after -join traversals, the aliases will be set up while handling the "root" -filter and then reused for each one of the extra_filters. - -This way of extending the extra_filters mechanism will allow the field -class to create conjunctions of atomic conditions. This is sufficient for -the "__exact" lookup type which will be implemented. - -Of the other lookup types, the only one that looks reasonable is "__in". -This will, however, have to be represented as a disjunction of multiple -"__exact" conditions since not all database backends support tuple -construction inside expressions. Therefore this lookup type will be left -out of this project as the mechanism would need much more work to make it -possible. CompositeField.primary_key ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -492,47 +363,6 @@ work thanks to automatic underlying field creation for composite fields and traversal in both directions will be supported by the query code. -``__in`` lookups for ``CompositeField`` -======================================= - -The existing implementation of ``CompositeField`` handles ``__in`` lookups -in the generic, backend-independent ``WhereNode`` class and uses a -disjunctive normal form expression as in the following example:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); - -The problem with this solution is that in cases where the list of values -contains tens or hundreds of tuples, this DNF expression will be extremely -long and the database will have to evaluate it for each and every row, -without a possibility of optimizing the query. - -Certain database backends support the following alternative:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; - -This would probably be the best option, but it can't be used by SQLite, -for instance. This is also the reason why the DNF expression was -implemented in the first place. - -In order to support this more natural syntax, the ``DatabaseOperations`` -needs to be extended with a method such as ``composite_in_sql``. - -However, this leaves the issue of the inefficient DNF unresolved for -backends without support for tuple literals. For such backends, the -following expression is proposed:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c - UNION SELECT 4, 5, 6) - WHERE a1=1 AND b1=b AND c1=c); - -Since both syntaxes are rather generic and at least one of them should fit -any database backend directly, a new flag will be introduced, -``DatabaseFeatures.supports_tuple_literals`` which the default -implementation of ``composite_in_sql`` will consult in order to choose -between the two options. ``contenttypes`` and ``GenericForeignKey`` @@ -588,6 +418,98 @@ composite primary key containing any special columns. This should be extremely rare anyway. +QuerySet filtering +~~~~~~~~~~~~~~~~~~ + +This is where the real fun begins. + +The fundamental problem here is that Q objects which are used all over the +code that handles filtering are designed to describe single field lookups. +On the other hand, CompositeFields will require a way to describe several +individual field lookups by a single expression. + +Since the Q objects themselves have no idea about fields at all and the +actual field resolution from the filter conditions happens deeper down the +line, inside models.sql.query.Query, this is where we can handle the +filters properly. + +There is already some basic machinery inside Query.add_filter and +Query.setup_joins that is in use by GenericRelations, this is +unfortunately not enough. The optional extra_filters field method will be +of great use here, though it will have to be extended. + +Currently the only parameters it gets are the list of joins the +filter traverses, the position in the list and a negate parameter +specifying whether the filter is negated. The GenericRelation instance can +determine the value of the content type (which is what the extra_filters +method is used for) easily based on the model it belongs to. + +This is not the case for a CompositeField -- it doesn't have any idea +about the values used in the query. Therefore a new parameter has to be +added to the method so that the CompositeField can construct all the +actual filters from the iterable containing the values. + +Afterwards the handling inside Query is pretty straightforward. For +CompositeFields (and virtual fields in general) there is no value to be +used in the where node, the extra_filters are responsible for all +filtering, but since the filter should apply to a single object even after +join traversals, the aliases will be set up while handling the "root" +filter and then reused for each one of the extra_filters. + +This way of extending the extra_filters mechanism will allow the field +class to create conjunctions of atomic conditions. This is sufficient for +the "__exact" lookup type which will be implemented. + +Of the other lookup types, the only one that looks reasonable is "__in". +This will, however, have to be represented as a disjunction of multiple +"__exact" conditions since not all database backends support tuple +construction inside expressions. Therefore this lookup type will be left +out of this project as the mechanism would need much more work to make it +possible. + +``__in`` lookups for ``CompositeField`` +======================================= + +The existing implementation of ``CompositeField`` handles ``__in`` lookups +in the generic, backend-independent ``WhereNode`` class and uses a +disjunctive normal form expression as in the following example:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); + +The problem with this solution is that in cases where the list of values +contains tens or hundreds of tuples, this DNF expression will be extremely +long and the database will have to evaluate it for each and every row, +without a possibility of optimizing the query. + +Certain database backends support the following alternative:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; + +This would probably be the best option, but it can't be used by SQLite, +for instance. This is also the reason why the DNF expression was +implemented in the first place. + +In order to support this more natural syntax, the ``DatabaseOperations`` +needs to be extended with a method such as ``composite_in_sql``. + +However, this leaves the issue of the inefficient DNF unresolved for +backends without support for tuple literals. For such backends, the +following expression is proposed:: + + SELECT a, b, c FROM tbl1, tbl2 + WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c + UNION SELECT 4, 5, 6) + WHERE a1=1 AND b1=b AND c1=c); + +Since both syntaxes are rather generic and at least one of them should fit +any database backend directly, a new flag will be introduced, +``DatabaseFeatures.supports_tuple_literals`` which the default +implementation of ``composite_in_sql`` will consult in order to choose +between the two options. + + Database introspection, ``inspectdb`` ===================================== From 46006ff964f35acb4ca8ac83615854897ea7d3dd Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 15:47:53 +0600 Subject: [PATCH 57/80] organization --- draft/orm-field-api-related-improvement.rst | 171 +++----------------- 1 file changed, 23 insertions(+), 148 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 3d492b71..a58fcfc9 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -42,6 +42,21 @@ The proposed solution is using Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is not practical and trivial +to try and rebase the previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach on top of master and present +ORM internals design. + +A better approach would be to Improve Field API, RealtionField API and model._meta +first. +Later imlement VirtualField type to first and star refactor of ``ForeignKey`` +and implement CompositeField as the next step. This will result in a better +maintainable development branch and a cleaner revision history, making it easier +to review the work before its eventual inclusion into Django. + + Abstract ========== This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API @@ -53,7 +68,7 @@ To keep thing sane I will try to split the Dep in 3 major Part: 3. CompositeField API formalization -Key concerns of New Approach to implement ``CompositeField`` +Key steps of New Approach to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField etc and change on ORM based on the splitted API. @@ -90,33 +105,14 @@ Key concerns of New Approach to implement ``CompositeField`` 15. Update in AutoField -Porting previous work on top of master -====================================== -The first major task of this project is to take the code written as part -of GSoC 2013 and compare it aganist master to have Idea of valid part. -The order in which It was implemented few years ago was to implement -``CompositeField`` first and then a refactor of ``ForeignKey`` which -is required to make it support ``CompositeField``. This turned out to be -inefficient with respect to the development process, because some parts of -the refactor broke the introduced ``CompositeField`` functionality, -meaning that it was needed effectively reimplement parts of it again. - -Also, some abstractions introduced by the refactor made it possible to -rewrite certain parts in a cleaner way than what was necessary for -``CompositeField`` alone (e.g. database creation or certain features of -``model._meta``). - -I am convinced that a better approach would be to Improve Field API and RealtionField API and later imlement VirtualField type to first do the required refactor of ``ForeignKey`` -and implement CompositeField as the next step. This will result in a better -maintainable development branch and a cleaner revision history, making it easier -to review the work before its eventual inclusion into Django. - - -Specification: +Specifications: =============== +Part-1: +======= + New split out Field API ========================= 1. BaseField: @@ -141,19 +137,15 @@ or any virtual type field can be benefitted from VirtualField. 5. RelationField: ----------------- - - - - - - - 6. CompositeField: ------------------ A composite field can be implemented based on BaseField and VirtualField to solve the CompositeKey/Multi column PrimaryKey issue. +Part-2: +======= + Introduce standalone ``VirtualField`` ===================================== @@ -245,41 +237,11 @@ Changes in ``RelationField`` ============================= -Summary of ``CompositeField`` -============================= - - - - -Alternative Approach of compositeFiled -======================================= Implementation -------------- - - - -CompositeField.primary_key -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -As with db_index and unique, the backend table generating code will have -to be updated to set the PRIMARY KEY to a tuple. In this case, however, -the impact on the rest of the ORM and some other parts of Django is more -serious. - -A (hopefully) complete list of things affected by this is: -- the admin: the possibility to pass the value of the primary key as a - parameter inside the URL is a necessity to be able to work with a model -- contenttypes: since the admin uses GenericForeignKeys to log activity, - there will have to be some support -- forms: more precisely, ModelForms and their ModelChoiceFields -- relationship fields: ForeignKey, ManyToManyField and OneToOneField will - need a way to point to a model with a CompositeField as its primary key - -Let's look at each one of them in more detail. - Admin ~~~~~ @@ -595,91 +557,4 @@ be implemented after successful finish of this project. With this feature the autogenerated intermediary M2M model could make the two ForeignKeys its primary key, dropping the need to have a redundant id AutoField. -Updatable primary keys in models -================================ - -The algorithm that determines what kind of database query to issue on -``model.save()`` is a fairly simple and well-documented one [6]_. If a -row exists in the database with the value of its primary key equal to -the saved object, it is updated, otherwise a new row is inserted. This -behavior is intuitive and works well for models where the primary key is -automatically created by the framework (be it an ``AutoField`` or a parent -link in the case of model inheritance). - -However, as soon as the primary key is explicitly created, the behavior -becomes less intuitive and might be confusing, for example, to users of the -admin. For instance, say we have the following model:: - - class Person(models.Model): - first_name = models.CharField(max_length=47) - last_name = models.CharField(max_length=47) - shoe_size = models.PositiveSmallIntegerField() - - full_name = models.CompositeField(first_name, last_name, - primary_key=True) - -Then we register the model in the admin using the standard one-liner:: - - admin.site.register(Person) - -Since we haven't excluded any fields, all three fields will be editable in -the admin. Now, suppose there's an instance whose ``full_name`` is -``CompositeValue(first_name='Darth', last_name='Vadur')``. A user decides -to fix the last name using the admin, hits the “Save” button and instead -of fixing an existing record, a new one will appear with the new value, -while the old one remains untouched. This behavior is clearly broken from -the point of view of the user. - -It can be argued that it is the developer's fault that the database schema -is poorly chosen and that they expose the primary key to their users. -While this may be true in some cases, it is still to some extent a -subjective matter. - -Therefore I propose a new behavior for ``model.save()`` where it would -detect a change in the instance's primary key and in that case issue an -``UPDATE`` for the right row, i.e. ``WHERE primary_key = previous_value``. - -Of course, just going ahead and changing the behavior in this way for all -models would be backwards incompatible. To do this properly, we would need -to make this an opt-in feature. This can be achieved in multiple ways. - -1) add a keyword argument such as ``update_pk`` to ``Model.save`` -2) add a new option to ``Model.Meta``, ``updatable_pk`` -3) make this a project-wide setting - -Option 3 doesn't look pleasant and I think I can safely eliminate that. -Option 2 is somewhat better, although it adds a new ``Meta`` option. -Option 1 is the most flexible solution, however, it does not change the -behavior of the admin, at least not by default. This can be worked around -by overriding the ``save`` method to use a different default:: - - class MyModel(models.Model): - def save(self, update_pk=True, **kwargs): - kwargs['update_pk'] = update_pk - return super(MyModel, self).save(**kwargs) - -To avoid the need to repeat this for each model, a class decorator might -be provided to perform this automatically. - -In order to implement this new behavior a little bit of extra complexity -would have to be added to models. Model instances would need to store the -last known value of the primary key as retrieved from the database. On -save it would just find out whether the last known value is present and in -that case issue an ``UPDATE`` using the old value in the ``WHERE`` -condition. - -So far so good, this could be implemented fairly easily. However, the -problem becomes considerably more difficult as soon as we take into -account the fact that updating a primary key value may break foreign key -references. In order to avoid breaking references the ``on_delete`` -mechanism of ``ForeignKey`` would have to be extended to support updates -as well. This means that the collector used by deletion will need to be -extended as well. - -The problem becomes particularly nasty if we realize that a ``ForeignKey`` -might be part of a primary key, which means the collector needs to keep -track of which field depends on which in a graph of potentially unlimited -size. Compared to this, deletion is simpler as it only needs to find a -list of all affected model instances as opposed to having to keep track of -which field to update using which value. From a59fc89eab745e03de931355eaf46f2f3f95855a Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 18 Mar 2017 16:04:48 +0600 Subject: [PATCH 58/80] simple and focused --- draft/orm-field-api-related-improvement.rst | 83 +-------------------- 1 file changed, 3 insertions(+), 80 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a58fcfc9..8a885b2d 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -26,20 +26,7 @@ behaviour This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it produces inconsistant behaviour and a very hard implementation to maintain. -Also there are such case is the many-to-many intermediary model. Even though -the pair of ForeignKeys in this model identifies uniquely each relationship, -an additional field is required by the ORM to identify individual rows. While -this isn't a real problem when the underlying database schema is created -by Django, it becomes an obstacle as soon as one tries to develop a Django -application using a legacy database. - -Since there is already a lot of code relying on the pk property of model -instances and the ability to use it in QuerySet filters, it is necessary -to implement a mechanism to allow filtering of several actual fields by -specifying a single filter. - -The proposed solution is using Virtualfield type, and necessary VirtualField -desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +The proposed solution is using Cleanup/provisional RealatedField API, Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. Notes on Porting previous work on top of master: @@ -143,6 +130,7 @@ A composite field can be implemented based on BaseField and VirtualField to solv the CompositeKey/Multi column PrimaryKey issue. + Part-2: ======= @@ -236,12 +224,6 @@ notes. Changes in ``RelationField`` ============================= - - - -Implementation --------------- - Admin ~~~~~ @@ -429,7 +411,7 @@ construction inside expressions. Therefore this lookup type will be left out of this project as the mechanism would need much more work to make it possible. -``__in`` lookups for ``CompositeField`` +``__in`` lookups for ``VirtualField`` ======================================= The existing implementation of ``CompositeField`` handles ``__in`` lookups @@ -472,65 +454,6 @@ implementation of ``composite_in_sql`` will consult in order to choose between the two options. -Database introspection, ``inspectdb`` -===================================== - -There are three main goals concerning database introspection in this -project. The first is to ensure the output of ``inspectdb`` remains the -same as it is now for models with simple primary keys and simple foreign -key references, or at least equivalent. While this shouldn't be too -difficult to achieve, it will still be regarded with high importance. - -The second goal is to extend ``inspectdb`` to also create a -``CompositeField`` in models where the table contains a composite primary -key. This part shouldn't be too difficult, -``DatabaseIntrospection.get_primary_key_column`` will be renamed to -``get_primary_key`` which will return a tuple of columns and in case the -tuple contains more than one element, an appropriate ``CompositeField`` -will be added. This will also require updating -``DatabaseWrapper.check_constraints`` for certain backends since it uses -``get_primary_key_column``. - -The third goal is to also make ``inspectdb`` aware of composite foreign -keys. This will need a rewrite of ``get_relations`` which will have to -return a mapping between tuples of columns instead of single columns. It -should also ensure each tuple of columns pointed to by a foreign key gets -a ``CompositeField``. This part will also probably require some changes in -other backend methods as well, especially since each backend has a unique -tangle of introspection methods. - -This part requires a tremendous amount of work, because practically every -single change needs to be done four times and needs separate research of -the specific backend in question. Therefore I can't promise to deliver full support -for all features mentioned in this section for all backends. I'd say -backwards compatibility is a requirement, recognition of composite primary -keys is a highly wanted feature that I'll try to implement for as many -backends as possible and recognition of composite foreign keys would be a -nice extra to have for at least one or two backends. - -I'll be implementing the features for the individual backends in the -following order: PostgreSQL, MySQL, SQLite and Oracle. I put PostgreSQL -first because, well, this is the backend with the best support in Django -(and also because it is the one where I'd actually use the features I'm -proposing). Oracle comes last because I don't have any way to test it and -I'm afraid I'd be stabbing in the dark anyway. Of the two remaining -backends I put MySQL first for two reasons. First, I don't think people -need to run ``inspectdb`` on SQLite databases too often (if ever). Second, -on MySQL the task seems marginally easier as the database has -introspection features other than just “give me the SQL statement used to -create this table”, whose parsing is most likely going to be a complete -mess. - -All in all, extending ``inspectdb`` features is a tedious and difficult -task with shady outcome, which I'm well aware of. Still, I would like to -try to at least implement the easier parts for the most used backends. It -might quite possibly turn out that I won't manage to implement more than -composite primary key detection for PostgreSQL. This is the reason I keep -this as one of the last features I intend to work on, as shown in the -timeline. It isn't a necessity, we can always just add a note to the docs -that ``inspectdb`` just can't detect certain scenarios and ask people to -edit their models manually. - Other considerations -------------------- From 93b257bd646c0daed4df09095232b6ff4a9fc920 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 19 Mar 2017 14:45:13 +0600 Subject: [PATCH 59/80] modifications --- draft/orm-field-api-related-improvement.rst | 111 ++++++++++---------- 1 file changed, 58 insertions(+), 53 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 8a885b2d..67058408 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -8,7 +8,7 @@ DEP : ORM Fields API & Related Improvements :Shepherd: Django Core Team :Status: Draft :Type: Feature/Cleanup/Optimization -:Created: 2017-3-18 +:Created: 2017-3-5 :Last-Modified: 2017-00-00 .. contents:: Table of Contents @@ -23,33 +23,41 @@ however, there are some historical design limitations and many inconsistant implementation in orm relation fields API which produce many inconsistant behaviour -This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations very annoying as it -produces inconsistant behaviour and a very hard implementation to maintain. +This type of design limitation made it difficult to add support for composite primarykey or working +with relationField/genericRelations very annoying as they produces inconsistant behaviour and a +their implementaion is hard to maintain sue to many special casing. -The proposed solution is using Cleanup/provisional RealatedField API, Virtualfield type, and necessary VirtualField desendent Fields[CompositeField]. The Virtual field type will enclose several real fields within one single object. +In order to fix this design limitations and inconsistant API's the proposed solution is to introduce REAL +VirtualField types and refactor Fields/RelationFields API based on virtualFields type. Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is not practical and trivial -to try and rebase the previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach on top of master and present -ORM internals design. +Considering the huge changes in ORM internals it is neither practical nor trivial +to rebase & port previous works related to ForeignKey refactor and CompositeKey without +figuring out new approach based on present ORM internals design on top of master. -A better approach would be to Improve Field API, RealtionField API and model._meta -first. -Later imlement VirtualField type to first and star refactor of ``ForeignKey`` -and implement CompositeField as the next step. This will result in a better -maintainable development branch and a cleaner revision history, making it easier -to review the work before its eventual inclusion into Django. +A better approach would be to Improve Field API, major cleanup of RealtionField API, model._meta, +and internal field_valaue_cache and related areas first. +Later after completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be +introduced and VirtualField based refactor of ForeignKey and relationFields should take place. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. Abstract ========== -This DEP aims to improve different part of django ORM and other associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem and several ways to implement this. There are two existing dep for solving this problem, but the aim of this dep is to incorporate Michal Petrucha's works suggestions/discussions from other related tickets and lesson learned from previous works. The main motivation of this Dep's approach is to improve django ORM's Field API -and design everything as much simple and small as possible to be able to implement separately. +This DEP aims to improve different part of django ORM and ot associated parts of django to support Real VirtualField +type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested +approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related +tickets were also analyzed to find out the proper ways and API design. + +The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. + +To keep thing sane it would be bette to split the Dep in 3 major Part: -To keep thing sane I will try to split the Dep in 3 major Part: 1. Logical refactor of present Field API and RelationField API 2. VirtualField Based refactor 3. CompositeField API formalization @@ -89,8 +97,6 @@ Key steps of New Approach to improve ORM Field API internals: 14. SubField/AuxilaryField -15. Update in AutoField - @@ -223,40 +229,6 @@ notes. Changes in ``RelationField`` ============================= - -Admin -~~~~~ - -The solution that has been proposed so many times in the past [2], [3] is -to extend the quote function used in the admin to also quote the comma and -then use an unquoted comma as the separator. Even though this solution -looks ugly to some, I don't think there is much choice -- there needs to -be a way to separate the values and in theory, any character could be -contained inside a value so we can't really avoid choosing one and -escaping it. - -GenericForeignKeys -~~~~~~~~~~~~~~~~~~ - -Even though the admin uses the contenttypes framework to log the history -of actions, it turns out proper handling on the admin side will make -things work without the need to modify GenericForeignKey code at all. This -is thanks to the fact that the admin uses only the ContentType field and -handles the relations on its own. Making sure the unquoting function -recreates the whole CompositeObjects where necessary should suffice. - -At a later stage, however, GenericForeignKeys could also be improved to -support composite primary keys. Using the same quoting solution as in the -admin could work in theory, although it would only allow fields capable of -storing arbitrary strings to be usable for object_id storage. This has -been left out of the scope of this project, though. - -ModelChoiceFields -~~~~~~~~~~~~~~~~~ - -Again, we need a way to specify the value as a parameter passed in the -form. The same escaping solution can be used even here. - Relationship fields ~~~~~~~~~~~~~~~~~~~ @@ -362,6 +334,23 @@ composite primary key containing any special columns. This should be extremely rare anyway. +GenericForeignKeys +~~~~~~~~~~~~~~~~~~ + +Even though the admin uses the contenttypes framework to log the history +of actions, it turns out proper handling on the admin side will make +things work without the need to modify GenericForeignKey code at all. This +is thanks to the fact that the admin uses only the ContentType field and +handles the relations on its own. Making sure the unquoting function +recreates the whole CompositeObjects where necessary should suffice. + +At a later stage, however, GenericForeignKeys could also be improved to +support composite primary keys. Using the same quoting solution as in the +admin could work in theory, although it would only allow fields capable of +storing arbitrary strings to be usable for object_id storage. This has +been left out of the scope of this project, though. + + QuerySet filtering ~~~~~~~~~~~~~~~~~~ @@ -453,6 +442,22 @@ any database backend directly, a new flag will be introduced, implementation of ``composite_in_sql`` will consult in order to choose between the two options. +ModelChoiceFields +~~~~~~~~~~~~~~~~~ + +Again, we need a way to specify the value as a parameter passed in the +form. The same escaping solution can be used even here. + +Admin +~~~~~ + +The solution that has been proposed so many times in the past [2], [3] is +to extend the quote function used in the admin to also quote the comma and +then use an unquoted comma as the separator. Even though this solution +looks ugly to some, I don't think there is much choice -- there needs to +be a way to separate the values and in theory, any character could be +contained inside a value so we can't really avoid choosing one and +escaping it. Other considerations From 62de41d49281df1de013ad5d677f0581419d630e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Wed, 22 Mar 2017 12:32:26 +0600 Subject: [PATCH 60/80] changes about related field clean up --- draft/orm-field-api-related-improvement.rst | 108 ++++++++++++++++++-- 1 file changed, 98 insertions(+), 10 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 67058408..022e8dad 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -58,9 +58,11 @@ The main motivation of this Dep's approach is to improve django ORM's Field API To keep thing sane it would be bette to split the Dep in 3 major Part: -1. Logical refactor of present Field API and RelationField API +1. Logical refactor of present Field API and RelationField API and make + them consistant + 2. VirtualField Based refactor -3. CompositeField API formalization + Key steps of New Approach to improve ORM Field API internals: @@ -93,7 +95,7 @@ Key steps of New Approach to improve ORM Field API internals: 12. Make sure new class based Index API ise used properly with refactored Field API. -13. Consider Database Contraints work of lan-foote and +13. Consider Database Contraints work 14. SubField/AuxilaryField @@ -122,18 +124,104 @@ ConcreteField will have all the common attributes of a Regular concrete field Presence base Field class with should refactored using BaseField and ConcreteField. If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. -4. VirtualField: +4. RelationField: +----------------- + +5. VirtualField: ---------------- A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField or any virtual type field can be benefitted from VirtualField. -5. RelationField: ------------------ +Relation Field API clean up: +============================ + +How relation works in django now: +================================= +Before defining clean up mechanism, lets jump into how relations work in django + +A relation in Django consits of: + - The originating field itself + - A descriptor to access the objects of the relation + - The descriptor might need a custom manager + - Possibly a remote relation field (the field to travel the relation in other direction) + Note that this is different from the target and source fields, which define which concrete fields this relation use (essentially, which columns to equate in the JOIN condition) + - The remote field can also contain a descriptor and a manager. + - For deprecation period, field.rel is a bit like the remote field, but without + actually being a field instance. This is created only in the origin field, the remote field doesn't have a rel (as we don't need backwards compatibility + for the remote fields) + + The loading order is as follows: + - The origin field is created as part of importing the class (or separately + by migrations). + - The origin field is added to the origin model's meta (the field's contribute_to_class is called). + - When both the origin and the remote classes are loaded, the remote field is created and the descriptors are created. The remote field is added to the + target class' _meta + - For migrations it is possible that a model is replaced live in the app-cache. For example, + assume model Author is changed, and it is thus reloaded. Model Book has foreign key to + Author, so its reverse field must be recreated in the Author model, too. The way this is + done is that we collect all fields that have been auto-created as relationships into the + Author model, and recreate the related field once Author has been reloaded. + + Example: + + class Author(models.Model): + pass + + class Book(models.Model): + author = models.ForeignKey(Author) + + 1. Author is seen, and thus added to the appconfig. + 2. Book is seen, the field author is seen. + - The author field is created and assigned to Book's class level variable author. + - The author field's rel instance is created at the same time the field is created. + - The metaclass loading for models sees the field instance in Book's attrs, + and the field is added the class, that is author's contribute_to_class is called. + - In the contribute_to_class method, the field is added to Book's meta. + - As last step of contribut_to_class method the prepare_remote() method + is added as a lazy loaded method. It will be called when both Book and + Author are ready. As it happens, they are both ready in the example, + so the method is called immediately.If the Author model was defined later + than Book, and Book had a string reference to Author, then the method would + be called only after Author was ready. + 3. The prepare_remote() method is called. + - The remote field is created based on attributes of the origin field. + The field is added to the remote model (the field's contribute_to_class + is called) + - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field + (unless the remote relation is hidden, in which case no descriptor is created) + +Clean up Relation API to make it consistant: +============================================ +The problem is that when using get_fields(), you'll get either a +field.rel instance (for reverse side of user defined fields), or +a real field instance(for example ForeignKey). These behave +differently, so that the user must always remember which one +he is dealing with. This creates lots of non-necessary conditioning +in multiple places of +Django. + +For example, the select_related descent has one branch for descending foreign +keys and one to one fields, and another branch for descending to reverse one +to one fields. Conceptually both one to one and reverse one to one fields +are very similar, so this complication is non-necessary. + +The idea is to deprecate field.rel, and instead add field.remote_field. +The remote_field is just a field subclass, just like everything else +in Django. + +The benefits are: +Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. Everything from get_fields() should be a field. +Code simplicity - no special casing based on if a given relation is described +by a rel or not +Code reuse - ReverseManyToManyField is in most regard exactly like +ManyToManyField. + +The expected problems are mostly from 3rd party code. Users of _meta that +already work on expectation of getting rel instances will likely need updating. +Those users who subclass Django's fields (or duck-type Django's fields) will +need updating. Examples of such projects include django-rest-framework and django-taggit. + -6. CompositeField: ------------------- -A composite field can be implemented based on BaseField and VirtualField to solve -the CompositeKey/Multi column PrimaryKey issue. From 4d5163834529e7075379a60e4d196a13575bd3b1 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 00:50:39 +0600 Subject: [PATCH 61/80] define virtualfield --- draft/orm-field-api-related-improvement.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 022e8dad..ba27a00c 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -221,6 +221,9 @@ already work on expectation of getting rel instances will likely need updating. Those users who subclass Django's fields (or duck-type Django's fields) will need updating. Examples of such projects include django-rest-framework and django-taggit. +Proposed API and workd flow for clean ups: +========================================== + @@ -230,6 +233,10 @@ Part-2: Introduce standalone ``VirtualField`` ===================================== +what is ``VirtualField``? +------------------------- +"A virtual field is a model field which it correlates to one or multiple +concrete fields, but doesn't add or alter columns in the database." From d84025a70c23b8e2be5edca4943af3f4ed370c61 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 12:24:39 +0600 Subject: [PATCH 62/80] rewording --- draft/orm-field-api-related-improvement.rst | 69 +++++++++++---------- 1 file changed, 35 insertions(+), 34 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index ba27a00c..3e9011b8 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -1,6 +1,6 @@ -========================================================= -DEP : ORM Fields API & Related Improvements -========================================================= +============================================================== +DEP : ORM Relation Fields API Improvements using VirtualField +============================================================== :DEP: 0201 :Author: Asif Saif Uddin @@ -19,49 +19,60 @@ DEP : ORM Fields API & Related Improvements Background: =========== Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design limitations and many inconsistant +however, there are some historical design limitations and inconsistant implementation in orm relation fields API which produce many inconsistant -behaviour +behaviour. -This type of design limitation made it difficult to add support for composite primarykey or working -with relationField/genericRelations very annoying as they produces inconsistant behaviour and a -their implementaion is hard to maintain sue to many special casing. +This type of design limitation made it difficult to add support for +composite primarykey or working with relationField/genericRelations +very annoying as they produces inconsistant behaviour and their +implementaion is hard to maintain due to many special casing. -In order to fix this design limitations and inconsistant API's the proposed solution is to introduce REAL -VirtualField types and refactor Fields/RelationFields API based on virtualFields type. +In order to fix this design limitations and inconsistant API's the proposed +solution is to introduce REAL VirtualField types and refactor +Fields/RelationFields API based on virtualFields type. Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is neither practical nor trivial -to rebase & port previous works related to ForeignKey refactor and CompositeKey without -figuring out new approach based on present ORM internals design on top of master. +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +design on top of master. -A better approach would be to Improve Field API, major cleanup of RealtionField API, model._meta, -and internal field_valaue_cache and related areas first. +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. -Later after completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be -introduced and VirtualField based refactor of ForeignKey and relationFields should take place. +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. This appraoch should keep things sane and easier to approach on smaller chunks. -Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. + Abstract ========== -This DEP aims to improve different part of django ORM and ot associated parts of django to support Real VirtualField -type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested -approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related +This DEP aims to improve different part of django ORM and ot associated +parts of django to support Real VirtualFieldtype in django. There were +several attempt to fix this problem before. So in this Dep we will try +to follow the suggested approaches from Michal Patrucha's previous works +and suggestions in tickets and IRC chat/mailing list. Few other related tickets were also analyzed to find out the proper ways and API design. -The main motivation of this Dep's approach is to improve django ORM's Field API and design everything as much simple and small as possible to be able to implement separately. +The main motivation of this Dep's approach is to improve django ORM's +Field API and design everything as much simple and small as possible +to be able to implement separately. -To keep thing sane it would be bette to split the Dep in 3 major Part: +To keep thing sane it would be better to split the Dep in 3 major Part: 1. Logical refactor of present Field API and RelationField API and make them consistant -2. VirtualField Based refactor +2. Fields internal value cache refactor for relation fields + +3. VirtualField Based refactor @@ -101,7 +112,6 @@ Key steps of New Approach to improve ORM Field API internals: - Specifications: =============== @@ -375,11 +385,9 @@ and traversal in both directions will be supported by the query code. - ``contenttypes`` and ``GenericForeignKey`` ========================================== - It's fairly easy to represent composite values as strings. Given an ``escape`` function which uniquely escapes commas, something like the following works quite well:: @@ -574,10 +582,3 @@ reasons: - there aren't really any form fields usable for tuples and a fieldset would require even more out-of-scope machinery -The CompositeField will not allow enclosing other CompositeFields. The -only exception might be the case of composite ForeignKeys which could also -be implemented after successful finish of this project. With this feature -the autogenerated intermediary M2M model could make the two ForeignKeys -its primary key, dropping the need to have a redundant id AutoField. - - From 109b69ef03bc6706a24bd2fcb944eef85b8398f2 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 23 Mar 2017 12:52:40 +0600 Subject: [PATCH 63/80] addressed another limitation of present related api related to dorect reverse relation --- draft/orm-field-api-related-improvement.rst | 91 +++++++++++++++++---- 1 file changed, 76 insertions(+), 15 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 3e9011b8..2d0817cb 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -200,6 +200,81 @@ A relation in Django consits of: - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) +Another limitation is, + +Django supports many-to-one relationships -- the foreign keys live on +the "many", and point to the "one". So, in a simple app where you +have Comments that can get Flagged, one Comment can have many Flag's, +but each Flag refers to one and only one Comment: + +class Comment(models.Model): + text = models.TextField() + +class Flag(models.Model): + comment = models.ForeignKey(Comment) + +However, there are circumstances where it's much more convenient to +express the relationship as a one-to-many relationship. Suppose, for +example, you want to have a generic "flagging" app which other models +can use: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +That way, if you had a new content type (say, a "Post"), it could also +participate in flagging, without having to modify the model definition +of "Flag" to add a new foreign key. Without baking in migrations, +there's obviously no way to make the underlying SQL play nice in this +circumstance: one-to-many relationships with just two tables can only +be expressed in SQL with a reverse foreign key relationship. However, +it's possible to describe OneToMany as a subset of ManyToMany, with a +uniqueness constraint on the "One" -- we rely on the join table to +handle the relationship: + +class Comment(models.Model): + text = models.TextField() + flags = models.ManyToMany(Flag, through=CommentFlag) + +class CommentFlag(models.Model): + comment = models.ForeignKey(Comment) + flag = models.ForeignKey(Flag, unique=True) + +While this works, the query interface remains cumbersome. To access +the comment from a flag, I have to call: + +comment = flag.comment_set.all()[0] + +as the ORM doesn't know for a fact that each flag could only have one +comment. But Django _could_ implement a OneToManyField in this way +(using the underlying ManyToMany paradigm), and provide sugar such +that this would all be nice and flexible, without having to do cumbersome +ORM calls or explicitly define extra join tables: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +class Post(models.Model): + body = models.TextField() + flags = models.OneToMany(Flag) + +# in a separate reusable app... +class Flag(models.Model) + reason = models.TextField() + resolved = models.BooleanField() + +# in a view... +comment = flag.comment +post = flag.post + +It's obviously less database efficient than simple 2-table reverse +ForeignKey relationships, as you have to do an extra join on the third +table; but you gain semantic clarity and a nice way to use it in +reusable apps, so in many circumstances it's worth it. And it's a +fair shake clearer than the existing generic foreign key solutions. + + Clean up Relation API to make it consistant: ============================================ The problem is that when using get_fields(), you'll get either a @@ -503,6 +578,7 @@ construction inside expressions. Therefore this lookup type will be left out of this project as the mechanism would need much more work to make it possible. + ``__in`` lookups for ``VirtualField`` ======================================= @@ -566,19 +642,4 @@ escaping it. Other considerations -------------------- -This infrastructure will allow reimplementing the GenericForeignKey as a -CompositeField at a later stage. Thanks to the modifications in the -joining code it should also be possible to implement bidirectional generic -relationship traversal in QuerySet filters. This is, however, out of scope -of this project. - -CompositeFields will have the serialize option set to False to prevent -their serialization. Otherwise the enclosed fields would be serialized -twice which would not only infer redundancy but also ambiguity. - -Also CompositeFields will be ignored in ModelForms by default, for two -reasons: -- otherwise the same field would be inside the form twice -- there aren't really any form fields usable for tuples and a fieldset - would require even more out-of-scope machinery From 5eb2a1e9ecd97845e24922c97c965e8f370317b8 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 31 Mar 2017 01:06:52 +0600 Subject: [PATCH 64/80] modifications --- draft/orm-field-api-related-improvement.rst | 56 +++++++++------------ 1 file changed, 25 insertions(+), 31 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 2d0817cb..45717fac 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -33,50 +33,28 @@ solution is to introduce REAL VirtualField types and refactor Fields/RelationFields API based on virtualFields type. -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. - -This appraoch should keep things sane and easier to approach on smaller chunks. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. - - Abstract ========== This DEP aims to improve different part of django ORM and ot associated -parts of django to support Real VirtualFieldtype in django. There were +parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested approaches from Michal Patrucha's previous works and suggestions in tickets and IRC chat/mailing list. Few other related -tickets were also analyzed to find out the proper ways and API design. +tickets were also analyzed to find out possible way's of API design. -The main motivation of this Dep's approach is to improve django ORM's -Field API and design everything as much simple and small as possible -to be able to implement separately. -To keep thing sane it would be better to split the Dep in 3 major Part: +To keep thing sane it would be better to split the Dep in some major Parts: -1. Logical refactor of present Field API and RelationField API and make - them consistant +1. Logical refactor of present Field API and RelationField API, to make + them sipler and consistant with _meta API calls -2. Fields internal value cache refactor for relation fields +2. Fields internal value cache refactor for relation fields (may be) -3. VirtualField Based refactor +3. VirtualField Based refactor of RelationFields API -Key steps of New Approach to improve ORM Field API internals: +Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField etc and change on ORM based on the splitted API. @@ -108,7 +86,7 @@ Key steps of New Approach to improve ORM Field API internals: 13. Consider Database Contraints work -14. SubField/AuxilaryField +14. SubField/AuxilaryField [may be] @@ -642,4 +620,20 @@ escaping it. Other considerations -------------------- +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +design on top of master. + +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. From e2116c8fb99cf90853e123bbedf2dbec0e570f2e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Fri, 31 Mar 2017 01:41:48 +0600 Subject: [PATCH 65/80] more modifications --- draft/orm-field-api-related-improvement.rst | 61 +++++++++++---------- 1 file changed, 32 insertions(+), 29 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 45717fac..a9c85003 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -11,31 +11,28 @@ DEP : ORM Relation Fields API Improvements using VirtualField :Created: 2017-3-5 :Last-Modified: 2017-00-00 -.. contents:: Table of Contents - :depth: 3 - :local: Background: =========== -Django's ORM is a simple & powerful tool which suits most use-cases, -however, there are some historical design limitations and inconsistant -implementation in orm relation fields API which produce many inconsistant -behaviour. +Django's ORM is a simple & powerful tool which suits most use-cases. +However, historicaly it has some design limitations and complex internal +API which makes it not only hard to maintain but also produce inconsistant +behaviours. This type of design limitation made it difficult to add support for composite primarykey or working with relationField/genericRelations -very annoying as they produces inconsistant behaviour and their +very annoying as they don't produce consistant behaviour and their implementaion is hard to maintain due to many special casing. -In order to fix this design limitations and inconsistant API's the proposed -solution is to introduce REAL VirtualField types and refactor -Fields/RelationFields API based on virtualFields type. +In order to fix these design limitations and inconsistancies, the proposed +solution is to refactor Fields/RelationFields to new simpler API and +incorporate virtualField type based refctors of RelationFields. Abstract ========== -This DEP aims to improve different part of django ORM and ot associated +This DEP aims to improve different part of django ORM and associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try to follow the suggested approaches from Michal Patrucha's previous works @@ -46,11 +43,13 @@ tickets were also analyzed to find out possible way's of API design. To keep thing sane it would be better to split the Dep in some major Parts: 1. Logical refactor of present Field API and RelationField API, to make - them sipler and consistant with _meta API calls + them simpler and consistant with _meta API calls -2. Fields internal value cache refactor for relation fields (may be) +2. Introduce new sane API for RelationFields [internal/provisional] -3. VirtualField Based refactor of RelationFields API +3. Fields internal value cache refactor for relation fields (may be) + +4. VirtualField Based refactor of RelationFields API @@ -60,33 +59,37 @@ Key steps of to follow to improve ORM Field API internals: BaseField etc and change on ORM based on the splitted API. 2. Change ForeignObjectRel subclasses to real field instances. (For example, - ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses. + ForeignKey generates a ManyToOneRel in the related model). The Rel instances + are already returned from get_field(), but they aren't yet field subclasses. -3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be - advantageous to be able to define reverse relations directly. For example, +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it + can be advantageous to be able to define reverse relations directly. For + example, see ​https://github.com/akaariai/django-reverse-unique. -5. Introduce new standalone well defined ``VirtualField`` +4. Introduce new standalone well defined ``VirtualField`` -6. Incorporate ``VirtualField`` related changes in django +5. Incorporate ``VirtualField`` related changes in django -7. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API -8. Figure out other cases where true virtual fields are needed. +7. Figure out other cases where true virtual fields are needed. -9. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey +8. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey -10. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -11. Make changes to migrations framework to work properly with Reafctored Field +10. Make changes to migrations framework to work properly with Reafctored Field API. +11. Migrations work well with VirtualField based refactored API + 12. Make sure new class based Index API ise used properly with refactored Field API. -13. Consider Database Contraints work +13. Query/QuerySets/Expressions work well with new refactored API's -14. SubField/AuxilaryField [may be] +14. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API @@ -605,8 +608,8 @@ ModelChoiceFields Again, we need a way to specify the value as a parameter passed in the form. The same escaping solution can be used even here. -Admin -~~~~~ +Admin/ModelForms +================ The solution that has been proposed so many times in the past [2], [3] is to extend the quote function used in the admin to also quote the comma and From 1c0e12bf7ca2a79284fd844ebc2c9514d18da32f Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 13:50:35 +0600 Subject: [PATCH 66/80] modifications --- draft/orm-field-api-related-improvement.rst | 70 ++++++++++++--------- 1 file changed, 42 insertions(+), 28 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a9c85003..4a84bd79 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -30,8 +30,8 @@ solution is to refactor Fields/RelationFields to new simpler API and incorporate virtualField type based refctors of RelationFields. -Abstract -========== +Aim of the Proposal: +==================== This DEP aims to improve different part of django ORM and associated parts of django to support Real VirtualField type in django. There were several attempt to fix this problem before. So in this Dep we will try @@ -56,29 +56,31 @@ To keep thing sane it would be better to split the Dep in some major Parts: Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, - BaseField etc and change on ORM based on the splitted API. + BaseField, RelationField etc and adjust codes based on that API. -2. Change ForeignObjectRel subclasses to real field instances. (For example, - ForeignKey generates a ManyToOneRel in the related model). The Rel instances - are already returned from get_field(), but they aren't yet field subclasses. +2. Change ForeignObjectRel subclasses to real field instances. + The Rel instances are already returned from get_field(), but they + aren't yet field subclasses. (For example, ForeignKey generates + a ManyToOneRel in the related model). -3. Allow direct usage of ForeignObjectRel subclasses. In certain cases it - can be advantageous to be able to define reverse relations directly. For - example, - see ​https://github.com/akaariai/django-reverse-unique. +3. Allow direct usage of ForeignObjectRel subclasses. In certain cases + it could be advantageous to be able to define reverse relations directly. + For example, ​https://github.com/akaariai/django-reverse-unique. -4. Introduce new standalone well defined ``VirtualField`` +4. Introduce new standalone well defined ``VirtualField``. -5. Incorporate ``VirtualField`` related changes in django +5. Incorporate ``VirtualField`` related changes in django. -6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` etc NEW Field API +6. Refactor ForeignKey based on ``VirtualField`` and ``ConcreteField`` + etc new Field API. -7. Figure out other cases where true virtual fields are needed. +7. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` + and new Field API based ForeignKey. -8. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey - -9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +9. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API + 10. Make changes to migrations framework to work properly with Reafctored Field API. @@ -89,7 +91,9 @@ Key steps of to follow to improve ORM Field API internals: 13. Query/QuerySets/Expressions work well with new refactored API's -14. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API +14. refactor GIS framework based on the changes in ORM + +15. ModelForms/Admin work well with posposed changes @@ -103,8 +107,8 @@ New split out Field API ========================= 1. BaseField: ------------- -Base structure for all Field types in django ORM wheather it is Concrete -or VirtualField +Base structure for all Field types in django ORM wheather it is Concrete, +relation or VirtualField 2. ConcreteField: ----------------- @@ -113,15 +117,21 @@ ConcreteField will have all the common attributes of a Regular concrete field 3. Field: --------- Presence base Field class with should refactored using BaseField and ConcreteField. -If it is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. +If it is decided to provide the optional virtual type to regular fields then +VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- +Based Field for All relation fields. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to the system to be used to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField -or any virtual type field can be benefitted from VirtualField. +A true stand alone virtula field will be added to the system to be used to solve +some long standing design limitations of django orm. initially RelationFields, +GenericRelations etc will be benefitted by using VirtualFields and later +CompositeField or any virtual type field can be benefitted from VirtualField. + + Relation Field API clean up: ============================ @@ -135,10 +145,13 @@ A relation in Django consits of: - A descriptor to access the objects of the relation - The descriptor might need a custom manager - Possibly a remote relation field (the field to travel the relation in other direction) - Note that this is different from the target and source fields, which define which concrete fields this relation use (essentially, which columns to equate in the JOIN condition) + Note that this is different from the target and source fields, which define which + concrete fields this relation use (essentially, which columns to equate in the + JOIN condition) - The remote field can also contain a descriptor and a manager. - For deprecation period, field.rel is a bit like the remote field, but without - actually being a field instance. This is created only in the origin field, the remote field doesn't have a rel (as we don't need backwards compatibility + actually being a field instance. This is created only in the origin field, + the remote field doesn't have a rel (as we don't need backwards compatibility for the remote fields) The loading order is as follows: @@ -620,13 +633,14 @@ contained inside a value so we can't really avoid choosing one and escaping it. -Other considerations --------------------- +GIS Framework: +============== Notes on Porting previous work on top of master: ================================================ Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and CompositeKey without figuring out new approach based on present ORM internals +trivial to rebase & port previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach based on present ORM internals design on top of master. A better approach would be to Improve Field API, major cleanup of RealtionField From fe0df8004a3272faf35391019e13ce968780bf20 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 22:12:45 +0600 Subject: [PATCH 67/80] drfat relational field api --- draft/orm-field-api-related-improvement.rst | 67 +++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 4a84bd79..d9678add 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -302,7 +302,74 @@ need updating. Examples of such projects include django-rest-framework and djang Proposed API and workd flow for clean ups: ========================================== +Relational field API +==================== +Currently the main use case is that we have a single place where I +can check that we don't define redundant APIs for related fields. + +Structure of a relational field +------------------------------- +A relational field consist of: + + - The user created field + - Possibly of a remote field, which is auto-created by the user created field + + Both the created field and the remote field can possibly add a descriptor to + the field's model. + + Both the remote field and the user created field have (mostly) matching API. + The API consists of the following attributes and methods: + + .. attribute:: name + + The name of the field. This is the key of the field in _meta.get_field() calls, and + thus this is also the name used in ORM queries. + + .. attribute:: attname + + ForeignKeys have the concrete value in field.attname, and the model instance in + field.name. For example Author.book_id contains an integer, and Author.book contains + a book instance. Attname is the book_id value. + + .. method:: get_query_name() + + A method that generates the field's name. Only needed for remote fields. + + .. method:: get_accessor_name() + + A method that generates the name the field's descriptor should be placed into. + + For remote fields, get_query_name() is essentially similar to related_query_name + parameter, and get_accessor_name() is similar to related_name parameter. + + .. method:: get_path_info() + + Tells Django which relations to travel when this field is queried. Essentially + returns one PathInfo structure for each join needed by this field. + + .. method:: get_extra_restriction() + + Tells Django which extra restrictions should be placed onto joins generated. + + .. attribute:: model + + The originating model of this field. + + .. attribute:: remote_field + + The remote field of this model. + + .. attribute:: remote_model + + Same as self.remote_field.model. + + + ******************************** RANDOM DESIGN DOCUMENTATION *********************** + Abstract models and relational fields: + - If an abstract model defines a relation to non-abstract model, we must not add the remote + field. + - If an model defines a relation to abstract model, this should just fail (check this!) From 8d1359902776e2d3c3253a0e9c4545d511a8d780 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 1 Apr 2017 22:21:14 +0600 Subject: [PATCH 68/80] problem section --- draft/orm-field-api-related-improvement.rst | 194 ++++++++++---------- 1 file changed, 97 insertions(+), 97 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index d9678add..f33a611b 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -30,6 +30,83 @@ solution is to refactor Fields/RelationFields to new simpler API and incorporate virtualField type based refctors of RelationFields. +Limitations of ORM that will be taken care of: +============================================== +One limitation is, + +Django supports many-to-one relationships -- the foreign keys live on +the "many", and point to the "one". So, in a simple app where you +have Comments that can get Flagged, one Comment can have many Flag's, +but each Flag refers to one and only one Comment: + +class Comment(models.Model): + text = models.TextField() + +class Flag(models.Model): + comment = models.ForeignKey(Comment) + +However, there are circumstances where it's much more convenient to +express the relationship as a one-to-many relationship. Suppose, for +example, you want to have a generic "flagging" app which other models +can use: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +That way, if you had a new content type (say, a "Post"), it could also +participate in flagging, without having to modify the model definition +of "Flag" to add a new foreign key. Without baking in migrations, +there's obviously no way to make the underlying SQL play nice in this +circumstance: one-to-many relationships with just two tables can only +be expressed in SQL with a reverse foreign key relationship. However, +it's possible to describe OneToMany as a subset of ManyToMany, with a +uniqueness constraint on the "One" -- we rely on the join table to +handle the relationship: + +class Comment(models.Model): + text = models.TextField() + flags = models.ManyToMany(Flag, through=CommentFlag) + +class CommentFlag(models.Model): + comment = models.ForeignKey(Comment) + flag = models.ForeignKey(Flag, unique=True) + +While this works, the query interface remains cumbersome. To access +the comment from a flag, I have to call: + +comment = flag.comment_set.all()[0] + +as the ORM doesn't know for a fact that each flag could only have one +comment. But Django _could_ implement a OneToManyField in this way +(using the underlying ManyToMany paradigm), and provide sugar such +that this would all be nice and flexible, without having to do cumbersome +ORM calls or explicitly define extra join tables: + +class Comment(models.Model): + text = models.TextField() + flags = models.OneToMany(Flag) + +class Post(models.Model): + body = models.TextField() + flags = models.OneToMany(Flag) + +# in a separate reusable app... +class Flag(models.Model) + reason = models.TextField() + resolved = models.BooleanField() + +# in a view... +comment = flag.comment +post = flag.post + +It's obviously less database efficient than simple 2-table reverse +ForeignKey relationships, as you have to do an extra join on the third +table; but you gain semantic clarity and a nice way to use it in +reusable apps, so in many circumstances it's worth it. And it's a +fair shake clearer than the existing generic foreign key solutions. + + Aim of the Proposal: ==================== This DEP aims to improve different part of django ORM and associated @@ -53,6 +130,26 @@ To keep thing sane it would be better to split the Dep in some major Parts: +Notes on Porting previous work on top of master: +================================================ +Considering the huge changes in ORM internals it is neither practical nor +trivial to rebase & port previous works related to ForeignKey refactor and +CompositeKey without figuring out new approach based on present ORM internals +design on top of master. + +A better approach would be to Improve Field API, major cleanup of RealtionField +API, model._meta and internal field_valaue_cache and related areas first. + +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could done. + +This appraoch should keep things sane and easier to approach on smaller chunks. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. + + Key steps of to follow to improve ORM Field API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, @@ -194,79 +291,7 @@ A relation in Django consits of: - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) -Another limitation is, - -Django supports many-to-one relationships -- the foreign keys live on -the "many", and point to the "one". So, in a simple app where you -have Comments that can get Flagged, one Comment can have many Flag's, -but each Flag refers to one and only one Comment: - -class Comment(models.Model): - text = models.TextField() - -class Flag(models.Model): - comment = models.ForeignKey(Comment) - -However, there are circumstances where it's much more convenient to -express the relationship as a one-to-many relationship. Suppose, for -example, you want to have a generic "flagging" app which other models -can use: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -That way, if you had a new content type (say, a "Post"), it could also -participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, -there's obviously no way to make the underlying SQL play nice in this -circumstance: one-to-many relationships with just two tables can only -be expressed in SQL with a reverse foreign key relationship. However, -it's possible to describe OneToMany as a subset of ManyToMany, with a -uniqueness constraint on the "One" -- we rely on the join table to -handle the relationship: -class Comment(models.Model): - text = models.TextField() - flags = models.ManyToMany(Flag, through=CommentFlag) - -class CommentFlag(models.Model): - comment = models.ForeignKey(Comment) - flag = models.ForeignKey(Flag, unique=True) - -While this works, the query interface remains cumbersome. To access -the comment from a flag, I have to call: - -comment = flag.comment_set.all()[0] - -as the ORM doesn't know for a fact that each flag could only have one -comment. But Django _could_ implement a OneToManyField in this way -(using the underlying ManyToMany paradigm), and provide sugar such -that this would all be nice and flexible, without having to do cumbersome -ORM calls or explicitly define extra join tables: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -class Post(models.Model): - body = models.TextField() - flags = models.OneToMany(Flag) - -# in a separate reusable app... -class Flag(models.Model) - reason = models.TextField() - resolved = models.BooleanField() - -# in a view... -comment = flag.comment -post = flag.post - -It's obviously less database efficient than simple 2-table reverse -ForeignKey relationships, as you have to do an extra join on the third -table; but you gain semantic clarity and a nice way to use it in -reusable apps, so in many circumstances it's worth it. And it's a -fair shake clearer than the existing generic foreign key solutions. Clean up Relation API to make it consistant: @@ -691,33 +716,8 @@ form. The same escaping solution can be used even here. Admin/ModelForms ================ -The solution that has been proposed so many times in the past [2], [3] is -to extend the quote function used in the admin to also quote the comma and -then use an unquoted comma as the separator. Even though this solution -looks ugly to some, I don't think there is much choice -- there needs to -be a way to separate the values and in theory, any character could be -contained inside a value so we can't really avoid choosing one and -escaping it. GIS Framework: ============== -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. - -This appraoch should keep things sane and easier to approach on smaller chunks. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. From aabb4595bc4b8f91dff0df3e517b0a69430f51ea Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 00:46:07 +0600 Subject: [PATCH 69/80] organize --- draft/orm-field-api-related-improvement.rst | 107 +++++--------------- 1 file changed, 28 insertions(+), 79 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f33a611b..219a09ae 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -73,7 +73,7 @@ class CommentFlag(models.Model): flag = models.ForeignKey(Flag, unique=True) While this works, the query interface remains cumbersome. To access -the comment from a flag, I have to call: +the comment from a flag, have to call: comment = flag.comment_set.all()[0] @@ -132,19 +132,20 @@ To keep thing sane it would be better to split the Dep in some major Parts: Notes on Porting previous work on top of master: ================================================ -Considering the huge changes in ORM internals it is neither practical nor -trivial to rebase & port previous works related to ForeignKey refactor and -CompositeKey without figuring out new approach based on present ORM internals +Considering the huge changes in ORM internals it is neither trivial nor +practical to rebase & port previous works related to ForeignKey refactor +without figuring out new approach based on present ORM internals design on top of master. -A better approach would be to Improve Field API, major cleanup of RealtionField -API, model._meta and internal field_valaue_cache and related areas first. +A better approach would be to Improve Field API, major cleanup of +RealtionField API, model._meta and internal field_valaue_cache and +related areas first. After completing the major clean ups of Fields/RelationFields a REAL VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could done. +of ForeignKey and relationFields could have been done. -This appraoch should keep things sane and easier to approach on smaller chunks. +This appraoch should keep things easier to approach with smaller steps. Later any VirtualField derived Field like CompositeField implementation should be less complex after the completion of virtualField based refactors. @@ -205,28 +206,28 @@ New split out Field API 1. BaseField: ------------- Base structure for all Field types in django ORM wheather it is Concrete, -relation or VirtualField +RelationField or VirtualField 2. ConcreteField: ----------------- -ConcreteField will have all the common attributes of a Regular concrete field +ConcreteField will extract all the common attributes of a Regular concrete field 3. Field: --------- -Presence base Field class with should refactored using BaseField and ConcreteField. -If it is decided to provide the optional virtual type to regular fields then +Field class should be refactored using BaseField and ConcreteField. If it +is decided to provide the optional virtual type to regular fields then VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- -Based Field for All relation fields. +Base Field for All relation fields extended from new BaseField class. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to the system to be used to solve -some long standing design limitations of django orm. initially RelationFields, -GenericRelations etc will be benefitted by using VirtualFields and later -CompositeField or any virtual type field can be benefitted from VirtualField. +A true stand alone virtula field will be added to solve some long standing +design limitations of django orm. initially RelationFields, GenericRelations +etc will be benefitted by using VirtualFields and later CompositeField or +any virtual type field can be benefitted from VirtualField. @@ -281,14 +282,15 @@ A relation in Django consits of: - As last step of contribut_to_class method the prepare_remote() method is added as a lazy loaded method. It will be called when both Book and Author are ready. As it happens, they are both ready in the example, - so the method is called immediately.If the Author model was defined later - than Book, and Book had a string reference to Author, then the method would + so the method is called immediately. If the Author model was defined later + than Book and Book had a string reference to Author, then the method would be called only after Author was ready. 3. The prepare_remote() method is called. - The remote field is created based on attributes of the origin field. The field is added to the remote model (the field's contribute_to_class is called) - - The post_relation_ready() method is called for both the origin and the remote field. This will create the descriptor on both the origin and remote field + - The post_relation_ready() method is called for both the origin and the remote field. + This will create the descriptor on both the origin and remote field (unless the remote relation is hidden, in which case no descriptor is created) @@ -301,8 +303,7 @@ field.rel instance (for reverse side of user defined fields), or a real field instance(for example ForeignKey). These behave differently, so that the user must always remember which one he is dealing with. This creates lots of non-necessary conditioning -in multiple places of -Django. +in multiple places of Django. For example, the select_related descent has one branch for descending foreign keys and one to one fields, and another branch for descending to reverse one @@ -314,7 +315,8 @@ The remote_field is just a field subclass, just like everything else in Django. The benefits are: -Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. Everything from get_fields() should be a field. +Conceptual simplicity - dealing with fields and rels is non-necessaryand confusing. +Everything from get_fields() should be a field. Code simplicity - no special casing based on if a given relation is described by a rel or not Code reuse - ReverseManyToManyField is in most regard exactly like @@ -323,7 +325,9 @@ ManyToManyField. The expected problems are mostly from 3rd party code. Users of _meta that already work on expectation of getting rel instances will likely need updating. Those users who subclass Django's fields (or duck-type Django's fields) will -need updating. Examples of such projects include django-rest-framework and django-taggit. +need updating. Examples of such projects include django-rest-framework and +django-taggit. + Proposed API and workd flow for clean ups: ========================================== @@ -598,23 +602,6 @@ composite primary key containing any special columns. This should be extremely rare anyway. -GenericForeignKeys -~~~~~~~~~~~~~~~~~~ - -Even though the admin uses the contenttypes framework to log the history -of actions, it turns out proper handling on the admin side will make -things work without the need to modify GenericForeignKey code at all. This -is thanks to the fact that the admin uses only the ContentType field and -handles the relations on its own. Making sure the unquoting function -recreates the whole CompositeObjects where necessary should suffice. - -At a later stage, however, GenericForeignKeys could also be improved to -support composite primary keys. Using the same quoting solution as in the -admin could work in theory, although it would only allow fields capable of -storing arbitrary strings to be usable for object_id storage. This has -been left out of the scope of this project, though. - - QuerySet filtering ~~~~~~~~~~~~~~~~~~ @@ -668,44 +655,6 @@ possible. ``__in`` lookups for ``VirtualField`` ======================================= -The existing implementation of ``CompositeField`` handles ``__in`` lookups -in the generic, backend-independent ``WhereNode`` class and uses a -disjunctive normal form expression as in the following example:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 5 AND c = 6); - -The problem with this solution is that in cases where the list of values -contains tens or hundreds of tuples, this DNF expression will be extremely -long and the database will have to evaluate it for each and every row, -without a possibility of optimizing the query. - -Certain database backends support the following alternative:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE (a, b, c) IN [(1, 2, 3), (4, 5, 6)]; - -This would probably be the best option, but it can't be used by SQLite, -for instance. This is also the reason why the DNF expression was -implemented in the first place. - -In order to support this more natural syntax, the ``DatabaseOperations`` -needs to be extended with a method such as ``composite_in_sql``. - -However, this leaves the issue of the inefficient DNF unresolved for -backends without support for tuple literals. For such backends, the -following expression is proposed:: - - SELECT a, b, c FROM tbl1, tbl2 - WHERE EXISTS (SELECT a1, b1, c1, FROM (SELECT 1 as a, 2 as b, 3 as c - UNION SELECT 4, 5, 6) - WHERE a1=1 AND b1=b AND c1=c); - -Since both syntaxes are rather generic and at least one of them should fit -any database backend directly, a new flag will be introduced, -``DatabaseFeatures.supports_tuple_literals`` which the default -implementation of ``composite_in_sql`` will consult in order to choose -between the two options. ModelChoiceFields ~~~~~~~~~~~~~~~~~ From 5ef6124d6066427ef95a5584c5d8032aaba4b60e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 14:47:26 +0600 Subject: [PATCH 70/80] re ogranize n clean ups --- draft/orm-field-api-related-improvement.rst | 168 +++++++------------- 1 file changed, 60 insertions(+), 108 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 219a09ae..4c03e077 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -415,14 +415,65 @@ concrete fields, but doesn't add or alter columns in the database." +Changes in ``RelationField`` +============================= +Relationship fields +~~~~~~~~~~~~~~~~~~~ + +This turns out to be, not too surprisingly, the toughest problem. The fact +that related fields are spread across about fifteen different classes, +most of which are quite nontrivial, makes the whole bundle pretty fragile, +which means the changes have to be made carefully not to break anything. + +What we need to achieve is that the ForeignKey, ManyToManyField and +OneToOneField detect when their target field is a CompositeField in +several situations and act accordingly since this will require different +handling than regular fields that map directly to database columns. + +The first one to look at is ForeignKey since the other two rely on its +functionality, OneToOneField being its descendant and ManyToManyField +using ForeignKeys in the intermediary model. Once the ForeignKeys work, +OneToOneField should require minimal to no changes since it inherits +almost everything from ForeignKey. + +The easiest part is that for composite related fields, the db_type will be +None since the data will be stored elsewhere. + +ForeignKey and OneToOneField will also be able to create the underlying +fields automatically when added to the model. I'm proposing the following +default names: "fkname_targetname" where "fkname" is the name of the +ForeignKey field and "targetname" is the name of the remote field name +corresponding to the local one. I'm open to other suggestions on this. + +There will also be a way to override the default names using a new field +option "enclosed_fields". This option will expect a tuple of fields each +of whose corresponds to one individual field in the same order as +specified in the target CompositeField. This option will be ignored for +non-composite ForeignKeys. + +The trickiest part, however, will be relation traversals in QuerySet +lookups. Currently the code in models.sql.query.Query that creates joins +only joins on single columns. To be able to span a composite relationship +the code that generates joins will have to recognize column tuples and add +a constraint for each pair of corresponding columns with the same aliases +in all conditions. + +For the sake of completeness, ForeignKey will also have an extra_filters +method allowing to filter by a related object or its primary key. + +With all this infrastructure set up, ManyToMany relationships using +composite fields will be easy enough. Intermediary model creation will +work thanks to automatic underlying field creation for composite fields +and traversal in both directions will be supported by the query code. + + Changes in ``ForeignKey`` ========================= Currently ``ForeignKey`` is a regular concrete field which manages both the raw value stored in the database and the higher-level relationship semantics. Managing the raw value is simple enough for simple -(single-column) targets. However, in the case of a composite target field, -this task becomes more complex. The biggest problem is that many parts of +(single-column) targets. The biggest problem is that many parts of the ORM work under the assumption that for each database column there is a model field it can assign the value from the column to. While it might be possible to lift this restriction, it would be a really complex project by @@ -446,15 +497,10 @@ uses a field specifically intended for the task. In order to keep this backwards compatible and avoid the need to explicitly create two fields for each ``ForeignKey``, the auxiliary field needs to be created automatically during the phase where a model class is -created by its metaclass. Initially I implemented this as a method on -``ForeignKey`` which takes the target field and creates its copy, touches -it up and adds it to the model class. However, this requires performing -special tasks with certain types of fields, such as ``AutoField`` which -needs to be turned into an ``IntegerField`` or ``CompositeField`` which -requires copying its enclosed fields as well. - -A better approach is to add a method such as ``create_auxiliary_copy`` on -``Field`` which would create all new field instances and add them to the +created by its metaclass. + +A better approach could be to add a method such as ``create_auxiliary_copy`` +on ``Field`` which would create all new field instances and add them to the appropriate model class. One possible problem with these changes is that they change the contents @@ -484,84 +530,15 @@ where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a 'chef', 'chef_id'] -This causes a lot of failures in the Django test suite, because there are -a lot of tests relying on the contents of ``_meta.fields`` or other -related attributes/properties. (Actually, this example is taken from one -of these tests, -``model_inheritance.tests.ModelInheritanceTests.test_multiple_table``.) -Fixing these is fairly simple, all they need is to add the appropriate -``__id`` fields. However, this raises a concern of how ``_meta`` is -regarded. It has always been a private API officially, but everyone uses -it in their projects anyway. I still think the change is worth it, but it -might be a good idea to include a note about the change in the release -notes. - - -Changes in ``RelationField`` -============================= -Relationship fields -~~~~~~~~~~~~~~~~~~~ - -This turns out to be, not too surprisingly, the toughest problem. The fact -that related fields are spread across about fifteen different classes, -most of which are quite nontrivial, makes the whole bundle pretty fragile, -which means the changes have to be made carefully not to break anything. - -What we need to achieve is that the ForeignKey, ManyToManyField and -OneToOneField detect when their target field is a CompositeField in -several situations and act accordingly since this will require different -handling than regular fields that map directly to database columns. - -The first one to look at is ForeignKey since the other two rely on its -functionality, OneToOneField being its descendant and ManyToManyField -using ForeignKeys in the intermediary model. Once the ForeignKeys work, -OneToOneField should require minimal to no changes since it inherits -almost everything from ForeignKey. - -The easiest part is that for composite related fields, the db_type will be -None since the data will be stored elsewhere. - -ForeignKey and OneToOneField will also be able to create the underlying -fields automatically when added to the model. I'm proposing the following -default names: "fkname_targetname" where "fkname" is the name of the -ForeignKey field and "targetname" is the name of the remote field name -corresponding to the local one. I'm open to other suggestions on this. - -There will also be a way to override the default names using a new field -option "enclosed_fields". This option will expect a tuple of fields each -of whose corresponds to one individual field in the same order as -specified in the target CompositeField. This option will be ignored for -non-composite ForeignKeys. - -The trickiest part, however, will be relation traversals in QuerySet -lookups. Currently the code in models.sql.query.Query that creates joins -only joins on single columns. To be able to span a composite relationship -the code that generates joins will have to recognize column tuples and add -a constraint for each pair of corresponding columns with the same aliases -in all conditions. - -For the sake of completeness, ForeignKey will also have an extra_filters -method allowing to filter by a related object or its primary key. - -With all this infrastructure set up, ManyToMany relationships using -composite fields will be easy enough. Intermediary model creation will -work thanks to automatic underlying field creation for composite fields -and traversal in both directions will be supported by the query code. ``contenttypes`` and ``GenericForeignKey`` ========================================== -It's fairly easy to represent composite values as strings. Given an -``escape`` function which uniquely escapes commas, something like the -following works quite well:: - - ",".join(escape(value) for value in composite_value) - -However, in order to support JOINs generated by ``GenericRelation``, we -need to be able to reproduce exactly the same encoding using an SQL -expression which would be used in the JOIN condition. +However, in order to support JOINs generated by ``GenericRelation``, +we need to be able to reproduce exactly the same encoding using an +SQL expression which would be used in the JOIN condition. Luckily, while thus encoded strings need to be possible to decode in Python (for example, when retrieving the related object using @@ -570,27 +547,6 @@ this isn't necessary at the database level. Using SQL we only ever need to perform this in one direction, that is from a tuple of values into a string. -That means we can use a generalized version of the function -``django.contrib.admin.utils.quote`` which replaces each unsafe -character with its ASCII value in hexadecimal base, preceded by an escape -character. In this case, only two characters are unsafe -- comma (which is -used to separate the values) and an escape character (which I arbitrarily -chose as '~'). - -To reproduce this encoding, all values need to be cast to strings and then -for each such string two calls to the ``replace`` functions are made:: - - replace(replace(CAST (`column` AS text), '~', '~7E'), ',', '~2C') - -According to available documentation, all four supported database backends -provide the ``replace`` function. [2]_ [3]_ [4]_ [5]_ - -Even though the ``replace`` function seems to be available in all major -database servers (even ones not officially supported by Django, including -MSSQL, DB2, Informix and others), this is still probably best left to the -database backend and will be implemented as -``DatabaseOperations.composite_value_to_text_sql``. - One possible pitfall of this implementation might be that it may not work with any column type that isn't an integer or a text string due to a simple fact – the string the database would cast it to will probably @@ -605,12 +561,8 @@ extremely rare anyway. QuerySet filtering ~~~~~~~~~~~~~~~~~~ -This is where the real fun begins. - The fundamental problem here is that Q objects which are used all over the code that handles filtering are designed to describe single field lookups. -On the other hand, CompositeFields will require a way to describe several -individual field lookups by a single expression. Since the Q objects themselves have no idea about fields at all and the actual field resolution from the filter conditions happens deeper down the From 3e87416ad626e3688750adeef6488e085b2ea76c Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 20:19:28 +0600 Subject: [PATCH 71/80] re arrange n clean up --- draft/orm-field-api-related-improvement.rst | 79 +++++---------------- 1 file changed, 16 insertions(+), 63 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 4c03e077..f1886693 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -110,12 +110,23 @@ fair shake clearer than the existing generic foreign key solutions. Aim of the Proposal: ==================== This DEP aims to improve different part of django ORM and associated -parts of django to support Real VirtualField type in django. There were -several attempt to fix this problem before. So in this Dep we will try -to follow the suggested approaches from Michal Patrucha's previous works -and suggestions in tickets and IRC chat/mailing list. Few other related -tickets were also analyzed to find out possible way's of API design. +parts of django to support Real VirtualField type in django. So in this +Dep we will try to follow the suggested approaches from Michal Patrucha's +previous works and suggestions in tickets and IRC chat/mailing list. +Related tickets were also analyzed to find out possible way's of API design. +A better approach would be to Improve Field API, major cleanup of +RealtionField API, model._meta and internal field_valaue_cache and +related areas first. + +After completing the major clean ups of Fields/RelationFields a REAL +VirtualField type should be introduced and VirtualField based refactor +of ForeignKey and relationFields could have been done. + +This appraoch should keep things easier to approach with smaller steps. + +Later any VirtualField derived Field like CompositeField implementation +should be less complex after the completion of virtualField based refactors. To keep thing sane it would be better to split the Dep in some major Parts: @@ -130,26 +141,6 @@ To keep thing sane it would be better to split the Dep in some major Parts: -Notes on Porting previous work on top of master: -================================================ -Considering the huge changes in ORM internals it is neither trivial nor -practical to rebase & port previous works related to ForeignKey refactor -without figuring out new approach based on present ORM internals -design on top of master. - -A better approach would be to Improve Field API, major cleanup of -RealtionField API, model._meta and internal field_valaue_cache and -related areas first. - -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could have been done. - -This appraoch should keep things easier to approach with smaller steps. - -Later any VirtualField derived Field like CompositeField implementation -should be less complex after the completion of virtualField based refactors. - Key steps of to follow to improve ORM Field API internals: ============================================================== @@ -564,44 +555,6 @@ QuerySet filtering The fundamental problem here is that Q objects which are used all over the code that handles filtering are designed to describe single field lookups. -Since the Q objects themselves have no idea about fields at all and the -actual field resolution from the filter conditions happens deeper down the -line, inside models.sql.query.Query, this is where we can handle the -filters properly. - -There is already some basic machinery inside Query.add_filter and -Query.setup_joins that is in use by GenericRelations, this is -unfortunately not enough. The optional extra_filters field method will be -of great use here, though it will have to be extended. - -Currently the only parameters it gets are the list of joins the -filter traverses, the position in the list and a negate parameter -specifying whether the filter is negated. The GenericRelation instance can -determine the value of the content type (which is what the extra_filters -method is used for) easily based on the model it belongs to. - -This is not the case for a CompositeField -- it doesn't have any idea -about the values used in the query. Therefore a new parameter has to be -added to the method so that the CompositeField can construct all the -actual filters from the iterable containing the values. - -Afterwards the handling inside Query is pretty straightforward. For -CompositeFields (and virtual fields in general) there is no value to be -used in the where node, the extra_filters are responsible for all -filtering, but since the filter should apply to a single object even after -join traversals, the aliases will be set up while handling the "root" -filter and then reused for each one of the extra_filters. - -This way of extending the extra_filters mechanism will allow the field -class to create conjunctions of atomic conditions. This is sufficient for -the "__exact" lookup type which will be implemented. - -Of the other lookup types, the only one that looks reasonable is "__in". -This will, however, have to be represented as a disjunction of multiple -"__exact" conditions since not all database backends support tuple -construction inside expressions. Therefore this lookup type will be left -out of this project as the mechanism would need much more work to make it -possible. ``__in`` lookups for ``VirtualField`` From 60a0a3a4bbae543ac871918bd7cb2d377d10856e Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sun, 2 Apr 2017 22:58:18 +0600 Subject: [PATCH 72/80] minor clean up --- draft/orm-field-api-related-improvement.rst | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index f1886693..97e833c5 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -215,7 +215,7 @@ Base Field for All relation fields extended from new BaseField class. 5. VirtualField: ---------------- -A true stand alone virtula field will be added to solve some long standing +A true stand alone VirtualField will be added to solve some long standing design limitations of django orm. initially RelationFields, GenericRelations etc will be benefitted by using VirtualFields and later CompositeField or any virtual type field can be benefitted from VirtualField. @@ -346,11 +346,11 @@ A relational field consist of: The name of the field. This is the key of the field in _meta.get_field() calls, and thus this is also the name used in ORM queries. - .. attribute:: attname + .. attribute:: attr_name - ForeignKeys have the concrete value in field.attname, and the model instance in + ForeignKeys have the concrete value in field.attr_name, and the model instance in field.name. For example Author.book_id contains an integer, and Author.book contains - a book instance. Attname is the book_id value. + a book instance. attr_name is the book_id value. .. method:: get_query_name() @@ -385,7 +385,6 @@ A relational field consist of: Same as self.remote_field.model. - ******************************** RANDOM DESIGN DOCUMENTATION *********************** Abstract models and relational fields: - If an abstract model defines a relation to non-abstract model, we must not add the remote field. @@ -401,8 +400,12 @@ Introduce standalone ``VirtualField`` ===================================== what is ``VirtualField``? ------------------------- -"A virtual field is a model field which it correlates to one or multiple -concrete fields, but doesn't add or alter columns in the database." +A VirtualField is a model field type which co-relates to one or multiple +concrete fields, but doesn't add or alter columns in the database. + +ORM or migrations certainly can't ignore ForeignKey once it becomes virtual; +instead, migrations will have to hide any auto-generated auxiliary concrete +fields to make migrations backwards-compatible. From 38cd4d1ca52f4c6f4dc1582c307c475682600c9b Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:05:38 +0600 Subject: [PATCH 73/80] VirtualField n other changes --- draft/orm-field-api-related-improvement.rst | 115 ++++++++++++++------ 1 file changed, 83 insertions(+), 32 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 97e833c5..84aaecd6 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -109,19 +109,17 @@ fair shake clearer than the existing generic foreign key solutions. Aim of the Proposal: ==================== -This DEP aims to improve different part of django ORM and associated -parts of django to support Real VirtualField type in django. So in this -Dep we will try to follow the suggested approaches from Michal Patrucha's -previous works and suggestions in tickets and IRC chat/mailing list. -Related tickets were also analyzed to find out possible way's of API design. +This DEP aims to improve django ORM internal Field and related Fields +private api to provide a sane API and mechanism for relation fileds. +Parts of it also propose to introduce true VirtualField type in django. -A better approach would be to Improve Field API, major cleanup of -RealtionField API, model._meta and internal field_valaue_cache and -related areas first. +To acheive these goals, a better approach would be to Improve Field API, +major cleanup of RealtionField API, model._meta and internal field_valaue_cache +and related areas first. -After completing the major clean ups of Fields/RelationFields a REAL -VirtualField type should be introduced and VirtualField based refactor -of ForeignKey and relationFields could have been done. +After completing the major clean ups of Fields/RelationFields a standalone +VirtualField and VirtualField based refactors of ForeignKey and relationFields +and other parts of orm/contenttypes etc could have been done. This appraoch should keep things easier to approach with smaller steps. @@ -131,18 +129,22 @@ should be less complex after the completion of virtualField based refactors. To keep thing sane it would be better to split the Dep in some major Parts: 1. Logical refactor of present Field API and RelationField API, to make - them simpler and consistant with _meta API calls + them simpler and return consistant result with _meta API calls. 2. Introduce new sane API for RelationFields [internal/provisional] -3. Fields internal value cache refactor for relation fields (may be) +3. Make it possible to use Reverse relation directly if necessary. -4. VirtualField Based refactor of RelationFields API +4. Take care of Fields internal value cache for relation fields. [may be] +5. VirtualField Based refactor of RelationFields API +6. ContentTypes refactor. -Key steps of to follow to improve ORM Field API internals: + + +Key steps to refactor ORM Fields API internals: ============================================================== 1. Split out Field API logically to separate ConcreteField, BaseField, RelationField etc and adjust codes based on that API. @@ -166,23 +168,25 @@ Key steps of to follow to improve ORM Field API internals: 7. Refactor all RelationFields [OneToOne, ManyToMany...] based on ``VirtualField`` and new Field API based ForeignKey. -8. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey +8. AuxiliaryField + +9. Refactor GenericForeignKey based on ``VirtualField`` based refactored ForeignKey -9. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API +10. ContentTypes/GenericRelations/GenericForeginKey works well with new Fields API -10. Make changes to migrations framework to work properly with Reafctored Field +11. Make changes to migrations framework to work properly with Reafctored Field API. -11. Migrations work well with VirtualField based refactored API +12. Migrations work well with VirtualField based refactored API -12. Make sure new class based Index API ise used properly with refactored Field +13. Make sure new class based Index API ise used properly with refactored Field API. -13. Query/QuerySets/Expressions work well with new refactored API's +14. Query/QuerySets/Expressions work well with new refactored API's -14. refactor GIS framework based on the changes in ORM +15. refactor GIS framework based on the changes in ORM -15. ModelForms/Admin work well with posposed changes +16. ModelForms/Admin work well with posposed changes @@ -407,6 +411,57 @@ ORM or migrations certainly can't ignore ForeignKey once it becomes virtual; instead, migrations will have to hide any auto-generated auxiliary concrete fields to make migrations backwards-compatible. +A virtualField class could be like the following + + +class VirtualField(Field): + """ + Base class for field types with no direct database representation. + """ + def __init__(self, **kwargs): + kwargs.setdefault('serialize', False) + kwargs.setdefault('editable', False) + super().__init__(**kwargs) + + def db_type(self, connection): + """ + By default no db representation, and thus also no db_type. + """ + return None + + def contribute_to_class(self, cls, name): + super().contribute_to_class(cls, name) + + def get_column(self): + return None + + @cached_property + def fields(self): + return [] + + @cached_property + def concrete_fields(self): + return [f + for myfield in self.fields + for f in myfield.concrete_fields] + + def resolve_concrete_values(self, data): + if data is None: + return [None] * len(self.concrete_fields) + if len(self.concrete_fields) > 1: + if not isinstance(data, (list, tuple)): + raise ValueError( + "Can't resolve data that isn't list or tuple to values for field %s" % + self.name) + elif len(data) != len(self.concrete_fields): + raise ValueError( + "Invalid amount of values for field %s. Required %s, got %s." % + (self.name, len(self.concrete_fields), len(data))) + return data + else: + return [data] + + Changes in ``RelationField`` @@ -414,15 +469,11 @@ Changes in ``RelationField`` Relationship fields ~~~~~~~~~~~~~~~~~~~ -This turns out to be, not too surprisingly, the toughest problem. The fact -that related fields are spread across about fifteen different classes, -most of which are quite nontrivial, makes the whole bundle pretty fragile, -which means the changes have to be made carefully not to break anything. - -What we need to achieve is that the ForeignKey, ManyToManyField and -OneToOneField detect when their target field is a CompositeField in -several situations and act accordingly since this will require different -handling than regular fields that map directly to database columns. +The fact that related fields are spread across about fifteen different +classes, most of which are quite nontrivial, makes the whole bundle +pretty fragile, which means the changes have to be made carefully not +to break anything. This will require different handling than regular +fields that map directly to database columns. The first one to look at is ForeignKey since the other two rely on its functionality, OneToOneField being its descendant and ManyToManyField From 010cb4a6d6156b55e9f5ee8f9098c9adeed4dedc Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:21:40 +0600 Subject: [PATCH 74/80] VirtualField n other changes --- draft/orm-field-api-related-improvement.rst | 24 +++------------------ 1 file changed, 3 insertions(+), 21 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 84aaecd6..c8746438 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -578,29 +578,11 @@ where ``place_ptr`` is a ``OneToOneField`` and ``chef`` is a -``contenttypes`` and ``GenericForeignKey`` +``ContentTypes`` and ``GenericForeignKey`` ========================================== +Following the refactor of Fields API and introduction of true +VirtualField type, this part will also be refactored. -However, in order to support JOINs generated by ``GenericRelation``, -we need to be able to reproduce exactly the same encoding using an -SQL expression which would be used in the JOIN condition. - -Luckily, while thus encoded strings need to be possible to decode in -Python (for example, when retrieving the related object using -``GenericForeignKey`` or when the admin decodes the primary key from URL), -this isn't necessary at the database level. Using SQL we only ever need to -perform this in one direction, that is from a tuple of values into a -string. - -One possible pitfall of this implementation might be that it may not work -with any column type that isn't an integer or a text string due to a -simple fact – the string the database would cast it to will probably -differ from the one Python will use. However, I'm not sure there's -anything we can do about this, especially since the string representation -chosen by the database may be specific for each database server. Therefore -I'm inclined to declare ``GenericRelation`` unsupported for models with a -composite primary key containing any special columns. This should be -extremely rare anyway. QuerySet filtering From ca1a3ba061b4c5c1a248c1c08b6bdae29762ade4 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 02:25:34 +0600 Subject: [PATCH 75/80] relation field clean up --- draft/orm-field-api-related-improvement.rst | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index c8746438..057fd4fd 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -493,23 +493,13 @@ corresponding to the local one. I'm open to other suggestions on this. There will also be a way to override the default names using a new field option "enclosed_fields". This option will expect a tuple of fields each of whose corresponds to one individual field in the same order as -specified in the target CompositeField. This option will be ignored for +specified in the target Field. This option will be ignored for non-composite ForeignKeys. -The trickiest part, however, will be relation traversals in QuerySet -lookups. Currently the code in models.sql.query.Query that creates joins -only joins on single columns. To be able to span a composite relationship -the code that generates joins will have to recognize column tuples and add -a constraint for each pair of corresponding columns with the same aliases -in all conditions. For the sake of completeness, ForeignKey will also have an extra_filters method allowing to filter by a related object or its primary key. -With all this infrastructure set up, ManyToMany relationships using -composite fields will be easy enough. Intermediary model creation will -work thanks to automatic underlying field creation for composite fields -and traversal in both directions will be supported by the query code. Changes in ``ForeignKey`` @@ -585,6 +575,7 @@ VirtualField type, this part will also be refactored. + QuerySet filtering ~~~~~~~~~~~~~~~~~~ From f7bc4258d5c114f2d7fe23f0f353e7714ebc67b8 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 15:45:22 +0600 Subject: [PATCH 76/80] changes --- draft/orm-field-api-related-improvement.rst | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 057fd4fd..ebd13067 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -54,9 +54,9 @@ class Comment(models.Model): text = models.TextField() flags = models.OneToMany(Flag) -That way, if you had a new content type (say, a "Post"), it could also +That way, if we had a new content type (say, a "Post"), it could also participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, +of "Flag" to add a new foreign key. Without baking in migrations, there's obviously no way to make the underlying SQL play nice in this circumstance: one-to-many relationships with just two tables can only be expressed in SQL with a reverse foreign key relationship. However, @@ -78,7 +78,7 @@ the comment from a flag, have to call: comment = flag.comment_set.all()[0] as the ORM doesn't know for a fact that each flag could only have one -comment. But Django _could_ implement a OneToManyField in this way +comment. But Django can implement a OneToManyField in this way (using the underlying ManyToMany paradigm), and provide sugar such that this would all be nice and flexible, without having to do cumbersome ORM calls or explicitly define extra join tables: @@ -216,6 +216,7 @@ VirtualField's features can also be added to specific fields. 4. RelationField: ----------------- Base Field for All relation fields extended from new BaseField class. +In new class hirerarchy RelationFields will be Virtual. 5. VirtualField: ---------------- @@ -323,6 +324,17 @@ Those users who subclass Django's fields (or duck-type Django's fields) will need updating. Examples of such projects include django-rest-framework and django-taggit. +While the advised approach was: +1. Find places where rield.remote_field responds to different API than Field. +Fix these one at a time while trying to have backwards compat, even if the +API isn't public. + +2. In addition, simplifications to the APIs are welcome, as is a high level +documentation of how related fields actually work. + +3. We need to try to keep backwards compat as many projects are forced to +use the private APIs. But most of all, do small incremental changes. + Proposed API and workd flow for clean ups: ========================================== From deef8aa96751e0e68f486dae05d5f46b92b8567d Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Mon, 3 Apr 2017 16:36:02 +0600 Subject: [PATCH 77/80] changes --- draft/orm-field-api-related-improvement.rst | 22 ++++++++++----------- 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index ebd13067..54ea7c1c 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -325,6 +325,7 @@ need updating. Examples of such projects include django-rest-framework and django-taggit. While the advised approach was: + 1. Find places where rield.remote_field responds to different API than Field. Fix these one at a time while trying to have backwards compat, even if the API isn't public. @@ -336,6 +337,14 @@ documentation of how related fields actually work. use the private APIs. But most of all, do small incremental changes. +I would like to try the more direct approach. The reasons are, + +1. Define clear definition of relation fields class hierarchy and naming. + at present the class names for reverse relation and backreference is + quite confusing, like BackReference of any relation class is being called + + + Proposed API and workd flow for clean ups: ========================================== Relational field API @@ -502,12 +511,6 @@ default names: "fkname_targetname" where "fkname" is the name of the ForeignKey field and "targetname" is the name of the remote field name corresponding to the local one. I'm open to other suggestions on this. -There will also be a way to override the default names using a new field -option "enclosed_fields". This option will expect a tuple of fields each -of whose corresponds to one individual field in the same order as -specified in the target Field. This option will be ignored for -non-composite ForeignKeys. - For the sake of completeness, ForeignKey will also have an extra_filters method allowing to filter by a related object or its primary key. @@ -596,15 +599,10 @@ code that handles filtering are designed to describe single field lookups. -``__in`` lookups for ``VirtualField`` -======================================= - - ModelChoiceFields ~~~~~~~~~~~~~~~~~ -Again, we need a way to specify the value as a parameter passed in the -form. The same escaping solution can be used even here. +As the virtualField itself won't be backed by any real db field Admin/ModelForms ================ From 27bf41d18d39b45e1916ae08082feb7b2e28170c Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Sat, 8 Apr 2017 14:53:25 +0600 Subject: [PATCH 78/80] changes --- draft/orm-field-api-related-improvement.rst | 36 +++++++++++++-------- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index 54ea7c1c..d896e385 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -339,9 +339,16 @@ use the private APIs. But most of all, do small incremental changes. I would like to try the more direct approach. The reasons are, -1. Define clear definition of relation fields class hierarchy and naming. - at present the class names for reverse relation and backreference is - quite confusing, like BackReference of any relation class is being called +1. Define clear definition of relation fields class hierarchy and naming + At present the class names for reverse relation and backreference is + quite confusing. Like RemoteField is actually holding the information about + any Fields relation which are now + +2. I have plan to introduce OneToManyField which can be used directly + and will be the main ReverseForeignKey + +3. + @@ -349,8 +356,8 @@ Proposed API and workd flow for clean ups: ========================================== Relational field API ==================== -Currently the main use case is that we have a single place where I -can check that we don't define redundant APIs for related fields. +Currently the main use case is that we have a single place where +can be checked that we don't define redundant APIs for related fields. Structure of a relational field ------------------------------- @@ -358,7 +365,7 @@ Structure of a relational field A relational field consist of: - The user created field - - Possibly of a remote field, which is auto-created by the user created field + - Possibly of a remote_field, which is auto-created by the user created field Both the created field and the remote field can possibly add a descriptor to the field's model. @@ -415,7 +422,9 @@ A relational field consist of: field. - If an model defines a relation to abstract model, this should just fail (check this!) +This was basically taken from a old work on Relational API clean up, but not well tested. +I believe I can adjust these later. Part-2: @@ -496,25 +505,23 @@ pretty fragile, which means the changes have to be made carefully not to break anything. This will require different handling than regular fields that map directly to database columns. +For that reason the Relational API will be cleaned up to return consistant +result and later VirtualField based refactor will take place. + The first one to look at is ForeignKey since the other two rely on its functionality, OneToOneField being its descendant and ManyToManyField using ForeignKeys in the intermediary model. Once the ForeignKeys work, -OneToOneField should require minimal to no changes since it inherits +OneToOneField should require minimal changes since it inherits almost everything from ForeignKey. -The easiest part is that for composite related fields, the db_type will be -None since the data will be stored elsewhere. ForeignKey and OneToOneField will also be able to create the underlying fields automatically when added to the model. I'm proposing the following -default names: "fkname_targetname" where "fkname" is the name of the +default names: "fk_targetname" where "fkname" is the name of the ForeignKey field and "targetname" is the name of the remote field name corresponding to the local one. I'm open to other suggestions on this. -For the sake of completeness, ForeignKey will also have an extra_filters -method allowing to filter by a related object or its primary key. - Changes in ``ForeignKey`` @@ -612,3 +619,6 @@ Admin/ModelForms GIS Framework: ============== + + + From f6e3b84f6b64df1cf69ae0c391b90f1c00965bd4 Mon Sep 17 00:00:00 2001 From: Asif Saifuddin Auvi Date: Thu, 13 Apr 2017 20:21:58 +0600 Subject: [PATCH 79/80] changes --- draft/orm-field-api-related-improvement.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index d896e385..a893d671 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -15,8 +15,7 @@ DEP : ORM Relation Fields API Improvements using VirtualField Background: =========== -Django's ORM is a simple & powerful tool which suits most use-cases. -However, historicaly it has some design limitations and complex internal +Historically Django's ORM has some design limitations and complex internal API which makes it not only hard to maintain but also produce inconsistant behaviours. From ed33a9d0d9357d97fcec07884a57200c347ef482 Mon Sep 17 00:00:00 2001 From: Asif Saif Uddin Date: Fri, 6 Oct 2017 16:46:05 +0600 Subject: [PATCH 80/80] drop un needed texts --- draft/orm-field-api-related-improvement.rst | 74 +-------------------- 1 file changed, 1 insertion(+), 73 deletions(-) diff --git a/draft/orm-field-api-related-improvement.rst b/draft/orm-field-api-related-improvement.rst index a893d671..9c3c1074 100644 --- a/draft/orm-field-api-related-improvement.rst +++ b/draft/orm-field-api-related-improvement.rst @@ -31,79 +31,7 @@ incorporate virtualField type based refctors of RelationFields. Limitations of ORM that will be taken care of: ============================================== -One limitation is, - -Django supports many-to-one relationships -- the foreign keys live on -the "many", and point to the "one". So, in a simple app where you -have Comments that can get Flagged, one Comment can have many Flag's, -but each Flag refers to one and only one Comment: - -class Comment(models.Model): - text = models.TextField() - -class Flag(models.Model): - comment = models.ForeignKey(Comment) - -However, there are circumstances where it's much more convenient to -express the relationship as a one-to-many relationship. Suppose, for -example, you want to have a generic "flagging" app which other models -can use: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -That way, if we had a new content type (say, a "Post"), it could also -participate in flagging, without having to modify the model definition -of "Flag" to add a new foreign key. Without baking in migrations, -there's obviously no way to make the underlying SQL play nice in this -circumstance: one-to-many relationships with just two tables can only -be expressed in SQL with a reverse foreign key relationship. However, -it's possible to describe OneToMany as a subset of ManyToMany, with a -uniqueness constraint on the "One" -- we rely on the join table to -handle the relationship: - -class Comment(models.Model): - text = models.TextField() - flags = models.ManyToMany(Flag, through=CommentFlag) - -class CommentFlag(models.Model): - comment = models.ForeignKey(Comment) - flag = models.ForeignKey(Flag, unique=True) - -While this works, the query interface remains cumbersome. To access -the comment from a flag, have to call: - -comment = flag.comment_set.all()[0] - -as the ORM doesn't know for a fact that each flag could only have one -comment. But Django can implement a OneToManyField in this way -(using the underlying ManyToMany paradigm), and provide sugar such -that this would all be nice and flexible, without having to do cumbersome -ORM calls or explicitly define extra join tables: - -class Comment(models.Model): - text = models.TextField() - flags = models.OneToMany(Flag) - -class Post(models.Model): - body = models.TextField() - flags = models.OneToMany(Flag) - -# in a separate reusable app... -class Flag(models.Model) - reason = models.TextField() - resolved = models.BooleanField() - -# in a view... -comment = flag.comment -post = flag.post - -It's obviously less database efficient than simple 2-table reverse -ForeignKey relationships, as you have to do an extra join on the third -table; but you gain semantic clarity and a nice way to use it in -reusable apps, so in many circumstances it's worth it. And it's a -fair shake clearer than the existing generic foreign key solutions. + Aim of the Proposal: