Skip to content

Latest commit

 

History

History
1554 lines (1284 loc) · 76.9 KB

06-schema.adoc

File metadata and controls

1554 lines (1284 loc) · 76.9 KB

Schema

If you have built castles in the air, your work need not be lost, that is where they should be. Now put the foundations under them.
— Henry David Thoreau

So far we have been discussing the things that influence how midPoint interacts with the outside. Resource definitions, outbound and inbound mappings, even the roles - the primary purpose of those things is to control how data get into midPoint and out of midPoint. Now it is time to discuss how midPoint works internally.

Early identity management systems were little more than smart data transformers. They took data from data sources, modified them in some way, applied access control model such as RBAC, and then they pushed the data out. There was very little crucial information that was stored inside the IDM system itself. However, that was a long time ago. The world is a different place now. Focus of identity management field has shifted towards identity governance and high-level policies. It is not enough to just transform the data. Policies have to be applied. Regulatory compliance has to be evaluated. There are processes to follow, paperwork to do, evidence to collect, reports to compile, notifications, reviews and daily status reports. It is perhaps no big surprise that there is a good deal of management in Identity Management after all.

Many of the following chapters deal with these management concepts. However, we have to start from the basics, the very foundation of midPoint: schema. MidPoint is built for much more that just a mere data transformation. MidPoint is designed to unify the data. Schema plays a crucial part in that ambition.

MidPoint Schema

MidPoint is designed as a schema-aware system, from the bottom to the top. MidPoint has a definition for every bit of data that passes through it. We know whether a particular piece of data is string, integer or timestamp. We know whether it is single-valued or multi-valued. We know whether it is optional or mandatory. We know whether this is a sensitive piece of data that requires extra protection. We know whether it is part of technical meta-data that we usually do not want to show by default. We usually also know what label we should use when we are presenting the data, and how that label translates to other languages. We know quite a lot about the data that we work with. All the objects that midPoint works with are completely defined by the schema. There is a schema for user, role, org, resource, system configuration and everything else.

Such awareness of the schema brings significant advantages to midPoint. The most obvious advantage is in data presentation. We know that we need to render a calendar selector because that particular data property is timestamp. We know that we need to render a text field with a plus button to add values because that particular property is a multi-valued string. We know that some fields should be disabled because those properties are read-only. This behavior is not hard-coded in the user interface code. Vast majority of midPoint user interface is rendered by interpreting midPoint schema.

This approach is absolutely crucial for any serious data management system to operate efficiently, doubly so for identity management. One of the reasons is that the identity management system works with data that are retrieved from other systems (resources). It is not realistically possible to hard-code midPoint user interface for all the various attributes that all the possible resources could have. A different strategy is needed here, a strategy that is much more dynamic.

When midPoint connects to a new resource for the first time it attempts to retrieve resource schema. The resource schema specifies what object classes the resource supports, which attributes the object classes have, what types are those and other details about the data model of the resource. MidPoint transforms this resource schema to its own native format, and stores that in the resource definition. This means that midPoint has the schema available anytime it is needed for dynamic interpretation. That schema is used to display resource data in the most natural and user-friendly way. It is also used by automatic data type conversions, which makes configuration of mappings easier.

Data Unification

MidPoint schema is not just a nice way to describe user, role or organizational structure. It has a much deeper meaning. The primary purpose of a schema is integration, data translation and unification. A clever reader would certainly remember that we have already talked about star topology or hub-and-spoke integration pattern. MidPoint is like a hub of the wheel and all the resources connect to midPoint as spokes. MidPoint is actively discouraging direct resource-to-resource communication. Everything in midPoint is built for resource-to-midPoint and midPoint-to-resource communication. MidPoint is always the center – for a very good reason. All resource data need to be translated to and from midPoint "data language". Thus, midPoint creates a common language that everybody can understand. This is the very purpose of midPoint schema. The schema of user, role, org and service is designed to contain properties that are often used in identity-related integration scenarios. Therefore, an engineer who is designing a mapping is quite likely to find a suitable property in midPoint schema that is prepared to be used.

MidPoint schema forms a lingua franca, a common language that can be translated to various data dialects used by the resources. Even more than that, it also provides a basic framework that can be reused for many midPoint deployments. Therefore, an engineer starting a new deployment does not need to start on a completely green field. The basic schema will always be there to provide a starting point.

Tip
Ever wondered why midPoint is called midPoint? Clever reader would have figured that out already.

Basic User Schema

When it comes to identity management field, there is one concept that is at the center of everything: concept of user. User is undoubtedly the most important object in the entire midPoint schema. Therefore, it is worth to have a closer look at how this object looks like. This is going to be a really educative lesson, as it will explain several fundamental principles of midPoint.

User is represented by schema datatype identified as UserType. Adding the Type suffix to data types is a common convention in midPoint, there are UserType, RoleType, OrgType, ResourceType and so on. This convention is partially historic, partially given by XML Schema conventions, partially a convenience to developers. Regardless of the origins, this convention is used for all the data types in midPoint schemas. You will get used to it eventually.

UserType is what we call an object definition in midPoint parlance. This means that UserType data structure specifies a complete midPoint object with all the things that any self-respecting object needs. There is object identifier (OID), name that can be presented in different forms and languages, free-form description and so on. All midPoint objects have those things.

The UserType data structure has many additional properties, containers and references. Property is a primitive data item such as string, integer or a timestamp. Container is a complex data structure that contains a bunch of properties or other containers. Reference is a pointer to another midPoint object.

Note
Properties are primitive. However, there may be properties that have internal structure, even quite a complex internal structure. This is sometimes given by historic reasons. However, there are also properties that need to be complex, e.g. properties that require localizable presentation, or properties that provide protection of data. Indeed, this may be somehow confusing. Even a clever reader looks puzzled now. However, this distinction is not a big issue, for now.

Definition of UserType is summarized in the following table:

Name Type Description

name

property

Human-readable, mutable name of the object. It is typically a username or some kind of application-level identifier. The value must be unique among all the users.
Example: jrandom

description

property

Free-form textual description of the object. This is meant to be displayed in the user interface.
Example: Random account for testing.

extension

container

A container for custom schema extensions. We will discuss that later.

metadata

container

Meta-data about object creation, modification, etc.

lifecycleState

property

Lifecycle state of the object. This property defines whether the object represents a draft, proposed definition, whether it is active, deprecated, and so on.
Example: active

assignment

container

Set of object’s assignments. Assignments define the privileges, policies and "features" that this object should have, that this object is entitled to. Typical assignment will point to a role, or define a construction of an account.
Assignments represent what the object should have. The assignments represent a policy, a desired state of things.

linkRef

reference

Set of shadows (projections) linked to this focal object. E.g. a set of accounts linked to a user. This is the set of shadows that belongs to the focal object in a sense that these shadows represents the focal object on the resource. E.g. The set of accounts that represent the same midPoint user (the same physical person, they are "analogous").
Links define what the object has. The links reflect real state of things.

activation

container

Type that defines activation properties. Determines whether something is active (and working) or inactive (e.g. disabled).

jpegPhoto

property

Photo of a user (in a binary form).

costCenter

property

The name, identifier or code of the cost center to which the user belongs.

locality

property

Primary locality of the user, the place where the user usually works, the country, city or building that he belongs to. The specific meaning and form of this property is deployment-specific.

preferredLanguage

property

Indicates user’s preferred language, usually for the purpose of localizing user interfaces. The format is IETF language tag defined in BCP 47, where underscore is used as a subtag separator. This is usually a ISO 639-1 two-letter language code optionally followed by ISO 3166-1 two-letter country code separated by underscore.
Example: en_US

locale

property

Defines user’s preference in displaying currency, dates and other items related to location and culture. It has the same format as preferredLanguage.
Example: en_US

timezone

property

User’s preferred timezone. It is specified in the "tz database" (a.k.a "Olson") format.
Example: Europe/Bratislava

emailAddress

property

E-Mail address of the user, org. unit, etc. This is the address supposed to be used for communication with the user.
Example: [email protected]

telephoneNumber

property

Primary telephone number of the user.
Example: +421 123 456 789

fullName

property

Full name of the user with all the decorations, middle name initials, honorific title and any other structure that is usual in the cultural environment that the system operates in. This element is intended to be displayed to a common user of the system. Example: James W. Random, PhD.

givenName

property

Given name of the user. It is usually the first name of the user, but the order of names may differ in various cultural environments. This element will always contain the name that was given to the user at birth or was chosen by the user.
Example: James

familyName

property

Family name of the user. It is usually the last name of the user, but the order of names may differ in various cultural environments. This element will always contain the name that was inherited from the family or was assigned to a user by some other means. Example: Random

additionalName

property

Middle name, patronymic, matronymic or any other name of a person. It is usually the middle component of the name, however that may be culture-dependent. Example: Walker

nickName

property

Familiar or otherwise informal way to address a person. Example: Randy

honorificPrefix

property

Honorific titles that go before the name. Example: Sir

honorificSuffix

property

Honorific titles that go after the name. Example: PhD.

title

property

User’s title defining a work position or a primary role in the organization. Example: CEO

personalNumber

property

Unique, business-oriented identifier of the employee. E.g. employee number, student identifier, citizen identifier, ID card number, social security number, etc. Typically used as a correlation identifier and for auditing purposes. Should be immutable, but the specific properties and usage are deployment-specific.

organization

property

Name or (preferably) immutable identifier of organization that the user belongs to. The format is deployment-specific. This property together with organizationalUnit may be used to provide easy-to-use data about organizational membership of the user.

organizationalUnit

property

Name or (preferably) immutable identifier of organizational unit that the user belongs to. The format is deployment-specific. This property together with organization may be used to provide easy-to-use data about organizational membership of the user.

credentials

container

The set of user’s credentials (such as passwords).

This is a basic outline of the schema for UserType. This description is slightly simplified. Not all the items that are defined for UserType are shown in the table above. Deprecated items are not shown at all. Only some operational properties are shown. Some items are simplified or entirely omitted for clarity.

Following example illustrates the use of midPoint UserType schema:

<user>
    <name>alice</name>
    <activation>
        <administrativeStatus>enabled</administrativeStatus>
    </activation>
    <preferredLanguage>en_US</preferredLanguage>
    <assignment>
        <targetRef oid="aaa6cde4-0471-11e9-9b50-c743da469067" type="RoleType"/>
    </assignment>
    <assignment>
        <targetRef oid="4e73ed62-aef9-11e9-a7a8-57334ef1f991" type="RoleType"/>
    </assignment>
    <emailAddress>[email protected]</emailAddress>
    <fullName>Alice Anderson, PhD.</fullName>
    <givenName>Alice</givenName>
    <familyName>Anderson</familyName>
    <honorificSuffix>PhD.</honorificSuffix>
    <title>Business Analyst</title>
    <personalNumber>001</personalNumber>
    <organizationalUnit>10010</organizationalUnit>
</user>

Operational, Experimental and Deprecated Items

Most of the items in midPoint schema are quite ordinary and they behave as expected. Such as the fullName property. The property can be set and changed by using midPoint user interface. However, then there are some extraordinary items. Those are automatically determined and controlled by midPoint core engine. Those items are essential for correct operation of midPoint. Therefore, they are called operational items. Operational items are usually not directly displayed in the user interface. They are either completely hidden, displayed indirectly or displayed only when user chooses to display them.

MidPoint schema has grown and evolved over time, and it is still evolving. Therefore, it is quite expected that the schema will slightly change over time. However, we do not want to affect midPoint deployments by incompatible schema changes in every midPoint release. Therefore, items are usually not removed from midPoint schema without a warning. An item that we plan to remove is marked as deprecated first. At that point, such item is still working as it was working before. However, it is not displayed in the user interface, to discourage use of that item. Deprecated items are removed in one of the subsequent midPoint releases. This gives enough time for midPoint users to adapt to schema changes.

There is also another kind of schema evolution. Development of most midPoint features is quick and straightforward. Then there are features that are quite complex or features that involve some degree of exploration. Those features cannot be implemented in a single midPoint release. There are also features that are provided to the midPoint community as a "preview", to gather feedback for further development. All such features are marked as experimental. Those features are not officially supported, but you are free to use them at your own risk. Most new features require extensions of midPoint schema. This is also true for those experimental features. However, when going experimental, there is a fair chance that something will change in the future. Therefore, we are explicitly marking parts of the schema as experimental. This is a warning that those parts are likely to change. We are not promising any kind of compatibility for experimental parts of midPoint schema. They may change any time, they may even completely disappear. There will be no deprecation or any other warning. Simply speaking: if you are dealing with experimental features, you are completely on your own. Do not come crying when those things stop working. You have been warned.

Lifecycle and Activation

Time is cruel, everything that we do is in some way temporary. Except perhaps for stupidity, which seems to be utterly endless. Sadly, all other things have a beginning and an end. Employees have hiring date, contracts have end dates, users can be deactivated, roles may get replaced and so on. We use the terms lifecycle and activation to encompass all those things that deal with the questions of digital life and death of the objects.

Lifecycle State

Users, as well as other identity-related objects have their cycle of life. They are enrolled into the system, such as record of new hire is entered into HR system. Then the objects are active, such as user accounts of an active employee. The objects may be inactivated for a while, such as an employee on maternal leave or sabbatical. Employees do not work for ever, they may resign, may be laid off, or retired. Even then, a record of a former employee may be kept for some time.

While the details of the lifecycle may be subtly different for various organizations and types of objects, the basic outline is almost always the same. For that reason the basic lifecycle states are pre-configured in midPoint, as illustrated in following diagram.

Lifecycle states

Objects have their lifecycle state specified in lifecycleState property. If no explicit value is specified, the default value is active. Pre-defined lifecycle states are described in following table.

State Is active Description Examples

draft

No

Definition of the new object in progress. The definition may change at any moment, it is not ready yet.

Role definition in preparation (not finished yet).

proposed

No

Definition of a new object is ready for use, but there is still a review process to be applied (e.g. approval). The definition should not change in this state.

Finished new role definition in approval process.
Self-registered user, not yet validated.

active

Yes

Active and working definition. Ready to be used without any unusual limitations.

Active employee.
Role in production use.

suspended

No

Suspended definition, temporarily disabled. It is expected that the object will return to active state eventually.

Employee on temporary leave (maternal leave, sabbatical).
Resource temporarily disabled for maintenance.

deprecated

Yes

Active definition which is being phased out. The definition is still fully operational, but it should not be used for new assignments. E.g. it should not be requested, it should not be approved, etc.

Deprecated role: still working, but not intended to be assigned any more.
Legacy resource: we still want to read the data, but we do not want to create new accounts.

archived

No

Inactive historical definition. It is no longer used. It is maintained only for historical, auditing and sentimental reasons. E.g. some systems require that the account exists to maintain referential consistency of historical data, audit records, etc. It may also be used to "block" the user or account identifier to avoid their reuse.

Retired employee, keeping minimal record for accounting reasons and to avoid identifier recycling.
Phased-out role definition, kept for historical reasons.

failed

no

Unexpected error has occurred during object lifecycle. Result of that event is that the object is rendered inactive. The situation cannot be automatically remedied. Manual action is needed.

Role definition rejected during approvals, without obvious continuation of the process.
Role definition identified to be in violation of the policy, immediately taken out of use.
Resource with unexpected critical errors, requiring attention od administrators.

Lifecycle state of an object determines whether it is considered active or inactive, among other things. This is perhaps the most important effect of lifecycle state on the system. User in suspended state is inactive, accounts are disabled, user cannot log in. Role in draft state is also inactive, the definition is not complete yet, it is not ready for use. Objects in active state operate normally.

When it comes to users, lifecycle state is meant to be controlled automatically, if possible. The usual method is to synchronize user lifecycle states from the data source, such as HR system. Candidate user record is meant to be proposed, the changed to active when hired, temporarily set to suspended during maternal leave, and finally end up in archived state until the object is deleted.

Users are not the only objects that are affected by lifecycle state. Many object types in midPoint have lifecycle states. Interpretation of the states are still almost the same. However, unlike users, lifecycle states of other objects are usually controlled manually. E.g. a business role begins its life in draft state. Role manager builds up role definition while in draft state, looking for appropriate combination of application roles to include in the definition, consulting with colleagues. This may take some time. In the meantime, the role is in draft state, inactive, without any risk of unintentional use. When role definition is done, it is switched to proposed state, reviewed, approved and finally set as active. When in active state, the role is fully operational, assigned to users, maybe even modified a bit to adapt to changed circumstances. Sooner or later the role becomes obsolete. However, we cannot simply delete the role, as it is still assigned to users. We have to be careful, we do not want to disrupt the business. First, we switch the role to deprecated state. The role is still active, everything works fine, just the role cannot be requested and it should not be assigned to any users. Then we can take our time to clean up all existing role assignments, replacing deprecated role with newer equivalents. Finally, when all assignments are gone, we can switch role to archived state.

Lifecycle state can be applied to many configuration elements in addition to objects. When lifecycle state is applied to parts of configuration, it controls whether the respective configuration is applied. E.g. setting a mapping to draft lifecycle state effectively disables the mapping. This approach can be used to temporarily deactivate parts of configuration.

Lifecycle state provides elegant, systematic, unified and controlled mechanisms to control how active an object is. It guides an objects from its digital cradle to its eventual binary death. As such, it is one of the essential mechanisms of identity management.

Activation

Identity management is all about the rules, policies and principles applied at scale. However, the world around us is not always completely systematic and elegant. There are always exceptions, special cases, data errors and other circumstances that do not entirely fit into our elegant identity management universe. For that reason, midPoint has activation mechanism, which provides ability for finer control as compared to simple lifecycle state. Activation also provides ability for manual overrides by administrator in case of need.

The activation in itself is multi-dimensional and a bit complex data structure. It is composed of several properties that may change in somehow independent and somehow inter-dependent way. Following list provides a quick summary of activation properties:

  • Administrative status defines administrative state of the object, usually manually set by system administrator.

  • Validity properties specify when the object should be active. There is activation date and deactivation date.

  • Effective status is a computed operational property that shows the current effective status of the user. It is computed from lifecycle state and other activation properties.

  • Lockout status is used for automatic temporary inactivation of user, e.g. in case of numerous failed authentication attempts.

  • Additional operational properties provide (meta) data about the past changes of administrative status.

The best way to explain how activation works is to describe the meaning and behavior of individual properties.

Administrative status defines the "administrative state" of the object (user), i.e. the explicit decision of the administrator. If administrative status is set, this property overrides any other constraints in the activation type (but not the lifecycle state). E.g. if this is set to enabled and the user is not yet valid (according to validity below), the user should be considered active. If set to disabled the user is considered inactive regardless of other settings. Therefore, this property does not necessarily define an actual state of the object. It is a kind of "manual override". In fact, the most common setting for this property is to leave it unset and let other properties determine the state. If this property is not present then the other constraints in the activation type should be considered (namely validity properties, see below).

Administrative Status Value Description

no value

No explicit override. Other activation properties determine the resulting status.

enabled

The entity is active. It is enabled and fully operational (if lifecycle state permits).

disabled

The entity is inactive. It has been disabled by an administrative action.

This indicates temporary inactivation with an intent to enable the entity later. It is usually used to temporarily disable account for security reasons.

If the administrative status is not present, and there are no other constraints in the activation type, or if there is no activation type at all, then the object is assumed to be "enabled", i.e. that the user is active - provided that the lifecycle state of the object allows it.

Note
The archived state of administrative status should not be used. Lifecycle state archived should be used instead. This administrative status value is one of the leftovers from midPoint history, from the dark ages when lifecycle state did not exist yet.

Validity refers to times when the object is considered legal or otherwise usable. In midPoint, the validity is currently defined by two dates: the date from which the object is valid (validFrom) and the date to which an object is valid (validTo). When talking about users, these dates usually represent the date when the contract with the user started (hiring date) and the date when the contract ends. The user is considered valid (active) between these two dates. The user is considered inactive before the validFrom date or after the validTo date.

It is perfectly acceptable to set just one of the dates or no date at all. If any date is unset then it is assumed to extend to infinity. E.g. if validFrom date is not set, the user is considered active from the beginning of the universe to the moment specified by the validTo date.

The validity is overridden by the administrative status. Therefore, if administrative status is set to any non-empty value then the validity dates are not considered at all.

Lockout status defines the state of user or account lock-out. Lock-out means that the account was temporarily disabled due to failed login attempts or a similar abuse attempt. This mechanism is usually used to avoid brute-force or dictionary password attacks and the lock will usually expire by itself in a matter of minutes.

This value is usually set by the resource or by midpoint internal authentication code. This value is mostly used to read the lockout status of a user or an account. This value is semi-writable. If the object is locked then it can be used to set it to the unlocked state. However, it does not work the other way around. It cannot be used to lock the account. Locking is always done by the authentication code.

Lockout Status Value Description

no value

No information (generally means unlocked user or account)

normal

Unlocked and operational user or account.

locked

The user or account has been locked. Log-in to the account is temporarily disabled.

Please note that even if user of account are in the normal (unlocked) state, they still can be disabled by lifecycle, administrative status or validity which will make them efficiently inactive.

There is also an informational property lockoutExpirationTimestamp that provides information about the expiration of the lock. However, not all resources may be able to provide such information.

Lifecycle vs Activation

Lifecycle state and activation work together, although the interaction between them might look slightly mysterious. Object lifecycle specifies phases of object’s life, separated by important life-changing events. Therefore, lifecycle state is the most important aspect when considering whether object is active or inactive. When lifecycle state specifies that object is inactive, then the decision is final. Such object is inactive, regardless of any other activation setting.

This makes perfect sense. E.g. when an object is in draft state, it is just being prepared for use. Such object may have validity dates or administrative status that would normally make it active. However, we do not want draft objects to be active yet. Such object may need a review and approval to transition to active lifecycle state. Only then it will really become active.

If lifecycle state indicates object as active, the the activation part is considered. Administrative status, validity dates and lockout status are computed, which determines state of the object.

This may look complicated. However, it matches the needs of identity management reality quite well. As the rule of the thumb, it is usually lifecycle state which is synchronized from data source. E.g. the HR system makes computation whether employee is in hiring process, whether the employee is active or retired. MidPoint takes that HR status and translates it to the pre-defined lifecycle state values. However, the HR system may not be able to provide such aggregate status. In that case midPoint inbound mappings should be used in a creative way to supplement that functionality. Unfortunately, the details are always deployment-specific, as the specific solution depends on the data that the HR system can provide. E.g. it is quite a common practice to map validity dates from HR system, and let midPoint do the validity calculations. The specific mappings of lifecycle state and validity dates is always somehow tricky in practice. It heavily depends on specific characteristics and capabilities of the data source (HR system). However, it is strongly recommended to control lifecycle state by using inbound mapping from the data source (HR system), and do not control administrative status. Administrative status should remain as a "last resort" mechanisms when user needs to be quickly disabled by manual action, e.g. in case of security incident.

The concepts of lifecycle and activation are not limited to users. Many midPoint objects have lifecycle state and activation. Roles can expire, organizational units can be disabled and so on. Lifecycle and activation are a concepts that have a very broad application in midPoint. Even assignments have activation, which is a crucial element in some configuration (e.g. multi-affiliation). Assignments are often used to model employment contracts, student affiliations, service contracts and similar concepts that have time boundaries. This is usually achieved by a clever use of assignment activation.

Activation Operational Properties

Lifecycle state and activation are somehow complex and spread out in several dimensions. Therefore, it may not be entirely obvious which objects are active and which are not. For that reason midPoint provides an operational property effectiveStatus which shows the computed "effective state" of the object. Simply speaking, it is a read-only property which tells whether the user should be considered active or inactive. The effective status is the result of combining lifecycle state and several activation settings (administrative status, validity dates, etc.).

The effective status holds the result of a computation, therefore it is an operational property that is recomputed every time the status changes. The effective status should not be set directly. The effective status can be changed only indirectly by changing other activation properties.

Effective Status Value Description

no value

Not yet computed. This should not happen under normal circumstances.

enabled

The entity is active.

disabled

The entity is inactive (temporary inactivation).

archived

The entity is inactive (permanent inactivation).

The effective status is the property that is used by majority of midPoint code when determining whether a particular object is active or inactive. This property should always have a value in a normal case. If this property is not present then the computation haven’t taken place yet.

Similarly to effective status, there is yet another operational property validityStatus. This property reflects the state of validity constraints with respect to current time. The values are before, in and after, meaning the states before the validity intervals started, inside the validity interval and after the validity interval ended respectively.

There also other operational properties in the activation data structure that provide operational data about user activation:

Name Type Description

disableReason

URI

URL that identifies a reason for disable. This may be indication that that identity was disabled explicitly, that the disabled status was computed, or it may indicate other source of the disable event.

disableTimestamp

dateTime

Timestamp of last modification of the activation status to the disabled state.

enableTimestamp

dateTime

Timestamp of last modification of the activation status to the enabled state.

archiveTimestamp

dateTime

Timestamp of last modification of the activation status to the archived state.

validityChangeTimestamp

dateTime

Timestamp of last modification of the effective validity state, i.e. last time the validity state was recomputed with a result that was different from the previous recomputation. It is used to avoid repeated validity change deltas.

Those properties are operational, therefore from the user point of view they are read-only. The values are automatically computed by midPoint and stored in the database.

Activation and Lifecycle Examples

Let’s see how that works on few examples. The simplest example is perhaps not even worth mentioning. A user without any lifecycle state or activation data structure is considered to be active (enabled). When such user is stored in midPoint repository, midPoint will automatically compute effectiveStatus:

<user>
    <name>alice</name>
    ...
    <activation>
        <effectiveStatus>enabled</effectiveStatus>
    </activation>
</user>

The user can be temporarily inactivated by setting lifecycleState to suspended. Property effectiveStatus will indicate that user is inactive now.

<user>
    <name>alice</name>
    <lifecycleState>suspended</lifecycleState>
    ...
    <activation>
        <effectiveStatus>disabled</effectiveStatus>
    </activation>
</user>

Changing lifecycleState back to active re-activates the user:

<user>
    <name>alice</name>
    <lifecycleState>active</lifecycleState>
    ...
    <activation>
        <effectiveStatus>enabled</effectiveStatus>
    </activation>
</user>

Lifecycle state is usually maintained automatically, using inbound mapping from the HR system. If an administrator would try to change lifecycle state manually, synchronization may reset the state back to its previous value. However, administrator can manually disable user by using administrativeStatus property:

<user>
    <name>alice</name>
    <lifecycleState>active</lifecycleState>
    ...
    <activation>
        <administrativeStatus>disabled</administrativeStatus>
    </activation>
</user>

When such user object is stored after the modification, midPoint computes the value of effective status:

<user>
    <name>alice</name>
    <lifecycleState>active</lifecycleState>
    ...
    <activation>
        <administrativeStatus>disabled</administrativeStatus>
        <effectiveStatus>disabled</effectiveStatus>
    </activation>
</user>

Even though the lifecycle state of the user is active, the user is manually inactivated using administrativeStatus.

The use of administrative status is usually quite harsh, and lifecycle state may be quite simplistic. MidPoint deployments are often using validity constraints in addition to lifecycle state. For example, an employee that has employment contract for a year would look like this:

<user>
    <name>bob</name>
    ...
    <activation>
        <validFrom>2019-01-01T00:00:00Z</validFrom>
        <validTo>2019-12-31T23:59:59Z</validTo>
        <validityStatus>in</validityStatus>
        <effectiveStatus>enabled</effectiveStatus>
    </activation>
</user>

Given that this chapter was written in 2019, such user will be active. It will automatically switch to inactive state after the last day of 2019. However, if there is ever a need to explicitly disable the user, administrative status can still be used:

<user>
    <name>bob</name>
    ...
    <activation>
        <administrativeStatus>disabled</administrativeStatus>
        <validFrom>2019-01-01T00:00:00Z</validFrom>
        <validTo>2019-12-31T23:59:59Z</validTo>
        <validityStatus>in</validityStatus>
        <effectiveStatus>disabled</effectiveStatus>
    </activation>
</user>

In this case the user is still in its validity interval. Hence the in value of validityStatus. However, the administrative status is explicitly set to disabled. Therefore, the resulting effective status is also disabled.

Activation operational properties are very useful, not just for troubleshooting. These properties are often used in reports, dashboards and search filters. E.g. the best way to search all active users is to search for activation/effectiveStatus = "enabled". Similarly, filter activation/effectiveStatus = "disabled" looks for all inactive users, regardless whether they are inactive due to lifecycle state, administrative status or validity constraints.

Schema Definition

So far we have talked mostly about the user schema (UserType data type). However, midPoint schema much broader than that. There are many types of objects, and there are thousands of data types overall. It would be almost impossible to manage such a big schema if it were hard-coded in midPoint code. Therefore, the schema is defined in special definition files that are used by midPoint in several ways. Schema definition used by the user interface to automatically render form fields. It is also used by midPoint expression engine to automatically convert data types. It is even used by midPoint build process (compilation), to make sure that midPoint code is using the schema properly. MidPoint is completely schema-aware system, from the bottom to the top.

Schema obviously plays a crucial role in everything that midPoint does. Therefore, it may be interesting to have a closer look at schema definition. This can be particularly useful for engineers that are deploying midPoint professionally, and that often needs to extend and customize the schema.

MidPoint schema is specified in XML Schema Definition (XSD) format. MidPoint schema is defined in several parts, but the most important is the "core" schema definition. The schema files reside in midPoint source code in schema component in the infra subsystem. Therefore, schema files can be found in the resources part under the infra/schema subdirectory of midPoint source code. Schema files are also included in midPoint distribution package for convenience.

Tip
Why XSD? Why did we choose to use the XML Schema Definition format for midPoint schema? There are historic reasons and there are pragmatic reasons. Back in early 2010s when midPoint was born, XML was perhaps the only sensible choice to build a complex system. Alternatives such as JSON were young, and their schema languages ranged from very limited through useless to non-existent. Therefore, XML and XSD were a natural choice. We needed to extend XSD with custom features. Fortunately, XSD allowed that. We also needed to rewrite parts of the XML/XSD-processing code. Unfortunately, the XML ecosystem was not designed for this, and we have also hit other limitations of XML and XSD. We have to invent a new way to use XSD to describe generic data structures (a.k.a. "Prism objects") that can be represented in XML, JSON and YAML. Due to that innovation, XSD did not really hold us back that much. Despite all its limitations, XSD worked for us quite well during all those years. However, we are getting very close to the very limits of what XSD (or any similar schema language) can do. We are already working on a replacement: Axiom data modeling language. Axiom is a next-generation language, supporting not just a data schema, but also meta-data schema. Axiom is still very young, it needs more work and time to mature. However, it is certainly a future for midPoint.

Every deployment engineer that takes midPoint deployments seriously should be aware of the schema. Hardcore engineer will surely open the XSD files in their favorite text editor in the terminal and analyze the definitions line-by-line. Developers could open the XSD files in their IDEs and have a nice organized look at the schema. However, even an ordinary engineer could benefit from learning the basics of XSD and having a look at a few important data types in midPoint schema.

Schema definition is not just about the properties, containers and data types. Crucial part of the schema definition is in-line documentation. Most of the data types and items are documented by using XSD in-line documentation mechanism. Therefore, a huge amount of details about midPoint can be learned by exploring the schema. We have tried to make that process easier by developing schemadoc mechanism. Schemadoc is a process that takes raw midPoint schema and generates HTML documentation out of that. This task is part of midPoint build process and generated documentation is a result of midPoint build. Schemadoc is also available online. Just search for "schemadoc" in midPoint docs.

Schema is not just a description how midPoint works. MidPoint schema is part of midPoint itself. It is used when midPoint is compiled. It is parsed when midPoint starts. It is used by midPoint core and user interface. MidPoint is complex, and even the experts can be sometimes wrong about midPoint functionality. MidPoint documentation is quite extensive, therefore it may be misleading or out of date at places. But not the schema. Schema is always right. Otherwise midPoint won’t work at all. In midPoint world, schema is the law. When in doubt, look at the schema.

Schema Extensibility

MidPoint schema is quite rich. Many of the properties that are frequently used in identity management deployments are already part of midPoint schema. Given name, family name, full name, additional names, honorific titles, job title, personal number - it is all there, ready to be used. However, reality has always a way to bring unexpected things. Therefore, midPoint deployments won’t get far if midPoint schema cannot be extended.

Vast majority of midPoint schema is available at compile-time. This means that such schema is used during compilation (build) of midPoint. That "static" part of schema is somehow hardcoded into midPoint itself, and it would be very difficult to change. Therefore, we have developed a mechanism to extend the schema at deployment-time. Small parts of the XSD definition can be provided when midPoint is deployed. MidPoint reads those definitions when it starts up. The static part of the schema is extended with those definitions. From that point on, the extensions are part of midPoint schema. The extensions are naturally used by midPoint user interface, expression-processing code and all other parts of midPoint.

Our ExAmPLE company was quite happy with the progress of their identity management deployment so far. Mappings were used to synchronize values of user names and all other common attributes. There is plenty of suitable properties for that in midPoint schema such as givenName and fullName. Even personalNumber came very handy. However, now they need to customize midPoint schema to better suit their very specific needs. The company management decided that the people are going to look really cool in fancy hats. Therefore, they are going provide a hat for every employee. Which means that the identity management system needs to track hat size for all users. Hat size is not used in the identity management deployments very often, therefore it is not a part of standard midPoint schema. Fortunately, it is quite easy to extend the schema.

First step to extend midPoint schema is to prepare a small XSD file:

example.xsd
<xsd:schema targetNamespace="http://example.com/xml/ns/midpoint/schema">
    ...
    <xsd:complexType name="UserTypeExtensionType">
        <xsd:annotation>
            <xsd:appinfo>
                <a:extension ref="c:UserType"/>
            </xsd:appinfo>
        </xsd:annotation>
        <xsd:sequence>
            <xsd:element name="hatSize" type="xsd:string"
                         minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

This file defines a new data structure UserTypeExtensionType. The name of this data structure does not really matter. What matters is that it is bound to an extension of UserType in the annotation part of the type definition. When midPoint reads this file, it extends the definition of UserType with this data type.

The extension specifies just a single property: hatSize. This is an optional single-valued string property, which is specified by the minOccurs and maxOccurs clauses. Every user in midPoint can have this property. When this schema extension is applied, user interface is going to automatically display text input field of this property for every user.

MidPoint administrator puts this XSD content into example.xsd file. Name of the file can be chosen arbitrarily as long as it has .xsd file extension. Administrator copies that file to schema subdirectory of midPoint home directory and restarts midPoint. From that point on the schema extension is active.

The users can be extended with custom property now:

<user xmlns:exmpl="http://example.com/xml/ns/midpoint/schema">
    <name>alice</name>
    <extension>
        <exmpl:hatSize>M</exmpl:hatSize>
    </extension>
    ...
</user>

There is a couple of important remarks to be made here. Firstly, all the extension properties are always placed in a special extension container in the objects. Even though the properties are placed inside a container, the user interface will present them in the same way as the other midPoint properties originating from the static midPoint schema. In midPoint, all schema items are treated equally, regardless of their origin.

Secondly, a clever reader surely noticed that we have used XML namespace here. We have omitted XML namespaces from the majority of other examples, as they are not that important when working with midPoint objects. However, schema is different. Namespaces are handled quite strictly when working with the schema. Namespaces must be declared and namespace prefixes must be properly used in all XSD definitions. This is how the XSD language was designed. The most important namespace in this case is the target namespace of the extended schema. The URI for this namespace should be chosen in such a way that it is globally unique. The use of your DNS domain is the recommended technique.

Namespaces also should be used when working with extension container in users and other midPoint objects. This requirement is not that strict, as midPoint can usually figure out the namespace. However, this may be a problem in case that several schema extensions are combined. Such schema combinations are fully supported by midPoint. MidPoint simply parses all the XSD files in the schema directory and applies all of them as extensions. These files may contain conflicting definitions of the same items. The namespace is used to differentiate between them. Therefore, if there is an expectation that several schema extensions will be used in the same deployment, then the use of namespaces in object extension is more than recommended.

Tip
Extension container

Why is there an extension container? Why are the extension properties not mixed among other static properties? XML should allow that. Yet, it does not really work well in practice. This is related to the intricacies of XML and XML schema.

Theoretically, XML is completely extensible. However, when XML Schema is applied to XML, some extensibility scenarios do not work very well. That is also the case for mixing of static XML elements and dynamic XML elements. We are hitting what is called "Unique Particle Resolution" limitation of XML schema. This was further amplified by limitations of Java XML libraries. The easiest and perhaps even most correct way to resolve this limitation was to create a dedicated XML element for schema extensions. That is what we have done in early midPoint versions. The schema processing code in midPoint has significantly improved since, and now we are almost at the point where we could remove the extension element. Unfortunately, we are not yet there. Moreover, there is still an aspect of compatibility to consider. The extension element stays, at least for now. However, we are trying hard to hide its existence from the end user.

MidPoint schema does not just specify the "core" data model. MidPoint schema goes a bit further, and it can also specify the details of data presentation. This means that the schema can specify a label that should be used for particular data item, help text, tooltip and other characteristics. The XML Schema (XSD) cannot do this out-of-the-box. Fortunately, XSD schema can be extended by annotations. Those annotations can be used to define the presentation properties of an items:

    ...
    <xsd:element name="hatSize" type="xsd:string"
                 minOccurs="0" maxOccurs="1">
        <xsd:annotation>
            <xsd:appinfo>
                <a:displayName>Hat size</a:displayName>
                <a:help>
                    Your hat size, in whatever mysterious units the hatters
                    are using for measuring hats.
                </a:help>
            </xsd:appinfo>
        </xsd:annotation>
    </xsd:element>
    ...

This works fine, if your system works in a single localization environment. Yet, this is not enough in case that you need to support more than one language. MidPoint was born in Europe, and here at the old continent we know quite well all the pain that comes with multi-language environments. Therefore, MidPoint is designed to be localizable. You can simply use localization keys instead of actual text:

    ...
    <xsd:element name="hatSize" type="xsd:string"
                 minOccurs="0" maxOccurs="1">
        <xsd:annotation>
            <xsd:appinfo>
                <a:displayName>UserTypeExtensionType.hatSize.displayName</a:displayName>
                <a:help>UserTypeExtensionType.hatSize.help</a:help>
            </xsd:appinfo>
        </xsd:annotation>
    </xsd:element>
    ...

The actual text to be used for the label can be looked up in the localization catalog. However, using localization catalogs is a matter of its own. It will be covered by later chapters.

PolyString and Protected String

Majority of midPoint schema is pretty standard stuff. When you walk through the jungle of midPoint schema definition, you can see all the usual wildlife: strings, integers, booleans, timestamps and binary values. Yet, there are few species that are quite strange. However strange they might look, they are immensely useful. Their names are PolyString and protected string.

PolyString (PolyStringType) is the stranger one of those two. Its name came from polymorphic string, which means a string that can take a variety of forms. In its simplest form, PolyString is just a simple string that can be normalized. Normalization means that we convert the original string into some standard form, e.g. by removing leading and trailing whitespace (trimming), converting all letters to lower case, simplifying national characters and so on.

Many ordinary midPoint properties are PolyStrings. Object name and user’s givenName, familyName and fullName and all PolyStrings. However, not even a clever reader have noticed anything suspicious about these properties so far. This is because normalization is almost transparent in midPoint PolyStrings, it happens on its own in the background. However, now it is a time to have a peek inside. Let’s import a user that looks like this:

<user>
    <name>semančík</name>
    ...
    <fullName>Radovan Semančík,  PhD. </fullName>
    ...
</user>

What is really stored in midPoint repository is this:

<user>
    <name>
        <orig>semančík</orig>
        <norm>semancik</norm>
    </name>
    ...
    <fullName>
        <orig>Radovan Semančík,  PhD. </orig>
        <norm>radovan semancik phd</norm>
    </fullName>
    ...
</user>

This all happens transparently. PolyStrings are displayed as strings in the user interface. They are handled (almost completely) as strings in the mappings. Ordinary midPoint user has no idea that the normalization happens at all. If everything is so transparent, why do we bother to normalize strings at all? PolyString normalization has many practical uses. Two of them are embedded quite deep in the way how midPoint works.

Firstly, normalization is used to provide reliable uniqueness mechanism. Usually we do not want a user with username semancik and another user with username Semancik or even Semančík. This may lead to confusion, and it even totally breaks some applications. As midPoint has uniqueness constraints on both the orig and norm parts of the name, such situation is completely avoided. All those usernames have the same normalized form semancik, therefore the uniqueness constraint on norm part of the name prohibits the use of all those forms at the same time.

Secondly, normalization is simple and elegant way to conveniently search for objects in international environments. When PolyStrings are searched, the value from the query is normalized. Then the norm part of the PolyString is searched. Therefore, whether the query contains semancik, Semancik or semančík, it will always find the user entry above.

Default normalization algorithm in midPoint should be a good fit for most environments. However, there are always deployments that are different. For example, characters such as hyphens (-) are usually not considered to be significant. Yet, some deployments may consider aliceanderson and alice-anderson to be two different usernames. The default midPoint normalization mechanism removes hyphens, therefore attempt to have two such users will end up with an error. Fortunately, the normalization algorithm is customizable. There are several algorithms to choose from and they can even be parameterized. In the extreme case, there is a way to develop a completely custom algorithm. Therefore, the PolyString normalization should fit pretty much every deployment scenario.

PolyString still has more tricks to do. Simple normalization is not much of a polymorphism yet. PolyString becomes a real shape-shifter when used in fully localized environments. PolyString is designed to store values that can have individualized representations in national environments. E.g. in international deployments we probably want to provide localized role names. Like this:

<role>
    <name>
        <orig>System administrator</orig>
        <lang>
            <en>System administrator</en>
            <sk>Systémový správca</sk>
            <cz>Správce systému</cz>
        </lang>
    </name>
    ...
</role>

This is a mechanism to display midPoint to end users in their own language, complete with localized content of midPoint.

The other strange animal in the midPoint jungle is protected string (ProtectedStringType). Identity management systems often work with sensitive data such as user passwords. All the identity-related data usually need protection, but those sensitive data items need even better safeguards. This usually means that some kind of cryptographic technique needs to be employed. E.g. we do not want to store passwords in the cleartext form. Want them to be either hashed or encrypted. That is what protected string is for. Protected string is basically just a simple string, but it has extra cryptographic protection.

If you have ever dealt with cryptography, you will probably know that cryptography is not simple. Even such a seemingly simple thing as password hashing is quite complex when it comes to all the details. E.g. we do not want to store plain hash, as that would not provide sufficient protection. We want salted hash. Which means that the salt value needs to be stored together with the string. Many algorithms are parametric, and the parameters used during the hashing also need to be stored. Most importantly, we do not want to hard-wire midPoint to any specific algorithm. Cryptographic algorithms often do not age well, and they need to be replaced eventually. Therefore, we also need to store algorithm identifiers with the value. If the value is encrypted, we also need to store key identifier, as several keys may be active at the same time. Nothing is simple in cryptography. The cryptographic devil is in the tiny and often counter-intuitive details.

Protected string is a data structure that is designed to handle all those pesky cryptographic details, and still pretend that the content of the data structure is just a string. Similarly to PolyString, the basic usage of protected string is quite simple. Data can be imported into midPoint by using clearValue element:

<user>
    <name>alice</name>
    ...
    <credentials>
        <password>
            <value>
	          <clearValue>sup3rSECRET</clearValue>
            </value>
        </password>
    </credentials>
</user>

The data are automatically protected when the object is imported into midPoint:

<user>
    <name>alice</name>
    ...
    <credentials>
        <password>
            <value>
                <t:encryptedData>
                    <t:encryptionMethod>
                        <t:algorithm>http://www.w3.org/2001/04/xmlenc#aes128-cbc</t:algorithm>
                    </t:encryptionMethod>
                    <t:keyInfo>
                        <t:keyName>1z0N17tv6hNQh5CAJ+jWHWDXeBM=</t:keyName>
                    </t:keyInfo>
                    <t:cipherData>
                        <t:cipherValue>g6Neg3ZEXY/ga00SpEa9w5MlJ9/IR+M1vEjdceni6bM=</t:cipherValue>
                    </t:cipherData>
                </t:encryptedData>
            </value>
        </password>
    </credentials>
</user>

Protected string data type supports cleartext representation, encryption using a symmetric algorithm and hashing. However, the protected string data type is just a mechanism for storing the data, it is just a place where the values can be stored. Whether specific protected string in the schema gets encrypted or hashed, and at which point that happens, is not controlled by the protected string data type itself. It is controlled by midPoint configuration and policies. For example, whether user password is encrypted or hashed is determined by midPoint security policy.

Advanced Schema Concepts

This section describes schema concepts that goes deeper into midPoint mechanisms and implementation. Awareness of those concepts will provide insight into how midPoint works. However, we have already talked about the schema quite a lot, and this chapter was quite low on practical examples. Feel free to skip the rest of this chapter if you want to get your hands dirty as soon as possible. However, please make sure to come back later. You will have to learn those schema concepts eventually, to get the best of midPoint functionality.

Type Hierarchy

So far we have presented midPoint schema as a simple set of data types. There is UserType for users, RoleType for roles and so on. However, all the midPoint objects have something in common. For example, all of them have object identifier (OID), name, description and so on. We could simply copy definitions of those properties to all the data types. However, that is not the best way to do data modeling. The proper way is to create a type hierarchy. Therefore, there is an ObjectType data type that specifies all the items that all the object types share. However, midPoint schema is substantial and one common ancestor won’t be enough. MidPoint type hierarchy was evolving during midPoint development, and now it forms quite a rich structure.

Type hierarchy

Following table is summarizing midPoint data types and their purpose.

Data type Description

ObjectType

Common (abstract) data type for all midPoint objects. Specifies basic items that all midPoint objects have: name, description, metadata and so on.

AssignmentHolderType

Abstract supertype for all object types that can have assignments.

FocusType

Abstract supertype for all object types that can be focus of full midPoint computation. This basically means objects that have projections (accounts). Focal objects also have activation, they may have personas, etc.

UserType

User object represents a physical user of the system. Properties of user object typically describe the user as a physical person. Therefore, the user object defines handful of properties that are commonly used to describe users in the identity management solutions (employees, customers, partners, etc.)

AbstractRoleType

Abstract data type that contains the "essence" of a role. Roles and other objects that behave like roles are derived from this data type. All abstract roles may "grant" accounts on resources, attributes and entitlements for such accounts. The role can also imply (induce) organizational units, other roles or various identity management objects that can be assigned directly to user.

RoleType

A role in the role-based access control (RBAC) sense. The roles specify privileges that the user (or other object) should have.

Roles are intended to give privileges to users and other objects.

OrgType

Organizational unit, division, section, object group, team, project or any other form of organizing things and/or people. The OrgType objects are designed to form a hierarchical organizational structure (or rather several parallel organizational structures).

Orgs are intended to group objects. As orgs are abstract roles, they can also behave as roles.

ServiceType

This object type represents any kind of abstract or concrete services, or devices such as servers, virtual machines, printers, mobile devices, network nodes, application servers, applications or anything similar.

ArchetypeType

Archetype definition. Archetype defines custom object (sub)type. I.e. it defines specific behavior, look and feel of objects of a particular type, such as "employee", "project", "application", "business role" and so on.

ResourceType

Resource represents a system or component external to midPoint system which is managed by midPoint. It is sometimes called identity resource, IT resource, target system, source system, provisioning target or by variety of their names. MidPoint connects to the resource to create accounts, assign accounts to groups, etc. Resource may also be an authoritative source of data, database that contains organizational structure and so on.

ConnectorType

Description of a generic connector. Connector in midPoint is any method of connection to the resource. This usually describes a ConnId identity connector.

ConnectorHostType

Host definition for remote connector, remote connector framework or a remote "gateway". This usually specifies the detail of a ConnId remote connector server.

SystemConfigurationType

System configuration object. It holds global system configuration setting. There is just one object of this type in the system. It has a fixed identifier (OID).

TaskType

Object that contains information about a task. This can represent active running task, it may be a scheduled task waiting for execution, or the object may contain a results of a finished task.

ObjectTemplateType

An object that contains mappings and other configuration intended to apply to other object types. E.g. it may be used as “user template” to set up basic properties of new user objects.

LookupTableType

An object that represents lookup table. The lookup table can be used for two purposes: value enumerations (e.g. for GUI or validation) and value mapping (translation). Simply speaking it is a set of key-value pairs that can be efficiently stored and used in midPoint user interface, mappings and so on. It is designed to hold a large number of key-value pairs.

SecurityPolicyType

System that contains definitions of overall security policy. It contains configuration of authentication mechanisms, credentials management (such as password resets) and so on.

ValuePolicyType

Policy for values of properties. This is almost always used to store password policies.

FunctionLibraryType

Object that contains a set of reusable functions. Those functions can be used in mappings and expressions in all parts of midPoint.

ObjectCollectionType

Object that specifies a collection of other objects. It is mostly just a named search filter that can be reused in other parts of midPoint. However, there are also some advanced functions that can be used in dashboards, for compliance purposes and so on.

ReportType

Specification of midPoint report. This specification defines what the report should contain, how it should look like, output format and so on.

This object contains a report definition. It is a report “template” that can be executed, and it produces data. The output data are referred to by report output objects.

ReportDataType

Object that refers to data of the report. This is usually an output of a report, but it may also refer to input data that are to be imported to midPoint. It also contains metadata, e.g. timestamp of report was creation, what definition was used, etc.

SequenceType

Definition of a sequence object that produces unique values. The sequence state is persistently stored in the repository, therefore it can efficiently produce unique identifiers in a controlled and predictable manner.

FormType

Form definition. Forms define how a certain user interface form or dialog is presented in the user interface. It is used for user interface customization.

DashboardType

Object that specifies a look and a behavior of a dashboard. This is used for user interface customization. It can also specify some aspects of midPoint reports.

GenericObjectType

Generic type for any other object type that do not fit into any other category. However, support for this data type is extremely limited. We generally do not recommend to use it at all.

ShadowType

Shadow of a resource object. Local copy of any object on the provisioning resource that is related to provisioning. It may be account, group, role (on the target system), privilege, security label, organizational unit or anything else that is worth managing in identity management.

NodeType

Node describes a single installation of midPoint. MidPoint installations can work in cluster. The Node objects are the way how the nodes in cluster know about each other.

Type hierarchy is a principle that is used in many software systems. This principle is probably quite obvious to all software developers, but it may need some time to get used to for other engineers. However, the basic idea is quite simple. E.g. AbstractRoleType has all the items that are needed for an object to behave like a role. RoleType, OrgType, ServiceType and ArchetypeType are subtypes of AbstractRoleType. Therefore RoleType, OrgType, ServiceType and ArchetypeType can all behave like a role.

This may sound quite strange, why would we want an organizational unit to behave like a role? Yet, the answer is quite obvious. Membership in an organizational unit may imply some privileges. Other identity systems need complex rules such as "if user belongs to organizational unit A then he will also have role `X`". That is not needed in midPoint. Organizational unit is like a role, therefore it can simply include all the roles that are needed. This means that the role-based access control (RBAC) principles can be applied to several object types. This approach illustrates a very typical trait of a midPoint philosophy: reuse of generic principles. We reuse existing principles instead of complicating the system by inventing a new single-purpose mechanism. As you will see later, this makes midPoint both elegant and powerful.

Item Path

MidPoint configuration often needs to reference a particular item in a particular object. For example, mapping sources and target are references to properties and containers. However, midPoint data structures can be quite complex. For example, password is stored in property value that is located in container password which is in container credentials defined in FocusType data type. It may be difficult to find a way in this little maze. There may be even some unambiguous situations. For example, user status is controlled by property administrativeStatus that is in the activation container. However, assignment also has an activation container, and there is an assignment administrativeStatus. Therefore, referencing an item by a simple name would not be enough. We need something more sophisticated.

MidPoint is using the concept of item path to reference items in the schema. In its simplest form, item path is just a sequence of item names concatenated by slash characters. For example the path of user administrative status is

activation/administrativeStatus

whereas the path of assignment activation status is

assignment/activation/administrativeStatus

Item path provides an unambiguous reference to a specific item in midPoint schema. The path can be used in all the places where there is a need to reference a particular item. It is often used in mappings to specify sources and target. Search filters are often using item paths to form search queries. The path is also used in other places that we will mention in later chapters. The concept of item path is deeply embedded in all midPoint operations. For example, modification deltas are using item path to precisely pinpoint the places in the object that are modified.

The path is used to locate a particular item in midPoint schema. It is also used to reference a specific value in midPoint objects. In that case the path often looks simple, as we have seen it above. As long as we are dealing only with single-value containers, the path can unambiguously point to a specific item. However, we may get into trouble in case that multi-valued containers are used, which are often used in midPoint schema. Assignment is one of those multi-valued container. User can have many assignments. If we want to disable one particular assignment, how do we do it? If we would use the path above, then it is not clear which assignment should be disabled. Therefore, in case of multi-valued containers, the path is extended with a container identifier in square brackets:

assignment[123]/activation/administrativeStatus

This path is unambiguously referencing administrativeStatus property in an activation container in a very specific assignment - an assignment container with identifier 123. This form of the path is used mostly in the deltas, and user should not need to ever enter those paths manually. However, this form is often recorded in midPoint log files and other diagnostic output. Therefore, it is quite useful to be familiar with it.

You might wonder why there is an identifier for assignment but there is no identifier for activation. Both are containers, aren’t they? The clever reader already knows the answer. assignment is a multi-valued container. Therefore, identifier is needed to pinpoint a specific value of that container. However, activation is a single-valued container. There is no danger of ambiguity. Therefore, the identifier is not needed in this case.

This form of item path works fine if we need to identify an item in a particular object. However, sometimes we have a lot of objects and other data structures to choose from. For example, a mapping can have several sources and expression variables. Therefore, using simple paths would be ambiguous, as it would not specify where the path starts. In such case the path can start with an optional variable identifier:

$focus/activation/administrativeStatus

The path above explicitly states that it should be applied to the content of variable focus. Therefore, there is no danger that this path could be applied to a shadow object which also has the activation container. This form of item path is often used in path expression evaluators.

Clever reader is surely wondering about QNames now. The XML schema defines the elements in a form of QNames, which basically means "names in a namespace". Therefore, element names are supposed to be QNames, and item path should use QNames as well. Yet, so far all the names in the path looked like simple strings. In fact, they indeed are simple strings, but they point to elements in the schema. While the path is correct and unambiguous, midPoint does not need the namespaces. Simple string (known as local part of QNames) are enough to navigate through the schema and automatically determine the namespaces. The same principle is used for parsing XML, JSON or YAML document without namespace definitions. However, there may be ambiguities in case that several custom schema extensions are used. Those extensions may have elements with conflicting local parts. In that case an alternative form of item path can be used:

declare namespace exmpl="http://example.com/xml/ns/midpoint/schema"; extension/exmpl:foo
Note
This alternative form is based on XPath specification, that was used in early midPoint versions and it was an inspiration for the concept of item path.

Clever reader may have also noticed that there are two types of namespaces that are often used in midPoint:

http://prism.evolveum.com/xml/ns/public/...
http://midpoint.evolveum.com/xml/ns/public/...

Indeed, the schema is divided into two big parts:

  • Prism schema is used to express basic concepts that deal with objects, deltas, item paths, queries and similar mechanisms. Those are concepts of our data representation library that we dubbed Prism. Prism concepts are very generic mechanisms that (theoretically) have nothing to do with identity management or midPoint. Prism is supposed to be a general-purpose data representation library that can be reused to build other applications.

  • MidPoint schema is used to express all the objects and data types that midPoint works with. All the concepts specific to identity management are there: user, role, org, assignment and many, many others. This is the data model of identity management as it is implemented in midPoint.

Conclusion

This is all about midPoint schema that you need to know right now. There is still much more to learn, as the entire midPoint schema is big and complex. Understanding of midPoint schema is absolutely crucial for advanced use of midPoint, as the schema is a foundation of everything that midPoint does. Yet, the best way to do the learning is to do it on the go. You will learn more about midPoint schema as you will explore midPoint functionality.