Export and Import - ShepherdDev/rock-misc GitHub Wiki
Before any encoding begins, all entities that are needed for the export are placed in a priority queue. The order of this queue is determined by the references of the various entities. Entities that are referenced by other entities are placed earlier in the queue than the entities that reference them. Each queued item tracks all the property paths that were used to reach them. For example, if you enqueue a WorkflowType
, it will include it's WorkflowActivityTypes
which will in turn include their WorkflowActionTypes
. One of the property path's to reach those WorkflowActionTypes
would be ActivityTypes.ActionTypes
as those are the properties used to reach that entity from the root entity.
When adding something to the queue we first check for and ignore EntityType
and FieldType
entities. This are automatically generated by Rock and we should not be attempting to create them. We may, down the road, need to take special care of these types and instead of tracking them by Guid track them by full class name instead in case the Guids start to become non-consistent across installs. Secondly, if the entity to be enqueued is in the EntityPath
that is used to reach this entity then it is considered a circular reference and ignores (for example, if you enqueue an DefinedType
it will enqueue it's DefinedValues
, which will in term attempt to enqueue their referenced DefinedType
again).
Before the entity up for queue consideration is placed in the queue, we first find all referenced entities (FindReferencedEntities()
) and attempt to enqueue them first. Then we add the currently considered entity to the queue, or if it is already in the queue we simply add another EntityPath
reference to it. After it is added, we consider all child entities (i.e. entities on the Many side of a Many-To-One relationship). Each of those is enqueued as well.
We use a little cheating to find referenced entities. We get all DataMember
properties of the entity and check for any of type int
or int?
whose name ends with Id
. Then we check if there is a related property on the entity with the same name, though without the Id
part (i.e. CategoryId
-> Category
). If a property is found and it is of type IEntity
then we add it to the list of potential referenced entities.
Next we give any EntityProcessor<>
implementations a chance to override that list by either adding to or removing from the list. This allows for customization of entities that do not have full foreign key references, such as AttributeValue
to have a helper method be defined to find out what that referenced entity really is.
Similarly, for finding child entities we do nearly the same thing. In this case we look for any DataMember
properties of type IEnumerable
whose base type is of type IEntity
. Any entities found in that enumerable are adding to the list of known children.
Next we check if the entity implements IHasAttributes
and if so we check for and add any Attribute
and AttributeValue
entities as children of this entity.
Again we give any EntityProcessor<>
implementations a chance to override this list. An example of this implementation can be seen with a WorkflowType
. It has Attribute
entities related to it but they do not directly reference the WorkflowType
. They use the EntityTypeQualifierColumn
and EntityTypeQualivierValue
columns to be noted as belonging to this WorkflowType
. These cannot be detected by the generic code in FindChildEntities
.
Now that we have a list of entities that need to be encoded for export we begin the easy work. The ProcessQueue
method is passed a Function that is used to determine if the various entities should be given a new Guid. Each queued entity is passed in turn to this function and flagged as wanting a new Guid. At the same time we actually encode the entity. Finally in this method we put all the encoded entities in a DataContainer
object which is returned to the caller. This is what can then be exported as JSON data.
The actual encoding of an entity is actually pretty straight forward. We gather a list of all defined DataMember
properties of the entity. Then we skip any of type IEntity
or IEnumerable<IEntity>
(though really as I think about it we should probably just skip all IEnumerable
types). We store the property name and value inside the Properties
of the EncodedEntity
object. After all the standard entities are stored we call GenerateReferences()
to generate all the Guid references for any "by-Id" properties that we just encoded, e.g. CategoryId
.
Though it is not fully implemented in the code, at this point we would call any EntityProcessor<>
PostProcessExportedEntity()
methods. This would allow for any processing that needs to happen after the entity has been encoded. I have not needed this functionality yet so while I implemented the EntityProcessor<>
methods I have not implemented the lines of code to call them yet. But an example of this might be an entity that needs to have it's data modified for export, but we do not wish to modify the original entity (while probably safe since we aren't saving changes, still better to not modify the original entity).
References are stored as the IEntity class name (e.g. Rock.Model.Category
), the referenced entity's Guid and the original "by-Id" property name (e.g. CategoryId
). When building the references we call the FindReferencedEntities()
method again and then for each returned referenced entity we call MakeGuidReference()
with the data just mentioned. That method will create a new Reference
object and attach it to the EncodedEntity
as well as remove the original property name from the list of encoded Properties
.
Exporting is the hard part. Importing is fairly straight forward. For consistency sake, everything is done inside a database transaction. So either everything will be created or nothing will be created. As a side note, because we are working in a transaction it is necessary to call SaveChanges(true)
instead of SaveChanges()
so that pre- and post-save actions do not happen. This causes deadlock conditions for some entities.
The first thing the import process does is walk through all the EncodedEntity
objects and check if they should be assigned a new Guid. If so we assign them a new Guid and create a map between the old and new Guids. In the future this section could be conditional on the user's preferences. They may wish to turn off new Guid creation, in which case the import operation becomes closer to a "restore data" operation.
After all the Guid's have been assigned we walk through the EncodedEntity
objects again. We attempt to load the entity by it's Guid from the database. If we find an existing object then we ignore it and move on to the next. On the other hand, if we do not find an existing object we call the CreateNewEntity()
method to create a new database entity from the EncodedEntity
. After all objects have been verified to exist now (either already existing or newly created) we commit the transaction.
There is a lot of magic that happens in this method. Meaning it people don't follow standard practices when creating core entities this will break :). We use reflection to determine the entities IService
object (e.g. CategoryService
) as well as the Add
method of that service.
Next we create an empty instance of this entity and restore all the standard properties and references (RestoreEntityProperties()
). Any EntityProcessor<>
types that are defined for this entity type get their PreProcessImportedEntity()
methods called. This allows pre-save processing. For example, the WorkflowActionFormProcessor
uses this method to clean up the form Actions
by updating any Guids to their new values. Because all this information is stored as a long delimited string it cannot be done by standard functionality.
After all pre-processing is done we save the new entity. Though it is not fully implemented in the code, at this point we would call any EntityProcessor<>
PostProcessImportedEntity()
methods and possibly save again. This would allow for any processing that needs to happen after the entity is in the database and has an Id number. I have not needed this functionality yet so while I implemented the EntityProcessor<>
methods I have not implemented the lines of code to call them yet.
This is fairly straight forward, but has a bit of logic in it to make sure we don't screw things up. For each entity DataMember
property we check if we have an encoded property value in the Properties
dictionary. If so, we check to see if it is a Guid (either a real Guid or a string Guid. The "string Guid" code probably is not needed as Newtonsoft seems to automatically convert strings to Guids, but better safe than sorry later). If it is a Guid then we attempt to Map it to a new Guid value if one was generated. This will update "hard coded" references in the form of Guid columns rather than Id columns (e.g. if somebody were to ever create a CategoryGuid
column instead of CategoryId
).
Once we have possibly re-mapped the value we convert the value to the target property data type and set the value.
If the property does not exist in the encoded Properties
dictionary, then we check to see if it exists in the References
array. If so, we try to locate an existing entity for that reference. If found then we get the Id number of that entity and store it in the new entity's property.
Any entity that can not be directly encoded/decoded by the core implementation will need to have an Entity Processor defined for it. Multiple Entity Processor's can be defined for the same type and there is no particular order in which they are executed. Generally speaking, only a single EntityProcessor<>
should be created for each data type unless you are doing something unique to your data.
Also EntityProcessor<>
implementations should not assume anything about the data they are working with. For example, even though a WorkflowActionForm
must have the Actions
property, the processor should not assume it exists since another processor may have done something custom with it and stored it under a different name.
While importing should never really need any custom logic (other than some EntityProcessor<>
implementations), exporting needs just a little bit of extra logic to determine which entities should get new Guids. Currently we only support exporting a WorkflowType
via the ExportWorkflowType()
method. Other methods should be generated to support new root entity types. A few examples that come to mind as potentially good targets:
- ExportPage
- ExportSite
- ExportGroup (as in export a single group, might need some extra logic to handle recursion options)
- ExportGroupTree
- ExportDefinedType
- Others?
It would be good to include some version information with the encoded DataContainer
. I would say the Rock version and also Version numbers of any used EntityProcessor<>
implementations (these would also need to be defined with version numbers). This way, on import a check can be done to see if the current versions available meet or exceed these version numbers. If not a warning could be displayed informing the user that even though the import might succeed, data may not be created correctly and verify that they wish to continue.
This would be especially true with EntityProcessor<>
implementations. We might find a slight bug or omission in an existing EntityProcessor<>
and fix it, but if they have not updated their plug-in/core code they wouldn't know they have the bug. As I said, I think this should be a warning as it might not really matter. The bug might be just a visual anomaly that they don't care about. Or it might even be a real bug that jacks up that particular entity which they know about and plan to fix manually.