A very generic datamodel.

I’ve come up with several projects in the past and a few have been mentioned here before. For example, the Garagesale project which was based on a system I called “CART”. Or the WordChain project that was a bit similar in structure. And because those similarities, I’ve been thinking about a very generic datamodel that should be handled to almost any project.

The advantage of a generic database is that you can focus on the business layer while you don’t need to change much in the database itself. The datamodel would still need development but by using the existing model, mapping to existing entities, you could keep it all very simple. And it resulted in this Datamodel:ClassDiagram(Click the image to see a bigger version.)

The top class is ‘Identifier’ which is just an ID of type GUID to find the records. Which will work fine in derived classes too. Since I’m using the Entity Framework 6 I can just use POCO to keep it all very simple. All I have to do is define a DBContext that tells me which tables (classes) I want. If I don’t create an entry for ‘Identifier’, the table won’t be created either.

The next class is the ‘DataContent’ class, which can hold any XML. That way, this class can contain all information that I define in code without the need to create new tables. I also linked it to a ‘DataTemplate’ class which can be used to validate the content of the XML with an XML schema or special style sheet. (I still need to work out how, exactly.) The template can be used to validate the data inside the content.

The ‘BaseItem’ and ‘BaseLink’ classes are the more important here. ‘BaseItem’ contains all fixed data within my system. In the CART system, this would be the catalog. And ‘BaseLink’ defines transactions of a specific item from one item to another. And that’s basically three-fourth of the CART system. (The template is already defined in the ‘DataTemplate’ class.)

I also created two separate link types. One to deal with fixed numbers which is called ‘CountLink’ which you generally use for items. (One cup, two girls, etc.) The other is for fractional numbers like weights or money and is called ‘AmountLink’. These two transaction types will be the most used transaction types, although ‘BaseLink’ can be used to transfer unique items. Derived links could be created to support more special situations but I can’t think of any.

The ‘BaseItems’ class will be used to derive more special items. These special items will define the relations with other items in the system. The simplest of them being the ‘ChildItem’ class that will define more information related to a specific item. They are strongly linked to the parent item, like wheels on a car or keys on a keyboard.

The ‘Relation’ class is used to group multiple items together. For example, we can have ‘Books’ defined as relation with multiple book items linked to it. A second group called ‘Possessions’ could also be created to contain all things I own. Items that would be in both groups would be what is in my personal library.

A special relation type is ‘Property’ which indicates that all items in the relation are owned by a specific owner. No matter what happens with those items, their owner stays the same. Such a property could e.g. be a bank account with a bank as owner. Even though customers use such accounts, the account itself could not be transferred to some other bank.

But the ‘Asset’ class is more interesting since assets are the only items that we can transfer. Any transaction will be about an asset moving from one item to another. Assets can still be anything and this class doesn’t differ much from the ‘BaseItem’ class.

A special asset is a contract. Contracts have a special purpose in transactions. Transactions are always between an item and a contract. Either you put an asset into a contract or extract it from a contract. And contracts themselves can be part of bigger contracts. By checking how much has been sent or received to a contract you can check if all transactions combined are valid. Transactions will have to specify if they’re sending items to the contract or receiving them from the contract.

The ‘BaseContract’ class is the more generic contract type and manages a list of transactions. When it has several transactions, it is important that there are no more ‘phantom items’. (A phantom item would be something that’s sent to the contract but not received by another item, or vice versa.) These contracts will need to be balanced as a check to see if they can be closed or not. They should be temporary and last from the first transaction to the last.

The ‘Contract’ type derived from ‘BaseContract’ contains an extra owner. This owner will be the one who owns any phantom items in the contract. This reduces the amount of transactions and makes the contract everlasting. (Although it can still be closed.) Balancing these contracts is not required, making them ideal as e.g. bank accounts.

Yes, it’s a bit more advanced than my earlier CART system but I’ve considered how I could use this for various projects that I have in mind. Not just the GarageSale project, but also a simple banking application, a chess notation application, a project to keep track of sugar measurements for people with diabetics and my WordChain application.

The banking application would be interesting. It would start with two ‘Relation’ records: “Banks” and “Clients”. The Banks relation would contain Bank records with information of multiple banks. The Clients relation would contain the client records for those banks. And because of the datamodel, clients can have multiple banks.

Banks would be owners of bank accounts, and those accounts would be contracts. All the bank needs to do is keep track of all money going in our out the account. (Making money just another item and all transactions will be of type ‘AmountLink’.) But to link those accounts to the persons who are authorized to receive money from the account, each account would need to be owner of a Property record. The property record then has a list of clients authorized to manage the account.

And we will need six different methods to create transactions. Authorized clients can add or withdraw money from the account. Other clients can send or receive payments from the account, where any money received from the contract needs to be authorized. Finally, the bank would like to have interest, or pays interest. (Or not.) These interest transactions don’t need authorization from the client.

The Chess Notation project would also be interesting. It would start with a Board item and 64 squares items plus a bunch of pieces assets. The game itself would be a basic contract without owner. The Game contract would contain a collection of transactions transferring all pieces to their first locations. A collection of ‘Move’ contracts would also be needed where the Game Contract owns them. The Move would show which move it is (including branches of the game) and the transactions that take place on the board. (White Rook gone from A1, White Rook added to A4 and Black pawn removed from A4, which translates into rook takes pawn at A4.)

It would be a very complex way to store a chess game, but it can be done in the same datamodel as my banking application.

With the diabetes project, each transaction would be a measurement. The contract would be owned by the person who is measuring his or her blood and we don’t need to send or receive these measurements, just link them to the contract.

The WordChain project would be a bit more complex. It would be a bunch of items with relations, properties and children. Contracts and assets would be used to support updates to the texts with every edit of a WordChain item kicking the old item out of the contract and adding a new item into the contract. That would result in a contract per word in the database.

A lot of work is still required to make sure it works as well as I expect. It would not be the most ideal datamodel for all these projects but it helps me to focus more on the business layer and the GUI without worrying about any database changes. Once the business model becomes more advanced, I could create a second data layer with a better datamodel to improve the performance of the data management.

 

 

 

The challenge for the CART system.

In The CART datamodel I displayed the datamodel that would be required for the CART system. Basically, the data model would store the items, transactions and contracts while the templates will be stored in code, as XML structures that are serialized to objects and back again. As a result, I would split the relations from the data, thus allowing me to avoid regular updates to the database structure. All that I might want to update are the templates but even that might not be required for as long as the software will support “older versions” of those objects.

But serializing those objects to and from XML isn’t as easy as it seems. Since I’ve separated data from relations, the data itself doesn’t know its own ID. Why? Because the ID is part of the relation, thus I would not need to store it within the XML. (It would be redundant.) But if I want to use these objects through a web service, having the ability to send this ID to the client is very practical, so I can send back changes through the same web service. I would need the ID to tell the server what to do with which object.

Thus I’ll end up with two methods of serializations. One is just to serialize the data itself. The other is the data plus its ID. And now I will have to decide a way to support both. Preferably in a simple way that would not require me to generate lots and lots of code.

In the data layer, I would split up every object into a relation and a data part. The data would be stored as XML within the relation object. To make sure the relation object will be able to expose the object, I would need to give it a DataObject property that would be the data object. It’s get/set methods should connect to the XML inside, preferably even by using an object as buffer so I don’t have to serialize it over and over again.

In the business layer, I should not have an XML property, nor should I have a DataObject property. The data fields should be there, with the ID. And basically, I would need a mapping to convert between the data layer and the business layer. The trouble with this approach is that I would define all data twice. Once in the data template and once in the business layer. That’s not very smart. I need to re-use things…

I’m considering to add my serialization method for the data templates. This means that I will include the ID within the template, so it becomes part of the object. All properties would be defined as data members, including the ID. That way, the ID is sent from the business layer to the client. But to store the template in the relation object, I would need to create my solution.

One solution would be by implementing methods to convert the data to XML plus a constructor that would accept XML to create it. It would also mean that I need a way to recognize each object type so I can call the proper construction and probably inherit a bunch of code or write interfaces to make objects more practical to be called upon. It would be complex…

Another solution would be by defining my own attributes. One would be for the class name, thus allowing me to find classes based on this custom attribute. The other would be for the property and my code would just use all of those to generate the XML or to read it back again. This too would allow custom field names. It would be a cleaner solution since I would define the template as just a bunch of properties. Which, basically, they are.

But this second solution is a bit complex, since I still need a way to call the constructor of these specific classes. So I’ve opened a question on StackOverflow, hoping I will get an interesting answer that would solve this easily. Why? Because part of being a good developer is to ask other experts for possible solutions when yourself don’t have a good answer! 🙂

The CART datamodel

Well, my back problems made me think a lot about the CART system that I’ve mentioned before. And it made me consider how I need to handle the difference between plain data and the relationship between the objects. Because the most troubling thing I had to deal with was that I did not want to change my datamodel for every new item that I want to store. So it made me think…

The CART system is mostly created to handle relationships between items, transactions and contracts. It’s not about handling of the data itself. Actually, the system doesn’t even care about the data. All that matters are the relationships. Data is just something to display to the user, something to store but normally not something that you’ll need to process very often at the data layer. So, considering the fact that you can serialize objects in .NET to XML data, I’ve decided to support a basic structure for my Garage Sale project for all the items, transactions and contracts. And each of them will contain a Data property that has the serialized data, that I could convert to data objects and back again.

This idea makes it also more clear where the templates are within my system. My templates are these object definitions! I will have a very generic database with a very simple layout, and I can generate a very complex business layer around this all that I can change as often as I like without the need to change my database model. As a result, I might never have to change the data model again!

Of course it will have a drawback, since the database will contain serialized objects. If I change those objects, I will also need to keep track of the changes in those stored structures and either update them or keep up multiple versions of those objects. Updating those structures would mean that I have to create update apps that know both the old structures and the new structures. It should then convert each old structure to a new structure. Maintaining multiple versions might be more practical since that would leave all old data intact in your system. Anyways, it’s some added complexity that I’ll have to consider.

But now, my datamodel as created by Visual Studio 2012 by using the Entity Framework:EF-CART

 

So, what do you see?

  • The DataObject is the base class for any CART object. It has a unique identifier, a name that can be used to show the related object and a Data property that will contain an object as XML.
  • DataItem is a generic item class, containing an additional description just for more practical reasons. When a user wants to select an existing item, having a description makes it possible to avoid reading the object within the data.
  • The Collection table is just an item with an extra bonus. It can contain multiple child items without the need for any transactions. Actually, this is just a shortcut solution to allow more complex structures within your items. For example, you might create a complete Household collection containing husband, wife, four children and a dog. And although you could link them together by using transactions, having them in a collection just saves the need to create those transactions.
  • DataTransactions is the base class for the transactions, having a sender, receiver and subject item connected together. It also has a link to a rule and a timestamp, indicating when the transaction tool place. (Or will take place for future transactions.)
  • IntegerTransaction is just a basic transaction with a multiplier. This way, you don’t have to add a lot of transactions when your customer buys ten bags of flour.
  • DecimalTransaction is also a basic transaction that will handle items that can be traded in all kinds of different numbers, including fractional amounts. For example, the price of a product, or it’s weight, length or light intensity.
  • DataRule is the basic contract type. It’s a collection of transactions that are all related to one another. For example, the sale of a product would result in a sale rule.
  • The Contract class is more than just a rule. It’s a rule that can contain child rules, thus allowing structured contracts that are made up of several subcontracts. These are used when you have to deal with complex
    situations, like mortgages. A mortgage would include a rule for purchasing a house, a rule for lending money and paying it back, plus other rules for all kinds of insurances.

Now, as I’ve said before, this datamodel should offer me more than enough structural parts to use for my Garage Sale project. All I need to do is compile it and then just leave it alone. There should not be a need to include anything else.

Well, okay… That’s not completely true, since I might want to handle user accounts and store large images of products. But these things would actually require new database structures and should preferably be part of separate databases.

Looking back at this design, it surprises even me how less data it actually has. But the trick here is that I’ve separated the relationships between objects from the actual data itself. Each object can still contain a huge amount of data. But it’s just not important for the model itself.