Download the Local Data Model Generator source code.
Relational databases store the data in tables. When an application asks for the data, the data that matches the query is sent back to the application. What the developer should think about is “how the application should store the data locally so it can be effectively used”?
Before we actually try to answer the question we will talk about local data models. A local data model could be defined as the way the application stores the data in its internal data structures and the way it manipulates the data. The are 4 elements that form the local data model:
The answer is: it is up to you. The plain array is useful because of fast access to consecutive elements. The ArrayList is useful because it can grow and shrink without much overhead. The Hashtable is useful because the data row’s key can be used as the hash key, providing fast access to data that is somehow related inside the database.
What about the structure of classes?
class C_PublicData { ... public ArrayList Customers; } class C_Order { ... public int ID; public int ID_CUSTOMER; public DateTime date; } class C_Customer { ... public int ID; public string name; public ArrayList Orders; }Such structure is really handy when visual components must be filled with data. For example, TreeView control could be initialized very easily.
public class C_Data { ... public static Hashtable C_Customers; public static Hashtable C_Orders; } public class C_Customer { ... public string name; public Hashtable C_Orders { get { Hashtable hRet = new Hashtable(); foreach ( C_Order order in C_Data.C_Orders.Values ) if ( order.ID_CUSTOMER == this.ID ) hRet.Add( order.ID, order ); return hRet; } } } public class C_Order { ... public int ID; public int ID_CUSTOMER; public DateTime date; public C_Customer L_Customer { get { return C_Data[ID_CUSTOMER] as C_Customer; } } }
How then the data model for a real application should be chosen? Well, it depends on many factors among which we mention:
The model is then usable for relatively small databases and does not scale well to big databases (millions of records).
CREATE TABLE [dbo].[S_CITIES] ( [ID] [int] IDENTITY (1, 1) NOT NULL , [name] [varchar] (28) ) ON [PRIMARY] CREATE TABLE [dbo].[U_USERS] ( [ID] [int] IDENTITY (1, 1) NOT NULL , [ID_CITY] [int] NOT NULL , [surname] [varchar] (28), [dateOfBirth] [datetime] NOT NULL ) ON [PRIMARY]will have a corresponding classes
public class CS_City { public int ID; public string name; public override string ToString() { return this.name; } public CS_City() {} } public class CU_User { public int ID; public int ID_CITY; public string surname; public DateTime dateOfBirth; public override string ToString() { return this.surname; } public CU_User() {} }Look at the definitions above. The relations between tables are modelled but not handled directly. Note that ID_CITY is a foreign reference to S_CITIES table. To efficiently handle relations, we will use Hashtables to actually store the data
public class C_PData { public static C_OHashTable hS_Cities = new C_OHashTable( typeof(CS_City), "S_CITIES", "ID", "Cities"); public static C_OHashTable hU_Users = new C_OHashTable( typeof(CU_User), "U_USERS", "ID", "Users"); private C_PData() {} }and fast Hashtable lookups to pick up related data
public class CU_User { ... public CS_City City { get { return (CS_City)C_PData.hS_Cities[ID_CITY]; } } }(do not worry about C_OHashTable now, the class inherits directly from Hashtable and basically contains some additional information about the collection name, type of elements etc.)
This will work as expected. Having an element of type CU_User, we can ask for corresponding City:
CU_User user; ... string cname = user.City.name; // get the name of user’s cityAny relations between tables can be handled in similar manner.
public abstract class C_Base { public int ID; ... public override int GetHashCode() { return ID.GetHashCode(); } } public class CU_User : C_Base { ... public class CS_City : C_Base { ...To load the data from the database, we will use one clever routine that will populate consecutive tables, using reflection to get the information about fields. All fields that should be initialized with database data must be previously marked with C_Field custom attribute to distinguish them from other fields that do not come from the database.
public class CU_User { [C_Field()] public int ID_CITY; [C_Field()] public string surname; [C_Field()] public DateTime dateOfBirth; ... } public const string SSelect0 = "SELECT * FROM {0}"; public static void Init_Table( C_OHashTable hDict, bool bClearTable ) { SqlCommand cmd; object item; SqlDataReader r = null; cmd = new SqlCommand( String.Format( SSelect0, hDict.NameOfTable ), C_PData.SqlConn ); r = cmd.ExecuteReader(); if ( bClearTable==true ) hDict.Clear(); while ( r.Read() ) { item = Activator.CreateInstance( hDict.ItemType ); foreach ( FieldInfo fI in hDict.ItemType.GetFields() ) foreach ( object oA in fI.GetCustomAttributes(true) ) if ( oA is C_Field ) { hDict.ItemType.InvokeMember( fI.Name, BindingFlags.SetField, null, item, new object[] { r[fI.Name] } ); } C_Base bItem = (C_Base)item; bItem.ID_DB = bItem.ID; bItem.modified = false; hDict.Add ( bItem.ID, item ); }Look carefully what is going on here. We read all records from the table (SELECT * FROM …), then we step through consecutive rows and try to pick from the SqlDataReader these columns that correspond to fields marked by C_Field in class definition.
This way the Init_Table method is polymorphic - it only needs a reference to the collection and then it can populate the collection with the data from the database.
public static void InitAllData() { ... C_PDataInits.Init_Table ( C_PData.hS_Cities, true ); C_PDataInits.Init_Table ( C_PData.hU_Users, true ); ...
In our model, the changes will not be sent to the server immediately. They will be rather accumulated and sent to the server at wish as one big portion of information.
The most interesting and difficult part is the dynamic conversation with the database server. Feedback is necessary because the server assigns IDs to newly created rows. These newly assigned IDs must be at once used at client side to store the other newly created elements. What we need is a clever strategy that would allow to gather the information about new IDs from server and use it at client side.
A newly created ID for element E, returned by the server can be of course propagated to all elements that are in relation with E. This, however, would be extremely inefficient. Imagine that in one session 100 new cities would be added and 1000 new users that come from these new cities. After each city would be stored in the database, its new ID should be propagated to each customer that comes from that city (so that this customer could be stored in the database with proper ID_CITY value). It would either require to loop through the customers collection 100 times (to find customers from each new city) or hold an array of backward references from cites to its customers.
We will take another approach. Instead of updating the newly assigned ID, we will have TWO IDs in each element!
public abstract class C_Base { [C_Field()] public int ID; public int ID_DB; public bool modified; ...The first one, ID, will be used as the client ID. I call it client ID because it is used to link related elements at the client-side. The second one, ID_DB, will be dynamically assigned to new records by the database server and will be used in SQL queries by referenced elements.
public abstract class C_Base { ... static int newID = 0; public static int NewID { get { return newID--; } }Such element would be then added to the proper collection, using this new, fresh ID as the hash key. Note that this way there would be two categories of records in each collection:
public abstract class C_Base { ... virtual public string InsUpdSQLQuery( C_OHashTable hDict ) { return string.Empty; }Because of two categories of records, we will take two paths:
public class CS_City : C_Base { ... const string SInsert_0 = "INSERT {0} "+ "(name) "+ "VALUES ('{1}') "+ "SELECT CAST(SCOPE_IDENTITY() AS INT)"; const string SUpdate_0 = "UPDATE {1} "+ "SET name='{2}' "+ "WHERE {3}={0} "+ "SELECT {0}"; override public string InsUpdSQLQuery( C_OHashTable hDict ) { if ( this.ID_DB > 0 ) { return String.Format( SUpdate_0, ID_DB, hDict.NameOfTable, name, hDict.NameOfKey ); } else { return String.Format( SInsert_0, hDict.NameOfTable, name ); } } public class CU_User : C_Base { const string SInsert_0 = "INSERT {0} "+ "(ID_CITY, surname, dateOfBirth) "+ "VALUES ({1}, '{2}', '{3}') "+ "SELECT CAST(SCOPE_IDENTITY() AS INT)"; const string SUpdate_0 = "UPDATE {1} "+ "SET ID_CITY={2}, surname='{3}', dateOfBirth='{4}' "+ "WHERE {5}={0} "+ "SELECT {0}"; override public string InsUpdSQLQuery( C_OHashTable hDict ) { if ( this.ID_DB > 0 ) { return String.Format( SUpdate_0, ID_DB, hDict.NameOfTable, City.ID_DB, surname, dateOfBirth, hDict.NameOfKey ); } else { return String.Format( SInsert_0, hDict.NameOfTable, City.ID_DB, surname, dateOfBirth ); } }Look carefully what is going on here. If the element comes originally from the database, its ID_DB is positive (the field was initialized when record was loaded from the database). Such element produces the UPDATE query. All newly created elements produce INSERT query and the query returns the newly assigned ID back to the application. When the record needs a ID from another table, it asks the corresponding record for its ID_DB (user asks for its city’s ID_DB in the insert-part of InsUpdSQLQuery. These newly assigned ID_DBs are gathered at the client-side after the single insert/update query is sent to the database.
public static void InsUpd_Table_0( C_OHashTable hDict, int key ) { SqlCommand cmd; C_Base item; item = (C_Base)hDict[key]; cmd = new SqlCommand( item.InsUpdSQLQuery(hDict), C_PData.SqlConn, C_PData.sqlTran ); // assign a newly created database ID (or just copy the ID of updated element) item.ID_DB = (int)cmd.ExecuteScalar(); item.modified = false; } public static void InsUpd_Table_0( C_OHashTable hDict ) { C_Base item; IEnumerator ie = hDict.Keys.GetEnumerator(); while ( ie.MoveNext() ) { item = (C_Base)hDict[ie.Current]; if ( item.modified == true ) { InsUpd_Table_0 ( hDict, (int)ie.Current ); } } } public static void InsUpd_All() { ... InsUpd_Table_0 ( C_PData.hS_Cities ); InsUpd_Table_0 ( C_PData.hU_Users ); }Note that collections have to be stored in proper order. In particular, cities have to be stored before users, so that server could assign new ID_DBs for newly created cities and then these new ID_DBs could be possibly used when storing newly created users.
To help you with your research, I’ve developed a simple tool called Local Data Model Generator. It allows you to connect to a selected database, examine it and automatically produce all class files that form a local data model for that database according to the above description.
The tool is far from perfection and its usage is limited but it also a great start point for further research.
Local Data Model Generator v.1.0 profile:
Any feedback on this would be appreciated. Feel free to discuss any ideas from or from outside this paper.