As you probably recall, ADO.Net uses a "disconnected" database model, meaning that it makes a temporary connection to a database to pull the data and all of the data is locally cached in RAM on the client PC. The DataSet object is, in many respects, a miniature in-memory relational database engine.
This might be a good time to review Using ADO.Net DataRelations
By using the built-in features of ADO.Net you can perform many database kinds of things like:
- Perform SQL-style queries
- Perform aggregations (like SUM, etc.) via expressions
- Perform an SQL-style database join
SQL Join syntax
Joining multiple Tables in a relational database (such as SQL Server) is done via an SQL statement such as:
SELECT Customers.CompanyName, Orders.OrderDate
FROM Customers
INNER JOIN
Orders
ON Customers.CustomerID = Orders.CustomerID
In this example, the Customers table is joined to the Orders table based upon a relationship between the two tables where the CustomerID is the same.
There is no native ADO.Net support for parsing a complex SQL statement like the one above, but you can write a class module that will take two DataTables and join them together in much the same way as it's done via an SQL statement.
The trick is to leverage the ADO.Net feature called DataRelations. The DataSet object can hold a collection of relationships between DataTables. After you have created a relationship, you can then use this relationship to get data from a related table.
Documentation Links
The Example Code
Let's look at a simple example of a LEFT join:
Dim dr_left, dr_right, dr_dest As DataRow Dim dt_left, dt_right As DataTable Dim dt As New DataTable ' create a relationship between the two tables ds.Relations.Add(New DataRelation("__RELATIONSHIP__", dc_right, dc_left, False)) For Each dr_left In dt_left.Rows dr_dest = dt.NewRow ' copy the data from the columns of the Left table For Each column In dt_left.Columns dr_dest(column.ColumnName) = dr_left(column.ColumnName) Next ' get the related rows via the Relationship dr_right = dr_left.GetParentRow("__RELATIONSHIP__") If Not IsNull(dr_right) then ' copy the data from the columns of the Right table For Each column In dt_right.Columns dr_dest(column.ColumnName) = dr_right(column.ColumnName) Next End If ' add the row to the ResultSet dt.Rows.Add(dr_dest) Next ' delete the temporary relationship ds.Relations.Remove("__RELATIONSHIP__")
Let's examine this example a little closer. The first thing we do is create a new DataRelation between the two tables. The last argument in the DataRelations constructor is set to False because we do not want to create a corresponding foreign key constraint to go along with our relationship.
' create a relationship between the two tables ds.Relations.Add(New DataRelation("__RELATIONSHIP__", dc_right, dc_left, False))
Note: While it's true that most SQL joins are performed on Primary Keys in one table against Foreign Keys on the other table, that is not a requirement. So, we can't assume that creating a foreign key constraint would be successful.
The next step is to copy all of the data from the columns in the left table. Since this example is a LEFT join, all rows from the left table should appear in the result set.
' copy the data from the columns of the left table For Each column In dt_left.Columns dr_dest(column.ColumnName) = dr_left(column.ColumnName) Next
This step is where the magic of Relations comes to play... we take a row from the left table and then get the related row from the right table. Then we copy the data from the right table's row into the result set just like we did with the left table.
' get the related rows via the Relationship dr_right = dr_left.GetParentRow("__RELATIONSHIP__") If Not IsNull(dr_right) then ' copy the data from the columns of the Right table For Each column In dt_right.Columns dr_dest(column.ColumnName) = dr_right(column.ColumnName) Next End If
Real-World Example
OK, now that you've got the basics of doing Relationship joins, let's look at a real-world function that can join two tables using LEFT, RIGHT, INNER, and FULL join methods. In this demo function, we create unique column names for the Result Set, since column name collisions are otherwise possible.
' Join two tables together Private Function JoinTables(ByVal LeftTable As DataTable, ByVal RightTable As DataTable, _ ByVal LeftColumn As String, ByVal RightColumn As String, ByVal JoinType _ As String) As DataTable Dim dr_left, dr_right, dr_dest, dr_temp, drs() As DataRow Dim dt, left_dt, right_dt As DataTable Dim right_col, left_col, column As String Dim i As Integer Dim RightColumnList, LeftColumnList As ArrayList ' build the detached TempTable that will be returned dt = BuildTempTable() left_dt = LeftTable right_dt = RightTable left_col = LeftColumn right_col = RightColumn ' what kind of join is this? Select Case JoinType Case "INNER" Case "LEFT" Case "RIGHT" ' We don't really do right joins... we just flip the arguments and ' do a left join instead. left_dt = RightTable right_dt = LeftTable left_col = RightColumn right_col = LeftColumn JoinType = "LEFT" Case "FULL" ' A full join is kinda complicated... we do a right join, then a left, ' then combine the two together. The optimum execution plan for a full ' join should be a "merge" operation... but let's not make things any ' more complicated than they already are. Dim temp As DataTable temp = JoinTables(RightTable, LeftTable, RightColumn, LeftColumn, "LEFT") dt = JoinTables(LeftTable, RightTable, LeftColumn, RightColumn, "LEFT") drs = temp.Select(TabColName(LeftTable.TableName, LeftColumn) & " is null") For Each dr_right In drs dt.ImportRow(dr_right) Next Return dt Case Else Throw New ApplicationException("SQL syntax error: Unknown Join type '" & _ JoinType & "'") End Select ' create a relationship between the two tables ds.Relations.Add(New DataRelation("__RELATIONSHIP__", right_dt.Columns(right_col), _ left_dt.Columns(left_col), False)) ' let's go! dr_temp = dt.NewRow LeftColumnList = GetTableColumns(left_dt.TableName) RightColumnList = GetTableColumns(right_dt.TableName) For Each dr_left In left_dt.Rows ' Get the related rows from the "right" table drs = dr_left.GetParentRows("__RELATIONSHIP__") ' For inner joins, we don't record anything unless there is a matching row If UBound(drs) >= 0 Or JoinType <> "INNER" Then dr_dest = dt.NewRow ' Let's start by just copying the columns from the "left" table For Each column In LeftColumnList dr_dest(TabColName(left_dt.TableName, column)) = dr_left(column) Next ' There are three possibilities... there are no matching rows, there is ' only one related row, there are many related rows. Select Case UBound(drs) Case -1 ' Just record the row as it is now (with just the columns from ' the left table). dt.Rows.Add(dr_dest) Case 0 dr_right = drs(0) For Each column In RightColumnList dr_dest(TabColName(right_dt.TableName, column)) = dr_right(column) Next dt.Rows.Add(dr_dest) Case Else ' Make a copy of the prototype datarow that we already filled in ' above. It already has the column data from the left table. dr_temp.ItemArray = dr_dest.ItemArray For Each dr_right In drs dr_dest = dt.NewRow ' Copy prototype row (the left table's data) dr_dest.ItemArray = dr_temp.ItemArray ' Copy the columns from the related rows in the right table For Each column In RightColumnList dr_dest(TabColName(right_dt.TableName, column)) = dr_right(column) Next dt.Rows.Add(dr_dest) Next End Select End If Next ' delete the temporary relationship we created above ds.Relations.Remove("__RELATIONSHIP__") Return dt End Function ' Make a unique column name Private Function MakeColumnName(ByVal TableName As String, ByVal ColumnName As _ String) As String Return (TableName.Replace(" ", "_").Trim("[]".ToCharArray) & "_" & _ ColumnName.Replace(" ", "_").Trim("[]".ToCharArray)) End Function
Note: The example above does not show the GetTableColumns() function which included in the source code example.
Downloads/Links
Read a related article on Using ADO Expressions
Read a related article on Using ADO DataRelations
Read a related article on In-Memory SQL Engine
Download the complete VB.Net Source code example used in this article:
Join3.zip