1. Computing & Technology

Discuss in my forum

LINQ to XML

Getting Started with LINQ and XML

By , About.com Guide

See More About:
Updated February 11, 2012

Since it was introduced in VB.NET 2008, LINQ has become fundamental in VB.NET - but it's still widely misunderstood. This article is one of a series at About Visual Basic that covers the major ways to program LINQ. The goal is for a beginning Visual Basic .NET programmer to get up to speed on LINQ. If you found this article from a search and you would like to start at the beginning, you can find an introduction to the overall LINQ technology - what it covers and what you can do with it - in the article LINQ - An Example Driven Introduction with an example showing how LINQ could be used to start web pages using selections made with CheckBox controls. There is an index to all of the articles in the series in the beginning of the article.

LINQ isn't just a single technology. It's a whole grab bag of technologies and one of the things that sometimes confuses people is just how to classify it in their mind. Since the fundamental purpose of LINQ is to access data stores, one way to classify the different parts of LINQ is to divide LINQ according to the data store that is addressed. For example, another major part of LINQ is LINQ to SQL for SQL Server data. (There's an About Visual Basic article about that here.) This article introduces the way to use LINQ to access XML data stores.

LINQ to XML - Taking the pain out of XML programming

XML was available to your application before LINQ to XML, but it wasn't even close to being as easy to use. Here's the old way compared with LINQ to XML.

In the bad old days ... not that old, actually, just before VB.NET 2008 ... you would typically use the W3C DOM API objects to code XML in VB.NET. Here's how that would look using the XML created for the example later in this article:


Imports System.Xml

Dim xmlDoc As New XmlDocument
Dim xmlQTopicCodes As XmlElement
Dim xmlQTCode As XmlElement
Dim xmlQTID As XmlAttribute
Dim xmlQCodes As XmlElement = xmlDoc.CreateElement("QCodes")
xmlDoc.AppendChild(xmlQCodes)
xmlQTopicCodes = xmlDoc.CreateElement("QTopicCodes")
xmlQTCode = xmlDoc.CreateElement("QTCode")
xmlQTCode.InnerText = "Knowledge of Chocolate"
xmlQTID = xmlDoc.CreateAttribute("QTID")
xmlQTID.InnerText = "CH"
xmlQTCode.Attributes.Append(xmlQTID)
xmlQTopicCodes.AppendChild(xmlQTCode)
xmlQTCode = xmlDoc.CreateElement("QTCode")
xmlQTCode.InnerText = "Underwater Basket Weaving"
xmlQTID = xmlDoc.CreateAttribute("QTID")
xmlQTID.InnerText = "TT"
xmlQTCode.Attributes.Append(xmlQTID)
xmlQTopicCodes.AppendChild(xmlQTCode)
xmlQTCode = xmlDoc.CreateElement("QTCode")
xmlQTCode.InnerText = "Chromatic Hue of Flea Feathers"
xmlQTID = xmlDoc.CreateAttribute("QTID")
xmlQTID.InnerText = "FH"
xmlQTCode.Attributes.Append(xmlQTID)
xmlQTopicCodes.AppendChild(xmlQTCode)
xmlQCodes.AppendChild(xmlQTopicCodes)

After you create an XML document using the DOM, processing the XML is just as ugly.

Fortunately, using the objects introduced with LINQ to XML, none of this is necessary anymore. The entire mess can be replaced with the much more direct and maintainable:


Dim xmlQTopicCodes As XElement =
    <QCodes>
        <QTopicCodes>
            <QTCode QTID="CH">
               Knowledge of chocolate
            </QTCode>
            <QTCode QTID="TT">
               Underwater Basket Weaving
            </QTCode>
            <QTCode QTID="FH">
               Chromatic hue of flea feathers
            </QTCode>
        </QTopicCodes>
    </QCodes>

The illustration shows that the result is identical, except for the initial XML declaration. The LINQ way is usually called functional construction.

--------
Click Here to display the illustration
--------

The XML objects in System.Xml.Linq also have the advantage that you don't even have to create a complete XML document. You can just work with a fragment if that's all your program needs. As a syntax note, remember that in LINQ, you use XElement. The W3C DOM API uses XmlElement.

In order to support LINQ, Microsoft had to add two basic capabilities to the VB.NET language: XML literals and XML axis properties.

-> XML literals enable you to include XML directly in your code.
-> XML axis properties enable you to access XML nodes and attributes.

The example above shows you an XML literal. You'll find the syntax to be almost the same as XML 1.0.

LINQ to XML programming is as simple as possible. The hardest thing to get right is the syntax. The concept of a namespace is a good example. It isn't required, but it's usually a good idea. The same XElement shown before can be updated with a namespace name very simply. First, add an Imports statement referencing the namespace:


Imports <xmlns:avb="http://VisualBasic.About.Com">

"xmlns" is a keyword that will aways be the same. "avb" is the XML prefix. Namespaces and namespace prefixes cause a lot of confusion but they're not a Microsoft invention; they're part of XML and the rules for them are, again, very close to the rules for XML 1.0. In brief, a prefix is just shorthand for a longer, presumably unique name http://VisualBasic.About.Com. You will find URL's used almost exclusively for namespace names precisely because they're unique. (If they weren't unique, the Internet wouldn't work.) As namespace names, they're never used as actual URL's. From that point on, you can just use the shorter prefix. For example:


Dim xmlQTopicCodes As XElement =
    <avb:playerScoresQCodes>
        <avb:playerScoresQTopicCodes>
... etc.

Visual Studio Intellisense enters the closing prefix for you.


... 
    </avb:playerScoresQCodes>

When you save the file, the namespace name and prefix is saved as part of the file too, but not in exactly the same way as shown below:

--------
Click Here to display the illustration
--------

You can also define a default namespace that will apply to any XML object that doesn't have a prefix by leaving the prefix out of the namespace declaration:


Imports <xmlns="http://About.com">

You might be wondering, "Why go to all this trouble? LINQ to XML works without namespaces and they look like extra keystrokes to me!"

Namespaces prevent one name from being confused with another. For example, how many times do you think "AccountNum" gets used as a name? Using namespaces, you can make one AccountNum different from another one. If you're using XML documents from Company A and Company B in the same program, you can code:


Imports <xmlns:coa="http://coa.com">
Imports <xmlns:cob="http://cob.com">
Imports <xmlns:avb="http://VisualBasic.About.Com">
...
Dim bothCompanies As XElement =
    <avb:AccountNums>
        <coa:AccountNum>12345</coa:AccountNum>
        <cob:AccountNum>67890</cob:AccountNum>
    </avb:AccountNums>

Now that we have a more fully qualified XElement, we can use the LINQ to XML axis properties to reference the XML.

The XML axis properties include:


-> Attribute             attributes of an XElement
-> Child                 children of an XElement
-> Descendant            descendants of an XElement
-> Extension Indexer     individual elements in a collection
                            of XElement or XAttribute
-> Value                 the value of the first element of
                            a collection of XElement or XAttribute

Using the XElement shown earlier, we can use a few of these axis properties to transform (change into a different XML structure) the XML.


Dim playerScores =
    <playerIDs>
        <%= From playerScoresQTCode In
            xmlQTopicCodes...<avb:playerScoresQTCode>
            Select <theIDs>
                       <%= playerScoresQTCode.@avb:QTID %>
                   </theIDs>
        %>
    </playerIDs>

The Attribute property is used in:


<%= playerScoresQTCode.@avb:QTID %>

The "@" syntax indicates that it's an attribute.

The Descendent property is used in:


xmlQTopicCodes...<avb:playerScoresQTCode>

The three dots substitute for the missing levels in the XML structure.

And the code also shows how to use a embedded expression. These are XML literals that contain expressions that evaluated at run time. The syntax is the same syntax used in ASP.NET: <%= expression %>.

LINQ to XML is a rich technology and we have only touched a few things that can be done. About Visual Basic has other articles covering this:

AutoTest - A LINQ to XML Example Program

... and ...

The end of Simple File Processing and VB.NET

Part 8 of the WPF Tutorial also includes a LINQ to XML example program.

  1. About.com
  2. Computing & Technology
  3. Visual Basic
  4. Using VB.NET
  5. LINQ for VB.NET
  6. LINQ and XML - Using Language Integrated Query with XML data

©2012 About.com. All rights reserved.

A part of The New York Times Company.