Development Blog‎ > ‎

Reading XML via Go

posted Jul 29, 2011, 12:40 PM by Stan Steel   [ updated Jun 25, 2012, 10:08 PM ]
My 6 year old daughter and I play Minecraft together on occasion to pass the time.  We are slowly building our own little world,
just the two of us.  If you are not familiar with Minecraft, the game's central premise (the way we play) is surviving a world filled with monsters while starting with nothing.  Success usually entails finding shelter and a fire source before the sun sets on the first day.  After that, the excitement is pretty sparse.  You spend a lot of time mining for resources and erecting humble structures.  To survive is to win.  A couple weeks ago, we were playing together and I think we both came to the conclusion we were bored.  Usually, when this happens we start a new world together to get through the excitement of the first night.  This time, however, we decided it'd be cool to design a game together.  So, she spent the next hour drawing pictures of how the monsters, foliage, sky, and people should look.  I asked questions to keep her creativity directed toward the goal of game design.  After that, like most of our projects, days came and went and I thought all was forgotten.

Fast forward to today.  I found myself trying to understand how to use the Go XML package to unmarshall an XML file into a set of types that I can use.  This was spurred by my daughter's recent question of, "Are you still working on our game, Daddy?"  So, today I was trying to read a COLLADA (3D model file) and display it into an OpenGL window.  My conclusion is that it isn't difficult at all.  It was actually easy; so easy, in fact, I felt compelled to write this post afterward.  This is a summary of my experience after a couple hours of effort.  Keep in mind, a lot of the time was directed towards remembering OpenGL, looking for examples to follow, and figuring out the COLLADA specification.  Here we focus on reading the COLLADA file.

Understanding the COLLADA File
The COLLADA file specification is a pretty extensive XML schema whose full specification can be found here and the summary that actually helped me understand it is found here.  In this post, I am focusing on the data I need to get to; which are vertices, normals, and triangles.  Here is an example file with only those portions I cared about remaining and the tags of particular interest highlighted in yellow. 

<?xml version="1.0" encoding="utf-8"?>
<COLLADA xmlns="http://www.collada.org/2005/11/COLLADASchema" version="1.4.1">
  <asset>
    <contributor>
      <author>Blender User</author>
      <authoring_tool>Blender 2.58.0 r37702</authoring_tool>
    </contributor>
    <created>2011-07-26T02:34:24</created>
    <modified>2011-07-26T02:34:24</modified>
    <unit name="meter" meter="1"/>
    <up_axis>Z_UP</up_axis>
  </asset>
  <library_geometries>
    <geometry id="Plane-mesh">
      <mesh>
        <source id="Plane-mesh-positions">
          <float_array id="Plane-mesh-positions-array" count="12">1 0.3503274 0 1 -0.3503274 0 -1 -0.9999998 0 -0.9999997 1 0</float_array>
        </source>
        <source id="Plane-mesh-normals">
          <float_array id="Plane-mesh-normals-array" count="6">0 0 1 0 0 1</float_array>
        </source>
        <vertices id="Plane-mesh-vertices">
          <input semantic="POSITION" source="#Plane-mesh-positions"/>
        </vertices>
        <polylist count="2">
          <input semantic="VERTEX" source="#Plane-mesh-vertices" offset="0"/>
          <input semantic="NORMAL" source="#Plane-mesh-normals" offset="1"/>
          <vcount>3 3 </vcount>
          <p>0 0 3 0 1 0 3 1 2 1 1 1</p>
        </polylist>
      </mesh>
    </geometry>
  </library_geometries>
</COLLADA>

When writing the code to read this file, I made some assumptions about the COLLADA file's data because I was the originator of the file and had control.  First, I assumed there was no texture coordinates in the file, this changes how one would parse the <p>0 0 3 0 1 0 3 1 2 1 1 1</p> section from file above (this summary describes the alternate form here).  Second, I assumed the <polylist> is always describing triangles and not quads.  The above assumptions would change for production code and it is easy enough to get the data that it wouldn't be a problem to handle more than the single case I handle here.

Organizing and Annotating Data Types in Preparation of Unmarshalling the COLLADA File
Using Go's XML package to read the file data in couldn't be any easier.  You create struct types that represent the tags you care about and 'annotate' fields with metadata that tells the parser how to handle data and where to put it.  If you follow the rules properly (found in the XML package documentation), you should have no problem getting the XML package to load the data into your types for you.  In this case, there was some post processing required to get the <float_array> and <p> data into arrays of appropriate numeric types.  Here are the structures I used:

type Collada struct {
    Id                    string               `xml:"attr"` 
    Version               string               `xml:"attr"`
    Library_Geometries    LibraryGeometries
    Library_Visual_Scenes LibraryVisualScenes
}

type LibraryGeometries struct{
    XMLName  xml.Name   `xml:"library_geometries"`
    Geometry []Geometry
}

type Geometry struct{
    XMLName xml.Name  `xml:"geometry"`
    Id      string    `xml:"attr"`
    Mesh    Mesh
}

type Mesh struct {
    XMLName  xml.Name `xml:"mesh"`
    Source   []Source
    Polylist Polylist
}

type Source struct{
    XMLName     xml.Name   `xml:"source"`
    Id          string     `xml:"attr"`
    Float_array FloatArray `xml:"float_array"`
}

type FloatArray struct{
    XMLName xml.Name `xml:"float_array"`
    Id      string   `xml:"attr"`
    CDATA   string   `xml:"chardata"`
    Count   string   `xml:"attr"`
}

type Polylist struct{
    XMLName xml.Name  `xml:"polylist"`
    Id      string    `xml:"attr"`
    Count   string    `xml:"attr"`
   
    // List of integers, each specifying the number of vertices for one polygon
    VCount  string    `xml:"vcount"`
   
    // list of integers that specify the vertex attributes
    P       string    `xml:"p"` 
}

type LibraryVisualScenes struct {
    XMLName      xml.Name       `xml:"library_visual_scenes"`
    VisualScene  VisualScene
}

type VisualScene struct{
    XMLName      xml.Name       `xml:"visual_scene"`   
}

I highlighted sections above that were of interest.  The Id  string `xml:"attr"` section shows how to inform the XML parser to load an attribute value into a field.  The field name should be exported (first letter capitalized) and be tagged with the `xml:"attr"` metadata.  The XMLName  xml.Name `xml:"library_geometries"` section shows how you can identify that a structure is associated with a particular XML element.  The Source []Source  section represents an array that the XML parser will initialize and fill for you.  Finally, the CDATA string `xml:"chardata"` section depicts how you tell the parser where to put text.  In this case, it loaded it with the string of float numbers found in this element:
    
        <float_array id="Plane-mesh-positions-array" count="12">1 0.3503274 0 1 -0.3503274 0 -1 -0.9999998 0 -0.9999997 1 0</float_array>  

Unmarshalling the COLLADA File
Oh, I almost forgot.  Once you've created your data types, the call to fill your data simply looks like the following:

//
// Create meshes from COLLADA file
//
// There are some major limitations for this currently
// 1) Must only contain Triangles
// 2) No support for animation or materials at the moment
// 3) Will not read translations or rotations
//
func BuildModel(filename string) *data.Model{
    file, err := os.Open(filename)
    if err!= nil{
        fmt.Println(err.String())   
    }
   
    c := new(Collada)
    err = xml.Unmarshal(file, c)
    if err!= nil{
        fmt.Println(err.String())   
    }
    ...
}

In the above code, I've used a filename to open a file and I've constructed the root document element (the Collada type) on this line:

         c := new(Collada)

and called xml.Unmarshall passing the file and a reference to the Collada data type instance.  Everything we've properly modeled will be accessible from this root element.  Again, I had to do some post-processing to convert some textual numbers to float and int arrays, but in the end, an ugly model I built in Blender like this:


 was being shown in a Go COLLADA Model Viewer I built like this:



Again, this was super easy and only took about 2-3 hours (I've been writing this post for about the same amount of time).  The next step will be to create and apply textures to the model and see how how to get it displayed in the model viewer.
Comments