1   Description

This post describes several methods for accessing the fields in Xmerl records:

  1. Access through the Elixir Record module.
  2. Access using functions whose source code is generated from the Xmerl records.
  3. Access using functions that are created from the Xmerl record definitions using Elixir metaprogramming.
  4. Access by converting Xmerl record instances to Elixir Structs.

The various Xmerl record types (xmlElement, xmlAttribute, etc) are available as very regular Elixir data structures. Once you become just a little bit familiar with those record definitions, accessing individual fields through any of the methods listed above and explained below is quite easy. This post attempts to help you get started doing that.

Regardless of the strategy that you choose to use, you will want to familiarize yourself with the Xmerl record definitions. You can find the Xmerl record definitions themselves in your Erlang source code. In your Erlang source code distribution, look in this file: otp_src_22.2/lib/xmerl/include/xmerl.hrl.

And, in all these strategies, you will need to convert the XML instance document that you want to process into Xmerl tuples. I suggest that you use SweetXml for that:

Here is an example of converting an XML document:

$ cd my_mix_project
$ iex -S mix
Erlang/OTP 22 [erts-10.6] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [hipe]

Interactive Elixir (1.10.0) - press Ctrl+C to exit (type h() ENTER for help)
iex> rec = File.stream!("path/to/my/xml/doc.xml") |> SweetXml.parse

2   Access through the Elixir Record module

Given that you have this module in your project:

defmodule XmerlRecs do
  @moduledoc """
  Define Xmerl records using record definitions extracted from Erlang Xmerl.
  """

  require Record
  Record.defrecord(:xmlElement, Record.extract(:xmlElement,
    from_lib: "xmerl/include/xmerl.hrl"))
  Record.defrecord(:xmlText, Record.extract(:xmlText,
    from_lib: "xmerl/include/xmerl.hrl"))
  Record.defrecord(:xmlAttribute, Record.extract(:xmlAttribute,
    from_lib: "xmerl/include/xmerl.hrl"))
  Record.defrecord(:xmlNamespace, Record.extract(:xmlNamespace,
    from_lib: "xmerl/include/xmerl.hrl"))
  Record.defrecord(:xmlComment, Record.extract(:xmlComment,
    from_lib: "xmerl/include/xmerl.hrl"))

end

You can write something like the following:

defmodule Test do

  def demo1 do
    element = File.stream!("path/to/my/doc.xml") |> SweetXml.parse
    name = XmerlRecs.xmlElement(element, :name)
    IO.puts("element name: #{name}")
    XmerlRecs.xmlElement(element, :attributes)
    |> Enum.each(fn attr ->
      attrname = XmerlRecs.xmlAttribute(attr, :name)
      attrvalue = XmerlRecs.xmlAttribute(attr, :value)
      IO.puts("    attribute -- name: #{attrname}  value: #{attrvalue}")
    end)
    XmerlRecs.xmlElement(element, :content)
    |> Enum.each(fn item ->
      case elem(item, 0) do
        :xmlText ->
          IO.puts("    text -- value: #{XmerlRecs.xmlText(item, :value)}")
        _ -> nil
      end
    end)
  end

end

Notes:

  • We use SweetXml to parse an XML document. That returns nested tuples that represent an element. The element has a name, and list of attributes, and a list of (sub-)contents.

  • We use the following to access the fields in that element (tuple):

    XmerlRecs.xmlElement(element, :name)
    XmerlRecs.xmlElement(element, :attributes)
    XmerlRecs.xmlElement(element, :content)
    
  • While handling the content, we match against the first element of each content tuple against the atom :xmlText to determine whether that item is a text item.

3   Generating source code for access functions

With a quite small amount of code, we can use the Xmerl tuple definitions to generate Elixir source code containing functions that access each field it the Xmerl records. For example:

defmodule Xml.Element do
  require XmerlRecs
  def get_name(item), do: XmerlRecs.xmlElement(item, :name)
  def get_expanded_name(item), do: XmerlRecs.xmlElement(item, :expanded_name)
  def get_nsinfo(item), do: XmerlRecs.xmlElement(item, :nsinfo)
  def get_namespace(item), do: XmerlRecs.xmlElement(item, :namespace)
  def get_parents(item), do: XmerlRecs.xmlElement(item, :parents)
  def get_pos(item), do: XmerlRecs.xmlElement(item, :pos)
  def get_attributes(item), do: XmerlRecs.xmlElement(item, :attributes)
  def get_content(item), do: XmerlRecs.xmlElement(item, :content)
  def get_language(item), do: XmerlRecs.xmlElement(item, :language)
  def get_xmlbase(item), do: XmerlRecs.xmlElement(item, :xmlbase)
  def get_elementdef(item), do: XmerlRecs.xmlElement(item, :elementdef)
end

defmodule Xml.Attribute do
  require XmerlRecs
  def get_name(item), do: XmerlRecs.xmlAttribute(item, :name)
  def get_expanded_name(item), do: XmerlRecs.xmlAttribute(item, :expanded_name)
  def get_nsinfo(item), do: XmerlRecs.xmlAttribute(item, :nsinfo)
  def get_namespace(item), do: XmerlRecs.xmlAttribute(item, :namespace)
  def get_parents(item), do: XmerlRecs.xmlAttribute(item, :parents)
  def get_pos(item), do: XmerlRecs.xmlAttribute(item, :pos)
  def get_language(item), do: XmerlRecs.xmlAttribute(item, :language)
  def get_value(item), do: XmerlRecs.xmlAttribute(item, :value)
  def get_normalized(item), do: XmerlRecs.xmlAttribute(item, :normalized)
end

As you can see from the above code, these functions are simple wrappers around the access technique that we saw in the previous section.

You can find this code in the XmlElixirStructs repo at: https://github.com/dkuhlman/xmlelixirstructs.git. It's in lib/generate.ex, and is short enough so that I'll repeat it here:

defmodule GenerateFuncs do

  @moduledoc """
  This module can be used to generate an Elixir (.ex) file containing Xmerl accessor functions.
  """

  @type device :: atom | pid

  @type_names [:attribute, :comment, :element, :namespace, :text, ]

  @doc """
  Write out source code for accessor functions for Xmerl records.

  **Caution:** This function is destructive.  It will over-write an
  existing file without warning.

  ## Examples

      iex> GenerateFuncs.generate("path/to/output/file.ex")

  """
  @spec generate(Path.t()) :: :ok
  def generate(path) when is_binary(path) do
    {:ok, dev} = File.open(path, [:write])
    wrt = fn val -> IO.write(dev, val <> "\n") end
    generate(wrt)
    :ok
  end
  @spec generate(device) :: :ok
  def generate(wrt) do
    #wrt = &IO.putss/1
    @type_names
    |> Enum.each(fn item ->
      name = to_string(item)
      cap_name = String.capitalize(name)
      #ident = to_atom("xml#{String.capitalize(to_string(item))}")
      wrt.("defmodule Xml.#{cap_name} do")
      wrt.("  require XmerlRecs")

      Record.extract(
        String.to_atom("xml#{cap_name}"),
        from_lib: "xmerl/include/xmerl.hrl")
      #|> IO.inspect(label: "fields")
      |> Enum.each(fn {field, _} ->
        wrt.("  def get_#{field}(item), do: XmerlRecs.xml#{cap_name}(item, :#{field})")
      end)
      wrt.("end\n")
    end)
  end

end

And, here is our demo, rewritten to use those generated access functions:

defmodule Test do

  def demo2 do
    element = File.stream!("Data/test02.xml") |> SweetXml.parse
    name = Xml.Element.get_name(element)
    IO.puts("element name: #{name}")
    Xml.Element.get_attributes(element)
    |> Enum.each(fn attr ->
      attrname = XmerlRecs.xmlAttribute(attr, :name)
      attrvalue = XmerlRecs.xmlAttribute(attr, :value)
      IO.puts("    attribute -- name: #{attrname}  value: #{attrvalue}")
    end)
    Xml.Element.get_content(element)
    |> Enum.each(fn item ->
      case elem(item, 0) do
        :xmlText ->
          IO.puts("    text -- value: #{Xml.Text.get_value(item)}")
        _ -> nil
      end
    end)
  end

end

4   Using Elixir metaprogramming to generate access functions

Now, let's try to use Elixir metaprogramming to produce functions similar to those described in the previous section.

Here is some code. You can find this in lib/xmlmetaprogramming.ex in the XmlElixirStructs repository at https://github.com/dkuhlman/xmlelixirstructs.git:

defmodule XmerlAccess do

  @moduledoc """
  Use Elixir meta-programming to generate test and accessor functions.

  For each Xmerl record type generate the following:

  - A test function, e.g. `is_element/1`, `is_attribute/1`, etc.

  - A set of assessor functions, one for each field, e.g. `get_element_name/1`,
    `get_element_attributes/1`, ..., `get_attribute_name/1`, etc.

  """

  require XmerlRecs

  @record_types ["element", "attribute", "text", "namespace", "comment"]

  @record_types
  |> Enum.each(fn record_type_str ->
    record_type_string = "xml#{String.capitalize(record_type_str)}"
    record_type_atom = String.to_atom(record_type_string)
    is_method_name_str = "is_#{record_type_str}"                           #1
    is_method_name_atom = String.to_atom(is_method_name_str)
    is_method_body_str = """                                               #2
      if is_tuple(item) and tuple_size(item) > 0 do
        case elem(item, 0) do
          :#{record_type_string} -> true
          _ -> false
        end
      else
        false
      end
    """
    {:ok, is_method_body_ast} = Code.string_to_quoted(is_method_body_str)  #3
    def unquote(is_method_name_atom) (item) do                             #4
      unquote(is_method_body_ast)
    end
    Record.extract(record_type_atom, from_lib: "xmerl/include/xmerl.hrl")
    |> Enum.each(fn {field_name_atom, _} ->                                #5
      method_name_str = "get_#{record_type_str}_#{to_string(field_name_atom)}"
      method_name_atom = String.to_atom(method_name_str)
      method_body_str = "XmerlRecs.#{to_string(record_type_atom)}(item, :#{to_string(field_name_atom)})"
      {:ok, method_body_ast} = Code.string_to_quoted(method_body_str)
      def unquote(method_name_atom)(item) do
        unquote(method_body_ast)
      end
    end)
  end)

end

Notes -- These notes correspond to the comment numbers in the above code:

  1. For each XML record type (e.g. for each xmlElement, xmlAttribute, xmlText, etc) we create a string to represent the name of the function that we want to define. Then, we convert that string to an atom.
  2. We create a string containing the body of the function that we want to define.
  3. We convert the string that represents the code in the body of the function into an AST (abstract syntax tree) by calling Code.string_to_quoted/1.
  4. We define the function. Note that because def is a macro and because macros return an AST, the def will result in inserting the AST into our module. That defines a function.
  5. We do something similar to the above for each field in that XML type (Xmerl record), which defines our "getter" functions (e.g. get_element_name/1, get_element_attributes/1, get_attribute_name/1, etc.).

Given that you have included the above in your Elixir mix project, you can write code like this:

defmodule Test do

  def test1() do
    rec = File.stream!("path/to/my/doc.xml") |> SweetXml.parse
    if XmerlAccess.is_element(rec) do
      IO.puts("element -- name: #{XmerlAccess.get_element_name(rec)}")
      XmerlAccess.get_element_attributes(rec)
      |> Enum.each(fn attr ->
        attrname = XmerlAccess.get_attribute_name(attr)
        attrvalue = XmerlAccess.get_attribute_value(attr)
        IO.puts("    attribute -- name: #{attrname}  value: #{attrvalue}")
      end)
    end
  end

end

You can get some help by typing h XmerlAccess:

iex> h XmerlAccess

                                  XmerlAccess

Use Elixir meta-programming to generate test and accessor functions.

For each Xmerl record type generate the following:

  • A test function, e.g. is_element/1, is_attribute/1, etc.

  • A set of assessor functions, one for each field, e.g.
    get_element_name/1, get_element_attributes/1, ..., get_attribute_name/1,
    etc.

5   Access by converting Xmerl record instances to Elixir Structs

This strategy is described in a previous post, which is here: http://davekuhlman.org/xml-elixir-structs.html.

You can find the code here: https://github.com/dkuhlman/xmlelixirstructs

You can add this code to your Elixir Mix project by adding the following to the dependencies in your mix.exs file:

defp deps do
  [
    # {:dep_from_hexpm, "~> 0.3.0"},
    # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"}
    {:xmlelixirstructs, github: "dkuhlman/xmlelixirstructs"},
  ]
end

6   Ideas for further exploration

Since the Xmerl data structures are so regular and are available to Elixir code, why not use Elixir metaprogramming to generate the field access functions described above? That is an idea for a future post, I hope. -- Done. See above: Using Elixir metaprogramming to generate access functions.


Published

Last Updated

Category

elixir

Tags

Contact