RuleML in JSON

From RuleML Wiki
Jump to: navigation, search

Authors: Harold Boley, Tara Athan, Adrian Paschke, Gen Zou, Davide Sottara, Mark Proctor


1 Overview

JavaScript Object Notation (JSON) is a human- and machine-readable data-interchange format for lists (ordered) of values and sets (unordered) of name-value pairs. Rule Markup Language (RuleML) is a system of families of Web-rule languages serialized by combining XML use in the positional (ordered) style and in an object-centered (unordered) style. By virtue of sharing this basic ordered/unordered distinction, RuleML and JSON can be matched to each other pretty well. In particular, RuleML can employ JSON as an alternate presentation and interchange syntax -- round-trippable with its XML serialization -- while keeping the schema definitions of RuleML unchanged (using XML Schema Definition and Relax NG, as e.g. for the Deliberation RuleML family).

This specification of RuleML in JSON builds, e.g., on RuleML's XML/RDF-unifying OrdLab data model, its POSL presentation syntax, and its PSOA RuleML development. Its initial version was prepared for a talk by the first author, entitled How Object-Centered XML Rules are Configured from the RuleML Lattice, can be Exchanged in JSON, and enable Decision Making over Object-Relational Data.

Since RuleML 0.87, the main distinction within the data model has been indicated as follows: Node (Type) elements have started with an upper-case letter while edge (role) elements have started with a lower-case letter. RuleML's two normal forms, the compact stripe-skipped and the expanded fully striped forms, can be viewed as focusing on JSON's two structures, ordered lists (arrays) and pair sets (objects), respectively.

Generally, on each nesting level, arbitrary combinations of stripe-skipped ('positional') RuleML/XML and fully striped ('object-centered') RuleML/XML are allowed. Such RuleML/XML elements can be translated to corresponding RuleML/JSON combinations of arrays and objects in a round-trippable (bi-directional) fashion.

XML tools, e.g. XSLT, used for the XML-to-JSON direction could be made generally aware of the Node/edge (upper-case/lower-case) distinction. However, for RuleML/XML, the XSLT stylesheet will take care of this distinction as part of its exhaustive analysis of all (Node and edge) elements used in the RuleML sublanguage to be translated.

While the RuleML-JSON match to a large extent allows direct modeling, there is some need for indirect encoding. First, like for any XML-JSON mapping, XML's elements need to be encoded, and we adopted the usual encoding as one-pair objects (to be refined through XML attributes). Second, since RuleML has multi-valued edges like <formula>, while JSON's name:value pairs SHOULD have a unique name within an object, we use the recommended and widely implemented arrays for encoding multiple values (although array order information needs to be disregarded). The resulting two kinds of arrays are disambiguated through their occurrence in two different syntactic contexts:

  • Each Node's array value directly models the intended child order
  • Each edge's array value indirectly encodes its single value or multiple values without intending the textual child order (taken from the XML) to carry information

Should a future version of JSON definitely allow non-unique names for pairs, this encoding would no longer be needed.

2 RuleML XML-to-JSON Translation Principles

The translation principles for the two basic cases -- from which advanced ones can be composed -- are as shown in this section (dots indicate recursive translation). For simplicity, we will initially omit RuleML/XML attributes.

To emphasize the underlying tree structures of XML and JSON, and demonstrate the XML-JSON mapping, we will employ essentially "2-space indentation" pretty-print layouts both for RuleML/XML elements and for RuleML/JSON structures. For clarity, the RuleML/JSON pretty-print layout is "encoding-aware" in that it displays structures (arrays and objects) depending on whether they perform 1. direct modeling or 2. indirect encoding (of XML's elements as one-pair objects and of RuleML's single/multiple values as arrays). This distinction is emphasized through different displays of opening curly braces (for objects) and opening square brackets (for arrays):

  1. Direct modeling: Opening curly braces and square brackets are separated from (the first part of) their content by whitespace (used here: newline)
  2. Indirect encoding: Opening curly braces and square brackets are not separated from (the first part of) their content by whitespace

However, this distinction is already implicit in the syntax, the convention is only employed for emphasis, and users are free to employ their own layouts.

2.1 Node with Nodes as Children

The RuleML/XML element Node0 with subelements Node1 . . . NodeK

<Node0>
  <Node1>...</Node1>
  .  .  .
  <NodeK>...</NodeK>
</Node0>

becomes a RuleML/JSON object having one pair with name Node0 whose value is an array of one-pair objects for the subelements Node1 . . . NodeK,

{"Node0":[
   {"Node1":...},
   .  .  .
   {"NodeK":...}]}

This encoding convention for an XML Node element name and its content as a JSON object having one pair that associates the element name with some representation of the content value is usual for XML-to-JSON translators. As the value of a Node, our content array directly models the ordered sequence of children of that Node.

For example, with K=2,

<Implies>
  <And>...</And>
  <Atom>...</Atom>
</Implies>

becomes

{"Implies":[
   {"And":...},
   {"Atom":...}]}

Here, the order is needed to distinguish the <And> condition from the <Atom> conclusion.

2.2 Node with edges as Children

We will distinguish unique (single-valued) from non-unique (multi-valued) edges, but encode the former as a special case of the latter for uniformity.

2.2.1 Single-valued edges

The RuleML/XML element Node0 containing a unique edgeI

<Node0>
  .  .  .
  <edgeI>...</edgeI>
  .  .  .
</Node0>

becomes a RuleML/JSON object having one pair with name Node0 whose value is an object containing a pair with the edgeI child in its array value,

{"Node0":{
    .  .  .
    "edgeI":[...],
    .  .  .}}

As the value of an edge, this singleton array indirectly encodes the singleton set of the single child of that edge (for uniformity with the below case of a multi-valued edge). This encoding convention was introduced in JSON-LD.

As an example of a Node with two unique edges, expanding the next stripe of Nodes (to capture the "Node with Nodes" example above),

<Implies>
  <if>
    <And>...</And>
  </if>
  <then>
    <Atom>...</Atom>
  </then>
</Implies>

becomes

{"Implies":{
   "if":
     [{"And":...}],
   "then":
     [{"Atom":...}]}}

Here, the <if> edge distinguishes the condition and the <then> edge distinguishes the conclusion. No order exists or would be needed within the singleton arrays.

2.2.2 Multi-valued edges

The RuleML/XML element Node0 containing multiple edgeI's (without loss of generality assumed to occur consecutively)

<Node0>
  .  .  .
  <edgeI>...</edgeI>
  . . .
  <edgeI>...</edgeI>
  .  .  .
</Node0>

becomes a RuleML/JSON object having one pair with name Node0 whose value is an object containing a pair with the consecutive edgeI children in its array value,

{"Node0":{
   .  .  .
   "edgeI":
     [...,
      . . .,
      ...],
   .  .  .}}

As the value of an edge, this array indirectly encodes (via a textually ordered sequence) the unordered set of multiple children of that edge.

Refining the above "Node with edges" example at a child Node with two non-unique edges, also expanding the next stripes of Nodes,

<And>
  <formula>
    <Atom>
      <op><Rel>buy</Rel></op>
      ...
    </Atom>
  </formula>
  <formula>
    <Atom>
      <op><Rel>keep</Rel></op>
      ...
    </Atom>
  </formula>
</And>

becomes

{"And":{
   "formula":
     [{"Atom":{
         "op":[{"Rel":"buy"}],
         ...}},
      {"Atom":{
         "op":[{"Rel":"keep"}],
         ...}}]}}

Here, the two children of the multi-valued <formula> edge become encoded as two textually ordered array elements.

3 Advancements and Complete Examples

In this section, we will present three complete examples, advance to attributes, and exemplify them with the index attribute.

As a first unabridged (dot-less) example, the stripe-skipped RuleML/XML

<Implies>
  <And>
    <Atom>
      <Rel>buy</Rel>
      <Var>person</Var>
      <Var>merchant</Var>
      <Var>object</Var>
    </Atom>
    <Atom>
      <Rel>keep</Rel>
      <Var>person</Var>
      <Var>object</Var>
    </Atom>
  </And>
  <Atom>
    <Rel>own</Rel>
    <Var>person</Var>
    <Var>object</Var>
  </Atom>
</Implies>

corresponds to this array-focusing RuleML/JSON (one-pair objects are only used for associating each element with its content -- arrays are used for the positional subelements):

{"Implies":[
   {"And":[
      {"Atom":[
         {"Rel":"buy"},
         {"Var":"person"},
         {"Var":"merchant"},
         {"Var":"object"}]},
      {"Atom":[
         {"Rel":"keep"},
         {"Var":"person"},
         {"Var":"object"}]}]},
   {"Atom":[
      {"Rel":"own"},
      {"Var":"person"},
      {"Var":"object"}]}]}

Alternatively, as a second example, the fully striped RuleML/XML

<Implies>
  <if>
    <And>
      <formula>
        <Atom>
          <op><Rel>buy</Rel></op>
          <arg index="1"><Var>person</Var></arg>
          <arg index="2"><Var>merchant</Var></arg>
          <arg index="3"><Var>object</Var></arg>
        </Atom>
      </formula>
      <formula>
        <Atom>
          <op><Rel>keep</Rel></op>
          <arg index="1"><Var>person</Var></arg>
          <arg index="2"><Var>object</Var></arg>
        </Atom>
      </formula>
    </And>
  </if>
  <then>
    <Atom>
      <op><Rel>own</Rel></op>
      <arg index="1"><Var>person</Var></arg>
      <arg index="2"><Var>object</Var></arg>
    </Atom>
  </then>
</Implies>

corresponds to this object-focusing RuleML/JSON (objects are also used for describing Node elements with edge elements; attributes are distinguished form edge elements by an "@" prefix -- arrays are used only for encoding single/multi-valued edge children):

{"Implies":{
   "if":
     [{"And":{
         "formula":
           [{"Atom":{
               "op":[{"Rel":"buy"}],
               "arg":
                 [{"@index":"1",
                   "Var":"person"},
                  {"@index":"2",
                   "Var":"merchant"},
                  {"@index":"3",
                   "Var":"object"}]}},
            {"Atom":{
               "op":[{"Rel":"keep"}],
               "arg":
                 [{"@index":"1",
                   "Var":"person"},
                  {"@index":"2",
                   "Var":"object"}]}}]}}],
   "then":
     [{"Atom":{
         "op":[{"Rel":"own"}],
         "arg":
           [{"@index":"1",
             "Var":"person"},
            {"@index":"2",
             "Var":"object"}]}}]}}

Finally, as a third example, the partially striped RuleML/XML

<Implies>
  <if>
    <And>
      <Atom>
        <Rel>buy</Rel>
        <Var>person</Var>
        <Var>merchant</Var>
        <Var>object</Var>
      </Atom>
      <Atom>
        <Rel>keep</Rel>
        <Var>person</Var>
        <Var>object</Var>
      </Atom>
    </And>
  </if>
  <then>
    <Atom>
      <op><Rel>own</Rel></op>
      <arg index="1"><Var>person</Var></arg>
      <arg index="2"><Var>object</Var></arg>
    </Atom>
  </then>
</Implies>

corresponds to this premise-array/conclusion-object RuleML/JSON:

{"Implies":{
   "if":
     [{"And":[
         {"Atom":[
            {"Rel":"buy"},
            {"Var":"person"},
            {"Var":"merchant"},
            {"Var":"object"}]},
         {"Atom":[
            {"Rel":"keep"},
            {"Var":"person"},
            {"Var":"object"}]}]}],
   "then":
     [{"Atom":{
         "op":[{"Rel":"own"}],
         "arg":
           [{"@index":"1",
             "Var":"person"}, 
            {"@index":"2",
             "Var":"object"}]}}]}}

4 State Of the Art on Mapping XML to JSON

[1], [2], [3], [4], [5], [6]

5 XML2JSON Mapping Principles

5.1 Version 1 - using a specialized property @children

[7], [8]

  • Round-Trip, therefore any datatype information with respect of XML validation should be preserved
  • XML elements maps to JSON anonymous object taking a property the element name.
<Ind>John</Ind>

If the element has one text node as child:

{
"Ind" : {"@children":["John"]}
}

if the element has 2..N children

<And>
 <Atom>...</Atom>
 <Atom>...</Atom>
... 
</And>
{
"And": {
    "@children":[
          {"Atom": ...}, 
          {"Atom": ...}, 
          ..., 
         ]
    }
}

If is empty element

 <Or/>
{
"Or" : {"@children":[]}
}

If the element has attributes

<Ind iri="http://example.com/people/jfk">John</Ind>

maps to

{
"Ind" : {
  "iri": "http://example.com/people/jfk",
  "@children":[ "John"]
...
}


NOTE1: In general @children like JSON values[9] is string, array or null.

NOTE2: It is possible to map XML attributes similar with XML elements (and, by consequence, eliminating the unnecessary @children) but in thie case thze round trip is not possible (the inverse mapping will obtain an XML without attributes).

5.2 Version 2

Using

6 XSLT

For formally defining (and in suitable circumstances, implementing) the conversion from the XML-based RuleML syntax into JSON, an XSLT transformation may be used[10], as has been done in other cases[11], [12]. XSLT performance can be optimized following best practices[13] or using new developments in XSLT 3.0[14]. The inverse transformation (RuleML-JSON to RuleML-XML) can be implemented in Javascript[15].

7 References

  1. RFC 4627, http://www.ietf.org/rfc/rfc4627.txt
  2. John Boyer, Sandy Gao, Susan Malaika, Michael Maximilien, Rich Salz, Jerome Simeon. Experiences with JSON and XML Transformations, IBM Submission to W3C Workshop on Data and Services Integration, October 20-21 2011, Bedford, MA, USA, (Experiences with JSON and XML Transformations slides)
  3. The JSON Data Interchange Format
  4. Apache Camel
  5. JSON-LD 1.0 A JSON-based Serialization for Linked Data, W3C Recommendation 16 January 2014. See also http://json-ld.org/
  6. Mark Joseph. 2010. XML to JSON and Back
  7. Extensible Markup Language (XML) 1.0, (Fifth Edition), W3C Recommendation 26 November 2008
  8. XML Schema Part 1: Structures, Second Edition W3C Recommendation 28 October 2004
  9. http://json.org/
  10. http://www.w3.org/TR/xslt-30/
  11. https://code.google.com/p/xml2json-xslt/
  12. http://www.w3.org/TR/xslt-30/xml-to-json.xsl
  13. http://www.dpawson.co.uk/xsl/sect4/N9883.html
  14. http://www.w3.org/TR/xslt-30/#whats-new-in-xslt3
  15. https://code.google.com/p/x2js/