Dexter and RuleML

From RuleML Wiki
Jump to: navigation, search

Dexter is the Stanford Logic Group's browser-based, domain-independent data explorer for the everyday user, synthesizing and extending similar earlier systems by the group: Logical Spreadsheets, Logicalc, etc. Dexter enables users to plug-n-play with heterogeneous, Web-accessible, structured data sources in a unified framework.

In particular, Dexter allows the following (see, e.g., http://dexter.stanford.edu/primer.pdf and http://stanford.edu/~abhijeet/papers/abhijeetSARA13.pdf):

  1. Create, edit and query tables locally in the browser.
  2. Export and share local tables.
  3. Import the data from local and remote CSV, XML and JSON files into local table(s).
  4. Extract the contents of Web-pages into local tables.
  5. Dynamically connect to remote MySQL databases and JSON(P)-APIs, and query them as well as shared tables.
  6. Define (integrity and derivation) rules in Dexlog, an extension of Datalog using negation-as-failure, sets, tuples, aggregates, and built-in arithmetic and comparison operators

Abhijeet Mohapatra's and Sudhir Agarwal's UNB Computer Science Seminar Series talk Dexter: A Browser-based Data Explorer for End Users gives an overview (slides). A RuleML 2015 paper gives details (slides).

A video explaining some basic features of Dexter is available at Introduction to Dexter.

RuleML is described in these wiki pages, including the survey Introducing RuleML.

Beginning with Michael Genesereth's Keynote at RuleML 2010, several connections between RuleML and (what later became) Dexter have been explored. We describe here some of the specific efforts planned by a joint Dexter and RuleML team:

  • Short-term planning stage: Export and import of Dexter tables as Naf Datalog RuleML (ground) facts
  • Midterm planning stage: Export and import of Dexter rules as Naf Datalog RuleML (integrity and derivation) rules
  • Strategic planning stage (to be described): Library of built-ins and aggregates

The Dexter Team has already finished the implementation -- in JavaScript, for modern browsers such as Firefox and Chrome -- of Dexter fact and rule export to RuleML/XML:

  • In Version 2, create a new table [by pressing the upper-left "+" button, and then, e.g., copy and paste the newline-separated rules ancestor(X, Y) :- parent(X, Y) and ancestor(X, Y) :- parent(X, Z) & ancestor(Z, Y) into the blank area above the data grid]
  • In the "Export Table" menu, select "Data and Rules as RuleML/XML"

The following sections discuss the relevant correspondences between Dexter and RuleML, with examples taken from the above-linked Dexter primer. While the various Dexter examples require different minimal expressivity for their RuleML schemas, the discussion will converge on the schema nafdatalog.xsd (Naf Datalog: Datalog with Negation as failure), which will be used for the final version of all examples.

1 Dexter Tables and RuleML Facts

Each Dexter table named t can be represented as a rulebase of Datalog RuleML ground facts with the same relation symbol t, and vice versa.

For example, the 3-row, 2-column Dexter people table

X Y
Alex male
Alex female
Siri female

in Datalog RuleML could be represented as the 3-fact rulebase with 1 binary relation symbol, people,

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/datalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>people</Rel>
      <Ind>Alex</Ind>
      <Data>male</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Ind>Alex</Ind>
      <Data>female</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Ind>Siri</Ind>
      <Data>female</Data>
    </Atom>
  </Assert>
</RuleML>

Here, Alex and Siri are represented as individuals (hence marked up as <Ind>), while male and female are represented as data (hence marked up as <Data>). For uniformity, all cells of Dexter tables will be represented as data (hence marked up as <Data>), obtaining

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/datalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>people</Rel>
      <Data>Alex</Data>
      <Data>male</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Data>Alex</Data>
      <Data>female</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Data>Siri</Data>
      <Data>female</Data>
    </Atom>
  </Assert>
</RuleML>

Here and below, the schema-language-agnostic xml-model processing instruction (XML Core Working Group Note) is used to associate schema documents -- for now RuleML's datalog.xsd -- with the XML instance document in which they occur.

Similarly, the 2-row, 2-column Dexter parent table

X Y
Alex Mary
Mary Tom

in Datalog RuleML could be represented as the 2-fact rulebase with 1 binary relation symbol, parent,

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/datalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>parent</Rel>
      <Ind>Alex</Ind>
      <Ind>Mary</Ind>
    </Atom>
    <Atom>
      <Rel>parent</Rel>
      <Ind>Mary</Ind>
      <Ind>Tom</Ind>
    </Atom>
  </Assert>
</RuleML>

Here, Alex, Mary, and Tom, are represented as individuals. Since, for uniformity, all cells of Dexter tables are again represented as data, we obtain

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/datalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>parent</Rel>
      <Data>Alex</Data>
      <Data>Mary</Data>
    </Atom>
    <Atom>
      <Rel>parent</Rel>
      <Data>Mary</Data>
      <Data>Tom</Data>
    </Atom>
  </Assert>
</RuleML>

The representation of a system of k Dexter tables uses a RuleML rulebase containing facts for k relation symbols. For example, with the data-only method, the k=2 above Dexter tables can be combined in a single 5-fact Datalog RuleML rulebase for k=2 relation symbols, people and parent,

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/datalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>people</Rel>
      <Data>Alex</Data>
      <Data>male</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Data>Alex</Data>
      <Data>female</Data>
    </Atom>
    <Atom>
      <Rel>people</Rel>
      <Data>Siri</Data>
      <Data>female</Data>
    </Atom>
    <Atom>
      <Rel>parent</Rel>
      <Data>Alex</Data>
      <Data>Mary</Data>
    </Atom>
    <Atom>
      <Rel>parent</Rel>
      <Data>Mary</Data>
      <Data>Tom</Data>
    </Atom>
  </Assert>
</RuleML>

The above rulebases refer to the RuleML anchor schema for Datalog in XSD, http://deliberation.ruleml.org/1.01/xsd/datalog.xsd, allowing both facts and rules (which will be needed later). Employing MYNG 1.01, a more fine-grained, table-oriented schema, http://deliberation.ruleml.org/1.01/myng-b1-d7-a7-l1-p6-i4-t200-q0-e0-s2.rnc, has been configured for data-only ground-fact Datalog in RNC. This can be referred to in xml-model declarations instead of the above datalog.xsd references, and could also be provided as a new RuleML anchor schema in XSD, e.g. as http://deliberation.ruleml.org/1.02/xsd/tablefact.xsd. However, for uniformity across all possible Dexter facts and rules, the datalog.xsd schema will be relaxed to the more course-grained anchor schema nafdatalog.xsd.

2 Dexter Rules and RuleML Rules

Two kinds of rules will be considered, which are usually combined. We will start with integrity rules and proceed to derivation rules.

2.1 Integrity Rules

A Dexter integrity rule can be represented as a Datalog RuleML rule with a special head.

For example, the Dexter integrity rule (using 'head-if-body' syntax)

illegal :- people(X, "male") & people(X, "female")

in Datalog RuleML could be represented as the rulebase with 1 rule (using 'body-then-head' syntax)

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/nafnegdisdatalogplus.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Forall>
      <Var>X</Var>
      <Implies>
        <And>
          <Atom>
            <Rel>people</Rel>
            <Var>X</Var>
            <Data>male</Data>
          </Atom>
          <Atom>
            <Rel>people</Rel>
            <Var>X</Var>
            <Data>female</Data>
          </Atom>
        </And>
        <Or/>
      </Implies>
    </Forall>
  </Assert>
</RuleML>

Here, the negative integrity in the head of the implication is written as an empty disjunction, <Or/> (which requires a more general Deliberation RuleML 1.01 anchor schema, nafnegdisdatalogplus.xsd, to be referred to in xml-model). For a direct Dexter correspondence, an atomic formula with a distinguished nullary illegal relation will instead be used in the head of the implication (which requires a less general anchor schema, nafdatalog.xsd), obtaining

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/nafdatalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Forall>
      <Var>X</Var>
      <Implies>
        <And>
          <Atom>
            <Rel>people</Rel>
            <Var>X</Var>
            <Data>male</Data>
          </Atom>
          <Atom>
            <Rel>people</Rel>
            <Var>X</Var>
            <Data>female</Data>
          </Atom>
        </And>
        <Atom>
          <Rel>illegal</Rel>
        </Atom>
      </Implies>
    </Forall>
  </Assert>
</RuleML>

Employing MYNG 1.01, a schema refining nafnegdisdatalogplus.xsd has been configured as http://deliberation.ruleml.org/1.01/myng-b7-d7-a7-l1-p22-i32-t300-q0-e0-s3.rnc, and a schema refining nafdatalog.xsd has been configured as http://deliberation.ruleml.org/1.01/myng-b7-d7-a7-l1-p22-i2-t300-q0-e0-s3.rnc. However, the schema nafdatalog.xsd will be again kept for uniformity.

On top of the people table/rulebase of the 'Tables and Facts' section, the above Dexter/RuleML rule for X = Alex in the body will conclude illegal, signaling an integrity violation.

2.2 Derivation Rules

A Dexter derivation rule can be represented as a Datalog RuleML rule with a non-illegal head.

For example, the Dexter derivation rules (the second being recursive)

ancestor(X, Y) :- parent(X, Y)
ancestor(X, Y) :- parent(X, Z) & ancestor(Z, Y)

in Datalog RuleML will be represented as the rulebase with 2 rules

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/nafdatalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Forall>
      <Var>X</Var>
      <Var>Y</Var>
      <Implies>
        <Atom>
          <Rel>parent</Rel>
          <Var>X</Var>
          <Var>Y</Var>
        </Atom>
        <Atom>
          <Rel>ancestor</Rel>
          <Var>X</Var>
          <Var>Y</Var>
        </Atom>
      </Implies>
    </Forall>
    <Forall>
      <Var>X</Var>
      <Var>Y</Var>
      <Var>Z</Var>
      <Implies>
        <And>
          <Atom>
            <Rel>parent</Rel>
            <Var>X</Var>
            <Var>Z</Var>
          </Atom>
          <Atom>
            <Rel>ancestor</Rel>
            <Var>Z</Var>
            <Var>Y</Var>
          </Atom>
        </And>
        <Atom>
          <Rel>ancestor</Rel>
          <Var>X</Var>
          <Var>Y</Var>
        </Atom>
      </Implies>
    </Forall>
  </Assert>
</RuleML>

On top of the parent table/rulebase of the 'Tables and Facts' section, the above Dexter and RuleML rules will derive, respectively, Dexter's 3-row, 2-column ancestor computed table

X Y
Alex Mary
Mary Tom
Alex Tom

and RuleML's corresponding 3-fact, binary ancestor rulebase

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://deliberation.ruleml.org/1.01/xsd/nafdatalog.xsd"?>
<RuleML xmlns="http://ruleml.org/spec">
  <Assert>
    <Atom>
      <Rel>ancestor</Rel>
      <Data>Alex</Data>
      <Data>Mary</Data>
    </Atom>
    <Atom>
      <Rel>ancestor</Rel>
      <Data>Mary</Data>
      <Data>Tom</Data>
    </Atom>
    <Atom>
      <Rel>ancestor</Rel>
      <Data>Alex</Data>
      <Data>Tom</Data>
    </Atom>
  </Assert>
</RuleML>

Similarly to the above discussion about base relations, a more specific schema (ultimately, tablefact.xsd), could have been employed for such computed tables, but nafdatalog.xsd is used here for uniformity.

3 Dexter/RuleML Rules Merged with Tables/Facts

Both kinds of rules can be combined with each other as well as merged with Dexter tables and RuleML rulebases.

For example, the combined Dexter/RuleML people-integrity rule and ancestor-derivation rules of the 'Rules' section can be merged with the system of people and parent tables/facts of the 'Tables and Facts' section. When merging rules and facts, their merged xml-model declaration has to use the more expressive schema nafdatalog.xsd in any case.