PSOATransRun Development Tutorial

From RuleML Wiki
Jump to: navigation, search

Authors: Gen Zou, Harold Boley, Tara Athan


PSOA RuleML, as described in the recent PSOAPerspectivalKnowledge paper, is a novel language for integrated object-relational data and knowledge. PSOATransRun is the PSOA RuleML reference implementation realized by high-level source-to-source translation using PSOA-to-PSOA transformation followed by PSOA-to-TPTP or PSOA-to-Prolog conversion that targets, e.g., the XSB Prolog engine for runtime execution.

PSOATransRun (which you can try with examples given or referenced in PSOA_RuleML, e.g. PSOARuleMLInvitedTutorialExamplesRW2015) has been developed by the first author over several years and has reached main Version 1.3 (current Version: PSOA RuleML#Prolog_Instantiation). While PSOA_RuleML#PSOATransRun gives an overview, and the README at http://psoa.ruleml.org/transrun/1.3/local/ gives a (brief) user introduction for PSOATransRun 1.3, the current 5-session tutorial addresses (open source) developers who desire a deeper understanding of PSOATransRun, e.g. because they are working on similar systems and/or consider contributing to future releases.

The tutorial is being continued by open-source project meetings (see descriptions in Session 5):

  • Project 3: Mon, 15 Jan 2018, 7am Pacific / 10am Eastern / 11am Atlantic / 3pm UK / 4pm Central Europe / 5pm Greece
  • Project 1: Tue, 16 Jan 2018, 10am Pacific / 1pm Eastern / 2pm Atlantic / 6pm UK / 7pm Central Europe / 8pm Greece
  • Project 4: Wed, 17 Jan 2018, 7am Pacific / 10am Eastern / 11am Atlantic / 3pm UK / 4pm Central Europe / 5pm Greece

If you want to participate and do not know the Skype details, please send an email with your Skype ID to Harold Boley (address via above homepage).

To get the most out of this tutorial and the open-source projects, install:

(Across the two READMEs, the same JDK, containing JRE, should be used for both of these bullets.)

More links to information about PSOA RuleML 1.0 and PSOATransRun 1.3 are in the Newsbar item of 2017-12-11.

1 Overview

(Tue, 19 Dec 2017, 2pm Eastern)

The architectural big (data- and control-flow) picture of the PSOATransRun 1.3 instantiation targeting Prolog, as implemented with GitHub-documented components in Java and ANTLR, and kept in the repository https://github.com/RuleML/PSOATransRunComponents:

  1. Translator parts
    1. PSOA RuleML 1.0 text-to-tree parser
    2. Normalization: Chain of tree-to-tree transformers
    3. Conversion: Final tree-to-text generator for XSB Prolog
  2. Runtime part

Advanced features of PSOATransRun as shown by

java -jar PSOATransRunLocal.jar --longhelp

include the automated PSOATransRun (unit-)testing environment.

Development infrastructure: Eclipse comes with the plugin EGit, a light-weight Git client (with a connection to users' GitHub accounts). Directory structure (same between GitHub and website). How to compile with Eclipse on the level of all directories and files. How to manage *.java (source) vs. *.jar (executable), e.g. GitHub no longer has separate download pages for *.jar files. What you may want to know on a low ("nuts and bolts") level. How to maintain the system, as it interacts with RuleML's virtual host (e.g., should something break such as a cron job no longer working).

2 PSOATransRun Workflow

(Thu, 21 Dec 2017, 2pm Eastern)

The overall workflow of PSOATransRun in Java, partly generated from ANTLR grammars. PSOATransRun uses ANother Tool for Language Recognition (ANTLR v3) to: 1. parse PSOA RuleML texts, 2. normalize the resulting trees (analogous to XSLT over XML DOM trees etc.), and 3. generate, from them, the targeted (e.g., XSB Prolog) texts.

3 Parsing PSOA into ASTs

(Wed, 3 Jan 2018, 2pm Eastern)

This session covers ANTLR use 1 -- parsing a KB and query from PSOA RuleML 1.0 Presentation Syntax (PS) into an Abstract Syntax Tree (AST). The next sessions will cover ANTLR uses 2 and 3.

4 Normalizing ASTs

(Fri, 5 Jan 2018, 2pm Eastern)

This session discusses the normalization of ASTs, e.g. for the central describution step of an oidful atom normalized into a conjunction of single-descriptor atoms with the same OID such as in PSOAPerspectivalKnowledge, where the (KB2) atoms describing the OID John are normalized to (KB1) atoms.

5 Generating Prolog from Normalized ASTs and Defining Projects for Enhancing PSOATransRun

(Tue, 9 Jan 2018, 11am Eastern)

This session first discusses AST normalization and conversion to Prolog.

  • Prolog runtime predicates for PSOA RuleML 1.0: memterm, sloterm, tupterm, prdsloterm, prdtupterm
  • Generating Prolog with runtime predicates from the transformed trees

We will then discuss project proposals for enhancing PSOATransRun.

All versions of PSOATransRun through 1.3 have required explicit underscore prefixes to mark local constants (individuals, functions, predicates, and slot names). In PSOA RuleML's abridged presentation syntax (used as the de facto 'publication syntax'), underscores are understood -- and can be discovered and reconstructed -- for relevant non-prefixed names.

The first author (performing a preliminary project on local-constant discovery) has automated, in Java and ANTLR, a basic PSOATransRun 1.3 upgrade for the discovery of relevant names and the reconstruction of their omitted underscores as a tokenizer-level 'pre-processing' translation, realizing the central part of an abridged-to-unabridged PSOA RuleML transformation.

Basically, a non-prefixed name is tokenized as if it was _name with the notable exceptions of name being one of the following:

  • Special predicates: Top, ...
  • Primitive predicates: External [might be deprecated after PSOA RuleML 1.0 release], ...
  • Datatype individuals: Integers, ...
  • Translator directives: Prefix, Import, ...

The exceptions should be represented in a declarative, easily maintainable and reusable manner. E.g., the PSOA RuleML 1.0 prerelease may thus be augmented by new special individuals:

  • Special individuals: ..., False [for Or()], True [for And()]

Building on the results of the preliminary local-constant discovery project, two PSOATransRun-evolution projects are proposed, Project 1 preparing Project 2. Based on orthogonal platform considerations, PSOATransRun-evolution Project 3 is proposed. For further evolution, Project 4 is proposed. Moreover, several PSOATransRun-application projects have emerged, including the Loan Processor decision model transcribed from POSL to PSOA (LoanProcessor) and a medical-legal classification of devices formalized from an EU Regulation (Medical Devices Rules).

Project 1: PSOATransRun Parameterization: [Done] Allow, even as the default, executing PSOATransRun with a KB and queries using abridged local-constant syntax from which unabridged syntax will be reconstructed (one or more local-constant occurrences may already be unabridged). Add an optional parameter -c with long form --explicitLocalConstants for the explicitly specified case of no local-constant reconstruction but rigid syntax checking:

java -jar PSOATransRunLocal.jar -c ...

[ToDo] See if further helpful parameterizations could be easily added such as for targeting the XSB vs. SWI engine (cf. Project 3).

Project 2: PSOATransRun GUI: [ToDo] Bring the top-level loop, e.g. loading different KBs, from the command-line console into a GUI for the PSOATransRun[PSOA2Prolog,XSBProlog] instantiation such that users can more easily vary selected parameters without need for quitting and re-entering the system. For the convenience of all users, employ the same command-line names -short with long form --long (e.g., -c with long form --explicitLocalConstants) also in the GUI, and make them additionally available as graphic (e.g., drop-down menu) actions. Take into account that the GUI should ultimately be Web-based (cf. PSOATransRun_Development_Agenda#Openshift), e.g. informed by the earlier PSOATransRun[PSOA2TPTP,VampirePrime] instantiation (i.e., by http://wiki.ruleml.org/index.php/PSOA_RuleML#TPTP_Instantiation).

Project 3: PSOATransRun SWI: [ToDo] Explore the use of SWI Prolog for PSOATransRun, i.e. how PSOATransRun[PSOA2Prolog,XSBProlog] can be changed to PSOATransRun[PSOA2Prolog,SWIProlog], where declarations for XSB Prolog's predicate tabling may be adapted for SWI Prolog's tabling. Already without a tabling substitute this will facilitate porting the relational case study "A Rule-Based Approach for Air Traffic Control in the Vicinity of the Airport" by Theodoros Mitsikas, Petros Stefaneas, and Iakovos Ouranos from POSL to PSOA. Since this case study currently employs POSL (on OO jDREW) and SWI Prolog in a (purely) relational manner, its version employing PSOA (on PSOATransRun) and SWI Prolog can also help exploring the behavior of PSOATransRun 1.3's static/dynamic OID virtualization for larger relational KBs. Generalized KB clauses can then be added, such as by extending aircraft modeling from (oidless) relationships to psoa atoms with OIDs for unique tracking and slots for optional properties. At that time, SWI Prolog's tabling may become unavoidable.

Project 4: PSOATransRun Maintenance: [ToDo] Utilizing the automated (unit-)testing environment (see the README.txt at the bottom of https://github.com/RuleML/PSOATransRunComponents/tree/master/PSOATransRun/test), construct further PSOATransRun test cases in PSOA RuleML, some of which could become the seeds of later applications. For test cases that pinpoint an issue whose resolution is agreed upon, plan and implement fixes.

[Done] To exemplify, for #KB_with_Queries, the test cases

> _Student(+[_Mon _Tue _Fri])    % Relational query with explicit tuple, marked as predicate-dependent
line 1:100 mismatched character '<EOF>' expecting '\n'
Answer(s):
Yes

> _Student(+[_Mon _Tue _Fri])    %
line 1:32 no viable alternative at character '<EOF>'
Answer(s):
Yes

pinpoint that "%"-inline-commented queries yield correct Answer(s) but have an <EOF> issue. Since "%"-comment parsing inadvertently 'eats up' its terminating newline, \n, the resolution is to skip characters that are not \n in a "%"-prefixed line (even if empty).

For all projects, practice the PSOATransRun Production Cycle (e.g., towards PSOATransRun 1.4, as planned in PSOATransRun Development Agenda):

  1. Prepare unit testing with current counter-examples (cf. Project 4)
    1. Create test (-KB, -query, -answer) files for testing environment
    2. Execute test files with expected presence of errors
  2. Do Java and ANTLR changes with instant inline documentation
  3. Perform unit testing with what now should be examples (cf. Project 4)
    1. Reuse or augment test files of 1.1
    2. Execute test files with expected absence of errors
    3. Go to 2 if any error persists
  4. Document (I)
    1. Add more inline documentation
    2. Update README
  5. Publish on GitHub with expressive Commit message
  6. Document (II)
    1. Update/Create MediaWiki page(s)
    2. Update/Create LaTeX paper(s)

6 KB with Queries

Modifying the visualized Rich TA clauses (by changing a specific workload fact into a general workload rule, based on two new coursehours slots), the following abridged/unabridged KB reproduces Fig. 2 of the PSOAPerspectivalKnowledge paper, augmented by sample Queries, all for copy&paste into PSOATransRun. The subsequent hints facilitate further explorations by readers.

6.1 From Abridged to Unabridged

KB (abridged, but still RuleML/Prefix+Assert-wrapped):

RuleML (
  Prefix(pred: <http://www.w3.org/2007/rif-builtin-predicate#>)
  
  Assert (                      % (KB)
                                   % (KB2#)
    Teacher##Scholar                 % Taxonomy
    Student##Scholar
    TA##Teacher
    TA##Student
                                     % Data (ii)
    John#Teacher(+[Wed Thu]
                   coursehours+>12 dept+>Physics salary+>29400
                                                 income->29400)
    John#Student(+[Mon Tue Fri] -[1995 8 17] 
                   coursehours+>20 dept+>Math gender->male)
    
    Forall ?o (                    % (R1)
      ?o#TA(workload+>high) :-
        And(?o#Teacher(coursehours+>?ht)
            External(pred:numeric-greater-than(?ht 10))    % ?ht > 10
            ?o#Student(coursehours+>?hs)
            External(pred:numeric-greater-than(?hs 18)))   % ?hs > 18
    )
  )
)

[Successful] Queries (abridged):

% Fact ("look-up") querying: Boolean (everything fixed)
John#Teacher(+[Wed Thu] coursehours+>12 dept+>Physics salary+>29400 income->29400)

% Fact ("distributed look-up") querying: Boolean (everything fixed)
And(John#Teacher(+[Wed Thu] coursehours+>12) John#Teacher(dept+>Physics salary+>29400 income->29400))

% Fact ("look-up") querying: OID-anchored (OID fixed, descriptors variable)
John#Teacher(+[?dy1 ?dy2] coursehours+>?ch dept+>?dt salary+>?sa income->?in)
John#Teacher(+[? ?] coursehours+>?ch dept+>?dt salary+>?si income->?si)

% Fact ("look-up") querying: Descriptor-associative (OID variable, descriptors fixed)
?who#Teacher(+[Wed Thu] coursehours+>12 dept+>Physics salary+>29400 income->29400)

% Fact ("look-up") querying: Descriptor-pattern (OID variable, descriptors variable)
?who#Teacher(+[?dy1 ?dy2] coursehours+>?ch dept+>?dt salary+>?sa income->?in)
?who#Teacher(+[? ?] coursehours+>?ch dept+>?dt salary+>?si income->?si)

% Fact ("look-in") querying: Relational- and Graph-style

Student(+[Mon Tue Fri])    % Relational query with explicit tuple, marked as predicate-dependent
Student(Mon Tue Fri)       % Relational query with implicit tuple, understood as predicate-dependent
?#Student(Mon Tue Fri)     % Relational-like query with don't-care Object-IDentifier (OID) variable
?w#Student(Mon Tue Fri)    % Relational-like query with named OID variable
John#Student(Mon Tue Fri)  % Relational-like query with OID constant 
John#Student(?d1 ?d2 ?d3)
?w#Teacher(Wed Thu)
?w#Teacher(?d1 ?d2)
?w#Student(-[1995 ?m 17])  % Relational-like query with explicit tuple, marked as predicate-independent
?w#Student(-[?y ?m ?d])
?w#Teacher(-[?y ?m ?d])
?w#Top(-[?y ?m ?d])

John#Student(gender->male)    % Graph query with OID constant typed by Student predicate and described by independent slot
John#Teacher(gender->male)    % Graph query with OID constant typed by Teacher predicate and described by independent slot
?w#Top(gender->male)          % Graph query with OID variable typed by any predicate and described by independent slot
John#Student(dept+>?d)    % Graph query with OID constant typed by Student predicate and described by dependent slot
John#Teacher(dept+>?d)    % Graph query with OID constant typed by Teacher predicate and described by dependent slot
John#?p(dept+>?d)         % Graph query with OID constant typed by predicate variable and described by dependent slot
?w#?p(coursehours+>?h)
And(?w#Teacher(coursehours+>?ht) ?w#Student(coursehours+>?hs))

% Explicit (R1) Rule ("inferential") querying: Deriving a descriptor via thresholds of other descriptors 

John#TA(workload+>?l)   % Graph query with OID constant typed by TA predicate and described by dependent slot
?w#?p(workload+>?l)     % Graph query with OID variable typed by predicate variable and described by dependent slot

% Implicit (##) Rule ("inferential") querying: Deriving an OID membership from OID's membership in a subclass

John#Scholar  % Derivable from describution-obtained John#Teacher (via Teacher##Scholar) or John#Student (via Student##Scholar)

% Querying of two Facts as well as the Explicit and one Implicit Rule: Combining retrieval with (general and taxonomic) inference

John#?p        % Finds all four predicates for John

KB (unabridged):

RuleML (
  Prefix(pred: <http://www.w3.org/2007/rif-builtin-predicate#>)
  
  Assert (                      % (KB)
                                   % (KB2#)
    _Teacher##_Scholar               % Taxonomy
    _Student##_Scholar
    _TA##_Teacher
    _TA##_Student
                                     % Data (ii)
    _John#_Teacher(+[_Wed _Thu]
                   _coursehours+>12 _dept+>_Physics _salary+>29400
                                                    _income->29400)
    _John#_Student(+[_Mon _Tue _Fri] -[1995 8 17] 
                   _coursehours+>20 _dept+>_Math _gender->_male)
    
    Forall ?o (                    % (R1)
      ?o#_TA(_workload+>_high) :-
        And(?o#_Teacher(_coursehours+>?ht)
            External(pred:numeric-greater-than(?ht 10))    % ?ht > 10
            ?o#_Student(_coursehours+>?hs)
            External(pred:numeric-greater-than(?hs 18)))   % ?hs > 18
    )
  )
)

[Successful] Queries (unabridged, although not Query-wrapped):

% Fact ("look-up") querying: Boolean (everything fixed)
_John#_Teacher(+[_Wed _Thu] _coursehours+>12 _dept+>_Physics _salary+>29400 _income->29400)

% Fact ("distributed look-up") querying: Boolean (everything fixed)
And(_John#_Teacher(+[_Wed _Thu] _coursehours+>12) _John#_Teacher(_dept+>_Physics _salary+>29400 _income->29400))

% Fact ("look-up") querying: OID-anchored (OID fixed, descriptors variable)
_John#_Teacher(+[?dy1 ?dy2] _coursehours+>?ch _dept+>?dt _salary+>?sa _income->?in)
_John#_Teacher(+[? ?] _coursehours+>?ch _dept+>?dt _salary+>?si _income->?si)

% Fact ("look-up") querying: Descriptor-associative (OID variable, descriptors fixed)
?who#_Teacher(+[_Wed _Thu] _coursehours+>12 _dept+>_Physics _salary+>29400 _income->29400)

% Fact ("look-up") querying: Descriptor-pattern (OID variable, descriptors variable)
?who#_Teacher(+[?dy1 ?dy2] _coursehours+>?ch _dept+>?dt _salary+>?sa _income->?in)
?who#_Teacher(+[? ?] _coursehours+>?ch _dept+>?dt _salary+>?si _income->?si)

% Fact ("look-in") querying: Relational- and Graph-style

_Student(+[_Mon _Tue _Fri])    % Relational query with explicit tuple, marked as predicate-dependent
_Student(_Mon _Tue _Fri)       % Relational query with implicit tuple, understood as predicate-dependent
?#_Student(_Mon _Tue _Fri)     % Relational-like query with don't-care Object-IDentifier (OID) variable
?w#_Student(_Mon _Tue _Fri)    % Relational-like query with named OID variable
_John#_Student(_Mon _Tue _Fri) % Relational-like query with OID constant 
_John#_Student(?d1 ?d2 ?d3)
?w#_Teacher(_Wed _Thu)
?w#_Teacher(?d1 ?d2)
?w#_Student(-[1995 ?m 17])     % Relational-like query with explicit tuple, marked as predicate-independent
?w#_Student(-[?y ?m ?d])
?w#_Teacher(-[?y ?m ?d])
?w#Top(-[?y ?m ?d])

_John#_Student(_gender->_male)    % Graph query with OID constant typed by _Student predicate and described by independent slot
_John#_Teacher(_gender->_male)    % Graph query with OID constant typed by _Teacher predicate and described by independent slot
?w#Top(_gender->_male)            % Graph query with OID variable typed by any predicate and described by independent slot
_John#_Student(_dept+>?d)    % Graph query with OID constant typed by _Student predicate and described by dependent slot
_John#_Teacher(_dept+>?d)    % Graph query with OID constant typed by _Teacher predicate and described by dependent slot
_John#?p(_dept+>?d)          % Graph query with OID constant typed by predicate variable and described by dependent slot
?w#?p(_coursehours+>?h)
And(?w#_Teacher(_coursehours+>?ht) ?w#_Student(_coursehours+>?hs))

% Explicit (R1) Rule ("inferential") querying: Deriving a descriptor via thresholds of other descriptors

_John#_TA(_workload+>?l)   % Graph query with OID constant typed by _TA predicate and described by dependent slot
?w#?p(_workload+>?l)       % Graph query with OID variable typed by predicate variable and described by dependent slot

% Implicit (##) Rule ("inferential") querying: Deriving an OID membership from OID's membership in a subclass

_John#_Scholar  % Derivable from describution-obtained _John#_Teacher (via _Teacher##_Scholar) or _John#_Student (via _Student##_Scholar)

% Querying of two Facts as well as the Explicit and one Implicit Rule: Combining retrieval with (general and taxonomic) inference

_John#?p        % Finds all four predicates for _John

6.2 Further Explorations

  • Modify and add Queries
  • Modify and add facts and rules to the abridged or unabridged KB variant, watching answers for Queries change and new Queries become possible
  • Start KB expansion with Mary/_Mary facts, e.g. with the abridged/unabridged relational-like fact below, and request more answers for Queries
Mary#Teacher(Wed Fri)
_Mary#_Teacher(_Wed _Fri)
  • Continue KB expansion with slot-enrichment rules, e.g. with a rule that adds a conclusion slot coursehours+>?h for a condition conjunct ?h = External(func:numeric-add(?ht ?hs)) [the original and the new rule can then be merged into one augmented rule]