The PILS programming language and system has been developed by me during the years 1979 to 2008, originally as an attempt to improve on Lisp. However, under influence of Prolog, C++, XSLT and SQL, and certain features of spoken Danish, it grew into something quite different.
Like Lisp, PILS uses a unified data model for programs and data, but whereas Lisp uses lists, PILS uses attribute based nodes as the building stones of its data model. For brevity, attributes are referred to as legs. The simple functions in Lisp have, by Prolog inspiration, given way for pattern matching based rulesets, which – inspired by C++ – are treated as objects and can be combined in may ways.
PILS is usable for all sorts of projects except those that need heavy number crunching or finer synchronization primitives than those offered by PILS. A rich set of fast text and list processing operations is offered, and the programming system, though small in size, is quite mature and flexible, with support for pinning bugs in the source code. Bindings to the juce library and worker threads allow for smooth GUI applications, though the bindings and the library that wraps them are not yet full-featured.
The source code for the PILS system is public domain (i.e. you can do what you want with it) except the Juce GUI library, see http://www.rawmaterialsoftware.com/juce/ which is currently used under GPL license, implying that the complete PILS system is currently under GPL terms, but more liberal terms are to be expected in near future.
PILS was designed for open-source projects and does not support obfuscation or compilation.
Military institutions and weapon's industry should expect no cooperation from my side. Aside from that, I will help with PILS when possible.
I have tried to arrange the material in an order that makes sense, but due to the intertwined dependencies between various aspects of the language and programming system, some sections refer to material that is introduced in later sections. If you are new to PILS, you should probably first read it and grasp as much as you can, then try some PILS coding, then reread.
PILS has – or attempts to provide – a programmer friendly syntax, covering a Lisp/XML flavour data model.
PILS is a pure functional programming language; nodes, lists and text strings cannot be changed once they are created. This is exploited internally to specialize nodes according to their contents.
PILS is dynamically typed and lexically scoped. Objects are constructed by binding rulesets and combining them. External objects – such as juce components – can be mixed with PILS objects.
Pinning of bugs in PILS source code is supported by responders and their model of responsibility, in combination with a parser mode that reports positions in the source.
Memory management is reference counter based, with immediate release. This ensures fast automatic freeing of resources.
Note that the compacting garbage collection used by Java and .NET is incompatible with the PILS data model, which relies heavily on hashing of pointers. There is no way to create a fast PILS system as managed code in those systems, and no plans to adapt PILS to them.
Built-in operations have precedence over those defined by rulesets; you cannot redefine 2 + 2 to be 5 , though you could use a language object to redirect + to another symbol. Rulesets may however define rules for the built-in operators to handle cases that are not predefined, such as calculations on tuples.
PILS is implemented by an interpreter that was originally written in assembler and later rewritten in crack style C++. The source is available but hardly documented. Patterns are compiled to blocks of C++ objects specialized for matching.
The principal building block of PILS data is a node, consisting of a node name and one or more uniquely named attributes, called legs. The node name and leg names can be arbitrary constants, including constant nodes. Leg values can be arbitrary PILS data.
The node name and leg names are kept in a cliche which is shared by all nodes with the same combination of node name and leg names, regardless of leg order.
Besides nodes and cliches, PILS has lists, numbers, strings and a few special types. Strings are utf-8 encoded, numbers are high precision floating point (doubles) of which integers are a special case, lists are sequences indexed from 1 and support piped operations: when a list is subjected to a series of operations, they will be performed on each list element in turn when possible, saving the need for constructing temporary list objects.
Depending on how the interpreter treats them, all PILS data can be classified as constants or expressives. Both constants and expressives are immutable, however constants are characterized by being immune to evaluation; the result of evaluating a constant is the constant itself, and the only thing that fits a constant pattern is that same constant. Expressives, on the other hand, may evaluate to something else, and can be matched by something else.
Constants have no identity and are always uniquely represented. When a constant is created, a global hash table is first searched, and if an equal constant already exists, it is reused.
All programming constructs are represented by specially named nodes. Such nodes, and nodes and lists that contain them are expressives, all other data are constants. Expressives have identity, which is used by the programming system to pinpoint failing expressives in the text from which they were parsed, which is how failing expressions are traced to the PILS source text.
External objects are treated as constants, though they may have state.
PILS objects are built from rulesets – lists of rules of the form {pattern|action} . When a ruleset is evaluated, it is bound to the current context, creating an object which can processes a call by trying rules with patterns that match the call. For the call to succeed, an action must take responsibility, typically by means of an :ok responder.
Rulesets can be created dynamically, simply by constructing nodes of the appropriate form – they will be compiled and indexed automatically. Bound rulesets can be combined to aggregates.
The pattern matching exploits the uniqueness of constants; often, complex structures can be recognized or rejected based on simple pointer comparisons. There is an overhead on constructing data structures, but this is mostly outweighed by the fast data recognition you get in return, except for number crunching which suffers from the boxing and hashing required to make numbers fit into a data model built for pattern-matching and searching.
PILS has a fine-grained error handling system which, to my knowledge, is not found in similar form elsewhere.
In weakly-typed languages, bugs can be hard to locate, as failures will happen in unexpected places, far from the bugs that cause them.
PILS remedies this by means of responders and blaming – the failure of an expression can be blamed on another expression, responsible for calling it. This is accomplished by calling with the :try and :need responders, which will blame the caller of their containing rule in case of failure.
The mechanism bears some resemblance to the concept known to mainstream programmers as structured exception handling, but PILS blaming is more fine-grained and integrated with lexical scoping.
PILS projects are stored in PILS library files. These are flat, notepad-compatible utf-8 text files that consist of a list of other libraries files required, followed by a list of named PILS modules. Module names are lists of PILS names, shown as a tree by the programming system.
When a PILS file is opened, a program strap is created, merging the library and the libraries it import, and their imports, and the libraries that constitute the programming system, and possibly some configuration files.
The executable has bindings for the juce library. These are used through a framework library that hides some of the framework specific details and allows programming in other languages than English.
PILS comes as an installer created with InnoSetup – the interface will be familiar to most users. The executable is statically linked against the run time libraries of Visual C++ 2008 Express, so no redistributables are required. An uninstall function is supplied.
PILS was developed and tested on the Danish edition of Windows XP.
Numbers are written as usual, with a leading - for negative numbers. The decimal point, if present, must be between digits and can be , (comma) or . (dot). C style hexadecimal notation is supported for integers.
Strings are written as arbitrary character sequences delimited by " (double-quote) which must be doubled in strings.
"This is a string containing a double-quote """
After the last " may follow a scraper – a sequence of two-digit hexadecimal byte values, doubled double-quotes and scrapes.
The scrapes are:
= for LF (linefeed, standard line terminator in Unix and PILS)
< for CR (Carriage return, Macintosh line terminator)
> for TAB
~ or * for NULL
/ for CR+LF (DOS/Windows line terminator)
Example:
"This string ends with a line terminator"=
A scraper can be followed by more text:
"This is a"="two-line string"=
Tekststrenge kan deles over flere linjer med - (delestreg).
"This is a single line of text "-
"that has been spread over two lines in the source."
The encoding is always utf-8, so accented letters are represented as two-byte-sequences.
Numbers and strings can be used directly as node and leg names.
PILS has three built-in types for handling of time.
Timestamps are stored internally as GMT, but written using local time with timezone indicators
2008-12-06T19:00:05.161+01:00
Minutes, seconds and milliseconds are optional, as are the minutes i timezone indicator. GMT can be indicated by +00:00 , +00 or Z .
Datings are abstract time indications consisting of a date and time. They are not associated with any timezone. They are written like timestamps but without the timezone indicator.
Durations are written like:
1001d5h20m4.567s
for 1001 days, 5 hours, 20 minutes, 4 seconds and 567 millisekunder. Weeks, months and years are not supported in durations. Parts of the duration may be omitted; a quarter of an hour can be written as:
15m
Durations can be negative:
-3h20m
for minus three hours and 20 minutes.
The decimal point in time indications is always . and is used only to separate seconds from milliseconds.
Composite constants (cliches, nodes and list) are usually enclosed in constant brackets [ ] . For cliches, the brackets are mandatory and part of their syntax, for nodes and lists, they are optional. Inside constant brackets, simplified syntax rules are used: the constructs described in the chapter on expressions are not valid, names and operators are treated as simple constants. An example:
2 + 2 is an expression that will result in the value 4
[2 + 2] is a list of three constants 2 , [+] and 2
Cliches: [node-name|leg-name|leg-name ...]
Node constants: node-name: .leg-name .leg-value ...
List constants: element, ... (the , after the last element is optional)
The empty list [] can be omitted when this does not lead to ambiguities. Inside constant brackets, ? also denotes the empty list.
In constant nodes, leg values can be omitted when identical to the leg name.
Nodes are greedy: whenever a node is started, it goes as far as possible. As a consequence of this, embedded control structures usually do not need parentheses.
Within constant brackets, list constants of two or more elements can be written as shortlists without commas. Shortlists can be embedded in comma separated lists, as in this matrix:
[1 0 0, 0 1 0, 0 0 1]
Generally, string lists should not be written as shortlists, as this leads to confusing syntax. Write:
s *=* ["bad", "good"]
rather than
s *=* ["bad" "good"]
A tail is leg whose name is the empty list, [] . The PILS interpreter links many structures by their tail. The tail can be written in any position – directly after the node head:
node: tail
or later, using ; (semicolon)
node: .name value; tail
Other legs may follow the tail.
The tail must be separated from the preceding : or ; by whitespace.
A principal leg is a leg whose name is the same as the node name. This has no special significance to PILS except in namespace-nodes as described below, but the convention is often useful and is supported by a special notation: if the principal leg is written first, the name can be omitted:
message: . "Hi" is the same as message: .message "Hi"
The PILS data model is very liberal with names: any constant, including node and list constants, can be used as a node name or leg name, though name cliches are commonly used.
A name cliche is a cliche of two strings [namespace-identifier|name-string] , which is produced by parsing a token, with an optional namespace prefix, through a language node.
The language node maps prefixes to namespace-identifiers, and can map specific name-strings to specific constants for a given prefix, adapting PILS to use keywords of your preferred language. Language objects are used for user interfaces of PILS applications as well.
All built-in names recognized by PILS interpreter kernel have "pils.org/ns/sne" as their namespace-identifier – sne is an acronym for Scandinavian Nerd-English. Native speakers of the English language should not hesitate to create language objects with a more natural terminology if the sne conventions seem inappropriate to them – and speakers of other languages should seriously consider using or creating mappings for their language.
Localized – or polyglot – programming languages have been shunned since localized macro languages messed up word processing and spreadsheets long ago; however, PILS has been designed with theses issues in mind.
Parsing is always controlled by a language node:
[language: language-definition-node]
The language-definition-node is a string-named node constant whose string-named legs hold namespace-nodes. The leg names are used as namespace prefixes. A principal leg is required and used to resolve names with no prefix.
A namespace-node is a node constant whose name is used as namespace-identifier for untranslated names with this namespace prefix. String-named legs of the namespace-node legs define translations. If the node name is - , only translated identifiers are allowed; this is useful for restricting namespaces.
The tail of a namespace-selector is a two-byte write-control string. Its first byte holds flags that controls the use of various syntactic conventions when writing, the last byte is used as decimal point and should be "." or "," .
The write control flags are:
0x01
control characters are escaped when writing strings
(protects against dos/unix newline conversion and null characters)
0x02
all non-ascii characters are escaped
(allows writing PILS expressions in pure ASCII)
0x04
non-utf-8 compliant data go unescaped (normally, such data will be escaped)
(saves space when writing binary data as strings)
0x10
all names are written as cliches with string literals
(verbose but independent of language)
0x20
sugar-free, expressions and rulesets are written using the basal node syntax
(instructional – shows how PILS parses your expressions)
All combinations are allowed; 0x02 overrules 0x04. The escape flags affect both names and string literals; if a name has characters that must be escaped, it is written using the string convention.
Parsing is not affected by the write-control string; in particular, both "." and "," are valid decimal points, regardless of which one is specified in the write-control string.
PILS is booted with a language node similar to this:
[ language:
"system": ! default namespace prefix
."system" ["pils.org/ns/sne":] ! default namespace identifier
."pils-configuration" ! auxillary namespace
[-: ! only the following translated names are accepted in this namespace
."platform". "juce" ! values used by the boot process
."system". "Win32" ! replace with your favourite OS
! (other configuration values omitted here for brevity)
]
;
""01"." ! writing flags, decimal point
]
In this particular case, names like pils-configuration:framework are translated to strings which are used in the boot process to decide which PILS libraries to load.
PILS can be localized to Danish by copying the library <lib>/pils/danish/system/config.pils to the user's Application Data folder. This library will then be included in all running programs, redirecting certain functions (notably say and saying) to use the danish language object defined in <lib>/pils/danish/system/danish.pils .
Presently, only English and Danish are supported. To support another language, such as French, you should add these files to the system:
<lib>/pils/french/system/config.pils
which controls the indirection, and
<lib>/pils/french/system/french.pils
which defines the language object. A simplistic French language object might look like:
[ language:
"fr":
."fr"
[ "pils.org/ns/fr":
."bon" good
."un" one
."di" say
."blanc" white
]
."system" ["pils.org/ns/sne":]
;
""01","
]
This will translate the name un to one (or, to spell it out: ["pils.org/ns/sne"|"one"] ), while vin would become ["pils.org/ns/fr"|"vin"] since wine is not included in this simple vocabulary.
If an French-speaking programmer writes this:
di [un bon vin blanc]
an English user will see:
One good vin white
which isn't quite what they teach at Harvard, but still intelligible.
It is still possible to define rules that refine the translation somewhat.
The danish language library defines a rude rule that tries to translates the English word no differently according to the context: no as an answer (No thanks) and no as a qualifier (No smoking) must be translated differently to make sense in Danish.
For an example of a full-blown language node, see lib/danish/system/danish.pils . Note that even language neutral operators should be defined as translations; otherwise they will get a national namespace identifier and not be recognized as built-in operators by PILS – though you could still use namespace prefixing to refer to the built-in symbols.
The İstanbul express workaround – described in the section on text operators – explains how translations can be used to adapt the case conversion operators for Turkish, where the letter i preserves its dot in upper case.
Arbitrary strings can be used as name-strings, but most of the time, regular names are used. These are written without string quotes.
A regular name is any sequence of letters, operator symbols + - * / \ ^ # $ % & , relational symbols < > = ~ , digits 0 1 2 3 4 5 6 7 8 9 and . (dot), treating ' @ ` _ and all non-ascii characters (i.e. utf-8 multibyte characters) as letters, with the following exceptions:
A regular name cannot start with a dot (this would be read as an attribute)
or with a digit, or a hyphen followed by a digit (this would be read as a number)
or end with an operator or relational symbol followed by a dot (this would escape the operator)
The general idea is, anything goes unless it means something else to the parser.
Names ending in an operator or relational symbol are classified as operators or relational operators, respectively. Inside constant brackets [] and for node or leg names, this classification is irrelevant, operators and relational operators can be used as node and leg names. The assignment symbol := is an example of a relational operator being used as an attribute name.
Operators and relational operators can be escaped by adding a single dot. This causes them to be read as ordinary names, not including the dot.
A prefix is a regular name and a : with no spaces. The prefix name is looked up in the namespace node, and this value is used to translate the following name or combine it with a namespace-identifier.
When no prefix is used, or when the dummy prefix ?: is used with string literal names, the default namespace will be used.
A PILS expression cannot define namespace prefixes locally, the only valid namespace prefixes are those defined by the language object. It is still possible to read and write names of other namespaces, using the cliche notation:
["namespace-uri"|"local-name"]
As a convenience, the parser accepts {} as a shorthand for the language node.
The namespace-uri for a namespace prefix can be referred to by: prefix:? or, for the default namespace, ?:?
Generally, nodes are evaluated by evaluating their legs and creating a node from them with the same head and leg names (for node constants, this always results in the exact same node and is equivalent to simply returning the node).
Nodes with names [] (the empty list) or [|action] are interpreted differently, and are written with special syntax:
;name value ... instead of []: .name value ...
;: value ... instead of []: value ...
:name ... instead of [|action]: .name ...
: value ... instead of [|action]: value ...
Empty-named nodes are used for declaring local bindings:
;name value; ...
Action nodes are used for control structures:
:if condition; expression .else else-expression
A standalone name
name
is read as:
:call [name]
In patterns, this binds the name to the corresponding value. In expressions, the current context will be searched for bindings or rules that match the name.
If the name is followed by a constant, a ruleset or an expression in parentheses, a node is formed – this is called a phrase.
a 3 is read as :call [a: 3]
b {x|y} is read as :call b: {x|y}
Phrases can have named legs
line (.from a .to b) is the same as: :call line: .from a .to b
Principal legs and hidden leg values can be used:
message (.) is the same as: :call message: .message message
Constant brackets can be used to form phrases:
message [. "Hello"] is the same as: :call [message: .message "Hello"]
In patterns, constant phrases work like names but expressive phrases are treated as ordinary nodes, which rarely makes sense. In expressions, expressive phrases are evaluated before the call is executed.
Two consecutive items are read as an object call:
(a) (b) is the same as :who a .call b
or, to spell it out: :who (:call [a]) .call :call [b]
If the second item is a name or phrase, it is read as a method call:
a b is the same as :who a .call [b]
or, to spell it out: :who (:call [a]) .call [b]
Object calls are chained from the left:
a b c is the same as (a b) c
In patterns, object calls are not generally meaningful. Some method calls are reserved for typechecks and various other purposes. To match an object call node, a pattern should specify an escaped object call, such as : (a) (b) or, using a more explicit syntax, ::who a .call b .
An expression can return to its caller in the following way:
The expression succeeds, and a value is returned.
2 + 2 returns 4
The expression misses, that is, PILS has no methods of evaluating it.
2 + "two" misses, assuming no rule processes it
The expression fails, that is, a built-in operator or a supplied rule explicitly fails.
{} read "[shit}" fails with [error: . ?:"Missing constant" .start 5 .end 5]
Failures signal that something is wrong with the program or its data, and should generally be reported to the user, while misses can be treated more lightly, depending on the context.
When a :call node is evaluated, this happens:
The call leg is evaluated.
The context is searched for rules or bindings that can process the call.
If no rules or bindings process the argument, a miss is signaled.
When a :who .call node is evaluated, this happens:
If the node is a built-in operation, special logic takes over. If the built-in operator cannot deal with the data, it will simulate the following steps 2 and 3 using already evaluated parts of the expression, so that things are as if the built-in operator was never tried. If a piped operation like every or fold misses, this rollback is not possible and the miss is treated as in 6.
The call leg is evaluated, this is the argument.
The who leg is evaluated, this is the object.
The object is searched for rules or bindings that can process the argument.
If no rules or bindings take effect, a :who .call node is constructed and treated as a call in the current context, allowing fallback rules to supplement built-in operators and object calls with locally defined rules.
If this errs or misses and the expression was the action of a :try or :need responder, the error or miss is treated by the responder. If the expression was evaluated as the expression of a match condition, the condition is rejected.
If this does not apply, an :error -call is made in the current context. For misses, an error value is constructed, for errors, the supplied value is used.
If the :error call misses, the error value is returned and execution continues.
Note that the argument is evaluated before the object. This convention was chosen mainly because it fits in with the optimization strategies used in the interpreter, especially the piping of lists. It is possible that future PILS implementations might evaluate them in parallel.
An open-air dot can be used to separate a name from a following constant, ruleset or parenthesised expression, to prevent the forming of a phrase.
a . 3
is read as:
:who (:call [a]) .call 3
(If a yields a list, this will get the 3rd element.)
An open air dot can also be used to separate a name or phrase from a preceding element, to indicate that the name or phrase should be embedded in a :call node.
a . b
is the same as
(a) (b)
or:
:who (:call [a]) .call :call [b]
(If a and b yield texts, this will concatenate them.)
When an operator or relational operator starts a sequence or follows another operator or relational operator, it is read as a prefix operator.
+ n
is read as:
:who (:call n) .call [+]
When operators are used infix, they have the same precedence as sequence calls.
a + b
is read as
:who (:call a) .call +: :call b
To clarify precedences,
a b + - * c 14 d
is the same as:
((a b) + (- (* (c (14))))) d
Relational operators, i.e. operators ending in = , < , > or ~ , have lower precedence and associate from the right
a + a = b + b = c + c
is the same as
(a + a) = ((b + b) = (c + c))
(This only serves to illustrate the rules of precedence, the expression does not do anything sensible.)
The building blocks of PILS objects are rulesets. A ruleset such as
{ pattern2 | action2 }
{ pattern1 | action1 }
is represented by a :ruleset
:ruleset (;match pattern2 .action action2), (;match pattern1 .action action1)
When a ruleset is evaluated in a given context, it is bound to the context by creating a node
:ruleset .where
which can be used as an object. When a call is performed on a bound ruleset, the rules are tried, last-written rule first, using an internal index for fast skipping of irrelevant rules. When trying a rule, the call is first matched into the pattern, and if this succeeds, the action is evaluated, with bindings established by the pattern match.
For the rule to have any effect, the action must respond to the call by performing a responder statement, typically :ok but other responders allow refined control. If no action responds, control passes to the next rule. If no rule responds to the call, it falls through.
Rules can query various aspects of the call by the implicit binders, of which :who is the most important and roughly corresponds to the this pointer of C++ and the like.
Patterns are data with a structure similar to that of the data to be matched. Any constant is a pattern that accepts itself and nothing else. Expressive lists match lists with the same length and matching elements. Expressive nodes generally match nodes with the same cliche and matching legs, though some nodes have special interpretations in patterns.
Variables are bound to the corresponding value, testing for identity if a variable occurs more than once in a pattern. They are written simply as names, represented by nodes :call [name] .
The joker – written as ? and represented as :call [] – is not bound or tested.
Other special nodes allow named sub-patterns, typechecks, compares with numeric literals, text and list search constraints and length extraction, matching specific positions in a list of unspecified length, simple translations of scalar values, and partial specifications of nodes with default values.
The special nodes can be used as ordinary nodes by escaping them:
: special-node
Besides their use in rules, patterns can be used with the => operator in conditionals.
:if expression => pattern; ...
Though generally a pure functional language, PILS has an assignment statement which is used in various list processing commands, and to set attributes of special objects.
It looks like:
target := value
target := value; tail
Assignments are evaluated by first evaluating value, then target is evaluated with a special variant of calling that searches for assignment rules or assignment-sensitive built-in operations.
If tail is present, the result is discarded after performing the assignment, and tail is evaluated and returned. The tail form is equivalent to:
(target := value) and: tail
Rules for assignments look like:
{ target := value | action }
These rules are dealt with separately by the rule compiler. General rules like {x|...} cannot be invoked by assignments, and assignment rules cannot be invoked by ordinary calls, even with arguments that look like assignments.
The PILS parser does not recognize assignments as such; they are parsed as sequences that end with a greedy node, and represented as:
:who (target) .call (:= value)
:who (target) .call (:= value; tail)
The following conventions apply to expressions only; they are not valid within constant brackets [] .
In the common case where a named leg holds a call to its name, the leg value can be omitted.
This is used in patterns as well as expressions, in the very common case when the name of an leg is bound to its value, i.e. named parameters.
The rule also applies to principal legs, but not to tails.
message: . is the same as message: .message message
A hidden leg is often used with the :ok responder for simple extraction rules. To extract the tail of a node:
{* ;: tail|:ok tail}
or shorter:
{* ;: ok|:ok} which is the same as {* ;: ok|:ok ok}
If the leg value starts with a call to the leg name, this can be represented as an open-air dot.
This rule adds 1 to its argument:
{ok|:ok . + 1}
Principal legs can be dotted – no space is required between the dots:
message: .. "!" is the same as message: .message (message) "!"
If a rule pattern starts with a dot, an immediately following phrase or name is not made into a call. This is handy for specifying parameterless operations:
{ .name | ... }
is easier to type than
{ [name] | ... }
A sequence of the form
completion .(incomplete-expression)
is read as:
incomplete-expression completion
In the original form, completion is parsed with the priority of a sequence; it can be a series of operations. When incomplete-expression is parsed and ) is reached, completion is inserted as a unit.
a b .(c d:) is the same as: c d: a b
An abbreviated form exists for single-leg action nodes, escapes and double-escapes:
expression .:name is the same as :name expression
expression .:: is the same as :: expression
The :need and :list constructs are often used in this form:
expression .:need is the same as :need expression
(... list := ...) .:list is the same as :list (... list := ...)
Inversions can be used to give complex expressions a narrative flow: you can write an expression and test it, then embed it in a control structure by adding an inversion at the bottom, without the need to change the part you already wrote.
The construct is modeled after a feature of natural language, as in this English phrase:
Things I like to do
Strictly speaking, things should follow do, but natural languages allow us to put essentials first, and so does PILS, to some degree.
PILS supports 3 types of comments:
! comment (until the first CR or LF character)
!! comment (may include newlines) !!
:- comment-expression; expression
! and !! comments are treated as white space when parsing; :- comments are parsed as action nodes, allowing comment-expressions to be embedded in expressions and patterns.
They correspond vaguely to //comment newline, /*comment*/ and #pragma of C++.
Most built-in operations are implemented as intercepted object calls. They are represented by nodes of form :who .call which are given special treatment by the interpreter.
Most of these operations only work when called directly, not through other constructions such as the (argument) call (object) form. However, list element extraction by index and the operations defined on language objects also work when called indirectly.
Note the difference: This is a direct call:
{.cliche|:ok [my-cliche]} cliche returns [[|action]|where|ruleset]
The builtin cliche operation gets the cliche of the node used to represent a bound ruleset.
This is an indirect call and invokes the [cliche] method defined by the ruleset:
[cliche] call {.cliche|:ok [my-cliche]} returns [my-cliche]
All numeric operations are done using doubles (typically 64 bit floating point with 52 bit mantissa, but implementations may differ). After a chain of operations, a trial conversion of the result is made to a 32 bit signed integer value and back to double. If this results in the same double as before, the number is recognized as an integer. Both integers and non-integers are hashed and boxed, though the integers 0 – 216-1 have prefabricated boxes, to speed indexing calculations.
The overhead of interpretation, hashing and boxing makes numeric performance slow. This is by design: fast pattern matching and node lookup are of greater importance.
Infinities, not-a-number etc. are not supported by PILS. Handling them has only been sporadically tested, so they may create havoc.
Binary operations are
+ addition
- subtraction
* multiplication
/ floating point division (even when supplied with integer arguments)
\ integer division
% modulo
Beware: all infix operators have the same precedence: the result of 1 + 2 * 3 is 9 , not 7 .
Prefix - (minus) works as you expect: ;x 7; - x returns -7 .
abs returns the absolute value
round returns the nearest integral value, exact halves are rounded away from 0
trunc truncates towards 0
These postfix operators are like the corresponding C++ functions with double values:
sin cos tan asin acos atan sqrt log exp
The round and truncate operators can be used with a unit argument:
x round 10
rounds x to the nearest 10.
Comparisons = <> < <= > >= return 1 for true, 0 for false; PILS has no boolean type.
PILS attempts to write numbers so that parsing them will produce the exact same number but unfortunately this cannot be relied on for large exponents.
There is no support for format strings.
Ordinary arithmetic operations can be used with timestamps and durations in these combinations:
timestamp + duration
dating + duration
duration + timestamp
duration + dating
duration + duration
timestamp - timestamp
timestamp - duration
dating – dating
dating – duration
duration - duration
duration * number
number * duration
duration / number
duration / duration
duration \ duration
timestamp round duration
dating round duration
duration round duration
timestamp truncate duration
dating truncate duration
duration truncate duration
The system operations now and timestamp both get the current system time, but timestamp will produce unique values within each process, incrementing when necessary.
s utf-8 converts between utf-8 encoded strings and lists of unicode values as integers.
s utf-16 converts between utf-16-encoded strings with byte order mark, and integer unicode value lists
s utf-16le same, for little-endian utf-16 without byte order mark
s utf-16be same, big-endian
s bytes converts between strings and list of byte vales 0 – 255.
This concatenates 2 strings:
(s) (t)
This concatenates a list of strings ss using a separator string t:
ss splice (t)
To split it again:
s split (t)
The replacement operator
s *=* ss
expects ss to be an even-length list of strings and interprets them as replacement pairs. For each position in s, the pairs are tried. If a match is found, the replacement is done and the search advances to the position after the replaced text, starting from the beginning of ss.
The prefix/suffix-replacement operator
s (<)$*=* ss
only covers matches at the beginning/end of s and never replaces more than once.
Strings can be compared with the relational operators = <> < <= > >= . As always, = and <> test for identity; the unique constants mechanism ensures that equal strings are identical. The other operators use bytewise unsigned comparisons – for utf-8 data, this is equivalent to character-wise comparison. For case-insensitive comparisons, use the lower operation on both operands. For proper alphabetical sorting, use the order operation with a key-function that produces collation keys, i.e. strings that, when sorted by raw comparisons, get sorted in the proper localized alphabetical order of the original strings.
To convert a string to upper or lower casing:
s upper
s lower
To convert the first character of a string to title or upper casing, leaving the rest untouched:
s title
The casing operations are implemented by code generated from unicode tables to work directly on utf-8, and should work with most languages.
Unfortunately, the Turks won't like this:
"istanbul" title returns "Istanbul"
To get the correct capital letter, the operator must be supplemented by a replacement operation:
"istanbul" $*=* ["i", "İ", "ı", "I"] title returns "İstanbul"
The İstanbul Express is a workaround designed to make this easier:
The [===,] module, available to all PILS modules, defines these operator replacements, combining the casing conversion operators with appropriate replacements:
{ text . ([upper] // replace) | :try text *=* replace .:need upper }
{ text . ([lower] // replace) | :try text *=* replace .:need lower }
{ text . ([title] // replace) | :try text $*=* replace .:need title }
By replacing the untranslated principal namespace with a translating version that intercepts the upper , lower and title operators as follows:
{} **=** ;[[?:?]:]
[ [?:?]:
."upper" [upper|"i", "İ", "ı", "I"]
."lower" [lower|"I", "ı", "İ", "I"]
."title" [title|"i", "İ", "ı", "I"]
]
and using this language to parse our test expression, we get
"istanbul" title returns "İstanbul"
So the casing conversion is configurable by means of the language object.
The splitter is an extended split operation, designed for writing tokenizers but useful for text analysis in general. The splitter was added to PILS mainly because of difficulties with integrating regular expressions with the uniquely represented strings of PILS.
A splitter is a constant node specifying a grammar for recursive descent parsing. The tail is a prioritized list of top level target names (start nonterminals); the corresponding legs specify roads to the targets (productions). The roads consist of targets, text snippets and control instructions. Targets can be invoked recursively, including internal targets not found in the priority list.
A road can have alternate lanes, each of which can have several steps and tests. Tests are steps that test a condition without advancing.
s split
[ target, ...
.target match-instruction
...
]
This walks through s and produces a list of pairs (target, substring), identifying top level matches. When no targets match, (, singlebyte-string) is produced.
Empty matches are rejected at the top level and by the repeaters [*: ...] and [+: ...] , to prevent infinite loops, but allowed elsewhere.
Valid match instructions are:
1 – wildcard, matches any single byte
"a-z" – a 3 byte string, matches one byte in the range
any other string – requires exact match
[*: instruction] – zero or more instruction matches
[+: instruction] – one or more instruction matches
Valid test instructions are:
[-: instruction] – instruction must fail
[/: target] – the last successful top-level target must be target...
[/: target, ...] – ...or one of them. $ means no top level targets emitted yet.
Lists are used in two levels with different interpretation:
alternative, ... – one of the alternatives must match.
An alternative can be a sequence – a list of instructions that must match consecutively.
For clarity, alternatives should be separated by commas while sequences should be written as shortlists.
When a sequence is the only alternative, append a comma to make it into a 1-element alternative list, like the .hexnumber below.
This example recognizes C-style hexadecimal numbers:
s split
[ hexnumber,
.hexnumber "0x" [+: hexdigit],
.hexdigit "0-9", "A-F", "a-f"
]
This splits s in single utf-8 characters (valid) and lumps of non utf-8 conformant data (bad):
s split
[ valid bad
.valid ""00"-"7f, ""c0"-"df x, ""e0"-"ef x x, ""f0"-"f7 x x x
.x "80"-"bf
.bad [+: [-: valid] 1,]
]
The -: instruction is useful for tests like these.
[-: 1] – end of string
[-: -: instruction] – lookahead, instruction must succeed but is not included
The operations in this section work on strings as well as lists. Strings are processed bytewise, so certain operations can produces unexpected results with utf-8 multibyte characters.
String and list handling is optimized to avoid creating and hashing of temporary objects. Binding the temporaries to variables breaks these optimizations – if you have performance problems, use chained operations when possible.
In the following, s and t are strings or lists, n a non-negative integer and op an operator. An (<) in front of an operator – such as (<)+# – indicates that the operator +# has a pendant <+# that works in the reverse direction, counting indexes backwards.
s . n – the nth byte/element of s, counting from 1, fails unless 1 <= n <= s count .
s count yields the byte/element count of s. Use s utf-8 count for character count.
s count (t) counts non-overlapping occurrences of t within s.
s reverse – s backwards. Use s utf-8 reverse utf-8 if you need character based reversal.
s (<)+# n – the first n elements of s; fails if n > s count .
s (<)-# n – s without the first n elements; fails if n > s count .
s (<)++# n – the first n elements of s, or all of s if n > s count .
For lists, ++# is piped and only evaluates the first n elements of s.
s (<)--# n – s without the first n elements; "" if n > s count .
In the following, (<)(+#/-#)op indicates that op has 2 pendants:
s +#op t = s +# (s op t)
s -#op t = s -# (s op t)
as well as reverse direction pendants <op <+#op <+#op
These are safe with utf-8 multibyte characters:
s (<)(+#/-#)=* t – count of s to and including first occurrence of t, or 0 if t is not found.
s (<)(+#/-#)^* t – count of s to but excluding first occurrence of t, or s count if t is not found.
s (<)(+#/-#)$* t – count of t if s begins with t, else 0 .
These are unsafe if t has utf-8 multibyte characters:
s (<)(+#/-#)#* t – count of common start of s and t.
s (<)(+#/-#)~* t – count of s to and including last element of sparse occurrence of t, or 0 .
s (<)(+#/-#)+* t – count of initial elements of s also found somewhere in t.
s (<)(+#/-#)-* t – count of initial elements of s not present in t.
To illustrate the workings of the combined operators, the expression below extracts the name part of a filename by excluding directory and extension, allowing for \ or / as directory separators:
fn <+#-* "\/" <-#=* "."
The <+#-* operator is safe with utf-8 filenames because ASCII characters \ and / never occur in utf-8 multibyte sequences.
The following paragraphs describe operations specific to lists.
The built-in list operations of PILS work by piping: – instead of constructing lists directly, elements are passed one by one to pending list operations. If no list operations are pending, the elements are collected in an internal structure and the list is built when the operation is finished.
In the following, m and n are integers, s and t are lists unless otherwise mentioned.
listwise and singlewise are convenience operations are used for handling situations where a single element or a list of elements may be passed interchangeably, such as parameter lists vs. passing a single parameter.
s listwise is like s call {e|:ok e,} {& ok|:ok}
If s is a list, it is simply passed on. If not, it is wrapped in a single element list.
s singlewise is like s try {ok,|:ok}
The opposite of listwise : if s is a single-element list, the element is extracted; all other values are simply passed on.
s & t concatenates s listwise and t listwise .
s first (e) prepends e to s, for use with the fold operation
The up and down operations produce lists of increasing and decreasing integers:
m up (n) m, m + 1, m + 2 ... n or [] if n < m
m down (n) m, m - 1, ... n or [] if n > m
n up same as 1 up (n)
n down same as n down 1
Lists can be split in sublists of a certain length:
s split (n)
s split 2 splits s in pairs.
A list of lists can be joined to a simple list:
s splice
Non-list elements of s are included in the spliced list.
Lists can be built by the list builder:
:list ... list := value ... ...
:list [tag]; ... list [tag] := value ...
This produces a list of all the values assigned. The tagged form allows selective writing in nested list builders.
List builders are often written as inversions: (... list := value ... ...) .:list
List can be filtered by the operations below. The filters are usually rulesets and do not need parentheses.
s each (filter)
Try to apply the filter function to all elements of s in turn. Pass a list of the results, ignoring misses.
s except (filter)
Like each, but pass only the elements that miss the filter. Results returned by the filter are ignored.
s legs (assign-filter)
Like each, but for each element e at position n (starting from 1 as always), the assignment n := e is tried, instead of just e.
s every (filter)
Like each but requires the filter function to succeed for all elements.
s find (filter)
Like a each (filter) 1 but faster – returns the first filtered element, fails if the filter never succeeds.
To eliminate dublets from a list:
s distinct
To eliminate dublets based on a key function rather than the element:
s distinct (filter)
All elements of s will be tried by the filter. If this produces a constant that has not been used, the original element is passed through. If a ;name .value node is produced and the name has not been used, the value is passed through, instead of the original element.
To consume a list:
s fold (assign-filter)
Implements a list consumer with state. s must have at least one element; the first element is set to the state. This is usually specified like
s first (initial-state) fold (assign-filter)
For all following elements e, assign-filter is required to process the assignment {state := e|...}, resulting in a new state. Finally, the state is returned.
The fold operation is often combined with a list builder to create stateful filtering. Example:
["Rene" male "John" "Peter" female "Jane" "Susan"]
first [unknown] fold
{ gender := $ name | :ok list := gender, name; gender }
{ ? := / gender | :ok gender }
.:list
The result is:
[unknown "Rene", male "John", male "Peter", female "Jane", female "Susan"]
($ and / are typechecks, as explained in the section on patterns.)
To sort a list by keys, use:
s order (key-function)
s order [ok] (simple sorting using a trivial key-function)
key-function is required to process all elements of a. A sorted copy is then passed, based on comparing the resulting keys. Numbers and time values are compared numerically; for strings, binary comparison is used. To achieve collated ordering, the key-function must transform the strings to collation keys. For multiple-key sorting, the key-function should return a list.
To extract a single element of a list by smallest/largest key:
s smallest (key-function)
s largest (key-function)
To simply get the smallest/largest element:
s smallest [ok]
s largest [ok]
To get the longest string in a list:
s largest {? $ ok|:ok}
If the largest key is shared by several elements, the first of them is returned.
To group data by key:
list-expression groups (key-function)
The key-function is tried for all list elements, and should either return a constant name or a ;name .value node or fail; returning an expressive other than a ;name .value node is an error. Returning a constant name is equivalent to returning a ;name .value node with that name, and the original element as a value.
This resulting lists of values for each name are assembled in a node:
groups: .name value-list ...
If no values are present, [] is returned.
The expression
15 up groups {n|:ok {} write (n) count}{13|?}
will group the numbers from 1 to 15 by digit count, leaving out 13. The result is:
[groups: .2 10 11 12 14 15 .1 1 2 3 4 5 6 7 8 9]
For each name, the values are listed in the order in which they occur in the original list, however the order of the keys is arbitrary, as is always the case with attribute names.
For the common case where only the first or last value for each key is wanted:
list-expression firsts (key-function)
list-expression lasts (key-function)
works like groups , but records only the first/last element for each key, instead of a list of all its elements.
list-expression singles (key-function)
like firsts but fails if a key is used more than once
list-expression folds (combined-key-function-and-assign-filter)
like firsts but when an already-used key is encountered, the old and new value are folded, using assign rules of the same filter that was used for key extraction. The whole expression misses if a fold misse.
This can be used for summing etc.:
[apples 1, oranges 2, bananas 3, apples 4]
folds {kind, count|:ok ;name kind .value count} {a := b|:try a + b}
returns
[folds: .apples 5 .bananas 3 .oranges 2]
as the apples counts are added in the folding rule.
The traverse operation rearranges a rectangular two-dimensional list, swapping dimensions:
[one 1, two 2, three 3] traverse returns [one two three, 1 2 3]
Named nodes can be constructed by simple evaluation:
oops: 1 + 2 returns the node constant [oops: 3]
Action and binding nodes can be constructed similarly with escapes.
: 4 + 5 + 6
is the same as
::who (:who 4 .call +: 5) .call +: 6
and returns
9 + 6
:: 4 + 5 + 6 returns : 15 – the double escape constructs an escape.
An escaped ruleset evaluates all patterns and actions, and constructs a new ruleset from them.
To process the legs of a node in turn:
node legs (assign-filter)
For each leg, the assignment name := value is tried in the filter; the results are passed as a list, ignoring misses.
To copy a node but without a specific leg:
node without (leg-name)
To add or replace a leg:
node merge (leg-name, leg-value)
To merge two nodes a and b:
a merge (b)
When merging nodes a and b this way, the legs of b override legs of a with the same names. The head of b is used unless it is [] ; in that case the head of a is used.
If either argument of merge is [] , the other argument is returned, except for this case:
[] merge (nonempty-list)
which is undefined unless nonempty-list is a pair (name, value) where name is a constant. In that case, the result is a node anyway: .name value
This convention serves for building nodes with merge, starting with the empty list.
To replace the name of a node or cliche:
a head (h)
To build a single-leg cliche as used for names:
namespace-identifier // local-name
This operation is mostly useed with text strings but works with other constants as well.
To search a structure of nodes or lists recursively, possibly replacing some parts:
s **=** filter
First, filter gets a chance to process s. If this succeeds, the result is returned with no further processing. If it misses and s is a list or node, its leg values or elements are processed by first calling the assignment n := value , n being the name or number of the leg, and then simly value , in filter; if both miss and value is a node or list, the process is applied recursively. When the filter calls succeed and return other values than the original, new nodes or lists are generated if the result of the operation is to be used.
This example will zero all numbers except price legs which are doubled. Note that the assignment is tried first, regardless of rule order.
s **=** {[price] := # ok|:ok . * 2} {#?|:ok 0}
Similarly, when transforming a list element, the assignment count := value is tried before value ; count being the position in the list, starting from 1.
To search a structure without transforming it, simply do a transformation in a statement where the result is not used – typically a list builder. To list all node names in s:
s **=** {name * ?|list := name} .:list
Tip: If the filter only has assignment rules, the top level will not be matched. If the topmost rule (which is the last to be searched) is:
{ ? := ok | :ok }
only direct legs of the top node or list will be searched.
PILS expressions can be evaluated by the evaluate-operator --- which can be used as a prefix or infix operator:
--- e evaluates the value of e in the current context (e gets evaluated twice).
e --- c evaluates the value of e using the value of c as context.
Expressions can be quoted with the :quote statement:
:quote e or, using an inversion: e .:quote
This returns e as it stands, without evaluation.
Note: :quote is rarely needed in PILS. There is no reason to use it for constant nodes or listes.
To bind names to values locally:
;name expression; tail
expression is evaluated in the current context cc, and tail is evaluated in a context ;name ev; cc where ev is the value of expression.
Several names can be bound in one go:
;name1 expr1 .name2 expr2 ...; tail
All the exprs are evaluated in the original current context, in undefined order. (Future PILS implementations might evaluate them in parallel.)
All constants except [] are valid names for binding.
To use a ruleset locally:
:use rules;
expression
or
expression
===
rules
These have the same effect: rules is evaluated in the current ruleset and used to extend it, then expression is evaluated in the extended context. Typically, rules is a ruleset and gets bound by the evaluation, but any expressions – in particular, references to PILS modules – can be used.
A conditional has this general form:
:if condition,... ; success-expression .else else-expression
else-expression is optional:
:if cc; e is like :if cc; e .else []
A single condition can be used instead of a list.
Conditions take these forms:
expression (positive condition)
expression => pattern (match condition)
expression +> pattern (positive match condition)
Conditions are tested by evaluating expression and testing the value. For positive conditions, the value must be a positive integer, for match conditions, it must match the pattern. If a pattern binds variables, the bindings take effect for the following conditions and success-expression.
Note: if a variable occurs twice in a pattern, the corresponding values are required to be identical, but if it occurs in two separate condition patterns, independent bindings will be created, the latter overriding the former.
For positive match conditions, the value must be a positive integer and match the pattern, which should be a simple variable or ? . Positive match conditions are mainly used with test searching, to test for a hit and bind its position.
If expression fails, a positive condition will raise an error, whereas match conditions and positive match conditions will silently transfer control to else-expression.
A call can be attempted with the try operation:
(argument) try (filter)
This is like the call operation, but if object does not accept argument , argument is simply passed on.
Looping is done with the repeat operation:
(argument) repeat (filter)
This is like the try operation, but filter is repeatedly applied on the last value. When filter fails, the last value is returned.
This example lists the integers from 1 to 10 (though 10 up is an easier way):
1 repeat {n <= 10|:ok list := n; n + 1} .:list
To iterate a constant as long as it changes:
(argument) again (filter)
argument is evaluated and must return a constant, or the operation will fail. Then, filter is repeatedly applied and must always return a constant, or the operation will fail. When this constant is the same as the last value, it is returned.
1 again {x|:ok 1 / x + 1}
is similar to:
1 repeat {x|;new 1 / x + 1; :if new <> x; :ok new}
and returns 1.618033988749895 on the x86 (it may fail on architectures with different floating-point rounding characteristics).
To prevent infinite loops, the again operator has built-in cycle detection: at iterations 64, 128, 256, 512, 1024... a value is sampled and compared against the following values. If it reappears after at least one other value, a cycle has been detected and the again operation fails. In examples like the above, floating point rounding errors may cause the loop to end up flipping between a two or more values – without the cycle detection, this would result in an endless loop.
The :exit statement looks similar to the :list statement:
:exit ... exit := value ... ...
:exit [tag]; ... exit [tag] := value ...
Like the list statement, exit statements are often written using inversion:
(... exit := value ... ...) .:exit
If the assignment is performed, value is returned immediately from the :exit statement.
Two expressions can be combined by the operators and , or and anyway :
e1 and: e2
e1 is evaluated. If successful, the result is discarded and control passes to e2. If e1 fails, an error is raised and e2 is not evaluated.
e1 or: e2
e1 is evaluated and returned if successful. If e1 fails, control passes to e2.
e1 anyway: e2
e1 is evaluated. Whether it fails or succeeds, control is passed to e2.
As an alternative to the form (object) (argument) , object calls can be specified with the call operation:
(argument) call (object)
The operation applies object to argument .
This differs from (object) (argument) in the following ways:
The (object) (argument) form may interpret argument as a built-in operation. The call operation does not.
If object does not accept argument , the (object) (argument) form proceeds to create a call node :who object .call argument and call it in the current context, allowing backing rules to deal with failed operations – whereas the call operation simply fails.
object gets evaluated before argument
Callarounds - nodes of form
call: argument
have a special interpretation when used as objects: they switch the argument with the object. To illustrate its use, consider this expression:
list each {item|:try item price}
The expression above will return a list of the prices of those list items that have a price. However, a shorter and faster way to get it is:
list each [call: price]
This works as follows: each item from list is given as argument to the object [call: price] , which deals with them by trying to call their price methods.
Two objects a and b can be combined into an extending aggregate with the +++ operator:
a +++ b
When the extending aggregate is called upon, b gets a try first. If b does not handle the call, a gets a try.
Extender aggregates are similar to subclassing in traditional object oriented languages, but aggregates are a runtime concept and work directly on objects, as PILS has no class concept.
If b = [] , the +++ operator simply returns a .
A filtering aggregate can be created with the -> operator:
a -> b
When this aggregate is called upon, a must process the call, and b must process the result, or the call fails.
Serial compounds cannot process assignments.
Note: the -> operator has the priority of a relational operator, associating from the right. The priority follows from the last character.
The :who binders will refer to the entire aggregate if used in a or b, allowing the component objects to call methods on the aggregate; this is similar to virtual methods. To isolate an aggregate component, wrap it like this:
who: object
Any :who binders in object will now refer to object, rather than the containing aggregate.
For use with the programming system's wrapping of platform specific objects, an extended form is supported:
who: . who-binding ; object
where who-binding specifies the value of :who -bindings inside object.
Patterns are used in rules and match conditions; they specify a structure into which data can be fitted.
A constant matches the exact same constant and nothing else.
A standalone name (written simply as name but represented by a node :call [name] ) is a variable that matches anything and binds it to the name. If a name is bound more than once in the same pattern, the values must be identical.
The joker ? or, to spell it out: :call [] , matches anything and ignores it.
Any other node :call constant is treated as a variable.
Beware: this rule:
{ fac (0) | :ok 1 }
may not do what you expect. The phrase fac (0) is read as :call [fac: 0] which is a variable with the name [fac: 0] .
A list pattern matches a list of the same length, with matching elements.
A node pattern matches a node with the same name and leg names, and matching leg values, unless otherwise specified in the following paragraphs. As always, leg order has no significance.
For constant nodes and lists, this is equivalent to matching the exact same constant, as specified for constants.
Action nodes should generally be escaped when used in patterns with the intention of matching a similar node, as some action nodes have special meaning in patterns.
An escaped pattern
: pattern
works the same as if pattern had been specified directly, except that if pattern is a node, any special conventions for node patterns are ignored. This allows the use of node patterns that would otherwise mean something else. A double-escape
:: x
matches a single escape – the outer escape tells the pattern compiler to treat the inner escape as a normal node, i.e. to match a node of the same structure, which is an escape.
These operations specify type checks. s is typically a variable or ? .
# s – number
% s – integer
+ s – integer > 0, same as % s > 0 , see below
s count – integer >= 0, same as % s >= 0
s time – timestamp
s duration – duration
s dating – dating
$ s – string
+$ s – nonempty string, same as s $ (? > 0) , see below...
++$ s – string of 2 or more bytes, same as s $ (? > 1)
& s – list
+& s – nonempty list, same as s & (? > 0)
++& s – list of length > 1, same as s & (? > 1)
* s – node
/ s – cliche
= s – any constant
s legs – node or list
The = type check can be combined with other typechecks:
= & s – list constant
The system interface may defines additional type checks for various system objects.
The search operators (<)=* (<)^* (<)$* (<)#* (<)~* (<)+* t (<)-* can be used against string literals and list constants. The resulting integer cannot be extracted in the pattern but is required to be > 0 .
To accept strings that begin with "http://" and end with "/" :
s $* "http://" <$* "/"
To accept lists containing the name [language] :
s =* [language,]
(Note the , – the search operations use lists, not list elements.)
Single attribute cliches like [namespace|namestring] can be split by a pattern like:
namespace // namestring
namespace and namestring can be any constants, but typically they are strings.
This pattern will match a name in the default namespace, binding name to the unsplit name and namestring to the namestring..
name = ?:? // namestring
Splitting of cliches with more than one leg name is, by design, not supported since the undefined order of the legs would lead to ill-defined semantics.
A pattern like alias = pattern requires the data to match both alias and pattern. Typically, alias is a variable, used to bind a node or list that is processed further by pattern.
A value can be required to differ from a constant
s <> constant
or compared against numeric literals:
s compare-operator literal
literal compare-operator s
where literal is a numeric literal and op is one of the operators < > <= >=
The compares can be chained. This pattern accepts an integer in the interval 10 – 20:
10 <= % x <= 20
Note: chained compares is a pattern-only concept. If used as an expression, 10 <= x <= 20 may not do what you expect.
To extract the length of a list or string:
list & length
string $ length
Lengths can be constrained using conpares. This pattern matches a string of no less than 5, no more than 10 bytes:
string $ (5 <= length <= 10)
Specific list elements can be retrieved, using integer literals as indexes. Positive integers specify elements counting from 1 (as usual), 0 specifies the last element, -1 the last but one etc.
This extracts the 3rd element e from a list:
(e) 3
This accepts a list q with identical first and last elements:
q = (e) 1 = (e) 0
When specifying unescaped nodes in a pattern, the node name [] acts as a joker:
;x
matches any node with an x leg and no other legs, binding x to the leg value.
To extract the node name:
h ;x
This is as ;x but h is bound to the node name.
Using the node typecheck operator * with a node allows nodes with more legs than those specified in the pattern:
* point: .x .y will accepts nodes like point: .x .y .z etc.
Anyway-specifiers allow nodes with fewer legs to be accepted, specifying constant default values for the missing legs:
(name: .leg ...) anyway [.leg ...]
This can be combined with the * operator to allow more legs as well:
(* name: .leg ...) anyway [.leg ...]
When an anyway-specifier is used, the node name [] is not treated as a joker. If all the specified legs have defaults, the node name itself is acceptable to the pattern as a placeholder for a node with no legs.
The anyway-specifier is designed to allow default values for named parameters.
The constant replacer operations perform simple replacements on data before passing it to the qualified pattern. There are two variations:
pattern call [.key replacement ...]
The keys are the only values accepted. The corresponding replacement is passed to pattern.
pattern try [.key replacement ...]
The keys are replaced, other values are allowed and passed unchanged.
Constant replacers are useful when dealing with enumerations in system interfacing, but they have other uses as well.
To extract the full paths of files and folders:
file file: filename
folder folder: path
To extract PILS extenders of system objects:
systemobjekt når: pils-extender
While PILS is generally an interpreted language, patterns are compiled into an intermediate form called quicksteps, which are substantially faster than the interpreted structures. During a match, quicksteps never perform reference counting or create objects, or use the context. When a match has succeeded, bindings are created and reference counters updated.
Compilation is a fast process that happens automatically whenever match conditions and rulesets are read by the parser or generated by PILS code.
To speed up rule search, the rules of a ruleset are indexed when possible. For a pattern to be indexable, the top level must be a directly specified list, a node with fully specified head and leg names, or a cliche, string or number.
Aliases do not affect indexing, so rules like { alias = indexable-pattern | ... } are indexed as well.
Assignment rules are indexed by the left side of the assignment, and kept separate from normal rules.
Indexing is transparent to the programmer, except for the performance implications. When constructing large rulesets, such as state machines for parsers, the rules should be constructed as indexable whenever possible, and non-indexable rules should be lumped together, preferably near the top of the ruleset.
When a rule action is entered, the rule call is unresponded: it has not been decided whether the rule will succeed or fail. The responders and binders described below are valid in unresponded rule calls only; in a responded rule call, they will fail.
PILS has four responders:
:ok tail
:try tail
:need expression (often written as expression .:need )
:do tail
:error expression
The lexical scoping of responders and their use for individual operations rather than statement blocks allow a more fine-grained error handling than the try-catch approach used in mainstream languages.
The :ok responder is the preferred way of getting a result form a rule. The responder takes responsibility for the call and returns the evaluated expression which should not contain further responders for the call.
This is similar to the return statement of C and its derivatives, with tail flattening granted.
Implementation note: Whenever the :ok responder is the first and only rule action, with nothing but binders before it, it is compiled and the stack setup required for responders is eliminated. If the following expression is a constant, a parameter value or a simple node or list construction based only on constants and parameter values, the rule will be fully compiled.
The :try responder tries to pass the responsibility to tail, without taking responsibility by itself. If the top level construct of tail succeeds or fails, the same happens to the original call, with tail flattening all the way. If the :try falls through, an empty list is returned. The rule will then miss, unless another responder responds.
Unlike the try statement of C++ and derivatives, the :try responder does not cover subexpressions that might fail or miss.
The :need responder is used for partial results that are required but not sufficient for the rule to succeed.
Failure will be forwarded to the original call, whereas success will be returned as result of the :need statement. Fall-throughs will make the innermost :try for the same rule miss. If no :try is active, the rule will miss.
If a needed expression has subexpressions susceptible of failure, nested :need responders can be used to cover these. The inversion form expression .:need is convenient for this use.
The :possibly responder returns the evaluated expression, but in contrast to the :ok responder, embedded responders are allowed, especially the :need responder which as the same effect as when used inside a :try responder.
The :error responder throws an error directly to the original call.
These binder statements:
:self tail
:who tail
:where tail
:what tail
bind their names to data circumstantial to the call, as follows.
self is bound to the bound ruleset.
who is bound to the object being called, possibly an aggregate containing the bound ruleset. For object calls on a bound ruleset that is not aggregated, who is the same as self .
For simple calls that do not specify an object, who is the context in which the call was made, i.e. the same as where .
where is bound to the context in which the call was made.
what is bound to the call node that initiated the call.
The self and who binders are typically used when an object needs to use its own methods. Using the who binder is similar to virtual calls of OO languages.
The where and what binders are used by the bug pinner to locate failing calls. When a call is forwarded by the :try or :need responder, binders in the forwarded call will behave as follows:
:where and :what refer to the original call.
:self and :who refer to the forwarded call.
The binders and responders exist in tagged versions, for use within embedded rules.
The tagged binders
:self [tag]; tail
:who [tag]; tail
:where [tag]; tail
:what [tag]; tail
bind name [tag] instead of name.
The tagged responders
:ok [tag]; tail
:try [tag]; tail
:need [tag]; expression
:error [tag]; expression
are used with the tagger:
:tag [tag]; tail
allowing an inner rule to act on behalf of an encompassing rule.
{ ...
| ... :tag [tag]; ...
... { ... | ... :ok [tag]; ... }
}
Generally, tagged responders are rarely used, though the PILS programming system uses them to deal with syntax errors in modules.
The explicit managing of the responsibility of calls helps in locating bugs. A major problem of weakly typed languages will surface in situations as shown by the following simplistic example:
! Anna's application tries to double a number using Benny's doubler library
double "MMVII"
===
! Bennys library uses Barry's library
{ double: x | :ok multiply (x .by 2) }
===
! Barry's library routine expects numbers
{ multiply: x .by | :ok x * by }
When Anna executes her application, the operation x * by in Barry's library will fail, leaving Anna mystified, not knowing how Barry's code got involved in her application.
The situation is as illustrated by this nursery rhyme (by Halfdan Rasmussen, translation suggested by Ian Noble in a usenet discussion):
Benny's breeks were burning.
Barry roared anon.
Barry having, namely,
Benny's britches on.
This is a general problem with weakly typed programming languages – when the behaviour of the data is not defined by an explicit contract, programmers have to resort to stack dumps, single stepping, or test logs to find the real cause of such failures. In PILS however, a sensible use of responders can help pinning a bug in one swift move.
Using the :try or :need responder, PILS allows the library routine to perform specific operations on behalf of its caller, so when things go wrong, the caller is blamed.
In the above example, Barry should revamp his library to use a more pessimist approach that doesn't trust the arguments. Anna will now have this setup:
double "MMVII" ! Anna's application
===
{ double: x | :ok multiply (x .by 2) } ! Benny's library
===
{ multiply: x .by | :try x * by } ! Barry's revised library
Now, the multiply (x .by 2) operation in Benny's library gets the blame. Benny might revise his library, so Anna gets this setup:
double "MMVII" ! Anna's application
===
{ double: x | :try multiply (x .by 2) } ! Benny's revised library
===
{ multiply: x .by | :try x * by } ! Barry's revised library
And the failure is now immediately blamed on the double "MMVII" call – pointing right to the smoking match in Anna's hand. Mystery solved.
The above example only serves as a simple illustration of the concept; such simple calculations would really be better served by typechecks in the pattern. However, in cases where the arguments are objects supposed to implement certain methods, type checks are of little help, PILS not having classes or interfaces to test against. Instead, the methods can be called without taking responsibility, so that the caller that supplied the arguments gets the blame when needed methods are not supported by the arguments.
PILS is crafted to support this technique without the code bloat that would result if individual operations had to be dealt with by traditional try-catch blocks.
The PILS kernel supports extending the data model with constant types, by means of registering parser plugins and methods.
When the parser encounters a starting brace { a lookahead is performed, keeping trace of strings and nested parentheses. If a closing brace } is encountered with no unnested rule bar | in between, the parser plugins get a try at parsing the enclosed characters.
The supplementary types are provided by the framework bindings and may vary between frameworks.
So far, all data PILS structures presented are stateless and immutable. To accommodate the needs of interacting with a changing world, a few state-bearing objects have been introduced into PILS. They have been crafted so as to minimize the risk of creating memory and resource leaks by circular structures.
Let channel be a node constants of the form
[channel: key]
where key can be any constant. The operation
channel listen (listener)
creates an opaque plug, which for its life time plugs listener into channel. The only way to unplug it is to trash the plug by loosing all references to it.
When a call is made to channel, it gets forwarded to its listeners in turn, the youngest listeners first, until one of them responds, as if the listeners had been aggregated with the +++ operator, except that the :who binder will bind to the plug.
As all constants, channels are unique. If an expression like (channel: filename) is performed several times for the same file name, the channel will be reused. The PILS programming system uses such channels to keep file windows and module editors unique.
Channels are thread safe.
When PILS is used with a windowing framework, the life span of windows are generally controlled by the user or the framework, not by PILS. Such externally controlled objects are said to be aliens. PILS creates special wrappers for them and treats them as constants, with methods and event handling defined by the framework bindings.
Aliens can have PILS objects attached to them in possibly circular ways.
PILS uses the event handling mechanism of the framework to track the destruction of aliens. When an alien is destroyed, PILS blinds its wrapper and releases any associated data.
To associate data with an alien and retrieve it:
alien mind . key := value;
...
alien mind . key
Values can be overwritten but keys cannot be deleted. There is no way to get at the values without knowing their keys.
Nodes of form [strap: key] are similar to channels, but can only have a single listener, and only when strapped to one or more aliens. Once these aliens perish, the strap looses its mind.
Straps are used by the programming system for program straps which are strapped to all windows of a particular program. When all windows of a program are destroyed, the program strap looses its mind and resources are freed.
During startup of programs, their straps are temporarily strapped to the universal key.
PILS standard file and folder objects allow simple operations on files and file systems.
A folder is the same as a directory.
Reading and writing of files is only supported on whole files. File and folder objects are really name wrappers with allowance to use the file system; PILS has no concept of open files.
File and folder names have full paths, using / as directory separator on all systems, including MS Windows. The MS Windows style separator \ is not used in PILS filenames.
Folder names always end with / .
The file and folder functions are methods of the universal key and do not work without it. By hiding the universal key, sandboxes can be constructed, allowing foreign scripts to execute with restricted file system access.
file (filename)
file (filename)
creates a filename wrapper object – file is the built-in function, datafile is a wrapper for use with data dependent modules.
To retrieve the filename:
file name
To read and write its contents in one go:
file text
file text := text
Reading and writing is done with raw bytes – encoding konversions are handled as separate operations:
fil text bytes utf-8
reads an ANSI encoded file and converts it to a utf-8 encoded string.
fil text := text utf-8 bytes
writes a utf-8 encoded string to an ANSI encoded file.
To manipulate files:
file (filename) copy (newname)
file (filename) move (newname)
file (filename) delete
These do the obvious thing – newname is a string; the result of copy and move is a filename wrapper for newname.
To test the existence of a file:
file ok
misses if the file does not exist, else file is returned.
file writable returns 1 if writing is possible and permitted, 0 if not.
file readable same for reading
file count returns the file size in bytes
file timestamp reads the last-modified timestamp of the file
file timestamp (time) sets the last-modified timestamp
There are no methods for extracting parts of the name; this is easily accomplished with the standard PILS string operators:
fn <-#-* "/" returns the folder name
fn <+#-* "/" returns the file name without folder
fn <+#-* "/" <-#=* "." returns the file name without folder and type
fn <+#-* "/" <+#=* "." --# 1 returns the file type
fn <-# (file name <+#-* "/" <=* ".") returns the file name without type
fn <+#-* "/" <-#=* "." returns the file name without folder and type
folder (folder-name)
creates a folder-name wrapper object. folder-name must end with a "/" character. To retrieve the name:
folder name
To search the folder for files and subfolders:
folder files returns a list of file objects
folder folders returns a list of folder objects
Search patterns are not supported – to get all files of type .pils in a folder:
folder files each {file|:if file name <$* ".pils"; :ok}
To travel a folder tree and list all files:
folder call {folder|:who :ok folder files & (folder folders every (who) splice)}
To create and delete empty folders:
folder create
folder delete
To facilitate filtering of files and folders, file and folder can be used as operators in patterns.
file file (fn) matches a file whilst matching its full name to fn
folder folder (fn) matches a folder whilst matching its full name to fn
This example will list all .pils files inside your document folder with their timestamps, excluding backup folders:
folder (platform path documents) call
{ folder folder (?) | :who :ok folder files & (folder folders each (who) splice) }
{ ? folder (? <$* "/backup/") | ? }
each { file file (filename <$* ".pils") | :ok filename, file timestamp }
To read a zip file:
file (zipfilename) zip
returns a list of lists (path, data, timestamp) for all archive entries in the zipfile.
To create a zipfile:
file (zipfilename) zip := entries
Note: zipfile creation is currently not supported in jucePILS.
As with filenames, directories are separated by / , though the zip format uses \ internally.
There is presently no support for modifying zip files or reading single entries, and no other archive formats are supported.
Note: OpenOffice files use the zip format. The PILS documentation files are authored using OpenOffice writer, and converted to HTML by a PILS script.
Worker threads allow the PILS user interface to stay responsible while lengthy computations are going on, and allow programs to benefit from multiprocessor systems.
PILS threads and knots are designed for this use and do not offer the fine-grained synchronization mechanisms required for process control. In return, the programmer does not need to worry about deadlocks or calling objects from the wrong threads.
The lib/pils/english/compute/compute.pils library offers a simple wrapper for using a single worker thread for parsing of files or similar processing.
:thread expression
This construct is only valid in the main thread. A worker thread is created and started, and [] is returned. The worker thread evaluates expression, discards the result and terminates.
System calls and user interface manipulations cannot be performed directly by worker threads. However, a worker thread can temporarily take over the main thread by calling a knot.
The only means of synchronization available for PILS threads is the knot call, that is, calling methods of an object wrapped in a knot:
(knot: object) method
The knot wrapper effectively makes object thread safe, by forwarding calls from worker threads to the main thread, for processing in its idle time, while the worker thread is blocked.
There is no support for parking a thread to awake it later.
When the call succeeds, fails or falls through, the worker thread is restarted in a state that reflects this. The thread switch is generally transparent, except that tail flattening and piping is suppressed and responders, list builders and exits not work across thread borders.
Generally, worker threads should do lengthy computation on their own most of the time, occasionally using knots for system access, user interface updates and information exchange. If a worker thread stays in a knotted call, the PILS system will be blocked.
When used by the main thread, knotted calls pass through with no synchronization. They are still thread safe because only the main thread performs dotted calls.
The PILS module that displays error messages is knotted, so when a worker thread throws an error, the bug pinner window will be shown by the main thread.
During construction of user interfaces, some operations need to be queued for later execution, at a time when the window being constructed has been sized and shown.
:later expression
expression is queued with the current context, and [] is returned immediately. When the main thread is idle, expression is evaluated in the supplied context, and then trashed.
Latecomers can be created from the main thread as well as from worker threads; their execution is always in the main thread.
A PILS program is a collection of modules, stored in one or more library files. All modules and libraries involved in running a program are gathered in a program strop.
Program straps are managed by the library pils/english/system/system.pils which is located by the executable at startup, searching the directory of the executable and its parent directories. When found, the module [pils system boot] is parsed and called.
As these modules have to work before the waxball is initialized, they rely on a somewhat ruder mechanism for referencing each other. You should not edit them unless you know what you are doing – if you break them, the error reporting of PILS will break down too and you will have to resort to primitive and tedious methods for sorting out what went wrong.
The basic unit of the PILS programming system is a module. Generally, a module is a named PILS expression, very often one that results in a bound ruleset of exported rules.
{ ... | } ! exported rules
...
===
private-definitions
In the library file, each module is stored with a header containing the module name and some attributes, of which the .timestamp attribute keeps track of when the module was last edited, and the .language attribute indicates a PILS language object to be used for parsing the module, as described below.
Modules are similar to singleton objects of object orientet programming, on a per program basis: when several documents are open, they will use different program straps and each of these will have separate instances of the modules. Furthermore, when a module is being edited while in use, multiple instances can sometimes exist within the same program.
A module name is a list of PILS names. The programming system presents the modules as a tree, merging common beginnings.
A module has implicit access to its parents, allowing a module [game board chess] to use the exported rules of [game board] and [game,] .
Modules cannot implicitly access their own exported methods. If a rule needs to access other rules of the same ruleset or aggregate, the :who or :self binder should be used.
Modules can refer to each other by absolute name:
@ [game board checkers] always refers to [game board checkers]
Or by relative names. In a module named [game board] :
@ chess
refers to [game board chess] , [game chess] or [chess,]
@ . [chess sizer]
refers to [game board chess sizer] , [game chess sizer] or [chess sizer]
Absolute module references are implemented by a function rule {@: module-name|...} whereas relative module references are implemented by a helper object named @ .
Both absolute and relative module references can use computed module names.
For the common case when a module exports a method of the same name, the helper object @@ implements a shorthand:
@@ board (...) is the same as @ board board (...)
@@ board is the same as @ board board
Functionality that is commonly used throughout an application can be implemented by rules in a module with === as its last name. The rules can be used directly by all other modules in that branch of the module tree.
The system library system.pils contains a module [===,] defining functionality available to all PILS modules.
Module references are basically dynamic: if a module is edited and saved while in use, further references to the module will retrieve an instance of the updated module.
However, module references that were executed before the change was saved still hold the old version.
This distinction is important when experimenting with changes in a running application: the features accessed via module references will change immediately as the modules are edited, whereas features implemented by objects that were created at application startup will not change unless the application is restarted.
The latter procedure – restarting the application to test a change – is still a common way of doing things, especially with compiled languages, but rarely necessary when working with PILS.
When a module is referenced for the first time by a running program, the programming system will instantiate it by parsing and evaluating the expression, keeping the instance in an instance mind for further references.
Instantiation happens on program basis. When several PILS programs are running, each will have its own program strap with separate module instantiations.
Whenever a module is edited and saved, it is removed from the instance mind of all affected program strops. Other modules that referenced the module during their instantiation are removed too. The old instances are still usable but future references will refer to new instances.
If, during instantiation, a module uses data from text files, XML files, spreadsheets etc., changes in the files may necessitate reinstantiation of the module so that the changed data are reflected by the PILS program. This is supported by the functions
datafile (filename)
which creates a file object and, if if called during module instantiation, registers the module's dependency of the timestamped file, and
datafile-check
which checks the timestamps of all datafiles registered by the program, flushing from the instance mind of the program strop all module instances that depend on files with a changed timestamp.
The name this can be used to query certain properties of the current module and program.
this program filename
this program path
this program language
get the filename, path and language of the program file for which the module is instantiated.
this module name
this module text
this module expression
this module language
this module library
get attributes of the module.
The bug pinner depends on this, and the name this should never be used for other purposes.
Each module has an associated language attribute which stores a language name, which can be a list of names.
The language of a module is found by prepending [pils language] to the language name and looking this up as a relative module reference, like the expression:
@ . ([pils language] & language-name)
To get started, the language name [system] always refers to the language object used for booting PILS.
In most cases, the language module is a PILS language object, possibly with national translations of the PILS vocabulary, possibly with application or library specific namespaces prefixes such as the j: used by the juce library bindings, or namespace prefixes suitable for dealing with specific XML formats such as OpenOffice documents.
Alternatively, custom parser objects can be defined in language modules. A simple example of this is the text language, with a trivial parser that simply returns the text, which is useful for saving text data in PILS libraries.
The PILS editor exposes language modules as menu options.
A library is a file that contains modules and possibly references to other libraries. When a library is loaded, its referenced libraries will be loaded too if this has not already been done.
A PILS program is simply a library which is used as an application of its own.
When the user opens a program, it is loaded with the libraries it depends on. The libraries are then merged into a program library which includes all modules present in all the libraries.
The life time of a program library is managed by a strap which is strapped to all top level windows created from that program. The program is released when the last of its windows is closed.
When a library is used as part of a program library, its module references pertain to the whole program library.
Four libraries – the system library, the editor library, the platform library and a user configuration library – must be available and loaded for the PILS system to work. The system, editor and platform libraries are located by relative paths from the executable, the user configuration library is chosen at startup, based on your computer's language settings:
If your computer is set to a language other than English, configuration and language files for that language will be linked in if they exist, and the say function – which should be used for all messages and labes in the user interfaces of PILS programs – will then use that language.
Currently, only English and Danish are supported.
On startup, the PILS executable will try to detect a previous instance and pass the command line to it. If this fails for one reason or the other, a new instance is started.
At startup, PILS creates an object that must be able to process command lines. This object is defined by the module [pils system start] in the waxball organizer library.
This design is slightly complicated by the fact that waxballs come and go as programs are opened and closed, so a channel [channel: pils: command-line] is used to pass the requests to a live waxball with a suitable instance of the [pils system start] module.
The MS Windows installer defines operations open and edit for the PILS file type in the registry; the command lines are distinguished by an -edit option in the edit command line. This is recognized by a rule in [pils system start] and has the effect of directly opening the editor, regardless of whether another action is specified in the [pils run start] module.
PILS programs are created, edited and tested using the PILS editor – a simple tabbed text editor with facilities for searching and navigating PILS libraries and executing test functions.
A PILS editor window always deals with a specific library and will only edit modules within that library. To edit multiple libraries, multiple windows are used.
The editor itself is located in a PILS library, lib/english/pils/editor , and runs in the waxball of the library being edited, allowing libraries to modify the behavior of the editor by redefining its modules.
When a module is saved, all modules that referred it directly or indirectly during their instantiation are flushed from the instance minds..
Note: redefining of editor modules should be done with caution. If a broken editor module prevents the editor from working, you cannot use that same editor to remove or mend the module. You will then need to open the library file using another editor, or restore an earlier version of the PILS editor.
To create a PILS program, simply create an empty file with the extension .pils , and open it with the PILS executable. You will be asked to choose the language you want to use for programming.
The file
lib/pils/language/pils/new.pils
serves as a simple template with the appropriate settings and will be copied over your empty file and opened. (Currently, only Danish and English are supported).
The language you choose will be used to store and show module names and will be the initial language for new modules in your program. You can change the file language later.
Note: The user interface language is controlled by your computer's language setting, not by the program language. If you write your programs using english terms and execute them on a system with Danish language settings, or the other way around, PILS will attempt to translate messages and labels.
Double-clicking a PILS file will call the open function in the module [pils run command open] . If this module has not been redefined, an editor window will be opened.
Similarly, the Edit command will call the edit function in the module [pils run command edit] , which will also open an editor window.
When you build PILS applications, you will usually start the application by starting the editor and executing a test function. When testing and polishing becomes more important, you can create a module [pils run command open] and define a rule:
{ .open | ... }
The editor window will still be available by using the Edit command.
In the command line, the option -edit (always in English, with a prepended hyphen) is used to distinguish the Edit command from the Open command which is the default.
When you open a program with the editor, the last changed module will be shown in a tab.
The editor is simple – the usual shortcuts apply:
Ctrl-C copy
Ctrl-X cut
Ctrl-V paste
Ctrl-Z undo
Ctrl-Y redo
Ctrl-S save
All saving goes straight to the disk file. You cannot test a module without saving it.
Attempts to save modules with syntax errors will be rejected, and you will be directed to the spot where the parser failed.
Tip: If you need to save a module with syntax errors, set the language to text as described below.
Ctrl-M shows the modules of your program in a tree view.
To open a module, activate it by double clicking or hitting the Enter key. Use the Esc key to close the tree pane.
To create a new module:
Ctrl-N creates a submodule of the current module.
To create a root module, create a submodule of an existing root module and move it down, using the Module menu.
Module->move->down moves the module and its submodules towards the root
Module->move->up->sibling moves the module and submodules up on sibling
Ctrl-K copies the current module and submodules to another name.
Ctrl-R renames the current module and submodules.
Module->delete deletes the current module if it is empty and has no submodules.
Module->move->library->library moves to another library, with submodules.
Tip: Deleting modules is – by design – a bit troublesome, to save you accidentally deleting code by hitting the wrong key. If you need to delete many modules, create a trash library, put them there and delete the trash library file.
To change the language of a module, make your pick from the Language menu. The module must be saved after the language change.
Except for system which always refers to the language object used to boot PILS and to read program headers, the entries in the language menu refer to modules with the words pils language in their name. To add a language to the menu, create a module named
pils language name
and write a language object or an expression that creates one, or an object of your own making with at least a read method, like he languages text and textlist defined by modules in the editor library – text simply delivers the module text, textlist splits it in lines.
Language modules can be specific to a branch in the module tree:
job myapp pils language mylanguage
defines a language which is only available to modules in the job myapp branch of the module tree.
If you set the language of a module to a branch-specific language, you should not move the module outside that branch.
The juce binding library – which wraps the PILS juce bindings – uses a branch-specific language which associates the j: prefix with the namespace used internally by PILS for juce methods and classes.
You can change the program language if you do not want to stay with the language you chose when you created the file. First, set your program to use an appropriate language library:
lib/pils/language/pils/language.pils
Then, use the menu File->settings->language-> language to set the program language.
This triggers a rebuild of the editor window, to ensure that module names are shown correctly.
The language change does not affect the language setting of modules already created.
To use PILS libraries, select
Libraries->use
and use the Insert and Delete keys to add or remove libraries from the list. Insert will open a standard file picker dialog, and encode the selected file name by one of the following prefixes if applicable:
<lib>/ – the lib/pils/ folder
<doc>/ – the user's document folder
<.>/ – the folder of the importing library
<..>/ – parent of the folder of the importing library
This helps keeping the imports valid when PILS libraries are moved.
When you start using multiple libraries, you will occasionally need to search through all the libraries involved in your project to find some stuff you forgot where you put. To search across libraries, press
Ctrl-D
This will open a Detective pane, which combines a high and a low search stripe with a module tree. Besides the usual Whole words only and Case sensitive checkboxes, the Detective supports a Structural mode which will parse your search term as a PILS expression and use it as part of a PILS rule, allowing you to search for particular constructs without knowing the detailed contents.
The module tree will update immediately as you type your terms, marking all modules with hits. Their parent modules will also be marked, down to the root, to help finding the hits.
When you select a module with hits, a hit list will be shown, with line numbers of the hits. As you select the hits, they will be selected automatically in the editing windows of their respective libraries.
Libraries have the file type .pils and are utf-8 text files with CR+LF (DOS/Windows style) line breaks. (If needed, they can be edited with MS Windows Notepad or similar. The program strap manager ignores the extra bytes (BOM) prepended to utf-8 files by Windows applications.
A PILS program consists of an optional program-header and a sequence of module entries stored in a utf-8 encoded text file, separated by markers:
library-header><:module-header><;module-body><:module-header><;module-body ...
with any other occurrences of >< encoded as ><. (adding a dot) to distinguish them from markers. Their order has no significance; they are sorted by binary string comparisons of the headers, which is fast and convenient when working with non-PILS-aware text editors or versioning systems.
Module entries consists of a module header and a module body. A module header is a serialised constant node [module: . ...] with a mandatory principal leg holding the name, and some optional legs such as language and timestamp . The module name is a list of 1 or more PILS names.
The program-header, if present, holds information common to all modules in the program. Its language leg, if present, determines what language is used for the module headers.
When a library is read in, the modules are stored in a node, using the module names as leg names; the leg values are module headers with the module text in the principal leg. The library header is held in the tail leg. This allows modules to be found by fast binary search.
Rules of the form
{ .name | :ok test-expression }
are available through a module's Test menu when the module is in a saved state. When you activate the menu, the rule is called and the result – if any – is displayed in a Result window. If you do not want to see the result, simply omit the responder.
{ .name | test-expression }
To make life easier, you can name your test rule like this:
{ .test | :ok test-expression }
The quicktest shortcut
Ctrl+T
will save your module if not already saved, and then call a .test rule if it exists.
Test modules serve to define test rules that can be used from all modules in a branch.
Say you work with a module named [a b c] . When you save the module or press Ctrl-T, these module names will be searched for test functions in listed order:
[a b c]
[a b c test]
[a b test]
[a test]
[test,]
For each of these module names, the whole waxball is searched. Every test function is bound to the first module in which it is found.
If you define your test functions in [test,] , you can use them from everywhere. However, you should consider limiting their scope, or your test menus may become cluttered.
When testing a library that is to be used with several programs, you may want to include a program for testing, without including it in other programs that use the library. Such a program is called a test project.
Open the library's Use libraries panel, add the tester, and mark it with Ctrl-T . The tester will now be included only when the library is started directly, not when it is loaded by another library.