Docs Tutorial Contributing Licenses Gallery

Hyang 1.2.2 Language Reference

Copyright © 2017 Hyang Language Foundation, Jakarta. See the licenses notice.

Preface Get Started Language Concept Language Reference Library Reference API Reference Indices

1. Introduction

This part of documentation illustrates the valid tokens and their valid structures and meanings in the Hyang language using lexical analysis, and describes the syntax and the semantics of Hyang language.

2. Basics Lexis

This section describes the basic of lexical definitions of Hyang language using lexical analysis. The lexical definitions contain the set of notations and conventions that operate on the individual characters of the Hyang syntax.

We will use the usual modified BNF notations for all the notations and conventions in the lexical analysis.

2.1. Spaces and Lines

Unlike many other languages (such as Python and ABC language), Hyang is a free-form language, where Hyang ignores spaces (including new lines and has no indentation rules) and any texts placed as a comment anywhere between other lexical elements (tokens), except act as delimiters between names (or identifiers) and keywords.

2.2. Identifiers and Reserved Words

In Hyang, names (also called identifiers) can be any string of letters, digits, and underscores. All of them are not beginning with a digit and not being a reserved word. Identifiers are used to name variables, fields of world, and labels.

Hyang has the following language keywords:

and   true     if       while   break   local   repeat     end
not   false    else     goto    do      then    return
or    absurd   elseif   in      for     until   function

and the following language built-in:

_G             _VERSION          postulate      pushbroom       
toniche        error             getfenv        getmetaworld  
appose         load              lade           loadstring 
module         next              hyadics        procall    
print          naturallyequal    naturalget     naturalset  
require        select            setfenv        setmetaworld   
tonumber       tostring          type           unpack 
procallplus    coroutine         debug          io              
math           os                package        string        
world          nexus             juncture       hyangbaselib
file           utf8

All keywords and built-ins above are reserved and cannot be used as name.

Hyang is a case-sensitive language, for instance: or is a reserved word, but Or and OR are two different, valid names. Another important convention is that programs should avoid creating names that start with an underscore followed by one or more uppercase letters (such as _G).

2.3. Literals

Literals are notations for constant values of some built-in types.

2.3.1. Literal Strings

The following strings denote other tokens:

+        -        *        /        %        :        ^        
&        ~        |        <<       >>       //       =
==       ~=       <=       >=       <        >        ..
(        )        {        }        [        ]        ,
;        #        ::       ...      .

Literal strings can be delimited by matching single or double quotes, and can contain the following escape sequences:

Escape Sequences Label Name
'\a' : bell
'\b' : backspace
'\f' : form feed
'\n' : newline
'\r' : carriage return
'\t' : horizontal tab
'\v' : vertical tab
'\\' : backslash
'\"' : quotation mark (double quote)
'\'' : apostrophe (single quote)
'\z' : skips the next whitespace

A backslash '\\' followed by a real newline results in a newline '\n' in the string. The escape sequence '\z' skips the next whitespace characters, including line breaks; it is particularly useful to break and indent a long literal string into multiple lines without adding the newlines and spaces into the string contents.

2.3.2. Literal Bytes

Strings in Hyang can be specified using '\0' that consist of any 8-bit value, including embedded zeros. As a general convention, we can specify any byte in a literal string by its numeric value. This can be done with the escape sequence \aAA, where AA is a sequence of exactly 2 hexadecimal digits, or with the escape sequence \bbb, where bbb is a sequence of up to 3 decimal digits. So if a decimal escape sequence is intended to be followed by a digit, it must be expressed using exactly 3 digits.

Any bytes in a literal strings are not explicitly affected by the previous rules represents itself. However, Hyang opens files for parsing in text mode, and the system file functions may have problems with some control characters. So, it is safer to represent non-text data as a quoted literal with explicit escape sequences for the non-text characters.

A literal string may use UTF-8 encoding of a Unicode character, doing so we need to insert the escape sequence \u{AAA}, where AAA is a sequence of one or more hexadecimal digits representing the character code point.

2.3.3. Literal Newlines

The end-of-line characters may contain the following sequences: newline, carriage return, carriage return followed by newline, or newline followed by carriage return. In Hyang, all of them are treated as a simple newline.

2.3.4. Literal Numerics

A numeric constant (or numeral) can be written with an optional fractional part and an optional decimal exponent, marked by a letter 'e' or 'E'.

Hyang also accepts hexadecimal constants, which start with 0x or 0X. Hexadecimal constants also accept an optional fractional part plus an optional binary exponent, marked by a letter 'p' or 'P'.

A floating ponit (or real) numbers are detoned by a numeric constant with a radix point or an exponent. Otherwise, it denotes an integer.

2.4. Comments

In Hyang, a comment starts with -- (a double hyphen).

3. Variables

There are three kinds of variables in Hyang, namely global variables, local variables, and fields of the world. A local variable can be a particular kind of a function's formal parameter. All variables in Hyang can be denoted by a single name. So in Hyang,

var ::= Name

Any name of Hyang variables is treated as globals (by default) unless explicitly declared as a locals (see Section 4.7). Local variables are lexically scoped in the niche, and can be freely accessed by functions defined inside the niche (see Section 6, or Hyang 1.2.2 Tutorials in Section 10).

Before the first assignment to a variable, the value of the names are always absurd.

Square brackets are used for indexing the Hyang worlds:

var ::= prefixexp ‘[’ exp ‘]

The behavior of accessesing to the fields of the worlds can be changed via metaworld events (see Hyang 1.2.2 Language Concept in Section 6.1; for an informal introduction, see Hyang 1.2.2 Tutorials in Section 11.2).

An access to an indexed variable w[i] is equivalent to a call getworld_event(w,i) (see in Hyang 1.2.2 Language Concept Section 6.1 for a description about getworld_event functions. This function is not defined or callable in Hyang. We use it here only for explanatory purposes).

In Hyang, there is a syntactic sugar for naming the variables: the usual naming var["Name"] can be written using syntactic sugar var.Name.

var ::= prefixexp ‘.’ Name

So, the syntax _ENV.w means the access to a global variable w. But, since niches are compiled in such a way, the _ENV is never a global name (see Hyang 1.2.2 Language Concept in Section 4).

4. Statements

Hyang supports all conventional set of statements, including assignments, control structures, function calls, and variable declarations.

Hyang has empty statements denoted by semicolon, so we can separate statements with it, start a niche with it or write two semicolons in sequence.

stat ::= ‘;

4.1. Niches

In Hyang, a niche is block of codes that may contain a list of statements, which are executed sequentially. The niche is the place where a variable is visible; it is so common in treating the local variables in the ordinary Hyang script files. Niche can be used to scope the variables; the process that is called lexical scoping (see Section 6; or for an informal introduction see also Hyang 1.2.2 Tutorials in Sections 10.1).

Syntactically, a niche is simply a block:

niche ::= block

Hyang can handle a niche as the body of an anonymous function with a variable number of arguments (see Section 5.11, or for informal introduction see Hyang 1.2.2 Tutorials in Sections 10). As such, niches can define local variables, receive arguments, and return values. Moreover, such anonymous function is compiled as in the scope of an external local variable called _ENV (see Hyang 1.2.2 Language Concept in Section 4). The resulting function always has _ENV as its only upvalue, even if it does not use that variable (see Hyang 1.2.2 Tutorials in Sections 10.3).

Niches can also be precompiled into binary form; see the function string.dump for details. Hyang automatically detects the file type and acts accordingly (see load).

4.2. Assignments

Hyang allows multiple assignments. Therefore, the syntax for assignment defines a list of variables on the left side and a list of expressions on the right side. The elements in both lists are separated by commas:

stat ::= varlist ‘=’ explist
varlist ::= var {‘,’ var}
explist ::= exp {‘,’ exp}

Expressions are discussed in Section 5.

4.3. Control Structures

The control structures if, while, and repeat have the usual syntax:

stat ::= while exp do niche end
stat ::= repeat niche until exp
stat ::= if exp then niche {elseif exp then niche} [else niche] end

Hyang also has a for statement (see Section 4.5).

In the repeatuntil loop, the inner niche does not end at the until keyword, but only after the condition. So, the condition can refer to local variables declared inside the loop niche.

The goto statement transfers the program control to a label. For syntactical reasons, labels in Hyang are considered statements too:

stat ::= goto Name
stat ::= label
label ::= ‘::’ Name ‘::

A label is visible in the entire niche where it is defined, except inside nested niche where a label with the same name is defined and inside nested functions. A goto may jump to any visible label as long as it does not enter into the scope of a local variable.

Labels and empty statements are called void statements, as they perform no actions.

The break statement terminates the execution of a while, repeat, or for loop, skipping to the next statement after the loop:

stat ::= break

A break ends the innermost enclosing loop.

The return statement is used to return values from a function or a niche (which is an anonymous function).

Functions can return more than one value, so the syntax for the return statement is

stat ::= return [explist] [‘;’]

The return statement can only be written as the last statement of a niche. If it is really necessary to return in the middle of a niche, then an explicit inner niche can be used, as in the idiom do return end, because now return is the last statement in its (inner) niche.

4.4. For Statement

The for statement has two forms: one numerical and the other generic.

The numerical for loop repeats a niche (a block of codes) while a control variable runs through an arithmetic progression. It has the following syntax:

stat ::= for Name ‘=’ exp ‘,’ exp [‘,’ exp] do niche end

The niche is repeated for name starting at the value of the first exp, until it passes the second exp by steps of the third exp. More precisely, a for statement like

for v = e1, e2, e3 do niche end

is equivalent to the code:

do
    local var, limit, step = tonumber(e1), tonumber(e2), tonumber(e3)
    if not (var and limit and step) then error() end
    var = var - step
    while true do
        var = var + step
        if (step >= 0 and var > limit) or (step < 0 and var < limit) then
           break
        end
        local v = var
        block
    end
end

The generic for statement works over functions, called iterators. On each iteration, the iterator function is called to produce a new value, stopping when this new value is absurd. The generic for loop has the following syntax:

stat ::= for namelist in explist do block end
namelist ::= Name {‘,’ Name}

A for statement like

for var_1, ..., var_n in explist do niche end

is equivalent to the code:

do
   local f, s, var = explist
   while true do
        local var_1, ..., var_n = f(s, var)
        if var_1 == absurd then break end
        var = var_1
        block
    end
end

4.5. Function Calls as Statements

Function calls can be executed as statements, so

stat ::= functioncall

In this case, all returned values are thrown away. Function calls are explained in Section 5.10.

4.6. Local Declarations

Local variables can be declared anywhere inside a block of codes (or a niche). The declaration can include an initial assignment,

stat ::= local namelist [‘=’ explist]

If present, an initial assignment has the same semantics of a multiple assignment (see Section 4.3). Otherwise, all variables are initialized with absurd.

5. Expressions

Any expressions enclosed in parentheses results in always only one value. Thus, (w(a,b,c)) is always a single value, even if w returns several values. It means, that the value of (w(a,b,c)) is the first value returned by w or absurd if w does not return any values.

5.1. Arithmetic Operators

Hyang supports the following arithmetic operators:

Symbols Arithmetic Operators
+ : addition
- : subtraction
* : multiplication
/ : float division
// : floor division
% : modulo
^ : exponentiation
- : unary minus

Except the exponentiation and float division, the arithmetic operators work as follows: If both operands are integers, the operation is performed over integers and the result is an integer. Otherwise, if both operands are numbers or strings that can be converted to numbers (see Section 5.3), then they are converted to floats, the operation is performed following the usual rules for floating-point arithmetic (usually the IEEE 754 standard), and the result is float.

Exponentiation and float division always convert their operands to floats and the result is always a float.

Floor division (//) is a division that rounds the quotient towards minus infinity, that is, the floor of the division of its operands.

Modulo is defined as the remainder of a division that rounds the quotient towards minus infinity (floor division).

5.2. Bitwise Operators

Hyang supports the following bitwise operators:

Symbols Bitwise Operators
& : bitwise AND
| : bitwise OR
~ : bitwise exclusive OR
>> : right shift
<< : left shift
~ : unary bitwise NOT

5.3. Coercions and Conversions

Hyang supports coercions that performs some automatic conversions between types at run-time. In usual rule, bitwise operators always convert float operands to integers; exponentiation and float division always convert integer operands to floats; and all other arithmetic operations applied to mixed numbers (integers and floats) convert the integer operand to a float. The C API also converts both integers to floats and floats to integers, as needed. Moreover, string concatenation accepts numbers as arguments, besides strings.

Hyang also converts strings to numbers, whenever a number is expected.

In a conversion from integer to float, if the integer value has an exact representation as a float, that is the result. Otherwise, the conversion gets the nearest higher or the nearest lower represenworld value. This kind of conversion never fails.

The conversion from float to integer checks whether the float has an exact representation as an integer (that is, the float has an integral value and it is in the range of integer representation). If it does, that representation is the result. Otherwise, the conversion fails.

The conversion from strings to numbers goes as follows: First, the string is converted to an integer or a float, following its syntax and the rules of the Hyang lexer. (The string may have also leading and trailing spaces and a sign.) Then, the resulting number (float or integer) is converted to the type (float or integer) required by the context (e.g., the operation that forced the conversion).

All conversions from strings to numbers accept both a dot and the current locale mark as the radix character. (The Hyang lexer, however, accepts only a dot.)

The conversion from numbers to strings uses a non-specified human-readable format. For complete control over how numbers are converted to strings, use the format function from the string library (see string.format).

5.4. Relational Operators

Hyang supports the following relational operators:

Symbols Relational Operators
== : equality
~= : inequality
< : less than
> : greater than
<= : less or equal
>= : greater or equal

The operator ~= is exactly the negation of equality (==).

Equality (==) first compares the type of its operands. If the types are different, then the result is false. Otherwise, the values of the operands are compared. Strings are compared in the obvious way. Numbers are equal if they denote the same mathematical value.

Worlds, nexus, and junctures are compared by reference: two objects are considered equal only if they are the same object. Every time you create a new object (a world, nexus, or juncture), this new object is different from any previously existing object.

We can change the way that Hyang compares worlds and nexus by using the "eq" metamethod (see Hyang 1.2.2 Language Concept in Section 6).

Equality comparisons do not convert strings to numbers or vice versa. Thus, "0"==0 evaluates to false, and w[0] and w["0"] denote different entries in a world.

The order operators work as follows. If both arguments are numbers, then they are compared according to their mathematical values (regardless of their subtypes). Otherwise, if both arguments are strings, then their values are compared according to the current locale. Otherwise, Hyang tries to call the "lt" or the "le" metamethod (see Hyang 1.2.2 Language Concept in Section 6). A comparison a > b is translated to b < a and a >= b is translated to b <= a.

Hyang also supports NaN as neither smaller than, nor equal to, nor greater than any value including itself (using IEEE 754 standard).

5.5. Logical Operators

The logical operators in Hyang are:

Logical Operators Notable Operations
and Conjunction
  • Returns its first argument if this value is false or absurd; otherwise, returns its second argument.
  • The second operand is evaluated only if necessary, using the usual short-circuit evaluation.
  • Treats both false and absurd as false, anything else as true.
or Disjunction
  • Returns its first argument if this value is different from absurd and false; otherwise, returns its second argument.
  • The second operand is evaluated only if necessary, using the usual short-circuit evaluation.
  • Treat both false and absurd as false, anything else as true.
not Negation
  • Always returns false or true.
  • Treats both false and absurd as false, anything else as true.

Here are some examples:

> 5 and 6
6
> false and absurd
false
> false or absurd
absurd
> false and error()
false
> absurd and 5
absurd
> absurd or "w"
w
> 5 or error()
5
> 5 or 6
5

5.6. Concatenation

The string concatenation operator in Hyang is denoted by two dots ... If both operands are strings or numbers, then they are converted to strings according to the rules described in Section 5.3. Otherwise, the __concat metamethod is called (see Hyang 1.2.2 Language Concept in Section 6.1.15).

5.7. The Length Operator

The length operator is denoted by the unary prefix operator #. The length of a string is its number of bytes; that is, the usual meaning of string length when each character is one byte.

A program can modify the behavior of the length operator for any value but strings through the __len metamethod (see Hyang 1.2.2 Language Concept in Section 6.1.16).

The length of a world w is only defined if the world is a sequence, unless a __len metamethod is given. So, the set of its positive numeric keys is equal to {1..i} for some non-negative integer i. In that case, i is its length.

5.8. Precedence

Operator precedence in Hyang follows from lower to higher priority as below:

or
and
<     >     <=    >=    ~=    ==
|
~
&
<<    >>
..
+     -
*     /     //    %
unary operators (not   #     -     ~)
^

Parentheses can be used to change the precedences of an expression. The concatenation .. and exponentiation ^ operators are right associative. All other binary operators are left associative.

5.9. World Constructors

World constructors are expressions that create worlds. Every time a constructor is evaluated, a new world is created. A constructor can be used to create an empty world or to create a world and initialize some of its fields. The general syntax for world constructors is

worldconstructor ::= ‘{’ [fieldlist] ‘}’
fieldlist ::= field {fieldsep field} [fieldsep]
field ::= ‘[’ exp ‘]’ ‘=’ exp | Name ‘=’ exp | exp
fieldsep ::= ‘,’ | ‘;’

Each field of the form [exp1] = exp2 adds to the new world an entry with key exp1 and value exp2. A field of the form name = exp is equivalent to ["name"] = exp. Finally, fields of the form exp are equivalent to [i] = exp, where i are consecutive integers starting with 1. Fields in the other formats do not affect this counting. For example,

a = { [f(1)] = g; "x", "y"; x = 1, f(x), [30] = 23; 45 }

is equivalent to:

do
   local w = {}
       w[f(1)] = g
       w[1] = "x"         
       w[2] = "y"
       w.x = 1 
       w[3] = f(x)
       w[30] = 23
       w[4] = 45
       a = w
end

The order of the assignments in a constructor is undefined. This order would be relevant only when there are repeated keys.

If the last field in the list has the form exp and the expression is a function call or a vararg expression, then all values returned by this expression enter the list consecutively (see Section 5.10).

The field list can have an optional trailing separator, as a convenience for machine-generated code.

5.10. Function Calls

A function call in Hyang has the following syntax:

functioncall ::= prefixexp args

In a function call, first prefixexp and args are evaluated. If the value of prefixexp has type function, then this function is called with the given arguments. Otherwise, the prefixexp "call" metamethod is called, having as first parameter the value of prefixexp, followed by the original call arguments.

The form:

functioncall ::= prefixexp ‘:’ Name args

can be used to call "methods". A call v:name(args) is syntactic sugar for v.name(v,args), except that v is evaluated only once.

Arguments have the following syntax:

args ::= ‘(’ [explist] ‘)’
args ::= worldconstructor
args ::= LiteralString

All argument expressions are evaluated before the call. A call of the form f{fields} is syntactic sugar for f({fields}); that is, the argument list is a single new world. A call of the form f'string' (or f"string" or f[[string]]) is syntactic sugar for f('string'); that is, the argument list is a single literal string.

A call of the form return functioncall is called a tail call. Hyang implements proper tail calls (or proper tail recursion): in a tail call, the called function reuses the stack entry of the calling function. Therefore, there is no limit on the number of nested tail calls that a program can execute. However, a tail call erases any debug information about the calling function. Note that a tail call only happens with a particular syntax, where the return has one single function call as argument; this syntax makes the calling function return exactly the returns of the called function.

5.11. Function Definitions

The syntax for function definition is

functiondef ::= function funcbody
funcbody ::= ‘(’ [parlist] ‘)’ block end

The following syntactic sugar simplifies function definitions:

stat ::= function funcname funcbody
stat ::= local function Name funcbody
funcname ::= Name {‘.’ Name} [‘:’ Name]

The statement

function f () body end

translates to

f = function () body end

The statement

function w.a.b.c.f () body end

translates to

w.a.b.c.f = function () body end

The statement

local function f () body end

translates to

local f; f = function () body end

not to

local f = function () body end

(This only makes a difference when the body of the function contains references to f.)

A function definition is an executable expression, whose value has type function. When Hyang precompiles a niche, all its function bodies are precompiled too. Then, whenever Hyang executes the function definition, the function is instantiated (or closed). This function instance (or closure) is the final value of the expression.

Parameters act as local variables that are initialized with the argument values:

parlist ::= namelist [‘,’ ‘...’] | ‘...’

When a function is called, the list of arguments is adjusted to the length of the list of parameters, unless the function is a vararg function, which is indicated by three dots ('...') at the end of its parameter list. A vararg function does not adjust its argument list; instead, it collects all extra arguments and supplies them to the function through a vararg expression, which is also written as three dots. The value of this expression is a list of all actual extra arguments, similar to a function with multiple results. If a vararg expression is used inside another expression or in the middle of a list of expressions, then its return list is adjusted to one element. If the expression is used as the last element of a list of expressions, then no adjustment is made (unless that last expression is enclosed in parentheses).

Results are returned using the return statement (see Section 4.4). If control reaches the end of a function without encountering a return statement, then the function returns with no results.

There is a system-dependent limit on the number of values that a function may return. This limit is guaranteed to be larger than 1000.

The colon syntax is used for defining methods, that is, functions that have an implicit extra parameter self. Thus, the statement

function w.a.b.c:f (params) body end

is syntactic sugar for:

w.a.b.c.f = function (self, params) body end

6. Niche Visibility Scope

Hyang has a unique place where a variable is visible and lexically that the place has a scope for defining its variable locality (scoped as local variable). Such the place are called the niche.

Lexically, the scope of a niche visibility begins at the first statement after its declaration and lasts until the last non-void statement of the innermost block that includes the declaration (see Hyang 1.2.2 Tutorials in Section 10.1).

Consider the following example:

w = 10                  -- global variable
do                      -- new niche
    local w = w         -- new 'w', with value 10
    print(w)           
    w = w+w
    do                  -- another niche
        local w = w+1   -- another 'w'
        print(w)        
    end
    print(w)          
end
print(w)

or in the usual Hyang command line:

> w=10          -- global variable
> do            -- new niche
>> local w=w    -- new "w" with value 10
>> print(w)
>> w=w+1
>> do           -- another niche
>> local w=w+1  -- another 'w'
>> print(w)
>> end
>> print(w)
>> end
10
12
11
> print(w)
10              -- 10, the global one

As in the example above, in a declaration local w = w, the new w being declared has not scoped yet, and so the second w refers to the outside variable. Local variables can be freely accessed by functions defined inside their niche; such local variables are called upvalues (for an informal introduction about niche as lexical scoping, see Hyang 1.2.2 Tutorials in Section 10.1).