# The Language ## Section 1 - Files ### Folder Structure Normal TNSL project structure has a root source folder with TNSL files contained within the folder and sub-folders. It is normal for the root folder to represent a single library or binary, although there is no strict rule enforcing this. Standard organization is to place sub-modules in sub-folders. The file name for the module's entry point should match the folder name. The file representing the compile target is known as the root file, which generally resides in the root source folder. If the program is built as an executible, it requires a function with the name `main` as the entrypoint to the program. ### TNSL Files TNSL files end with the `.tnsl` extension and may contain the following: - Modules - Constant and variable definitions - Enum declarations - Struct definitions - Named function blocks - Method and interface blocks - Import statements - Asm statements There are other language constructs which may only be used within functions: - Re-assignment of variables - Control flow blocks - Value statements - Anonymous blocks (Scope blocks) - Stream semantics Comments may appear anywhere in the file ### Comments Comments begin with `#` and end with a new line. Comment blocks start with `/#` and end with `#/`. ## Section 2 - Blocks Blocks in TNSL open with `/;` and close with `;/`. Keywords directly after the opening (and on the same line) affect the type of block created. A quicker syntax for closing and re-opening a new block is to use `;;` which is equivalent to `;//;`; this can be helpful with series of `else if` blocks and `case` blocks. ### Modules Modules are akin to namespaces in C++ They hold a group of related sub-modules, functions, structs, and variables. These named definitions may be used by other projects if the `export` keyword is used before the `module` keyword. Otherwise, the names are not exported into the program/library's symbol table. ### Module definition example: *File a.tnsl (project a)* ``` /; export module pubmod /; module hidden # Can access all from pubmod, and pubmod.hidden ;/ # Can access all from pubmod, and pubmod.hidden ;/ # Can access all from pubmod, and pubmod.hidden ``` *File aa.tnsl (project a)* ``` /; my_function_a # Can access all from pubmod, and pubmod.hidden ;/ # Can access all from pubmod, and pubmod.hidden ``` *File b.tnsl (project b)* ``` /; my_function # Can import all from pubmod, but not pubmod.hidden ;/ # Can import all from pubmod, but not pubmod.hidden ``` ### Functions Functions are blocks followed by a user defined name (not a keyword). Functions may have inputs and/or outputs. Inputs are enclosed by `()` and outputs are enclosed by `[]` Input lists may begin with a type or be empty. If they begin with a type they must conclude with at least one named parameter. Parameters are separated by commas and use the previous type unless a new one is specified. Output lists consist of a comma seperated list of types. Functions *may* be overloaded (that is, two functions may share names but have differing input type lists). Overloaded functions *must not* share the same inputs and differing outputs, but *may* have both differing inputs and differing outputs. Symbols can be defined in a separate build file or auto-generated by the compiler. There is a standard for how the compiler will auto-generate names; this can be found in another chapter. ### Function definition example: *simple function with no inputs or outputs named "my_function"* ``` /; my_function tnsl.print("Hello from my_function!") ;/ ``` *functions with inputs and/or outputs* ``` /; my_second_function (int input1, bool input2) [bool, int] return input2, input1 ;/ /; sum_lists ({}int a, b) [int] int sum = 0 /; loop (int i = 0; i < len a) [++i] sum += a{i} ;/ /; loop (int i = 0; i < len b) [++i] sum += b{i} ;/ ;/ int global_lol = 0 /; next [int] return ++global_lol ;/ /; set_global (int i) global_lol = i ;/ ``` ### Control Flow Blocks Control flow blocks begin with the keywords `if`, `else`, `loop`, `match`, `case`, or `default`. Control flow blocks have a series of lists, generally these can be thought of as 'beginnings' encased in "()" and 'endings' encased in "[]". What 'beginning' and 'ending' mean varries by the type of block and is explained below. ### `if` Blocks `if` blocks generally work as you would expect in other procedural languages. They can be followed by any number of `else if` blocks as well as a final `else` block. The 'beginning' "()" of an `if` or `else if` block is a series of statements separated by `;`. These are executed in order. The last of these must evaluate to a boolean (type `bool` with value `true` or `false`). The code within the if block is executed if the boolean evaluates to `true` and does not execute if the boolean evaluates to `false`. If any block in the series has a condition which evaluates to true then it is executed and the others are skipped. If none evaluate to true then the `else` block will execute if present. The 'ending' "[]" of an `if` block is currently reserved and has undefined behavior. *Examples:* ``` /; if (true) tnsl.print("this always prints") ;/ /; if (false) tnsl.print("this never prints") ;; else if (true) tnsl.print("this one will now print") ;; else tnsl.print("this never prints either") ;/ /; if (0 !== 0) tnsl.print("you have to use boolean values") ;; else if (1 < 0) tnsl.print("standard equality operators work, see appendix for a list.") ;; else if (int i = 0; i < 2 && 5 > i) tnsl.print("Statements!") ;/ /; if (false) tnsl.print("this never prints") ;; else if (false) tnsl.print("this never prints either") ;; else tnsl.print("this one will now print") ;/ ``` ### `loop` Blocks The `loop` block can be configured (based on weather or not each boolean statement is omitted) to act as any type of loop. The 'beginning' "()" of a `loop` is similar to an `if` in that it is a series of statements; however in the case of a `loop` the conditional is optional and defaults to `true` if omitted. If the conditional evaluates to `true` then the inner code is evaluated. The 'ending' "[]" of a `loop` is similar to the 'beginning' of the loop as it a series of statements with an optional conditional at the end. If the conditional is omitted here it defaults to the same as the conditional from the 'beginning'. Each of these statements are evaluated at the end of the loop and if the conditional evaluates to `true` then the loop repeats its execution from just after the 'beginning'. *Examples* ``` # Same as a do ... while block /; loop [ ] ;/ # Same as a while loop /; loop ( ) ;/ # Infinite loop /; loop ;/ # Adding statements to mimic a for loop # Since i++ is not a bool it does not count # as the conditional /; loop (int i = 0; i < 10) [i++] ;/ ``` ### `match` Blocks `TODO` #### `case` Block `TODO` #### `default` Block `TODO` ## Section 3 - Types An exhaustive list of built-in and special types can be found in Appendix B. ### Standard Types in `tnsl` The standard set of types will be familiar to programmers with experience in procedural languages. Some common types are: - Signed integer variants (positive or negative): `int`, `int8`, `int16`, `int32`, `int64` - Unsigned integer variants (positive only): `uint`, `uint8`, `uint16`, `uint32`, `uint64` - Floating point variants: `float`, `float32`, `float64` - Boolean (`true` or `false`): `bool` - TNSL restricts valid platforms to those with byte addressable memory and whose processors support at least 16-bit integers. TNSL basic types with unspecified length (`int`,`uint`, and `float`) default to the largest supported in standard registers (non-SIMD or vector operations). I.E. for x86_32 `int` defaults to `int32`, on x86_64 `int` defaults to `int64` ### `libtnsl` Types The following are well supported but rely on libtnsl: - The meta-type: `type` - The vector (SIMD) type: `vect` They are discussed in more detail in the advanced features section of the specification. ### Pointers Pointer types are prefixed with the `~` (pointer to) operator. This operator serves as both part of the type, and as a way to get a pointer from a variable. The de-reference operator `` ` `` is used as a postfix to pointer variables when getting or setting the underlying value. *Examples* ``` # define int i as 0 int i = 0 # pointer to int i ~int p = ~i # set the value of i using p (i is set to 1) p` = 1 /; if (i == 1) tnsl.print("That's pointers!") ;/ ``` ### References Reference types are typically for use in function parameters but can be defined anywhere. Their type signature ends in the de-reference operator `` ` ``. A few quirks of reference types: - Reference types are similar to pointers and must be initialized with a pointer to be useful. - When accessed or set in a normal statement they automatically de-reference the pointer they hold. - When set in definition or function call they expect a pointer to the underlying variable they will access. - The pointer which the reference variable uses can be set by prefixing with `~` *Examples* ``` # this will be our underlying integer int a = 0 # basic definition and immediate assignment of a reference type # (immediate assignment is special as it allows setting the # pointer of the reference without use of the ~ operator) int` r = ~a # sets or gets in normal statements will use the underlying 'a' # a becomes 1 r = 1 # b is defined and set to 1 int b = r # an example of declaration without immediate assignment int` s # setting what 's' points to requires use of the ~ operator ~s = ~a # a becomes 2 s++ # a function with a reference parameter /; add_one (int` i) i++ ;/ # you must explicitly call using a pointer to the variable being referenced add_one(~a) /; if (a == 3) tnsl.print("a is now three") ;/ ``` ### Fixed-length Arrays Arrays are a repeated sequence of the same type of data in memory. Arrays store their length as a `uint` and immediately follow with the contents of the array. All arrays can be checked for length with the `len` operator. Arrays are created by prefixing a type with `{ <# of elements> }`. One can similarly access an element of an array by suffixing the variable name with `{ }`. When initializing or assigning a new value to an entire array, use `{}` to encase a list of values. The length of the array can be gotten by `len ` *Examples* ``` # create an array of five integers {5}int i # assign values to the array i{0} = 0 i{1} = 2 i{2} = 0 i{3} = 2 i{4} = 1 # store the length of the array (5) uint array_length = len i # create an initialized array with length five {5}int j = {1, 2, 3, 4, 5} # loop through the array and add them. /; loop (int k = 0; k < array_length) [k++] i{k} += j{k} ;/ ``` ### Unknown-length Arrays When creating an array where the length is not known at compile time (or accepting an array with unknown length as a parameter) use the `{}` prefix. ***When would I use this?*** - When defining an array within a function body or module the compiler will optimize however it thinks is best and is functionally equivalent to a fixed length array. - The difference matters more when **defining functions** or **defining structs** since in this case `{}` ***always*** denotes a ***pointer*** to an array. - This can be useful when you want to accept arbitrarily long lists or have a recursive struct which has an array of itself as a member. *Examples* ``` # when defining an array {}int i = {1, 2, 3, 4} {}int j = {5, 6, 7, 8, 9, 10} # when defining a recursive struct struct Node { int i, # using a fixed-length here would result in a # compile time error because the size could # not be computed {}Node sub } # when defining a function /; sum({}int arr) [int] int out = 0 /; loop (int i = 0; i < len arr) [++i] out += arr{i} ;/ ;/ # sum can take any array of integers int a = sum(i), b = sum(j) ``` ### NOTE: Evaluation Order Order of evaluation of type prefixes and postfixes is first all prefixes in right to left order then all postfixes in left to right order. This can be overridden using parenthesis. *Convoluted Examples* ``` # a reference to a pointer which points to an int ~int` (~int)` # a reference to a reference to an array (unknown length) which holds pointers to floats {}~float`` ({}(~(float))`` ``` ### The `void` Type The `void` type can represent two different things: unknown memory or a function. When `void` is prefixed with `~` it represents a pointer to arbitrary memory (byte aligned). When the void type is paired with input and output parameters `void( )[ ]` it represents a function. This is considered part of the type and is not a postifx. *Examples* ``` # simple function /; func_1 tnsl.print("hello!") ;/ # void example func /; func_2 # create a void type and assign it func_1's value void()[] func_ref = func_1 # call func_1 using func_ref func_ref() ;/ ``` ### Casting Types Cast by enclosing a value in `()` and following with the type to cast to enclosed in `[]`. *Examples* ``` # define an int and a float int i = 10 float f = 11.5 # define a void pointer and set it to reference i ~void v = ~i # define an int pointer and cast the void pointer to initialize it ~int p = (v)[~int] # cast the float to an int and set the value of i p` = (f)[int] ``` ### Defining Types In TNSL, types may be defined by using the `struct` keyword. Struct must be used in conjunction with a user defined name and a set of members enclosed in `{}`. Instances of struct types may be larger than their members if not declared `raw` due to type information and extension. Certain restrictions must be applied to `raw` types. These restrictions may be found in Appendix C. Types may extend other types and interfaces with some caveats. Raw structs may not extend other structs, but may extend interfaces. Non-raw structs may not extend raw structs. If extending two or more structs, they may not have any conflicting member names. Methods may be added to a struct with the `method` block. Immediately following `method` must be the user defined name of the struct. Methods may use the `override` or `operator` keywords in function definition. `override` must be used for functions which are named and typed equivalently to extended classes' methods. `operator` allows types to use operators as methods, the keyword must immediately be followed by the operator to overload, and must only have up to one input depending on weather the operator is binary or not. Methods may access the special keywords `self` and `super`. `self` is a reference to the instance of the struct that the function was called on. `super` is a reference to any structs or interfaces extended by the struct. If there is only one extended type, it references the methods of that type. Otherwise, it is an array of such objects. It may be called to call the equivalent method on the extended type. `super` may also be used in the member set to position the extended types' members in relation to the new struct's members. Examples: # normal struct ; struct box { float x, y, z } # method block /; method box /; area [float] return self.x * self.y * self.z ;/ ;/ ### Interface Types Interfaces are defined using the `interface` keyword. Interfaces have methods but no struct or members to accompany them. Instances of interfaces may not be created. Methods defined by interfaces must be overridden unless marked in the interface. Such marked methods may call on other methods, but may not use any members as interfaces have none. Example: /; interface shape /; area [float] ;/ # this method does not need to be overridden /; override area_sq [float] ;float a = self.area() ;return a*a ;/ ;/ ;struct box extends shape { float x, y, z } /; method box /; override area [float] ;return x*y*z ;/ ;/ ### Enum Types Enums are defined using the `enum` keyword. An enum represents a set of possible states, and requires a single output type which can be compared. Enums may be defined in conjunction with the `raw` keyword. When defined in this way, each state is mutually exclusive and must be represented by a single bit of a uint type. Raw enums may be thought of more akin to bit-masks. Examples: # non-raw enums must define each value ; enum color [int] { # In standard styling, these use UPPER_SNAKE_CASE RED = 1, BLUE = 2, ... YELLOW = 12 } # raw enums may not define any value ; raw enum object_material { WOOD, METAL, GLASS, PLASTIC, ... ROCK } ## Section 4 - Statements `TODO` ## Section 5 - Operators An exhaustive list of operators can be found in Appendix A ### Operator Precedence Operator precedence is as follows (from greatest to least): ``` Pointer operators (p0): ~ - address of ` - de-reference Access operator (p1): . - get/access Increment/de-increment (p2): ++ - increment -- - de-increment Multiplication/division (p3): * - multiply / - divide Addition and subtraction (p4): + - addition - - subtraction Modulus (p5): % - modulus Bitwise operators (p6): & - and | - nor ^ - xor << - shift left >> - shift right !& - nand !| - nor !^ - xand ! - not (bitwise or boolean) Boolean operators (p7): && - boolean and || - boolean or == - boolean eq > - greater than < - less than !&& - boolean nand !|| - boolean nor !== - boolean neq !> - boolean not greater than !< - boolean not less than >== - boolean greater than or equal to <== - boolean less than or equal to ``` ## Section 6 - `asm` `TODO` ## Section 7 - Crosscalling to C `TODO` ## License This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.