Name

Little

Synopsis

L [options] script.l [args]

Introduction

Little is a compiled-to-byte-code language that draws heavily from C and Perl. From C, Little gets C syntax, simple types (int, float, string), and complex types (arrays, structs). From Perl, Little gets associative arrays and regular expressions (PCRE). And from neither, Little gets its own simplistic form of classes.

The name "Little", abbreviated as simply "L", alludes to the language's simplicity. The idea was to distill the useful parts of other languages and combine them into a scripting language, with type checking, classes (not full-blown OO but useful none the less), and direct access to a cross-platform graphical toolkit.

Little provides a set of built-in functions, drawn from Perl and the standard C library.

Little is built on top of the Tcl/TK system. The Little compiler generates Tcl byte codes and uses the Tcl calling convention. This means that L and Tcl code may be intermixed. More importantly, it means that Little may use all of the Tcl API and libraries as well as TK widgets. The net result is a type-checked scripting language which may be used for cross-platform GUIs.

Little is open source under the same license as Tcl/TK (BSD like) with any bits that are unencumbered by the Tcl license also being available under the Apache License, Version 2.0.

Running Little programs

You can run a Little program from the command line.

L [options] progname.l [args]

Alternatively, put this as the first line of your script, but make sure your script is executable (chmod 755 script.l under Unix).

#!/path/to/L [options]

Options:

--fnhook=myhook: When function tracing is enabled, use myhook as the trace hook.
--fntrace=on | entry | exit | off: Enable function tracing on both function entry and exit, entry only, exit only, or disable tracing altogether.
--norun: Compile only (do not run). This is useful to check for compilation errors.
--nowarn: Disable compiler warnings. This is useful when you know you have unused variables or other warnings that you don't want to be bothered with.
--poly: Treat all types as poly. This effectively disables type checking.
--trace-depth=n: When function tracing is enabled, trace only to a maximum call depth of n.
--trace-files=colon-separated list of glob | /regexpr/: Enable tracing of all functions in the given files, specified either as globs or regular expressions. A leading + before a glob or regexp means to add to what is otherwise being traced and a leading - means to remove. No leading + or - means to trace exactly what is specified.
--trace-funcs=colon-separated list of glob | /regexpr/: Like --trace-files but specifies functions.
--trace-out=filename | host:port: Send default trace output to a file or a TCP socket.
--version: Print the L build version and immediately exit.

The tracing-related command-line options also can be specified in a #pragma inside the program; see the DEBUGGING section.

The optional [args] is a white-space separated list of arguments that are passed to the script's main() function as an array of strings (argv).

Language syntax

A Little script or program consists of one or more statements. These may be executable statements, variable or type declarations, function or class declarations, or #pragma statements which specify tracing directives. Statements outside of functions are said to be at the top level and are executed in the order they appear, although you can use a return statement to bail out. There is no need to have a main() function, but if one is present, it is executed after all of the top-level statements (even if you did a return from the top level).

puts("This is printed first.");
void main()
{
    puts("This is printed last.");
}
puts("This is printed second.");

Little statements end in a semi-colon.

printf("Hello, world\n");

Both C style and hash style comments are allowed, but the hash-style comments are only for the first line and must start on column 1.

# This is a comment
    // So is this
/* And this too */
    # But this is an error, only allowed on column 1, line 1

Whitespace usually is irrelevant.

printf(
    "Hello, world\n")
    ;

... except inside quoted strings:

# this would print with a linebreak in the middle
printf("Hello\
world\n");

and around the string-concatenation operator " . " so it can be distinguished from the struct-member selection operator ".".

Double quotes or single quotes may be used around literal strings:

puts("Hello, world");
puts('Hello, world');

However, only double quotes "interpolate" variables and handle character escapes such as for newlines (\n):

puts("Hello, ${name}");     // works fine
puts('Hello, ${name}');     // prints ${name}\n literally

Inside single quotes, you can still escape a line break, the single quote character (\'), and the escape character (\\).

puts('Here \' and \\ are escaped.');
puts('This one spans a\
line');

If you put a line break in the middle of a string but forget to escape it, Little will complain.

Adjacent string constants are automatically concatenated, like in C.

printf("This " "prints "
       "the concatenation "
       "of ""all"" strings\n");

prints "This prints the concatenation of all strings" followed by a newline.

Little requires that they be the same "type", all interpolated ("") or all not interpolated ('') but the dot operator comes the rescue in this contrived example:

'Hi there. ${USER} is ' . "${USER} today"

Variables, types, and constants

Little is a statically typed language with both scalar and complex types. All variables are typed and must be declared before use.

The scalar types are int, float, and string. The complex types are array, hash, struct, and list. Little also supports function pointers, classes, and a special type called poly which matches any type and normally is used to disable type checking. Finally, Little has the concept of an undefined value which a variable of any type can possess.

Strong typing means that you can assign something of one type only to something else of a compatible type. Normally, to be compatible the types must be structurally the same, but there are exceptions such as an int being compatible with float and a list sometimes being compatible with an array or struct.

Variables begin with a letter and can contains letters, numerals, and underscores, but they cannot begin with an underscore (_). This is because the Little compiler reserves names starting with _ for internal use.

A variable declaration includes the type, the variable name, and optionally an initial value which can be any Little expression:

int i = 3*2;

printf("i = %d\n", i);  // prints i = 6

If an initial value is omitted, the variable starts out with the undefined value undef.

Scalars

A scalar represents a single value that is a string, integer, or floating-point number. Strings are wide char (unicode), integers are arbitrarily large, and floats are like C's double.

Examples:

string animal = "camel";
int answer = 42;
float pi = 3.14159;

When one of these types is expected, supplying another one usually is an error, except that an int always can be used as a float. You can override this behavior with a type cast.

Hex and octal integer constants are specified like this:

int space = 0x20;
int escape = 0o33;

Integer constants can be arbitrarily large; they are not limited by the machine's word size.

Strings have a special feature where they can be indexed like arrays, to get a character or range of characters, or to change a character (but you cannot change a range of characters):

string s1, s2;

s1 = "hello";
s2 = s1[1];     // s2 gets "e"
s2 = s1[1..3];  // s2 gets "ell"
s1[1] = "x";    // changes s1 to "hxllo"
s1[END+1] = "there";    // changes s1 to "hxllothere"

The pre-defined identifier END is the index of the last character, or is -1 if the string is empty. You always can write to one past the end of a string to append to it, but writing beyond END+1 is an error.

You delete a character within a string by indexing the string and setting it to "" (the empty string), or by using the undef() built-in:

s[3] = "";    // deletes fourth character of s
undef(s[3]);  // same thing

After the deletion any characters after the deleted character are shifted left by one:

s = "123X456";
undef(s[3]);  // s is now "123456", the X is gone, the rest left shifted

Undef

Sometimes you want to signify that a variable has no legal value, such as when returning an error from a function. Little has a read-only pre-defined identifier called undef which you can assign to any variable.

int_var = undef;
array_var = undef;

This is different than the undef() built-in function which deletes array, hash, or string elements.

When used in comparisons, a variable that is undefined is never seen as true, or as equal to anything defined, so you can easily check for error conditions:

unless (f = fopen(file, "r")) {
    die(file);
}
while (s = <f>) {
    printf("%s\n", s);
}

You have to be a little careful because any numeric value (and poly) can be false in a condition because it can have the value of zero:

int i;

i = 0;
if (i)              // false

i = 1;
if (i)              // true.

i = undef;
if (i)              // false

i = 0;
if (defined(i))     // true
if (i)              // false

Other than numeric types, you can skip the defined() and just use if (var). That's true for arrays, structs, hashes, and FILE types.

Little itself sometimes uses undef to tell you that no value is available. One case is when you assign to an array element that is more than one past the end of an array. Little auto-extends the array and sets the unassigned elements to undef.

Arrays

An array holds a list of values, all of the same type:

string animals[] = { "camel", "llama", "owl" };
int numbers[] = { 23, 42, 69 };

You do not specify a size when declaring an array, because arrays grow dynamically.

Arrays are zero-indexed. Here's how you get at elements in an array:

puts(animals[0]);              // prints "camel"
puts(animals[1]);              // prints "llama"

The pre-defined identifier END is the index of the last element of an array, or is -1 if the array is empty.

puts(animals[END]);       // last element, prints "owl"

END is valid only inside of an array subscript (strings are a kind of array so END works there).

If you need the length of an array, use a built-in function:

num_elems = length(animals);   // will get 3

If the array is empty, length() returns 0.

To get multiple values from an array, you use what's called an array slice which is a sub-array of the array being sliced. Slices are for reading values only; you cannot write to a slice.

animals[0..1];            // gives { "camel", "llama" }
animals[1..END];          // gives all except the first element

In this last example where END is used, you must be careful, because if the array is empty, END will be -1, and an array slice where the second index is less than the first causes a run-time error.

You can add and remove from an array with push and pop, unshift and shift, and insert. The push and pop functions add and remove from the end:

string birds[], next;

push(&birds, "robin");
push(&birds, "dove", "cardinal", "bluejay");
next = pop(&birds);   // next gets "bluejay"
// birds is now { "robin", "dove", "cardinal" }

The & means that birds is passed by reference, because it will be changed. This is discussed in more detail in the section on functions.

Another way to append:

birds[END+1] = "towhee";

The unshift and shift functions are similar but they add and remove from the beginning of the array.

You can insert anywhere in an array with insert:

insert(&birds, 2, "crow");  // insert crow before birds[2]
insert(&birds, 3, "hawk", "eagle");

In these examples we inserted one or more single elements but whereever you can put an element you also can splice in a list:

string new_birds[] = { "chickadee", "turkey" };
push(&birds, new_birds);  // appends chickadee and turkey

In this example the variable new_birds is not required; an array constant could have been pushed instead:

push(&birds, { "chickadee", "turkey" });

There is an ambiguity, resolved by the type of the first argument, as to whether it is two strings being pushed as two new entries in the array, or if it is a single item being pushed. You have to know the type of the first argument to know which is which.

You can remove from anywhere in an array with undef:

string dev_team[] = { "larry", "curly", "mo" };
undef(dev_team[0]);  // delete "larry" from dev_team

When you delete an element, all subsequent elements slide down by one index. Note that undef() works only on a variable; it cannot remove an element from a function return value, for example.

You also can directly assign to any array index even if the array hasn't yet grown up to that index. If you assign more than one past the current end, the unassigned elements are assigned undef:

string colors[] = { "blue, "red" };
colors[3] = "green";   // colors[2] gets undef and
                       // colors[3] gets "green"

You can read from any non-negative array index as well. You will simply get undef if the element doesn't exist. Reading from a negative index causes a run-time error.

An array can hold elements of any type, including other arrays. Although Little does not have true multi-dimensional arrays, arrays of arrays give you basically the same thing:

int matrix[][] = {
    { 1, 2, 3 },
    { 4, 5, 6 },
    { 7, 8, 9 }
};

When declaring an array, it is legal to put the brackets after the type instead of the name. This sometimes is useful for readability, and is required in function prototypes that omit the parameter name.

int[] mysort(int[]);	// prototype 
int[] mysort(int vector[]) { ... }

An array in Little is implemented as a Tcl list under the covers.

Hashes

A hash holds a set of key/value pairs:

int grades{string} = { "Tom"=>85, "Rose"=>90 };

When you declare a hash, you specify both the key type (within the {}) and the value type. The keys must be of scalar type but the values can be of any type, allowing you to create hashes of arrays or other hashes.

To get at a hash element, you index the hash with the key:

grades{"Rose"};           // gives 90

If the given key does not exist in the hash, you get back undef. Using an undefined key causes a run-time error.

You get a list of all the keys in a hash with the keys() built-in, which returns an array:

string students[] = keys(grades);

Because hashes have no particular internal order, the order in which the keys ("Tom" and "Rose") appear is undefined. However, you can obtain a sorted array of keys like this:

string students[] = sort(keys(grades));

The length built-in works on hashes too and returns the number of key/value pairs.

You remove an element from a hash with undef:

undef(grades{"Tom"});  // removes "Tom" from the hash

Note that this is different than assigning undef to a hash element, which does not remove that element from the hash, it creates an element with the value undef. It is not an error to remove something that's not in the hash. Note that undef() works only on a variable; it cannot remove an element from a function return value, for example.

When declaring a hash, it is legal to put the braces after the type instead of the name. It comes in handy for function definitons, here is is function that is returning an hash of integer grades indexed by student name after adjusting them:

int{string} adjust_grades(int{string});	// prototype
int{string} adjust_grades(int grades{string}) { ... }

A hash in Little is implemented as a Tcl dict.

Structs

Little structs are much like structs in C. They contain a fixed number of named things of various types:

struct my_struct {
    int    i;
    int    j;
    string s;
};
struct my_struct st = { 1, 2, "hello" };

You index a struct with the "." operator except when it is a call-by-reference parameter and then you must use "->":

void foo(struct my_struct &byref) {
    puts(byref->s);  // prints hello
}
puts(st.i);    // prints 1
puts(st.j);    // prints 2
puts(st.s);    // prints hello
foo(&st);      // pass st by reference

It is an error to use "." when "->" is required and vice-versa. Be careful to not put any whitespace around the "." or else you will get the string concatenation operator and not struct-member selection (this is a questionable overload of the "." operator but too useful to pass up).

Structs can be named like my_struct above or they can be anonymous:

struct {
    int    i;
    int    j;
} var1;

Struct names have their own namespace, so they will never clash with function, variable, or type names.

A struct in Little is implemented as a Tcl list.

Lists

In the examples above, we have been initializing arrays, hashes, and structs by putting values inside of {}:

string nums[] = { "one", "two", "three" };

In Little, the {} is an operator that creates a list and can be used anywhere an expression is valid. The array could instead be initialized like this:

string nums[];
nums = { "one", "two", "three" };

We said before that you can assign a value to something only if it has a compatible type. Lists are special in that they can be compatible with arrays, hashes, and structs. A list where all the elements are of the same type, say T, is compatible with an array of things of type T. The example above illustrates this.

A list also is compatible with a struct if the list elements agree in type and number with the struct. The assignment of the variable st above illustrates this.

A list is compatible with a hash if it has a sequence of key/value pairs and they are all compatible with the key/value types of the hash:

int myhash{string} = { "one"=>1, "two"=>2, "three"=>3 };

Lists are very useful at times because you can use them to build up larger complex structures. To concatenate two arrays, you could do this:

{ (expand)array1, (expand)array2 };

The (expand) operator takes an array (or struct or list) and moves its elements out a level as if they were between the { and } separated by commas. The section on manipulating complex data structures has more details on the (expand) operator.

A list in Little is implemented as a Tcl list.

Poly

Sometimes you don't want Little to do type checking. In this case, you use the poly type, which is compatible with any type. Poly effectively disables type checking, allowing you to use or assign values without regard to their types. Obviously, care must be taken when using poly.

The -poly option to Little causes all variables to be treated as if they were of type poly, regardless of how they are declared.

Type Casts

Something of one type can be converted into something of another type with a type cast like in C:

string_var = (string)13;

If the thing being cast cannot be converted to the requested type, the result of the cast is undef.

Typedefs

You can declare a type name to be a shorthand for another type, as you would in C:

typedef struct {
    int     x, y;
} point;

And then use the shorthand as you would any other type name:

point points[];
points[] = { 1,1, 2,2, 3,3 };

You can typedef a function pointer too. This declares compar_t as type function that takes two ints and returns an int:

typedef int compar_t(int a, int b);

Type names belong to their own namespace, so you can define a typedef with the same name as a variable, function, or struct without ambiguity (though it is poor practice to do so).

Name scoping

Variables must be declared before use, or a compile-time error will result. However, functions need not be declared before use although it is good practice to do so.

Declarations at the top-level code exist at the global scope and are visible across all scripts executed by a single run of Little. You can qualify a global declaration with private to restrict it to the current file only; this is similar to a static in C, except that private globals are not allowed to shadow public globals. Names declared in a function, or in a block within a function, are local and are scoped to the block in which they are declared.

Functions and global variables share the same namespace, so a variable and function cannot have identical names. Struct tags have their own namespace, and type names have theirs.

Inside a function, two locals cannot share the same name, even if they are in parallel scopes. This is different than C where this is allowed. If a local shares the same name as a global, the local is said to shadow the global. Locals cannot shadow globals that have been previously declared.

Names declared inside of a class can be either local or global depending on how they are qualified.

String interpolation

Expressions can be interpolated into double-quoted strings, which means that within a string you can write an expression and at run-time its value will be inserted. For example, this interpolates two variables:

int    a = 12;
string b = "hello";

/* This will print "A is 12 and b is hello". */

printf("A is ${a} and b is ${b}\n");

Everything inside the ${} is evaluated like any other Little expression, so it is not limited to just variables:

printf("The time is ${`date`}\n");
s = "The result is ${some_function(a, b, c) / 100}";

Here documents

Sometimes you need to assign a multi-line string to a variable. Here documents help with that:

string s = <<EOF
This is the first line in s.
This is the second.
And the last.
EOF;

Everything in the line starting after the initial <<EOF delimiter and before the final EOF delimiter gets put into the variable s. You can use any identifier you want as the delimiter, it doesn't have to be EOF. A semicolon after the EOF is optional.

The text inside the here document undergoes interpolation and escape processing. If you don't want that, put the initial delimiter inside of single quotes:

string s = <<'EOF'
None of this text is interpolated.
So this ${xyz} appears literally as '${xyz}'.
And so does \ and ' and " and anything else.
EOF;

To help readability, you can indent your here document but have the indenting white space ignored. Put the initial delimiter on the next line and then whatever whitespace you put before it gets ignored:

string s =
    <<EOF
    This is the first line in s and gets no leading white space.
     This line ends up with a single leading space.
      And this ends up with two.
    EOF;

Exceptions to the indentation rule: a blank line is processed as if it is indented, and the end delimiter can have any amount of leading white space so that you can indent it more or less if you like.

Operators

Arithmetic

+   addition
++  increment by 1 (integer only)
-   subtraction
--  decrement by 1 (integer only)
*   multiplication
/   division
%   remainder

Numeric and String comparison

==  equality
!=  inequality
<   less than
>   greater than
<=  less than or equal
>=  greater than or equal

String comparison

=~  regexp match or substitute
!~  negated regexp match

Comparison of composite types (array, hash, struct)

eq(a,b)

Bit operations

&   bit and
|   bit or
^   bit exclusive or
~   bit complement
<<  left shift
>>  right shift

Boolean logic

&&  and
||  or
!   not

Conditional

?:  ternary conditional (as in C)

Indexing

[]  array index
{}  hash index
.   struct index (no whitespace around the dot)
->  struct index (call-by-reference parameters dereference)
->  class and instance variable access (object dereference)

Miscellaneous

=   assignment
,   statement sequence
.   string concatenation (must have whitespace around the dot)
``  command expansion

Assignment

+=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=, .=

Operator precedence (highest to lowest) and associativity

`` (non associative)
[] {} . (struct index) -> ++ -- (left)
unary + unary - ! ~ & (right)
* / % (left)
+ - . (string concatenation) (left)
<< >> (left)
< <= > >= (left)
== != =~ !~ (left)
& (left)
^ (left)
| (left)
&& (left)
|| (left)
?: (right)
= += -= *= /= %= &= |= ^= <<= >>= .= (right)
, (left)

Control transfer statements

Little has most of the usual conditional and looping constructs.

For conditionals, numeric variables (including poly variables with a number in them), evaluate to true or false based on their value. If you want to know if a numeric variable is defined you have to use the defined() builtin.

For all other variable types (arrays, hashes, structs, strings, etc), the variable itself will yield true or false if it is / is not defined.

int undefined = undef;
int zero = 0;
int one = 1;
string args[];
string more[] = { "hi", "there", "mom" };

if (undefined)          // false (undef)
if (zero)               // false (0 value)
if (defined(zero))      // true (not undef)
if (one)                // true (1 value)
if (defined(one))       // true (set to some value)
if (args)               // false, not initialized
if (more)               // true, initialized

See the list of operators in the next section for information on comparison and logic operators, which are commonly used in conditional statements.

if

The if statement comes in the traditional form:

if ( condition ) {
    ...
} else if ( other condition ) {
    ...
} else {
    ...
}

And there's a negated version of it (from Perl) provided as a more readable version of if (!condition).

unless ( condition ) {
    ...
}

while

while ( condition ) {
    ...
}

do {
    ...
} while ( condition )

for

for (i = 0; i < max; ++i) {
    ...
}

foreach

The foreach statement lets you iterate through the elements of an array:

string element;
string myarray[];

foreach (element in myarray) {
    printf("This element is %s\n", element);
}

... or of a hash:

string key;
int value;
int myhash{string};

foreach (key=>value in myhash) {
    printf("Key %s has value %d\n", key, value);
}

... or of a string:

string char;

foreach (char in mystring) {
    printf("This char is %s\n", char);
}

... or through the lines in a string:

int i = 0;
string s;
string lines = "a\nbb\nccc\ndddd\n";

# (questionable) alias for foreach (s in split(/\n/, lines))
foreach (s in <lines>) {
    puts("line #${++i}: ${s}");
}

Inside the loop, the index variable(s) (element, key, val, and char above) get copies of the iterated elements, so if you assign to them, the thing you're iterating over does not change.

If you want to stride through more than one array element, character, or line in each iteration, just use a list of value variables instead of one:

foreach (e1,e2,e3 in myarray) {
    printf("Next three are %s:%s:%s\n", e1, e2, e3);
}

If there isn't a multiple of three things to iterate through, the stragglers get undef on the last iteration. Strides work only for arrays and strings, not hashes.

After completing the loop and falling through, all loop counters become undefined (they get the undef value). If the loop is prematurely ended with a break or by jumping out of the loop with a goto, the loop counters keep their values.

braceless control flow

Little allows braceless versions of if, unless, while, for, foreach provided that it is a single statement after the control flow keyword. Note that else is not in the list, if then else always requires braces.

switch

The switch statement is like C's except that regular expressions and/or strings can be used as case expressions:

switch (string_var) {
    case "true":
    case "false":
        puts("boolean (sort of)");
        break;
    case /[0-9]+/:
        puts("numeric");
        break;
    case /[a-zA-Z][0-9a-zA-Z]*/:
        puts("alphanumeric");
        break;
    default:
        puts("neither");
        break;
}

The default case is optional. The expression being switched on must be of type integer, string or poly.

In addition to checking the value of the switch expression, you can test whether it is undefined. This is useful when switching on a function return value which could be undef to signal an error condition.

switch (myfunc(arg)) {
    case /OK/:
        puts("all is A-OK");
        break;
    case undef:
        puts("error");
        break;
    default:
        puts("unknown return value");
        break;
}

Regular expressions have an alternative syntax (borrowed from Perl) that is used when the expression may contain "/". In switch statements that syntax is somewhat restricted because of the parsing problems you can imagine below:

switch (str) {
    case m|x*y|:    // "|" and most other punctuation as the delim
                    // are OK,
                    // except "(" and ":" -- error
                    // and any alphabetic character -- error
        break;
    case m:         // is the variable m (not a regexp) -- ok
        break;
    case mvar:      // and variables starting with "m" -- ok
        break;
}

break and continue

Little has break and continue statements that behave like C's. They work in all Little loops including foreach loops, and break works in switch case bodies.

goto

The goto statement unconditionally transfers control to a label in the same function, or to a label at the global scope if the goto is at the global scope. You cannot use a goto to transfer in to or out of a function. Labels have their own namespace so they will not clash with variable, function, or type names.

/* A goto at the global scope. */
goto L1;
puts("this is not executed");
L1: puts("but this is");

void foo()
{
    goto L2;
    puts("this is not executed");
L2:     puts("but this is");
}

Some caveats: do not jump into a foreach loop or a run-time error may result due to bypassing the loop set-up. Do not bypass a variable declaration or else the variable will be inaccessible.

Functions

Little's functions are much like functions in C. Like variable names, function names cannot begin with an underscore (_).

Each function must be declared with a return type and a formal-parameter list:

int sum(int a, int b)
{
    return (a + b);
}

void is a legal return type for a function that returns no value. Functions cannot be nested.

Function prototypes are allowed, where all but the function body is declared. In a prototype, you can omit any parameter names or use void for an empty parameter list:

void no_op1(void);
void no_op2();
int sum(int, int);

Unlike Perl, when calling a function you must use parentheses around the arguments:

sum(a, b);

Little does a special kind of call called a pattern function call when the function name is capitalized and contains an underscore; these are useful for calling Tcl commands and are described later in the section "Calling Tcl from Little". Normal function names therefore should not be capitalized and contain an underscore.

Parameters are passed by value by default. To pass by reference, you use a & in the declaration and in the function call:

void inc(int &arg)
{
    ++arg;
}

inc(&x);  // inc() can change x

The & only tells Little to pass by reference. It is not a pointer (no pointer arithmetic), it is a reference. You use a reference to give the called function the ability to change the caller's variable.

Only variables can be passed by reference, not elements of arrays, hashes, or structs. This is one significant difference from C. Passing an array, hash, or struct element with & uses copy in/out, not a true reference. The element value is copied into a temp variable and the temp is passed by reference. Then when the function returns, any changes to the temp are copied back into the array, hash, or struct element. In most cases this behaves like call-by-reference and you don't need to worry about it. But if you access the passed element during the function call, by referencing it directly instead of through the formal parameter, then you must be careful:

string array[] = { "one", "two" };
void fn(string &var, string val)
{
    var = val;
    array[0] = "this gets overwritten by the copy-out";
}
void main()
{
    fn(&array[0], "new");
    puts(array[0]);  // will print "new"
}

Instead of passing a reference, you can pass undef like you would a NULL pointer in C. You test for this with the defined() operator:

void inc(int &arg)
{
    if (defined(&arg)) ++arg;
}

inc(undef);   // does nothing
inc(&x);      // increments x

In the example above, you might question why it is defined(&arg) instead of defined(arg). If you think of it in terms of C, the &arg is like looking at a pointer, you are seeing if it is non-null, the arg++ is like *p += 1.

If you pass undef as a reference and then attempt to access the parameter, a run-time error results similar to derefencing a NULL pointer in C.

When accessing a struct argument inside a function, if the struct was passed by reference, the "->" operator must be used instead of ".". This makes it clear to the reader that the struct variable is passed by reference; it is intended to allude to a C pointer even though Little does not have general-purpose pointers.

Variable arguments to functions

Functions can take a variable number of arguments, like printf does. In the function declaration, you use the qualifier "..." in front of the last formal parameter name and omit its type:

void dump(...args)
{
    string s;
    foreach (s in args) puts(s);
}
dump("just one");
dump("but two", "or three", "or more is OK");

Inside the function, args has type array of poly, allowing any number of parameters of any type to be passed.

The main() function

If main() is present, it is called after all of the top-level statements have executed. The main() function may be defined in any of the following ways:

void|int main(void) {}
void|int main(string argv[]) {}
void|int main(int argc, string argv[], string env{string}) {}

The argv array is populated from the script name and any arguments that appear after the name on the Little command line. In this example, argc is 4 and argv[] contains "script.l", "arg1", "arg2", and "arg3":

L script.l arg1 arg2 arg3

The env hash is populated with the environment variables present when Little is invoked. Although you can change this hash, writes to it are not reflected back into the environment. To do that use the putenv library function.

Only a main written in Little is automatically called. You can write a main in Tcl but Little will not call it automatically.

If main is declared to have return type int then any value returned will be the exit value.

Function pointers

Function pointers are supported, but only as arguments -- you cannot otherwise assign a function pointer to a variable. It is common to first typedef the function-pointer type; here is one for a function that compares two strings:

typedef int str_compar_t(string a, string b);

You can then pass such a compare function as follows:

string    a[];

bubble_sort(a, &unary_compar);

Where the sort function looks like this:

string[] bubble_sort(string a[], str_compar_t &compar)
{
    do {
        ...
        if (compar(a[i], a[i+1] > 0) { ... }
        ...
    } ...
}

And the compare function looks like this:

int unary_compar(string a, string b)
{
    int     al = length(a);
    int     bl = length(b);

    if (al < bl) {
            return -1;
    } else if (al > bl) {
            return 1;
    } else {
            return 0;
    }
}

Regular expressions

Little's regular expression support is based on the PCRE (Perl Compatible Regular Expressions) library http://www.pcre.org. The basics are documented here but for more extensive documentation please see http://www.pcre.org/pcre.txt.

Simple matching

if (s =~ /foo/) { ... }  // true if s contains "foo"
if (s !~ /foo/) { ... }  // false if s contains "foo"

The // matching operator must be used in conjunction with =~ and !~ to tell Little what variable to look at.

If your regular expression contains forward slashes, you must escape them with a backslash, or you can use an alternate syntax where almost any punctuation becomes the delimiter:

if (s =~ m|/path/to/foo|) { ... }
if (s =~ m#/path/to/foo#) { ... }
if (s =~ m{/path/to/foo}) { ... }

In the last case, note that the end delimiter } is different than the start delimiter { and you must escape all uses of either delimiter inside the regular expression.

Simple substitution

x =~ s/foo/bar/;         // replaces first foo with bar in x
x =~ s/foo/bar/g;        // replaces all instances of foo
                         // with bar in x
x =~ s/foo/bar/i;        // does a case-insensitive search

This form also has several alternate syntaxes:

x =~ s|/bin/root|~root|;
x =~ s{foo}{bar};
x =~ s{foo}/bar/;

More complex regular expressions

.                   a single character
\s                  a whitespace character
                    (space, tab, newline, ...)
\S                  non-whitespace character
\d                  a digit (0-9)
\D                  a non-digit
\w                  a word character (a-z, A-Z, 0-9, _)
\W                  a non-word character
[aeiou]             matches a single character in the given set
[^aeiou]            matches a single character not in given set
(foo|bar|baz)       matches any of the alternatives specified

^                   start of string
$                   end of string

Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the meta characters listed above, or a group of characters or meta characters in parentheses.

*                   zero or more of the previous thing
+                   one or more of the previous thing
?                   zero or one of the previous thing
{3}                 matches exactly 3 of the previous thing
{3,6}               matches between 3 and 6 of the previous thing
{3,}                matches 3 or more of the previous thing

Some brief examples:

/^\d+/              string starts with one or more digits
/^$/                nothing in the string (length == 0)
/(\d\s){3}/         a three digits, each followed by a whitespace
                    character (eg "3 4 5 ")
/(a.)+/             matches a string in which every odd-numbered
                    letter is "a" (eg "abacadaf")

// This loop reads from stdin, and prints non-blank lines.
string buf;
while (buf = <stdin>) {
    unless (buf =~ /^$/) puts(buf);
}

Unicode

Both regular expressions and the strings they are matched against can contain unicode characters or binary data. This example looks for a null byte in a string:

if (s =~ /\0/) puts("has a null");

Parentheses for capturing

As well as grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in $1, $2 and so on, and these capture variables are available in the substitution part of the operator as well as afterward. You can use up to nine captures ($1 - $9).

// Break an e-mail address into parts.
if (email =~ /([^@]+)@(.+)/) {
    printf("Username is %s\n", $1);
    printf("Hostname is %s\n", $2);
}

// Use $1,$2 in the substitution to swap two words.
str =~ s/(\w+) (\w+)/$2 $1/;

Capturing has a limitation. If you have more than one regexp with captures in an expression, the last one evaluated sets $1, $2, etc.

// This loses email1's captures.
if ((email1 =~ /([^@]+)@(.+)/) && (email2 =~ /([^@]+)@(.+)/)) {
    printf("Username is %s\n", $1);
    printf("Hostname is %s\n", $2);
}

In situations like this, care must be taken because the evaluation order of sub-expressions generally is undefined. But this example is an exception because the && operator always evaluates its operands in order.

Includes

Little has an #include statement like the one in the C pre-processor. A #include can appear anywhere a statement can appear as long as it begins in the first column and is contained entirely on one line:

#include <types.l>
#include "myglobals.l"
void main()
{
    ...
}

Unless given an absolute path, when the file name is in angle brackets (like <types.l>), Little searches these paths, where BIN is where the running tclsh exists:

$BIN/include
/usr/local/include/L
/usr/include/L

When the file name is in quotes (like "myglobals.l"), Little searches only the directory containing the script that did the #include.

Little also remembers which files have been included and will not include a file more than once, allowing you to have #include files that include each other.

Classes

Little has a class abstraction for encapsulating data and functions that operate on that data. Little classes are simpler than full-blown object-oriented programming (there is no inheritance), but they get you most of the way there.

You declare a class like this:

class myclass
{
    ....
}

The name myclass becomes a global type name, allowing you to declare an object of myclass:

myclass obj;

You can declare both variables and functions inside the class. These all must be declared inside one class declaration at the global scope. You cannot have one class declaration that has some of the declarations and another with the rest, and you cannot nest classes inside of functions or other classes.

Inside the class, you can have class variables and instance variables. Class variables are associated with the class and not the individual objects that you allocate, so there is only one copy of each. Instance variables get attached to each object.

class myclass
{
    /* Class variables. */
    public string pub_var;
    private int num = 0;

    /* Instance variables. */
    instance {
        public string inst_var;
        private int n;
    }
    ...
}

All declarations (except the constructors and destructors) must be qualified with either public or private to say whether the name is visible at the global scope or only inside the class.

A class can have one or more constructors and destructors but they are optional. Inside a constructor, the variable self is automatically declared as the object being constructed. A constructor should return self, although it also could return undef to signal an error. A destructor must be declared with self as the first parameter.

constructor myclass_new()
{
    n = num++;
    return (self);
}
destructor myclass_delete(myclass self) {}

If omitted, Little creates a default constructor or destructor named classname_new and classname_delete. Although not shown in this example, you can declare them with any number of parameters, just like regular functions.

A public class member function is visible at the global scope, so its name must not clash with any other global function or variable. A private member function is local to the class.

The first parameter to each public function must be self, the object being operated on. Private functions do not explicitly include self in the parameter list because it is implicitly passed by the compiler.

private void bump_num()
{
    ++n;
}
public int myclass_getnum(myclass self)
{
    bump_num();
    return (n);
}

To create an object, you must call a constructor, because just declaring the variable does not allocate anything:

myclass obj;

obj = myclass_new();

To operate on an object, you call one of its public member functions, passing the object as the first argument:

int n = myclass_getnum(obj);

Little allows you to directly access public class and instance variables from outside the class. To get a class variable, you dereference the class name (you must use ->):

string s = myclass->pub_var;

To get a public instance variable, you dereference the object whose data you want to access:

string s = obj->inst_var;

Once you free an object

myclass_delete(obj);

you must be careful to not use obj again unless you assign a new object to it, or else a run-time error will result.

Working with Tcl/TK

Little is built on top of Tcl: Little functions are compiled down to Tcl procs, Little local variables are just Tcl variables local to the proc, and Little global variables are Tcl globals. Although Little is designed to hide its Tcl underpinnings, sometimes it is useful for Little and Tcl to cooperate.

Mixing Little and Tcl Code

When you invoke Little with a script whose name ends in .l, the script must contain only Little code. If you run a .tcl script, you can mix Little and Tcl:

puts "This is Tcl code"
#lang L
printf("This is Little code\n");
#lang tcl
puts "Back to Tcl code"

You also can run Little code from within Tcl by passing the Little code to the Tcl command named L:

puts "Tcl code again"
L { printf("Called from the Little Tcl command.\n"); }

Calling Tcl from Little

You call a Tcl proc from Little like you would a Little function:

string s = "hello world";
puts(s);

In this example, puts is the Tcl command that outputs its argument to the stdout channel appending a trailing newline.

If you want argument type checking, you can provide a prototype for the Tcl functions you call. Otherwise, no type checking is performed.

In Tcl, options usually are passed as strings like "-option1" or "-option2". Little has a feature to pass these options more pleasantly:

func(option1:);            // passes "-option1"
func(option2: value, arg); // passes "-option2", value, arg

Without this, you would have to say:

func("-option1");
func("-option2", value, arg);

A similar feature is for passing sub-commands to Tcl commands:

String_length("xyzzy");   // like Tcl's [string length xyzzy]
String_isSpace(s);        // like Tcl's [string is space $s]

Whenever the function name is capitalized and contains an underscore, the sequence of capitalized names after the underscore are converted to (lower case) arguments (although capitalizing the first name after the underscore is optional). This is called a pattern function call.

x = Something_firstSecondThird(a, b)

is like this in Tcl:

set x [something first second third $a $b]

A pattern-function call often is used to call a Tcl proc, but you can call a Little function just as easily, and Little has a special case when the function is named like Myfunc_*:

void Myfunc_*(...args)
{
    poly p;

    printf("Myfunc_%s called with:\n", $1);
    foreach (p in args) printf("%s\n", p);
}
void main()
{
    Myfunc_cmd1(1);
    Myfunc_cmd2(3,4,5);
}

If Myfunc_* is declared, then any call like Myfunc_x becomes a call to Myfunc_* where the string x is put into the local variable $1 inside Myfunc_*. The remaining parameters are handled normally. This gives you a way to handle a collection of sub-commands without having to declare each as a separate Little function. Note that this use of $1 clashes with regular expression captures (described later), so if you use both, you should save off $1 before using any such regular expressions.

Note: we are going to change this to not conflict with regular expressions.

If you need to execute arbitrary Tcl code rather than just call a proc, you pass it to Tcl's eval command:

eval("puts {you guessed it, Tcl code again}");

Calling Little from Tcl

Little functions are easily called from Tcl, because a Little function foo compiles down to a Tcl proc named foo in the global namespace. Let's say this is run from a script named script.tcl:

#lang L
int avg(...args)
{
    int i, sum=0;
    unless (length(args)) return (0);
    foreach (i in args) sum += i;
    return (sum/length(args));
}
#lang tcl
set x [avg 4 5 6]
puts "The average is $x"

The Little code defines a proc named avg which the Tcl code then calls.

An exception is that private Little functions are not callable from Tcl.

Variables

Because Little variables are just Tcl variables, you can access Little variables from Tcl code. Here is an example from the Little library:

int size(string path)
{
    int sz;

    if (catch("set sz [file size $path]")) {
        return (-1);
    } else {
        return (sz);
    }
}

In this Tcl code, $path refers to the Little formal parameter path, and the Little local sz is set to the file size. This example also illustrates how you can use Tcl's exception-handling facility to catch an exception raised within some Tcl code.

An exception is that private Little global names are mangled (to make them unique per-file). You can pass the mangled name to Tcl code with the & operator. Here we are passing the name of the private function mycallback to register a Tcl fileevent "readable" handler:

private void mycallback(FILE f) { ... }

fileevent(f, "readable", {&mycallback, f});

Complex variables

Passing scalar variables works because they have the same representation in Little and in Tcl.

Passing complex variables is trickier and is not supported, but if you want to try here is what you need to know. This is subject to change. A Little array is a Tcl list. A Little struct is a Tcl list with the first struct member as the first list element and so on. A Little hash table is a Tcl dict. If a Little variable is deeply nested, so is the Tcl variable.

So long as you understand that and do the appropriate thing in both languages, passing complex variables usually is possible.

Namespaces

You can access Tcl procs and variables in namespaces other than the global namespace by qualifying the name:

extern string ::mynamespace::myvar;

/* Print a bytecode disassembly of the proc "foo". */
puts(::tcl::unsupported::disassemble("proc", "foo"));

/* Print a variable in another namespace. */
puts(::mynamespace::myvar);

Calling Tk

To help call Tk widgets, Little has a widget type that is used with the pattern function calls described above. A widget value behaves like a string except in a pattern function call where it is the name of the widget to call:

widget w = Text_new();
Text_insert(w, "end", "hi!");   // like Tk's $w insert end hi!

Another feature is useful for calling Tk widgets that take the name of a variable whose value is updated when the user changes a widget field. You can use a Little variable like this:

string msg;
ttk::label(".foo", textvariable: &msg);

The ampersand (&) in front of msg alludes to a C pointer but it really passes just the name of the variable. Little does this when the option name ends in "variable", as "textvariable" does in the example above (yes, this is a hack).

Learning more about Tcl/Tk

The Little language distribution includes the Tcl and Tk repositories. Each of those has a doc/ subdirectory with files starting with upper case and lower case. Ignore the upper case files ending in .3, those are internal C API documentation. The lower case files are Tcl / Tk exposed APIs. They are all nroff -man markup, to view

$ nroff -man file.n | less

For books we like these:

Tcl and the Tk Toolkit 2nd Edition
Effective Tcl/Tk Programming: Writing Better Programs with Tcl and Tk

Manipulating complex structures

Little has built-in operators for turning complex data structures into something else: (expand), and (tcl).

(expand) takes an array of things and pushes them all onto the run-time stack to call a function that expects such a list. It is identical to Tcl's {*}:

void foo(string a, string b, string c);

string v[] = { "one", "two", "three" };

foo((expand)v);     // passes three string arguments to foo

It expands only one level, so if the array contains three hashes instead of three strings, (expand)v passes three hashes to foo. (expand) works with structs too.

If you have this structure:

struct {
    int   i[];
    int   h{string};
} foo = {
    { 0, 1, 2, 3, },
    { "big" => 100, "medium" => 50, "small" => 10 }
};

And you use (expand) when passing these as arguments:

func((expand)foo);

you need a function definition like this:

void func(int nums[], int sizes{string})
{
}

There is no way to recursively expand at this time.

(tcl) is used to pass a single string to a Tcl proc for processing. It puts in the Tcl quotes. So

(tcl)foo

0 1 2 3 { big 100 medium 50 small 10 }

Another example:

string v[] = { "a b c", "d", "e" };
string arg = (tcl)v;     // arg is "{a b c} d e"

Sometimes you need to assign a group of variables all at once. You can do this by assigning a list of values to a list of variables:

{a, b, c} = {1, 2, 3};

This is more than a short-cut for the three individual assignments. The entire right-hand side gets evaluated first, then the assignment occurs, so you can use this to swap the value of two variables:

{a, b} = {b, a};

If you want to ignore one of the elements in the right-hand list, you can put undef in the corresponding element of left-hand list instead of having to use a dummy variable:

{a, undef, b} = {1, 2, 3};  // a gets 1, b gets 3

If the right-hand side list isn't as long as the left-hand list, the stragglers get undef:

{a, b, c} = {1, 2};  // a gets 1, b gets 2, c gets undef

These composite assignments also work with arrays or structs on the right-hand side:

int dev, inode;
struct stat st;

lstat(file, &st);
{dev, inode} = st;  // pull out first two fields of the stat struct

{first, second} = split(line);  // get first two words in line

Html with embedded Little

For building web-based applications, Little has a mode where the input can be HTML with embedded Little code (which we call Lhtml). This works in a way similar to PHP. To invoke this mode, the input file must end in .lhtml:

L [options] home.lhtml

All text in home.lhtml is passed through to stdout except that anything between <? and ?> is taken to be one or more Little statements that are replaced by whatever that Little code outputs, and anything between <?= and ?> is taken to be a single Little expression that is replaced by its value. All Little code is compiled at the global scope, so you can include Little variable declarations early in the Lhtml document and reference them later.

Here's an example that iterates over an array of key/value pairs and formats them into a rudimentary table:

<? key_value_pair row, rows[]; ?>
<html>
<body>
<p>This is a table of data</p>

<table>
<? rows = get_data();
   foreach (row in rows) { ?>
     <tr>
         <td><?= row.key ?></td>
         <td><?= row.value ?></td>
     </tr>
<? } ?>
</table>

</body>
</html>

Pre-defined identifiers

__FILE__: A string containing the name of the current source file, or "<stdin>" if the script is read from stdin instead of from a file. Read only.
__LINE__: An int containing the current line number within the script. Read only.
__FUNC__: A string containing the name of the enclosing function. At the top level, this will contain a unique name created internally by the compiler to uniquely identify the current file's top-level code. Read only.
END: An int containing the index of the last character of a non-empty string or the last element of a non-empty array. If the array or string is empty, END is -1. Valid only inside of a string or array subscript. Read only.
stdio_status: A struct of type STATUS (see system()) containing status of the last system(), `command`, successful waitpid(), or failed spawn().
undef: A poly containing the undef value, where defined(undef) is false. Assigning this to something makes it undefined. However, undef is not guaranteed to have any particular value, so applications should not rely on the value. Read only.

Reserved words

The following identifiers are reserved. They cannot be used for variable, function, or type names:

break
case
class
constructor
continue
default
defined
destructor
do
else
END
expand
extern
float
for
foreach
goto
if
instance
int
poly
private
public
return
string
struct
switch
typedef
undef
unless
void
while
widget

Debugging

Function tracing

Little function tracing is controlled with #pragma statements, _attribute clauses in function declarations, command-line options, environment variables, and a run-time API. When a function is marked for tracing, by default its entry and exit are traced to stderr, but you can use your own custom hooks to do anything you want.

A #pragma takes a comma-separated list of attribute assignments:

#pragma fntrace=on
string myfunc(int arg)
{
    return("return value");
}
void main()
{
    myfunc(123);
}

When this program runs, traces go to stderr with a millisecond timestamp, the function name, parameter values, and return value:

1: enter main
1: enter myfunc: '123'
2: exit myfunc: '123' ret 'return value'
3: exit main

The allowable tracing attributes are as follows.

fntrace=on | entry | exit | off: Enable tracing on both function entry and exit, entry only, exit only, or disable tracing altogether.
trace_depth=n: Trace only to a maximum call depth of n.
fnhook=myhook: Use myhook as the trace hook (see below).

A #pragma stays in effect until overridden by another #pragma or by an _attribute clause in a function declaration which provides per-function tracing control:

// don't trace this function
void myfunc2(int arg) _attribute (fntrace=off)
{
}

Tracing also can be controlled with command-line options:

--fntrace =on | entry | exit | off: Enable tracing of all functions on both function entry and exit, entry only, exit only, or disable all tracing. This overrides any #pragma or _attribute clauses in the program.
--trace-out=stdin | stderr | filename | host:port: Send default trace output to stdin, stderr, a file, or a TCP socket.
--trace-files=colon-separated list of glob | /regexp/: Enable tracing of all functions in the given files, specified either as globs or regular expressions. A + before a glob or regexp enables tracing, a - disables, and no + or - is like having a +, except that the leading one is special: if omitted, it means trace exactly what is specified, overriding any #pragmas or _attribute clauses in the code, by first removing all traces and then processing the file list.
--trace-funcs=colon-separated list of glob | /regexp/: Like trace-files but specifies functions.
--fnhook=myhook: Use myhook as the trace hook, overriding any #pragmas in the program.
--trace-script=script.l | Little code: Get the trace hook from a file, or use the given Little code (see below).

Some examples:

# Trace all functions
$ L --fntrace=on myscript.l

# Trace only foo
$ L --trace-funcs=foo myscript.l

# Trace foo in addition to what the source marks for tracing
$ L --trace-funcs=+foo myscript.l

# Trace all functions except foo
$ L --trace-funcs=*:-foo myscript.l
# This does it too
$ L --fntrace=on --trace-funcs=-foo myscript.l

Environment variables also can control tracing and take precedence over the other ways above:

L_TRACE_ALL=on | entry | exit | off
L_TRACE_OUT=stdin | stderr | filename | host:port
L_TRACE_FILES=colon-separated list of glob | /regexp/
L_TRACE_FUNCS=colon-separated list of glob | /regexp/
L_TRACE_DEPTH=n
L_TRACE_HOOK=myhook
L_TRACE_SCRIPT=script.l | <Little code>

Things in L_TRACE_FUNCS are applied after things in L_TRACE_FILES. As with the command-line options, they also can begin with + or - to add or subtract from what is specified elsewhere.

As a short-cut,

L_TRACE=stdin | stderr | filename | host:port

traces all functions and sets the trace output location.

More examples:

# Trace all files except foo.l
L_TRACE_FILE=*:-foo.l L myscript.l

# Trace main() and buggy() in addition to whatever is marked
# for tracing with #pragmas or _attribute clauses in the code.
L_TRACE_FUNCS=+main:buggy L myscript.l

# Trace *only* main() and buggy().
L_TRACE_FUNCS=main:buggy L myscript.l

There also is a run-time API that takes a hash of named arguments analogous to those above:

Ltrace({ "fntrace" => "on",
         "fnhook_out" => "myhook",
         "trace_depth" => 3,
         "trace_out" => "tracing.out",
         "trace_files" => "foo.l",
         "trace_funcs" => "+main:buggy" });

To use your own tracing function, specify fnhook in any of the above ways. Your hook is called on function entry and exit instead of the default hook. Its prototype must look like this:

void myhook(int pre, poly argv[], poly ret);

where pre is 1 when your hook is called upon function entry and 0 when called upon exit, argv contains the function's arguments (argv[0] is the function name; argv[1] is the first parameter), and ret is the return value (exit hook only; it is undef for entry).

If you use your own hook and then want to go back to the default, set fnhook=def.

To avoid infinite recursion, during the call of a hook, further calls into the hook are disabled. Also, functions defined as hooks, and the Little library functions, are not traced.

The trace-script attribute is a useful way to provide your own hook:

L_TRACE_SCRIPT=my-trace-hook.l  // filename must end in .l
L_TRACE_SCRIPT=<Little code>

In the latter case, the Little code gets wrapped in a function like this:

void L_fn_hook(int pre, poly av[], poly ret)
{
    ...code from L_TRACE_SCRIPT...
}

and L_fn_hook is used as the default trace hook.

As one example of where this is useful: say you are trying to find whether the function foo is ever called with the first argument of 123, and if so, to print all the arguments:

L_TRACE_FUNCS=foo \
    L_TRACE_SCRIPT='if (av[0]==123) puts(av)' L myscript.l

Built-in and library functions

Little has built-in functions and a set of library functions modeled after the standard C library and Perl.

string <>

string <FILE f>

Get the next line from a FILE handle and return it, or return undef for EOF or errors. Trailing newlines are removed. If a file handle is specified, it is not closed by this function.

The form without a file handle

while (buf = <>) {
    ...
}

means

unless (argv[1]) {
    while (buf = <stdin>) {
        ...
    }
} else for (i = 1; argv[i]; i++) {
    unless (f = open(argv[i], "r")) {
        perror(argv[i]);
        continue;
    }
    while (buf = <f>) {
        ...
    }
}

A trivial grep implementation:

void
main(int ac, string argv[])
{
    string  regexp = argv[1];
    string  buf;

    unless (regexp) die("usage: grep regexp [files]");
    undef(argv[1]); // left shift down the args
    while (buf = <>) {
        if (buf =~ m|${regexp}|) puts(buf);
    }
}

string `command`

Execute the command (the string enclosed within back-ticks) and substitute its stdout as the value of the expression. Any output to stderr is passed through to the calling application's stderr and is not considered an error. The command is executed using the Tcl exec command which understands I/O re-direction and pipes, except that command is split into arguments using Bourne shell style quoting instead of Tcl quoting (see shsplit). The command string is interpolated. Backslash escapes $, `, and \, \<newline> is ignored, but otherwise backslash is literally interpreted. An embedded newline is an error. If the command cannot be run, undef is returned. The global variable stdio_status (see system()) contains the command's exit status.

int abs(int val)

float abs(float val)

Return the absolute value of the argument.

void assert(int condition)

Print an error and exit with status 1 if condition is false. The filename, line number, and text of the condition are printed.

string basename(string path)

Return the file portion of a path name.

string caller(int frame)

Return the name of a calling function, or the caller's caller, etc. To get the caller, use a frame of 0, to get the caller's caller, use 1, etc.

int chdir(string dir)

Change directory to dir. Return 0 on success, -1 on error.

int chmod(string path, string permissions)

Not available on Windows. Change the mode of the file or directory named by path. Permissions can be the octal code that chmod(1) uses, or symbolic attributes that chmod(1) uses of the form [ugo]?[[+-=][rwxst],[...]], where multiple symbolic attributes can be separated by commas (example: u+s,go-rw add sticky bit for user, remove read and write permissions for group and other). A simplified ls-style string, of the form rwxrwxrwx (must be 9 characters), is also supported (example: rwxr-xr-t is equivalent to 01755). Return 0 on success, -1 on error.

int chown(string owner, string group, string path)

Not available on Windows. Change the file ownership of the file or directory names by path. If either owner or group is an empty string, the attribute will not be modified. Return 0 on success, -1 on error.

int cpus()

Return the number of processors (if known). Defaults to 1.

void die(string fmt, ...args)

Output a printf-like message to stderr and exit 1. If fmt does not end with a newline, append " in <filename> at line <linenum>.\n"

string dirname(string path)

Return the directory portion of a pathname.

int eq(compositeType a, compositeType b)

Compare two arrays, hashes, structs, or lists for equality. The two arguments are compared recursively element by element.

int exists(string path)

Return 1 if the given path exists or 0 if it does not exist.

int fclose(FILE f)

Close an open FILE handle. Return 0 on success, -1 on error.

FILE fopen(string path, string mode)

Open a file. The mode string indicates how the file will be accessed.

"r"	Open the file for reading only; the file must already exist. This is the default value if access is not specified.
"r+"	Open the file for both reading and writing; the file must already exist.
"w"	Open the file for writing only. Truncate it if it exists. If it doesn't exist, create a new file.
"w+"	Open the file for reading and writing. Truncate it if it exists. If it doesn't exist, create a new file.
"a"	Open the file for writing only. The file must already exist, and the file is positioned so that new data is appended to the file.
"a+"	Open the file for reading and writing. If the file doesn't exist, create a new empty file. Set the initial access position to the end of the file.
"v"	This mode can be added to any of the above and causes open errors to be written to stderr.

Return a FILE handle on success and undef on error.

int fprintf(FILE f, string fmt, ...args)

Format and print a string to the given FILE handle. The FILE handles stdin, stdout, and stderr are pre-defined.

Return 0 on success, -1 on error.

int Fprintf(string filename, string fmt, ...args)

Like fprintf but write to the given file name. The file is overwritten if it already exists. Return 0 on success, -1 on error.

string ftype(string path)

Return the type of file at the given path. Type can be directory, file, character, block, fifo, symlink or socket. Return undef on error.

string[] getdir(string dir)

string[] getdir(string dir, string pattern)

Return the files in the given directory, as a sorted string array. Optionally filter the list by pattern which is a glob and may contain the following special characters:

?	Matches any single character.
*	Matches any sequence of zero or more characters.
[chars]	Matches any single character in chars. If chars contains a sequence of the form a-b then any character between a and b (inclusive) will match.
\x	Matches the character x.
{a,b,...}	Matches any of the strings a, b, etc.

If the first character in a pattern is ``~'' then it refers to the home directory for the user whose name follows the ``~''. If the ``~'' is followed immediately by ``/'' then the value of the HOME environment variable is used.

dirent[] getdirx(string dir)

Return the files in the given directory as an array of structs, with the directories sorted and coming first in the array followed by the sorted file names. Return undef on error. The dirent struct is defined as follows:

typedef struct dirent {
    string  name;
    string  type;    // "file", "directory", "other"
    int     hidden;
} dirent;

string getenv(string varname)

Return the value of an environment variable if it exists and is of non-zero length, or return undef if it has zero length or does not exist. This allows you to say putenv("VAR=") and have getenv("VAR") return undef.

string getopt(string av[], string opts, string longopts[])

Parse command-line argument This version (from BitKeeper, same semantics) recognizes the following types of short and long options in the av array:

-            leaves it and stops processing options
             (for indicating stdin)
--           end of options
-a
-abcd
-r <arg>
-r<arg>
-abcr <arg>  same as -a -b -c -r <arg>
-abcr<arg>   same as -a -b -c -r<arg>
-r<arg> 
--long
--long:<arg>
--long=<arg>
--long <arg>

Short options are all specified in a single opts string as follows:

d           boolean option          -d
d:          required arg            -dARG or -d ARG
d;          required arg no space   -dARG
d|          optional arg no space   -dARG or -d

Long options are specified in the longopts array (one option per element) as follows:

long        boolean option          --long
long:       required arg            --long=ARG or --long ARG
long;       required arg no space   --long=ARG
long|       optional arg no space   --long=ARG or --long

The function returns the name of the next recognized option or undef if no more options exist. The global variable optind is set to the next av[] index to process. If the option has no arg, optarg is set to undef.

If an unrecognized option is seen, the empty string ("") is returned and the global variable optopt is set to the name of the offending option (unless the option is a long option).

This example shows a typical usage of both short and long options.

int     debug_level, verbose;
string  c, lopts[] = { "verbose" };

while (c = getopt(av, "d|v", lopts)) {
    switch (c) {
        case "d":
            if (optarg) debug_level = (int)optarg;
            break;
        case "v":
        case "verbose":
            verbose = 1;
            break;
        default:
            die("unrecognized option ${optopt}");
    }
}

int getpid()

Return the caller's process id.

void here()

Output a message like "myfunc() in script.l:86" to stderr which contains the file name, line number, and currently executing function name. Typically used for debugging.

void insert(type &array[], int index, type elem | type elems[], ...)

Insert one or more elements into array before the element specified by index. If index is 0, the elements are inserted at the beginning of the array; this is what unshift() does. If index is -1 or larger than or equal to the number of elements in the array, the elements are inserted at the end; this is what push does. You can insert single elements or arrays of elements.

int isalpha(string s)

Return 1 if the given string contains only alphabetic characters, else return 0. An empty string also returns 0.

int isalnum(string s)

Return 1 if the given string contains only alphabetic or digit characters, else return 0. An empty string also returns 0.

int isdigit(string s)

Return 1 if the given string contains only digit characters, else return 0. An empty string also returns 0.

int isdir(string path)

Return 1 if the given path exists and is a directory, else return 0.

int islink(string path)

Return 1 if the given path exists and is a link, else return 0.

int islower(string s)

Return 1 if the given string contains only lower case alphabetic characters, else return 0. An empty string also returns 0.

int isreg(string path)

Return 1 if the given path exists and is a regular file, else return 0.

int isspace(string buf)

Return 1 if all characters in the argument are space characters, else return 0. An empty string also returns 0.

int isupper(string s)

Return 1 if the given string contains only upper-case alphabetic characters, else return 0. An empty string also returns 0.

int iswordchar(string s)

Return 1 if the given string contains only alphanumeric or connector punctuation characters (such as underscore), else return 0. An empty string also returns 0.

string join(string sep, type array[])

Convert an array into a string by joining all of its elements by inserting sep between each pair.

keyType[] keys(valType hash{keyType})

Return an array containing the keys of a given hash. Note that the return type depends on the argument type.

string lc(string s)

Return a copy of the string that is in all lower case.

int length(string s)

Return the number of characters in the given string. Returns 0 if the argument is undef.

int length(type array[])

Return the number of elements in the given array. Returns 0 if the argument is undef.

for (i = 0; i < length(array); i++)

int length(valType hash{keyType})

Return the number of key/value pairs in the given hash. Returns 0 if the argument is undef.

int link(string sourcePath, string targetPath)

Create a hard link from sourcePath to targetPath. Return 0 on success, -1 on error.

int lstat(string path, struct stat &buf)

Call lstat(2) on path and place the information in buf. Return 0 on success, -1 on error. The struct stat type is defined as follows:

struct stat {
    int     st_dev;
    int     st_ino;
    int     st_mode;
    int     st_nlink;
    int     st_uid;
    int     st_gid;
    int     st_size;
    int     st_atime;
    int     st_mtime;
    int     st_ctime;
    string  st_type;
};

where st_type is a string giving the type of file name, which will be one of file, directory, characterSpecial, blockSpecial, fifo, link, or socket.

int|float max(int|float, int|float)

int|float min(int|float, int|float)

Return the maximum or minimum of two numbers. The return type is float if either of the arguments is a float, otherwise the return type is int.

int milli()

Return the number of milliseconds since the currently executing script started.

void milli_reset()

Reset the internal state for milli() to begin counting from 0 again.

int mkdir(string path)

Create a directory at the given path. This creates all non-existing parent directories. The directories are created with mode 0775 (rwxrwxr-x). Return 0 on success, -1 on error.

int mtime(string path)

Return the modified time of path, or 0 to indicate error.

string normalize(string path)

Return a normalized version of path. The pathname will be an absolute path with all "../" and "./" removed.

int ord(string c)

Return the numeric value of the encoding (ASCII, Unicode) of the first character of c, or -1 on error or if c is the empty string.

int pclose(FILE f)

int pclose(FILE f, STATUS &s)

Close an open pipe created by popen(). Return 0 on success, -1 on error. See system() for details of the STATUS struct.

void perror()

void perror(string message)

Print the error message corresponding to the last error from various Little library calls. If message is not undef, it is prepended to the error string with a ": ".

type pop(type &array[])

Remove an element from the end of array. Return undef if the array is already empty.

FILE popen(string cmd | argv[], string mode)

FILE popen(string cmd | argv[], string mode, void &stderr_callback(string cmd, FILE f))

Open a file handle to a process running the command specified in argv[] or cmd. In the cmd case, the command is split into arguments respecting Bourne shell style quoting. The returned FILE handle may be used to write to the command's input pipe or read from its output pipe, depending on the value of mode. If write-only access is used ("w"), then standard output for the pipeline is directed to the current standard output unless overridden by the command. If read-only access is used ("r"), standard input for the pipeline is taken from the current standard input unless overridden by the command.

The optional third argument is a callback function that is invoked by Tcl's event loop when the command's stderr pipe has data available to be read. The second argument of the callback is a non-blocking FILE for the read end of this pipe. Care must be taken to ensure that the event loop is run often enough for the callback to reap data from the pipe often enough to avoid deadlock. In console apps, this may mean calling Tcl's update() function. The pclose() function also invokes the callback, so it is guaranteed to be called at least once.

If the third argument to popen is undef, the command's stderr output is ignored. Otherwise, unless re-directed by the command, any stderr output is passed through to the calling script's stderr and is not considered an error.

If Tk is being used, there is a default callback that pops up a window with any output the command writes to stderr.

Return the FILE handle on success, or undef on error.

int printf(string fmt, ...args)

Format arguments and print to stdout, as in printf(3). Return 0 on success, -1 on error.

void push(type &array[], type element1 | type elements1[], ...)

Push one or more elements onto the end of array. You can insert single elements or arrays of elements.

string putenv(string var_fmt, ...args)

Set an environment variable, overwriting any pre-existing value, using printf-like arguments:

putenv("VAR=val");
putenv("MYPID=%d", getpid());

Return the new value or undef if var_fmt contains no "=".

int read(FILE f, string &buffer)

int read(FILE f, string &buffer, int numBytes)

Read at most numBytes from the given FILE handle into the buffer, or read the entire file if numBytes == -1 or is omitted. Return the number of bytes read, -1 on error or EOF.

int rename(string oldpath, string newpath)

Rename a file. Return 0 on success, -1 on error.

string require(string packageName)

Find and load the given Tcl package packageName. Return the version string of the package loaded on success, and undef on error.

int rmdir(string dir)

Delete the given directory. Return 0 on success, -1 on error.

type shift(type &array[])

Remove and return the element at the beginning of array. Return undef if the array is already empty.

string[] shsplit(string cmd)

Split the string the same way as the Bourne shell and return the array.

int size(string path)

Return the size, in bytes, of the named file path, or -1 on error.

void sleep(float seconds)

Sleep for seconds seconds. Note that seconds can be fractional to get sub-second sleeps.

type[] sort(type[] array)

type[] sort([decreasing: | increasing:], type[] array)

type[] sort([integer: | real: | ascii:], type[] array)

type[] sort(command: &compar, type[] array)

Sort the array array and return a new array of sorted elements. The first variation sorts the elements into ascending order, and does an integer, real, or ascii sort based on the type of array. The second two variations show optional arguments that can be passed to change this behavior. The last variation shows how a custom compare function can be specified. The function must take two array elements of type T as arguments and return -1 if the first comes before the second in the sort order, +1 if the first comes after the second, and 0 if the two are equal.

int spawn(string cmd)

int spawn(string cmd, STATUS &s)

int spawn(string argv[])

int spawn(string argv[], STATUS &s)

int spawn(cmd | argv[], FILE in, FILE out, FILE err)

int spawn(cmd | argv[], FILE in, FILE out, FILE err, STATUS &s)

int spawn(cmd | argv[], string in, FILE out, FILE err)

int spawn(cmd | argv[], string in, FILE out, FILE err, STATUS &s)

int spawn(cmd | argv[], string[] in, FILE out, FILE err)

int spawn(cmd | argv[], string[] in, FILE out, FILE err, STATUS &s)

int spawn(cmd | argv[], "input", "${outf}", "errors")

int spawn(cmd | argv[], "input", "${outf}", "errors", STATUS &s)

Execute a command in background. All forms return either a process id or undef to indicate an error. In the error case the STATUS argument is set, otherwise it remains untouched and the status can be reaped by waitpid().

See the system() function for information about the arguments.

See the waitpid() function for information about waiting on the child.

string[] split(string s)

string[] split(/regexp/, string s)

string[] split(/regexp/, string s, int limit)

Split a string into substrings. In the first variation, the string is split on whitespace, and any leading or trailing white space does not produce a null field in the result. This is useful when you just want to get at the things delimited by the white space:

split("a b c");    // returns {"a", "b", "c"}
split(" x y z ");  // returns {"x", "y", "z"}

In the second variation, the string is split using a regular expression as the delimiter:

split(/,/, "we,are,commas");  // returns {"we", "are", "commas"}
split(/xxx/, "AxxxBxxxC");    // returns {"A", "B", "C"}
split(/[;,]/, "1;10,20");     // returns {"1", "10", "20"}

When a delimiter is used, split returns a null first field if the string begins with the delimiter, but if the string ends with the delimiter no trailing null field is returned. This provides compatibility with Perl's split:

split(/xx/, "xxAxxBxxCxx");  // returns {"", "A", "B", "C"}

You can avoid the leading null fields in the result if you put a t after the regular expression (to tell it to "trim" the result):

split(/xx/t, "xxAxxBxxCxx");  // returns {"A", "B", "C"}

If a limit argument is given, at most limit substrings are returned (limit <= 0 means no limit):

split(/ /, "a b c d e f", 3);  // returns {"a", "b", "c d e f"}

To allow splitting on variables or function calls that start with m, the alternate regular expression delimiter syntax is restricted:

split(m|/|, pathname);  // "|" and most other punctuation -- ok
                        // but ( and ) as delimiters -- error
split(m);               // splits the variable "m" -- ok
split(m(arg));          // splits the result of m(arg) -- ok

Regular expressions, and the strings to split, both can contain unicode characters or binary data as well as ASCII:

split(/\0/, string_with_nulls);  // split on null
split(/ש/, "זו השפה שלנו");      // unicode regexp and string

string sprintf(string fmt, ...args)

Format arguments and return a formatted string like sprintf(3). Return undef on error.

int stat(string path, struct stat &buf)

Call stat(2) on path and place the information in buf. Return 0 on success, -1 on error. See the lstat() command for the definition of struct stat.

int strchr(string s, string c)

Return the first index of c into s, or -1 if c is not found.

int strlen(string s)

Return the string length.

int strrchr(string s, string c)

Return the last index of c into s, or -1 if c is not found.

int symlink(string sourcePath, string targetPath)

Create a symbolic link from sourcePath to targetPath. Return 0 on success, -1 on failure.

int system(string cmd)

int system(string cmd, STATUS &s)

int system(string argv[])

int system(string argv[], STATUS &s)

int system(cmd | argv[], string in, string &out, string &err)

int system(cmd | argv[], string in, string &out, string &err, STATUS &s)

int system(cmd | argv[], string[] in, string[] &out, string[] &err)

int system(cmd | argv[], string[] in, string[] &out, string[] &err, STATUS &s)

int system(cmd | argv[], FILE in, FILE out, FILE err);

int system(cmd | argv[], FILE in, FILE out, FILE err, STATUS &s);

int system(cmd | argv[], "input", "${outf}", "errors")

int system(cmd | argv[], "input", "${outf}", "errors", STATUS &s)

Execute a command and wait for it to finish (see spawn() for the asynchronous version).

The command is executed using Tcl's exec which understands I/O re-direction and pipes, except that command is split into arguments using Bourne shell style quoting instead of Tcl quoting (see shsplit).

If the number of arguments is one or two, then the existing stdin, stdout, stderr channels are used.

If the number of arguments is four or five, then the second, third, and fourth arguments specify stdin, stdout, stderr, respectively. Each can be a string variable or string array (a reference is required for stdout and stderr), a FILE variable which must be an open file handle, or a string literal which is interpreted as a file path name. If you want to specify a file name from a variable, use the string literal "${filename}". It is an error to both re-direct input/output in the command string and to specify the corresponding input/output argument; in such a case, the command is not run and undef is returned.

If stdout or stderr are sent to strings or string arrays and no output is produced, then out or err are undef upon return.

The optional last argument is a reference to the following structure:

typedef struct {
    string  argv[]; // args passed in
    string  path;   // if defined, this is the path to the exe
                    // if undef, the executable was not found
    int     exit;   // if defined, the process exited with <exit>
    int     signal; // if defined, the process was killed
	                // by <signal>
} STATUS;

The global variable stdio_status is also set. If the the command is a pipeline and a process in that pipeline fails, the returned status is for the first process that failed.

If there is an error executing the command, or if the process is killed by a signal, undef is returned; otherwise, the return value is the process exit status (for a pipeline, the status of the first process that exited with error).

Examples:

// No futzing with input/output, uses stdin/out/err.
ret = system(cmd);

// Same thing but no quoting issues, like execve(2).
ret = system(argv);

// Get detailed status.
unless (defined(ret = system(cmd, &status))) {
    unless (defined(status.path)) {
            warn("%s not found or bad perm\n", status.path);
    }
    if (defined(status.signal)) {
            warn("%s killed with %d\n",
                status.argv[0], status.signal);
    }
}

// Taking input and sending output to string arrays.
// The in_vec elements should not contain newlines and
// the out/err_vec elements will not contain newlines.
string in_vec[], out_vec[], err_vec[];
ret = system(cmd, in_vec, &out_vec, &err_vec);

// Taking input and sending output to files.
string outf = sprintf("/tmp/out%d", getpid());
ret = system(cmd, "/etc/passwd", "${outf}", "/tmp/errors");

// Using open file handles.
FILE in = popen("/some/producer/process", "r");
FILE out = popen("/some/consumer/process", "w");
FILE err = popen("cat > /dev/tty", "w");
ret = system(argv, buf, in, out, err, &status);
// error handling here
pclose(in, &status);
// error handling here
...

// Mixing and matching.
ret = system(argv, buf, &out, "/tmp/errors", &status);

string trim(string s)

Return a copy of the string that has been trimmed of any leading and trailing whitespace (spaces, tabs, newlines, and carriage returns).

string typeof(<variable>)

Return the simple type name of the given variable. This is one of "int", "string", "poly", "widget", "array", "hash", or "struct"; or if the variable's type is a typedef, the typedef name; or if the variable has a class type, the class name; of if the variable is really a function name, "function".

string uc(string s)

Return a copy of the string that is in all upper case.

void undef(<array>[index])

void undef(<string>[index])

void undef(<hash>{index})

void undef(<variable>)

In the first three forms, remove an array, string, or hash element from the specified variable. In the last form, sets the variable to undef. When setting a hash or array to undef, all of its old elements are freed (unless they were shared with some other variable).

int unlink(string path)

Delete the named file. Return 0 on success, -1 on failure.

void unshift(type &array[], type element1 | type elements1[], ...)

Add one or more elements onto the beginning of array. You can insert single elements or arrays of elements.

int waitpid(int pid, STATUS &status, int nohang)

Given a pid returned by spawn(), wait for it, and place the exit information in the (optional) STATUS struct. If pid is -1, return any process that has exited or return -1 if no more child processes exist; otherwise return pid or -1 on error. If nohang is non-zero, returns -1 if the process does not exist or other error, returns 0 if the process exists and has not exited, and returns pid and updates status if the process has exited.

int wait(STATUS &status)

Same as waitpid(-1, &status, 0).

void warn(string fmt, ...args)

Output a printf-like message to stderr. If fmt does not end with a newline, append " in <filename> at line <linenum>.\n"

int write(FILE f, string buffer, int numBytes)

Write at most numBytes to the given FILE handle from the buffer. Return the number of bytes written, or -1 on error.

Example code

shapes.l

This is something we hand to our customers to see what "shape" their source trees have.

#!/usr/bin/bk tclsh
/*
 * Determine the files/size of each directory under a bk repository.
 * Optionally transform the directory names to obscure their structure.
 *
 * The idea is that you can run this script like this:
 *
 *   bk little shapes.l <path_to_root_of_repo>
 *
 * and get a list of directories with their sizes and number of files in
 * each of them. Save the output, then run it again with -o:
 *
 *   bk little shapes.l -o <path_to_root_of_repo>
 *
 * and send the output to BitMover.
 *
 * The names of all the directories will be rot13'd and sorted (since
 * sort is a destructive transform, it makes it harder to reverse the
 * rot13). This is a weak form of obfuscation, but it lets BitMover
 * work with the directory structure without inadvertently learning
 * about the client's projects.
 *
 * The line numbers at the beginning is so that we can talk about a certain
 * directory by number without BitMover knowing the name of the directory.
 *
 *  ob@dirac.bitmover.com|src/contrib/shapes.l|20100723224240|23777
 *
 */

string      obscure(string s);      // pathname to no-IP-leak pathname
string      pp(float n);            // pretty print a number, like df -h
string      rot13(string str);      // if you don't know, you don't know

int
main(int ac, string[] av)
{
        int         size, files, maxlen, n;
        int         do_obscure = 0;
        string      fn, root, dir, d, ob;
        FILE        f;
        struct      stat sb;
        struct      dirstats {
            int     files;
            int     size;
            int     total_files;
            int     total_size;
        } dirs{string};

        dir = ".";
        if (ac == 3) {
                if (av[1] == "-o") {
                        do_obscure = 1;
                } else {
                        fprintf(stderr, "usage: %s [-o] [<dir>]\n", av[0]);
                        exit(1);
                }
                dir = av[2];
        } else if (ac == 2) {
                if (av[1] == "-o") {
                        do_obscure = 1;
                        dir = ".";
                } else {
                        dir = av[1];
                }
        } else if (ac > 3) {
                fprintf(stderr, "usage: %s [-o] [<dir>]\n", av[0]);
                exit(1);
        }
        if (chdir(dir)) {
                fprintf(stderr, "Could not chdir to %s\n", dir);
                exit(1);
        }
        root = `bk root`;
        if (root == "") {
                fprintf(stderr, "Must be run in a BitKeeper repository\n");
                exit(1);
        }
        if (chdir(root)) {
                fprintf(stderr, "Could not chdir to %s\n", root);
                exit(1);
        }

        size = 0;
        files = 0;
        f = popen("bk sfiles", "r");
        while (defined(fn = <f>)) {
                dir = dirname(fn);
                if (dir == "SCCS") {
                        dir = ".";
                } else {
                         // remove SCCS and obscure
                        dir = dirname(dir);
                }
                unless (defined(dirs{dir})) dirs{dir} = {0, 0, 0, 0};
                if (maxlen < length(dir)) maxlen = length(dir);
                dirs{dir}.files++;
                files++;
                if (lstat(fn, &sb)) {
                        fprintf(stderr, "Could not stat %s\n", fn);
                        continue;
                }
                dirs{dir}.size += sb.st_size;
                size += sb.st_size;
                // add our size/file count to each parent dir
                for (d = dirname(dir); d != "."; d = dirname(d)) {
                        unless (defined(dirs{d})) dirs{d} = {0,0,0,0};
                        dirs{d}.total_size += sb.st_size;
                        dirs{d}.total_files++;
                }
                dirs{"."}.total_size += sb.st_size;
                dirs{"."}.total_files++;
        }
        close(f);
        // now print it
        printf("  N   | %-*s | FILES | SIZE    | T_FILES | T_SIZE \n",
            maxlen, "DIRS");
        n = 1;
        foreach (dir in sort(keys(dirs))) {
                ob = dir;
                if (do_obscure) {
                        ob = obscure(dir);
                }
                if (dirs{dir}.total_files > 0) {
                        printf("%5d | %-*s | %5d | %7s | %7s | %7s\n",
                            n, maxlen,
                            ob, dirs{dir}.files, pp(dirs{dir}.size),
                            dirs{dir}.total_files, pp(dirs{dir}.total_size));
                } else {
                        printf("%5d | %-*s | %5d | %7s | %7s | %7s\n",
                            n, maxlen,
                            ob, dirs{dir}.files, pp(dirs{dir}.size),
                            "","");
                }
                n++;
        }
        printf("TOTAL: %u files, %s\n",
            dirs{"."}.total_files, pp(dirs{"."}.total_size));
        return (0);
}

/* Pretty print a number */
string
pp(float n)
{
        int         i;
        float       num = (float)n;
        string      sizes[] = {"b", "K", "M", "G", "T"};

        for (i = 0; i < 5; i++) {
                if (num < 1024.0) return (sprintf("%3.2f%s", num, sizes[i]));
                num /= 1024.0;
        }
}

/* Table for rot13 function below */
string rot13_table{string} = {
        "A" => "N", "B" => "O", "C" => "P", "D" => "Q", "E" => "R", "F" => "S",
        "G" => "T", "H" => "U", "I" => "V", "J" => "W", "K" => "X", "L" => "Y",
        "M" => "Z", "N" => "A", "O" => "B", "P" => "C", "Q" => "D", "R" => "E",
        "S" => "F", "T" => "G", "U" => "H", "V" => "I", "W" => "J", "X" => "K",
        "Y" => "L", "Z" => "M",     "a" => "n", "b" => "o", "c" => "p", "d" => "q",
        "e" => "r", "f" => "s", "g" => "t", "h" => "u", "i" => "v", "j" => "w",
        "k" => "x", "l" => "y", "m" => "z", "n" => "a", "o" => "b", "p" => "c",
        "q" => "d", "r" => "e", "s" => "f", "t" => "g", "u" => "h", "v" => "i",
        "w" => "j", "x" => "k", "y" => "l", "z" => "m",
};

/* rot13 a string */
string
rot13(string str)
{
        int         i;
        string      ret = "";

        for (i = 0; i < length(str); i++) {
                ret .= rot13_table{str[i]};
        }
        return (ret);
}

/*
 * Print an obscured version of the string
 * rot13 + sort
 */
string
obscure(string s)
{
        string      p;
        string[]    ret;
        string[]    sp = split(s, "/");

        foreach (p in sp) {
                push(&ret, rot13(join("", lsort(split(p, "")))));
        }
        return (join("/", ret));
}

photos.l

#!/usr/bin/bk tclsh
/*
 * A rewrite of Eric Pop's fine igal program in Little.  I talked to Eric and he
 * really doesn't want anything to do with supporting igal or copycats so
 * while credit here is cool, don't stick his name on the web pages.
 * I completely understand that, people still ask me about webroff and
 * lmbench.
 *
 * First version by Larry McVoy Sun Dec 19 2010.  Public Domain.
 *
 * usage photos [options] [dir]
 *
 * TODO
 * - slideshow mode
 * - move the next/prev/index to the sides along w/ EXIF info
 */
int bigy = 750;             // --bigy=%d for medium images
int dates = 0;              // --date-split
int exif = 0;               // --exif under titles
int exif_hover = 0;         // --exif-hover, exif data in thumbnail hover
int exif_thumbs = 0;        // --exif-thumbnails, use the camera thumbnail
int force = 0;              // -f force regen of everything
int names = 0;              // put names below the image
int nav = 0;                // month/year nav
int parallel = 1;           // -j%d for multiple processes
int sharpen = 0;            // --sharpen to turn it on
int thumbnails = 0;         // force regen of those
int quiet = 1;              // turn off verbose
string      title = "McVoy photos"; // --title=whatever
int ysize = 120;            // -ysize=%d for thumbnails
int rotate[];               // amount to rotate, -+90
string      indexf = "~/.photos/index.html";
string      slidef = "~/.photos/slide.html";

int
main(int ac, string av[])
{
        string      c;
        string      lopts[] = {
                "bigy:",
                "date-split",
                "exif",
                "exif-thumbnails",
                "exif-hover",
                "force",
                "index:",
                "names",
                "nav",
                "parallel:",
                "quiet",
                "regen",
                "sharpen",
                "slide:",
                "thumbnails",
                "title:",
                "ysize:",
        };

        if (0) ac = 0;      // lint
        parallel = cpus();
        dotfiles();

        while (c = getopt(av, "fj:", lopts)) {
                switch (c) {
                    case "bigy": bigy = (int)optarg; break;
                    case "date-split": dates = 1; break;
                    case "exif": exif = 1; break;
                    case "exif-hover": exif_hover = 1; break;
                    case "exif-thumbnails": exif_thumbs = 1; break;
                    case "f":
                    case "force":
                    case "regen":
                        force = 1; break;
                    case "index": indexf = optarg; break;
                    case "j":
                    case "parallel": parallel = (int)optarg; break;
                    case "quiet": quiet = 1; break;
                    case "names": names = 1; break;
                    case "nav": nav = 1; break;
                    case "sharpen": sharpen = 1; break;
                    case "slide": slidef = optarg; break;
                    case "title": title = optarg; break;
                    case "thumbnails": thumbnails = 1; break;
                    case "ysize": ysize = (int)optarg; break;
                    default: 
                    printf("Usage: photos.l");
                    foreach(c in lopts) {
                        if (c =~ /(.*):/) {
                            printf(" --%s=<val>", $1);
                        } else {
                            printf(" --%s", c);
                        }
                    }
                    printf("\n");
                    return(0);
                }
        }
        unless (av[optind]) {
            dir(".");
        } else {
            while (av[optind]) dir(av[optind++]);
        }
        return (0);
}

void
dir(string d)
{
        string      jpegs[];
        string      tmp[];
        string      buf;
        int i;

        if (chdir(d)) die("can't chdir to %s", d);
        tmp = getdir(".", "*.jpeg");
        unless (tmp[0]) tmp = getdir(".", "*.jpg");
        unless (tmp[0]) tmp = getdir(".", "*.png");
        unless (tmp[0]) tmp = getdir(".", "*.PNG");
        unless (tmp[0]) die("No jpegs found in %s", d);
        // XXX - should getdir do this?
        for (i = 0; defined(tmp[i]); i++) tmp[i] =~ s|^\./||;

        /* so we start at one not zero */
        jpegs[0] = '.';
        rotate[0] = 0;
        // XXX - I want push(&jpegs, list)
        foreach (buf in tmp) {
                push(&jpegs, buf);
                push(&rotate, rotation(buf));
        }

        slides(jpegs);
        thumbs(jpegs);
        html(jpegs);
}

/*
 * Create .thumb-$file if
 * - it does not exist
 * - .ysize is different than ysize
 * - $file is newer than thumbnail
 */
void
thumbs(string jpegs[])
{
        string      cmd[];
        string      jpeg, file, slide;
        int i;
        int all = 0;
        int my_parallel = parallel, bg = 0;
        int pid, reaped;
        int pids{int};

        unless (exists(".ysize")) {
save:               Fprintf(".ysize", "%d\n", ysize);
        }
        if ((int)`cat .ysize` != ysize) {
                all = 1;
                goto save;
        }
        if (force || thumbnails) all = 1;
        if (exif_thumbs) my_parallel = 1;
        for (i = 1; defined(jpeg = jpegs[i]); i++) {
                file = sprintf(".thumb-%s", jpeg);
                slide = sprintf(".slide-%s", jpeg);
                if (!all && exists(file) && (mtime(file) > mtime(jpeg))) {
                        continue;
                }

                if (exif_thumbs && do_exif(undef, jpeg)) {
                        unlink(file);
                        cmd = {
                            "exif",
                            "-e",
                            "-o", file,
                            jpeg
                        };
                } else {
                        cmd = {
                            "convert",
                            "-thumbnail",
                            "x${ysize}",
                            "-quality", "85",
                        };
                        if (sharpen) {
                                push(&cmd, "-unsharp");
                                //push(&cmd, "0x.5");
                                push(&cmd, "2x0.5+0.7+0");
                        }
                        push(&cmd, exists(slide) ? slide : jpeg);
                        push(&cmd, file);
                }
                while (bg >= parallel) {
                        reaped = 0;
                        foreach (pid in keys(pids)) {
                                if (waitpid(pid, undef, 1) > 0) {
                                        reaped++;
                                        bg--;
                                        undef(pids{pid});
                                        break;
                                }
                        }
                        if (reaped) break;
                        sleep(0.100);
                }
                unless (quiet) {
                        printf("Creating %s from %s\n",
                            file, exists(slide) ? slide : jpeg);
                }
                pid = spawn(cmd);
                unless (defined(stdio_status.path)) {
                        die("%s: command not found.\n", cmd[0]);
                }
                bg++;
                pids{pid} = 1;
        }
        foreach (pid in keys(pids)) waitpid(pid, undef, 0);
}

/*
 * Create .slide-$file if
 * - it does not exist
 * - .bigy is different than bigy
 * - $file is newer than slide
 * - $file is bigger than bigy
 */
void
slides(string jpegs[])
{
        string      cmd[];
        string      jpeg, file;
        int all = 0;
        int i;
        int bg = 0;
        int pid, reaped;
        int pids{int};

        unless (exists(".bigy")) {
save:               Fprintf(".bigy", "%d\n", bigy);
        }
        if ((int)`cat .bigy` != bigy) {
                all = 1;
                goto save;
        }
        if (force) all = 1;
        for (i = 1; defined(jpeg = jpegs[i]); i++) {
                file = sprintf(".slide-%s", jpeg);
                if (!all && exists(file) && (mtime(file) > mtime(jpeg))) {
                        continue;
                }
                if (small(jpeg)) {
                        unlink(file);
                        if (link(jpeg, file)) warn("link ${jpeg} ${file}");
                        continue;
                }
                cmd = {
                    "convert",
                    "+profile", "*",
                    "-scale", "x" . "${bigy}",
                    "-quality", "85",
                };
                if (rotate[i]) {
                        push(&cmd, "-rotate");
                        push(&cmd, sprintf("%d", rotate[i]));
                }
                if (sharpen) {
                        push(&cmd, "-unsharp");
                        //push(&cmd, "0x.5");
                        push(&cmd, "2x0.5+0.7+0");
                }
                push(&cmd, jpeg);
                push(&cmd, file);
                while (bg >= parallel) {
                        reaped = 0;
                        foreach (pid in keys(pids)) {
                                if (waitpid(pid, undef, 1) > 0) {
                                        reaped++;
                                        bg--;
                                        undef(pids{pid});
                                        break;
                                }
                        }
                        if (reaped) break;
                        sleep(0.150);
                }
                unless (quiet) {
                        printf("Creating %s from %s\n", file, jpeg);
                }
                printf("%s\n", join(" ", cmd));
                pid = spawn(cmd);
                unless (defined(stdio_status.path)) {
                        die("%s: command not found.\n", cmd[0]);
                }
                bg++;
                pids{pid} = 1;
        }
        foreach (pid in keys(pids)) waitpid(pid, undef, 0);
}

int
small(string file)
{
        string      buf;

        // Hack to avoid exif calls on small files
        if (size(file) < 100000) return (1);
        if (size(file) > 200000) return (0);
        unless (buf = `identify '${file}'`) return (0);
        if (buf =~ /JPEG (\d+)x(\d+)/) return ((int)$2 <= bigy);
        return (0);
}

string num2mon{int} = {
        1 => "January",
        2 => "February",
        3 => "March",
        4 => "April",
        5 => "May",
        6 => "June",
        7 => "July",
        8 => "August",
        9 => "September",
        10 => "October",
        11 => "November",
        12 => "December",
};

typedef     struct {
        int day;    // day 1..31
        int mon;    // month 1..12
        int year;   // year as YYYY
        string      sdate;  // YYYY-MM-DD
} date;

/*
 * Return the date either from the filename if it is one of date ones,
 * or from the exif data,
 * or fall back to mtime.
 */
date
f2date(string file)
{
        date        d;
        string      buf;
        FILE        f;
        int t;

        if (file =~ /^(\d\d\d\d)-(\d\d)-(\d\d)/) {
match:              
                buf = (string)$3; buf =~ s/^0//; d.day = (int)buf;
                buf = (string)$2; buf =~ s/^0//; d.mon = (int)buf;
                d.year = (int)$1;
                d.sdate = sprintf("%d-%02d-%02d", d.year, d.mon, d.day);
                return (d);
        }

        if (f = popen("exif -t DateTime '${file}' 2>/dev/null", "r")) {
                while (buf = <f>) {
                        // Value: 2006:02:04 22:59:24
                        if (buf =~ /Value: (\d\d\d\d):(\d\d):(\d\d)/) {
                                pclose(f);
                                goto match;
                        }
                }
                pclose(f);
                // fall through to mtime
        }

        if (t = mtime(file)) {
                buf = Clock_format(t, format: "%Y:%m:%d");
                buf =~ /(\d\d\d\d):(\d\d):(\d\d)/;
                goto match;
        }

        return (undef);
}

/*
 * Create the html slide files and index.html
 * XXX - could stub this out if mtime(html) > mtime(.slide) etc.
 */
void
html(string jpegs[])
{
        string      template, file, stitle, ntitle, ptitle, buf;
        string      cap = '';
        string      date_nav = '';
        string      dir, jpeg, escaped, thumbs = '';
        int i, next, prev;
        int first = 1;
        FILE        f;
        string      map[];
        string      exdata;
        date        d, d2;

        unless (f = fopen(slidef, "rv")) die("slide.html");
        read(f, &template, -1);
        fclose(f);

        for (i = 1; defined(jpeg = jpegs[i]); i++) {
                file = sprintf("%d.html", i);
                if (i > 1) {
                        prev = i - 1;
                } else {
                        prev = length(jpegs) - 1;
                }
                if (jpegs[i+1]) {
                        next = i + 1;
                } else {
                        next = 1;
                }
                undef(map);
                stitle = jpeg;
                stitle =~ s/\.jp.*//;
                ntitle = jpegs[next];
                ntitle =~ s/\.jp.*//;
                ptitle = jpegs[prev];
                ptitle =~ s/\.jp.*//;
                escaped = jpeg;
                escaped =~ s/:/%3A/g;
                dir = `pwd`;
                dir =~ s|.*/||;
                map = {
                        "%FOLDER%",
                        dir,
                        "%TITLE%",
                        stitle,
                        "%NEXT_HTML%",
                        sprintf("%d.html", next),
                        "%NEXT_TITLE%",
                        ntitle,
                        "%PREV_HTML%",
                        sprintf("%d.html", prev),
                        "%PREV_TITLE%",
                        ptitle,
                        "%NEXT_SLIDE%",
                        sprintf(".slide-%s", jpegs[next]),
                        "%ORIG%",
                        escaped,
                        "%SLIDE%",
                        sprintf(".slide-%s", escaped),
                };
                push(&map, "%CAPTION%");
                if (names || exif) cap = '<P class="center">';
                if (names) {
                        cap .= stitle .
                            '   ' .
                            sprintf("(%d/%d)\n", i, length(jpegs) - 1);
                }
                undef(exdata);
                if (exif) {
                        do_exif(&exdata, jpeg);
                        if (names) cap .= "<br>";
                        cap .= exdata;
                }
                if (names || exif) cap .= "</P>\n";
                push(&map, cap);

                push(&map, "%NAV%");
                date_nav = '';
                do_nav(&date_nav, jpeg, prev, next, 1);
                push(&map, date_nav);

                buf = String_map(map, template);
                Fprintf(file, "%s\n", buf);

                if (dates &&
                    defined(d2 = f2date(jpeg)) &&
                    (first || (d.sdate != d2.sdate))) {
                        d = d2;
                        unless (first) thumbs .= "</DIV>\n";
                        buf = num2mon{d.mon};
                        thumbs .= "<p><a name=\"${buf}_${d.day}\">";
                        cap = "${buf} ${d.day} ${d.year}";
                        thumbs .= cap . "</a>";
                        cap = ".cap-${buf}-${d.day}-${d.year}";
                        // .cap-January-09-2011, if exists, is appended
                        if (exists(cap) && (cap = `cat ${cap}`)) {
                                thumbs .= ': ' . cap;
                        }
                        thumbs .= "<br>\n<DIV class=\"center\">\n";
                }

                if (exif && exif_hover) stitle .= " " . exdata;
                thumbs .= sprintf(
                    '<a href="%s">' .
                    '<img src=".thumb-%s" alt="%s" title="%s" border="0"/>' . 
                    '</a>' . "\n",
                    file, escaped, stitle, stitle);
                first = 0;
        }

        /* do index.html */
        unless (f = fopen(indexf, "rv")) die("index.html");
        read(f, &template, -1);
        fclose(f);
        undef(map);
        push(&map, "%TITLE%");
        push(&map, title);
        push(&map, "%THUMBS%");
        thumbs .= "</DIV>\n";
        push(&map, thumbs);
        date_nav = '';
        push(&map, "%NAV%");
        do_nav(&date_nav, jpegs[1], undef, undef, 0);
        push(&map, date_nav);
        buf = String_map(map, template);
        if (exists(".index-include")) {
                buf .= `cat .index-include`;
        }
        Fprintf("index.html", "%s", buf);
        unless (f = fopen("~/.photos/photos.css", "rv")) die("photos.css");
        read(f, &buf, -1);
        fclose(f);
        Fprintf("photos.css", "%s", buf);
}

/*
 * XXX - what this needs is a hash and then at the end I push the info
 * I want in the order I want.
 */
int
do_exif(string &cap, string jpeg)
{
        FILE        f = popen("exiftags -a '${jpeg}'", "rv");
        string      save, buf, maker = '';
        string      v[];
        string      iso = undef;
        int thumb = 0;
        int i;
        string      tags{string};

        while (buf = <f>) {
                switch (trim(buf)) {
                    case /^Equipment Make: (.*)/:
                        maker = $1;
                        if (maker == "OLYMPUS IMAGING CORP.") {
                                maker = "Olympus";
                        }
                        if (maker == "NIKON CORPORATION") {
                                maker = "Nikon";
                        }
                        break;
                    case /^Camera Model: (.*)/:
                        save = $1;
                        if (save =~ /${maker}/i) {
                                tags{"camera"} = save;
                        } else {
                                tags{"camera"} = "${maker} ${save}";
                        }
                        if (save == "TG-1") tags{"lens"} = "25-100mm f2.0";
                        if (save =~ /Canon PowerShot S95/) {
                                tags{"lens"} = "28-105 mm";
                        }
                        if (save =~ /Canon PowerShot S100/) {
                                tags{"lens"} = "24-120mm";
                        }
                        break;
                    case /Lens Name: (.*)/:
                        if ($1 =~ /EF\d/) $1 =~ s/EF/EF /;
                        if ($1 =~ /EF-S\d/) $1 =~ s/EF-S/EF-S /;
                        if ($1 =~ / USM/) $1 =~ s/ USM//;
                        if ($1 == "30mm") $1 = "Sigma 30mm f/1.4";
                        if ($1 == "90mm") $1 = "Tamron 90mm macro";
                        if ($1 == "18-200mm") $1 = "Tamron 18-200mm";
                        if ($1 == "18-250mm") $1 = "Tamron 18-250mm";
                        if ($1 == "18-270mm") $1 = "Tamron 18-270mm";
                        if ($1 == "170-500mm") $1 = "Sigma 170-500mm";
                        $1 =~ s|f/|f|;
                        tags{"lens"} = $1;
                        break;
                    case /Lens Size: 10.00 - 22.00 mm/:
                        tags{"lens"} = "EF-S 10-22mm f/3.5-4.5";
                        break;
                    case /Exposure Bias: (.*)/:
                        if ($1 != "0 EV") {
                                unless ($1 =~ /^-/) $1 = "+" . $1;
                                tags{"bias"} = $1;
                        }
                        break;
                    case /^Exposure Time: (.*)/:
                        save = $1;
                        $1 =~ /(\d+)\/(\d+) sec/;
                        if ((int)$1 > 1) {
                                i = (int)$2/(int)$1;
                                save = "1/${i}";
                        }
                        tags{"time"} = save;
                        break;
                    case /Lens Aperture: (.*)/:
                    case /F-Number: (.*)/:
                        $1 =~ s|/||;
                        tags{"fstop"} = $1;
                        break;
                    case /ISO Speed Rating: (.*)/:
                        iso = undef;
                        if ($1 == "Auto") {
                                iso = "ISO ${$1}";
                        } else if ($1 == "Unknown") {
                                ;
                        } else unless ((int)$1 == 0) {
                                iso = "ISO ${$1}";
                        }
                        if (defined(iso)) tags{"iso"} = iso;
                        break;
                    case /Focal Length .35mm Equiv.: (.*)/:
                    case /Focal Length: (.*)/:
                        save = $1;
                        if (tags{"camera"} =~ /Canon PowerShot S95/) {
                                save =~ s/ mm//;
                                save = (string)(int)((float)save * 4.7);
                                save .= " mm";
                        }
                        if (tags{"camera"} =~ /Canon PowerShot S100/) {
                                save =~ s/ mm//;
                                save = (string)(int)((float)save * 4.61538);
                                save .= " mm";
                        }
                        unless (defined(tags{"focal"})) {
                                tags{"focal"} = save;
                        }
                        break;
                    case /Metering Mode: (.*)/:
                        unless (defined(tags{"metering"})) {
                                tags{"metering"} = "${$1} metering";
                        }
                        break;
                    case /White Balance: (.*)/:
                        unless ($1 =~ /white balance/) $1 .= " white balance";
                        $1 =~ s/white balance/WB/;
                        unless (defined(tags{"balance"})) {
                                tags{"balance"} = $1;
                        }
                        break;
                    case /Compression Scheme: JPEG Compression .Thumbnail./:
                        thumb = 1;
                        break;
                }
        }
        fclose(f);
        cap = "";
        if (defined(tags{"camera"})) push(&v, tags{"camera"});
        if (defined(tags{"lens"})) {
                if (defined(tags{"focal"}) && 
                    (tags{"lens"} =~ /[0-9]-[0-9]/)) {
                        tags{"lens"} .= " @ " . tags{"focal"};
                }
                push(&v, tags{"lens"});
        }
        if (defined(tags{"fstop"})) push(&v, tags{"fstop"});
        if (defined(tags{"time"})) push(&v, tags{"time"});
        if (defined(tags{"bias"})) push(&v, tags{"bias"});
        if (defined(tags{"iso"})) push(&v, tags{"iso"});
        if (defined(tags{"metering"})) push(&v, tags{"metering"});
        if (defined(tags{"balance"})) push(&v, tags{"balance"});
        if (defined(v)) cap = join(", ", v);
        return (thumb);
}

int
rotation(string file)
{
        string      r = `exif -m -t Orientation '${file}'`;

        switch (r) {
            case /right.*top/i:
                return (90);
            case /left.*bottom/i:
                return (-90);
            default:
                return (0);
        }
}

/*
 * This is called for both index nav and slide nav.
 * For index nav, unless nav is set, do nothing.
 * For slide nav, always do at least 
 * prev | index | next 
 * and optionally
 * prev | next | prev month | index | next month | prev year | next year
 */
void
do_nav(string &date_nav, string jpeg, int prev, int next, int slide)
{
        int i, mon, did_it;
        string      buf, month;
        date        d;

        date_nav = '';
        if (!nav && !slide) return;

        unless (defined(d = f2date(jpeg))) return;
        month = num2mon{d.mon}[0..2];

        if (slide) {
                /* <<< prev | January | next >>> */
                date_nav .= '<a href="' . sprintf("%d.html", prev) .
                    '"><< prev pic</a>  ';
                date_nav .= "\n";
                unless (nav) {
                        date_nav .= '<a href="index.html">Index</a>  ';
                        date_nav .= "\n";
                }
                date_nav .= '<a href="' . sprintf("%d.html", next) .
                    '">next pic >></a>';
                date_nav .= "\n";

                unless (nav) return;
        }

        /* <<< prev | next >>> |  <<< January >>> | <<< 2003 >>> */
        date_nav .= "\n";
        date_nav .= '    ';
        date_nav .= "\n";

        /* do the <<< for the prev month */
        for (i = 0; i < 12; i++) {
                mon = d.mon - i;
                if (mon == 1) {
                        buf = sprintf("../../%d/%02d/index.html", d.year-1, 12);
                } else {
                        buf = sprintf("../../%d/%02d/index.html", d.year,mon-1);
                }
                if (exists(buf)) break;
        }
        if (exists(buf)) date_nav .= '<a href="' . buf . '"><<<</a>';
        date_nav .= "\n";

        /* do the link to index.html for this month */
        if (slide) {
                date_nav .= '  <a href="index.html">' . 
                    month . " index" . '</a>  ';
        } else {
                date_nav .= "  ${month}  ";
        }
        date_nav .= "\n";

        /* do the >>> for next month */
        for (i = 0; i < 12; i++) {
                mon = d.mon + i;
                if (mon == 12) {
                        buf = sprintf("../../%d/%02d/index.html", d.year+1, 1);
                } else {
                        buf = sprintf("../../%d/%02d/index.html", d.year,mon+1);
                }
                if (exists(buf)) break;
        }
        if (exists(buf)) {
                date_nav .= '<a href="' . buf . '">>>></a>';
        }

        date_nav .= "\n";
        date_nav .= '    ';
        date_nav .= "\n";

        did_it = 0;
        buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon);
        unless (exists(buf)) for (i = 1; i < 12; i++) {
                buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon+i);
                if (exists(buf)) break;
                buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon-i);
                if (exists(buf)) break;
        }
        if (exists(buf)) {
                date_nav .= '<a href="' .
                    buf . '"><<<</a> ' .  "${d.year}";
                date_nav .= "\n";
                did_it++;
        }
        buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon);
        unless (exists(buf)) for (i = 1; i < 12; i++) {
                buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon+i);
                if (exists(buf)) break;
                buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon-i);
                if (exists(buf)) break;
        }
        if (exists(buf)) {
                unless (did_it) date_nav .= "${d.year}";
                date_nav .= ' <a href="' . buf . '">>>></a>';
                date_nav .= "\n";
        }
}

void
dotfiles(void)
{
        string      file, buf;

        unless (isdir("~/.photos")) mkdir("~/.photos");
        file = "~/.photos/slide.html";
        unless (exists(file)) {
                buf = <<'END'
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
  <HEAD>
    <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
    <TITLE>%TITLE%</TITLE>
    <LINK rel="stylesheet" type="text/css" href="photos.css">
    <LINK rel="contents" href="index.html">
    <LINK rel="next" href="%NEXT_HTML%" title="%NEXT_TITLE%">
    <LINK rel="previous" href="%PREV_HTML%" title="%PREV_TITLE%">
    <SCRIPT type="text/javascript" language="javascript" defer>
       <!--
       if (document.images)    {
          Image1          = new Image();
          Image1.src      = "%NEXT_SLIDE%";
       }       //-->   
    </SCRIPT>
  </HEAD>

  <BODY>
    <P class="center">
      %NAV%
    </P>
    <DIV class="center">
      <TABLE bgcolor="#ffffff" cellspacing=0 cellpadding=4>
        <TR>
          <TD class="slide">
            <A href="%ORIG%">
            <IMG src="%SLIDE%" alt="%TITLE%"
            title="Click here to see full size, then use your back button."
            border=0></a>
          </TD>
        </TR>
      </TABLE>
      <P>
      %CAPTION%
    </DIV>
  </BODY>
</HTML>
END;
                Fprintf(file, "%s", buf);
        }
        file = "~/.photos/index.html";
        unless (exists(file)) {
                buf = <<'END'
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<HTML>
  <HEAD>
    <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
    <TITLE>%TITLE%</TITLE>
    <LINK rel="stylesheet" type="text/css" href="photos.css">
  </HEAD>

  <BODY>
    %TITLE%
     
     
     
     
    %NAV%
    <p>
    %THUMBS%
    <p align="center">
    %NAV%
    <P class="small">
    For each picture there are 3 sizes:
    (1) the index thumbnails you are looking at,
    (2) a mid sized picture that you get to by clicking the thumbnail,
    (3) the original that you get to by clicking the midsize.
    Legal crud: everything is copyrighted by whoever took the picture.
    In the unlikely event you want to use a picture, please ask just to make
    us feel good.
    </P>
  </BODY>
</HTML>
END;
                Fprintf(file, "%s", buf);
        }
        file = "~/.photos/photos.css";
        unless (exists(file)) {
                buf = <<'END'
.center { 
  text-align: center;
}

.center table { 
  margin-left: auto;
  margin-right: auto;
  text-align: center;
}

body {
  font-family: verdana, sans-serif;
  background: #000000;
  color: #DDDDDD;
}

a:link {
  color: #95DDFF;
  background: transparent;
}

a:visited {
  color: #AAAAAA;
  background: transparent;
}

a:hover {
  color: #BBDDFF;
  background: #555555;
}

.small {
  font-size: 50%;
}

.large {
  font-size: 200%;
}

.tiled {
  background-image: url(".tile.png");
  background-repeat: repeat-x;
  background-color: #000000;
  padding: 0;
}

.thumb {
  background-color: #000000;
  text-align: center;
  vertical-align: middle;
}

.slide {
  background-color: #ffffff;
  text-align: center;
  vertical-align: middle;
}
END;
                Fprintf(file, "%s", buf);
        }
}

pod2html.l

This is a Little implementation of pod2html. Pretty stripped down but slightly prettier than the Perl pod2html.

int
main(int ac, string av[])
{
        FILE        f;
        int         i, ul;
        int         space = 0, dd = 0, p = 0, pre = 0, table = 0;
        string      head, buf, tmp, title, trim, all[];

        // lint
        if (0) ac++;

        /*
         * -t<title> or --title=<title>
         */
        for (i = 1; defined(av[i]) && (av[i] =~ /^-/); i++) {
                if (av[i] == "--") {
                        i++;
                        break;
                }
                if ((av[i] =~ /--title=(.*)/) || (av[i] =~ /-t(.*)/)) {
                        title = $1;
                } else {
                        die("usage: ${av[0]} [--title=whatever]");
                }
        }
        if (!defined(av[i]) ||
            defined(av[i+1]) || !defined(f = fopen(av[i], "r"))) {
                die("usage: ${av[0]} filename");
        }
        unless (defined(title)) title = av[i];

        header(title);

        /*
         * Load up the whole file in all[] and spit out the index.
         */
        puts("<ul>");
        ul = 1;
        while (defined(buf = <f>)) {
                push(&all, buf);
                if (buf =~ /^=head(\d+)\s+(.*)/) {
                        i = (int)$1;
                        while (ul > i) {
                                puts("</ul>");
                                ul--;
                        }
                        while (i > ul) {
                                puts("<ul>");
                                ul++;
                        }
                        tmp = $2;
                        tmp =~ s/\s+/_/g;
                        buf =~ s/^=head(\d+)\s+//;
                        puts("<li><a href=\"#${tmp}\">${buf}</a></li>");
                }
        }
        while (ul--) puts("</ul>");
        fclose(f);

        /*
         * Now walk all[] and process the markup.  We currently handle:
         * =head%d title
         * =over
         * =item name
         * =proto return_type func(args)
         * =back
         * <blank line>
         * bold this
         * some code
         * italics
         */
        for (i = 0; i <= length(all); i++) {
                buf = inline(all[i]);
                if (buf =~ /^=head(\d+)\s+(.*)/) {
                        if ((int)$1 == 1) puts("<HR>");
                        tmp = $2;
                        tmp =~ s/\s+/_/g;
                        printf("<H%d><a name=\"%s\">%s</a></H%d>\n",
                            $1, tmp, $2, $1);
                } else if (buf =~ /^=over/) {
                        puts("<dl>");
                } else if (buf =~ /^=item\s+(.*)/) {
                        if (dd) {
                                puts("</dd>");
                                dd--;
                        }
                        puts("<dt><strong>${$1}</strong></dt><dd>");
                        dd++;
                } else if (buf =~ /^=proto\s+([^ \t]+)\s+(.*)/) {
                        if (dd) {
                                puts("</dd>");
                                dd--;
                        }
                        puts("<dt><b>${$1} ${$2}</b></dt><dd>");
                        dd++;
                } else if (buf =~ /=table/) {
                } else if (buf =~ /^=back/) {
                        if (dd) {
                                puts("</dd>");
                                dd--;
                        }
                        puts("</dl>");
                } else if (buf =~ /^\s*$/) {
                        if (p) {
                                puts("</p>");
                                p = 0;
                        }
                        if (pre) {
                                /*
                                 * If we see a blank line in a preformatted
                                 * block, we don't want to stop the pre
                                 * unless the next line is not indented.
                                 * So peek ahead.
                                 */
                                if (defined(buf = all[i+1]) && (buf =~ /^\s/)) {
                                        puts("");
                                        continue;
                                }
                                puts("</pre>");
                                pre = 0;
                                trim = undef;
                        }
                        space = 1;
                } else {
                        if (space) {
                                if (buf =~ /^(\s+)[^ \t]+/) {
                                        trim = $1;
                                        puts("<pre>");
                                        pre = 1;
                                } else {
                                        puts("<p>");
                                        p = 1;
                                }
                                space = 0;
                        }
                        if (defined(trim)) buf =~ s/^${trim}//;
                        puts(buf);
                }
        }
        puts("</body></html>");
        return (0);
}

/*
 * header and style sheet
 */
void
header(string title)
{
        string      head = <<EOF
<html>
<head>
<title>${title}</title>
<style>
pre {
        background: #eeeedd;
        border-width: 1px;
        border-style: solid solid solid solid;
        border-color: #ccc;
        padding: 5px 5px 5px 5px;
        font-family: monospace;
        font-weight: bolder;
}
body {
        padding-left: 10px;
}
dt {
        font-size: large;
}
</style>
</head>
<body>
EOF
        puts(head);
        puts("<h1>${title}</h1>");
}

/*
 * Process bold, code, italic, italic, link, non-breaking.
 * This will handle nested stuff like if (!condition)
 * but dies if there are nested ones of the same type.
 */
string
inline(string buf)
{
        string      c, prev, result, link, stack[];
        int         B = 0, C = 0, I = 0, L = 0, S = 0;

        foreach (c in buf) {
                if ((c == "<") && defined(prev)) {
                        if (prev == "B") {
                                if (B++) die("Nested B<> unsupported: ${buf}");
                                result[END] = "";
                                result .= "<B>";
                                push(&stack, "B");
                        } else if (prev == "C") {
                                if (C++) die("Nested C<> unsupported: ${buf}");
                                result[END] = "";
                                result .= "<CODE>";
                                push(&stack, "CODE");
                        } else if (prev == "I" || prev == "F") {
                                if (I++) die("Nested I<> unsupported: ${buf}");
                                result[END] = "";
                                result .= "<I>";
                                push(&stack, "I");
                        } else if (prev == "L") {
                                if (L++) die("Nested L<> unsupported: ${buf}");
                                result[END] = "";
                                result .= "<a href=\"";
                                link = "";
                                push(&stack, "L");
                        } else if (prev == "S") {
                                if (S++) die("Nested S<> unsupported: ${buf}");
                                result[END] = "";
                                push(&stack, "S");
                        } else {
                                result .= "<";
                                prev = c;
                        }
                } else if ((c == ">") && length(stack)) {
                        c = pop(&stack);
                        if (c == "B") {
                                B--;
                        } else if (c == "CODE") {
                                C--;
                        } else if (c == "I") {
                                I--;
                        } else if (c == "L") {
                                L--;
                                result .= "\">${link}</a>";
                                c = undef;
                        } else {
                                S--;
                                c = undef;
                        }
                        if (defined(c)) {
                                result .= "</" . c . ">";
                        }
                        prev = undef;
                } else {
                        if (S && isspace(c)) {
                                result .= " ";
                        } else if (c == "<") {
                                result .= "<";
                        } else if (c == ">") {
                                result .= ">";
                        } else {
                                result .= c;
                        }
                        if (L) link .= c;
                        prev = c;
                }
        }
        return (result);
}