Name
Little
Synopsis
L [options] script.l [args]
Introduction
Little is a compiled-to-byte-code language that draws heavily from C and Perl. From C, Little gets C syntax, simple types (int, float, string), and complex types (arrays, structs). From Perl, Little gets associative arrays and regular expressions (PCRE). And from neither, Little gets its own simplistic form of classes.
The name "Little", abbreviated as simply "L", alludes to the language's simplicity. The idea was to distill the useful parts of other languages and combine them into a scripting language, with type checking, classes (not full-blown OO but useful none the less), and direct access to a cross-platform graphical toolkit.
Little provides a set of built-in functions, drawn from Perl and the standard C library.
Little is built on top of the Tcl/TK system. The Little compiler generates Tcl byte codes and uses the Tcl calling convention. This means that L and Tcl code may be intermixed. More importantly, it means that Little may use all of the Tcl API and libraries as well as TK widgets. The net result is a type-checked scripting language which may be used for cross-platform GUIs.
Little is open source under the same license as Tcl/TK (BSD like) with any bits that are unencumbered by the Tcl license also being available under the Apache License, Version 2.0.
Running Little programs
You can run a Little program from the command line.
L [options] progname.l [args]
Alternatively, put this as the first line of your script, but make
sure your script is executable (chmod 755 script.l
under Unix).
#!/path/to/L [options]
Options:
- --fnhook=myhook
-
When function tracing is enabled, use
myhook
as the trace hook. - --fntrace=on | entry | exit | off
-
Enable function tracing on both function entry and exit, entry only, exit only, or disable tracing altogether.
- --norun
-
Compile only (do not run). This is useful to check for compilation errors.
- --nowarn
-
Disable compiler warnings. This is useful when you know you have unused variables or other warnings that you don't want to be bothered with.
- --poly
-
Treat all types as
poly
. This effectively disables type checking. - --trace-depth=n
-
When function tracing is enabled, trace only to a maximum call depth of n.
- --trace-files=colon-separated list of glob | /regexpr/
-
Enable tracing of all functions in the given files, specified either as globs or regular expressions. A leading + before a glob or regexp means to add to what is otherwise being traced and a leading - means to remove. No leading + or - means to trace exactly what is specified.
- --trace-funcs=colon-separated list of glob | /regexpr/
-
Like --trace-files but specifies functions.
- --trace-out=filename | host:port
-
Send default trace output to a file or a TCP socket.
- --version
-
Print the L build version and immediately exit.
The tracing-related command-line options also can be specified in a #pragma inside the program; see the DEBUGGING section.
The optional [args] is a white-space separated list of arguments that are passed to the script's main() function as an array of strings (argv).
Language syntax
A Little script or program consists of one or more statements. These may
be executable statements, variable or type declarations, function
or class declarations, or #pragma statements which specify tracing
directives. Statements outside of functions are said
to be at the top level
and are executed in the order they appear,
although you can use a return
statement to bail out.
There is no need to have a main()
function, but if one is
present, it is executed after all of the top-level statements
(even if you did a return
from the top level).
puts("This is printed first."); void main() { puts("This is printed last."); } puts("This is printed second.");
Little statements end in a semi-colon.
printf("Hello, world\n");
Both C style and hash style comments are allowed, but the hash-style comments are only for the first line and must start on column 1.
# This is a comment // So is this /* And this too */ # But this is an error, only allowed on column 1, line 1
Whitespace usually is irrelevant.
printf( "Hello, world\n") ;
... except inside quoted strings:
# this would print with a linebreak in the middle printf("Hello\ world\n");
and around the string-concatenation operator " . " so it can be distinguished from the struct-member selection operator ".".
Double quotes or single quotes may be used around literal strings:
puts("Hello, world"); puts('Hello, world');
However, only double quotes "interpolate" variables and handle
character escapes such as for newlines (\n
):
puts("Hello, ${name}"); // works fine puts('Hello, ${name}'); // prints ${name}\n literally
Inside single quotes, you can still escape a line break, the
single quote character (\'
), and the escape character (\\
).
puts('Here \' and \\ are escaped.'); puts('This one spans a\ line');
If you put a line break in the middle of a string but forget to escape it, Little will complain.
Adjacent string constants are automatically concatenated, like in C.
printf("This " "prints " "the concatenation " "of ""all"" strings\n");
prints "This prints the concatenation of all strings" followed by a newline.
Little requires that they be the same "type", all interpolated ("") or all not interpolated ('') but the dot operator comes the rescue in this contrived example:
'Hi there. ${USER} is ' . "${USER} today"
Variables, types, and constants
Little is a statically typed language with both scalar and complex types. All variables are typed and must be declared before use.
The scalar types are int, float, and string. The complex types are
array, hash, struct, and list. Little also supports function pointers,
classes, and a special type called poly
which matches any type and
normally is used to disable type checking. Finally, Little has the concept
of an undefined
value which a variable of any type can possess.
Strong typing means that you can assign something of one type only to something else of a compatible type. Normally, to be compatible the types must be structurally the same, but there are exceptions such as an int being compatible with float and a list sometimes being compatible with an array or struct.
Variables begin with a letter and can contains letters, numerals, and underscores, but they cannot begin with an underscore (_). This is because the Little compiler reserves names starting with _ for internal use.
A variable declaration includes the type, the variable name, and optionally an initial value which can be any Little expression:
int i = 3*2; printf("i = %d\n", i); // prints i = 6
If an initial value is omitted, the variable starts out with the
undefined value undef
.
A scalar represents a single value that is a string, integer, or floating-point number. Strings are wide char (unicode), integers are arbitrarily large, and floats are like C's double.
Examples:
string animal = "camel"; int answer = 42; float pi = 3.14159;
When one of these types is expected, supplying another one usually is an error, except that an int always can be used as a float. You can override this behavior with a type cast.
Hex and octal integer constants are specified like this:
int space = 0x20; int escape = 0o33;
Integer constants can be arbitrarily large; they are not limited by the machine's word size.
Strings have a special feature where they can be indexed like arrays, to get a character or range of characters, or to change a character (but you cannot change a range of characters):
string s1, s2; s1 = "hello"; s2 = s1[1]; // s2 gets "e" s2 = s1[1..3]; // s2 gets "ell" s1[1] = "x"; // changes s1 to "hxllo" s1[END+1] = "there"; // changes s1 to "hxllothere"
The pre-defined identifier END
is the index of the last character,
or is -1 if the string is empty. You always can write to one past the end
of a string to append to it, but writing beyond END+1 is an error.
You delete a character within a string by indexing the string and
setting it to "" (the empty string), or by using the undef()
built-in:
s[3] = ""; // deletes fourth character of s undef(s[3]); // same thing
After the deletion any characters after the deleted character are shifted left by one:
s = "123X456"; undef(s[3]); // s is now "123456", the X is gone, the rest left shifted
Sometimes you want to signify that a variable has no legal value, such
as when returning an error from a function. Little has a read-only pre-defined
identifier called undef
which you can assign to any variable.
int_var = undef; array_var = undef;
This is different than the undef()
built-in function which
deletes array, hash, or string elements.
When used in comparisons, a variable that is undefined is never seen as true, or as equal to anything defined, so you can easily check for error conditions:
unless (f = fopen(file, "r")) { die(file); } while (s = <f>) { printf("%s\n", s); }
You have to be a little careful because any numeric value (and poly) can be false in a condition because it can have the value of zero:
int i; i = 0; if (i) // false i = 1; if (i) // true. i = undef; if (i) // false i = 0; if (defined(i)) // true if (i) // false
Other than numeric types, you can skip the defined() and just use if (var). That's true for arrays, structs, hashes, and FILE types.
Little itself sometimes uses undef to tell you that no value is available. One case is when you assign to an array element that is more than one past the end of an array. Little auto-extends the array and sets the unassigned elements to undef.
An array holds a list of values, all of the same type:
string animals[] = { "camel", "llama", "owl" }; int numbers[] = { 23, 42, 69 };
You do not specify a size when declaring an array, because arrays grow dynamically.
Arrays are zero-indexed. Here's how you get at elements in an array:
puts(animals[0]); // prints "camel" puts(animals[1]); // prints "llama"
The pre-defined identifier END
is the index of the last element
of an array, or is -1 if the array is empty.
puts(animals[END]); // last element, prints "owl"
END is valid only inside of an array subscript (strings are a kind of array so END works there).
If you need the length of an array, use a built-in function:
num_elems = length(animals); // will get 3
If the array is empty, length() returns 0.
To get multiple values from an array, you use what's called an array
slice
which is a sub-array of the array being sliced.
Slices are for reading values only; you cannot write to a slice.
animals[0..1]; // gives { "camel", "llama" } animals[1..END]; // gives all except the first element
In this last example where END is used, you must be careful, because if the array is empty, END will be -1, and an array slice where the second index is less than the first causes a run-time error.
You can add and remove from an array with push
and pop
,
unshift
and shift
, and insert
.
The push
and pop
functions add and remove from the end:
string birds[], next; push(&birds, "robin"); push(&birds, "dove", "cardinal", "bluejay"); next = pop(&birds); // next gets "bluejay" // birds is now { "robin", "dove", "cardinal" }
The & means that birds is passed by reference, because it will be changed. This is discussed in more detail in the section on functions.
Another way to append:
birds[END+1] = "towhee";
The unshift
and shift
functions are similar but they add
and remove from the beginning of the array.
You can insert anywhere in an array with insert
:
insert(&birds, 2, "crow"); // insert crow before birds[2] insert(&birds, 3, "hawk", "eagle");
In these examples we inserted one or more single elements but whereever you can put an element you also can splice in a list:
string new_birds[] = { "chickadee", "turkey" }; push(&birds, new_birds); // appends chickadee and turkey
In this example the variable new_birds
is not required;
an array constant could have been pushed instead:
push(&birds, { "chickadee", "turkey" });
There is an ambiguity, resolved by the type of the first argument, as to whether it is two strings being pushed as two new entries in the array, or if it is a single item being pushed. You have to know the type of the first argument to know which is which.
You can remove from anywhere in an array with undef
:
string dev_team[] = { "larry", "curly", "mo" }; undef(dev_team[0]); // delete "larry" from dev_team
When you delete an element, all subsequent elements slide down by one index. Note that undef() works only on a variable; it cannot remove an element from a function return value, for example.
You also can directly assign to any array index even if the array hasn't yet grown up to that index. If you assign more than one past the current end, the unassigned elements are assigned undef:
string colors[] = { "blue, "red" }; colors[3] = "green"; // colors[2] gets undef and // colors[3] gets "green"
You can read from any non-negative array index as well. You will simply get undef if the element doesn't exist. Reading from a negative index causes a run-time error.
An array can hold elements of any type, including other arrays. Although Little does not have true multi-dimensional arrays, arrays of arrays give you basically the same thing:
int matrix[][] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9 } };
When declaring an array, it is legal to put the brackets after the type instead of the name. This sometimes is useful for readability, and is required in function prototypes that omit the parameter name.
int[] mysort(int[]); // prototype int[] mysort(int vector[]) { ... }
An array in Little is implemented as a Tcl list under the covers.
A hash holds a set of key/value pairs:
int grades{string} = { "Tom"=>85, "Rose"=>90 };
When you declare a hash, you specify both the key type (within the {}) and the value type. The keys must be of scalar type but the values can be of any type, allowing you to create hashes of arrays or other hashes.
To get at a hash element, you index the hash with the key:
grades{"Rose"}; // gives 90
If the given key does not exist in the hash, you get back undef. Using an undefined key causes a run-time error.
You get a list of all the keys in a hash with the keys()
built-in,
which returns an array:
string students[] = keys(grades);
Because hashes have no particular internal order, the order in which the keys ("Tom" and "Rose") appear is undefined. However, you can obtain a sorted array of keys like this:
string students[] = sort(keys(grades));
The length
built-in works on hashes too and returns the number of
key/value pairs.
You remove an element from a hash with undef
:
undef(grades{"Tom"}); // removes "Tom" from the hash
Note that this is different than assigning undef to a hash element,
which does not remove that element from the hash, it creates an
element with the value undef
.
It is not an error to remove something that's not in the hash. Note
that undef() works only on a variable; it cannot remove an element
from a function return value, for example.
When declaring a hash, it is legal to put the braces after the type instead of the name. It comes in handy for function definitons, here is is function that is returning an hash of integer grades indexed by student name after adjusting them:
int{string} adjust_grades(int{string}); // prototype int{string} adjust_grades(int grades{string}) { ... }
A hash in Little is implemented as a Tcl dict.
Little structs are much like structs in C. They contain a fixed number of named things of various types:
struct my_struct { int i; int j; string s; }; struct my_struct st = { 1, 2, "hello" };
You index a struct with the "." operator except when it is a call-by-reference parameter and then you must use "->":
void foo(struct my_struct &byref) { puts(byref->s); // prints hello } puts(st.i); // prints 1 puts(st.j); // prints 2 puts(st.s); // prints hello foo(&st); // pass st by reference
It is an error to use "." when "->" is required and vice-versa. Be careful to not put any whitespace around the "." or else you will get the string concatenation operator and not struct-member selection (this is a questionable overload of the "." operator but too useful to pass up).
Structs can be named like my_struct
above or they can be anonymous:
struct { int i; int j; } var1;
Struct names have their own namespace, so they will never clash with function, variable, or type names.
A struct in Little is implemented as a Tcl list.
In the examples above, we have been initializing arrays, hashes, and structs by putting values inside of {}:
string nums[] = { "one", "two", "three" };
In Little, the {}
is an operator that creates a list
and can be used
anywhere an expression is valid. The array could instead be
initialized like this:
string nums[]; nums = { "one", "two", "three" };
We said before that you can assign a value to something only if it has a compatible type. Lists are special in that they can be compatible with arrays, hashes, and structs. A list where all the elements are of the same type, say T, is compatible with an array of things of type T. The example above illustrates this.
A list also is compatible with a struct if the list elements agree
in type and number with the struct. The assignment of the variable
st
above illustrates this.
A list is compatible with a hash if it has a sequence of key/value pairs and they are all compatible with the key/value types of the hash:
int myhash{string} = { "one"=>1, "two"=>2, "three"=>3 };
Lists are very useful at times because you can use them to build up larger complex structures. To concatenate two arrays, you could do this:
{ (expand)array1, (expand)array2 };
The (expand) operator takes an array (or struct or list) and moves its elements out a level as if they were between the { and } separated by commas. The section on manipulating complex data structures has more details on the (expand) operator.
A list in Little is implemented as a Tcl list.
Sometimes you don't want Little to do type checking.
In this case, you use the poly
type, which is compatible with
any type. Poly effectively disables type checking, allowing you
to use or assign values without regard to their types. Obviously,
care must be taken when using poly.
The -poly option to Little causes all variables to be treated as if they were of type poly, regardless of how they are declared.
Something of one type can be converted into something of another type with a type cast like in C:
string_var = (string)13;
If the thing being cast cannot be converted to the requested type, the
result of the cast is undef
.
You can declare a type name to be a shorthand for another type, as you would in C:
typedef struct { int x, y; } point;
And then use the shorthand as you would any other type name:
point points[]; points[] = { 1,1, 2,2, 3,3 };
You can typedef a function pointer too. This declares compar_t as type function that takes two ints and returns an int:
typedef int compar_t(int a, int b);
Type names belong to their own namespace, so you can define a typedef with the same name as a variable, function, or struct without ambiguity (though it is poor practice to do so).
Name scoping
Variables must be declared before use, or a compile-time error will result. However, functions need not be declared before use although it is good practice to do so.
Declarations at the top-level code exist at the global
scope and
are visible across all scripts executed by a single run of Little. You can
qualify a global declaration with private
to restrict it to the
current file only; this is similar to a static
in C, except that
private globals are not allowed to shadow public globals. Names
declared in a function, or in a block within a function, are local
and are scoped to the block in which they are declared.
Functions and global variables share the same namespace, so a variable and function cannot have identical names. Struct tags have their own namespace, and type names have theirs.
Inside a function, two locals cannot share the same name, even if they
are in parallel scopes. This is different than C where this is
allowed. If a local shares the same name as a global,
the local is said to shadow
the global.
Locals cannot shadow globals that have been previously declared.
Names declared inside of a class can be either local or global depending on how they are qualified.
String interpolation
Expressions can be interpolated into double-quoted strings, which means that within a string you can write an expression and at run-time its value will be inserted. For example, this interpolates two variables:
int a = 12; string b = "hello"; /* This will print "A is 12 and b is hello". */ printf("A is ${a} and b is ${b}\n");
Everything inside the ${} is evaluated like any other Little expression, so it is not limited to just variables:
printf("The time is ${`date`}\n"); s = "The result is ${some_function(a, b, c) / 100}";
Here documents
Sometimes you need to assign a multi-line string to a variable.
Here documents
help with that:
string s = <<EOF This is the first line in s. This is the second. And the last. EOF;
Everything in the line starting after the initial <<EOF delimiter and
before the final EOF delimiter gets put into the variable s
. You
can use any identifier you want as the delimiter, it doesn't have
to be EOF.
A semicolon after the EOF is optional.
The text inside the here document undergoes interpolation and escape processing. If you don't want that, put the initial delimiter inside of single quotes:
string s = <<'EOF' None of this text is interpolated. So this ${xyz} appears literally as '${xyz}'. And so does \ and ' and " and anything else. EOF;
To help readability, you can indent your here document but have the indenting white space ignored. Put the initial delimiter on the next line and then whatever whitespace you put before it gets ignored:
string s = <<EOF This is the first line in s and gets no leading white space. This line ends up with a single leading space. And this ends up with two. EOF;
Exceptions to the indentation rule: a blank line is processed as if it is indented, and the end delimiter can have any amount of leading white space so that you can indent it more or less if you like.
Operators
- Arithmetic
-
+ addition ++ increment by 1 (integer only) - subtraction -- decrement by 1 (integer only) * multiplication / division % remainder
- Numeric and String comparison
-
== equality != inequality < less than > greater than <= less than or equal >= greater than or equal
- String comparison
-
=~ regexp match or substitute !~ negated regexp match
- Comparison of composite types (array, hash, struct)
-
eq(a,b)
- Bit operations
-
& bit and | bit or ^ bit exclusive or ~ bit complement << left shift >> right shift
- Boolean logic
-
&& and || or ! not
- Conditional
-
?: ternary conditional (as in C)
- Indexing
-
[] array index {} hash index . struct index (no whitespace around the dot) -> struct index (call-by-reference parameters dereference) -> class and instance variable access (object dereference)
- Miscellaneous
-
= assignment , statement sequence . string concatenation (must have whitespace around the dot) `` command expansion
- Assignment
-
+=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=, .=
- Operator precedence (highest to lowest) and associativity
-
`` (non associative) [] {} . (struct index) -> ++ -- (left) unary + unary - ! ~ & (right) * / % (left) + - . (string concatenation) (left) << >> (left) < <= > >= (left) == != =~ !~ (left) & (left) ^ (left) | (left) && (left) || (left) ?: (right) = += -= *= /= %= &= |= ^= <<= >>= .= (right) , (left)
Control transfer statements
Little has most of the usual conditional and looping constructs.
For conditionals, numeric variables (including poly variables with
a number in them), evaluate to true or false based on their value.
If you want to know if a numeric variable is defined you have to
use the defined()
builtin.
For all other variable types (arrays, hashes, structs, strings, etc), the variable itself will yield true or false if it is / is not defined.
int undefined = undef; int zero = 0; int one = 1; string args[]; string more[] = { "hi", "there", "mom" }; if (undefined) // false (undef) if (zero) // false (0 value) if (defined(zero)) // true (not undef) if (one) // true (1 value) if (defined(one)) // true (set to some value) if (args) // false, not initialized if (more) // true, initialized
See the list of operators in the next section for information on comparison and logic operators, which are commonly used in conditional statements.
- if
-
The
if
statement comes in the traditional form:if ( condition ) { ... } else if ( other condition ) { ... } else { ... }
And there's a negated version of it (from Perl) provided as a more readable version of
if (!condition)
.unless ( condition ) { ... }
- while
-
while ( condition ) { ... } do { ... } while ( condition )
- for
-
for (i = 0; i < max; ++i) { ... }
- foreach
-
The
foreach
statement lets you iterate through the elements of an array:string element; string myarray[]; foreach (element in myarray) { printf("This element is %s\n", element); }
... or of a hash:
string key; int value; int myhash{string}; foreach (key=>value in myhash) { printf("Key %s has value %d\n", key, value); }
... or of a string:
string char; foreach (char in mystring) { printf("This char is %s\n", char); }
... or through the lines in a string:
int i = 0; string s; string lines = "a\nbb\nccc\ndddd\n"; # (questionable) alias for foreach (s in split(/\n/, lines)) foreach (s in <lines>) { puts("line #${++i}: ${s}"); }
Inside the loop, the index variable(s) (
element
,key
,val
, andchar
above) get copies of the iterated elements, so if you assign to them, the thing you're iterating over does not change.If you want to stride through more than one array element, character, or line in each iteration, just use a list of value variables instead of one:
foreach (e1,e2,e3 in myarray) { printf("Next three are %s:%s:%s\n", e1, e2, e3); }
If there isn't a multiple of three things to iterate through, the stragglers get undef on the last iteration. Strides work only for arrays and strings, not hashes.
After completing the loop and falling through, all loop counters become undefined (they get the
undef
value). If the loop is prematurely ended with abreak
or by jumping out of the loop with agoto
, the loop counters keep their values. - braceless control flow
-
Little allows braceless versions of
if, unless, while, for, foreach
provided that it is a single statement after the control flow keyword. Note thatelse
is not in the list, if then else always requires braces. - switch
-
The
switch
statement is like C's except that regular expressions and/or strings can be used as case expressions:switch (string_var) { case "true": case "false": puts("boolean (sort of)"); break; case /[0-9]+/: puts("numeric"); break; case /[a-zA-Z][0-9a-zA-Z]*/: puts("alphanumeric"); break; default: puts("neither"); break; }
The default case is optional. The expression being switched on must be of type integer, string or poly.
In addition to checking the value of the switch expression, you can test whether it is undefined. This is useful when switching on a function return value which could be
undef
to signal an error condition.switch (myfunc(arg)) { case /OK/: puts("all is A-OK"); break; case undef: puts("error"); break; default: puts("unknown return value"); break; }
Regular expressions have an alternative syntax (borrowed from Perl) that is used when the expression may contain "/". In switch statements that syntax is somewhat restricted because of the parsing problems you can imagine below:
switch (str) { case m|x*y|: // "|" and most other punctuation as the delim // are OK, // except "(" and ":" -- error // and any alphabetic character -- error break; case m: // is the variable m (not a regexp) -- ok break; case mvar: // and variables starting with "m" -- ok break; }
- break and continue
-
Little has
break
andcontinue
statements that behave like C's. They work in all Little loops includingforeach
loops, andbreak
works inswitch
case bodies. - goto
-
The
goto
statement unconditionally transfers control to a label in the same function, or to a label at the global scope if the goto is at the global scope. You cannot use a goto to transfer in to or out of a function. Labels have their own namespace so they will not clash with variable, function, or type names./* A goto at the global scope. */ goto L1; puts("this is not executed"); L1: puts("but this is"); void foo() { goto L2; puts("this is not executed"); L2: puts("but this is"); }
Some caveats: do not jump into a foreach loop or a run-time error may result due to bypassing the loop set-up. Do not bypass a variable declaration or else the variable will be inaccessible.
Functions
Little's functions are much like functions in C. Like variable names, function names cannot begin with an underscore (_).
Each function must be declared with a return type and a formal-parameter list:
int sum(int a, int b) { return (a + b); }
void
is a legal return type for a function that returns no value.
Functions cannot be nested.
Function prototypes are allowed, where all but the function body is
declared. In a prototype, you can omit any parameter names or use
void
for an empty parameter list:
void no_op1(void); void no_op2(); int sum(int, int);
Unlike Perl, when calling a function you must use parentheses around the arguments:
sum(a, b);
Little does a special kind of call called a pattern function
call
when the function name is capitalized and contains an underscore;
these are useful for calling Tcl commands and are described later
in the section "Calling Tcl from Little".
Normal function names therefore should not be capitalized
and contain an underscore.
Parameters are passed by value by default. To pass by reference,
you use a &
in the declaration and in the function call:
void inc(int &arg) { ++arg; } inc(&x); // inc() can change x
The &
only tells Little to pass by reference. It is not a pointer (no
pointer arithmetic), it is a reference.
You use a reference to give the called function the ability to change
the caller's variable.
Only variables can be passed by reference, not elements of arrays,
hashes, or structs. This is one significant difference from C.
Passing an array, hash, or struct element with &
uses copy in/out
,
not a true reference. The element value is copied
into a temp variable and the temp is passed by reference. Then when
the function returns, any changes to the temp are copied back into the
array, hash, or struct element. In most cases this behaves like
call-by-reference and you don't need to worry about it.
But if you access the passed element during the function call,
by referencing it directly instead of through the formal parameter,
then you must be careful:
string array[] = { "one", "two" }; void fn(string &var, string val) { var = val; array[0] = "this gets overwritten by the copy-out"; } void main() { fn(&array[0], "new"); puts(array[0]); // will print "new" }
Instead of passing a reference, you can pass undef
like you would a
NULL pointer in C.
You test for this with the defined()
operator:
void inc(int &arg) { if (defined(&arg)) ++arg; } inc(undef); // does nothing inc(&x); // increments x
In the example above, you might question why it is defined(&arg)
instead of defined(arg)
. If you think of it in terms of C,
the &arg
is like looking at a pointer, you are seeing if it is
non-null, the arg++
is like *p += 1
.
If you pass undef
as a reference and then attempt to access
the parameter, a run-time error results similar to derefencing
a NULL pointer in C.
When accessing a struct argument inside a function, if the struct was passed by reference, the "->" operator must be used instead of ".". This makes it clear to the reader that the struct variable is passed by reference; it is intended to allude to a C pointer even though Little does not have general-purpose pointers.
Variable arguments to functions
Functions can take a variable number of arguments, like printf does. In the function declaration, you use the qualifier "..." in front of the last formal parameter name and omit its type:
void dump(...args) { string s; foreach (s in args) puts(s); } dump("just one"); dump("but two", "or three", "or more is OK");
Inside the function, args
has type array of poly, allowing any
number of parameters of any type to be passed.
The main() function
If main() is present, it is called after all of the top-level statements have executed. The main() function may be defined in any of the following ways:
void|int main(void) {} void|int main(string argv[]) {} void|int main(int argc, string argv[], string env{string}) {}
The argv
array is populated from the script name and any arguments
that appear after the name on the Little command line.
In this example, argc is 4 and argv[] contains "script.l", "arg1",
"arg2", and "arg3":
L script.l arg1 arg2 arg3
The env
hash is populated with the environment variables present
when Little is invoked. Although you can change this hash, writes to it
are not reflected back into the environment. To do that use the
putenv
library function.
Only a main
written in Little is automatically called.
You can write a main
in Tcl but Little will not call it automatically.
If main
is declared to have return type int
then any value returned
will be the exit value.
Function pointers
Function pointers are supported, but only as arguments -- you cannot otherwise assign a function pointer to a variable. It is common to first typedef the function-pointer type; here is one for a function that compares two strings:
typedef int str_compar_t(string a, string b);
You can then pass such a compare function as follows:
string a[]; bubble_sort(a, &unary_compar);
Where the sort function looks like this:
string[] bubble_sort(string a[], str_compar_t &compar) { do { ... if (compar(a[i], a[i+1] > 0) { ... } ... } ... }
And the compare function looks like this:
int unary_compar(string a, string b) { int al = length(a); int bl = length(b); if (al < bl) { return -1; } else if (al > bl) { return 1; } else { return 0; } }
Regular expressions
Little's regular expression support is based on the PCRE (Perl Compatible Regular Expressions) library http://www.pcre.org. The basics are documented here but for more extensive documentation please see http://www.pcre.org/pcre.txt.
- Simple matching
-
if (s =~ /foo/) { ... } // true if s contains "foo" if (s !~ /foo/) { ... } // false if s contains "foo"
The
//
matching operator must be used in conjunction with=~
and!~
to tell Little what variable to look at.If your regular expression contains forward slashes, you must escape them with a backslash, or you can use an alternate syntax where almost any punctuation becomes the delimiter:
if (s =~ m|/path/to/foo|) { ... } if (s =~ m#/path/to/foo#) { ... } if (s =~ m{/path/to/foo}) { ... }
In the last case, note that the end delimiter } is different than the start delimiter { and you must escape all uses of either delimiter inside the regular expression.
- Simple substitution
-
x =~ s/foo/bar/; // replaces first foo with bar in x x =~ s/foo/bar/g; // replaces all instances of foo // with bar in x x =~ s/foo/bar/i; // does a case-insensitive search
This form also has several alternate syntaxes:
x =~ s|/bin/root|~root|; x =~ s{foo}{bar}; x =~ s{foo}/bar/;
- More complex regular expressions
-
. a single character \s a whitespace character (space, tab, newline, ...) \S non-whitespace character \d a digit (0-9) \D a non-digit \w a word character (a-z, A-Z, 0-9, _) \W a non-word character [aeiou] matches a single character in the given set [^aeiou] matches a single character not in given set (foo|bar|baz) matches any of the alternatives specified ^ start of string $ end of string
Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the meta characters listed above, or a group of characters or meta characters in parentheses.
* zero or more of the previous thing + one or more of the previous thing ? zero or one of the previous thing {3} matches exactly 3 of the previous thing {3,6} matches between 3 and 6 of the previous thing {3,} matches 3 or more of the previous thing
Some brief examples:
/^\d+/ string starts with one or more digits /^$/ nothing in the string (length == 0) /(\d\s){3}/ a three digits, each followed by a whitespace character (eg "3 4 5 ") /(a.)+/ matches a string in which every odd-numbered letter is "a" (eg "abacadaf") // This loop reads from stdin, and prints non-blank lines. string buf; while (buf = <stdin>) { unless (buf =~ /^$/) puts(buf); }
- Unicode
-
Both regular expressions and the strings they are matched against can contain unicode characters or binary data. This example looks for a null byte in a string:
if (s =~ /\0/) puts("has a null");
- Parentheses for capturing
-
As well as grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in
$1
,$2
and so on, and these capture variables are available in the substitution part of the operator as well as afterward. You can use up to nine captures ($1 - $9).// Break an e-mail address into parts. if (email =~ /([^@]+)@(.+)/) { printf("Username is %s\n", $1); printf("Hostname is %s\n", $2); } // Use $1,$2 in the substitution to swap two words. str =~ s/(\w+) (\w+)/$2 $1/;
Capturing has a limitation. If you have more than one regexp with captures in an expression, the last one evaluated sets
$1
,$2
, etc.// This loses email1's captures. if ((email1 =~ /([^@]+)@(.+)/) && (email2 =~ /([^@]+)@(.+)/)) { printf("Username is %s\n", $1); printf("Hostname is %s\n", $2); }
In situations like this, care must be taken because the evaluation order of sub-expressions generally is undefined. But this example is an exception because the && operator always evaluates its operands in order.
Includes
Little has an #include
statement like the one in the C pre-processor.
A #include can appear anywhere a statement can appear as long as it begins
in the first column and is contained entirely on one line:
#include <types.l> #include "myglobals.l" void main() { ... }
Unless given an absolute path, when the file name is in angle brackets (like <types.l>), Little searches these paths, where BIN is where the running tclsh exists:
$BIN/include /usr/local/include/L /usr/include/L
When the file name is in quotes (like "myglobals.l"), Little searches only the directory containing the script that did the #include.
Little also remembers which files have been included and will not include a file more than once, allowing you to have #include files that include each other.
Classes
Little has a class
abstraction for encapsulating data and
functions that operate on that data. Little classes are simpler than
full-blown object-oriented programming (there is no inheritance), but
they get you most of the way there.
You declare a class like this:
class myclass { .... }
The name myclass
becomes a global type name, allowing you to
declare an object
of myclass
:
myclass obj;
You can declare both variables and functions inside the class. These all must be declared inside one class declaration at the global scope. You cannot have one class declaration that has some of the declarations and another with the rest, and you cannot nest classes inside of functions or other classes.
Inside the class, you can have class variables
and
instance variables
. Class variables are associated with the class
and not the individual objects that you allocate, so there is only one
copy of each. Instance variables get attached to each object.
class myclass { /* Class variables. */ public string pub_var; private int num = 0; /* Instance variables. */ instance { public string inst_var; private int n; } ... }
All declarations (except the constructors and destructors) must be
qualified with either public
or private
to say whether the name
is visible at the global scope or only inside the class.
A class can have one or more constructors and destructors but they are optional.
Inside a constructor, the variable self
is automatically declared
as the object being constructed. A constructor should return
self
, although it also could return undef
to signal an error.
A destructor must be declared with self
as the first parameter.
constructor myclass_new() { n = num++; return (self); } destructor myclass_delete(myclass self) {}
If omitted, Little creates a default constructor or destructor named
classname_new
and classname_delete
. Although not shown in this
example, you can declare them with any number of parameters, just like
regular functions.
A public
class member function is visible at the global scope, so
its name must not clash with any other global function or variable. A
private member function is local to the class.
The first parameter to each public function must be self
, the
object being operated on. Private functions do not explicitly include
self
in the parameter list because it is implicitly passed by the
compiler.
private void bump_num() { ++n; } public int myclass_getnum(myclass self) { bump_num(); return (n); }
To create an object, you must call a constructor, because just declaring the variable does not allocate anything:
myclass obj; obj = myclass_new();
To operate on an object, you call one of its public member functions, passing the object as the first argument:
int n = myclass_getnum(obj);
Little allows you to directly access public class and instance variables from outside the class. To get a class variable, you dereference the class name (you must use ->):
string s = myclass->pub_var;
To get a public instance variable, you dereference the object whose data you want to access:
string s = obj->inst_var;
Once you free an object
myclass_delete(obj);
you must be careful to not use obj
again unless you assign a
new object to it, or else a run-time error will result.
Working with Tcl/TK
Little is built on top of Tcl: Little functions are compiled down to Tcl procs, Little local variables are just Tcl variables local to the proc, and Little global variables are Tcl globals. Although Little is designed to hide its Tcl underpinnings, sometimes it is useful for Little and Tcl to cooperate.
Mixing Little and Tcl Code
When you invoke Little with a script whose name ends in .l
, the script
must contain only Little code.
If you run a .tcl
script, you can mix Little and Tcl:
puts "This is Tcl code" #lang L printf("This is Little code\n"); #lang tcl puts "Back to Tcl code"
You also can run Little code from within Tcl by passing the Little code to the
Tcl command named L
:
puts "Tcl code again" L { printf("Called from the Little Tcl command.\n"); }
Calling Tcl from Little
You call a Tcl proc from Little like you would a Little function:
string s = "hello world"; puts(s);
In this example, puts
is the Tcl command that outputs its argument
to the stdout
channel appending a trailing newline.
If you want argument type checking, you can provide a prototype for the Tcl functions you call. Otherwise, no type checking is performed.
In Tcl, options usually are passed as strings like "-option1" or "-option2". Little has a feature to pass these options more pleasantly:
func(option1:); // passes "-option1" func(option2: value, arg); // passes "-option2", value, arg
Without this, you would have to say:
func("-option1"); func("-option2", value, arg);
A similar feature is for passing sub-commands to Tcl commands:
String_length("xyzzy"); // like Tcl's [string length xyzzy] String_isSpace(s); // like Tcl's [string is space $s]
Whenever the function name is capitalized and contains an underscore,
the sequence of capitalized names after the underscore are converted
to (lower case) arguments (although capitalizing the
first name after the underscore is optional).
This is called a pattern function
call.
x = Something_firstSecondThird(a, b)
is like this in Tcl:
set x [something first second third $a $b]
A pattern-function call often is used to call a Tcl proc, but you
can call a Little function just as easily, and Little has a special case
when the function is named like Myfunc_*
:
void Myfunc_*(...args) { poly p; printf("Myfunc_%s called with:\n", $1); foreach (p in args) printf("%s\n", p); } void main() { Myfunc_cmd1(1); Myfunc_cmd2(3,4,5); }
If Myfunc_*
is declared, then any call like Myfunc_x
becomes a
call to Myfunc_*
where the string x
is put into the local
variable $1
inside Myfunc_*
. The remaining parameters are
handled normally.
This gives you a way to handle a collection of sub-commands without
having to declare each as a separate Little function.
Note that this use of $1
clashes with regular expression captures
(described later), so if you use both, you should save off $1
before using any such regular expressions.
Note: we are going to change this to not conflict with regular expressions.
If you need to execute arbitrary Tcl code rather than just call a proc,
you pass it to Tcl's eval
command:
eval("puts {you guessed it, Tcl code again}");
Calling Little from Tcl
Little functions are easily called from Tcl, because a Little function
foo
compiles down to a Tcl proc named foo
in the global
namespace.
Let's say this is run from a script named script.tcl
:
#lang L int avg(...args) { int i, sum=0; unless (length(args)) return (0); foreach (i in args) sum += i; return (sum/length(args)); } #lang tcl set x [avg 4 5 6] puts "The average is $x"
The Little code defines a proc named avg
which the Tcl code then calls.
An exception is that private
Little functions are not callable from Tcl.
Variables
Because Little variables are just Tcl variables, you can access Little variables from Tcl code. Here is an example from the Little library:
int size(string path) { int sz; if (catch("set sz [file size $path]")) { return (-1); } else { return (sz); } }
In this Tcl code, $path
refers to the Little formal parameter path
,
and the Little local sz
is set to the file size. This example also
illustrates how you can use Tcl's exception-handling facility
to catch an exception raised within some Tcl code.
An exception is that private Little global names are mangled (to
make them unique per-file).
You can pass the mangled name to Tcl code with the &
operator.
Here we are passing the name of the private function mycallback
to
register a Tcl fileevent "readable" handler:
private void mycallback(FILE f) { ... } fileevent(f, "readable", {&mycallback, f});
Complex variables
Passing scalar variables works because they have the same representation in Little and in Tcl.
Passing complex variables is trickier and is not supported, but if you want to try here is what you need to know. This is subject to change. A Little array is a Tcl list. A Little struct is a Tcl list with the first struct member as the first list element and so on. A Little hash table is a Tcl dict. If a Little variable is deeply nested, so is the Tcl variable.
So long as you understand that and do the appropriate thing in both languages, passing complex variables usually is possible.
Namespaces
You can access Tcl procs and variables in namespaces other than the global namespace by qualifying the name:
extern string ::mynamespace::myvar; /* Print a bytecode disassembly of the proc "foo". */ puts(::tcl::unsupported::disassemble("proc", "foo")); /* Print a variable in another namespace. */ puts(::mynamespace::myvar);
Calling Tk
To help call Tk widgets, Little has a widget
type that is used with the
pattern function calls described above. A widget value behaves like
a string except in a pattern function call where it is the name
of the widget to call:
widget w = Text_new(); Text_insert(w, "end", "hi!"); // like Tk's $w insert end hi!
Another feature is useful for calling Tk widgets that take the name of a variable whose value is updated when the user changes a widget field. You can use a Little variable like this:
string msg; ttk::label(".foo", textvariable: &msg);
The ampersand (&) in front of msg
alludes to a C pointer but it
really passes just the name of the variable. Little does this when the
option name ends in "variable", as "textvariable" does in the example
above (yes, this is a hack).
Learning more about Tcl/Tk
The Little language distribution includes the Tcl and Tk repositories.
Each of those has a doc/
subdirectory with files starting with
upper case and lower case. Ignore the upper case files ending in .3, those are
internal C API documentation. The lower case files are Tcl / Tk
exposed APIs. They are all nroff -man
markup, to view
$ nroff -man file.n | less
For books we like these:
Tcl and the Tk Toolkit 2nd Edition Effective Tcl/Tk Programming: Writing Better Programs with Tcl and Tk
Manipulating complex structures
Little has built-in operators for turning complex data structures into something else: (expand), and (tcl).
(expand) takes an array of things and pushes them all onto the run-time stack to call a function that expects such a list. It is identical to Tcl's {*}:
void foo(string a, string b, string c); string v[] = { "one", "two", "three" }; foo((expand)v); // passes three string arguments to foo
It expands only one level, so if the array contains three hashes instead of three strings, (expand)v passes three hashes to foo. (expand) works with structs too.
If you have this structure:
struct { int i[]; int h{string}; } foo = { { 0, 1, 2, 3, }, { "big" => 100, "medium" => 50, "small" => 10 } };
And you use (expand) when passing these as arguments:
func((expand)foo);
you need a function definition like this:
void func(int nums[], int sizes{string}) { }
There is no way to recursively expand at this time.
(tcl) is used to pass a single string to a Tcl proc for processing. It puts in the Tcl quotes. So
(tcl)foo
is
0 1 2 3 { big 100 medium 50 small 10 }
Another example:
string v[] = { "a b c", "d", "e" }; string arg = (tcl)v; // arg is "{a b c} d e"
Sometimes you need to assign a group of variables all at once. You can do this by assigning a list of values to a list of variables:
{a, b, c} = {1, 2, 3};
This is more than a short-cut for the three individual assignments. The entire right-hand side gets evaluated first, then the assignment occurs, so you can use this to swap the value of two variables:
{a, b} = {b, a};
If you want to ignore one of the elements in the right-hand list, you
can put undef
in the corresponding element of left-hand list
instead of having to use a dummy variable:
{a, undef, b} = {1, 2, 3}; // a gets 1, b gets 3
If the right-hand side list isn't as long as the left-hand list,
the stragglers get undef
:
{a, b, c} = {1, 2}; // a gets 1, b gets 2, c gets undef
These composite assignments also work with arrays or structs on the right-hand side:
int dev, inode; struct stat st; lstat(file, &st); {dev, inode} = st; // pull out first two fields of the stat struct {first, second} = split(line); // get first two words in line
Html with embedded Little
For building web-based applications, Little has a mode where the input
can be HTML with embedded Little code (which we call Lhtml
).
This works in a way similar to PHP.
To invoke this mode, the input file must end in .lhtml:
L [options] home.lhtml
All text in home.lhtml is passed through to stdout except that anything between <? and ?> is taken to be one or more Little statements that are replaced by whatever that Little code outputs, and anything between <?= and ?> is taken to be a single Little expression that is replaced by its value. All Little code is compiled at the global scope, so you can include Little variable declarations early in the Lhtml document and reference them later.
Here's an example that iterates over an array of key/value pairs and formats them into a rudimentary table:
<? key_value_pair row, rows[]; ?> <html> <body> <p>This is a table of data</p> <table> <? rows = get_data(); foreach (row in rows) { ?> <tr> <td><?= row.key ?></td> <td><?= row.value ?></td> </tr> <? } ?> </table> </body> </html>
Pre-defined identifiers
- __FILE__
-
A string containing the name of the current source file, or "<stdin>" if the script is read from stdin instead of from a file. Read only.
- __LINE__
-
An int containing the current line number within the script. Read only.
- __FUNC__
-
A string containing the name of the enclosing function. At the top level, this will contain a unique name created internally by the compiler to uniquely identify the current file's top-level code. Read only.
- END
-
An int containing the index of the last character of a non-empty string or the last element of a non-empty array. If the array or string is empty, END is -1. Valid only inside of a string or array subscript. Read only.
- stdio_status
-
A struct of type STATUS (see system()) containing status of the last system(), `command`, successful waitpid(), or failed spawn().
- undef
-
A poly containing the undef value, where defined(undef) is false. Assigning this to something makes it undefined. However, undef is not guaranteed to have any particular value, so applications should not rely on the value. Read only.
Reserved words
The following identifiers are reserved. They cannot be used for variable, function, or type names:
break case class constructor continue default defined destructor do else END expand extern float for foreach goto if instance int poly private public return string struct switch typedef undef unless void while widget
Debugging
Function tracing
Little function tracing is controlled with #pragma statements, _attribute clauses in function declarations, command-line options, environment variables, and a run-time API. When a function is marked for tracing, by default its entry and exit are traced to stderr, but you can use your own custom hooks to do anything you want.
A #pragma takes a comma-separated list of attribute assignments:
#pragma fntrace=on string myfunc(int arg) { return("return value"); } void main() { myfunc(123); }
When this program runs, traces go to stderr with a millisecond timestamp, the function name, parameter values, and return value:
1: enter main 1: enter myfunc: '123' 2: exit myfunc: '123' ret 'return value' 3: exit main
The allowable tracing attributes are as follows.
- fntrace=on | entry | exit | off
-
Enable tracing on both function entry and exit, entry only, exit only, or disable tracing altogether.
- trace_depth=n
-
Trace only to a maximum call depth of n.
- fnhook=myhook
-
Use myhook as the trace hook (see below).
A #pragma stays in effect until overridden by another #pragma or by an _attribute clause in a function declaration which provides per-function tracing control:
// don't trace this function void myfunc2(int arg) _attribute (fntrace=off) { }
Tracing also can be controlled with command-line options:
- --fntrace =on | entry | exit | off
-
Enable tracing of all functions on both function entry and exit, entry only, exit only, or disable all tracing. This overrides any #pragma or _attribute clauses in the program.
- --trace-out=stdin | stderr | filename | host:port
-
Send default trace output to stdin, stderr, a file, or a TCP socket.
- --trace-files=colon-separated list of glob | /regexp/
-
Enable tracing of all functions in the given files, specified either as globs or regular expressions. A + before a glob or regexp enables tracing, a - disables, and no + or - is like having a +, except that the leading one is special: if omitted, it means trace exactly what is specified, overriding any #pragmas or _attribute clauses in the code, by first removing all traces and then processing the file list.
- --trace-funcs=colon-separated list of glob | /regexp/
-
Like trace-files but specifies functions.
- --fnhook=myhook
-
Use
myhook
as the trace hook, overriding any #pragmas in the program. - --trace-script=script.l | Little code
-
Get the trace hook from a file, or use the given Little code (see below).
Some examples:
# Trace all functions $ L --fntrace=on myscript.l # Trace only foo $ L --trace-funcs=foo myscript.l # Trace foo in addition to what the source marks for tracing $ L --trace-funcs=+foo myscript.l # Trace all functions except foo $ L --trace-funcs=*:-foo myscript.l # This does it too $ L --fntrace=on --trace-funcs=-foo myscript.l
Environment variables also can control tracing and take precedence over the other ways above:
L_TRACE_ALL=on | entry | exit | off L_TRACE_OUT=stdin | stderr | filename | host:port L_TRACE_FILES=colon-separated list of glob | /regexp/ L_TRACE_FUNCS=colon-separated list of glob | /regexp/ L_TRACE_DEPTH=n L_TRACE_HOOK=myhook L_TRACE_SCRIPT=script.l | <Little code>
Things in L_TRACE_FUNCS are applied after things in L_TRACE_FILES. As with the command-line options, they also can begin with + or - to add or subtract from what is specified elsewhere.
As a short-cut,
L_TRACE=stdin | stderr | filename | host:port
traces all functions and sets the trace output location.
More examples:
# Trace all files except foo.l L_TRACE_FILE=*:-foo.l L myscript.l # Trace main() and buggy() in addition to whatever is marked # for tracing with #pragmas or _attribute clauses in the code. L_TRACE_FUNCS=+main:buggy L myscript.l # Trace *only* main() and buggy(). L_TRACE_FUNCS=main:buggy L myscript.l
There also is a run-time API that takes a hash of named arguments analogous to those above:
Ltrace({ "fntrace" => "on", "fnhook_out" => "myhook", "trace_depth" => 3, "trace_out" => "tracing.out", "trace_files" => "foo.l", "trace_funcs" => "+main:buggy" });
To use your own tracing function, specify fnhook
in any of the
above ways.
Your hook is called on function entry and exit instead of
the default hook.
Its prototype must look like this:
void myhook(int pre, poly argv[], poly ret);
where pre is 1 when your hook is called upon function entry and 0 when called upon exit, argv contains the function's arguments (argv[0] is the function name; argv[1] is the first parameter), and ret is the return value (exit hook only; it is undef for entry).
If you use your own hook and then want to go back to the default,
set fnhook=def
.
To avoid infinite recursion, during the call of a hook, further calls into the hook are disabled. Also, functions defined as hooks, and the Little library functions, are not traced.
The trace-script attribute is a useful way to provide your own hook:
L_TRACE_SCRIPT=my-trace-hook.l // filename must end in .l L_TRACE_SCRIPT=<Little code>
In the latter case, the Little code gets wrapped in a function like this:
void L_fn_hook(int pre, poly av[], poly ret) { ...code from L_TRACE_SCRIPT... }
and L_fn_hook
is used as the default trace hook.
As one example of where this is useful: say you are trying to find
whether the function foo
is ever called with the first argument of 123,
and if so, to print all the arguments:
L_TRACE_FUNCS=foo \ L_TRACE_SCRIPT='if (av[0]==123) puts(av)' L myscript.l
Built-in and library functions
Little has built-in functions and a set of library functions modeled after the standard C library and Perl.
- string <>
- string <FILE f>
-
Get the next line from a FILE handle and return it, or return undef for EOF or errors. Trailing newlines are removed. If a file handle is specified, it is not closed by this function.
The form without a file handle
while (buf = <>) { ... }
means
unless (argv[1]) { while (buf = <stdin>) { ... } } else for (i = 1; argv[i]; i++) { unless (f = open(argv[i], "r")) { perror(argv[i]); continue; } while (buf = <f>) { ... } }
A trivial grep implementation:
void main(int ac, string argv[]) { string regexp = argv[1]; string buf; unless (regexp) die("usage: grep regexp [files]"); undef(argv[1]); // left shift down the args while (buf = <>) { if (buf =~ m|${regexp}|) puts(buf); } }
- string `command`
-
Execute the command (the string enclosed within back-ticks) and substitute its stdout as the value of the expression. Any output to stderr is passed through to the calling application's stderr and is not considered an error. The command is executed using the Tcl
exec
command which understands I/O re-direction and pipes, except thatcommand
is split into arguments using Bourne shell style quoting instead of Tcl quoting (seeshsplit
). The command string is interpolated. Backslash escapes $, `, and \, \<newline> is ignored, but otherwise backslash is literally interpreted. An embedded newline is an error. If the command cannot be run, undef is returned. The global variablestdio_status
(see system()) contains the command's exit status. - int abs(int val)
- float abs(float val)
-
Return the absolute value of the argument.
- void assert(int condition)
-
Print an error and exit with status 1 if
condition
is false. The filename, line number, and text of the condition are printed. - string basename(string path)
-
Return the file portion of a path name.
- string caller(int frame)
-
Return the name of a calling function, or the caller's caller, etc. To get the caller, use a frame of 0, to get the caller's caller, use 1, etc.
- int chdir(string dir)
-
Change directory to dir. Return 0 on success, -1 on error.
- int chmod(string path, string permissions)
-
Not available on Windows. Change the mode of the file or directory named by path. Permissions can be the octal code that chmod(1) uses, or symbolic attributes that chmod(1) uses of the form [ugo]?[[+-=][rwxst],[...]], where multiple symbolic attributes can be separated by commas (example: u+s,go-rw add sticky bit for user, remove read and write permissions for group and other). A simplified ls-style string, of the form rwxrwxrwx (must be 9 characters), is also supported (example: rwxr-xr-t is equivalent to 01755). Return 0 on success, -1 on error.
- int chown(string owner, string group, string path)
-
Not available on Windows. Change the file ownership of the file or directory names by path. If either owner or group is an empty string, the attribute will not be modified. Return 0 on success, -1 on error.
- int cpus()
-
Return the number of processors (if known). Defaults to 1.
- void die(string fmt, ...args)
-
Output a printf-like message to stderr and exit 1. If fmt does not end with a newline, append " in <filename> at line <linenum>.\n"
- string dirname(string path)
-
Return the directory portion of a pathname.
- int eq(compositeType a, compositeType b)
-
Compare two arrays, hashes, structs, or lists for equality. The two arguments are compared recursively element by element.
- int exists(string path)
-
Return 1 if the given path exists or 0 if it does not exist.
- int fclose(FILE f)
-
Close an open FILE handle. Return 0 on success, -1 on error.
- FILE fopen(string path, string mode)
-
Open a file. The
mode
string indicates how the file will be accessed."r" Open the file for reading only; the file must already exist. This is the default value if access is not specified.
"r+" Open the file for both reading and writing; the file must already exist.
"w" Open the file for writing only. Truncate it if it exists. If it doesn't exist, create a new file.
"w+" Open the file for reading and writing. Truncate it if it exists. If it doesn't exist, create a new file.
"a" Open the file for writing only. The file must already exist, and the file is positioned so that new data is appended to the file.
"a+" Open the file for reading and writing. If the file doesn't exist, create a new empty file. Set the initial access position to the end of the file.
"v" This mode can be added to any of the above and causes open errors to be written to stderr.
Return a FILE handle on success and undef on error.
- int fprintf(FILE f, string fmt, ...args)
-
Format and print a string to the given FILE handle. The FILE handles
stdin
,stdout
, andstderr
are pre-defined.Return 0 on success, -1 on error.
- int Fprintf(string filename, string fmt, ...args)
-
Like fprintf but write to the given file name. The file is overwritten if it already exists. Return 0 on success, -1 on error.
- string ftype(string path)
-
Return the type of file at the given path. Type can be
directory
,file
,character
,block
,fifo
,symlink
orsocket
. Return undef on error. - string[] getdir(string dir)
- string[] getdir(string dir, string pattern)
-
Return the files in the given directory, as a sorted string array. Optionally filter the list by
pattern
which is a glob and may contain the following special characters:? Matches any single character.
* Matches any sequence of zero or more characters.
[chars] Matches any single character in chars. If chars contains a sequence of the form a-b then any character between a and b (inclusive) will match.
\x Matches the character x.
{a,b,...} Matches any of the strings a, b, etc.
If the first character in a pattern is ``~'' then it refers to the home directory for the user whose name follows the ``~''. If the ``~'' is followed immediately by ``/'' then the value of the HOME environment variable is used.
- dirent[] getdirx(string dir)
-
Return the files in the given directory as an array of structs, with the directories sorted and coming first in the array followed by the sorted file names. Return undef on error. The
dirent
struct is defined as follows:typedef struct dirent { string name; string type; // "file", "directory", "other" int hidden; } dirent;
Return the value of an environment variable if it exists and is of non-zero length, or return undef if it has zero length or does not exist. This allows you to say putenv("VAR=") and have getenv("VAR") return undef.
Parse command-line argument This version (from BitKeeper, same semantics) recognizes the following types of short and long options in the av array:
- leaves it and stops processing options (for indicating stdin) -- end of options -a -abcd -r <arg> -r<arg> -abcr <arg> same as -a -b -c -r <arg> -abcr<arg> same as -a -b -c -r<arg> -r<arg> --long --long:<arg> --long=<arg> --long <arg>
Short options are all specified in a single opts
string as follows:
d boolean option -d d: required arg -dARG or -d ARG d; required arg no space -dARG d| optional arg no space -dARG or -d
Long options are specified in the longopts
array (one option
per element) as follows:
long boolean option --long long: required arg --long=ARG or --long ARG long; required arg no space --long=ARG long| optional arg no space --long=ARG or --long
The function returns the name of the next recognized option or
undef if no more options exist.
The global variable optind
is set to the next av[] index to
process.
If the option has no arg, optarg
is set to undef
.
If an unrecognized option is seen, the empty string ("") is returned
and the global variable optopt
is set to the name of the
offending option (unless the option is a long option).
This example shows a typical usage of both short and long options.
int debug_level, verbose; string c, lopts[] = { "verbose" }; while (c = getopt(av, "d|v", lopts)) { switch (c) { case "d": if (optarg) debug_level = (int)optarg; break; case "v": case "verbose": verbose = 1; break; default: die("unrecognized option ${optopt}"); } }
Return the caller's process id.
Output a message like "myfunc() in script.l:86" to stderr which contains the file name, line number, and currently executing function name. Typically used for debugging.
Insert one or more elements into array
before the element specified
by index
.
If index
is 0, the elements are inserted at the beginning of the
array; this is what unshift()
does.
If index
is -1 or larger than or equal to the number of elements in
the array, the elements are inserted at the end; this is what push
does.
You can insert single elements or arrays of elements.
Return 1 if the given string contains only alphabetic characters, else return 0. An empty string also returns 0.
Return 1 if the given string contains only alphabetic or digit characters, else return 0. An empty string also returns 0.
Return 1 if the given string contains only digit characters, else return 0. An empty string also returns 0.
Return 1 if the given path exists and is a directory, else return 0.
Return 1 if the given path exists and is a link, else return 0.
Return 1 if the given string contains only lower case alphabetic characters, else return 0. An empty string also returns 0.
Return 1 if the given path exists and is a regular file, else return 0.
Return 1 if all characters in the argument are space characters, else return 0. An empty string also returns 0.
Return 1 if the given string contains only upper-case alphabetic characters, else return 0. An empty string also returns 0.
Return 1 if the given string contains only alphanumeric or connector punctuation characters (such as underscore), else return 0. An empty string also returns 0.
Convert an array into a string by joining all of its elements by inserting sep between each pair.
Return an array containing the keys of a given hash. Note that the return type depends on the argument type.
Return a copy of the string that is in all lower case.
Return the number of characters in the given string.
Returns 0 if the argument is undef
.
Return the number of elements in the given array.
Returns 0 if the argument is undef
.
for (i = 0; i < length(array); i++)
Return the number of key/value pairs in the given hash.
Returns 0 if the argument is undef
.
Create a hard link from sourcePath to targetPath. Return 0 on success, -1 on error.
Call lstat(2) on path
and place the information in buf
.
Return 0 on success, -1 on error.
The struct stat
type is defined as follows:
struct stat { int st_dev; int st_ino; int st_mode; int st_nlink; int st_uid; int st_gid; int st_size; int st_atime; int st_mtime; int st_ctime; string st_type; };
where st_type
is a string giving the type of file name, which will
be one of file, directory, characterSpecial, blockSpecial, fifo, link,
or socket.
Return the maximum or minimum of two numbers. The return type is float if either of the arguments is a float, otherwise the return type is int.
Return the number of milliseconds since the currently executing script started.
Reset the internal state for milli() to begin counting from 0 again.
Create a directory at the given path. This creates all non-existing parent directories. The directories are created with mode 0775 (rwxrwxr-x). Return 0 on success, -1 on error.
Return the modified time of path, or 0 to indicate error.
Return a normalized version of path. The pathname will be an absolute path with all "../" and "./" removed.
Return the numeric value of the encoding (ASCII, Unicode) of the first character
of c
, or -1 on error or if c
is the empty string.
Close an open pipe created by popen(). Return 0 on success, -1 on error. See system() for details of the STATUS struct.
Print the error message corresponding to the last error from
various Little library calls.
If message
is not undef, it is prepended to the error string
with a ": ".
Remove an element from the end of array
.
Return undef if the array is already empty.
Open a file handle to a process running the command specified in
argv[]
or cmd
. In the cmd
case, the command is split into
arguments respecting Bourne shell style quoting.
The returned FILE handle may be
used to write to the command's input pipe or read from its output
pipe, depending on the value of mode
. If write-only access is used
("w"), then standard output for the pipeline is directed to the
current standard output unless overridden by the command. If
read-only access is used ("r"), standard input for the pipeline is
taken from the current standard input unless overridden by the
command.
The optional third argument is a callback function that is
invoked by Tcl's event loop when the command's stderr pipe has data
available to be read.
The second argument of the callback is a non-blocking FILE for the
read end of this pipe.
Care must be taken to ensure that the event loop is run often enough
for the callback to reap data from the pipe often enough to avoid deadlock.
In console apps, this may mean calling Tcl's update()
function.
The pclose() function also invokes the callback, so it is guaranteed to
be called at least once.
If the third argument to popen
is undef
, the command's stderr
output is ignored. Otherwise, unless re-directed by the command, any
stderr output is passed through to the calling script's stderr and is
not considered an error.
If Tk is being used, there is a default callback that pops up a window with any output the command writes to stderr.
Return the FILE handle on success, or undef on error.
Format arguments and print to stdout, as in printf(3). Return 0 on success, -1 on error.
Push one or more elements onto the end of array
.
You can insert single elements or arrays of elements.
Set an environment variable, overwriting any pre-existing value, using printf-like arguments:
putenv("VAR=val"); putenv("MYPID=%d", getpid());
Return the new value or undef if var_fmt contains no "=".
Read at most numBytes from the given FILE handle into the buffer, or read the entire file if numBytes == -1 or is omitted. Return the number of bytes read, -1 on error or EOF.
Rename a file. Return 0 on success, -1 on error.
Find and load the given Tcl package packageName. Return the version string of the package loaded on success, and undef on error.
Delete the given directory. Return 0 on success, -1 on error.
Remove and return the element at the beginning of array
.
Return undef if the array is already empty.
Split the string the same way as the Bourne shell and return the array.
Return the size, in bytes, of the named file path, or -1 on error.
Sleep for seconds
seconds. Note that seconds
can be fractional to
get sub-second sleeps.
Sort the array array
and return a new array of sorted elements.
The first variation sorts the elements into ascending order, and does
an integer, real, or ascii sort based on the type of array
. The
second two variations show optional arguments that can be passed to
change this behavior. The last variation shows how a custom compare
function can be specified. The function must take two array elements
of type T as arguments and return -1 if the first comes before the
second in the sort order, +1 if the first comes after the second, and
0 if the two are equal.
Execute a command in background. All forms return either a process id or undef to indicate an error. In the error case the STATUS argument is set, otherwise it remains untouched and the status can be reaped by waitpid().
See the system() function for information about the arguments.
See the waitpid() function for information about waiting on the child.
Split a string into substrings. In the first variation, the string is split on whitespace, and any leading or trailing white space does not produce a null field in the result. This is useful when you just want to get at the things delimited by the white space:
split("a b c"); // returns {"a", "b", "c"} split(" x y z "); // returns {"x", "y", "z"}
In the second variation, the string is split using a regular expression as the delimiter:
split(/,/, "we,are,commas"); // returns {"we", "are", "commas"} split(/xxx/, "AxxxBxxxC"); // returns {"A", "B", "C"} split(/[;,]/, "1;10,20"); // returns {"1", "10", "20"}
When a delimiter is used, split returns a null first field if the string begins with the delimiter, but if the string ends with the delimiter no trailing null field is returned. This provides compatibility with Perl's split:
split(/xx/, "xxAxxBxxCxx"); // returns {"", "A", "B", "C"}
You can avoid the leading null fields in the result if you put a
t
after the regular expression (to tell it to "trim" the result):
split(/xx/t, "xxAxxBxxCxx"); // returns {"A", "B", "C"}
If a limit
argument is given, at most limit
substrings are returned
(limit <= 0 means no limit):
split(/ /, "a b c d e f", 3); // returns {"a", "b", "c d e f"}
To allow splitting on variables or function calls that start with
m
, the alternate regular expression delimiter syntax
is restricted:
split(m|/|, pathname); // "|" and most other punctuation -- ok // but ( and ) as delimiters -- error split(m); // splits the variable "m" -- ok split(m(arg)); // splits the result of m(arg) -- ok
Regular expressions, and the strings to split, both can contain unicode characters or binary data as well as ASCII:
split(/\0/, string_with_nulls); // split on null split(/ש/, "זו השפה שלנו"); // unicode regexp and string
Format arguments and return a formatted string like sprintf(3). Return undef on error.
Call stat(2) on path
and place the information in buf
.
Return 0 on success, -1 on error.
See the lstat() command for the definition of struct stat
.
Return the first index of c into s, or -1 if c is not found.
Return the string length.
Return the last index of c into s, or -1 if c is not found.
Create a symbolic link from sourcePath to targetPath. Return 0 on success, -1 on failure.
Execute a command and wait for it to finish (see spawn()
for the
asynchronous version).
The command is executed using Tcl's exec
which understands
I/O re-direction and pipes, except that command
is split into arguments
using Bourne shell style quoting instead of Tcl quoting (see shsplit
).
If the number of arguments is one or two, then the existing stdin, stdout, stderr channels are used.
If the number of arguments is four or five, then the second, third,
and fourth arguments specify stdin, stdout, stderr, respectively.
Each can be a string variable or string array
(a reference is required for stdout and stderr),
a FILE variable which must be an open file handle, or
a string literal which is interpreted as a file path name.
If you want to specify a file name from a variable, use the string
literal "${filename}".
It is an error to both re-direct input/output in the command string
and to specify the corresponding input/output argument; in such a
case, the command is not run and undef
is returned.
If stdout or stderr are sent to strings or string arrays and no output
is produced, then out
or err
are undef
upon return.
The optional last argument is a reference to the following structure:
typedef struct { string argv[]; // args passed in string path; // if defined, this is the path to the exe // if undef, the executable was not found int exit; // if defined, the process exited with <exit> int signal; // if defined, the process was killed // by <signal> } STATUS;
The global variable stdio_status
is also set.
If the the command is a pipeline and a process in that pipeline fails,
the returned status is for the first process that failed.
If there is an error executing the command, or if the process is killed by a signal, undef is returned; otherwise, the return value is the process exit status (for a pipeline, the status of the first process that exited with error).
Examples:
// No futzing with input/output, uses stdin/out/err. ret = system(cmd); // Same thing but no quoting issues, like execve(2). ret = system(argv); // Get detailed status. unless (defined(ret = system(cmd, &status))) { unless (defined(status.path)) { warn("%s not found or bad perm\n", status.path); } if (defined(status.signal)) { warn("%s killed with %d\n", status.argv[0], status.signal); } } // Taking input and sending output to string arrays. // The in_vec elements should not contain newlines and // the out/err_vec elements will not contain newlines. string in_vec[], out_vec[], err_vec[]; ret = system(cmd, in_vec, &out_vec, &err_vec); // Taking input and sending output to files. string outf = sprintf("/tmp/out%d", getpid()); ret = system(cmd, "/etc/passwd", "${outf}", "/tmp/errors"); // Using open file handles. FILE in = popen("/some/producer/process", "r"); FILE out = popen("/some/consumer/process", "w"); FILE err = popen("cat > /dev/tty", "w"); ret = system(argv, buf, in, out, err, &status); // error handling here pclose(in, &status); // error handling here ... // Mixing and matching. ret = system(argv, buf, &out, "/tmp/errors", &status);
Return a copy of the string that has been trimmed of any leading and trailing whitespace (spaces, tabs, newlines, and carriage returns).
Return the simple type name of the given variable. This is one of "int", "string", "poly", "widget", "array", "hash", or "struct"; or if the variable's type is a typedef, the typedef name; or if the variable has a class type, the class name; of if the variable is really a function name, "function".
Return a copy of the string that is in all upper case.
In the first three forms, remove an array, string, or hash element from the specified variable. In the last form, sets the variable to undef. When setting a hash or array to undef, all of its old elements are freed (unless they were shared with some other variable).
Delete the named file. Return 0 on success, -1 on failure.
Add one or more elements onto the beginning of array
.
You can insert single elements or arrays of elements.
Given a pid returned by spawn(), wait for it, and place the exit information
in the (optional) STATUS struct.
If pid
is -1, return any process that has exited or return -1 if
no more child processes exist;
otherwise return pid
or -1 on error.
If nohang
is non-zero, returns -1 if the process does not exist or other
error, returns 0 if the process exists and has not exited, and
returns pid
and updates status
if the process has exited.
Same as waitpid(-1, &status, 0)
.
Output a printf-like message to stderr. If fmt does not end with a newline, append " in <filename> at line <linenum>.\n"
Write at most numBytes to the given FILE handle from the buffer. Return the number of bytes written, or -1 on error.
Example code
shapes.l
This is something we hand to our customers to see what "shape" their source trees have.
#!/usr/bin/bk tclsh /* * Determine the files/size of each directory under a bk repository. * Optionally transform the directory names to obscure their structure. * * The idea is that you can run this script like this: * * bk little shapes.l <path_to_root_of_repo> * * and get a list of directories with their sizes and number of files in * each of them. Save the output, then run it again with -o: * * bk little shapes.l -o <path_to_root_of_repo> * * and send the output to BitMover. * * The names of all the directories will be rot13'd and sorted (since * sort is a destructive transform, it makes it harder to reverse the * rot13). This is a weak form of obfuscation, but it lets BitMover * work with the directory structure without inadvertently learning * about the client's projects. * * The line numbers at the beginning is so that we can talk about a certain * directory by number without BitMover knowing the name of the directory. * * ob@dirac.bitmover.com|src/contrib/shapes.l|20100723224240|23777 * */ string obscure(string s); // pathname to no-IP-leak pathname string pp(float n); // pretty print a number, like df -h string rot13(string str); // if you don't know, you don't know int main(int ac, string[] av) { int size, files, maxlen, n; int do_obscure = 0; string fn, root, dir, d, ob; FILE f; struct stat sb; struct dirstats { int files; int size; int total_files; int total_size; } dirs{string}; dir = "."; if (ac == 3) { if (av[1] == "-o") { do_obscure = 1; } else { fprintf(stderr, "usage: %s [-o] [<dir>]\n", av[0]); exit(1); } dir = av[2]; } else if (ac == 2) { if (av[1] == "-o") { do_obscure = 1; dir = "."; } else { dir = av[1]; } } else if (ac > 3) { fprintf(stderr, "usage: %s [-o] [<dir>]\n", av[0]); exit(1); } if (chdir(dir)) { fprintf(stderr, "Could not chdir to %s\n", dir); exit(1); } root = `bk root`; if (root == "") { fprintf(stderr, "Must be run in a BitKeeper repository\n"); exit(1); } if (chdir(root)) { fprintf(stderr, "Could not chdir to %s\n", root); exit(1); } size = 0; files = 0; f = popen("bk sfiles", "r"); while (defined(fn = <f>)) { dir = dirname(fn); if (dir == "SCCS") { dir = "."; } else { // remove SCCS and obscure dir = dirname(dir); } unless (defined(dirs{dir})) dirs{dir} = {0, 0, 0, 0}; if (maxlen < length(dir)) maxlen = length(dir); dirs{dir}.files++; files++; if (lstat(fn, &sb)) { fprintf(stderr, "Could not stat %s\n", fn); continue; } dirs{dir}.size += sb.st_size; size += sb.st_size; // add our size/file count to each parent dir for (d = dirname(dir); d != "."; d = dirname(d)) { unless (defined(dirs{d})) dirs{d} = {0,0,0,0}; dirs{d}.total_size += sb.st_size; dirs{d}.total_files++; } dirs{"."}.total_size += sb.st_size; dirs{"."}.total_files++; } close(f); // now print it printf(" N | %-*s | FILES | SIZE | T_FILES | T_SIZE \n", maxlen, "DIRS"); n = 1; foreach (dir in sort(keys(dirs))) { ob = dir; if (do_obscure) { ob = obscure(dir); } if (dirs{dir}.total_files > 0) { printf("%5d | %-*s | %5d | %7s | %7s | %7s\n", n, maxlen, ob, dirs{dir}.files, pp(dirs{dir}.size), dirs{dir}.total_files, pp(dirs{dir}.total_size)); } else { printf("%5d | %-*s | %5d | %7s | %7s | %7s\n", n, maxlen, ob, dirs{dir}.files, pp(dirs{dir}.size), "",""); } n++; } printf("TOTAL: %u files, %s\n", dirs{"."}.total_files, pp(dirs{"."}.total_size)); return (0); } /* Pretty print a number */ string pp(float n) { int i; float num = (float)n; string sizes[] = {"b", "K", "M", "G", "T"}; for (i = 0; i < 5; i++) { if (num < 1024.0) return (sprintf("%3.2f%s", num, sizes[i])); num /= 1024.0; } } /* Table for rot13 function below */ string rot13_table{string} = { "A" => "N", "B" => "O", "C" => "P", "D" => "Q", "E" => "R", "F" => "S", "G" => "T", "H" => "U", "I" => "V", "J" => "W", "K" => "X", "L" => "Y", "M" => "Z", "N" => "A", "O" => "B", "P" => "C", "Q" => "D", "R" => "E", "S" => "F", "T" => "G", "U" => "H", "V" => "I", "W" => "J", "X" => "K", "Y" => "L", "Z" => "M", "a" => "n", "b" => "o", "c" => "p", "d" => "q", "e" => "r", "f" => "s", "g" => "t", "h" => "u", "i" => "v", "j" => "w", "k" => "x", "l" => "y", "m" => "z", "n" => "a", "o" => "b", "p" => "c", "q" => "d", "r" => "e", "s" => "f", "t" => "g", "u" => "h", "v" => "i", "w" => "j", "x" => "k", "y" => "l", "z" => "m", }; /* rot13 a string */ string rot13(string str) { int i; string ret = ""; for (i = 0; i < length(str); i++) { ret .= rot13_table{str[i]}; } return (ret); } /* * Print an obscured version of the string * rot13 + sort */ string obscure(string s) { string p; string[] ret; string[] sp = split(s, "/"); foreach (p in sp) { push(&ret, rot13(join("", lsort(split(p, ""))))); } return (join("/", ret)); }
photos.l
#!/usr/bin/bk tclsh /* * A rewrite of Eric Pop's fine igal program in Little. I talked to Eric and he * really doesn't want anything to do with supporting igal or copycats so * while credit here is cool, don't stick his name on the web pages. * I completely understand that, people still ask me about webroff and * lmbench. * * First version by Larry McVoy Sun Dec 19 2010. Public Domain. * * usage photos [options] [dir] * * TODO * - slideshow mode * - move the next/prev/index to the sides along w/ EXIF info */ int bigy = 750; // --bigy=%d for medium images int dates = 0; // --date-split int exif = 0; // --exif under titles int exif_hover = 0; // --exif-hover, exif data in thumbnail hover int exif_thumbs = 0; // --exif-thumbnails, use the camera thumbnail int force = 0; // -f force regen of everything int names = 0; // put names below the image int nav = 0; // month/year nav int parallel = 1; // -j%d for multiple processes int sharpen = 0; // --sharpen to turn it on int thumbnails = 0; // force regen of those int quiet = 1; // turn off verbose string title = "McVoy photos"; // --title=whatever int ysize = 120; // -ysize=%d for thumbnails int rotate[]; // amount to rotate, -+90 string indexf = "~/.photos/index.html"; string slidef = "~/.photos/slide.html"; int main(int ac, string av[]) { string c; string lopts[] = { "bigy:", "date-split", "exif", "exif-thumbnails", "exif-hover", "force", "index:", "names", "nav", "parallel:", "quiet", "regen", "sharpen", "slide:", "thumbnails", "title:", "ysize:", }; if (0) ac = 0; // lint parallel = cpus(); dotfiles(); while (c = getopt(av, "fj:", lopts)) { switch (c) { case "bigy": bigy = (int)optarg; break; case "date-split": dates = 1; break; case "exif": exif = 1; break; case "exif-hover": exif_hover = 1; break; case "exif-thumbnails": exif_thumbs = 1; break; case "f": case "force": case "regen": force = 1; break; case "index": indexf = optarg; break; case "j": case "parallel": parallel = (int)optarg; break; case "quiet": quiet = 1; break; case "names": names = 1; break; case "nav": nav = 1; break; case "sharpen": sharpen = 1; break; case "slide": slidef = optarg; break; case "title": title = optarg; break; case "thumbnails": thumbnails = 1; break; case "ysize": ysize = (int)optarg; break; default: printf("Usage: photos.l"); foreach(c in lopts) { if (c =~ /(.*):/) { printf(" --%s=<val>", $1); } else { printf(" --%s", c); } } printf("\n"); return(0); } } unless (av[optind]) { dir("."); } else { while (av[optind]) dir(av[optind++]); } return (0); } void dir(string d) { string jpegs[]; string tmp[]; string buf; int i; if (chdir(d)) die("can't chdir to %s", d); tmp = getdir(".", "*.jpeg"); unless (tmp[0]) tmp = getdir(".", "*.jpg"); unless (tmp[0]) tmp = getdir(".", "*.png"); unless (tmp[0]) tmp = getdir(".", "*.PNG"); unless (tmp[0]) die("No jpegs found in %s", d); // XXX - should getdir do this? for (i = 0; defined(tmp[i]); i++) tmp[i] =~ s|^\./||; /* so we start at one not zero */ jpegs[0] = '.'; rotate[0] = 0; // XXX - I want push(&jpegs, list) foreach (buf in tmp) { push(&jpegs, buf); push(&rotate, rotation(buf)); } slides(jpegs); thumbs(jpegs); html(jpegs); } /* * Create .thumb-$file if * - it does not exist * - .ysize is different than ysize * - $file is newer than thumbnail */ void thumbs(string jpegs[]) { string cmd[]; string jpeg, file, slide; int i; int all = 0; int my_parallel = parallel, bg = 0; int pid, reaped; int pids{int}; unless (exists(".ysize")) { save: Fprintf(".ysize", "%d\n", ysize); } if ((int)`cat .ysize` != ysize) { all = 1; goto save; } if (force || thumbnails) all = 1; if (exif_thumbs) my_parallel = 1; for (i = 1; defined(jpeg = jpegs[i]); i++) { file = sprintf(".thumb-%s", jpeg); slide = sprintf(".slide-%s", jpeg); if (!all && exists(file) && (mtime(file) > mtime(jpeg))) { continue; } if (exif_thumbs && do_exif(undef, jpeg)) { unlink(file); cmd = { "exif", "-e", "-o", file, jpeg }; } else { cmd = { "convert", "-thumbnail", "x${ysize}", "-quality", "85", }; if (sharpen) { push(&cmd, "-unsharp"); //push(&cmd, "0x.5"); push(&cmd, "2x0.5+0.7+0"); } push(&cmd, exists(slide) ? slide : jpeg); push(&cmd, file); } while (bg >= parallel) { reaped = 0; foreach (pid in keys(pids)) { if (waitpid(pid, undef, 1) > 0) { reaped++; bg--; undef(pids{pid}); break; } } if (reaped) break; sleep(0.100); } unless (quiet) { printf("Creating %s from %s\n", file, exists(slide) ? slide : jpeg); } pid = spawn(cmd); unless (defined(stdio_status.path)) { die("%s: command not found.\n", cmd[0]); } bg++; pids{pid} = 1; } foreach (pid in keys(pids)) waitpid(pid, undef, 0); } /* * Create .slide-$file if * - it does not exist * - .bigy is different than bigy * - $file is newer than slide * - $file is bigger than bigy */ void slides(string jpegs[]) { string cmd[]; string jpeg, file; int all = 0; int i; int bg = 0; int pid, reaped; int pids{int}; unless (exists(".bigy")) { save: Fprintf(".bigy", "%d\n", bigy); } if ((int)`cat .bigy` != bigy) { all = 1; goto save; } if (force) all = 1; for (i = 1; defined(jpeg = jpegs[i]); i++) { file = sprintf(".slide-%s", jpeg); if (!all && exists(file) && (mtime(file) > mtime(jpeg))) { continue; } if (small(jpeg)) { unlink(file); if (link(jpeg, file)) warn("link ${jpeg} ${file}"); continue; } cmd = { "convert", "+profile", "*", "-scale", "x" . "${bigy}", "-quality", "85", }; if (rotate[i]) { push(&cmd, "-rotate"); push(&cmd, sprintf("%d", rotate[i])); } if (sharpen) { push(&cmd, "-unsharp"); //push(&cmd, "0x.5"); push(&cmd, "2x0.5+0.7+0"); } push(&cmd, jpeg); push(&cmd, file); while (bg >= parallel) { reaped = 0; foreach (pid in keys(pids)) { if (waitpid(pid, undef, 1) > 0) { reaped++; bg--; undef(pids{pid}); break; } } if (reaped) break; sleep(0.150); } unless (quiet) { printf("Creating %s from %s\n", file, jpeg); } printf("%s\n", join(" ", cmd)); pid = spawn(cmd); unless (defined(stdio_status.path)) { die("%s: command not found.\n", cmd[0]); } bg++; pids{pid} = 1; } foreach (pid in keys(pids)) waitpid(pid, undef, 0); } int small(string file) { string buf; // Hack to avoid exif calls on small files if (size(file) < 100000) return (1); if (size(file) > 200000) return (0); unless (buf = `identify '${file}'`) return (0); if (buf =~ /JPEG (\d+)x(\d+)/) return ((int)$2 <= bigy); return (0); } string num2mon{int} = { 1 => "January", 2 => "February", 3 => "March", 4 => "April", 5 => "May", 6 => "June", 7 => "July", 8 => "August", 9 => "September", 10 => "October", 11 => "November", 12 => "December", }; typedef struct { int day; // day 1..31 int mon; // month 1..12 int year; // year as YYYY string sdate; // YYYY-MM-DD } date; /* * Return the date either from the filename if it is one of date ones, * or from the exif data, * or fall back to mtime. */ date f2date(string file) { date d; string buf; FILE f; int t; if (file =~ /^(\d\d\d\d)-(\d\d)-(\d\d)/) { match: buf = (string)$3; buf =~ s/^0//; d.day = (int)buf; buf = (string)$2; buf =~ s/^0//; d.mon = (int)buf; d.year = (int)$1; d.sdate = sprintf("%d-%02d-%02d", d.year, d.mon, d.day); return (d); } if (f = popen("exif -t DateTime '${file}' 2>/dev/null", "r")) { while (buf = <f>) { // Value: 2006:02:04 22:59:24 if (buf =~ /Value: (\d\d\d\d):(\d\d):(\d\d)/) { pclose(f); goto match; } } pclose(f); // fall through to mtime } if (t = mtime(file)) { buf = Clock_format(t, format: "%Y:%m:%d"); buf =~ /(\d\d\d\d):(\d\d):(\d\d)/; goto match; } return (undef); } /* * Create the html slide files and index.html * XXX - could stub this out if mtime(html) > mtime(.slide) etc. */ void html(string jpegs[]) { string template, file, stitle, ntitle, ptitle, buf; string cap = ''; string date_nav = ''; string dir, jpeg, escaped, thumbs = ''; int i, next, prev; int first = 1; FILE f; string map[]; string exdata; date d, d2; unless (f = fopen(slidef, "rv")) die("slide.html"); read(f, &template, -1); fclose(f); for (i = 1; defined(jpeg = jpegs[i]); i++) { file = sprintf("%d.html", i); if (i > 1) { prev = i - 1; } else { prev = length(jpegs) - 1; } if (jpegs[i+1]) { next = i + 1; } else { next = 1; } undef(map); stitle = jpeg; stitle =~ s/\.jp.*//; ntitle = jpegs[next]; ntitle =~ s/\.jp.*//; ptitle = jpegs[prev]; ptitle =~ s/\.jp.*//; escaped = jpeg; escaped =~ s/:/%3A/g; dir = `pwd`; dir =~ s|.*/||; map = { "%FOLDER%", dir, "%TITLE%", stitle, "%NEXT_HTML%", sprintf("%d.html", next), "%NEXT_TITLE%", ntitle, "%PREV_HTML%", sprintf("%d.html", prev), "%PREV_TITLE%", ptitle, "%NEXT_SLIDE%", sprintf(".slide-%s", jpegs[next]), "%ORIG%", escaped, "%SLIDE%", sprintf(".slide-%s", escaped), }; push(&map, "%CAPTION%"); if (names || exif) cap = '<P class="center">'; if (names) { cap .= stitle . ' ' . sprintf("(%d/%d)\n", i, length(jpegs) - 1); } undef(exdata); if (exif) { do_exif(&exdata, jpeg); if (names) cap .= "<br>"; cap .= exdata; } if (names || exif) cap .= "</P>\n"; push(&map, cap); push(&map, "%NAV%"); date_nav = ''; do_nav(&date_nav, jpeg, prev, next, 1); push(&map, date_nav); buf = String_map(map, template); Fprintf(file, "%s\n", buf); if (dates && defined(d2 = f2date(jpeg)) && (first || (d.sdate != d2.sdate))) { d = d2; unless (first) thumbs .= "</DIV>\n"; buf = num2mon{d.mon}; thumbs .= "<p><a name=\"${buf}_${d.day}\">"; cap = "${buf} ${d.day} ${d.year}"; thumbs .= cap . "</a>"; cap = ".cap-${buf}-${d.day}-${d.year}"; // .cap-January-09-2011, if exists, is appended if (exists(cap) && (cap = `cat ${cap}`)) { thumbs .= ': ' . cap; } thumbs .= "<br>\n<DIV class=\"center\">\n"; } if (exif && exif_hover) stitle .= " " . exdata; thumbs .= sprintf( '<a href="%s">' . '<img src=".thumb-%s" alt="%s" title="%s" border="0"/>' . '</a>' . "\n", file, escaped, stitle, stitle); first = 0; } /* do index.html */ unless (f = fopen(indexf, "rv")) die("index.html"); read(f, &template, -1); fclose(f); undef(map); push(&map, "%TITLE%"); push(&map, title); push(&map, "%THUMBS%"); thumbs .= "</DIV>\n"; push(&map, thumbs); date_nav = ''; push(&map, "%NAV%"); do_nav(&date_nav, jpegs[1], undef, undef, 0); push(&map, date_nav); buf = String_map(map, template); if (exists(".index-include")) { buf .= `cat .index-include`; } Fprintf("index.html", "%s", buf); unless (f = fopen("~/.photos/photos.css", "rv")) die("photos.css"); read(f, &buf, -1); fclose(f); Fprintf("photos.css", "%s", buf); } /* * XXX - what this needs is a hash and then at the end I push the info * I want in the order I want. */ int do_exif(string &cap, string jpeg) { FILE f = popen("exiftags -a '${jpeg}'", "rv"); string save, buf, maker = ''; string v[]; string iso = undef; int thumb = 0; int i; string tags{string}; while (buf = <f>) { switch (trim(buf)) { case /^Equipment Make: (.*)/: maker = $1; if (maker == "OLYMPUS IMAGING CORP.") { maker = "Olympus"; } if (maker == "NIKON CORPORATION") { maker = "Nikon"; } break; case /^Camera Model: (.*)/: save = $1; if (save =~ /${maker}/i) { tags{"camera"} = save; } else { tags{"camera"} = "${maker} ${save}"; } if (save == "TG-1") tags{"lens"} = "25-100mm f2.0"; if (save =~ /Canon PowerShot S95/) { tags{"lens"} = "28-105 mm"; } if (save =~ /Canon PowerShot S100/) { tags{"lens"} = "24-120mm"; } break; case /Lens Name: (.*)/: if ($1 =~ /EF\d/) $1 =~ s/EF/EF /; if ($1 =~ /EF-S\d/) $1 =~ s/EF-S/EF-S /; if ($1 =~ / USM/) $1 =~ s/ USM//; if ($1 == "30mm") $1 = "Sigma 30mm f/1.4"; if ($1 == "90mm") $1 = "Tamron 90mm macro"; if ($1 == "18-200mm") $1 = "Tamron 18-200mm"; if ($1 == "18-250mm") $1 = "Tamron 18-250mm"; if ($1 == "18-270mm") $1 = "Tamron 18-270mm"; if ($1 == "170-500mm") $1 = "Sigma 170-500mm"; $1 =~ s|f/|f|; tags{"lens"} = $1; break; case /Lens Size: 10.00 - 22.00 mm/: tags{"lens"} = "EF-S 10-22mm f/3.5-4.5"; break; case /Exposure Bias: (.*)/: if ($1 != "0 EV") { unless ($1 =~ /^-/) $1 = "+" . $1; tags{"bias"} = $1; } break; case /^Exposure Time: (.*)/: save = $1; $1 =~ /(\d+)\/(\d+) sec/; if ((int)$1 > 1) { i = (int)$2/(int)$1; save = "1/${i}"; } tags{"time"} = save; break; case /Lens Aperture: (.*)/: case /F-Number: (.*)/: $1 =~ s|/||; tags{"fstop"} = $1; break; case /ISO Speed Rating: (.*)/: iso = undef; if ($1 == "Auto") { iso = "ISO ${$1}"; } else if ($1 == "Unknown") { ; } else unless ((int)$1 == 0) { iso = "ISO ${$1}"; } if (defined(iso)) tags{"iso"} = iso; break; case /Focal Length .35mm Equiv.: (.*)/: case /Focal Length: (.*)/: save = $1; if (tags{"camera"} =~ /Canon PowerShot S95/) { save =~ s/ mm//; save = (string)(int)((float)save * 4.7); save .= " mm"; } if (tags{"camera"} =~ /Canon PowerShot S100/) { save =~ s/ mm//; save = (string)(int)((float)save * 4.61538); save .= " mm"; } unless (defined(tags{"focal"})) { tags{"focal"} = save; } break; case /Metering Mode: (.*)/: unless (defined(tags{"metering"})) { tags{"metering"} = "${$1} metering"; } break; case /White Balance: (.*)/: unless ($1 =~ /white balance/) $1 .= " white balance"; $1 =~ s/white balance/WB/; unless (defined(tags{"balance"})) { tags{"balance"} = $1; } break; case /Compression Scheme: JPEG Compression .Thumbnail./: thumb = 1; break; } } fclose(f); cap = ""; if (defined(tags{"camera"})) push(&v, tags{"camera"}); if (defined(tags{"lens"})) { if (defined(tags{"focal"}) && (tags{"lens"} =~ /[0-9]-[0-9]/)) { tags{"lens"} .= " @ " . tags{"focal"}; } push(&v, tags{"lens"}); } if (defined(tags{"fstop"})) push(&v, tags{"fstop"}); if (defined(tags{"time"})) push(&v, tags{"time"}); if (defined(tags{"bias"})) push(&v, tags{"bias"}); if (defined(tags{"iso"})) push(&v, tags{"iso"}); if (defined(tags{"metering"})) push(&v, tags{"metering"}); if (defined(tags{"balance"})) push(&v, tags{"balance"}); if (defined(v)) cap = join(", ", v); return (thumb); } int rotation(string file) { string r = `exif -m -t Orientation '${file}'`; switch (r) { case /right.*top/i: return (90); case /left.*bottom/i: return (-90); default: return (0); } } /* * This is called for both index nav and slide nav. * For index nav, unless nav is set, do nothing. * For slide nav, always do at least * prev | index | next * and optionally * prev | next | prev month | index | next month | prev year | next year */ void do_nav(string &date_nav, string jpeg, int prev, int next, int slide) { int i, mon, did_it; string buf, month; date d; date_nav = ''; if (!nav && !slide) return; unless (defined(d = f2date(jpeg))) return; month = num2mon{d.mon}[0..2]; if (slide) { /* <<< prev | January | next >>> */ date_nav .= '<a href="' . sprintf("%d.html", prev) . '"><< prev pic</a> '; date_nav .= "\n"; unless (nav) { date_nav .= '<a href="index.html">Index</a> '; date_nav .= "\n"; } date_nav .= '<a href="' . sprintf("%d.html", next) . '">next pic >></a>'; date_nav .= "\n"; unless (nav) return; } /* <<< prev | next >>> | <<< January >>> | <<< 2003 >>> */ date_nav .= "\n"; date_nav .= ' '; date_nav .= "\n"; /* do the <<< for the prev month */ for (i = 0; i < 12; i++) { mon = d.mon - i; if (mon == 1) { buf = sprintf("../../%d/%02d/index.html", d.year-1, 12); } else { buf = sprintf("../../%d/%02d/index.html", d.year,mon-1); } if (exists(buf)) break; } if (exists(buf)) date_nav .= '<a href="' . buf . '"><<<</a>'; date_nav .= "\n"; /* do the link to index.html for this month */ if (slide) { date_nav .= ' <a href="index.html">' . month . " index" . '</a> '; } else { date_nav .= " ${month} "; } date_nav .= "\n"; /* do the >>> for next month */ for (i = 0; i < 12; i++) { mon = d.mon + i; if (mon == 12) { buf = sprintf("../../%d/%02d/index.html", d.year+1, 1); } else { buf = sprintf("../../%d/%02d/index.html", d.year,mon+1); } if (exists(buf)) break; } if (exists(buf)) { date_nav .= '<a href="' . buf . '">>>></a>'; } date_nav .= "\n"; date_nav .= ' '; date_nav .= "\n"; did_it = 0; buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon); unless (exists(buf)) for (i = 1; i < 12; i++) { buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon+i); if (exists(buf)) break; buf = sprintf("../../%d/%02d/index.html", d.year - 1, d.mon-i); if (exists(buf)) break; } if (exists(buf)) { date_nav .= '<a href="' . buf . '"><<<</a> ' . "${d.year}"; date_nav .= "\n"; did_it++; } buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon); unless (exists(buf)) for (i = 1; i < 12; i++) { buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon+i); if (exists(buf)) break; buf = sprintf("../../%d/%02d/index.html", d.year + 1, d.mon-i); if (exists(buf)) break; } if (exists(buf)) { unless (did_it) date_nav .= "${d.year}"; date_nav .= ' <a href="' . buf . '">>>></a>'; date_nav .= "\n"; } } void dotfiles(void) { string file, buf; unless (isdir("~/.photos")) mkdir("~/.photos"); file = "~/.photos/slide.html"; unless (exists(file)) { buf = <<'END' <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <HTML> <HEAD> <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <TITLE>%TITLE%</TITLE> <LINK rel="stylesheet" type="text/css" href="photos.css"> <LINK rel="contents" href="index.html"> <LINK rel="next" href="%NEXT_HTML%" title="%NEXT_TITLE%"> <LINK rel="previous" href="%PREV_HTML%" title="%PREV_TITLE%"> <SCRIPT type="text/javascript" language="javascript" defer> <!-- if (document.images) { Image1 = new Image(); Image1.src = "%NEXT_SLIDE%"; } //--> </SCRIPT> </HEAD> <BODY> <P class="center"> %NAV% </P> <DIV class="center"> <TABLE bgcolor="#ffffff" cellspacing=0 cellpadding=4> <TR> <TD class="slide"> <A href="%ORIG%"> <IMG src="%SLIDE%" alt="%TITLE%" title="Click here to see full size, then use your back button." border=0></a> </TD> </TR> </TABLE> <P> %CAPTION% </DIV> </BODY> </HTML> END; Fprintf(file, "%s", buf); } file = "~/.photos/index.html"; unless (exists(file)) { buf = <<'END' <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <HTML> <HEAD> <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <TITLE>%TITLE%</TITLE> <LINK rel="stylesheet" type="text/css" href="photos.css"> </HEAD> <BODY> %TITLE% %NAV% <p> %THUMBS% <p align="center"> %NAV% <P class="small"> For each picture there are 3 sizes: (1) the index thumbnails you are looking at, (2) a mid sized picture that you get to by clicking the thumbnail, (3) the original that you get to by clicking the midsize. Legal crud: everything is copyrighted by whoever took the picture. In the unlikely event you want to use a picture, please ask just to make us feel good. </P> </BODY> </HTML> END; Fprintf(file, "%s", buf); } file = "~/.photos/photos.css"; unless (exists(file)) { buf = <<'END' .center { text-align: center; } .center table { margin-left: auto; margin-right: auto; text-align: center; } body { font-family: verdana, sans-serif; background: #000000; color: #DDDDDD; } a:link { color: #95DDFF; background: transparent; } a:visited { color: #AAAAAA; background: transparent; } a:hover { color: #BBDDFF; background: #555555; } .small { font-size: 50%; } .large { font-size: 200%; } .tiled { background-image: url(".tile.png"); background-repeat: repeat-x; background-color: #000000; padding: 0; } .thumb { background-color: #000000; text-align: center; vertical-align: middle; } .slide { background-color: #ffffff; text-align: center; vertical-align: middle; } END; Fprintf(file, "%s", buf); } }
pod2html.l
This is a Little implementation of pod2html. Pretty stripped down but slightly prettier than the Perl pod2html.
int main(int ac, string av[]) { FILE f; int i, ul; int space = 0, dd = 0, p = 0, pre = 0, table = 0; string head, buf, tmp, title, trim, all[]; // lint if (0) ac++; /* * -t<title> or --title=<title> */ for (i = 1; defined(av[i]) && (av[i] =~ /^-/); i++) { if (av[i] == "--") { i++; break; } if ((av[i] =~ /--title=(.*)/) || (av[i] =~ /-t(.*)/)) { title = $1; } else { die("usage: ${av[0]} [--title=whatever]"); } } if (!defined(av[i]) || defined(av[i+1]) || !defined(f = fopen(av[i], "r"))) { die("usage: ${av[0]} filename"); } unless (defined(title)) title = av[i]; header(title); /* * Load up the whole file in all[] and spit out the index. */ puts("<ul>"); ul = 1; while (defined(buf = <f>)) { push(&all, buf); if (buf =~ /^=head(\d+)\s+(.*)/) { i = (int)$1; while (ul > i) { puts("</ul>"); ul--; } while (i > ul) { puts("<ul>"); ul++; } tmp = $2; tmp =~ s/\s+/_/g; buf =~ s/^=head(\d+)\s+//; puts("<li><a href=\"#${tmp}\">${buf}</a></li>"); } } while (ul--) puts("</ul>"); fclose(f); /* * Now walk all[] and process the markup. We currently handle: * =head%d title * =over * =item name * =proto return_type func(args) * =back * <blank line> * bold this *some code
* italics */ for (i = 0; i <= length(all); i++) { buf = inline(all[i]); if (buf =~ /^=head(\d+)\s+(.*)/) { if ((int)$1 == 1) puts("<HR>"); tmp = $2; tmp =~ s/\s+/_/g; printf("<H%d><a name=\"%s\">%s</a></H%d>\n", $1, tmp, $2, $1); } else if (buf =~ /^=over/) { puts("<dl>"); } else if (buf =~ /^=item\s+(.*)/) { if (dd) { puts("</dd>"); dd--; } puts("<dt><strong>${$1}</strong></dt><dd>"); dd++; } else if (buf =~ /^=proto\s+([^ \t]+)\s+(.*)/) { if (dd) { puts("</dd>"); dd--; } puts("<dt><b>${$1} ${$2}</b></dt><dd>"); dd++; } else if (buf =~ /=table/) { } else if (buf =~ /^=back/) { if (dd) { puts("</dd>"); dd--; } puts("</dl>"); } else if (buf =~ /^\s*$/) { if (p) { puts("</p>"); p = 0; } if (pre) { /* * If we see a blank line in a preformatted * block, we don't want to stop the pre * unless the next line is not indented. * So peek ahead. */ if (defined(buf = all[i+1]) && (buf =~ /^\s/)) { puts(""); continue; } puts("</pre>"); pre = 0; trim = undef; } space = 1; } else { if (space) { if (buf =~ /^(\s+)[^ \t]+/) { trim = $1; puts("<pre>"); pre = 1; } else { puts("<p>"); p = 1; } space = 0; } if (defined(trim)) buf =~ s/^${trim}//; puts(buf); } } puts("</body></html>"); return (0); } /* * header and style sheet */ void header(string title) { string head = <<EOF <html> <head> <title>${title}</title> <style> pre { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 5px 5px 5px 5px; font-family: monospace; font-weight: bolder; } body { padding-left: 10px; } dt { font-size: large; } </style> </head> <body> EOF puts(head); puts("<h1>${title}</h1>"); } /* * Process bold,code
, italic, italic, link, non-breaking. * This will handle nested stuff likeif (!condition)
* but dies if there are nested ones of the same type. */ string inline(string buf) { string c, prev, result, link, stack[]; int B = 0, C = 0, I = 0, L = 0, S = 0; foreach (c in buf) { if ((c == "<") && defined(prev)) { if (prev == "B") { if (B++) die("Nested B<> unsupported: ${buf}"); result[END] = ""; result .= "<B>"; push(&stack, "B"); } else if (prev == "C") { if (C++) die("Nested C<> unsupported: ${buf}"); result[END] = ""; result .= "<CODE>"; push(&stack, "CODE"); } else if (prev == "I" || prev == "F") { if (I++) die("Nested I<> unsupported: ${buf}"); result[END] = ""; result .= "<I>"; push(&stack, "I"); } else if (prev == "L") { if (L++) die("Nested L<> unsupported: ${buf}"); result[END] = ""; result .= "<a href=\""; link = ""; push(&stack, "L"); } else if (prev == "S") { if (S++) die("Nested S<> unsupported: ${buf}"); result[END] = ""; push(&stack, "S"); } else { result .= "<"; prev = c; } } else if ((c == ">") && length(stack)) { c = pop(&stack); if (c == "B") { B--; } else if (c == "CODE") { C--; } else if (c == "I") { I--; } else if (c == "L") { L--; result .= "\">${link}</a>"; c = undef; } else { S--; c = undef; } if (defined(c)) { result .= "</" . c . ">"; } prev = undef; } else { if (S && isspace(c)) { result .= " "; } else if (c == "<") { result .= "<"; } else if (c == ">") { result .= ">"; } else { result .= c; } if (L) link .= c; prev = c; } } return (result); }