Datastructures in Tcl
This document will teach you a little about building more complex data structures in Tcl. I’m assuming you already know basic Tcl stuff. If you don’t you can pick up some at <a href=”http://www.arsdigita.com/books/tcl/”>Tcl for Web Nerds.
Take Advantage Of Arrays
You probably know well by now that Tcl doesn’t support any complex datastructures very well. You might also know from <a href=”http://sicp.arsdigita.org/text/sicp/” title=”Abelson and Sussman: Structure and Interpretation of Computer Programs”>the textbook that any data structure can be represented with lists. So theoretically, Tcl does support data structures, altough doing so with lists only can be very cumbersome.
Tcl associative arrays are good for C struct
or Pascal
record
type things. Except, of course, in Tcl they’re not
declared ahead of time, so there’s no way to know in advance what keys
exist in the array.
Unfortunately, you can’t pass the value of an arrray to a procedure
like you can a string or a list: Lists and strings are really the same
thing, but arrays are different. There are two ways to pass an
array. You can use the [array get <i>arrayname</i>]
to
get a list representation of the array, which can then be massaged
back into an array by array set <i>arrayname</i>
$<i>arraylist</i>
. Here’s how that could look:
proc print_artist { artist_info } { array set artist $artist_info puts "Name : $artist(name)" puts "Genre: $artist(genre)" }
set myartist(name) "Miles Davis" set myartist(genre) "Jazz"
print_artist [array get myartist]
The advantage to this is that it’s clean. You use arrays both inside
and out side the procedure and just need to use array get myartist
instead of $myartist
when you pass it to the proc. The
downside is performance: If you have thousands of keys, translating it
into a list and then back into an array is slow.
If the proc doesn’t modify the value of the argument, you can pass the
name of the array to the proc, and the proc can then do an
upvar
to access the values. This is much faster, since
you only need to pass the reference and not the whole value, but it is
also more error-prone, since you must be very careful not to
alter the value of the array from the proc. Here’s how that could
look:
proc print_artist { artist_varname } { upvar $artist_varname artist puts "Name : $artist(name)" puts "Genre: $artist(genre)" }
set myartist(name) "Miles Davis" set myartist(genre) "Jazz"
print_artist myartist
I strongly prefer the first type, unless what I want really is pass-by-reference, i.e. I want side-effects. I’d rather let the computer work a little harder to make my life easier than save a few CPU cycles and risk spending hours looking for a bug caused by unintended side-effects. Heck, I even <a href=”http://setiathome.berkeley.edu/”>give away extra clock cycles for free.
Use Clever Naming
What if you want a list of arrays? You can have the values of the
first list be array get
list representations ofthe
arrays, but that’s not very efficient if you want to access those
values frequently, since you’d be converting from list
representation to arrays over and over again.
A much more efficient approach is to use some clever names of the arrays. In Tcl, the name of a variable, including arrays, can be any string. It can include spaces, special characters such as dots, commas, newlines, tabs, parenthesis, etc. You just need to quote it right, and the name can be anything. We can make use of this.
global last_artist_id artists set last_artist_id 0 set artists [list]
proc add_artist { name genre } { global last_artist_id artists set id [incr last_artist_id]
global artist,$id set artist,${id}(name) $name set artist,${id}(genre) $genre
lappend artists artist,$id }
proc print_artists {} { global artists foreach artist $artists { global $artist puts "Name : [set ${artist}(name)]" puts "Genre: [set ${artist}(genre)]" } }
add_artist "Miles Davis" "Jazz" add_artist "Bruce Springsteen" "Rock"
print_artists
In this example, the entries in the list are really names of arrays, names which are automatically generated (i.e.
artist,1
and artist,2
) in a way that
we know is globally unique. The trick here is to use proper quoting of
your variable names. Note for instance how we say set
artist,${id}(name) $name
so Tcl doesn’t interpret
id
as the name of the array, but rather the whole string
artist,${id}
. We also have to use [set
<i>varname</i>]
to get the value of a variable when we
construct the variable name dynamically.
Another thing to be aware of is just how parenthesis are
interpreted. Only if the last character of the variable name is a
right parenthesis is it taken to be an array. That is, set
foo(bar)baz 1
just gets interpreted setting the value of a
variable with the strange name "foo(bar)baz". Tcl doesn’t
care, since any string is allowed for a variable name. If the name
ends with a right parenthesis, and there’s no left parenthesis
anywhere, it’s also taken as just a silly name of a variable.
If, however, the name ends with a right parenthesis, and there’s a left paren somewhere else, then everything before the first left parenthesis is taken to be the name of the array, and everything from the first left parenthesis up until the right parenthesis at the end of the name is taken to be the key into the array.
This has some funny implications. Consider, for example:
# variable name is "", array key is "foo" set (foo) 1
# variable name is "foo", array key is "bar)baz(greble" set foo(bar)baz(greble) 1
# variable name is "foo(bar)baz", and is not an array set foo(bar)baz 1
Note, how even the empty string is a valid variable name.
Now you’re all set to build the most clever of data structures in Tcl. But beware, that if you do, you’ll soon feel sorry that you’re not working in a real programming language.
About Calvin Correli
I've spent the last 17 years learning, growing, healing, and discovering who I truly am, so that I'm now living every day aligned with my life's purpose.
1 comment
Leave a comment