Datastructures in Tcl

This document will teach you a little about building more complex data structures in Tcl. I’m assuming you already know basic Tcl stuff. If you don’t you can pick up some at <a href=”http://www.arsdigita.com/books/tcl/”>Tcl for Web Nerds.



Take Advantage Of Arrays



You probably know well by now that Tcl doesn’t support any complex datastructures very well. You might also know from <a href=”http://sicp.arsdigita.org/text/sicp/” title=”Abelson and Sussman: Structure and Interpretation of Computer Programs”>the textbook that any data structure can be represented with lists. So theoretically, Tcl does support data structures, altough doing so with lists only can be very cumbersome.





Tcl associative arrays are good for C struct or Pascal record type things. Except, of course, in Tcl they’re not declared ahead of time, so there’s no way to know in advance what keys exist in the array.



Unfortunately, you can’t pass the value of an arrray to a procedure like you can a string or a list: Lists and strings are really the same thing, but arrays are different. There are two ways to pass an array. You can use the [array get <i>arrayname</i>] to get a list representation of the array, which can then be massaged back into an array by array set <i>arrayname</i> $<i>arraylist</i>. Here’s how that could look:



proc print_artist { artist_info } {
    array set artist $artist_info
    puts "Name : $artist(name)"
    puts "Genre: $artist(genre)"
}

set myartist(name) "Miles Davis" set myartist(genre) "Jazz"

print_artist [array get myartist]




The advantage to this is that it’s clean. You use arrays both inside and out side the procedure and just need to use array get myartist instead of $myartist when you pass it to the proc. The downside is performance: If you have thousands of keys, translating it into a list and then back into an array is slow.





If the proc doesn’t modify the value of the argument, you can pass the name of the array to the proc, and the proc can then do an upvar to access the values. This is much faster, since you only need to pass the reference and not the whole value, but it is also more error-prone, since you must be very careful not to alter the value of the array from the proc. Here’s how that could look:



proc print_artist { artist_varname } {
    upvar $artist_varname artist
    puts "Name : $artist(name)"
    puts "Genre: $artist(genre)"
}

set myartist(name) "Miles Davis" set myartist(genre) "Jazz"

print_artist myartist


I strongly prefer the first type, unless what I want really is pass-by-reference, i.e. I want side-effects. I’d rather let the computer work a little harder to make my life easier than save a few CPU cycles and risk spending hours looking for a bug caused by unintended side-effects. Heck, I even <a href=”http://setiathome.berkeley.edu/”>give away extra clock cycles for free.



Use Clever Naming



What if you want a list of arrays? You can have the values of the first list be array get list representations ofthe arrays, but that’s not very efficient if you want to access those values frequently, since you’d be converting from list representation to arrays over and over again.





A much more efficient approach is to use some clever names of the arrays. In Tcl, the name of a variable, including arrays, can be any string. It can include spaces, special characters such as dots, commas, newlines, tabs, parenthesis, etc. You just need to quote it right, and the name can be anything. We can make use of this.



global last_artist_id artists
set last_artist_id 0
set artists [list]

proc add_artist { name genre } { global last_artist_id artists set id [incr last_artist_id]

global artist,$id set artist,${id}(name) $name set artist,${id}(genre) $genre

lappend artists artist,$id }

proc print_artists {} { global artists foreach artist $artists { global $artist puts "Name : [set ${artist}(name)]" puts "Genre: [set ${artist}(genre)]" } }

add_artist "Miles Davis" "Jazz" add_artist "Bruce Springsteen" "Rock"

print_artists


In this example, the entries in the list are really names of arrays, names which are automatically generated (i.e. artist,1 and artist,2) in a way that we know is globally unique. The trick here is to use proper quoting of your variable names. Note for instance how we say set artist,${id}(name) $name so Tcl doesn’t interpret id as the name of the array, but rather the whole string artist,${id}. We also have to use [set <i>varname</i>] to get the value of a variable when we construct the variable name dynamically.



Another thing to be aware of is just how parenthesis are interpreted. Only if the last character of the variable name is a right parenthesis is it taken to be an array. That is, set foo(bar)baz 1 just gets interpreted setting the value of a variable with the strange name "foo(bar)baz". Tcl doesn’t care, since any string is allowed for a variable name. If the name ends with a right parenthesis, and there’s no left parenthesis anywhere, it’s also taken as just a silly name of a variable.





If, however, the name ends with a right parenthesis, and there’s a left paren somewhere else, then everything before the first left parenthesis is taken to be the name of the array, and everything from the first left parenthesis up until the right parenthesis at the end of the name is taken to be the key into the array.





This has some funny implications. Consider, for example:



# variable name is "", array key is "foo"
set (foo) 1

# variable name is "foo", array key is "bar)baz(greble" set foo(bar)baz(greble) 1

# variable name is "foo(bar)baz", and is not an array set foo(bar)baz 1


Note, how even the empty string is a valid variable name.





Now you’re all set to build the most clever of data structures in Tcl. But beware, that if you do, you’ll soon feel sorry that you’re not working in a real programming language.

1 comment

Ron TheMystic
 

DON'T use special characters in variable names!! <PRE> Hi, Your discussion of array-passing methods was interesting, but I disagreed with one of your other points. You SHOULD NOT use special characters - such as commas and parenthesis - in your variable names. It makes source code harder to read and support. It also won't pass a code review at many businesses. I re-wrote your artist example using simpler variable names and code: # artist_info is our main data array # artist_info(names) : list of all artist names # artist_info($name,genre) : genre of the specified artist global artist_info set artist_info(names) "" proc add_artist {name genre} { global artist_info lappend artist_info(names) $name set artist_info($name,genre) $genre } proc print_artists {} { global artist_info foreach name $artist_info(names) { puts "Name : $name" puts "Genre: $artist_info($name,genre)" } } add_artist "Miles Davis" "Jazz" add_artist "Bruce Springsteen" "Rock" print_artists Regards, Ron </PRE>
Read more
Read less
  Cancel

Leave a comment