Arguing Semantics

We all search for meaning in things. Software development seems to be an exception. Some of the things I've seen make me wonder if the people responsible ever took a step back and thought "what is the reason for this?" At a very basic level the understanding and contemplation of the meaning of a thing allows you to use that thing far more effectively than you otherwise could, or to know when and when not to use it. Alternatively, and from a more artistic point of view, trying to understand the meaning of something gives you a deeper connection with it and gives you a greater range of expression with it.

The study of meanings, according to Webster, is called semantics. I'd like to try and elaborate on a few simple points where I think looking at the semantics of something would improve software development practices, including student work.

Programming Languages

Something student supervisors often run into, at least at my university, is students selecting the programming language for their project based on what they already know. Apart from the various other problems with this approach, I find that most students don't actually know the languages they've been taught – at least in a proper sense. It usually turns out that what they know is the syntax of a language and not its semantics. Just because you had to use Java for a semester long subject doesn't mean you really understand how to apply Java properly.

To use an analogy: it's the difference between knowing how to write so that your spelling and grammar is correct, and being able to write a great literary work of art. Being able to do the first does not guarantee the second.

So, the first thing to do then is understand that this is the case and try to understand the semantics of a language. For instance, where did Java come from? What deficiencies that occurred with other languages was it trying to correct? What deficiencies of its own does it have? What was the intention for the language? Where is it usually used? What other languages is it related to? How does it compare to other languages? How is it usually used? What problems are solved easily with it? What problems are hard to solve with it? And so on and so forth.

I think if you had a look into these questions for the different languages you used then you'd have a much better understanding of those languages. Most programming languages have entire cultures and communities surrounding them so this is a key resource to have when working out the various meanings behind a language.

Naming

When developing software, engineers have to assign names to things on a regular basis. Variables, functions and classes probably spring immediately to mind but other things like directories, files, database tables, team member roles, documents and project phases all need (or want) names as well. How do you come up with those names? What do the names mean?

What I often see happen is students select a name based on a "template" given to them because they don't have enough experience or confidence to select a name themselves. This is to be expected but I think some thought still needs to go into the semantics of that name.

For example, when creating documents in LaTeX, students are usually told that they can break them up into sections, placing each section in its own file. I like this kind of organisation but often students will split up their sections into files called part1.tex, part2.tex, section1.tex, appendix1.tex, etc. This makes things very confusing and makes it harder for changes to be made later on. Say I wanted to go in and change the meeting procedures section of the management plan – how would I do it? How do I know which file contains the relevant information? What happens if you want to add in a section later on? Do you go through and change all the names of the files to be updated for the new section numbers? If so, I don't see much point using LaTeX's built-in ability to generate section numbers for you. It's far better to name the according to what sort of content is in it, rather than according to its (current) section number. Names like meetings.tex, constraints.tex and so on are far more meaningful and allow for easy adjustments later on.

Variables are another area that most people get a bit slack on and occasionally don't bother naming properly. I think everyone knows that variables should be named according to what they're going to be used for, such as user_city, and not given generic names like var45. Shortened names for common tasks, like using i for a looping variable, are totally acceptable as long as the purpose of the variable is clear and the its scope is limited to the area in or around the loop. Variables can be used more intelligently though. You can use variables to also specify how its meant to be used and not just what it will store. Joel Spolsky goes through some examples of this is his excellent article Making Wrong Code Look Wrong, which is a highly recommended read.

Web Pages

This is rather an obvious one but there's still plenty of "unenlightened" people out there making pages with table-based layouts and using <font> tags all over the place. I hope to write a full post on this topic to go into it in some depth but for now I'd like to say that HTML does include some tags other than <font>, <table>, <div>, <span> and <a>! Some tags that might be far more appropriate are <em> for emphasised text, <dl> for lists of definitions, <abbr> for abbreviations and acronyms (there's also an <acronym> tag but it's going to be deprecated on XHTML 2) and so on. Tags also have various attributes for adding in semantic content that are often underused or simply misused.

Coming back to the naming topic above, it's a common occurrence to see CSS classes and ids that are called things like "left_div". I think we should try and avoid naming things like this (I'm guilty of it myself). Specifying areas of a page in our hook names doesn't actually tell us anything about the semantics of that hook or what it's meant to be for. If you use that hook for your navigation menu then what happens if you want to move it over to the right hand side?

I also think that classes and ids could be used better as well. Most pages I look at seem to have something against ids and don't use them at all, preferring to use classes for things like "header" or "content" of which there is only one. Using id carries with it some semantics that tell you that that tag is the only one of its kind. I like to use them for the overall structure of a page and use classes for things which I expect will be repeated or just make more sense as a class. As I wrote in my post on [CSS image replacement][3], id basically says “this thing is the one and only X” whereas a class says “this thing is a type of Y”. From this you can see that certain types of meaning are implied with classes and ids.

Conclusion

Hopefully with some of these example areas you can see the importance of proper semantics in software development. I think if we take a look at the meanings behind the things we use then we gain access to a much richer and deeper understanding of our work.

[3]: http://www.chnorton.com.au/2007/06/07/css-image-replacement-tutorial/