Command names

From ProgClub
Jump to: navigation, search

I am eminently unqualified for this job of generating grammars. But I needed to figure out if I was going to write a backup script for my database server 'unity', was I gonna name my backup script 'unity-backup.sh' or 'backup-unity.sh'..? And if I was going to defer my 'unity' backup script to a more general script that can backup any particular server, was I going to call the more general script that takes the MySQL/MariaDB server host name as an argument 'mysql-backup.sh' or 'backup-mysql.sh'..? They say that form is liberating, and I believe them. So I wanted to develop some guidelines which can help me in such matters.

The "grammar" described here might not yet be sufficiently powerful for all possible scripts; this is a work in progress and these are just first thoughts. If you have suggestions for me (or are aware of prior art, I can't be the first person to have considered this issue) I would be very pleased to hear from you, feel free to get in contact.

Scope

These are my nascent thoughts on naming things, specifically naming Linux commands which will be issued in an interactive console session (a subset might be issued directly and potentially noninteractively from an invoking script). The commands may be implemented as BASH functions or BASH aliases; or they might be implemented as scripts or executables, living typically in some 'bin' directory of some project. Commands may or may not be in a user's environment or $PATH. Obviously BASH functions and BASH aliases must be in the user's environment to be useful; scripts and executables on the other hand need not be in the user's $PATH as they can be invoked directly with either a fully-qualified command or a suitable relative command with respect to the user's $PWD.

The scope of these guidelines is limited to the command name only. This document does not concern itself with other details such as command arguments, or command input/output, or the formats thereof.

Note too that these guidelines are for project specific stuff. If you're naming a script for inclusion in /bin or /usr/bin you would be well advised to consult the Debian Policy Manual instead.

Standard

Command names must be all lower case and can contain alphanumeric ASCII characters and 'dash' (-) and 'dot' (.), but they must not start or end with dash or dot.

If the command is implemented as an ELF binary, BASH function, or a BASH alias, then the name has no standard suffix (and typically no dots). Otherwise the command suffix will indicate the implementation with a suffix (file extension) that is typically '.sh' for BASH scripts and '.php' for PHP scripts and so on for your weapon of choice. Yes this leaks an implementation detail into the interface, and yes we're gonna do that anyway.

Generally a script with a file extension as a suffix provides less guarantees of interface stability than executable/function/alias commands which do not have such suffixes, but that is not a hard rule.

Commands which users will execute often should be short, say one to four characters long; commands which are invoked by scripts or less frequently should have longer more descriptive names.

Short names

For short command names an acronym, abbreviation, or contraction is generally appropriate.

Try to pick something that:

  1. is short (say one to four characters long)
  2. is memorable (yes this is highly subjective)
  3. is easy to type on a QWERTY keyboard (e.g. "qzwx" is not easy to type)
  4. doesn't conflict with any existing commands you know of (at least check in your environment before you commit to a name)

Note that if you are planning to use an acronym/initialism then you might like to consider the long name specification for the order of the parts.

Long names

For long command names we want to be consistent and semantic so the rest of this specification is about how to compose the parts of a longer command name.

The number one goal of this naming convention is that if you know there's a script for what you want to do you can guess what it would be called based on what you know about what it is that you want to do. Similarly if you know what the script you're about to write will do you know mechanically what it will be called so you can name it appropriately.

The basic rule is "verb first", so verb-noun or verb-noun-noun. The nouns might be hardcoded/global, or they might be supplied as command-line arguments.

If you have a process that operates on multiple nouns you can "stack" them "containerwise". E.g. a database server contains multiple databases so if you were to operate on a variable database on a variable server the name would be $process-$server-$database (as verb-noun-noun, e.g. 'backup-mysql-database.sh') and the server host name and database name would be passed as arguments.

Note that if your script can do something to an entire container or optionally just to some parts of a container then just name it for the container, not the parts. For example if your script can backup all the databases on a server, or potentially just a few specific nominated databases, then don't name your script 'backup-mysql-database.sh', just name it 'backup-mysql.sh'. If you found you were backing up single databases a lot you could create the narrower 'backup-mysql-database.sh' script and just defer the implementation to your 'backup-mysql.sh' with appropriate options set automatically.

The idea with noun stacking is that nouns go from the outermost to the innermost when there is a container hierarchy. If there is no containership these guidelines don't presently indicate how to handle that, if you have a usecase let me know.

Rationale

Why lower case ASCII etc?

So we use lower case because capitals require the use of the Shift key (or CAPS LOCK if you even have one) which is a hassle to type.

We use ASCII alphanumeric because those keys are generally fairly easy to type and sufficiently expressive.

We don't start or end with a 'dot' (.) because hidden files start with dots and sentences end with dots and we don't need that sort of potential confusion in our lives.

We don't start with a 'dash' (-) because we like for a dash to indicate a command option/argument/switch, not a command itself. And we don't end with a dash because that just seems untidy, a dash is typically used to separate parts and there is no value in having a null/empty part.

Why include a suffix?

Basically so you can reimplement and phase out gradually.

Also it's nice to know a little bit more about what you're dealing with when you use it.

And when there's only one implementation the suffix is just one auto-complete TAB away anyway.

And if you do reimplement using a different technology if you have interface parity you can simply alias the old command name and *lie* about the implementation. Your users might even not ever know.

And if commands which are files include an appropriate file extension you can open them automatically in an appropriate editor based on file type associations in your window manager of choice.

Oh, and because John likes it that way and this is John's doco so "whatever, I'll do what I want!"

Why QWERTY?

Because that's a very common keyboard layout...

Implementation

As we said scripts should end with a suffix but executables/functions/aliases should not. But this is not a hard rule. Do what you think is most appropriate. If you don't want to think then just lean on the suggestion.

File names

For ELF binaries and scripts the "command name" will be the "file name", and the files may or may not be in the user's $PATH.

For BASH functions and BASH aliases the "command name" doesn't have anything to do with files, except that they must individually be defined in some file which has been 'sourced' by a BASH environment.

Shebangs

Commands which are scripts should start with a shebang on the first line and be chmod'ed +x. Use /bin/bash for BASH and /usr/bin/env php for PHP. This makes the file name the command name which is desirable.

Legacy names

There are already many many command names already out there in the wild. Can't do much about that. The conventions here are all about what we might do henceforth, they have little to contribute to what has already been done...

Examples

Database backups

So we have a database server called 'unity' which has a bunch of databases with various names.

The name of the script which can backup any MySQL/MariaDB database server (indicated with a command argument) is called 'backup-mysql.sh'.

The name of the script which backs up 'unity' is 'backup-unity.sh'.

The name of the script which can restore a database backup to any MySQL/MariaDB database server is called 'restore-mysql-database.sh', notice that nouns stack "containerwise" (i.e. the "MySQL/MariaDB" server contains the "database", so the container is listed first in the noun list).

The name of the script which can restore a specific database backup to 'unity' is called 'restore-unity.sh' (because it only operates on 'unity', there is no option for server name). This function could have been called 'restore-unity-database.sh' because the database is an argument; but as the only things which will ever be restored to 'unity' are "databases" this part can be elided, or not, at your option.