CS:2820 Notes, Lecture 31

When you have a project that is broken up into multiple source files, distributing the code for that project to others becomes a problem. Consider the road-network simulator, which, after breaking it into multiple files, consists of the following:

Error.java         MyScanner.java  RoadNetwork.java  Source.java
Intersection.java  NoStop.java     Simulator.java    StopLight.java
MyRandom.java      Road.java       Sink.java

RoadFiles

TestFile

If you just give someone a directory containing all this to someone, you are likely to do little more than baffle them. Any project this big should be distributed with a roadmap. This need led the Unix community, back in the 1970s, to adopt the convention of including, in every source directory, a file called README that tells the newcomer what they have. The name is all upper case to make it stand out from other file names.

Any of these components that gets big may be moved into its own file, so some project directories contain Copyright or License as separate source files. Here is a minimal README appropriate for our broken up road-network simulator.

Road Network Simulator
Version: Nov 3, 2020
Author: Douglas Jones

For the list of source files see:  RoadFiles

To build the simulator use the shell command: javac @RoadFiles

To test the build, use the shell command: java RoadNetwork testfile

An example input file for the simulator is: testfile
    BUG:  we need to document the input file format somewhere

To generate HTML documentation, run the shell command: javadoc @RoadFiles

With the above, the recipient of the source for the project has a place to start. They can build the project, they can run it, they have at least a clue about where to begin.

Note: The Gnu Coding Standards and GitHub both require README files for any software distributed through those channels. Some modern distribution systems encourage use of a formal structure for README files, but there is no universal format. GitHub, for example, has its own Markdown format.

Aside: Shell Archives

Once you create a directory ready to distribute, there are numerous ways you can actually distribute it. You can copy the directory to removable media. Today, USB drives work well for this. In the past, floppy disks, and before that, various magnetic tape formats. Some have referred to this as software distribution by sneakernet — because you are distributing the software by hand-carried network packets.

The Internet provides a variety of software distribution tools. Git and GitHub are a widely used example. Some of these are commercial cloud computing services, some are free software.

One old and durable tool from the early days of Unix is called a shell archive. The shell command shar is still around on many Unix, Linux and MacOS systems, and it is not difficult to write a shell script that does what the shar command does. If you type this command in our project file, you can create a shell archive of the project:

shar README RoadFiles *.java testfile > RoadNetwork.shar

Different versions of shar produce different shell archive formats, but as a rule, a .shar file is a shell script which, when you run it, recreates the archived files in the current directory. Most .shar files begin with comments explaining that they are shell archives, and as a rule, it is possible to unbundle a .shar file on a Unix, Linux or MacOS system by simply using it as shell input. For example, the road network source files could be extracted from the archive created above with this shell command:

sh < RoadNetwork.shar

Note that shar can only be used to archive source files. It does not work on binary files. Shell archive file formats are simple enough that it is always possible to extract files from a shell archive with a text editor.

Shell archives pose one significant danger. They are just shell scripts, and a shell script is a piece of active software that can do malicious things. In the case of shell archives, the defense is simple: If someone you don't trust hands you a shell archive, don't just extract it, check it first. See that the only thing it contains is shell commands to creat the source files in the archive, plus comments. Typically, each source file is created by a cat command, although some versions of shar use sed to strip a one-character prefix off of each source line. Here is how a simple version of shar packages a brief source file:

cat > test <<\xxxxxxxxxx
// a little one-line test file
xxxxxxxxxx

echo "x - extracting test (text)"
sed 's/^X//' << 'SHAR_EOF' > 'test' &&
// a little one-line test file
SHAR_EOF

This version of shar put an X prefix on every line of the original source file that began with either X or with SHAR_EOF in order to prevent the source file from prematurely terminating it's recoverey if the file itself happened to contain the text SHAR_EOF on a line by itself.

Even more sophisticated versions of shar do things like checking the checksum (or more sophisticated hash) of the file to make sure nothing was lost in transmission, and they include shell conditonals to prevent deletion or replacement of existing files and to set the access rights of the restored files to the access rights of the files that were archived.

31. Makefiles

Aside: Setting up to Distribute some Software

Aside: Shell Archives