http://homepage.cs.uiowa.edu/~dwjones/voting/
Prepared for the
Workshop on the Future of Voting Technology in a Networked Environment
Georgia Tech Research Institute
Atlanta, June 4-5, 2002.
by
Douglas W. Jones
Associate Professor of
Computer Science,
University of Iowa
Chair, Iowa Board of Examiners for Voting Machines
and Electronic Voting Systems
Abstract: Can we construct a trustworthy remote-site Internet voting system? More generally, can we construct trustworthy applications that run on untrusted systems? The answer to this question may be yes. Specifically, if the untrusted system is constructed to earn the trust of its user community, it must behave in a trustworthy manner most of the time, rarely violating the user's trust and doing so covertly only when specific target applications are detected. A trustworthy application designed to run on such a system must avoid detection by the system. The designers of viruses and E-mail spam delivery systems have explored a range of tools crafted to avoid identification, and we can use similar tools to design unrecognizable trusted applications and therefore defend ourselves against attack by untrusted host systems.
Can we construct a trustworthy remote-site Internet voting system? I suspect that the answer is a qualified yes, but I believe that we will only be able to do so if the voting application itself applies every lesson learned by the authors of viruses and the distributors of junk E-mail! The answer same may apply generally to the problem of running trusted applications on untrusted platforms.
To see why this is so, we must first make what appear to be some very paranoid assumptions. The usual assumption about remote-site Internet voting is that the voter will download some kind of voting application, perhaps a Java applet, to run on the voter's own computer. This implies that the voting application will run in an unknown operating environment, and if this is so, we must make the worst possible assumptions about this environment.
Let us assume that the operating software for all personal computers is provided by a single monopoly supplier, and let us assume that this supplier is actively intent on manipulating the outcome of elections. This is fiction, but it is not unreasonable! If remote-site Internet voting becomes widespread and if the underlying security mechanisms are based on less pessimistic assumptions, we will have created an incredible plum, ripe for the picking by any crook who can gain control of the right part of the monopoly's software development group.
Therefore, software for remote-site Internet voting should be designed from the start with the assumption that every component of the host system is an adversary, as suggested in Figure 1:
Of course, the monopoly software supplier would lose the monopoly rapidly if the software obviously violated the users' trust. Therefore, we can trust the software in question most of the time; what we must defend against are subtle attacks, attacks targeting only one application and covering their tracks as they do so.
For an Internet voting application, we should therefore expect attacks from the system components as described below:
- The window manager
- Several years ago, I proposed a simple attack directed against a voting application by the window manager [1]. In summary, the proposed evil window manager might contain code that interchanges the strings democratic and green when these appear as labels on the buttons of a "radio button widget", but only in the context of a window containing the string straight party and only on the first Tuesday after the first Monday of November in an even numbered year, and then only 10% of the time. The effect would be to throw 10% of the straight party vote for one of the major parties to the minor party most likely to attract the support of voters from that major party, therefore throwing many close elections to the other major party. If the voter pushes the straight-party button labeled democratic and then is allowed to immediately cast their ballot, without examining the effect of this straight party vote on the various races, the voter will never notice anything irregular!
More generally, we should assume that the window manager monitors the material being displayed on the screen and the input from the user, attempting to detect that a voting application is being run. If the window manager is able to detect this application, we should assume that it will attempt to surreptitiously alter the voter actions seen by the application. The voter must not be led to suspect anything abnormal, so these alterations must be subtle; the proposed window-manager attack demonstrates that the logic required for such a subtle attack is not difficult to implement!
- The Network Interface
- Designers of secure communication protocols have long designed their software under the assumption that there may be interference along the data path from the sending machine to the receiving machine. Thus, today, we speak routinely of cryptographic tunneling being used to create a secure communications channel on an insecure network. It is natural, therefore, that almost all proposals for Internet voting use cryptography to prevent a third party from examining or altering any information involved in the voting transaction.
If we assume an untrusted operating environment, though, we must go one step further. We must not trust any cryptographic tools provided by the operating environment! Thus, for example, if our untrustworthy monopoly supplier of operating systems provides standard tools for establishing a secure channel to a remote server, we must not rely on these tools! Indeed, we must assume that these tools actively snoop in the data being encrypted, seeking data relating to the voting application and carefully corrupting that data in order to manipulate the outcome of the election.
- The Programming Language Interpreter
- Applications programmers usually assume that their code will be faithfully executed by the interpreter on which it runs. Here, I use the word interpreter in the broadest sense, to include any and all mechanisms used to execute the object code of the voting application. Conventionally, the hardware of the CPU makes up the larger part of the interpreter, but this is typically extended by software, in the form of the trap handlers of the operating system that interpret those instructions, supervisor calls, that are not part of the hardware instruction set. For some languages, the interpreter may be entirely software, for example, Java is typically compiled to J-code which is executed by a software implementation of the J machine.
If we assume an untrusted interpreter, we can assume that the interpreter actively attempts to identify the voting application. It may do so by examining the code of the application, by examining the contents of memory, and by examining the execution trace. If the target application is not seen, the interpreter must behave exactly as advertised in order to retain the trust of the larger user community, but when the target is seen, we should assume that the application includes code to covertly substitute its own improper behavior for the behavior that is advertised.
These pessimistic assumptions apply not only to voting applications, but also to any applications developed by organizations that are seen by the monopoly vendor as posing a serious threat to their monopoly. In addition, Internet-based applications where there is a significant potential payoff to someone who can subvert them are also vulnerable. For example, consider the potential payoff to an employee or a small group of employees of the monopoly vendor if they could add a credit-card number sniffer to the system. As a result, the techniques proposed here may have far broader applicability than just Internet voting.
The developer of the voting application has a great disadvantage: Generally, the voting application must go through the long review process specified by the FEC/NASED Voting Systems Standards [2]. In contrast, many operating system vendors have implemented instant update mechanisms for their Internet-based operating systems, so that systems belonging to registered owners will automatically install the newest system updates soon after they become available.
As a result, it is extremely likely that each version of the voting application will remain in effect for many months and perhaps years. During this time, an potential attacker has plenty of time to study the voting application, reverse engineer it, and develop software that will carry out an attack. The automatic update mechanisms of the operating system allow the voting-system attack version of the operating system to be auto-installed in computers in the week prior to the election, and it is entirely possible that the attack software could delete itself from the voters' machines immediately after election day!
We must defend our voting application against each of the threats outlined above! An absolute defense may not be possible, but for for cryptographic defenses, we are used to accepting the idea that a defense is acceptably strong if we can control the expected cost of an attack. If we can gain similar control over the cost of an attack from an untrusted system, we will deem this equally sufficient.
For example, if we can show that malicious software must solve NP hard problems or solve the halting problem in order to carry out an attack, we will deem the voting application to be sufficiently secure. For these examples, while exact attacks are computationally infeasible, there may be approximate methods that the attacker may find useful, but we can control the difficulty of the approximate solution, and we can work to make sure that errors in the approximate solutions will be very likely to be generally visible.
We may have to accept even accept weaker defenses! If we can invent defenses that force an attacker to adopt methods that require the cooperation of a large number of co-conspirators, we may deem this to be adequate. This is because the more people that must be involved in a criminal conspiracy, the more likely it is that a participant will defect and reveal the nature of the attack.
The first line of defense is to prevent the attacker from recognizing the application! Therefore, we must deny the attacker easy access to the content of the material displayed on the screen. Of course, the window manager has complete information to every pixel displayed on the screen, but there is a huge difference between giving the window manager access to the pixels and giving it access to the content expressed by those pixels, as illustrated in Figure 2.
This is easily illustrated by the window manager attack cited previously [1]. This attack rested on the fact that an application displaying certain text on election day was extremely likely to be a voting application. It is easy for the window manager to scan the text of character strings that are passed to it as explicit text, but far harder for it to scan the pixels on the screen to determine if they make up letters, and then to scan those letters to see if they contain a particular message.
Therefore, our voting application should not directly display text on the screen; rather, it should construct text on the screen from simple graphical elements. Furthermore, in order to complicate the job of any software in the window manager that seeks to identify the voting application by detecting predictable patterns on the screen, the voting application should randomize its display. We refer to this randomization as obfuscation. David Jefferson has referred to this technique as pseudo-encryption since "the data looks encrypted with respect to pixel- and simple OCR-level operations, but the eye nonetheless easily decodes the message" [3].
As an example of the many elements we could randomize, consider randomly selecting the colors we use. Instead of using an exactly white background, for example, we could select any of several pale shades, and instead of using black text, we could use any of several dark shades. To a voter, all would appear close enough to black and white that it would make no difference. Similarly, a random element could be added to the position of each item on the screen, and any of several equally readable fonts could be used for text. Yahoo already does this in an attempt to foil robotic harvesting of Yahoo.com user IDs. [4].
We could randomly present vote choices using elliptical and rectangular voting targets; we could randomly indicate voter selections by using a variety of X-marks and check-marks, and we could vary the sizes of these marks over a significant range. Similarly, the wording of auxiliary text could be varied by changing the order of phrases and substituting any of several synonyms. What matters is that, to a human viewer, the content of the screen has a fixed meaning.
The above randomizations are similar in their effect to ballot rotation. Ballot rotation involves changing the order of presentation of the candidates on the ballot. Traditionally, this is done in order to prevent one candidate from having an unfair advantage by virtue of being listed first, but rotation also complicates the job that must be done by pattern recognition software that is attempting to detect the fact that a ballot is being presented on the screen.
If all of the obfuscation techniques outlined above are used, it is still, in principle, possible to write software that recognizes the presence of a ballot on the screen, but instead of simply searching for character strings such as republican, the attacker would now have to perform complex pattern analysis on the ballot, first doing pattern recognition to recognize the letters and graphic elements associated with ballot design, and then assembling letters into character strings for content analysis.
A system containing the code needed to perform this attack will be slow! The system must apply this analysis to any screen display that might prove to be a ballot, and as a result, many applications will be significantly slowed by the presence of the software needed to carry out this attack.
In addition the code required to carry out such an attack will be sufficiently large that its presence within the window manager should be difficult to hide. Instead of being an attack that one programmer can carry out, it now becomes an attack that will be visible to large numbers of people within the vendor's programming staff. As such, the conspirators faces the complex problem of hiding their work behind layers of cover stories that explain why they had to include these complex mechanisms in the system, and this kind of cover-up is dangerous.
All data delivered to the network must be encrypted prior to being handed to any system software. This encryption must, therefore, be done by the voting application itself! It is not unusual for applications to avoid the use of the underlying system's encryption software, nor is it unusual for applications to use multilayer encryption, performing some kind of encryption locally prior to passing the data to the system for re-encryption and transmission.
It is not sufficient to encrypt! We must also guarantee that the encrypted data does not contain visible signatures indicating the type of the data being transmitted. If we transmit a standard prefix on the data stream that says "I am a ballot", this allows the network interface to detect the voting application and report this fact to other system components that might be trying to corrupt the results. Therefore, standard prefixes should themselves hide under a layer of encryption, and the key used for this encryption must vary from one transaction to the next so that the encrypted prefix does not always appear in the same form.
We must also hide the network addresses involved! If we are trying not to reveal to the system that a vote is being cast, we cannot simply open a connection to http://polling.place.gov to download the voting software, nor can the voted ballot be deposited at http://ballot.box.gov. We must assume that our adversary has a list of the fixed Internet addresses of each election jurisdiction in the country! Therefore, on election day, there must be no discernible link between the network addresses used by the voter and the fixed addresses of the election jurisdictions; instead, we must use transiently valid intermediaries.
The use of transient intermediaries in Internet addressing is a well established practice in the E-mail spam industry, and it has been proven to be quite effective. Spammers will open temporary accounts using a free E-mail service such as hotmail.com; these accounts are used to send the spam through a bulk remailer, and after use, the spammer expects the account to be shut down within hours of sending their spam. We can do exactly the same thing using E-mail to deposit votes, where the E-mail address is generated immediately prior to opening the on-line polls, and we can do something quite similar with transient domain names using one of several services that will set up temporary Internet domains [5].
Figure 3 summarizes the network transactions that we will use to hide the identity of the voting application from adversarial network software on the voter's computer. These are described in more detail below, but it should be noted that this sketch focuses on hiding the fact that a vote is taking place from the voter's computer; as such, the full cryptographic protocol used to protect the integrity of the ballot is only hinted at:
If the interpreter for the voting application recognizes the application, either as it is loaded or while it is running, the interpreter could modify the application, monitor its use, or extract data such as votes and encryption keys. Therefore, we must prevent the application from being recognized.
The art of writing applications that evade software that is designed to recognize them is extremely well developed! Virus writers do this, attempting to create viruses that evade detection by anti-virus software. Andrew Tannenbaum's discussion of the ever-escalating war between virus writers and anti-virus software developers is a good introduction to this [6].
If we wish to prevent the system from recognizing the voting application, our first line of defense is to make sure that no two versions of the application run the same code. Polymorphic or self-mutating viruses use this defense, but the job we face is far simpler. Instead of constructing a voting application that self-mutates, we can rely on the server that distributes the voting application to generate a new variant for each voter, perhaps by randomly reordering permutable elements of the source code and then recompiling it each time it is sent to the voter.
As an experiment in this direction, I have created a source code obfuscation engine for languages with syntax based on C (including C++ and Java). This takes, as input, annotated source code for the language and produces a randomly equivalent program as output. The annotations indicate sections of the source code that may be reordered and they indicate alternative constructs for operations that may be performed in any of several ways, as illustrated in Figure 4.
Annotated Input | Output A | Output B |
---|---|---|
int f(int x);
{ <$ int a = <$ x $+$ 1 $>; $$ int b = <$ x - 1 $?$ -1 + x $>; $> return ($ x $*$ y $); } |
int f(int x);
{ int b = -1 + x ; int a = x + 1 ; return ( x * y ); } |
int f(int x);
{ int a = 1 + x ; int b = x - 1 ; return ( y * x ); } |
The experimental obfuscation engine is written in annotated ANSI C so that it may be applied to itself. A diligent attempt has been made to annotate all parts of the program that may be reordered, from the top-level structure of the source file down to the level of terms in expressions. As a result, there are potentially well over 1060 distinct versions of the program, although the use of a 32-bit random number seed in the current version of the program prevents more than 232 distinct versions from being generated.
Obfuscating the source code in this way prior to compilation raises one important question: Do optimizing compilers significantly undo the obfuscation? It is quite likely that local optimization will produce identical code for a+1 and 1+a but most compilers do not appear to reorder declarations, so our obfuscation makes a mess of data structures and the overall layout of code in memory. In addition, obfuscated declarations of enumerated types that have no ordering constraints randomly shuffles the constants for members of those types. In sum, it appears that this methodology should make it extremely hard to show that two different versions of any particular application are indeed versions of the same application.
When this approach is applied to a language like Java, one significant problem emerges. Java requires that compilers embed a considerable amount of information about the source of a program into the object code. We need to suppress as much of this as is possible! For example, the obfuscation engine should substitute meaningless and randomly selected identifiers for the textual names which Java will carry into the object code.
The equivalence problem for arbitrary programs is as hard as the halting problem, but there has been considerable recent interest in simple-to-compute measures of program similarity. It appears that the obfuscation engine, if used with care, should be able to defeat the Baker and Manber's Java bytecode similarity tools [7], but no experiments with this have yet been conducted.
It should be noted that the tools proposed here, if used in conjunction, effectively guarantee that different versions of the voting application will not, in general, be formally equivalent! Versions will differ in the order in which elements of the graphical display are rendered, they will differ in the precise dither patterns they use for backgrounds, and they will differ in the synonymous messages they output. Furthermore, one or more of the encryption keys for encrypting the ballot can be embedded in the code, so that the output sent to the remote site will only be recognizable as a ballot to software possessing the matching decryption key.
We have one final defense we can apply: Instead of delivering the voting application in the language understood by the untrusted interpreter, for example, Java bytecodes, we can deliver an obfuscated interpreter to the run-time system, along with object code in a form understood only by this version of the interpreter. An obfuscated interpreter could, for example, execute bytecodes from a one-time-use instruction set that was generated at the same time the interpreter was generated.
Will all this guarantee that our adversary interpreter will not be able to recognize the application? No. What we have done is make the recognition problem difficult enough that the software to solve it could not be easily hidden. Again, the size of the conspiracy required for a successful attack is too large to be reliable, and we can hope that the tools required to carry out the attack will be sufficiently visible and will threaten a sufficient number of other applications that the marketplace in general will not accept their presence in the system.
If the voting application is the only application using this suite of defense tools, is it possible that our adversary will be able to recognize the application by the tools it uses to hide itself? This threat can be addressed in two ways, first by checking, tool by tool, that the application's behavior is sufficiently like that of other programs to make recognition unlikely, and second, we can recruit other application areas that should be similarly defended, so that the voting application is only one of many that will have such behavior.
As already noted, many of these tools are already common. Spammers use the tools we have outlined to prevent E-mail filters from successfully blocking their junk E-mail. Yahoo uses images instead of text in order to foil robotic harvesting of user IDs. Obfuscated one-time downloads of code are not yet widely used, but they should be!
Consider the very common class of web applications that request credit-card numbers from users. It is now common for these to use a secure network connection to transmit the collected credit card number to the server, but in almost all cases, the text "enter credit card number" and "expiration date" appears adjacent to the blank on the screen where this data is entered, and it would be very easy to include a credit-card harvester in both the web browser's forms mechanism and in the window manager that is used to present the form.
We cannot expect the browser or window-manager vendors to prevent such an attack! A single dishonest but competent programmer could probably add the code necessary for one or the other of these attacks without its attracting any attention, assuming that the vendor's management style is typical of the software industry today. The potential profit margin to the crook from the harvest of even one day's worth of credit card numbers from all users of the dominant browser or window-manager should provide ample motivation for determined crooks to work their way into the jobs that give them access to these system components.
Therefore, merchants who request credit card numbers over the web should probably apply much or all of the tool set outlined above whenever their code requests such a number. The request should be presented using a difficult-to-OCR image and not text. The digits of the number should be collected by an obfuscated Java applet and not by the standard forms interface, they number should not land in memory as a recognizable text string, but rather, it should be at least partly encrypted at all times.
All users of similar defenses gain considerable security if, from the outside, their applications are difficult to distinguish. Thus, users should attempt to use common application frameworks, common general screen layouts and ranges of color schemes, and similar sizes of downloads and messages. The net result is that a crook attempting to subvert some component of the host system will end up facing a range of deliberately similar applications, some of which collect credit card numbers, some of which collect votes, and some of which use this general approach not because they need it, but because their authors see the benefit of adding to the population of similar applications.
The defense mechanisms outlined above have obvious costs, measured in terms of performance penalties and network overheads. What is less obvious is the cost to the developer and the added complexity faced by those involved in certifying the correctness of the voting application.
Not only must we have confidence in the underlying logic of the application, but we must have confidence that this has been correctly annotated in order to allow the obfuscation engine to generate an astronomical number of distinct versions from the base version of the system. We cannot possibly test all of these versions, so we must rely on careful development and a thorough source-code audit.
In addition, we must trust the obfuscation engine itself. We must assure ourselves that the engine generates correct output from a correctly annotated correct input, and we must assure ourselves that the engine does not somehow sign the output with some easy to recognize signature identifying the specific original that it has processed. In effect, we must also certify the correctness of the obfuscation engine. This tool is an ideal candidate for open-source distribution, with strict controls to assure that the version used in the voting application is the version that has been subject to long-term public scrutiny!
Software tools such as the obfuscation engine discussed here will not completely automate the defenses outlined here, so the programmers must learn a paranoid programming style, a style quite different from that required by most applications. Where the programmer's normal goal is to assure only that the source code is correct and easy to understand, now, in addition, the programmer must be constantly aware that the program's run-time behavior is to be as thoroughly obfuscated as possible.