This section is designed to provide a bit of background on some of the more technical aspects of writing a PAC File.
As mentioned elswhere, PAC files are written in JavaScript. JavaScript is a formal, structured programming language which and has specific structures that must be followed when writing code. The goal of this section is to make you understand some of those rules and how they should be used.
This is an important document to read, but it's not going to teach you how to write JavaScript or even to develop perfect PAC files. This is a reference that you should review to understand some of the basics to be able to better interpret what you're seeing in some of the examples and other guides.
When I wrote this page I had a difficult time deciding what to call a "PAC File" component or a "JavaScript" component. Since PAC files are written in JavaScript there's nothing here that's unique to a PAC file and is generally useable in JavaScript. You might see me interchange the two terms occasionally.
A variable in JavaScript is simply a way to assign name a value in a way that it can be easily reused later and referenced by the name.
For example, there's a JavaScript function called "myIPAddress()" which always returns the machine's local IP address. You can create a variable called "localIP" that contain the local workstation IP address by using the following command.
LocalIP = myIPAddress();
Now, each time you need to check the machine's IP address you can use the variable "localIP" instead of having to call the myIpAddress() function. This is cheaper, computationally and easier to understand.
JavaScript, like many other languages is built from a series of building blocks called "functions". A function is a JavaScript building block that executes a specific action or runs a specific test based on variables you send it.
JavaScript has a large number of built-in functions frequently used in PAC files. These functions help you perform tests like "Is this host resolvable in DNS?" "Is this host in this DNS domain?" "Is this IP address in this subnet?" "What is my IP address?" "What time is it?"
When you call a function you're going to need to give it some information. For example, "Is www.cnn.com in the cnn.com domain?" This is done by passing specific values or variables to the function and letting it process it.
Functions return one of two things - Either a specific value ("What is the local workstation IP address?") or a true/false answer ("Is www.cnn.com in the cnn.com domain?").
There are a number of basic documents from Netscape and Microsoft that list out the basic functions and what they do in the Reference section of this document, in the Base References section. There are also some more details on a few of those functions in the "Details on Selected Functions" section, as well.
Comments in JavaScript are very simple - Anytime JavaScript sees two forward slashes it ignores anything else in the line. This can be used to add notes or comments to the code to improve readability. Comments can be used in their own line or at the end of an existing line. For example.
// PAC file Date 7/1/2009, Version 2.33
if (host == "intranet.company.com") { //Check for intranet.company.com
Every PAC file must follow the same general structure which is pre-defined by the system. As mentioned elsewhere, a PAC file is written in JavaScript and can use many common Javascript functions and operands.
PAC files always begins with the same line
function FindProxyForURL(url, host) {
To decode this, a JavaScript function called "FindProxyForURL" is being created. Two variables are being passed to the function from the browser, "url" and "host".
The "url" variable contains the full URL of the request - "http://www.yahoo.com/index.html". The "host" variable contains just the host name - "www.yahoo.com"
In general, you want to do almost all of your PAC file checks on the host and NOT on the URL. Internet Explorer has a "feature" called Automatic Proxy Results Cache, discussed in the "Lessons Learned" section of this site. This "feature" means IE will only process the PAC file ONCE per hostname per browser session. Because of this it is generally inadvisable to make proxy routing decisions based on a URL. There are some cases where this is appropriate, but be sure to think through the ramifications.
The basic role of a PAC file is to tell the browser if it should retrieve the site directly or if it should use a proxy. If the return is via proxy, the PAC file specifies which proxy should be used.
The way you do this is by using a "return". This tells the browser to stop processing the PAC file and gives the browser instructions on what to do.
Instructing your browser to retrieve the site directly, bypassing the proxy is simple. The return value is "DIRECT"
return "DIRECT";
Instructing the browser to use a proxy is also very simple. The return value is the word "PROXY" plus the proxy you want the browser to use. Either give the name or the IP address, a colon, then the port the proxy listens on.
return "PROXY proxy.company.com:8080";
If you use a name, the browser will accept a simple name "proxy" or a fully-qualified domain name (FQDN) "proxy.company.com". In general, it's better form to use the FQDN since it's a bit less work on the client PC and doesn't have any dependencies on the search path configuration of the PC.
You can also return more than one proxy server in the list, separated by a semi-colon. The browser will always start with the first one in the list and only move to the second one if it thinks the first has failed. This is, however, not foolproof - See the "Lessons Learned" section for more reading on this topic.
return "PROXY proxy1.company.com:8080; PROXY proxy2.company.com:8080";
I've seen a lot of questions around what you can return from a PAC file. The answer is a simple one - You can return DIRECT or one or more proxies. Nothing else. No alternate PAC files, etc.
This document isn't the place to get deep into the details of developing JavaScript, but you should understand how a number of special characters are used and why. For examples of each of these things, check the sample PAC files on this site.
A semicolon is used to end a statement or "action", for lack of a better term. Whenever you have line that does something it needs to end in a semicolon. Examples are below.
return "DIRECT";
proxy = "PROXY nyproxy.company.com:8080";
alert("I got here, something must be wrong below");
Semicolons should NOT be used after conditionals (IF statements).
There are two very different ways that equals signs can be used. CHECKING a value or SETTING a value.
The most common thing to do with an equals sign is to check if two things are the same. To check a value, you use two equals signs together. For example "if (host == "www.company.com")" checks to see if the variable called "host" has the value of "www.company.com". Note that you can do a "Not equals" check by using the "!=" test - if (host != "www.company.com")
Using the equals to SET a value uses a single equals sign. For example, "localip = myIpAddress();" will set a variable called "localip" to have the value returned by the function myIpAddress.
Be careful not to confuse these two uses. A statement of "if (host = "www.company.com") will always true - It is read as "IF I can successfully set the variable called "host" to the value of "www.company.com".
In general, JavaScript will ignore carriage returns/newline characters, extra blank spaces and tabs. The only time where these really make any difference is when they split apart something inside of quotes. This can be used to add formatting and increase readability. To a PAC file, the two examples below are processed identically.
Example 1:
if (host == "www.cnn.com" ||
host == "images.cnn.com") {
alert("Found CNN!");
return "PROXY proxy.company.com:8080";
}
Example 2:
if (host=="www.cnn.com" || host == "images.cnn.com")
{ alert("Found CNN!"); return "PROXY proxy.company.com:8080"; }
Obviously, example #1 is far easier to read, but both will function identically.
Braces are used to group statements together and to signify a beginning and an end of a section of code. Braces ALWAYS come in pairs - Whenever you see an open brace "{" you should always be able to match it to a close brace - "}". Braces are most commonly used in conditionals (IF statements).
When writing code, it's often recommended to indent the lines grouped together inside a set of braces. This makes it easier to follow.
It is, unfortunately, very easy to get mismatched braces. You might put an open brace in and forget to close it. This can lead to some odd results that are hard to troubleshoot, so check your braces carefully.
You'll notice the first line of the PAC file ends in an open brace. When building a PAC file, you're defining a function called FindPRoxyForURL. The opening brace here signifies the begniing of the function. As with any brace, there MUST be a closing brace somewhere. In this case, you will ALWAYS close this brace as the very last line of your PAC file, as can be seen in some of the examples on this site.
Parenthesis are used in two different ways - Sending values to other functions or grouping values together in conditional (IF) statements. As with braces, parenthesis always come in pairs - An open "(" and close ")" must be matched together.
If you want to call a JavaScript function (either built-in or one you created) you need to use set of parenthesis after the function name. If you need to pass a variable to the function you put it inside of the parenthesis. Examples:
isResolveable(host)
shExpMatch(host, "*cnn.com")
alert("Hello World!")
Whenever you use a conditional (IF statement) you need to put the "test" inside a set of parenthesis. In some cases, you might be using a function which has parenthesis of its own. This means multiple layers of parenthesis, each of which has to have an open and close. Here are some examples.
if (host == "www.cnn.com")
if (dnsDomainIs(host, ".cnn.com"))
if (isInNet(myIpAddress(), "192.168.1.0", "255.255.255.0))
With parenthesis, it's VERY easy to get them mismatched, especially in a long line. The trick is to count the open parenthesis and make sure you've got the same number of closing parenthesis.
Lastly, parenthesis can "group" different things together in a conditional, combined with a Boolean operator, discussed below in more detail.
This is one of the key aspects of a PAC file - Testing and checking values to determine how the PAC file should respond.
The "if" statement is the fundamental building block for PAC file conditionals.
The general format for using an "if" statement is the word "if" followed by a value or values to be tested in parenthesis followed by a series of actions to be taken inside of braces. If the value is true, the actions inside the braces are run, otherwise the PAC file skips down to the line below the close brace and continues processing. For example, to send traffic to a proxy for the host webmail.company.com you would do the following.
if (host == "webmail.company.com") {
return "PROXY proxy.company.com:8080";
}
In some cases, you will want to check a value and perform an action if that value is true and a DIFFERENT action if that value is not true. To do this, you use an ELSE at the end of your IF statement. If the IF statement is true, it executes normally. If the IF statement is false, the code in the ELSE brackets is executed, as shown below.
if (host == "testsite.company.com") {
alert("You are going to the test site!");
}
else {
alert("You are NOT going to the test site!");
}
ELSE statements are rarely used in most PAC files. A PAC file is typically very linear - You check one thing, if that doesn't match, you go onto the next. There are few absolute "If this do that, otherwise do that" kind of scenarios. There are, however, cases where using an ELSE can be very useful. For an example of an ELSE "in action" check out the Simple PAC File with Load Balancing sample PAC file.
Boolean operators are a way to increase the flexibility of a conditional. The primary Boolean operators are AND, OR and NOT. These very simple building-blocks can combined together to form some very powerful tests. JavaScript uses special characters for these operators. && is used to represent AND, || is used to represent OR and ! is used to represent NOT.
A Boolean AND means that the all the things being compared must be true for the overall statement to be true. The OR means that only one of the things in the list need to be true for the overall statement to be true.
Examples:
if (host == "www.company.com" && isInNet(myIpAddress(), "192.168.0.0", "255.255.0.0"))
if (host == "webmail.company.com" || host == "intranet.company.com")
if (dnsDomainIs(host, ".company.com") || isResoveable(host))
The Boolean NOT means that nothing in the conditional should be true to match. Examples:
if (!isPlainHostName(host))
if (!dnsDomainIs(host, "www.company.com))
Boolean operators can also combine together more complex conditionals put in paranthesis. When this occurs, the sub-conditional inside the parenthesis is evaluated first, then the larger conditional. For example...
if ( (host == "www.company.com" || host == "webmail.company.com") && isInNet(myIpAddress(), "192.168.0.0", 255.255.0.0"))
The browser will first check to see if the host is www.company or webmail.company.com. If one of those is true then it will check to see if the host's local IP address is in the specified subnet. If that all is true, then the overall condition will be true, otherwise it will be false.
Conditionals and Boolean logic are immensely powerful often require a lot of thinking and processing, especially when trying to work out how they interact with a number of AND, OR, and NOT. (especially the NOT operator - It adds a lot of power but can be very confusing). If you're having trouble trying to visualize a complex conditional statement, write it out on a piece of paper. You might try to substitute simple letters or variables names instead of each condition. "IF (A and B) OR NOT C " or "IF (localdomain and localsubnet) AND NOT (dmzsubnet or InternetHostedServer)..." Get your head wrapped around it there in simple terms and walk through some possibilities to make sure it works right.