2011年7月28日星期四

AMPL Scite





http://archive.ite.journal.informs.org/Vol7No1/LeeRaffensperger/scite/ampl.properties


Volume 7, Number 1, September 2006
Using AMPL for teaching the TSP
 
 
Jon Lee
IBM T.J. Watson Research Center
P.O. Box 218, Yorktown Heights, NY 10598 USA
 
 
John F. Raffensperger
Dept. of Management, Private Bag 4800
University of Canterbury, Christchurch, New Zealand
 
 
Abstract
In this paper, we discuss the use of AMPL in teaching students about the traveling salesman problem (TSP). The paper gives suggestions for pedagogical devices, homework assignments and exams, PowerPoint presentations, and a convenient package of AMPL models and scripts. The AMPL files include different formulations for the TSP, its relaxations, scripts for its solution, and – particularly useful in class – scripts for visualization of those solutions using SVG. We have a special focus on visualization, to provide convenient ways for students to view and report their solutions. We observe that the TSP is such a classical O.R. problem, that it can play a central role in an undergraduate course about integer programming.

1. Why teach the TSP?
This paper describes our experience in teaching O.R. students about the traveling salesman problem (TSP), as part of advanced courses in math programming, for third year, fourth year, and masters level courses. The TSP remains a very important problem. Though it is very well studied, it continues to tantalize researchers. Nothing humbles a graduate student so quickly as a brutally difficult problem, and the TSP is perhaps foremost in that category.
The TSP leads into many other related areas of O.R., including integer programming and cutting planes, approximation algorithms, Lagrangean relaxation and the subgradient method, and various heuristic approaches (such as greedy and k-opt). The TSP and its variations, such as multiple vehicles, time windows, tour length constraints, have enormous real-world applicability. Because of its importance, its ease in description, its many applications, and its historical inspiration for countless algorithms, the TSP will continue to be an important part of teaching integer programming and combinatorial optimization.
This paper's contributions include suggestions for pedagogical devices, homework assignments and exams, and a convenient package of AMPL models and scripts. We have a special focus on visualization, to provide convenient ways for students to view and report their solutions. The TSP lends itself well to graphical output, and students find this immensely helpful for learning and for model and algorithm debugging. Furthermore, the lecturer can require that students turn in a graphical print-out of the tour, thus allowing some determination of solution quality at a glance.
Computationally difficult problems are sometimes a source of pedagogical challenge. A model instance that is too small does not give the right impression of computational complexity to a student. On the other hand, student versions of solvers are usually too small to allow solution of large model instances. One solution to this dilemma is programming ability. Lecturers whose students have good programming skills have an advantage in teaching the TSP. Unfortunately, undergraduate students in business have a widespread lack of programming skills. Only recently has the University of Canterbury added a modest programming requirement for third-year courses in operations research; it is still too early to see whether this new requirement will pay off. The students' lack of programming skill puts a greater burden on the lecturer to explain algorithms in more detail and to provide canned tools.
Because it does not require programming and because many students already have some familiarity with it, the spreadsheet bears some consideration, especially to solve small problems in first or second year courses, and in MBA programs. We use Excel and What'sBest for such courses; students are required to solve a 30-city TSP, starting with the assignment problem, and then manually add subtour-breaking constraints and binary integrality. This exercise is extremely effective in conveying the problem's difficulty as well as hinting at some of the key elements that can be effectively employed to solve large instances.
Unfortunately, the spreadsheet has several drawbacks for solving the TSP, and for teaching students about it. The spreadsheet does not scale to larger problem instances easily. Creating algorithms (such as adding subtour-breaking constraints) is unnatural, awkward, and time consuming, so the spreadsheet does not lend itself to exploring advanced techniques of integer programming. And Excel has no convenient tools to display network graphs (not to be confused with charts), as each arc must be a new series. We have spent some time coding Visual Basic macros to produce visualizations within Excel, only to find that these macros are difficult to write, clumsy, and unsatisfactory. In contrast to spreadsheets, algebraic modeling languages provide an easy means for making use of integer linear programming methods for combinatorial optimization problems.
We have chosen to use the proprietary modeling language AMPL, which has scripting capability. Useful alternatives to AMPL include FLOPC++ (an open source algebraic modeling language implemented as a C++ class library), GAMSMapleMathematica, the optimization tools in MatlabZIMPL (open source), and, of course, the standard programming languages. We have chosen AMPL primarily for its scripting ability, but also for its low cost, its easy links to many solvers (including open-source solvers such as Cbc), and its popularity.
Perhaps the most interesting AMPL alternative is the open-source Gnu LP Kit, with the useful LP_Solve IDE (see also Introduction to lp_solve 5.5.0.9). In fact, these are not really alternatives, but overlapping programs, because AMPL can call LP_Solve, and because the Gnu LP Kit language is a subset of AMPL. Though LP_Solve IDE cannot run AMPL scripts, it has a surprisingly nice interface and can solve many AMPL models. LP_Solve IDE has the remarkable ability to convert models into different formats (e.g. it can convert MPS toLP-FML, among others).
For editing AMPL, one alternative is AMPL Studio, which we have not tested. Some popular text editors have basic syntax highlighting, and this is the path we have taken, partly for its low cost. In this paper, we give directions and help files for using AMPL with the open-source editor Scite.
While using AMPL feels like programming to students, many AMPL models do not require scripts, and the programming overhead associated with AMPL scripts is less than with Visual Basic or Java. We have found that when students are given some reasonable initial help in AMPL in a workshop, including some scripts to get them started, the programming demands are modest. We have no empirical data regarding the effectiveness of our materials. However, based on students' look of relief when they get their first AMPL model working, the speed with which they can get the basic algorithms going, and on the overall quality of their work, the workshop approach is clearly positive and worthwhile. Students who cannot write AMPL scripts can modify them and learn from that, while learning about the TSP, and thereby eventually learn to write their own AMPL models and scripts. Of course, AMPL models are the first step, but this naturally leads to the harder step of writing AMPL scripts, and this can potentally lead to real programming.
The AMPL code that we provide does not usually incorporate theoretically-efficient algorithms. Except for the minimum spanning tree and the 1-tree, the subproblems are modeled as linear or integer linear programs, and AMPL then accesses a general linear programming or integer linear programming code.
The outline of this paper is as follows. First, we discuss other teaching materials for the TSP. Second, we give a syllabus, with suggested software, lectures and tutorials, a homework assignment, exam questions, and describe the relevant AMPL scripts and data. Third, we give some further ideas to expand the material beyond what we currently teach, along with additional AMPL models and scripts. We conclude with a listing of internet resources for the TSP.
2. Literature review
Lawler et al (1985) and Reinelt (1994) are, of course, excellent texts on the TSP. These books thoroughly cover branch and bound, cutting planes, subgradient optimization, a thorough review of different relaxations, and many important heuristics. Similarly, Nemhauser & Wolsey (1988) has quite rich material on the TSP, and encapsulates it well. Such textbooks are too advanced for our students in business, who do not have advanced skills in math or computer programming.
Several classic textbooks that are currently used at the undergraduate level either do not mention the TSP at all, or contain a shallow introduction to the subject. To mention a few, Dantzig's (1963) classic text on linear programming gives three different formulations over a couple pages. Wagner's (1975) classic text gives a branch and bound algorithm. Hillier & Lieberman (2001) has, surprisingly, nothing on the TSP. Ragsdale (2004) gives one example with the heuristic in Frontline System's Premium Solver for Excel. Lawrence & Pasternack (1998) give a similarly brief example, with a branch and bound algorithm in a supplement on a CD-ROM. Winston's (2004) discussion of the TSP includes a dynamic program (aimed at very small instances), a branch and bound example, a brief discussion about heuristics, and a loose integer programming formulation. We currently use Winston's text, but liberally add to the material.
We feel that the TSP is such a classical problem, perhaps the classical O.R. integer program, that it can play a central role in an undergraduate course about integer programming, even in a school of business. Yet the undergraduate textbooks do not seem to do it justice. We have found just one article written specifically for teaching the TSP to undergraduates. Pataki's (2003) excellent paper covers the relative tightness of different formulations, using Matlab.
Specifically for AMPL, Fourer, Gay & Kernighan (2003) is excellent and authoritative, but much more than students need for one course; by all means the book should be available to students as a reference.
3. TSP Syllabus
Here, we describe the software required and give our course material on the TSP. The course material is comprised of three lectures, two tutorials, homework problems, and exam problems.
3.1. Software required
The student version of AMPL with CPLEX may be downloaded fromwww.ampl.com. In fact, we intentionally avoid making a more powerful solver available to students, as we have found that the homework is more compelling with the limited version. Students are still free to try to find a larger solver on their own, but few ever do so.
If you wish to solve models, but not use scripts, we also suggest the LP_Solve IDE. It has a nice interface and can read a subset of AMPL's language. The solver is not limited in the number of variables or constraints, but is much slower than CPLEX.
SVG viewer. To see the graphical output from the scripts, you should install an SVG viewer, and we highly recommend this approach. You can get a free viewer from Adobe.com. The graphical output is displayed in scalable vector graphics (SVG). From the website of the World Wide Web Consortium, "SVG is a language for describing two-dimensional graphics and graphical applications in XML." Almost all web browsers can display SVG natively or with an add-in. The file format is plain text, and easy to create with a script. For example, here is the minimum spanning tree for 1,000 random points on the Euclidean plane, as intermediate output from the Christofides heuristic.
Scite. For editing AMPL files, we prefer the open-source cross-platform Scite, downloadable from scintilla.sourceforge.net. This editor is highly configurable for various languages via properties files. We discovered a rudimentary AMPL properties file for Scite in Italian, and have modified and extended it extensively (ampl.api, ampl.propertiesSciTEGlobal.
properties
).
  • Scite shows syntax coloring, block indenting, block commenting, and section folding.
  • Scite can recognize AMPL keywords for statement completion with the tab key.
  • Scite displays in-line pop-up help for AMPL functions such as "first(x)".
  • With our attached amplhelp.html file, Scite provides context-sensitive help for AMPL.
  • The user can start AMPL with the current run file, simply by pressing the F5 key. AMPL runs in a side panel.
  • Scite allows infinite undo, and many other features.
More details on installing the files for Scite are provided in section 4.2.
The amplhelp.html file has roughly eight pages of beginner-level and reference material for AMPL. It includes a section on getting started with AMPL, basic syntax, descriptions of some important AMPL words and commands, commonly used options and functions, and a list of frequently-asked questions. We recommend that the lecturer using these materials modify this help file for the local requirements as needed.
The included AMPL models and scripts. These are organized in their own directories, with a separate directory for data. This encourages students to think about modeling in an object-oriented way, to keep the model as generic as possible, and separate from the data.
The scripts and models have, for the most part, uniform data formats and uniform definitions for sets, variables, and parameters. For example, "param N;" is used in all models to define the number of nodes, and "Tourlength" is the name of every objective, even though some will not produce tours. This uniformity has a couple of advantages, the primary one being that we can use a single script to produce the SVG output graph, and a secondary advantage being that students become accustomed to the notation more quickly.
3.2. Lectures and tutorials
Lecture 1. Description & history of the TSP. Some formulations. PowerPoint presentation
In the first lecture, we first describe the problem as simply as possible: drive a car via n cities, minimizing travel distance. We then follow this with some history, especially the seminal Dantzig, Fulkerson & Johnson (1954) paper and formulation, with its key ideas of branch and bound and IP cuts. We discuss the difficulty of the TSP, touching on complexity theory. We also mention Concorde (Applegate, Bixby, Chvatal & Cook 1998), so students realize how far the research has come.
Following this introduction, the remainder of the lecture covers different formulations related to the TSP, most with their AMPL formulations and solutions.
(Note: if the files do not open on click, consider setting the ".mod" and ".run" file extensions to open with your favorite text editor.)
The algebraic formulations for these are in the PowerPoint presentation, and will not be repeated here. Generally, we show the algebraic form and the AMPL model directly in the slides (which are printed and given to students), so the AMPL formulation itself is used in the lecture. This gets students accustomed to seeing AMPL formulations, which helps them when it comes time to write their own. It allows them to ask questions in class about AMPL syntax. Further, if they wish, they can copy the examples straight from the lecture notes for testing. In some cases, we also show the expanded model for further clarification.
Throughout, we follow a careful format for models, first giving the indices, the parameters, the decision variables, the model, and an explanation of each constraint set using "NPS format" (Brown 2004), due to its invention and use at the O.R. Department of the Naval Postgraduate School. (For a published example of NPS format, see Baker, Morton, Rosenthal, & Williams 2002.) To give students a mnemonic, we call this IPDME format (i.e., Indices, Parameters,Decision variables, Model, Explanation of constraints). This format has several advantages over a less formal approach. The format requires the author to write out formulations in a methodical and uniform way, making sure that every term of the model is explained, and encouraging the use of a unique meaning for each term. The format gives the reader a convenient directory of all symbols. The format leads naturally to an algebraic modeling language, as indices become sets, and the rest follows analogously. We encourage educators and editors to standardize on this format.
For example, here is the classic Dantzig-Fulkerson-Johnson (DFJ) formulation, in IPDME format.
Indices: ij city.
Parameters: n = number of cities; cij = cost to travel from city i to city j.
Decision variables: xij = 1 if the salesman should travel between cities i and j, else 0.
  1. Model DFJ: min å ni=1 å j:j>cij xij,
  2. åj:j>(xij + xji) = 2 for all i=1,...,n.
  3. åi,jÎxij £ |S| – 1, for every subset S.
  4. xij Î {0, 1} for all i, j:j>i.
Explanation:
  1. Minimize total distance traveled.
  2. Each city must have degree 2, i.e., must be entered and departed.
  3. No subset of cities S consist of a tour, unless that subset has all the cities.
  4. The salesman cannot split his travel between cities.
We write algebraic models in plain text (rather than Equation Editor or images from LaTeX) using text fonts, subscripts, and superscripts carefully. This allows algebraic models to be copied more easily between programs, animated in PowerPoint, and edited in HTML.
While familiar to the O.R. teacher and researcher, the summations in model DFJ can be quite a challenge to students. We suggest taking a little time to explain how the summations work, and explain that the vertical bars on the set, |S|, mean the cardinality of the set, like a spreadsheet's Count() function.
Part of understanding the model is understanding the solution. Students sometimes have difficulty making sense of the model until they understand the form of the decision variables. Thus, we use a uniform example for all formulations, and show the solution for each formulation. A tour can be encoded in different ways, and it is easy to forget that the IP edge formulation is not immediate for a newcomer. Without any guidance, a student may suggest making a solution vector be simply a permutation of the integers 1,...,n. It is instructive to pursue this direction, exploring the difficulties of trying to write down an IP formulation from these variables. The first issue that one encounters is the difficulty of expressing the "all-different" constraint that the values of each of the n variables should be distinct. For example, for n=3, we see that 1,2,3 and 3,2,1 are both valid solutions. Their midpoint 2,2,2 is on the convex hull and is therefore feasible in any LP formulation based on these variables. But it is infeasible to the IP. Furthermore, it is not possible to formulate the objective as a linear function in these variables, as there are only n variables but O(n2) pairwise distances. So it is a useful exercise in modeling, to work one's way to the inevitably fruitful idea of using binary variables to select edges for a tour.
Tutorial 1. Intro to AMPL, and two key relaxations of the TSP.
The goal of this tutorial is to give students a quick hands-on success with AMPL, and to teach two key relaxations of the TSP, in a workshop approach. Please note that we assume that the lecturer is familiar with AMPL. Some instructions for running these files are given here and further information is given in amplhelp.html but the authoritative discussion on AMPL is Fourer, Gay & Kernighan (2003).
Before the tutorial starts:
  • students should be aware of the requirements for Homework 1, so they are thinking about how to solve it;
  • the lab computers should have Scite, AMPL, CPLEX, and an SVG viewer installed.
The lab should have a chalkboard. At the beginning of the tutorial, the algebraic and the AMPL formulations for the assignment (assignment.mod,assignment.run), and 2-matching problems (2matching.mod2matching.run), are put on the board. Do not give the files on disk, but do give students the complete AMPL formulation either on paper or on the chalkboard. Remind students to end commands with a semicolon. By giving students an example straight off, they will start to catch on to AMPL's syntax. Give the model to students without integrality, and let students discover for themselves the need for integrality.
Students should copy the AMPL formulations into a text editor, creating their own files, and running AMPL as they are ready. By requiring students to type in two simple models, they are forced to engage with AMPL, and make it work, at a time when they can get immediate help. The concept of the AMPL run file is taught, and students put it to use. Separation of model, script, and data then begins to make sense, especially as they realize that changing from the tutorial example to the homework set involves changing only the data file. This contrasts with the need to rewrite the entire model, as would be required with Lindo.
The models can be run either through the Scite interface or at the AMPL command prompt.
  • To run a script in AMPL through the Scite interface, the user simply opens the script (usually a ".run" file) in Scite, and presses the F5 function key. AMPL then runs in a side panel. Note that if you have a model file open without a "solve" command, AMPL will read the model, but not solve it because it has not received the "solve" command.
  • To run a script at the AMPL command prompt, click Start, Run, type ampl in the dialog and press enter. This opens an AMPL command window. To run the 2matching.run script, at the "ampl:" prompt, type "run 2matching.run;", and press enter. In the command window, they will need to enter the reset command to clear AMPL's memory. ("Reset" is not needed if they use Scite as the interface.) Remind them not to reset before using "display" to see results. Explain that when they are frequently rerunning a script, they may want to put "reset;" at the end of the script.
After students get the two models running, they are given access to the script (makesvg.run) which produces the SVG graphical output, and shown how to call the script from their existing run files. This visualization gives them a powerful and immediate indication as to whether their model is correct, and how it is related to the TSP. This would be the time to explain the commands filename command, which is used to give AMPL a list of commands from a file.
SVG can be viewed in a web browser with a free add-in (such as Adobe's SVG Viewer, http://www.adobe.com/svg). We have carefully written all models and scripts to have uniform variable and set definitions, so they all will work with makesvg.run. For algorithms such as the 1-tree subgradient optimization, the graph can be viewed as the algorithm is progressing, simply by refreshing the browser. Copying the image to a word processing program is convenient. The images are valuable for teaching, and are also conveniently copied to presentation software such as PowerPoint.
Given sufficient time, students may be given AMPL scripts for the minimum spanning tree (Prims.run or kruskal.run, originally developed by Olinick (2006)). Alternatively, the minimum spanning tree may be covered in a later tutorial, after the material is presented during the lecture.
The workshop approach has multiple benefits, as immediate help can be given to those students who have difficulty starting and running AMPL. Points of syntax can be addressed immediately, and students get a quick positive result, with a problem that is easy computationally. Key AMPL commands, such as "reset", "display", and "expand", are explained and immediately put to use. Sometimes, students become sufficiently confident and enthused that they want to attempt the homework assignment immediately.
Provision of this one-hour workshop has alleviated a host of problems in using AMPL. Students come up to speed much more quickly compared to simply being given a sheet of instructions.
Lecture 2. Relaxations and solution methods. PowerPoint presentation
Following Lecture 1's introduction to TSP formulations, Lecture 2 introduces some solution methods. We first present the assignment problem with branch and bound, using an example from Winston (2003, pp. 530-534), then give the 1-tree, with Prim's and Kruskal's algorithms. This information is easy to explain, and tends to be understood easily, partly because of the highly graphical nature of the information, and as it is given just a few weeks following more general IP cuts and branch and bound.
The final section of this lecture covers the Held & Karp 1-tree subgradient optimization algorithm. Many students get completely thrown at this point, as the lecture has moved from tree, to 1-tree, to an entirely new algorithm. Considerable patience is required. The subgradient optimization material is presented in more detail in Tutorial 2.
Tutorial 2. Subgradient optimization. PowerPoint presentation
In this tutorial, the subgradient optimization algorithm is presented in detail.
With sufficient time for a computer lab, students may be given the relevant AMPL scripts, including those for Prim's (Prims.run), Kruskal's (kruskal.run, originally developed by Olinick (2006)), and subgradient optimization (HeldKarp.run). Students should be challenged to match up the steps of the algorithm from the notes with lines of code in AMPL.
These scripts do not use separate model files. Since these do not call the IP solver, AMPL's script processor can solve larger spanning trees than would be solvable through calls to the student-version IP solver. Of course, this depends on the commands employed. If you modify the script and find that AMPL complains about the number of variables, try using parameters instead of variables.
Lecture 3. More solution methods for the TSP. PowerPoint presentation
The majority of Lecture 3 presents the DFJ subtour cut generation algorithm. We use a simple 6-city example from Winston (2003), an example which produces subtours with the 2-matching relaxation. The max-flow algorithm is then developed as a method to find subtours. The presentation shows the output from AMPL running, so students see the iterative nature of the algorithm.
The presentation does not cover all the various AMPL scripts we have for the DFJ model, but the lecturer may want to discuss them with students. These include, in increasing order of sophistication:
  • DFJ_all_subtours.mod and DFJ_all_subtours.run (modified fromhttp://www.ampl.com/FAQ/tsp.mod). This script explicitly uses the model with all O(2n) subtour constraints. This is meant only for very small instances.
  • DFJ-simple.mod and DFJ-simple.run. This script starts with the model consisting of just the simple bounds and the degree constraints. At each stage the current integer linear programming formulation is solved, and then a single violated subtour inequality is appended to the formulation. The script stops when there is no violated subtour constraint.
  • DFJ.mod and DFJ.run. This script also starts with the model consisting of just the simple bounds and the degree constraints. The scripts executes a sequence of stages, repeating until a substage finds an optimal tour. At each stage we do the following:
    1. Repeatedly solve the current linear program; identify a violated subtour constraints by solving further (max-flow/min-cut) linear programs, and append it to the current formulation; this sub-stage terminates when there are no further violated subtour constraints; if the resulting solution is integer, then we have an optimal tour, and we terminate the script.
    2. Solve the current formulation as an integer linear program. Identify a violated subtour constraint and append it to the current formulation. If there is no violated subtour constraint, then we have an optimal tour, and we terminate the script; otherwise, we initiate a new stage.
  • DFJ_restricted_arcs.mod and DFJ_restricted_arcs.run. This script executes roughly as DFJ.mod/run, but initially the number of arcs is restricted to the shorter ones. Then, as described at the end of the PowerPoint presentation, new arcs are added if they might improve on the solution of the linear programming relaxation. Arcs with sufficiently large reduced cost can be omitted from the model. In our experience, the 300-variable limit in the student version of AMPL/CPLEX has been sufficient to prove optimality for 50-city problems. Note that the restricted arcs versions can be used to solve the homework directly, so the scripts probably should not be given to students before the homework is due.
The max-flow algorithm is then developed further into a new formulation. Martin (1991) gave an LP formulation for the MST, which is best understood in this context. From this formulation for the MST, we then develop a formulation for the TSP with O(n3) variables. The relevant AMPL files are martin_tree_lp.mod, and martin_tree_lp.run, which can be used to solve for a minimum spanning tree, a 1-tree, or the TSP.
The lecture concludes with hints for the homework, which we describe next.
3.3. Homework.
We assign a 50-city problem, which is too big to solve directly in the student version of AMPL with CPLEX. (We have included a script which will generate a data file with random x,y coordinates, get_random_coordinates.run. This script calls another script, makesvg_points.run, which creates an SVG graph of the points. Alternatively, the lecturer can use a spreadsheet "Random_data_generator_for_TSP_50_points.xls".) Students are required to give valid upper & lower bounds on the tour length, with a description of their procedures for finding these bounds.
The use of the restricted student version of CPLEX requires students to omit rows and variables. The limitation on the student solver should be used to emphasize that one's computer is never fast enough, and that one's solver is never powerful enough. Real world problems can be extremely demanding, and the resourcefulness of the operations researcher is therefore put to the test. This invites students to use creativity (e.g., learn to use the NEOS server (see 5.1 below) or other software, such as the open source COIN-OR Cbc, GNU LP Kit, or LP_Solve).
As mentioned above, the solution is to use DFJ_restricted_arcs.mod andDFJ_restricted_arcs.run, which should not be given to students before the homework is due.
The most common error is to claim optimality when they cannot. Students realize that they must follow the common practice of eliminating unlikely variables (e.g. long edges) from consideration, but then they make claims about optimality that they should not.
Students may gain nearly full marks without an optimal solution, but their approach must be reasonable. Minimal nonzero marks would be given for a tour found visually for the upper bound and an ordinary 1-tree for the lower bound. Good answers should include a good feasible solution, and at least a few iterations of the Held-Karp subgradient optimization algorithm. The lecturer can choose whether or not to give the students our included script (HeldKarp.run).
3.4. Exam problems
3.4.1. A minimum spanning tree.
To write the problem:
  1. Create a random problem. A 20-node problem is reasonable. Because new problems are easy to generate, students cannot memorize a given solution from a previous year's exam paper.
  2. For this random problem, print the graph and the matrix of distances on the exam paper.
  • Ask students to find the MST by hand, using Prim's or Kruskal's algorithm.
  • Ask the students to draw the solution on the exam paper, and show the order in which the arcs were added.
  • To mark the question, find the correct solution with AMPL, and create the graph. Then simply compare the student's answer to the correct answer.
3.4.2. One step of subgradient optimization.
To write the problem:
  1. Create a random problem. A ten-node problem is reasonable.
  2. Create the initial 1-tree with AMPL.
  3. Print the 1-tree and the matrix of distance on the exam paper.
  4. Include a table of the current arcs, labeled with From nodeTo node,Distance, and Penalized distance.
  5. Provide a line or two of space for students to write node penalties, for question a.
  6. Fill in all the entries except the Penalized distance column, for question b.
  7. Print a graph of the nodes without arcs, for question c.
Questions could be as follows:
  • "a. (1 point) Starting with initial penalties of zero and a step size of 30, calculate the new node penalties. Write them here." (Provide space.)
  • "b. (2 points) Write the new penalized distances in the table above."
  • "c. (7 points) With the new penalized distances, draw the new 1-tree below."
3.4.3. Subtour cut generation.
To write the problem, create a random problem. Modify the DFJ AMPL script to stop with subtours. Copy the graph into the exam paper, and ask students to write constraints that break the displayed subtours.
3.4.4. Understanding of specific algorithms.
Show students an AMPL script that they have studied. Ask them to explain several lines in the script, and how the algorithm requires those lines of script. For example, the Kruskal.run script early on contains the line, "let Best_Edges := {(i,j) in Unscanned: length[i,j] = min{(u,v) in Unscanned} length[u,v]};" For a short answer question, you can ask students to explain in two or three sentences the purpose of this line. A valid answer would be something like, "This line initializes an iteration for one arc. The line finds the shortest unconnected arc among the set Unscanned, and adds this arc to the set Best_Edges. The algorithm then connects this arc to a component of the still-disconnected tree."
3.4.5. Short answer question.
"Describe two ways to get a good lower bound on the traveling salesman problem." Answer: Method 1. Solve the assignment or 2-matching problem, then add subtour breaking constraints. Every solution is a valid lower bound, and the lower bound improves with each added constraint. Method 2: Use Held & Karp subgradient optimization for the 1-tree relaxation. Method 3: Solve any other TSP formulation as an LP.
3.4.6. Short answer question.
"Give two reasons why a lower bound would be helpful in solving the traveling salesman problem to optimality." Answer: First, if we find a tour that is equal to the lower bound, then we know it is optimal. Second, during branch and bound, we can prune off subproblems that have a worse lower bound than the current best lower bound. Third, with both an upper bound U and a lower bound L, we can delete variables that have a reduced cost (in the LP) that is larger than U −L. Putting those variables in the solution will increase the objective above the upper bound, so they cannot be optimal.
3.5. Further ideas
The TSP allows a wide range of creative possibilities. Here are additional ideas for the classroom and homework.
3.5.1. A half lecture could cover Christofides' heuristic (Christofides.run).
Teaching heuristics requires that students already have good programming skills and an appreciation for the TSP's difficulty. In short, we think that the integer programming approach should be taught first.
3.5.2. A half lecture could cover the 2-matching problem in depth, including comb constraints.
This is an example of an easy problem (i.e., there is an efficient algorithm), but the IP is not naturally integral. Instead of relaxing the degree 2 constraints, we could relax the subtour breaking constraints, resulting in a 2-matching problem. The 2-matching problem can be solved in polynomial time, but its LP relaxation is not naturally integer. However, Edmonds (1965) showed that a complete linear description of the 2-matching problem is obtained by the constraints 1 and 2 from the DFJ formulation above, with 5 below.
5. å (i,j):iÎW, jÎW xi,j + å (i,j)ÎF xi,j ≤ |W| + ë|F|/2û, for all node subsets W, and all sets of arcs F, where (i,jΠF if i Î W xor j Î W.
These constraints are called the 2-matching inequalities, not to be confused with the constraint set that requires that nodes have degree 2. The 2-matching inequalities are part of a large set of constraints called comb constraints; set Wis often called the "handle" and set F is called the "teeth ". Unfortunately, the 2-matching problem has an exponential number of two matching inequalities. We provide a script to find and add 2-matching inequalities iteratively,2matching_with_combs.mod and 2matching_with_combs.run.
3.5.3. Add Dantzig-Wolfe decomposition with the 1-tree subproblem (Held & Karp).
Held & Karp (1969) showed how to find v(DFJ) with the following model (note we use underscore to designate fixed variables, as this is easier to type than overscore):
Model HK: min åtπtåiåj:j>icijxtij, where xtij is the set of arcs in 1-tree t,
åπt (åj:j>(xtj,i + xti,j)) = 2, for all i (dual prices λi),
åπ= 1,
πt ≥ 0, for all t.
A solution method for Model HK is as follows:
Step 0. Pick dual prices λi.
Step 1. Solve the 1-tree: min åiåj>(cij – λi – λj)xij + 2åλi, x ÎT.
Step 2. Add one variable πt+1 to Model HK above and resolve HK. Update dual prices λi from degree constraints. Quit when v(HKt) = v(1-treet).
This requires many iterations, with a "tail-off " effect. To improve on this method, Held & Karp (1970) gave their now-famous subgradient optimization algorithm.
3.5.4. Assign a Petersen graph, giving only the numerical data.
(An example, Petersen10.txt is given in the AMPL data files.) Students who take the time to create a visualization will have additional insight. A bit of a trick assignment, perhaps best for a tutorial.
3.5.5. Add a capacity constraint, such as for fuel.
This is easy to do, and lends realism to the problem.
3.5.6. Use of AMPL's database access to take TSP data from a spreadsheet.
This is easy to do, and useful for students.
3.5.7. The lecturer can change innocuous data
(e.g., arcs not in the solution) to test for plagiarism. A glance at the student's data confirms whether the student has taken another student's files.
3.5.8. Ask students to code the nearest neighbor heuristic in AMPL script, and examine its performance.
4. A directory of teaching materials: files, models and scripts for the TSP
This section gives a directory of all the files associated with this paper. (Here are all supporting files in a single zipped file, Using AMPL for teaching the TSP, supporting files.zip.)
4.1. PowerPoint presentations
4.2. Scite configuration files for AMPL
The following represents one attempt at setting up Scite to work with AMPL. We would welcome additional suggestions and improvements from readers.
  1. Download Scite from scintilla.sourceforge.net. The single executable Sc1 is convenient. Place this file in the directory of your choice, perhaps "C:\Program Files\Scite".
  2. Add a shortcut to sc1.exe on your Windows Launch tool bar.
  3. Copy all the files from this paper's Scite directory to your Scite directory.
  4. In Start, Control Panel, System, Advanced, Environment variables, System variables, add your Scite directory to the path variable.
  5. Then, to start Scite, simply click on the shortcut on your launch bar. You can also drag files onto the shortcut to open them with Scite.
  6. If Scite does not seem to find the AMPL files, review the settings in the ampl.properties file.
The file SciTEGlobal.properties imports the ampl properties file. SciTEGlobal.properties allows a wide range of configurations. See the Scite documentation file for more information.
The file ampl.properties provides support for the file extensions ".mod", "dat" and "run", lists AMPL key words for colored syntax, and specifies a variety of other miscellaneous properties. Settings in ampl.properties override settings in SciTEGlobal.properties. Ampl.properties controls the behavior of function key F5, which currently calls AMPL with the current ".run" file. You can modify this file to use the GNU LP Kit in place of AMPL.
The file ampl.api lists many of AMPL's functions, for the tooltips.
The file amplhelp.html provides context-sensitive help for AMPL key words in Scite. This help file has taken some effort, but could probably still be improved. The file uses images AMPLcmdline.PNGSciteAMPL.PNG, andSciteAMPL0.PNG.
The file amplwizard.hta represents an experimental approach to guiding a first-time user to AMPL syntax. This is ".hta" application, which runs in Internet Explorer as a trusted executable. Amplwizard.hta is written in ordinary HTML and VB script, which is easy to modify. It uses the "SendKeys" function to pass a string to Scite. The string is inserted into Scite at the current cursor location in Scite. This also shows how an external command can be placed into Scite's menu and called.
4.3. TSP and TSP-related formulations
The major source of scripts for this paper was the first author's University of Kentucky web site, which is not currently available. We discuss other internet resources later in the paper.
Some of the directories also contain sample SVG graph files, usually named tspgraph.svg.
4.4. Algorithms
  • Prim's algorithm for the MST, Prims.run.
  • Kruskal's algorithm for the MST, kruskal.run, script originally developed by Olinick (2006).
  • Held-Karp subgradient optimization, HeldKarp.run.
  • The DFJ cut-generation algorithm, adding one subtour cut at a time, DFJ-simple.mod and DFJ-simple.run.
  • The classical DFJ cut-generation algorithm, using a max flow model as a subroutine, DFJ.mod and DFJ.run.
  • The DFJ cut-generation algorithm with restricted arcs and reduced cost test, DFJ_restricted_arcs.mod and DFJ_restricted_arcs.run.
  • The 2-matching problem with comb generation, using a max flow model as a subroutine, 2matching_with_combs.mod and 
    2matching_with_combs.run
    .
  • Christofides heuristic, Christofides.run. Intermediate and final SVG output from this model is contained the files 1000points_mst.svg,1000points_oddmatching.svg, and 1000points_christo_tour.svg.
  • Script to create SVG output, makesvg.run. The script is assumed to be called from another script. The script assumes N nodes, a parameter or objective function named "Tourlength", set "Nodes" and "Arcs", variables or parameters named "x" defined over the set Arcs, and appropriately defined parameters "length", "xcoord" and "ycoord". The script creates an SVG graph with solid lines for x[i,j]=1 and dotted lines for 0 < x[i,j] < 1. It tries to scale the graphical objects to the data, but you may still need to adjust stroke-width, circle radius, and font-size.
4.5. Sample data
Note that non-Euclidean examples have dummy x,y coordinates for the SVG visualization.
5. Additional web resources
The Internet has rich materials on the TSP, with lots of free software and interactive visualizations. Google reports 115,000 hits on the search site:edu"traveling salesman problem". (Note that with the double-elled ‘travelling', one gets about 25,000 hits.) Here, we aim to give a few key educational sites that are appropriate for third or fourth year students. (These were last accessed in June 2006.)
5.1. General web sites
  • TSPLIB. "This page intends to be a comprehensive listing of papers, source code, preprints, technical reports, etc, available on the Internet about the Traveling Salesman Problem (TSP) and some associated problems." http://www.ing.unlp.edu.ar/cetad/mos/TSPBIB_home.html orhttp://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/.
  • Concorde. General site concerning the TSP. Highlights: "The full source code to the Concorde network optimization package, as well as executables for various platforms, and a Windows graphical user interface to Concorde's traveling salesman solver are available for academic research use." http://www.tsp.gatech.edu/.
  • COIN-OR, www.coin-or.org. A repository for open-source code for O.R. Notably, Cbc (COIN Branch and Cut) can be used as a C++ library, as a standalone solver, and as an AMPL solver (with all Cbc options available from an AMPL script). In addition, the Coin-OR website has a variety of other solvers and algorithms available for linear, nonlinear, continuous, and discrete optimization.
  • NEOS Server for Optimization. NEOS provides open access to a large number of solvers which run on NEOS servers. The Concorde server at NEOS provides for exact solution using the TSP solver of Applegate, Bixby, Chvátal, and Cook. In addition, the Concorde server provides for solution using the Lin-Kernighan heuristic. For integer programming, the open-source solver Cbc and others are available. Several of the IP solvers on NEOS provide for user input via AMPL, in two different ways. First, AMPL files can be uploaded to the server and then AMPL and the desired solver execute on a NEOS server. Second, a user can run AMPL locally on their own machine, and only use the server as a back-end solver. This latter possibility utilizes the Kestrel client/server. The advantage of using Kestrel is that the solution provided by the remote solver is returned to the running AMPL session on your own machine, where, for example, a script can then continue. Also, if one has a complicated script that calls many different solvers, this is the only way to proceed if the solvers are available on different server machines of NEOS.http://neos.mcs.anl.gov/neos/.
  • Chvatal's web site, The Traveling Salesman Problem.
  • Steven S. Skiena's "The Stony Brook Algorithm Repository," has a section devoted to the TSP: The Traveling Salesman Problem. In addition, Skiena (1997) has a section on the TSP.
  • Website of the 8th DIMACS Implementation Challenge: The Traveling Salesman Problem. Instances. results and comparisons. The Traveling Salesman Problem.
  • Wikipedia article, The Traveling Salesman Problem.
  • An introductory article by Karla Hoffman, The Traveling Salesman Problem.
  • Fractal Instances of the Traveling Salesman Problem. Resources on generation of instances of the TSP with known optimal solution. Fractal Instances of the Traveling Salesman Problem.
  • A few links and experimental resources. The Traveling Salesman Problem.
5.2 Optimal solutions and other scripts
5.3. Heuristics
6. References
Applegate, David, Robert Bixby, Vasek Chvatal, and William Cook (1998), "On the Solution of Travelling Salesman Problems," Documenta Math., Vol. 3, pp. 645-656.
Brown, Gerald G. (2004), "Top Ten Secrets to Success with Optimization,"The Phalanx, Vol. 37, No. 4, p. 12ff.
Dantzig, G., Fulkerson, R., and Johnson, S. (1954), Solution of a Large-Scale Traveling-Salesman Problem, Operations Research, Vol. 2, p. 393.
Edmonds, J. (1965), "Maximum matching and a polyhedron with 0,1-vertices," J. Res. Nat. Bur. Stand. B, Vol. 69, pp. 125-130.
Fourer, R., Gay, D.M., & Kernighan, B.W. (2003) AMPL: A Modeling Language for Mathematical Programming, Thomson Publishing, Pacific Grove, CA.
Guignard, Monique, and Kim, Siwhan (1987), "Lagrangean Decomposition: a Model Yielding Stronger Lagrangean Bounds," Math Programming, Vol. 39, pp. 215-228.
Held, Michael, and Karp, Richard M. (1970), "The Travelling-Salesman Problem and Minimum Spanning Trees," Operations Research, Vol. 18, No. 6, Nov-Dec, pp. 1138-1162.
Held, Michael, and Karp, Richard M. (1971), "The Travelling-Salesman Problem and Minimum Spanning Trees: Part II," Math Programming, Vol. 1, pp. 6-25.
Hillier, F. S. and G. J. Lieberman (2001), Introduction to Operations Research, 7th edition, McGraw-Hill, Boston, MA.
Lawler, E.L., J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys (1985), The Traveling Salesman Problem: a Guided Tour of Combinatorial Optimization, John Wiley & Sons, Chichester.
Lawrence, John A., and Pasternack, Barry A. (1998), Applied Management Science, John Wiley and Sons, New York, NY.
Martin, R. Kipp (1991), "Using Separation Algorithms to Generate Mixed Integer Model Reformulations," Operations Res. Letters, Vol. 10, pp. 119-128.
Nemhauser, George L, and Wolsey, Lawrence A. (1988), Integer and Combinatorial Optimization, John Wiley & Sons, Inc.
Olinick, Eli (2006), available online at  (last accessed on 7 March 2006).
Pataki, Gabor (2003), "Teaching Integer Programming Formulations Using the Traveling Salesman Problem," SIAM Review, Vol. 45, No. 1, pp. 116–123.
Ragsdale, Cliff (2004), Spreadsheet Modeling & Decision Analysis, 4th edition, Thomson-Southwestern, Mason, Ohio.
Reinelt, Gerhard (1994), The Traveling Salesman: Computational Solutions for TSP Applications, Springer-Verlag, Berlin.
Baker, S.F., Morton, D.P., Rosenthal, R.E., and Williams, L.M. (2002), "Optimizing Strategic Airlift," Operations Research, Vol. 50, July-August, pp. 582-602.
Skiena, Steven S. (1997), The Algorithm Design Manual, Springer-Verlag, New York.
Winston, Wayne (2003), Operations Research: Applications and Algorithms, 4th edition, Duxbury Press, Belmont, CA.

To download a printable version (pdf) of this paper, click here. To download the Adobe Acrobat reader for viewing and printing pdf files, click here.
To reference this paper, please use: 
Lee J., and J.F. Raffensperger (2006), "Using AMPL for teaching the TSP," INFORMS Transactions on Education, Vol. 7, No 1, http://ite.pubs.informs.org/Vol7No1/LeeRaffensperger/