CCP V3 for the Iris.3 Mobile Robot Project

Program by Andy Schmidgall

Report by Bob Arrigo

Download Source Code

PART I. A Quick Introduction to Iris.3

IRIS.3 (read iris-dot-three) is the name given to the current robot being developed by the Mind Project. IRIS.3 represents the IRIS design team's more focused goal of building an intelligent robot that can interact with the world in an expandable number of ways. IRIS.3 is being created with the idea that many different developers may produce peripheral abilities for the robot that can be implemented independently of her other components. IRIS.3 will be composed of a number of different computer platforms and programs networked together in a manner conducive to achieving maximum expandability; IRIS.3 must be platform independent and able to 'grow.' To achieve this end careful planning and design has been necessary.

To better explain IRIS.3 we will use a simple, meaningful analogy: IRIS.3's architecture is in some ways analogous to the architecture of a human. Where humans have a Central Nervous System, IRIS.3 has the Central Control Program(CCP). Just as the Central Nervous System coordinates the body's many systems (cardiovascular, muscular, etc), the CCP coordinates the major systems of IRIS.3. The CCP handles communication between systems, it receives input and produces output specific to each system it controls. In this manner different and otherwise incompatible components and systems may cooperate much like different components and systems in your body cooperate. One such example would be your hands and eyes. Though entirely separated from your eyes, your hands require the capabilities and functionality of your eyes to know where and how to move. Similarly, IRIS.3 will possess a robotic hand that will be coordinated through the CCP with other systems including a vision system so that IRIS.3's hand movements will be both precise and meaningful. The human body has all sorts of systems that provide functionality for movement, thought, communication, and sensory perception. One goal of IRIS.3 is to provide similar systems that allow for much of the same functionality. But here the analogy between human and IRIS.3 breaks down; where the human body is confined to certain biologically determined abilities, IRIS.3 will be as finitely expandable as her developers wish her to be. This is possible in major part by the careful implementation of the Central Control Program.

PART II. A High Level Look at the CCP

The CCP, also known as the IRIS.3 Windows Central Control Program, is a computer program whose duties are twofold:

1) Handle the data transactions between programs

2) Keep a log of these activities

The CCP is basically a router of information. The CCP will take data sent to it by the other programs and systems that compose IRIS.3 and, depending on the incoming data, take certain actions and relay certain information. Here it would be acceptable to view the CCP as a type of interpreter. Suppose you had a task to complete with a few other students in your dorm. Now imagine that a couple of these students whom you must cooperate with are exchange students who do not speak any English. One may speak German, one may speak Cantonese, and a third may even speak Spanish. There is no way for all of you to directly communicate with one another, you need a translator. You need to find a person who is capable of both listening to what each member of your group has to say and conveying this information to the other members of the group to whom the message is directed. In a case like this a good translator is invaluable; when it comes to 'dialog' between computers a good translator is mandatory.

To understand the need for an interpreter between computer programs we must first understand the concept of Modularization. As any student of Computer Science can tell you, the idea of Modularization is one of the basic design philosophies employed today for creating computer programs and systems. Modularization is the idea that any given program or system can be broken down into small, distinct, manageable parts called modules. Each of these modules is a sub-program, it will have certain jobs to fulfill and will require that other modules fulfill their jobs. The sum of all of these little modules compose what you might commonly see as a computer program, but this concept expands even further. Suppose we consider your computer as a whole. Any individual program on your computer may have one or more jobs. You may use Internet Explorer to surf the web, Microsoft Word to type papers and letters, and Win Amp to play MP3 music files. Yet, in totality, all of these programs combined with many other programs running on your computer will yield what is most commonly called a Personal Computer(PC). If asked what your computer does you might rightly say that it helps you to type papers, play music, and surf the net. Your computer certainly does perform all of these tasks and yet each of these duties is performed by separate modules that work with a program called your operating system to produce all of the functionality you may commonly attribute to your computer as a whole. So now we should have a pretty clear understanding of what Modularization is: Modularization deals with the different jobs of your computer and its programs being broken up into separate, smaller jobs that are fulfilled by objects called modules.

The applicability of Modularization is endless. One such use allows different programs and sub-programs running on the same machine to act independently of one another. But what if these modules need to talk to one another? You are already familiar with this concept of programs talking to one another, even if you didn't recognize it as such. Every time you surf the web, (which is how you got here isn't it?) your computer, a module, is communicating with other computer modules throughout the world. These modules may have vastly different roles. They may have been written in different computer languages or run on different types of computers. And yet they can all 'talk' to each other. This communication is much like the communication we imagined earlier between you and some foreign exchange students, and again an interpreter is needed. In our case the interpreter is really just a set of standards that some computer company or governing board put together that allows a standard way for computers to talk to each other. This idea of a standard should be fairly comfortable for you, just imagine how you would talk with other human beings if we didn't have a standard language like English or French. It would be chaos! One way to implement a standard of communication is to say that all programs that want to talk with each other must support such and such a standard. For instance, to communicate with and over the internet a computer or other such device must support the Transmission Control Protocol/Internet Protocol standard, a.k.a. the TCP/IP standard. Another way for different computers and programs to communicate with one another is through an actual software or hardware translator serving as a 'router' which listens to programs talking in any number of different languages, analyzes what they've said, and then distributes the message(s) to the proper recipients. And thus we have come full circle back to IRIS.3 and the CCP.

As we learned earlier, IRIS.3 is a collection of different systems, which we can now call modules, that together produce a robot. However, each system is entirely separate from the others. These individual modules, like the Vision module or the Robot Arm module, have specific jobs that they are very good at doing, but to produce a functional robot requires these modules coordinate their activities and exchange data with one another to accomplish 'robot-wide' tasks. We also learned that for different modules to communicate with each other a translator is needed that can route messages and data between each system. The CCP, then is fundamentally a major router of information and will act as a translator between any new or existing systems that the developers of IRIS.3 want her to make use of.

One aspect of the CCP that we have not mentioned yet is its ability to communicate with programs written in other languages or running on entirely different systems all together. This capability was wisely implemented by its creators to allow for maximum expandability of IRIS.3 across all platforms and languages of computers. This allows for new modules for IRIS.3 to be written in any language and for any system. As long as each of these modules can communicate with the CCP, its designers will be able to easily integrate them into the existing robot without rewriting all of the other programs or reconfiguring the rest of the robot. This means that even after IRIS.3 is complete a computer programmer may write a new program that performs some task like playing Tic-Tac-Toe. This new program will be able to talk with IRIS.3's Vision system to see the Tic-Tac-Toe board and her Robot Arm to draw X's and O's. Such capabilities allow IRIS.3 to be as useful and as expandable as the ingenuity of her developers want her to be.

It appears we now have all of the knowledge needed to walk through a simplified example of the CCP at work. As of right now, when users wish to interact with IRIS.3 they will primarily work with ProtoThinker, a natural language processing program. ProtoThinker can understand and appropriately respond to English sentences provided by the user. The user provides these sentences either by typing them into the program or by vocalizing them into IRIS.3's ears (a microphone) where they are eventually converted into regular text. If the user wishes to have IRIS.3 perform a task like picking up a block and putting it into a cup the user can simply tell her to do so through ProtoThinker by telling it to, 'Put the block in the cup.' ProtoThinker is smart enough to recognize this as a command that must be handled by another part of the robot. To send this instruction to the proper recipient, the Robotic Hand program, ProtoThinker will establish a connection with the CCP. This is analogous to your actions when you pick up a telephone to call someone, you are in essence attempting to establish a connection with the outside world. Once ProtoThinker has established a connection with the CCP she then sends the necessary data, in this case the command 'Put the block in the cup,' through that connection to be received by the CCP. The CCP will receive the command, close the connection, and finish the job of communication by looking up the command it received in a Command Action File look-up table and then performing the corresponding action it finds in the table. In our example the Command Action File look-up table would tell CCP to execute the arm control program and send it the command, 'Put block in cup.'

And so we conclude our High-Level discussion of IRIS.3's Central Control Program. For more information on this program continue reading the next section (Part III) which will provide a more in depth look at how the CCP works.

PART III. A Mid Level Look at the CCP

This section of coverage on the Central Control Program (CCP) assumes that you have read Parts I and II of the CCP's documentation, have a good understanding of computers, and have some programming experience under your belt. Before we begin our discussion on the CCP you will first learn a bit about GTK+/GLIB, multi-threading, and sockets.

A. Background Information

1. GTK+/GLIB

GTK+ is the acronym for the GIMP toolkit and GIMP, in turn, refers to the GNU Image Manipulation Program. The GTK+ is a toolkit of software libraries including Glib. These libraries provide means for threads, GUIs, and other useful functions. One advantage of using GTK+ is that programs written in a language like C++ using the GIMP toolkit can easily be made to run on both Windows and Unix machines. This software portability allows for the transfer of programs between systems without having to recode or recompile everything. The reason GTK+ is mentioned here is that it was used for much of the development of the Central Control Program. One of the primary requirements of the CCP was portability so using GTK+ was an obvious choice. The CCP's entire user interface was written with calls to the GTK+ toolkit as was much of the rest of the program. To learn more about this toolkit, click here: www.gtk.org.

2. Multithreading

This section on multithreading is meant as a general introduction to the idea of concurrent programming. The Central Control Program utilizes this concept as a basic part of its execution. Concurrent programming is implemented by multithreading which refers to a single program having multiple (two or more) threads of execution running at a time. A thread here is defined as a process of sequential instructions or steps for the computer to take. A thread is the smallest part of a program that can be have independent existence. When a uniprocessor computer such as a standard PC is running it can execute only one particular thread at a time. When using popular operating systems like Windows you may often hear about its ability to multi-task. Do not confuse the idea of multithreading with multi-tasking. Multi-tasking refers to a single computer running multiple applications at the same time whereas multithreading refers to a single application having multiple threads of execution that run concurrently. An example of multi-tasking is when your computer is seemingly running both MS Word and Internet Explorer at the same time. In reality, however, your computer is actually only processing one application at a time. The CPU divides its time between each program running on the system. It executes a part of the code, saves the necessary data, and moves on to the next program or thread that needs to be executed. In this way each program and its threads get a certain portion of the CPU's time to be executed.

For many programs just one thread of execution is ample but there are situations where multiple threads may be needed to allow the program to perform appropriately. One example of multithreading that you should be familiar with occurs when you use most word processors like MS Word. After you start up MS Word and begin typing you may notice that as you type the program is constantly checking what you have written for spelling and grammatical correctness. What is going on behind the scenes here is an example of two threads working together to perform one task (word processing). The first thread is responsible for receiving data from the keyboard and mouse, processing it, and updating the screen. Seemingly at the same time, as the CPU switches from one thread to another, the second thread will independently analyze what has been typed and alert the user of spelling and grammatical mistakes. In this way the threads are like separate programs that operate on the same data (the text) yet they are contained within a single application. Because threads share the same resources like data or address space there must be a manner of synchronizing the threads. There are a number of approaches to synchronization but the basic goal of any approach is to prevent one thread from corrupting the data of another thread. This would occur if one thread changes the value of some variable or object when another thread was expecting the data to remain unchanged. Careful programming is required to implement synchronization and it is an important topic to understand when working on multithreaded programs.

There are a number of different libraries and languages that allow multiple threads within the same programming. The CCP uses POSIX which is included with many software development environments as the library <pthreads>.

This section has been a very basic introduction to concurrent programming with multithreading but it will be sufficient for our purpose, to understand the CCP. For the curious reader, more information regarding threads can be found by clicking here:

Introduction to Programming Threads

3. Sockets

Sockets are objects through which applications may send and receive data to and from other sockets and applications independently of the network protocol used. A number of different socket libraries exist and are commonly used, the CCP uses two such libraries. When running on Windows systems the CCP uses winsock.h and when running on Unix systems the CCP uses socket.h. One goal of sockets is to abstract away the particular implementation of the network being communicated through. In this manner a standard socket can communicate with other sockets and programs running on the same machine as itself just as easily as it can communicate with a long distance Internet Service Provider or other platforms connected to it. A socket is associated with a particular running application and has a unique 'name' to identify it. A socket may serve two roles: it may act as either a server or client. If the socket is a server it will be initialized and then it will simply listen for another socket to request a connection. Thus the other role, a socket playing the part of client will ask the server socket for a connection which can be either accepted or denied. Once a connection is established data can flow freely between the server and client. Sockets are said to be full-duplex which means that they can communicate in both directions at the same time which of course adds to there usefulness. Next we will see an example of this usefulness in action when we consider the CCP and its implementation of a socket server system.

B. The Action File

Technically, an action file is not a part of the Central Control Program. However, familiarity with action files is necessary to understand how the CCP works. There are two files, action_file.h and action_file.c, included with the CCP project that compile into a Dynamically Linked Library (DLL). The action_file.dll contains the Application Programming Interface (API) and data structures required to create and use an action file. Basically, an action file is a simple data file that consists of two columns of info, a command and an action separated by a colon:

command : action

put the block in the cup : run arm.exe block in cup

initiate text to speech : run tts.exe

These files will be used by the CCP to look up commands that it receives and take appropriate action. The Action File DLL can be used to create new action files, add more lines to existing files, load files, and save files. The CCP is capable of interacting with the Action File DLL to perform all of its action file needs.

C. Breakdown of the CCP

First we will take a 'file-level' look at the CCP. The Central Control Program is currently broken up into four separate files: the ccp.h header file, the ccp.c source file, the server.c source file, and the iris3.h header file. The pair ccp.h and ccp.c are coupled together and perform the initialization of the program, the Graphical User Interface (GUI) work, and start the server portion of the program running. The (socket) server portion of the program is contained in server.c. Some constants and typedef's are contained in iris3.h. The files include a number of different libraries to accomplish their jobs. The libraries used by the CCP are:

For Windows Machines:

-winsock.h

For Unix Machines:

-types.h -socket.h -unistd.h -netdb.h -inet.h

For Both Platforms:

-string.h -pthread.h -gtk.h -glib.h

C. Detailed Flow of Execution of the CCP

In this part we will look at the Central Control Program's flow of execution during a sample run. Just as in any other C/C++ program, when the CCP is executed main() is the first function called. In main() we begin by initializing both data and GTK (with a call to gtk_init()). The next step is a call to create_main_window(), a function which uses GTK to construct a slick graphical user interface (GUI). In this call to create_main_window() a lot of the work to setup the program for user interaction is done including tying buttons to functions and 'signals' to functions. ('signals' are similar to 'events' in event driven programming, see www.GTK.org for more info) When this call ends a pointer to the newly created window is returned. From here the logging functionality is setup and the main window is actually displayed on the screen:

The next few processing steps involves creating and initializing the thread that will run the socket server. With a call to gdk_threads_enter() the socket server thread separates from the main thread and begins its simultaneous execution. Before we see what is going on with the socket server thread you will learn just a bit more about the main thread's activities.

Again, the main thread runs the GUI and will respond to user actions such as closing the program or opening an action file. When the user selects to open an action file a standard GTK file browser window opens up and the user may search out and select the action file containing the commands/responses they wish to use. The selected file's data will then be used as the current action file. When the CCP receives a message from one of IRIS.3's other programs it will analyze the message by looking it up in its current action file, finding the proper entry, and executing the appropriate response. Again, this response may be to send data to another running program or to start a new program and send it certain command line arguments. An action file must be loaded for the CCP to process incoming requests and messages. Thus the CCP's main thread will control the dissemination of information in a dynamic way that is tailored to the needs of currently installed peripherals.

As mentioned above, the socket server thread breaks off from the main thread of execution and initially enters the socket_server() function. This thread will continue executing the CCP's socket server code until the program is shut down. In this function a socket is initialized and established. Once it is up and running the socket begins listening for other sockets or programs to connect with it. When a connection is requested the socket will attempt to accept connections from clients as they come in. When a connection is accepted and established the thread calls read_data(). This function will listen to its client's incoming message and save it. The message is then sent to af_perform_action(), an Action File DLL function. This function will search through the current action file for the message and take appropriate action based on the message. After this process is complete the connection is terminated and the socket resumes listening for connections.

Once an attempt has been made to terminate the CCP the GUI is destroyed and the main thread calls the gdk_threads_exit() function. The program then closes.