Frequently Asked Questions
- Part 2 -
Technical Questions

Q: How can I restrict software generation (*.prog files) to just the *programmable* devices in my system?
A: To restrict software generation to only the programmable processors:
   1. Add the following line into the header of your 
        programmable device models:


        It should be inserted after the device-type name, 
        but before the port-list, local variables, and threads.
        Like this:

                DEFINE_DEVICE_TYPE:     Sharc   
                   PORT_LIST( p0, p1, p2, p3, p4, p5, p6 );
                   /* Local Variables */
                   int my_id, ii, pc, done;

        The magic keyword is: "programmable" !!!

   2. Re-build your simulation to yield a new "netinfo" file.

This causes the SCHEDULER to enforce the restriction of
generating software only to devices of the "programmable"
(See Scheduler Documentation about -e.)

Q: How to get statistics or metrics for performance/resource utilization for processing elements in a model? How can we tell the % utilization of a PE over a given time period?
A: There is a file called summaries.dat that gets generated by the models in the Perf.Mod.Lib. (core_models) after you run a simulation. Simply view it after you run a simulation. It lists the processor utilizations.

The file is created for writing by the "monitor" model and written-to by each of the PE's at the end of the simulation.

The time-window, over which the utilization is calculated, is set by the global variables Time1 and Time2. These are defined in parameters.sim. By default, they are set to:

Time1 = 0.0 Seconds
Time2 = 1.0 Seconds
You may wish to set a different time window over which to take statistics.

Q: How to get histograms and similar plots of the CPU utilization of the modeled system?
A: Use UtilHist to extract the processor-utilization results from your summaries.dat file to produce a histogram plot file.
Then plot with:     xgraph   utilhist.dat

If the above utility does not suffice, in general you can collect and plot data from your models by writing the data to a file, and then calling XGRAPH on the file. XGRAPH will plot the x-y data points. See XGRAPH.

You can add your required statistics to your models. I'll explain briefly how it would be done.

For example, to show a utilization histogram, have each PE write their utilization to file:

  Declare a global file in parameters.sim:
	FILE *utilfile;

  Then open the file in the monitor,
   and set the histgram-bar thickness:
	utilfile = fopen("utilhistogram.dat","w");
	fprintf(utilfile,"thickness = 4.0\n");

  Then write from the PE model:
	fprintf(utilfile,"%d 0.0\n", MY_ID);
	fprintf(utilfile,"%d %f\n", MY_ID, utilization);

  Remember to close the file at the end of the simulation
   in the monitor.

This will create a file with entries like:

	thickness = 4.0

	10 0.0
	10 87.9

	12 0.0
	12 62.3

Each set of four numbers defines a histogram-bar, which is a line segment defined by two pairs of x-y points. In this case it is a vertical line extending from y=0.0, i.e. the x-axis, up to the utilization level of the processor, ex. 87.9%. The x position being the processor logical ID number.

You can get much fancier with these graphs, for instance, putting axes titles, graph titles, colorizing, etc..

To view the graph, type:

xgraph utilhistogram.dat

Q: With regard to Data Flow Graphs, if you have a task-B, for example, that's receiving messages from task-A, is it possible to have task-B examine the message and pass on a message to task -C or -D based on the contents of the message received from A?
A: You can make conditional flow-graph nodes (i.e. branch nodes, valves, etc..) by using the Dynamic Scheduler. See appendix-C of the Scheduler document. (Specifically section 4.3.) Note: The Dynamic version of the Scheduler is not part of the externally distributed CSIM package.

Q: For some reason, the RECEIVE command returns a length value for me that is always 0. I am attempting to use the length returned from RECEIVE to set the length for a subsequent send. The result is that the messages are being sent, but the length of the message appears as 0.
A: The length value returned by the RECEIVE function is calculated based on the time elapsed for the transfer.
length = (T2 - T1 (Seconds)) * (Link_xfer_rate (Bytes/Seconds))
Notice the units. Seconds cancel out, leaving only Bytes. T1 is the time of start of the transfer and T2 is the time at the end of the transfer.

This estimate is fairly accurate when the link-rate and message length are finite quantities. However, it cannot be accurate for extreme values, such as infinite link rates.

You might check the link rate where the RECEIVE is occurring. If it is set to a "*", then it is infinite, so the transfer time (T2-T1) will be zero and the length calculation will under-flow in your computer.

If the above is true, and you actually want zero-delay transfers, then the length parameter is not appropriate for you. You would be better to pack the length value into a field of the message.

Background: The RECEIVE function was designed this way (to return a length estimate) to support models where transfers may be preempted. Please see Message Preemption. Without this feature, the receiver of a message would not know how much of the message was received. I.E. Whether the whole message came across, just a portion, or nothing. Obviously, this applies only to certain kinds of models, such as used in network architecture studies.

> I have found 3 csim references to Verbosity.
> (1) There is a command line argument described on p. 21, section 4.1, of
> the CSIM Documentation which describes a command line argument.  The
> scope of this command line argument is stated as being limited to the
> precompiler.
> (2)  There is a menu item on the runtime gui that lists three categories
> which can be selected.  I am not certain whether choosing one of these
> items turns on that specific type of verbosity or chooses it instead of
> the other two.
> (3) There are statements in the core library such as If (VERBOSITY >
> XXX) ...  It is not clear how the variable VERBOSITY is set by a user.
> My question is "How do I set the VERBOSITY variable described in #3
> above.
A: Verbosity is, in general, the degree of textual output to put-out while processing. All the CSIM tools, and many of the models, have some control over their verbosity. Each is distinct and individually controllable. It is probably useful to divide the verbosities into three groups:
	1. Tools, 2. Simulator, 3. User-Models

 1. Verbosity levels of the CSIM TOOLs:		(all but sim.exe)
	Users can set the verbosity level of each CSIM tool by the
	"-v xxx" command-line option when you run each tool.  
	You cannot create new verbosity levels or messages.
	You can only choose from the available settings.

 2. Built-in verbosities of the SIMULATOR:	(sim.exe only)
	The verbosity of the simulator is set interactively;  not by
	command line argument.  It can be changed during simulation.
	There are three independent verbosities that can be selected
	within a simulation.  They are produced by the simulator and
	are independent from your models.  For example, you can have
	it print the time in your text window whenever it changes,
	so that it intersperses with your other printf's.  And/or you 
	can monitor every SEND/RECEIVE event.  And/or you see the
	event-queue on every event (this one is very heavy-duty).

 3. Verbosity of individual MODELs:
	Users can set and create new verbosity levels and messages 
	in their models.  (This is your code.)
	We'll call this category: "User Verbosity".
You seem to be interested in the third category. There are as many ways to control User-Verbosity as there are users and models. However, in CSIM's Computer-Architecture Performance Model Library (Core-Perf-Mod-Lib) models, this is how it works:
	There is a global integer variable, VERBOSE, that is visible
	from all model-code.  It is declared in subroutines.sim, and
	it is usually set in "Monitor.sim".   Monitor.sim is the
	logical place to set the verbosity level, because it is the
	central place to put all simulation oriented things that do
	not have anything to do specifically with a modeled element.

	If you are debugging something far into your simulation, you
	could do something like this in the Monitor:

		VERBOSE = 0;
		DELAY( 10000.0 );
		VERBOSE = 5;

	Sprinkled throughout the model code are comparisons to 
	VERBOSE.  Feel free to add your own, as needed.

	So the answer to one of your questions is:  

		Set the User-Verbosity, which controls the verbosity 
		of your models, in the "Monitor.sim".

>  When I attempt to open a leaf node, e.g. PE4, in demo1,
>  a vi editor appears.  The problem is that the window containing
>  the vi editor erases everything behind it, so I can't read the
>  diagram on the window behind the vi editor, until I close the
>  vi editor.
A: The GUI is modal, meaning it allows you to do only one thing at a time. I.E. Editing the diagram OR Editing a text-file. When you open a text-file with the editor, we force the GUI to go asleep, until you finish editing. We have it this way for safety, to keep all files in-sync. Otherwise, you could open multiple editing sessions on the same file (accidentally) and potentially lose work.

When the GUI opens the text editor, it replaces its canvas with a message to "Close the text editor before continuing". However, I suspect that message might have been obscured by the vi editor itself, depending on the size of your screen.

Q: The curves on XGRAPH are quite invisible when printed on my printer. Can you tell me how to plot a curve with a line thicker than the default?
A: You can control line thickness by placing the command:
thickness = 10
in your data file. (The value 10 is an example. Default is 1.0, 2.0 is double thick, etc..).     See: Thickness

However, the printing problem you describe is more likely due to the color rendering of your printer, assuming you are not using a color printer. Often black&white print-drivers render colors (such as red) as medium gray tones which are difficult to see.

If sending to a B&W printer, select the "B&W/Color" button on the print-dialogue. You should see the message:

"Toggled printer-setting to Black&White printer."
in your text window. The lines should then print black and crisp.

Q: Now that I have a ProcTline file and associated Spider plot file arranged the way we want, how do we select different colors for the plots other than manually editing the ProcTline.dat file?
A: The ProcTline.dat and Spider.dat files are produced by the models. There are several ways to set the colors:
  1. Modify the models to set the colors differently.
  2. Modify the .dat files manually.
  3. Modify the .dat files automatically.
I assume you already thought of the first two methods, but you might not be aware of the third.

There is a utility in the directory:
called filter. You can use filter to change the colors without editing the .dat files. Therefore, you can put filter commands within scripts that are run automatically.

(If you place the $CSIM_ROOT/tools/$CSIM_MTYPE/general_utilities directory in your $PATH, then you can give the commands directly, without the long path name!)
Specifically, the colors are set in the .dat files by lines of the following kind:
        color = Red
        color = 9
You want to change the color names or numbers specified on these lines. The filter utility does global search-and-replace operations. For example, the following command:
filter ProcTLine.dat   -s   "color = Red"   -r   "color = Blue"   ProcTLine.dat
... will read the ProcTLine.dat file and search for the phrase: "color = Red" and replace it with: "color = Blue". And it will write the resulting file back to ProcTLine.dat.

Here is the man-page on filter:

filter - A general purpose file-filtering utility. Does global search and replaces on files of any size. Expects the source file to be the first file on the command-line. If no file is specified on the command-line, it will prompt you. By default, if no destination file name is specified, it will write the filtered output to a file of the same root-name as the source, but with ".new" suffix. If you wish to write the file to a specific name, then specify that name as the second file argument.
Filter correctly handles replacing a file by itself. (It automatically uses a temporary file.)
You may specify the search and/or replacement string(s) on the command line also by prefacing them with the "-s" or "-r" command-line options respectively. If you do not supply them, filter will prompt you for them.
                filter   file1   -s   "was"   -r   "is"
This would create "".
                filter   file1   file1   -s   "was"   -r   "is"
This replaces "file1" with its filtered version. Using the "-v" verbose option will show the replacement lines as they occur (after replacement). Filter always provides a summary of the number of replacements made.
There are many other similar automatic methods as well, such as by using sed or awk. But I am not aware of a way to do this as simply with them, -by a single line command-, as with filter. If interested, see man pages on sed and awk.

Q: How to use C++?       How to mix C++ models with C?
A: The following has been one method for linking any C++ code with C programs. The first thing to know is that linking works the same for both C and C++, therefore, you can use C functions and global variables in C++ and vice versa by matching the function names correctly. However, a difficulty is that the C++ compiler does name mangling to symbol names when supporting features of templates and namespaces. Note also that complex C++ class types are not directly supported in C. So to link a C++ object file with a C object file you can to do a couple of tricks to get around these limitations.

First, force the C++ compiler to not mangle the symbol names that you will use with the C program, - by declaring those symbols (variables and functions) within an extern "C" block. This is usually done in a common header file, but C compilers will not understand extern "C". The solution is to conditionally compile the 'extern "C"' key words in only if you are in a C++ program.

The other trick lets you use C++ objects in a C program. The trick is to create special wrapper functions in your C++ code that the C compiler can understand. The functions take arguments of type void * instead of the complex class type and then cast the void * variable back to its original type. Below is an example:

	---=== begin ===---
	#include "foo.hh"
	class Foo 
	    int bar1;
	    char *bar2;
	void *make_foo()
	    Foo *new_foo;
	    new_foo = new Foo();
	    return( (void *)new_foo );
	int get_bar1( void *a_foo_var )
	    Foo *a_foo = (Foo *)a_foo_var;
	    return( a_foo->bar1 );
	---=== end ===---
	---=== begin foo.hh ===---
	#ifdef __cplusplus
	 extern "C" {
	#endif /* __cplusplus */
		void *make_foo();
		int get_bar1( void *a_foo_var );
	#ifdef __cplusplus
	#endif /* __cplusplus */
	---=== end foo.hh ===---
	---=== begin c_prog.c ===---
	int main() 
	    void *my_foo;
	    my_foo = make_foo();
	    printf( "the bar is %d\n", get_bar1( my_foo ) );
	    return( 0 );
	---=== end c_prog.c ===---
Functions are extern by default. You do not need to declare them that way.

For an example of using C++ directly in CSIM models, see C++ Example.



Q: What pending event queue or future event list structure is CSIM using. Does it use the Calendar Queue structure?
A: CSIM includes a couple of queuing mechanisms which are selected based on the statistics of the pending event set. All are equivalent to a time-ordered event list. One is an n-ary sort-tree, for large event lists, and gives log-order access and traversal. The n-ary tree is dynamic, efficient, and continually (reasonably) self-balancing. This is superior to the basic Calendar Queue structure, which maintains a single array of linked lists.

Q: Are there any limitations to using local variables within each thread?
A: On some platforms, the amount of stack space per thread may be limited. Local variables are allocated out of the stack space. This is sufficient for a few scalar variables, since the minimum stack space is usually at least 16K Bytes. But large local arrays could exceed the stack. So they should be allocated dynamically, since dynamically allocated memory comes from virtually unlimited heap. Remember that the stack space must support all nested subroutines. Some system calls, like file-IO with complex formatting, may need a good amount of 16K. On other platforms, thread stack space grows as needed. But it is always good to be careful.

In addition to dynamically allocating data structures, three other methods are available to avoid thread stack limitations:

  1. Use CSIM's TRIGGER_THREAD_STCKSZ( threadname, delayamnt, var, requested_stacksize_in_bytes ).

  2. Use CSIM"S CALL_THREAD instead of TRIGGER_THREAD. CALL_THREAD does not create a separate actual thread, but instead runs model code under the main process thread which has virtually unlimited stack (> 4GB).

  3. Use globally declared data structures. These are virtually unlimited since they are in statically allocated from heap storage; not on the stack. (We have also observed faster access to heap storage than stack, probably because addressing is absolute and direct, while stack access is indirect (relative to the stack pointer.) Just use distinct data items for each box instance. If multiple instances exist, then for example, create an array of a data structure, and have each thread operate with their own index. You could use MY_ID, since that is guaranteed to be unique to each box, to start at zero, and to be contiguously assigned. For example, if you expect less than 100 boxes, then you could declare the array(s) to be of dimension 100.
Symptoms of running out of stack space include: confusing results. Generally there is no immediate warning or error message. One thread simply collides with and overwrites another thread's stack. The first observable effects may then occur far from the fault.   Tools like Valgrind can detect and isolate these situations quickly and precisely.

Q: Are there limitations to how many threads can exist at a given time?
A: On some platforms, there is a limit to how many threads can co-exist. On Linux platforms, this limit can be controlled by the system variable ulimit. On some older 32-bit distributions it was set by default to 16K, which is usually more than enough, but it can easily be raised if needed. On newer Linux 64-bit distributions, either there is no limit, or it is much higher than could ever be practically used like 4,294,967,295 or so.

On Microsoft platforms the thread limit is known to be about 800 threads. We know no way to expand the limit on that platform. So this appears to be a hard limit on that OS family.

On Sun Solaris, we have found no limits other than total system memory. We have operated several million simultaneous threads under Solaris without any problems.

More FAQs - New PerfMod User Questions

(Questions, Comments, & Suggestions: