Dan's Brain

Performance Analysis - Simulations and Models

Taught by Gianfranco Balbo, Gaeta

Systems

This course focuses on systems that are:

  • stochastic
  • dynamic
  • discrete-event

Introduction

Discrete-event simulation

  • purpuse, modeling - simulating - analyzing systems
  • way, computation and mathematical techniques
  • experiments on the model

A model is a conceptual framework describing a system The simulations are going to use Montecarlo-based techniques1

Model Development (typically done iteratively):

  1. Goals and objectives
  2. Build a conceptual model
  3. Convert into a specification model
  4. Convert into a computational model
  5. Verify
  6. Validate

Formalisms

  • Queueing Networks
    • concurrency
    • congestion
  • Stochastic Petri Nets
    • concurrency
    • congestion
    • sychronization

Queues can use different disciplines or algorithms to regulate the job handled by the server

  • FIFO
  • LIFO
  • SIRO, serve in random order
  • using some kind of priority, usually SJF

Basic assumptions on queues:

  • FIFO
  • non-preemptive
    • once initiated, service will continue until completion
  • conservative
    • server stays idle if there is one or more jobs in the service node

For a job \(i\)

  • arrival time \(a_{i]\)
  • delay \(d_{i]\)
  • service begins at time \(b_{i] = a_{i] + d_{i]\)
  • service time \(s_{i}\)
  • wait time \(w_{i} = d_{i} + s_{i]\)
  • departure time \(c_{i] = a_{i} + w_{i}\)
  • interarrival time \(\tau_{i} = a_{i} - a_{i-1}\)
  • idle time \(e_{i} = a_{i} - c_{j-1}\)
    • \(0\) if job arrives whle the server is busy

In general: \[B = \sum_{i=1}^{n} s_{i}\] \[E = \sum_{i=1}^{n} e_{i}\]

Observing these informations we can distinguish:

  • job-averaged statistics
    • average interarrival time

\[\overline \tau = \frac{1}{n}\sum^{n}_{i=1}\tau_{i}=\frac{a_{n}}{n}\]

  • input rate \(\theta = \frac{1}{\overline \tau}\)
  • average service time

\[\overline s = \frac{1}{n}\sum^{n}_{i=1} s_{i}=\frac{B}{n}\]

  • service rate \(\mu = \frac{1}{\overline s}\)
    • maximum throughput, in the case that the server was never idle
  • total simulation time \(T = c_{n} - c_{0}\)
  • throughput \[X = \frac{n}{T}\]
  • utilization \[U = \frac{B}{T}\]
  • time-averaged statistics are defined by the area under a curve (integration)
    • defined 3 functions
      • \(n(t)\), number of jobs in the service node at time \(t\)
      • \(q(t)\), number of jobs in the queue at time \(t\)
      • \(y(t)\), number of jobs in service at time \(t\)
      • \(n(t) = q(t) + y(t)\)
    • \[\overline n = \frac{1}{T} \int_{0}^{T} n(t) dt\]
    • \[\overline q = \frac{1}{T} \int_{0}^{T} q(t) dt\]
    • \[\overline y = \frac{1}{T} \int_{0}^{T} y(t) dt\]
      • also the server utilization (probability, in the limit)
      • \[\overline y = \frac{\sum_{i=1}^{n} s_{j}}{c_{n}} = \frac{B}{c_{n}} = \frac{c_{}_{n}- E}{c_{n}}\]
    • traffic intensity, input rate to service rate ratio
      • \[\rho = \frac{1/\overline \tau}{1/\overline s} = \frac{c_{n}B}{a_{n} c_{n}} = \bigg (\frac{c_{n}}{a_{n}}\bigg )\overline y\]

A Trace File is a log of all the arrival and service times.

Discrete-Event Simulation

The general structure of the simulator is made of different handlers:

void initialize() { clock = 0; halt = false; nsys = 0; Busy = 0; Area_n = 0;

arrival_t = GetArrival(fp); service_t = GetService(fp);

event_notice = get_new_node(); event_notice->event.type = ARRIVAL; event_notice->event.occur_time = arrival_t; event_notice->event.service_time = service_t; schedule(event_notice); // scheduling the first event

event_notice = get_new_node();
event_notice->event.type = END;
event_notice->event.occur_time = End_time;
schedule(event_notice);

}

In the engine we call the handlers, these are best kept as small and fast as possible. It is better to create different variants of the events with simple handles than have fewer but more complex handlers.

void engine(void) { int event_type; double oldclock, delta; nodePointer new_event;

new_event = event_pop();

oldclock = clock; clock = new_event->event.occur_time; delta = clock - oldclock;

// nsys > 0, somebody was in the queue if (nsys > 0) { Busy = Busy + delta; Area_n = Area_n + nsys * delta; }

// transition
event_type = new_event->event.type;
switch(event_type) {
    case ARRIVAL : arrival(new_event);
        break;
    case DEPARTURE : departure(new_event);
        break;
    case END : end(new_event);
}

}

void arrival(struct node* node_event) { struct node* next_job;

nsys++; if (nsys == 1) { / Nobody was in queue node_event->event.type = DEPARTURE; node_event->event.occur_time = clock + node_event->event.service_time; schedule(node_event); / the engine popped the event

//  arrival puts it back as a departure

} else { enqueue(node_event); } arrival_t = GetArrival(fp); service_t = GetService(fp);

next_job = get_new_node();
next_job->event.type = ARRIVAL;
next_job->event.occur_time = arrival_t;
next_job->event.service_time = service_t;
schedule(next_job);

}

void departure(struct node* node_event) { double waiting_time; struct node* next_job;

nsys--;
if (nsys > 0) {
    next_job = dequeue();
    next_job->event.type = DEPARTURE;
    next_job->event.occur_time = clock + next_job->event.service_time;
    schedule(next_job);
}
return_node(node_event);

}

void End(struct node* node_event) { halt = true; return_node(node_event); }

Memory Management

To have an efficient memory management it is important not to use GC, instead it’s better to reuse as many chunks of memory as possible. That is because the memory manager operation are very time-intensive. This can be done by having an available-list of chunks where we can take and place back chunks, interrogating the memory manager as little as possible.

Tandem Network

A network made of two queues each with its output connected to the others input. The Event Notice follows the life of a single job, in this structure the clients/jobs are continuously passing from one queue to another. This means there is no need for an Available Queue anymore as the number of notices is not gonna change during the simulation. The basic assumption of the clock is that it is always cabable of the time granularity needed to distinguish between events that occur “at the same time”.

  • so, while events pass from one queue to another in 0 time, the clock is able to distinguish and establish which event (departure from one | arrivale to the other) is happening first

Operational Analysis

See Denning78/Notes.

Bottleneck

State Transition Balance

State of the system can be represented by the number of waiting customers.

  • \[C(m,n)\] is the number of times the system moves from state \(m\) to state \(n\)

The Balance Equation holds for all states, but the first and the last one \[\sum_{k} C(k,n) = \sum_{m}C(n,m)\]

  • the first has one more exits than entries
  • the last has one more entries than exits

To simplify the state graph we organize them in a chain

  • One-Step Behavior

Convolution Method

Load Independent Stations

service time of each station is independent to the number of clients in queue

  • Utilization Law

\[\overline n_{m} (n) = U_{m}(n)[1+\overline n_{m} (n-1)]\]

\[\overline w_{m}(n) = S_{m}[1+\overline n_{m}(n-1)]\]

Mean Value Analysis

see Arrival Theorem (Sevcik-Mitrani)

  • in single server stations with Poisson input process arriving customers see the system as they were external observers not involved in its operation

Then we group the recurrent expression for the \(i\text{-th}\) station

\[\overline n_{i}(n) = S_{i}[1+\overline n_{}_{i} (n-1)]\] \[U_{i}(n) = X_{i}(n) S_{i}\] \[\overline n_{i} (n) = U_{i}(n)[1+\overline n_{i} (n-1)]\]

In a system where we identified the station \(ref\) as the reference for the computation of the \(V_{i}\)

\[Y_{ref}(n) = S_{ref} + \sum_{i=1,i\neq ref}^{M} V_{i} \overline w_{i} (n)\]

For the \(i\text{-th}\) station the list of derivation is:

\begin{align*} \overline w_{i} (n) &= S_{i} [1 + \overline n_{i}(n-1)]\\
X_{ref}(n) &= \frac{n}{\sum^{M}_{j=1} [V_{j} \oveline w_{j}(n)]}\\
X_{i}(n) &= V_{i} X_{ref}(n)\\
U_{i}(n) &= X_{i}(n)S_{i}\\
\overline n_{i}(n) &= U_{i}(n)[1 + \overline n_{i} (n-1)] \end{align*}

Very simple from a computational point of view

  • just loop \(M\) times
  • number of multiplications by the order of the number of the stations

With a Delay station the algorithm has to be modified

\begin{align*} \overline w_{i}(n) = \begin{cases} Z_{i} \quad &\text{Delay Station}\\
S_{i}[1 + \overline n_{i}(n-1)] \quad &\text{Load Independent Station} \end{cases} \end{align*}

\begin{align*} \overline n_{i}(n) = \begin{cases} Z_{i} X_{i}(n) \quad &\text{Delay Station}\\
U_{i}(n)[1 + \overline n_{i}(n-1)] \quad &\text{Load Independent Station} \end{cases} \end{align*}

\begin{align*} U_{i}(n) = \begin{cases} \frac{\overline n_{}_{i}(n)}{n}\quad &\text{Delay Station}\\
X_{i}(n)S_{i} \quad &\text{Load Independent Station} \end{cases} \end{align*}

BCMP Theorem

Baskett, Chandy, Muntz, Palacios

Random Values

Generation

Distributions

  • Pareto Type 1

    \begin{align*} F_x (x) = \begin{cases} 1- \big(\frac{\beta}{x}\big)^\alpha \quad x \ge \beta \\
    0 \quad x < \beta \end{cases} f(x) = \frac{\alpha \beta^\alpha}{x^{\alpha+1}} E[x] = \begin{cases} \infty \quad \alpha \le 1\\
    \frac{\alpha \beta}{\alpha - 1} \quad \alpha > 1 \\
    \end{cases} \end{align*}

Acceptance-Rejection

Generates random variates

  • usually continuous
  • for distribution whose IDF cannot be easily computed

Generate 2 randoms for \(x\) and \(y\)

  • these individuate a point
  • a point inside the area of the density function is a hit

This is a brute-force approach

  • doesn’t matter the shape of the function
  • efficiency can be really low
    • lots of point can be useless
  • efficiency depends on how well the \(c g(x)\) approximates \(f(x)\)
do {
    u = Random();
    x = g_inverse(u);
    v = Random();
} while (c * g(x) * v > f(x)); // rejection criteria
return x;

Composition

  • usually continuous random variates
  • distribution can be computed as weighted sum of elementary distributions

Cox Theorem

Any distribution can be represented in this manner

  • has to have a Rational Laplace Transform
  • it is an approximation still

With a schema with \(n\) negative exponentials added up at each step with a probability \(\alpha\), else the result is returned.


  1. Computational algorithms using repeated random sampling to obtain results ↩︎