diff --git a/report/milestone3.pdf b/report/milestone3.pdf index 4ad7285..57b073e 100644 Binary files a/report/milestone3.pdf and b/report/milestone3.pdf differ diff --git a/report/milestone3.tex b/report/milestone3.tex index b4695f2..fb6956d 100644 --- a/report/milestone3.tex +++ b/report/milestone3.tex @@ -239,6 +239,8 @@ \subsection{Model} \get{}s and \set{}s are modelled separately, i.e. this is a \emph{multiclass} queueing network. +To analyse the model, I will perform MVA using the Octave package \href{http://www.moreno.marzolla.name/software/queueing/queueing.html}{queueing}. + \subsection{Problems of the model} \label{sec:part3:problems} @@ -266,26 +268,18 @@ \subsection{Parameter estimation} One issue with this definirion is that $s_{network}$ includes the queueing time in \linkmain{LoadBalancer}, but this is not an issue because the service time in that node is extremely low and thus, the queueing time is extremely low. Accordingly, our estimate for $s_{network}$ is not off by more than a few tenths of a percent. -MVA was performed using the Octave package \href{http://www.moreno.marzolla.name/software/queueing/queueing.html}{queueing}. - \subsection{Data} \label{sec:part3:data} The experimental data used in this section comes from Milestone~2 Section~2 and can be found in \texttt{\href{https://gitlab.inf.ethz.ch/pungast/asl-fall16-project/tree/master/results/replication}{results/replication}}. For this section, only data from one repetition (rep. no. 5) and one configuration ($S=5$, $R=1$) were used (short name \texttt{replication-S5-R1-r5}). As a reminder, that experiment had $W=5\%$, $T=32$ and $C=180$. \subsection{Comparison of model and experiments} - - - -\subsubsection{Mean value analysis} - \input{../results/analysis/part3_network/comparison_table.txt} -\todo{} +The predictions of the model match experimental results fairly well compared to the previous two Sections, as shown in Table~\ref{tbl:part3:comparison_table}. Total throughput is off by roughly 15\% and the response time to \get{}s is off by 10\% and so is the number of items in \linkmain{ReadWorker}s. Response time to \set{}s is off by a factor of 2. This is explained when we consider the discussion in Section~\ref{sec:part3:problems}: behaviour of \linkmain{WriteWorker}s is much less predictable than that of \linkmain{ReadWorker}s -- but since there are many more \get{}s than \set{}s the throughput estimate is still reasonable. -performance of writeworker \emph{depends on queue length} because if there are no elements in queue then we check memcached responses every 1ms, whereas if there are elements in queue then we check always after dequeueing an element. This dependence violates \todo{} what assumption? +In summary, the model is quite accurate for \linkmain{ReadWorker}s but less accurate for the more unpredictable \linkmain{WriteWorker}s. -if there is nothing in the queue and no responses from memcached then we will wait for 2ms! \subsubsection{Bottleneck analysis} @@ -296,9 +290,12 @@ \subsubsection{Bottleneck analysis} \label{fig:part3:utilisation} \end{figure} -\todo{} Book 33-03 (33.6) +To find the bottleneck, we can compare utilisation of each node in the queueing network. Figure~\ref{fig:part3:utilisation} shows that \linkmain{WriteWorker}s are the bottleneck in the actual system as well as in the model. However, since only \set{}s pass through \linkmain{WriteWorker}s, \get{}s have a different bottleneck -- which is clearly \linkmain{ReadWorker}s based on Figure~\ref{fig:part3:utilisation}. + +We can estimate the upper bound on throughput by finding out what would happen if the bottlenecks had utilisation $U=1$. For each \linkmain{WriteWorker}, this is roughly 320 and for each \linkmain{ReadWorker} roughly 10500 requests per second, and the total throughput of the system is bound above by roughly 49900 requests per second. The response time of a \linkmain{WriteWorker}s is bound below by 3.1 ms and that of \linkmain{ReadWorker} by 0.095 ms. + + -calculate bounds of throughput and response time \clearpage % -------------------------------------------------------------------------------- diff --git a/scripts/r/part3_network_model3.r b/scripts/r/part3_network_model3.r index 525544e..e138f95 100644 --- a/scripts/r/part3_network_model3.r +++ b/scripts/r/part3_network_model3.r @@ -157,7 +157,7 @@ comparisons_to_save <- comparison %>% melt(id.vars=c("type")) %>% dcast(variable ~ type) %>% select(variable, predicted, actual) -comparison_table <- xtable(comparisons_to_save, caption="\\todo{} loadbalancer items and response times have been left out because they were extremely low", +comparison_table <- xtable(comparisons_to_save, caption="Parameters of the system calculated using MVA. \\texttt{lb} stands for \\linkmain{LoadBalancer}. The throughput and number of items in workers is given as the total over all threads. The response time of and number of items in \\linkmain{LoadBalancer} have been left out of the table because they were extremely low.", label="tbl:part3:comparison_table", digits=c(NA, NA, 2, 2), align="|l|l|r|r|") @@ -167,50 +167,13 @@ print.xtable(comparison_table, file=paste0(output_dir, "/comparison_table.txt"), # Bottleneck analysis Z <- 0 # waiting time +X <- max(mva$X) D <- mva$U / mva$X D_sum <- sum(D, na.rm=TRUE) # sum(D[1,ind_RW]) + sum(D[2,ind_WW]) + sum(D[2,]) D_max <- max(D, na.rm=TRUE) -throughput_slope <- 1 / (D_sum + Z) -throughput_constant <- 1/D_max -responsetime_slope <- D_max -responsetime_constant <- D_sum - -N_max <- 100 -M <- K -S <- mva$S[1,] #(1-prop_writes) * mva$S[1,] + prop_writes * mva$S[2,] -V <- mva$V[1,] #(1-prop_writes) * mva$V[1,] + prop_writes * mva$V[2,] -delay_centers <- c(1, M) -multiple_servers <- 2:(2+num_servers-1) - -manual_mva_res <- get_mva_results(N_max, Z, M, S, V, delay_centers, multiple_servers) - -N <- 1:N_max -rt_bound <- pmax(responsetime_constant, responsetime_slope * N) -data_rt <- data.frame(N, rt_bound, rt_mva=manual_mva_res$response_times) -ggplot(data_rt, aes(x=N)) + - geom_hline(aes(yintercept=responsetime_constant), linetype = 2) + - geom_abline(aes(intercept=-Z, slope=responsetime_slope), linetype = 2) + - geom_line(aes(y=rt_bound), size=1) + - geom_line(aes(y=rt_mva), color="red") + - xlab("Number of clients") + - ylab("Response time") + - asl_theme -#ggsave(paste0(output_dir, "/graphs/asymptotics_responsetime.pdf"), -# width=fig_width, height=fig_height) - - -tp_bound <- pmin(throughput_constant, throughput_slope * N) -data_tp <- data.frame(N, tp_bound, tp_mva=manual_mva_res$throughputs) -ggplot(data_tp, aes(x=N)) + - geom_hline(aes(yintercept=throughput_constant), linetype = 2) + - geom_abline(aes(intercept=0, slope=throughput_slope), linetype = 2) + - geom_line(aes(y=tp_bound), size=1) + - geom_line(aes(y=tp_mva), color="red") + - xlab("Number of clients") + - ylab("Throughput") + - asl_theme -#ggsave(paste0(output_dir, "/graphs/asymptotics_throughput.pdf"), -# width=fig_width, height=fig_height) +throughput_constant <- num_servers * ((1-prop_writes) * 1/D[1,3] + prop_writes * 1/D[2,8]) +responsetime_constant_get <- D[1,3] * 1000 # ms +responsetime_constant_set <- D[2,8] * 1000 # ms