-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy paththesis.tex
2319 lines (1994 loc) · 122 KB
/
thesis.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% --- Template for thesis / report with tktltiki2 class ---
%
% last updated 2013/02/15 for tkltiki2 v1.02
\documentclass[english]{tktltiki2}
% tktltiki2 automatically loads babel, so you can simply
% give the language parameter (e.g. finnish, swedish, english, british) as
% a parameter for the class: \documentclass[finnish]{tktltiki2}.
% The information on title and abstract is generated automatically depending on
% the language, see below if you need to change any of these manually.
%
% Class options:
% - grading -- Print labels for grading information on the front page.
% - disablelastpagecounter -- Disables the automatic generation of page number information
% in the abstract. See also \numberofpagesinformation{} command below.
%
% The class also respects the following options of article class:
% 10pt, 11pt, 12pt, final, draft, oneside, twoside,
% openright, openany, onecolumn, twocolumn, leqno, fleqn
%
% The default font size is 11pt. The paper size used is A4, other sizes are not supported.
%
% rubber: module pdftex
% --- General packages ---
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage{microtype}
\usepackage{amsfonts,amsmath,amssymb,amsthm,booktabs,color,enumitem,graphicx}
\usepackage[pdftex,hidelinks]{hyperref}
\usepackage{subfigure}
\usepackage[textsize=tiny, disable]{todonotes}
\usepackage{multirow}
\usepackage{array}
\usepackage{setspace}
\usepackage{morefloats}
\usepackage{algorithm}% http://ctan.org/pkg/algorithm
\usepackage{algpseudocode}% http://ctan.org/pkg/algorithmicx
\newcommand{\var}[1]{{\ttfamily#1}}% variable
% Automatically set the PDF metadata fields
\makeatletter
\AtBeginDocument{\hypersetup{pdftitle = {\@title}, pdfauthor = {\@author}}}
\makeatother
% --- Language-related settings ---
%
% these should be modified according to your language
% babelbib for non-english bibliography using bibtex
\usepackage[fixlanguage]{babelbib}
% add bibliography to the table of contents
\usepackage[nottoc]{tocbibind}
% --- Theorem environment definitions ---
\newtheorem{thm}{Theorem}
\newtheorem{lem}[thm]{Lemma}
\newtheorem{cor}[thm]{Corollary}
\theoremstyle{definition}
\newtheorem{definition}[thm]{Definition}
\theoremstyle{remark}
\newtheorem*{remark}{Remark}
% --- tktltiki2 options ---
%
% The following commands define the information used to generate title and
% abstract pages. The following entries should be always specified:
\title{Theory and practice of rapid elasticity in cloud applications}
\author{Mika Majakorpi}
\date{\today}
\level{MSc Thesis}
\abstract{
This thesis is a study of the theory of scalability and its application in an
infrastructure as a service (IaaS) cloud context. The cloud based utility
computing paradigm is presented along with how scalability principles are
applied in the cloud. The differences of scalability in general and the cloud
concept of elasticity are discussed.
A \textit{quality of elasticity (QoE)} metric is developed to facilitate factual
discussion and comparison of different cloud platforms' elasticity capabilities
and the effectiveness of elastic scaling strategies. The metric is based on
business requirements expressed as \textit{preference functions} over a set of
lower level metrics. Multi-criteria analysis is applied to these possibly
conflicting preferences to arrive at a unified value of \textit{utility} based
on weighting the sum of the preferences. QoE reflects the utility of the system
over time.
The concept of an elasticity controller application is presented and a prototype
implementation described in order to exercise the QoE metric. Two load testing
scenarios are executed against a simple test application whose deployment is
managed by the prototype controller. The elastic scaling behavior of the system
is analyzed in terms of the QoE results to confirm the prototype is functional
and to find areas of improvement.
}
% The following can be used to specify keywords and classification of the paper:
\keywords{cloud computing, scalability, elasticity}
% classification according to ACM Computing Classification System (http://www.acm.org/about/class/)
% This is probably mostly relevant for computer scientists
\classification{\protect{\ \\
\textbf{Networks $\rightarrow$ Cloud computing} \\
\textbf{Software and its engineering $\rightarrow$ Software performance} \\
\textit{Computer systems organization $\rightarrow$ Reliability} \\
General and reference $\rightarrow$ Metrics \\
}}
%→
% If the automatic page number counting is not working as desired in your case,
% uncomment the following to manually set the number of pages displayed in the abstract page:
%
% \numberofpagesinformation{16 pages + 10 appendix pages}
%
% If you are not a computer scientist, you will want to uncomment the following by hand and specify
% your department, faculty and subject by hand:
%
% \faculty{Faculty of Science}
% \department{Department of Computer Science}
% \subject{Computer Science}
%
% If you are not from the University of Helsinki, then you will most likely want to set these also:
%
% \university{University of Helsinki}
% \universitylong{HELSINGIN YLIOPISTO --- HELSINGFORS UNIVERSITET --- UNIVERSITY OF HELSINKI} % displayed on the top of the abstract page
% \city{Helsinki}
%
%\doublespacing
%\singlespacing
\onehalfspacing
\setlength{\parindent}{0mm}
\setlength{\parskip}{1ex}
\begin{document}
% --- Front matter ---
\maketitle % title page
\makeabstract % abstract page
\tableofcontents % table of contents
\newpage % clear page after the table of contents
%\newenvironment{acknowledgements}%
% {\cleardoublepage\thispagestyle{empty}\null\vfill\begin{center}%
% \bfseries Acknowledgements\end{center}}%
~\\[10\baselineskip]
%{\vfill\null}
\begin{center}
\textit{For Rashmi}\\
\textit{Your resolve inspires me}
\end{center}
%\listoffigures
%\listoftables
\listoftodos
\newpage
% --- Main matter ---
\section{Introduction}
Computer science is a discipline built on layers upon layers of abstraction. We
build entire worlds out of combinations of binary states. When complexity
increases over a practical threshold, we apply another abstraction layer and
continue until we face another technological or conceptual limit. Progress
happens when new abstractions emerge either leveraging existing ones or
replacing and simplifying them.
The context of this thesis is scalability in cloud computing, a recent
abstraction built on virtualization and distributed computing~\cite{handbook}.
Technologies related to cloud computing accelerate the provisioning of computing
resources by several orders of magnitude compared to a non-virtualized process.
Resources are provided to users as virtual units which draw on a pool of
distributed physical resources collectively called a cloud. The lead time to
acquire a virtual server instance is measured in seconds or minutes instead of
days or even weeks~\cite{elasticsiteMarshall2010}. When the server is no longer
needed or during times of inactivity the resources reserved for the server are
allocated to other virtual resources or released back to the cloud as free
capacity. This flexibility drives down costs and provides the possibility for
new kinds of agile ICT.
The apparent unlimited supply and instant delivery of resources has inspired
researchers to consider cloud computing as a utility similar to water and
electricity~\cite{Buyya2009a}; It’s ubiquitously available and billed based on
usage. The ease at which cloud resources can be provisioned makes it possible to
run applications with an adjustable amount of server instances depending on the
current or anticipated usage level of the application. This flexibility and the
speed with which the deployment can be adjusted have enabled e.g. web
applications to scale from a handful of concurrent sessions to millions~\cite{
cloudberkeleyviewacm} and back without committing to a large amount of computing
resources which would remain deployed but unused during periods of low usage. A
deployment capable of serving millions of users is understandably expensive to
maintain, but the cloud approach with its prevalent pay per use pricing enables
such scenarios to be realized without large upfront investment in computing
resources as would be the case with dedicated hardware servers.
The ability of an application deployment on a cloud platform to change in size
dynamically at runtime is referred to as elasticity or rapid elasticity~\cite{
nistdefinition}. This capability to automatically scale the deployment in
(smaller) or out (larger) depending on current demand is a major factor in
the hype and success~\cite{cloudberkeleyviewacm} of cloud platforms in recent years.
The goal of this thesis is to explore the theory and practice of rapid
elasticity. The concept of quality of elasticity is developed and put to test
using a prototype implementation of an elasticity controller~\cite{VRB11}, a
piece of cloud infrastructure software whose responsibility is to decide on and
implement cloud provisioning actions. The focus is on infrastructure as a
service (IaaS) clouds and the provisioning of virtual machines in such clouds.
The elasticity controller concept is a step towards a more service oriented
cloud offering. Rather than provide infrastructure with an interface modeled
exactly after the operations performed on IaaS VMs, a service-oriented approach
aims to provide more abstract interfaces which address the cloud customer’s
problem domain rather than the cloud provider’s. This includes e.g. cross-cloud
capabilities~\cite{frominfratoservice}.
The thesis is structured as follows. Chapter~\ref{sec:scalability} discusses the
different forms of scalability in system and software architectures.
Chapter~\ref{sec:cloudscalability} presents scalability in cloud context.
Chapter~\ref{sec:elasticity} discusses the theory of elastic scaling and
develops a metric, quality of elasticity, for it. Architectural patterns to take
full advantage of elasticity are also presented in this chapter.
Chapter~\ref{sec:elasticScalingPrototype} presents an elasticity controller
prototype along with its requisite monitoring infrastructure used in the thesis
to put the theory to test in practice.
Results of two test load scenarios are presented in chapter~\ref{sec:results}. The
two scenarios compare quality of elasticity under a gradually growing load and a
sudden spike of load. Areas of further development are discussed based on the
findings of the tests. Finally, chapter~\ref{sec:conclusion} concludes the thesis.
\subsection{Cloud computing terminology} Cloud computing
is often referred to quite vaguely as a massively scalable model for
infrastructure services in information technology. As academic research and
practical use grows, more and more terms and conceptual frameworks related to
cloud computing are emerging, some of them short lived or focused on marketing.
Published taxonomies~\cite{Hofer2011taxonomy} offer a snapshot to a fast moving
target. The following terms for deployment models and abstraction levels are
fixed in common usage~\cite{nistdefinition}~\cite{handbook} and essential to
understanding the scope of cloud computing.
A cloud is \emph{public} if it is available for the general public to access and
\emph{private} if it is only available internally to some organization or
selected group of organizations. Obvious differences from a cloud user’s
perspective are the location of data and management of physical resources on
which the virtualization environment is built. \emph{Hybrid} clouds are a
combination of the above such that a private cloud is bridged to another private
or public cloud. They remain functionally independent but the private clouds
gain benefits in tolerance against hardware failure and resource exhaustion as
workload (i.e. virtual machines) can be shifted elsewhere in case of a shortage
of capacity. Such expansion of a private cloud is called
cloudbursting~\cite{nistdefinition}. When a private cloud is the actor in
cloudbursting, it is considered functionally transparent to the users of the
cloud. The cloud user has a single interface towards the cloud which handles
bursting behind the scenes. Bursting may also be implemented outside any cloud
infrastructure layer, closer to the application. In this case bursting is
typically handled by an application controller component in charge of elastic
scaling (elasticity controller).
Varying the size of a deployment or the amount of resources reserved for a task is called scaling. Adding more resource instances (e.g. a virtual machine) is referred to as \textit{scaling out}. Decreasing the amount of resource instances is called \textit{scaling in}. This is in contrast to modifying the capabilities of an existing resource instance, which is referred to as \textit{scaling up} for more and \textit{scaling down} for less.
Customers can benefit from clouds at different levels of service. The simplest
case for a customer is using cloud deployed \emph{software as a service (SaaS)}
without having to consider any operative aspects of the software. Gmail is
an example in this category. Google operates the service supposedly deployed on
their private cloud infrastructure and customers merely log in to the service
and use it over the Internet. Moving down to the next level of service,
customers can deploy their own applications to \emph{platform as a service
(PaaS)} clouds like Heroku, Microsoft Azure or Google App Engine. The service
provided is a platform for applications with related application programming
interfaces (APIs) and services for managing and monitoring the deployment. A
PaaS cloud enables customers to focus on the application instead of
infrastructure at the cost of losing control and ownership of it. One further
level down, \emph{infrastructure as a service (IaaS)} clouds enable customers to
provision virtualized infrastructure resources (virtual machines, storage,
network) to build their own infrastructure, platform and application. IaaS gives
the most control on the deployment, but requires considerably more management
compared to the other service levels.
Cloud service levels form a hierarchy with infrastructure at the bottom, a
platform deployed on the infrastructure and software on the platform offered as
a service to customers. A new service or application may be built by leveraging
any of these service levels. For example, the Heroku PaaS platform uses Amazon’s
EC2 infrastructure and an application deployed on Heroku will then complete the
stack. On the other hand, an application could simply be deployed on EC2,
skipping the PaaS layer, if it was deemed beneficial to gain additional control
of the stack down to virtual infrastructure. Starting at a lower abstraction
level increases the responsibilities of the application or organization
operating it to include infrastructure or platform management as well as
managing the application.
Ultimately all deployment models and service levels are meant to provide
scalable computing resources to customers. How to best benefit from scalability and what it actually means in each case is up to the customer.
\section{Scalability}
\label{sec:scalability}
Scalability is one of the elusive ``-ilities'' in information systems, a quality
attribute whose importance for an application deployment is clearly demonstrated
when the usage of a system grows and resource demands increase. Yet it is
hard to pin down exactly what scalability means in each discussion of
it~\cite{ScalabilityHill1990}.
Computing resources are limited and eventually any system which grows in data or
usage will saturate the resources available to it. The system may then also end
up needlessly large or expensive in case resource requirements decrease
afterwards. The resources in question may be e.g. processing capacity for
computationally intensive systems or storage capacity for data intensive
systems. Network capacity is a notable scalability point in distributed systems.
Structural scalability concerns the internal design of a system and how the
design lends itself to growth or shrinking of the system’s data model or, for
example, its deployment.
\subsection{Dimensions}
%The vocabulary: scale up, out, down, in wrt vertical, horizontal
Scalability has multiple dimensions as illustrated in
figure~\ref{fig:scalabilityDimensions}. Scaling is said to be \emph{vertical} if
the scaling point or points in question are internal to a server. For example,
the amount of RAM available to a specific server or its CPU speed is a vertical
scaling point in terms of that server. A vertically scaled system remains
logically equivalent in the process of scaling. Scaling this way is
straightforward as software requires no changes to take advantage of further
resources on a system. In contrast, adding more server instances,
\emph{horizontal} scaling, is a more coarse grained operation and requires
software to be written specifically to leverage the multiple servers by running
tasks in parallel~\cite{handbook-scaling}.
%The following paragraph is possibly better suited for the rapid elasticity or
%elastic architectures chapter?
Switching to a higher level of abstraction in system design changes the
viewpoint from horizontal to vertical. Horizontal scaling of nodes in a server
cluster or cloud can be considered vertical scaling from the viewpoint of the
utility provided (e.g. processing capacity) by it to a higher level system using
it as a component. This is how infrastructure as a service (IaaS) and platform
as a service (PaaS) scaling viewpoints differ. Horizontal IaaS virtual machine
scaling is vertical platform capacity scaling from the viewpoint of an
application deployed on a PaaS cloud where the platform manages the
infrastructure.
The vertical dimension gets exponentially more expensive as the system size
increases. The scaled system needs to be changed to a more capable type of
server as the need for more resources increases past what the current server can
physically support. This scaling path eventually leads to mainframes and
supercomputers. The limits for horizontal scaling, on the other hand, are
traditionally in the domain of data centers and the amount of servers that can
fit on their racks. Horizontal scaling has been very common in Internet
architectures since the early days of the network, but recently virtualization
has made vertical scaling an important option to consider as well~\cite{VRB11}.
With vertical scaling, system configuration can be adjusted dynamically at
runtime in a matter of seconds, which is faster than the minute-or-two time
frame of horizontal scaling in a cloud. A combination of both scaling dimensions
can be used to implement a fine grained scaling solution.
Cloud computing pushes both scaling dimensions past their traditional
boundaries. Hybrid clouds and cloud interoperability make it possible to scale
out a system past the boundaries of data centers and cloud providers. The
network becomes the limiting factor here as the communication between nodes in a
system needs to be transmitted between clouds over the Internet.
A third dimension to consider is \emph{structural scalability} which has to do
with the behavior of a piece of software as its data model, amount of data or
amount of tasks to execute varies in size~\cite{handbook-scaling}. A requirement
for scalable software is to be internally efficient in terms of the asymptotic
time and space complexity of its algorithms~\cite{algorithmBook} and
additionally support parallel processing in terms of tasks and
data~\cite{foundationsOfParallelBook}.
\begin{figure}[htbp]
\includegraphics[width=\textwidth]{images/scalingDimensions}
\caption{Three dimensions of scalability}
\label{fig:scalabilityDimensions}
\end{figure}
Task parallelism is a feature of software systems which are capable of
simultaneously executing multiple tasks on the same or different data. A serial
program, in contrast, must proceed with a single task at a time. Task parallel
software lends itself well to horizontal scaling as separate tasks can be
executed on distributed nodes of a system. Vertical scaling, on the other hand,
can be applied to adjust the performance of each task independently.
Data parallelism is the capability of a software system to perform the same
operation in parallel to different instances of data. In distributed systems, a
large computing task is typically split into multiple independent tasks which
can be executed on separate server nodes simultaneously without communication
between them. Results of the split tasks are then sent back to a controller
which combines them and computes the final result. This was done in large
scale already back in 1999 with the Seti@Home project. Since then millions of
ordinary computer users have donated CPU time to search for extraterrestrial
intelligence by running an application which analyzes pieces of a large set of
radio telescope data~\cite{Korpela2001}\cite{setiathomewebsite}. More recently
Google and e.g. the open source distributed database Hadoop have made use of the
MapReduce programming model~\cite{handbook-mapreduce} for distributed data
parallel computing.
Structural scalability is closely related to the horizontal scaling dimension.
To take full advantage of horizontal scaling, the application has to support
parallel execution and minimize synchronization between the parallel threads of
execution. Depending on the use case, parallelism can be either task or data
based, but in both cases the notion of parallelism has to be built in to the
application.
\subsection{Tradeoffs}
Scaling a system is not without its negatives. Vertical scaling gets expensive
at an exponential rate when the system grows in available resources. Horizontal
scaling increases complexity of coordination between distributed nodes.
Structural scaling requires algorithm and data model design to fit the chosen
scaling mechanism.
Advances in computer science and technology have reduced the impact of these
tradeoffs from what it used to be with older technology. Virtualization and
dynamic provisioning of virtual machines have made it possible to use computing
resources more efficiently in a modern data center. High capacity servers are
not kept idle waiting for spikes in system load. Virtual resources can be
allocated dynamically based on demand at any given time. Nevertheless, for
vertical scaling, the cost is still the definitive tradeoff.
With horizontal scaling, a common tradeoff point is the need for communication
between nodes in a distributed system. If such communication can be avoided or
minimized, the software scales well; Adding servers does not cause excessive use
of network bandwidth and the benefit of additional servers does not decrease as
the amount of servers increases. This could happen due to increased processing
needed for keeping the system's increasingly complex state synchronized. For
data parallel computations, the MapReduce model enables this on a massive scale
but requires algorithms to fit its mold with two distinct phases, \emph{map} and
\emph{reduce}~\cite{handbook-mapreduce}. The map phase distributes data to
mapper nodes where it is processed. Intermediate results from the map phase are
fed to reducer nodes for another round processing which ends with the final
result. Both of the steps can be processed in parallel on distributed systems.
The model clearly requires a specific approach to algorithm implementation in
order to take advantage of parallel computation and is only applicable to
structurally similar problems which can be expressed in terms of the map and
reduce functions.
Task parallel software can get congested due to synchronized access to common
data. ACID (Atomicity, Consistency, Isolation, Durability) transactions are
inherently serial in nature so a shared relational database, for example,
quickly emerges as a bottleneck for scalability. To remedy this, databases can
be scaled by applying various techniques such as
sharding~\cite{scalableDataStores}. Need for ACID transactions should also be
scrutinized. Many highly scalable systems make do with the BASE (Basically
Available, Soft state, Eventually Consistent) consistency model in favor of the
more strict ACID model~\cite{tivitbestpractices}\cite{VRB11}\cite{Buyya2010intercloud}.
When scaling in, the tradeoff is with ensuring performance and
reliability while minimizing cost. When a deployment is at its minimum size,
it’s difficult to reliably react to increased load without false positives and
try to keep the application responsive. Scaling out to accommodate the
load will take some time, so the decision should be made early enough to keep
the application responsive during the scaling activity~\cite{Roy2011}. Ensuring
performance, reliability and fault tolerance as required by e.g a service level
agreement~\cite{Funika2011}\cite{Iqbal2011} sets limits for the minimum system
configuration. A capacity buffer of appropriate size has to be kept to allow
time for scaling activities.
At code level, in addition to the need for communication between horizontal
nodes, tradeoffs are made between the asymptotic complexity of algorithms in
terms of CPU time or data storage space needed for execution. System design
principles are of key importance to minimize the impact of scalability
tradeoffs.
Scalability should be considered in context. Discussing e.g. only the CPU
capacity of a system is a moot point if the system becomes overly complex or
expensive to maintain due to the increase in computational capacity. Designing a
system with one scalability factor in mind may reduce scalability in terms of
other factors. Tradeoffs like these are important to understand when designing
systems. The relative importance of scalability factors can be derived from the
requirements of the system in question. By going after the most important
factors, the utility (performance maintained by scaling which the stakeholders
experience as a tangible benefit)~\cite{Duboc2007} of the scaling effort is
the highest.
Scalability can be analyzed as a multi-criteria optimization problem where, given
the priorities of the system in question, different scaling strategies will
perform differently as the system grows. Multi-criteria analysis will help to
choose the correct scaling factors from both technical and stakeholder benefit
viewpoints as shown by Duboc in her work on the subject~\cite{Duboc2007}.
Choosing the strategy with the most utility for the system’s stakeholders should
be the goal.
\subsection{Bounds}
\label{sec:scalabilityBounds}
%speedup, amount of work done in same time, asymptotic complexity
Scalability analysis in the design phase of a system can save effort and costs
during a system’s lifetime as changes are easiest and cheapest to make in the
beginning. Any real system will have its bounds set by its environment and
stakeholders through functional and non-functional requirements. Some
requirements are harder to meet than others and an understanding of the laws of
scalability helps in managing expectations and succeeding in system
implementation. This chapter presents basic laws of scalability to establish the
limits within which scalability engineering takes place.
Typically a portion of any computation is not parallelizable. The size of this
portion determines the lower bound in terms of execution time for a program
according to Amdahl’s law~\cite{amdahlslaw}. The law can be expressed as a
function which gives the maximum speedup $S$ that can be achieved with $N$ nodes
working in parallel,
\begin{equation}
S(N) = \frac{1}{(1-P)+\frac{P}{N}},
\label{eq:amdahlslaw}
\end{equation}
where $P$ is the portion of the program that can be executed in parallel and
conversely $(1-P)$ the serial portion. As $N$ tends to infinity, the speedup
tends to $1/(1-P)$.
For example, if a given computation has a serial part which is 10\% of the
complete computation, then the benefit of increasing parallelism for the
remainder of the computation will tend towards zero as the number of parallel
nodes increases. The upper bound for $S(N)$ in this case is 10. The computation
can be sped up at most by a factor of 10 regardless of the amount of parallel
processors introduced to the system. This along with other values of $P$ are
illustrated in figure~\ref{fig:amdahlsLaw}.
Before reaching the theoretical limit given by Amdahl's law, typically a
practical limit for evenly dividing $P$ into parallel tasks or data sets would
be reached. This implies that software design level structural scalability is
very important in order to keep the non-parallelizable code to a minimum.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{images/AmdahlsLaw}
\caption{Amdahl's law states that the maximum speedup of a program when running
it on multiple processors is limited by the portion of the program that can be
run in parallel. (Image source: Wikimedia Commons~\cite{amdahlslawimage})}
\label{fig:amdahlsLaw}
\end{figure}
Amdahl’s law underlines the importance of algorithmic optimization to maximize
the speedup achievable with parallel processing. However, although the benefit
of adding more parallel nodes tends to zero, the processing capacity of each of
the nodes does not of course diminish in the process. In fact, the entire array
of parallel nodes is idle for $1-P$ percent of the execution. To make efficient
use of a horizontally scalable system, the problem therefore needs to be of a
nature which benefits from a large number of $N$. That is, the portion of
inherently serial code $1-P$ needs to be minimized and the problem needs to be
divisible to $N$ or more parallel parts.
Dividing a fixed set of data or tasks can only be done in a limited number of
practical ways. Amdahl’s law assumes a static size and even division for data
over $N$ nodes and gives the maximum proportional speed but does not consider
that more data or tasks can be processed in the same time. In practice, the
benefit of parallel processing is larger given a problem with dynamic data or
task set size. Having more data or tasks is key to being able to split them $N$
ways. With virtualization, $N$ can also be adjusted to fit the input size.
\todo{could also discuss efficiency of parallel processing, E = S/N.}
Gustafson’s law~\cite{gustafsonslaw} shows that parallel processing is efficient
given the right kind of problem. It assumes the serial fraction of computation
$\alpha=1-P$ is static while the divisible amount of data or tasks grows evenly
with the amount of parallel nodes $N$. The speedup according to the law is then
\begin{equation}
S(N) = N - \alpha(N-1).
\label{eq:gustafsonslaw}
\end{equation}
In practice $\alpha$ will also grow due to overhead caused by increased
parallelism, but as long as the overhead is insignificant, Gustafson’s law shows
that scaling horizontally is efficient up to large numbers of $N$ if the data or
task size grows with the system. This is illustrated in figure~\ref{fig:gustafsonsLaw}.
In contrast to Amdah’ls law, Gustafson shows that parallel computing in a
dynamic environment (data divided into parts equal in amount to that of
computing nodes) scales very well. Recently multicore processors have brought
more possibilities to architecting scalability~\cite{amhdalmulticore} but the
basic principles from the 1960s still apply for computations done either with
processor cores or virtual nodes on an elastic cloud platform. Similar high level
algorithms work for both cases, and the source of computing resources can be
thought of as an abstract concept.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{images/GustafsonsLaw}
\caption{Gustafson's law states that speedup increases linearly given the
amount of
work grows evenly with the amount of nodes available to process it. (Image
source:
Wikimedia Commons~\cite{gustafsonslawimage})}
\label{fig:gustafsonsLaw}
\end{figure}
%Scalability models
\section{Scalability in cloud infrastructures}
\label{sec:cloudscalability}
Public infrastructure clouds (IaaS clouds) make computing resources available to
customers on a pay per use basis. Customers provision virtual servers, storage
and networks from a pool of physical resources. Part of the allure of IaaS
clouds is that the availability of further resources is made to seem infinite.
Cloud service providers do set limits to the size of deployment under a single
account, but those limits can be raised by separate agreement. The initial
limits are there more to avert denial of service attacks than to safeguard
against actual resource depletion.
It is apparent that clouds are massively scalable systems in terms of
performance, reliability, cost, maintenance and a multitude of other quality
attributes when the size of deployment of physical hardware on which the virtual
resources are provisioned varies. Cloud computing takes distributed computing
forward by increasing dynamism in the structure of distributed systems.
Virtualization enables quick provisioning and deprovisioning of servers, storage
and networks. Setting up a public cloud infrastructure for a new business can be
accomplished in a matter of minutes or hours. The pay-per-use model enables
quick responsiveness to change since adding servers to the environment does not
come with a large up front cost. Similarly, when removing servers from the
system, the released capacity will not necessarily go to waste. It becomes
available to other users of the cloud.
This thesis focuses on the user level of IaaS clouds and the benefits attainable
at that level for building scalable information systems. The underlying physical
implementation of a cloud and scalability therein is left mostly out of scope.
Higher levels of the cloud service stack (PaaS, SaaS) are not discussed
directly, but the elasticity measures presented in later chapters do apply to
them as well.
In cloud context, the basic principles of scalability remain as discussed in
chapter \ref{sec:scalability}. Vertical scaling is achieved by adjusting the
performance of existing virtual machines by changing the amount of available
resources. This can imply relocating the virtual machine to a different physical
host if the current host can’t accommodate the scaled up
VM~\cite{Verma2010-CostOfReconfigurationInCloud}. In practice, vertical scaling
is currently slower than it could be due to limitations on adjusting CPU and RAM
dynamically at runtime~\cite{VRB11}. Changing these parameters requires a
restart and, for example on Amazon EC2, a newly provisioned VM instance will
replace the old one. The process is heavy considering the gained benefit and as
discussed above will grow exponentially expensive when resource demands
increase.
Horizontal scaling is where clouds excel. Virtual machines are cloned as needed
and load is balanced among them. Scaling the network, load balancers and other
infrastructure tools like monitoring is needed when the system grows to surpass
their capacity~\cite{VRB11}.
\subsection{Rapid elasticity}
A cloud is said to be elastic~\cite{nistdefinition} if the resources it provides
can be provisioned and deprovisioned dynamically and automatically. This implies
the necessity to monitor the cloud so that provisioning decisions can be made
based on performance data. Provisioning must be automatic, i.e. decisions to
scale out or scale in should be acted on without human intervention. This
implies the need for cloud customers to access a programmatic interface with
which cloud provisioning actions are carried out. The actions should resolve as
fast as possible to enable constant matching of the size of deployment to
service demand.
The benefit of elasticity is realized when the gap between demand and capacity
can be kept as small as possible (see figure~\ref{fig:oldVsNewScalability}). When
demand increases and more capacity is needed, rapid elasticity can enable the
service to scale out quickly enough so that no requests need to be refused.
Scaling in rapidly when demand decreases means unneeded resources are kept
reserved for a shorter time and consequently less money is wasted on unused
capacity. The utilization rate of provisioned resources can be kept at a better
level compared to a system that would prepare for demand spikes by
overprovisioning resources which then end up being idle during non peak demand.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{images/oldVsNewScaling}
\caption{Elastic scaling allows capacity to closely follow demand whereas
traditional non-virtualized capacity is slower to provision and typically
remains unused but reserved when demand decreases.}
\label{fig:oldVsNewScalability}
\end{figure}
\subsection{Virtual machine lifecycle} \label{sec:VMLifeCycle}
Rapid elasticity is all about adjusting the size of a system by instantiating
new virtual machines (VMs) and terminating existing ones. This takes the VMs
through a lifecycle. Optimizing this lifecycle is key to successful rapid
elasticity.
The VMs go through a number of phases during the lifecycle. The high level
phases from an application perspective are
\begin{itemize}
\item template preparation,
\item instance configuration,
\item instance start,
\item instance contextualization,
\item instance monitoring (running state) and
\item instance termination.
\end{itemize}
In the template preparation phase, the virtual machine and its data is prepared
up to a point from which it can be instantiated in the cloud. The template could
be a basic installation of an operating system on virtual hardware or further
specialized for a specific purpose. The tradeoff between generic and specialized
templates is the time it takes to configure and contextualize an instantiated
generic VM for a specific purpose and, on the other hand, the
effort needed to maintain specialized templates which can be applied quickly to
newly provisioned VMs.
Instance configuration is the first phase on the way to instantiating a specific
VM instance from the template. This phase may include steps like choosing the
size of the VM instance i.e. how much memory and CPU capacity the instance will
have. Network configuration is set at this phase as well as other virtual
hardware configuration. Security settings such as SSH access keys are configured
in this phase before VM is started up.
With the template chosen and configuration set, the VM instance is ready to be
started. This phase is in the cloud provider’s domain, but customers need to be
able to monitor the progress in order to have up to date information on their
deployment. Behind the scenes, the cloud provider chooses a physical server on
which to allocate the VM instance and makes the necessary changes in their
system to allocate portions of physical CPU, memory, storage and other resources
to the VM.
When the start is done, the customer system will learn of the availability of
the new instance via some reporting mechanism offered by the cloud provider.
This is typically an API query over HTTP, i.e. a request-response cycle. An
event mechanism whereby the cloud notifies the customer would be preferred to
shorten feedback time or the need to busy loop querying the status, but
scalability and security considerations on the cloud provider side may prevent
such a scenario.
After starting up, the virtual machine needs to be contextualized for the
dynamic runtime environment of the service it is part of. The VM could be added
to a group of workers fetching work items from a queue or added to a load
balanced cluster of application servers, for example. Monitoring and other
infrastructure services are configured with runtime information at this point.
To work around waiting time in a scenario where a controller component would
connect to the new VM to perform contextualization tasks after it starts up, the
virtual machine may be configured to pull its context from another server by
executing a script at startup. Context may additionally be provided as a
mountable block storage volume separate from the template. The Open
Virtualization Format (OVF) standard advocates the use of ISO CD images for this
purpose~\cite{ovf11}. Amazon and Eucalyptus among others provide a local network
service for querying instance specific metadata over HTTP.
There has been a lot of research activity regarding the contextualization phase
in the form of describing one-off solutions to accomplish a specific goal like
joining instantiated VMs to a scientific computing
cluster~\cite{Kijsiponge2010}, standardizing an interface between VM instances
and a configurator component to separate concerns of the VM internal
implementation and deployment configuration by the inversion of control
principle \cite{Liu2011} and using this phase to carry out tasks related to a
higher level service management approach ~\cite{frominfratoservice}
\cite{Kirschnick2010} \cite{Chapman2010}.
After contextualization, the VM instance is in the running state. The VM carries
out its tasks and reports its status as configured until, at some point in time,
the VM will be shut down. The termination phase is where the VM should inform
all related system components of its eventual termination so that the system as
a whole can react to it by e.g. removing a load balancing setup or monitoring
scope.
These phases need to be customizable so that cloud customers can add their own
logic in them. Template preparation, configuration, contextualization and
termination phases are the main customization points. Automation tools like
Puppet \cite{puppetlabswebsite} and Chef \cite{chefwebsite} exist to help system
administrators carry out configuration tasks. Claudia \cite{frominfratoservice}
proposes a new abstraction layer on top of IaaS to enable more purposeful cloud
service management including use of multiple cloud service providers.
\subsection{Triggers and bounds - monitoring an elastic cloud deployment}
Clouds have the capability to scale, but system specific logic is needed to make
decisions on when and how to scale. Scaling decisions can be based on the
business requirements set for the system. Good requirements are measurable and
unambiguous. What is measurable depends on the monitoring capabilities of the
cloud system. The monitoring subsystem needs to be customizable so that service
specific metrics can be included in the data set and scaling logic. Cloud
providers typically provide monitoring facilities, but separate monitoring tools
like Ganglia \cite{gangliapaper} serve this purpose in hybrid or highly
specialized configurations. With separate solutions, the cloud customer has full
control over the monitoring subsystem and it can be used in private clouds as
well as in hybrid configurations. The tradeoff is having to maintain the monitoring components if they are not provided as a service.
Quality of monitoring data is important to make timely decisions. With large
deployments, the amount of data can be large and analyzing it all can put load
on the system. Data is typically aggregated from service tiers or groups of
servers to reduce the amount of raw data that is to be processed by the
monitoring subsystem. Another way to reduce monitoring load is to gather data at
longer intervals. This quickly reduces the quality of the scaling metrics. Cloud
systems aiming at just-in-time scalability already have to account for
provisioning delays of tens of seconds or a few minutes. If the data on which
scaling decisions are based is also a few minutes old, this makes the total
reaction time sum up to e.g. 10 minutes. Balancing the monitoring overhead and
scaling reaction time is an exercise needed to optimize each system.
The metrics used to make scaling decisions are typically related to performance
or fault tolerance. CPU and network load and available storage capacity are
straightforward metrics on a subsystem level as well as a heartbeat metric
indicating the live status of each VM. System-wide and service specific metrics
like requests handled per second, time spent on each service tier and the size
of work queues are understandable by business stakeholders and therefore usable
for concretely agreeing on and discussing system performance. Such metrics are
typical for quantifying the quality of service (QoS) and are referred to in
service level agreements (SLA) \cite{Boloor2011} with specific that should not be crossed.
%predictive, reactive provisioning, Urgaonkar2008
Operating a system has to be profitable or at least sustainable. Cost is often
the upper bound for scaling a system in the cloud. Business stakeholders need to
set limits above which the system is not allowed to scale based on cost. The
lower bound is set by technical limitations of system architecture or business
requirements on fault-tolerance and availability. Understanding the economics of
IT systems deployed on clouds is a key success factor in the long run
\cite{Suleiman2011}. Cloud adoption in enterprises begun with simple cost saving
goals but is moving towards enablement of lean enterprises capable of quick
changes in business direction~\cite{Marston2011}.
Clouds are a technology which levels the IT system playing field considerably
between startups and large corporations. With the pay-per-use model, large up
front investments in computing infrastructure are not required to start a
business, yet the scalability is available in case the service popularity
explodes.
%quality of elasticity, Suleiman
%performance variation of VMs
% \begin{itemize}
% \item Elasticity: when/how to scale
% \begin{itemize}
% \item Infrastructure prerequisites
% \item Lifecycle
% \item Triggers
% \item Business considerations
% \begin{itemize}
% \item Elasticity window (min/max)
% \end{itemize}
% \end{itemize}
% \end{itemize}
\section{Elasticity}
\label{sec:elasticity}
This chapter introduces a theory of elasticity based on the concept of a
controlled process loop governed by business requirements. Metrics are
identified as tools for defining rules that govern a system's measures to stay
conformant to the requirements. The concept of \textit{utility} of a system's
performance based on multi-criteria analysis of conflicting requirements is
developed. The overall utility over a range of time is introduced as
\textit{quality of elasticity, QoE} and the ability of a system to maximize its
utility within a defined range given a specific usage pattern is defined as the
\textit{QoE score}.
The fact that utility and QoE are based on business requirements is key here. Cloud adoption in enterprises is increasingly a step towards making the business more agile instead of saving on IT operations costs~\cite{cloudberkeleyviewacm}. The QoE concept facilitates measurement and scoring of system behavior given a set of business requirements. The system's performance can be optimized by varying either the cloud platform, the way the elastic scaling is handled or the parallelizability of the application implementation. The direction and need for optimizations originates from the business requirements and the QoE results are understandable by business stakeholders. This enables a quick feedback loop between business and IT stakeholders.
\subsection{Elasticity as a controlled process}
\label{sec:elasticity_as_a_controlled_process}
Efficient cloud deployed applications can change their deployment configuration in a matter of minutes if not seconds. To effectively manage a system at such speeds it
is essential that reactions to regularly occurring or anticipated events are
built in to the system and automated.
Concepts from process control theory and autonomic computing can be applied to
implement a cloud application system which knows its state and reacts to changes
in it. An essential part of such a system is a controller component external to
the application itself. The responsibilities of such an \emph{elasticity controller}
are to monitor the system, analyze the metrics, plan corrective actions and
execute them. This is known as a MAPE-K control
loop~\cite{Huebscher2008}\cite{Mueller2009} named after its phases (Monitoring,
Analysis, Planning, Execution, Knowledge). The knowledge in MAPE-K loops is
shared data between the actual MAPE phases. The loop is illustrated in figure~\ref{fig:mape-k}.
The cloud application deployment is monitored and the configuration is adjusted
based on metrics reported by monitoring agents (software components) attached to
the application or its environment. This attachment can be non-intrusive, where
the agent is located outside the application and monitors external phenomena
like network traffic or CPU load. Intrusive monitor attachment works by
instrumenting the execution environment or application itself for monitoring.
For example, a Java virtual machine (JVM) can be instrumented using the
java.lang.instrument API to monitor the internal workings of the JVM. Aspect
oriented programming can be used to instrument at the application level to
monitor metrics unique to the application or its business logic.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{images/mape-k}
\caption{The elasticity controller's functionality is modeled after the \mbox{
MAPE-K} control loop.}
\label{fig:mape-k}
\end{figure}
Monitoring data is analyzed by the controller in the corresponding phase of the
MAPE-K loop. The raw sensor data is turned into knowledge in this phase. The
MAPE-K knowledge can be an advanced modeled abstraction of the system where the
data is fed into or simply a group of variables reflecting the state of the
monitored system now and the way it is changing over time.
Analysis of the system model may indicate that one or more criteria of
acceptable system behavior are no longer met (reactive trigger) or some metric
is about to exit its tolerated range (proactive trigger). Given such a
situation, the controller will enter the planning phase with the purpose of creating
a plan of action to bring the metric values back to or keep them in the
tolerance zone. This plan can be based on a set of rules that govern the
operation of the controller component or again a more elaborate model driven
approach which approximates the behavior of the actual system.
The execution phase is where the controller or its delegate effector components
interface with the application and the cloud environment to carry out the actions
decided in the planning phase. This phase relies on automation APIs available
for the environment and the runtime configurability of the application.
The executed actions will cause changes in the behavior of the system which are
then reported back to the controller in subsequent control loops.
\subsection{Rules to satisfy requirements} The control loop needs metrics that
are relevant to the system in question and bounds to specify acceptable value
ranges for the metrics. Each metric and its acceptable range represent a
\emph{requirement} for the controller. Rules for controlling the system are
created with the purpose of making sure the system will always meet these
requirements.
Requirements expressed in terms of the implementation technology (system load,
network traffic, etc.) are straightforward to set up for monitoring and further
processing. If non-technical stakeholders like business decision makers are
involved in the requirements elicitation, technical requirements may be
difficult to communicate understandably. Therefore higher level requirements
(e.g. cost per visit to a website, type of user activity, etc.) expressed in
business terms may be the starting point of defining the elasticity requirements
for a system.
To monitor and make scaling decisions based on metrics expressed in business
terms, it is necessary to instrument the application code or monitor the state
of the application's domain model (database). This kind of monitoring takes more
effort compared to non-intrusive technical metrics since the monitoring has to
be customized for the application. The choice of customization or relying on
lower level metrics is a tradeoff one has to make when designing an elastic
system. A mapping from business requirements to technical requirements~
\cite{Chen2008}\cite{Emeakaroha2010}\cite{Wu2011}\cite{Suleiman2011} may be
necessary to facilitate communication of the requirements from their source down
to the implementation of the controller.
\subsection{Multi-criteria decision analysis} Often the requirements given for
the performance of a system conflict each other. If, for example, a system is
optimized in terms of response time by adding more virtual machines to the
deployment, cost rises too high to operate the system. Or if memory usage is
minimized by writing data to disk, the performance may suffer due to increased
access time to data. The requirements may form a complex network of this kind of
interdependencies. It quickly becomes difficult to specify simple rules for
satisfying all the requirements simultaneously.
Multi-criteria decision analysis~\cite{Duboc2007}\cite{Ke2012} is a method
for finding an optimal decision considering conflicting criteria. It can be
applied here to formalize the decision making under conflicting requirements.
\missingfigure{Could show conflicting preference function graphs here and a pareto frontier with their mutually optimum points...}
The multiple criteria are considered together by the use of a \emph{utility
function}
\begin{equation}
U(X) = \sum\limits_{i=1}^k w_{i}P_{i}(X) \label{eq:utilityfunction}
\end{equation}
with a normalized range $U(X) \in [0, 1]$ in the domain of real numbers, where a
value of 0 denotes the worst possible utility and 1 denotes that the system
fully satisfies its combined requirements. $X$ is a set of $j$ parameters
$\{x_{1}, \dots, x_{j}\}$ which are needed to calculate the utility. Metric
values and other knowledge of the system state are typical parameters. The
utility function is a weighted sum of $k$ \emph{preference functions} $P_{i}(X)$
with $1 \le i \le k$. Each elasticity related requirement is defined as a
preference function $P_{i}$ with a normalized range $P_{i}(X) \in [0, 1]$, where
a value of 0 denotes the worst possible preference for this requirement and 1
denotes that the requirement has been optimally fulfilled. The weights $w_{i}$
represent the relative importance of each preference function to overall
utility, with $\sum_{i=1}^k w_{i} = 1$.
\subsection{Quality of elasticity}
The utility function (\ref{eq:utilityfunction}),
given business-related preferences, measures the business utility of a system
with regard to its performance metrics. Plotting the utility over time as the
usage pattern changes shows how the system responds to these changes. A perfectly
elastic system would adjust its capacity to match or slightly surpass the
required level for maximum utility. The aggregate measure of utility over time
shows how well the system responds to changes, i.e. how well the system scales
out and in as a response to changes in its environment.
The \emph{quality of elasticity} (QoE) for a system over time can be quantified
as the integral of the utility function from some moment of time $a$ to time
$b$ divided by the duration of the measurement $b - a$:
\begin{equation}
QoE = \frac{\int\limits_a^b U(X)~dx}{b-a} \label{eq:qoefunction}
\end{equation}
The range for $QoE$ is the same as that of the utility function, i.e. $QoE \in [0, 1]$ in the domain of real numbers. Figure~\ref{fig:utility-qoe} illustrates the QoE concept as the area on a graph between the values of $U(X)$ and the x-axis over time.