pubblicato in data 10/nov/2008 06.47 da Alberto Bartoli
[
aggiornato in data 22/lug/2009 03.31
]
Alberto Bartoli, Giorgio Davanzo, Eric Medvet,
IEEE Internet Computing vol. 13, n. 4, pp. 52-58, July/August 2009.
Web site defacement has become a common threat for
organizations
exposed on the web. There exist several statistics that indicate the
number of incidents of this sort but there is a crucial piece of
information still lacking: the typical duration
of a defacement. Clearly, a defacement lasting one week is much more
harmful than one of few minutes. In this paper we present the results
of a two months monitoring activity that we performed over more than
62000 defacements in order to figure out whether and when
a reaction to the defacement is taken. We show that such time tends to
be unacceptably long---in the order of several days---and with a
long-tailed distribution. We believe our findings may improve the
understanding of this phenomenon and highlight issues deserving
attention by the research community.
|
pubblicato in data 10/nov/2008 06.47 da Alberto Bartoli
[
aggiornato in data 03/dic/2008 04.54
]
Giorgio Davanzo, Eric Medvet and Alberto Bartoli,
Proc. 12-th IEEE International Conference on Information Visualization,
July 2008, pp. 527-532.
Hand-held devices have become widespread and provided with
significant computing capabilities, which results in an increasing
pressure for using these devices to perform tasks formerly limited to
notebooks, like web browsing. Due to their small screens, however,
hand-held devices cannot visualize directly documents that were not
designed explicitly for small-screen rendering. Such documents may be
rendered either at a scale too small to be useful, or at a scale that
requires intensive scrolling operations by the user. Unfortunately,
scrolling a small window across a large document with a hand-held
device is quite cumbersome.
In this paper we propose a
scrolling system much simpler and more natural to use, based on the
embedded camera---a component available in every modern hand-held
device. We detect device motion by analyzing the video stream generated
by the camera and then we transform the motion in a scrolling of the
content rendered on the screen. This way, the user experiences the
device screen like a small movable window on a larger virtual view,
without requiring any dedicated motion-detection hardware.
We
performed an experimental evaluation aimed at assessing the
effectiveness of the proposed system in the considered scenario,
characterized by low image quality, unpredictable framed scene and so
on. We performed an objective benchmark quantifying the accuracy of the
detected trajectory and a subjective benchmark examining users'
confidence with the proposed system. For the latter evaluation, we
involved a panel of 20 subjects that executed a trajectory with our
system and, as a comparison, with keyboard, mouse and touchpad. The
results demonstrate that our approach is indeed practical.
|
pubblicato in data 10/nov/2008 06.45 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.07
]
Giorgio Davanzo, Eric Medvet and Alberto Bartoli,
Proc.
23-rd IFIP International Information Security Conference, September
2008, pp. 711-716.
Web site defacement, the process of introducing unauthorized
modifications to a web site, is a very common form of attack.
Detecting such events automatically is very difficult because web pages
are highly dynamic and their degree of dynamism, as well as their
typical content and appearance, may vary widely across different pages.
Anomaly based detection can be a feasible and effective solution for
this task because it does not require any prior knowledge about the
page to be monitored.
Instead, a profile may be generated automatically by observing the page
for a while and then any deviation from that profile may be considered
as a defacement.
We developed earlier an anomaly detection algorithm tailored to this
problem and showed that the approach indeed delivers satisfactory
performance.
A key feature of our proposal is that it incorporates a domain specific
knowledge about the nature of web content.
In this paper we broaden our analysis of automatic detection of web
defacements by examining several anomaly detection techniques that have
been proposed in the literature for network/host intrusion detection.
We assess the performance of such techniques in terms of False Positive
Rate and False Negative Rate, by using our earlier domain
knowledge-based algorithm as a baseline.
Our evaluation is based on a dataset that we constructed by observing
15 highly dynamic web pages for two months and that includes a set of
95 real defacements.
This study enables gaining further insights into the problem of
automatic detection of web defacements. We want to ascertain whether
existing techniques for anomaly intrusion detection may be applied to
this problem and we want to assess pros and cons of incorporating
domain knowledge into the detection algorithm.
|
pubblicato in data 10/nov/2008 06.44 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.07
]
P. Vercesi, A. Bartoli,
Proc. 9-th
Modern Information Technology in the
Innovation Processes of the Industrial Enterprises (MITIP 2007)
.
|
pubblicato in data 10/nov/2008 06.43 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.07
]
C.Fillon, A. Bartoli,
Proc. 2007 IEEE Conference on Evolutionary Computation, Sept. 2007, pp. 23-30
Symbolic regression is aimed at discovering mathematical expressions, in symbolic form, that
fit a given sample of data points. While Genetic Programming (GP) constitutes a powerful tool
for solving this class of problems, its effectiveness is still severely limited when the data
sample requires different expressions in different regions of the input space --- i.e., when
the approximating function should be discontinuous.
In this paper we present a new GP-based approach for symbolic regression of discontinuous
functions in multivariate data-sets. We identify the portions of the input space that require
different approximating functions by means of a new algorithm that we call Hyper-Volume Error
Separation (HVES). To this end we run a preliminary GP evolution and partition the input space
based on the error exhibited by the best individual across the data-set. Then we partition the
data-set based on the partition of the input space and use each such partition for driving an
independent, preliminary GP evolution. The populations resulting from such preliminary
evolutions are finally merged and evolved again.
We compared our approach to the standard GP search and to a GP search for discontinuous
functions in univariate data-sets. Our results show that coupling HVES with GP is an effective
approach and provides significant accuracy improvements while requiring less computational
resources.
|
pubblicato in data 10/nov/2008 06.42 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.08
]
Eric
Medvet, Cyril Fillon, Alberto Bartoli,
Proc. 3-rd IEEE International
Symposium on Information Assurance
and Security (IAS'07).
Web site defacement, the process of introducing unauthorized
modifications to a web site, is a very common form of attack. In this
paper we propose a novel approach aimed at monitoring the integrity of
remote web pages automatically while remaining fully decoupled from
them, in particular, without requiring any prior knowledge about the
page. Our approach is based on Genetic Programming (GP), an automatic
method for generating computer programs inspired by analogies with the
evolution theory described by Darwin. In a preliminary learning phase,
GP builds an algorithm based on a sequence of readings of the observed
page and a sample set of attacks. Then, we monitor the page at regular
intervals applying the algorithm, which raises an alert when a suspect
modification is found. We developed a prototype and tested our approach
over a dataset of 15 dynamic web pages, observed for about a month, and
a collection of real web defacements. We compared the experimentation
outcome with those of an anomaly-based approach with known effectivess,
and the results are encouraging: Genetic Programming is an effective
approach for this task. |
pubblicato in data 10/nov/2008 06.41 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.08
]
P. Vercesi, A. Bartoli, Proc. IEEE-ICCCN'07 - Workshop on Advanced Networking and Communications.
Organizations are increasingly aggregating their computing resources
to formInternet-based grids specialized in specific application
workflows and made available to other organizations. The scheduling of
jobs in such grids may clearly have a substantial impact on
performance, but finding effective scheduling policies is hard due to
the very same nature of this scenario. Performance may greatly depend
on a myriad of parameters whose values can hardly be determined in
practice. Moreover, the load injected by users istypically
unpredictable, performance of Internet links may vary widely during an
execution and computing resources at participating organizations could
also vary dynamically, perhaps because of additional workloads injected
by other competing activities.
In this paper we propose
mechanisms and policies for controlling the scheduling of jobs in such
a highly dynamic environment. We attempt to minimize the resource usage
at the participating organizations while maintaining the performance
delivered to clients at an acceptable level. Our approach consists of a
form of admission control at the entrance point of the application
workflow that is simple to deploy in practice and does not need any
hook from the participating organizations. We simply vary dynamically
the maximum number of jobs that can be injected within the grid, based
on performance measures taken online, and delay excess jobs. We
have evaluated our proposal in detail, by simulation, focussing on its
ability to adapt automatically to perturbations in the form of
substantial and unexpected changes in the amount of computing resources
available. We have found that our proposal is indeed capable of finding
automatically a suitable trade-off between throughput and resource
usage, even in such a dynamic scenario. |
pubblicato in data 10/nov/2008 06.39 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.09
]
Eric Medvet, Alberto Bartoli,
Proc.
Fourth International
Conference on Detection of Intrusions & Malware, and
Vulnerability Assessment, pp. 60-78, Lecture Notes in Computer Science 4579 Springer 2007
(http://www.dimva2007.org/).
Anomaly detection is a commonly used approach for constructing
intrusion detection systems. A key requirement is that the data used
for building the resource profile are indeed attack-free, but this
issue is often skipped or taken for granted. In this work we consider
the problem of corruption in the learning data, with respect to a
specific detection system, i.e., a web site integrity checker. We used
corrupted learning sets and observed their impact on performance (in
terms of false positives and false negatives). This analysis enabled us
to gain important insights into this rather unexplored issue. Based on
this analysis we also present a procedure for detecting whether a
learning set is corrupted. We evaluated the performance of our proposal
and obtained very good results up to a corruption rate close to 50\%.
Our experiments are based on collections of real data and consider
three different flavors of anomaly detection. |
pubblicato in data 10/nov/2008 06.39 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.09
]
P. Vercesi, A. Bartoli,
in Proc. 31-st IEEE COMPSAC - Computer Software and Applications Conference 2007.
Composition of Internet-based services exported by different
organizations has quickly become a key software paradigm for
engineering and scientific communities. The widespread diffusion and
acceptance of protocols for programmatic interactions across different
organizations (e.g. web services, grid computing) has made it feasible
the development of workflows encompassing geographically remote
organizations connected to the Internet. Although the scheduling of
workflow invocations in such a multi-organization, multi-tiered and
geographically dispersed environment may have strong impacts on
performance, this issue has not been explored very much so far.
In
this work we consider the scheduling of workflow invocations taking the
perspective of the service providers. We focus on the trade-off
between performance of the workflow and cost
incurred at the participating organizations, where we consider as cost
metric the time spent at each organization for executing the workflow.
We propose an adaptive mechanism to be deployed at the workflow engine
that is capable of keeping performance close to the optimal level,
while decreasing significantly the cost at the participating
organizations. Our proposal can be implemented very simply and
consists in a form of admission control: we place an upper bound on the
number of jobs concurrently being executed and delay excess jobs. The
number of jobs that can be injected into the composite service is varied dynamically
with an adaptation policy driven by the current estimates of latency
and throughput for the composite service as a whole. The key feature of
our approach is that we do not require any hook from the participating
organizations and treat the entire workflow as a black-box. We
evaluated our technique in a broad range of scenarios, by means of a
detailed event-driven simulator. These experimental results suggest
that our proposal can indeed provide significant benefits to service
providers. |
pubblicato in data 10/nov/2008 06.39 da Alberto Bartoli
[
aggiornato in data 11/nov/2008 00.10
]
C.Fillon, A. Bartoli,
in Proc.10-th European Conference on Genetic Programming
(EuroGP), Lecture Notes in Computer
Science 4445, Springer Verlag (April 2007), pp. 170-180.
TCP is one of the fundamental components of the Internet. The
performance of TCP is heavily dependent on the quality of its round-trip (RTT) estimator, i.e. the formula that predicts dynamically the delay experienced by packets along a network connection. In this paper we apply multi-objective genetic programming
for constructing an RTT estimator. We used two different approaches for
multi-objective optimization and a collection of real traces collected
at the mail server of our University. The solutions that we found
outperform the RTT estimator currently used by all TCP implementations.
This result could lead to several applications of genetic programming
in the networking field. |
|