Refine
Year of publication
- 2020 (2) (remove)
Language
- English (2) (remove)
Keywords
- Learning Process (1)
- Lernprozess (1)
- Network Analysis (1)
- Network Data (1)
- Netzwerkanalyse (1)
- Netzwerkdaten (1)
- Nutzerverhalten (1)
- Online Behaviour (1)
- Onlineverhalten (1)
- Pupils (1)
Institute
- Institut für Wirtschaftsinformatik (IIS) (2) (remove)
Analysis of User Behavior
(2020)
Online behaviors analysis consists of extracting patterns from server-logs. The works presented here were carried out within the "mBook" project which aimed to develop indicators of the quantity and quality of the learning process of pupils from their usage of an eponymous electronic textbook for History. In this thesis, the research group investigates several models that adopt different points of view on the data. The studied methods are either well established in the field of pattern mining or transferred from other fields of machine learning and data mining. The authors improve the performance of archetypal analysis in large dimensions and apply it to unveil correlations between visibility time of particular objects in the e-textbook and pupils' motivation. They present next two models based on mixtures of Markov chains. The first extracts users' weekly browsing patterns. The second is designed to process essions at a fine resolution, which is sine qua non to reveal the significance of scrolling behaviors. The authors also propose a new paradigm for online behaviors analysis that interprets sessions as trajectories within the page-graph. In this respect, they establish a general framework for the study of similarity measures between spatio-temporal trajectories, for which the study of sessions is a particular case. Finally, they construct two centroid-based clustering methods using neural networks and thus lay the foundations for unsupervised behaviors analysis using neural networks.
Technological development made it possible to store and process data on a scale not imaginable decades ago — a development that also includes network data. A particular characteristic of network data is that, unlike standard data, the objects of interest, called nodes, have relationships to (possibly all) other objects in the network. Collecting empirical data is often complicated and cumbersome, hence, the observed data are typically incomplete and might also contain other types of errors. Because of the interdependent structure of network data, these errors have a severe impact on network analysis methods. This cumulative dissertation is about the impact of erroneous network data on centrality measures, which are methods to assess the position of an object, for example a person, with respect to all other objects in a network. Existing studies have shown that even small errors can substantially alter these positions. The impact of errors on centrality measures is typically quantified using a concept called robustness. The articles included in this dissertation contribute to a better understanding of the robustness of centrality measures in several aspects. It is argued why the robustness needs to be estimated and a new method is proposed. This method allows researchers to estimate the robustness of a centrality measure in a specific network and can be used as a basis for decision making. The relationship between network properties and the robustness of centrality measures is analyzed. Experimental and analytical approaches show that centrality measures are often more robust in networks with a larger average degree. The study of the impact of non-random errors on the robustness suggests that centrality measures are often more robust if missing nodes are more likely to belong to the same community compared to missingness completely at random. For the development of imputation procedures based on machine learning techniques, a process for the evaluation of node embedding methods is proposed.