Institut für Wirtschaftsinformatik (IIS)
Filtern
Schlagworte
- Autonomes Fahren (1)
- Learning Process (1)
- Lernprozess (1)
- Network Analysis (1)
- Network Data (1)
- Netzwerkanalyse (1)
- Netzwerkdaten (1)
- Nutzerverhalten (1)
- Online Behaviour (1)
- Online-Marketing (1)
Detecting and Assessing Road Damages for Autonomous Driving Utilizing Conventional Vehicle Sensors
(2021)
Environmental perception is one of the biggest challenges in autonomous driving to move inside complex traffic situations properly. Perceiving the road's condition is necessary to calculate the drivable space; in manual driving, this is realized by the human visual cortex. Enabling the vehicle to detect road conditions is a critical and complex task from many perspectives. The complexity lies on the one hand in the development of tools for detecting damage, ideally using sensors already installed in the vehicle, and on the other hand, in integrating detected damages into the autonomous driving task and thus into the subsystems of autonomous driving. High-Definition Feature Maps, for instance, should be prepared for mapping road damages, which includes online and in-vehicle implementation. Furthermore, the motion planning system should react based on the detected damages to increase driving comfort and safety actively. Road damage detection is essential, especially in areas with poor infrastructure, and should be integrated as early as possible to enable even less developed countries to reap the benefits of autonomous driving systems. Besides the application in autonomous driving, an up-to-date solution on assessing road conditions is likewise desirable for the infrastructure planning of municipalities and federal states to make optimal use of the limited resources available for maintaining infrastructure quality. Addressing the challenges mentioned above, the research approach of this work is pragmatic and problem-solving. In designing technical solutions for road damage detection, the researchers conduct applied research methods in engineering, including modeling, prototyping, and field studies. They utilize design science research to integrate road damages in an end-to-end concept for autonomous driving while drawing on previous knowledge, the application domain requirements, and expert workshops. This thesis provides various contributions to theory and practice. The investigators design two individual solutions to assess road conditions with existing vehicle sensor technology. The first solution is based on calculating the quarter-vehicle model utilizing the vehicle level sensor and an acceleration sensor. The novel model-based calculation measures the road elevation under the tires, enabling common vehicles to assess road conditions with standard hardware. The second solution utilizes images from front-facing vehicle cameras to detect road damages with deep neural networks. Despite other research in this area, the algorithms are designed to be applicable on edge devices in autonomous vehicles with limited computational resources while still delivering cutting-edge performance. In addition, the analyses of deep learning tools and the introduction of new data into training provide valuable opportunities for researchers in other application areas to develop deep learning algorithms to optimize detection performance and runtime. Besides detecting road damages, the authors provide novel algorithms for classifying the severity of road damages to deliver additional information for improved motion planning. Alongside the technical solutions, they address the lack of an end-to-end solution for road damages in autonomous driving by providing a concept that starts from data generation and ends with servicing the vehicle motion planning. This includes solutions for detecting road damages, assessing their severity, aggregating the data in the vehicle and a cloud platform, and making the data available via that platform to other vehicles. Fundamental limitations in this dissertation are due to boundaries in modeling. The pragmatic approach simplifies reality, which always distorts the degree of truth in the result.
Extracting meaningful representations of data is a fundamental problem in machine learning. Those representations can be viewed from two different perspectives. First, there is the representation of data in terms of the number of data points. Representative subsets that compactly summarize the data without superfluous redundancies help to reduce the data size. Those subsets allow for scaling existing learning algorithms up without approximating their solution. Second, there is the representation of every individual data point in terms of its dimensions. Often, not all dimensions carry meaningful information for the learning task, or the information is implicitly embedded in a low-dimensional subspace. A change of representation can also simplify important learning tasks such as density estimation and data generation. This thesis deals with the aforementioned views on data representation and contributes to them. The authors first focus on computing representative subsets for a matrix factorization technique called archetypal analysis and the setting of optimal experimental design. For these problems, they motivate and investigate the usability of the data boundary as a representative subset. The authors also present novel methods to efficiently compute the data boundary, even in kernel-induced feature spaces. Based on the coreset principle, they derive another representative subset for archetypal analysis, which provides additional theoretical guarantees on the approximation error. Empirical results confirm that all compact representations of data derived in this thesis perform significantly better than uniform subsets of data. In the second part of the thesis, the research group is concerned with efficient data representations for density estimation. The researchers analyze spatio-temporal problems, which arise, for example, in sports analytics, and demonstrate how to learn (contextual) probabilistic movement models of objects using trajectory data. Furthermore, they highlight issues of interpolating data in normalizing flows, a technique that changes the representation of data to follow a specific distribution. The authors show how to solve this issue and obtain more natural transitions on the example of image data.
Maximizing the value from data has become a key challenge for companies as it helps improve operations and decision making, enhances products and services, and, ultimately, leads to new business models. While enterprise architecture (EA) management and modeling have proven their value for IT-related projects, the support of enterprise architecture for data-driven business models (DDBMs) is a rather new and unexplored field. The research group argues that the current understanding of the intersection of data-driven business model innovation and enterprise architecture is incomplete because of five challenges that have not been addressed in existing research: (1) lack of knowledge of how companies design and realize data-driven business models from a process perspective, (2) lack of knowledge on the implementation phase of data-driven business models, (3) lack of knowledge on the potential support enterprise architecture modeling and management can provide to data-driven business model endeavors, (4) lack of knowledge on how enterprise architecture modeling and management support data-driven business model design and realization in practice, (5) lack of knowledge on how to deploy data-driven business models. The researchers address these challenges by examining how enterprise architecture modeling and management can benefit data-driven business model innovation. The mixed-method approach of this thesis draws on a systematic literature review, qualitative empirical research as well as the design science research paradigm. The investigators conducted a systematic literature search on data-driven business models and enterprise architecture. Considering the novelty of data-driven business models for academia and practice, they conducted explorative qualitative research to explain "why" and "how" companies embark on realizing data-driven business models. Throughout these studies, the primary data source was semi-structured interviews. In order to provide an artifact for DDBM innovation, the researchers developed a theory for design and action. The data-driven business model innovation artifact was inductively developed in two design iterations based on the design science paradigm and the design science research framework.
Mental health is an important factor in an individuals' life. Online-based interventions have been developed for the treatment of various mental disorders. During these interventions, a large amount of patient-specific data is gathered that can be utilized to increase treatment outcomes by informing decision-making processes of psychotherapists, experts in the field, and patients. The articles included in this dissertation focus on the analysis of such data collected in digital psychological treatments by using machine learning approaches. This dissertation utilizes various machine learning methods such as Bayesian models, regularization techniques, or decision trees to predict different psychological factors, such as mood or self-esteem, dropout of patients, or treatment outcomes and costs. These models are evaluated using a variety of performance metrics, for example, receiver operating characteristics curve, root mean square error, or specialized performance metrics for Bayesian inference. These types of analyses can support decision- making for psychologists and patients, which can, in turn, lead to better recommendations and subsequently to increased outcomes for patients and simultaneously more insight about the interplay between psychological factors. The analysis of user journey data has not yet been fully examined in the field of psychological research. A process for this endeavor is developed and a technical implementation is provided for the research community. The application of machine learning in this context is still in its infancy. Thus, another contribution is the exploration and application of machine learning techniques for the revelation of correlations between psychological factors or characteristics and treatment outcomes as well as their prediction. Additionally, economic factors are predicted to develop a process for treatment type recommendations. This approach can be utilized for finding the optimal treatment type for patients on an individual level considering predicted treatment outcomes and costs. By evaluating the predictive accuracy of multiple machine learning techniques based on various performance metrics, the importance of considering heterogeneity among patients' behavior and affect is highlighted in some articles. Furthermore, the potential of machine learning-based decision support systems in clinical practice has been examined from a psychotherapists' point of view.
Analysis of User Behavior
(2020)
Online behaviors analysis consists of extracting patterns from server-logs. The works presented here were carried out within the "mBook" project which aimed to develop indicators of the quantity and quality of the learning process of pupils from their usage of an eponymous electronic textbook for History. In this thesis, the research group investigates several models that adopt different points of view on the data. The studied methods are either well established in the field of pattern mining or transferred from other fields of machine learning and data mining. The authors improve the performance of archetypal analysis in large dimensions and apply it to unveil correlations between visibility time of particular objects in the e-textbook and pupils' motivation. They present next two models based on mixtures of Markov chains. The first extracts users' weekly browsing patterns. The second is designed to process essions at a fine resolution, which is sine qua non to reveal the significance of scrolling behaviors. The authors also propose a new paradigm for online behaviors analysis that interprets sessions as trajectories within the page-graph. In this respect, they establish a general framework for the study of similarity measures between spatio-temporal trajectories, for which the study of sessions is a particular case. Finally, they construct two centroid-based clustering methods using neural networks and thus lay the foundations for unsupervised behaviors analysis using neural networks.
Technological development made it possible to store and process data on a scale not imaginable decades ago — a development that also includes network data. A particular characteristic of network data is that, unlike standard data, the objects of interest, called nodes, have relationships to (possibly all) other objects in the network. Collecting empirical data is often complicated and cumbersome, hence, the observed data are typically incomplete and might also contain other types of errors. Because of the interdependent structure of network data, these errors have a severe impact on network analysis methods. This cumulative dissertation is about the impact of erroneous network data on centrality measures, which are methods to assess the position of an object, for example a person, with respect to all other objects in a network. Existing studies have shown that even small errors can substantially alter these positions. The impact of errors on centrality measures is typically quantified using a concept called robustness. The articles included in this dissertation contribute to a better understanding of the robustness of centrality measures in several aspects. It is argued why the robustness needs to be estimated and a new method is proposed. This method allows researchers to estimate the robustness of a centrality measure in a specific network and can be used as a basis for decision making. The relationship between network properties and the robustness of centrality measures is analyzed. Experimental and analytical approaches show that centrality measures are often more robust in networks with a larger average degree. The study of the impact of non-random errors on the robustness suggests that centrality measures are often more robust if missing nodes are more likely to belong to the same community compared to missingness completely at random. For the development of imputation procedures based on machine learning techniques, a process for the evaluation of node embedding methods is proposed.
Online marketing, especially Paid Search Advertising, has become one of the most important paid media channels for companies to sell their products and services online. Despite being under intensive examination by a number of researchers for several years, this topic still offers interesting opportunities to contribute to the community, particularly because of its large economic impact and practical relevance as well as the detailed and widely unfiltered view of consumer behavior that such marketing offers. To provide answers to some of the important questions from advertisers in this context, the author present four papers in his thesis, in which he extends previous works on optimization topics such as click and conversion prediction. He applies and extends methods from other fields of research to specific problems in Paid Search. After a short introduction, the dissertation starts with a paper in which the authors illustrates a new method that helps advertisers to predict conversion probabilities in Paid Search using sparse keyword-level data. They address one of the central problems in Paid search advertising, which is optimizing own investments in this channel by placing bids in keyword auctions. In many cases, evaluations and decisions are made with extremely sparse data, although anecdotal evidence suggests that online marketing is a typical "Big Data" topic. In the developed algorithm presented in this paper, the authors use information such as the average time that users spend on the advertiser's website and bounce rates for every given keyword. This previously unused data set is shared between all keywords and used as prior knowledge in the proposed model. A modified version of this algorithm is now the core prediction engine in a productive Paid Search Bid Optimization System that calculates and places millions of bids every day for some of the most recognized retailers and service providers in the German market. Next, the author illustrates the development of a non-reactive experimental method for A/B testing of Paid Search Advertising activities. In that paper, the authors provide an answer to the question of whether and under what circumstances it makes economic sense for brand owners to pay for Paid Search ads for their own brand keywords in Google AdWords auctions. Finally, the author presents two consecutive papers with the same theoretical foundation in which he applies Bayesian methods to evaluate the impact of specific text features in Paid Search Advertisements.