<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>7 Mathematisch-Naturwissenschaftliche Fakultät</title>
<link>http://hdl.handle.net/10900/42133</link>
<description/>
<pubDate>Fri, 12 Jun 2026 10:22:37 GMT</pubDate>
<dc:date>2026-06-12T10:22:37Z</dc:date>
<item>
<title>Improved Exploration through Transfer Learning in Multi-Armed Bandits</title>
<link>http://hdl.handle.net/10900/180746</link>
<description>Improved Exploration through Transfer Learning in Multi-Armed Bandits
Bilaj, Steven
Reinforcement learning is a branch of machine learning that focuses on training an agent to interact with a dynamic environment to maximize its expected cumulative reward. In contrast to offline learning problems, where the training data is immediately available, the agent typically assembles its own data set by interacting with the environment. Since model performances heavily depend on the gathered training data, a key challenge is determining an exploration strategy at the potential cost of low immediate rewards to find a policy. This is known as the exploration-exploitation trade-off.&#13;
This dissertation examines how to effectively re-balance the exploration-exploitation trade-off by transferring knowledge between tasks with a shared structure to enhance the agent's overall performance. The primary focus is on developing algorithms that reduce uncertainties in the model estimations by transferring information given various assumptions while providing theoretical bounds on the regret. The contributions span single-task transfer, meta-learning, multi-task learning and non-stationary environments in the multi-armed bandit setting.&#13;
In a straightforward task-to-task transfer approach, where an expert is assumed to be available to the learner, we propose a dynamic convex combination of the expert and target model. We prove that when the expert's parameter vector is close to the true task related feature vector, the learner can exploit the expert's knowledge in the early steps of the algorithm and reduce the regret with high probability.&#13;
A generalization would be to investigate feature vectors that are close in a subspace. We address this idea within the concept of meta learning, where an agent sequentially interacts with multiple tasks sampled from a common meta distribution. Under the assumption of a low-dimensional subspace structure in the meta distribution, we propose a framework to estimate the subspace with projection matrices and exploit it as prior information within an OFUL and Thompson sampling based algorithm. With each task the agent interacts with, it improves its estimation of the projections for exploitation in future tasks. Theoretical guarantees are provided with an emphasis on an improvement on the regret bound with respect to the dimensionality.&#13;
In a clustered setting, we assume that tasks are grouped in clusters such that only tasks of the same cluster share the same feature vector. When the number of clusters is lower than the number of dimensions, it can be interpreted as a special case of the low-dimensional subspace setting. We explore the general clustered setting in a multi-task framework, where an agent interacts with a fixed number of tasks in parallel. The agent has access to a graph, where each node is associated with a different task. We introduce a network lasso based bandit algorithm that exploits the given graph such that it implicitly learns the cluster structure. Theoretical bounds show that, with a well suited graph, this approach offers significant improvements over other baselines.&#13;
Finally, we address dynamic environments or piecewise-stationary settings, where the agent typically discards all collected data points upon detecting changes in the environment and retrains its model from scratch. Instead, we propose an algorithm that only discards data points directly associated with the environmental change and retains the rest. We show that intelligent transfer of data from previous segments can reduce exploration after each change and increase overall reward.&#13;
This dissertation thus proposes multiple algorithms for transferring information in several multi-armed bandit settings with the purpose of optimizing exploration. We provide both theoretical guarantees and empirical evaluations showcasing significant improvements over existing methods.
</description>
<pubDate>Fri, 12 Jun 2026 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10900/180746</guid>
<dc:date>2026-06-12T00:00:00Z</dc:date>
</item>
<item>
<title>Empirical Likelihood Estimators for Robust and Causal Learning</title>
<link>http://hdl.handle.net/10900/180527</link>
<description>Empirical Likelihood Estimators for Robust and Causal Learning
Kremer, Heiner Stephan
Some of the central problems in robust and causal machine learning, including learning under covariate shifts and instrumental variable regression, can be expressed as conditional moment restrictions (CMR). By restricting the conditional expectation of a signed error metric, models identified via CMR exhibit robustness against shifts in the distribution of the conditioning variable. In practice, this generally results in an ill-posed problem, as it requires the solution of an over-identified infinite-dimensional system of equations. For the unconditional case, empirical likelihood estimators have emerged as general and powerful tools to address over-identified moment restriction problems. These methods learn a model along with an approximation of the population distribution by means of minimizing a φ-divergence constrained by the moment restrictions. The main goal of this work is to advance the state-of-the art in CMR estimation by extending and refining the idea of empirical likelihood estimation in several directions. First, we generalize the classical framework to conditional moment restrictions using a functional formulation, that leverages modern machine learning models. Then, we extend the principle to alternative distributional distance notions based on kernel methods and optimal transport. The resulting estimators exhibit superior small sample properties and robustness against data corruptions at training time and adversarial attacks at test time, respectively. Finally, drawing inspiration from the close relation between empirical likelihood estimation and distributionally robust optimization (DRO), we provide an application of kernel-based DRO on chance-constrained programming.
</description>
<pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10900/180527</guid>
<dc:date>2026-06-09T00:00:00Z</dc:date>
</item>
<item>
<title>Vpr-Controlled Manipulation of T Cell Physiology: NF-AT Activation and Protein Degradation as Key Mechanisms for HIV-1 Pathogenesis</title>
<link>http://hdl.handle.net/10900/180503</link>
<description>Vpr-Controlled Manipulation of T Cell Physiology: NF-AT Activation and Protein Degradation as Key Mechanisms for HIV-1 Pathogenesis
Vanegas Torres, Carlos Alberto
Human immunodeficiency virus type 1 (HIV-1), the lentiviral pathogen behind the global AIDS&#13;
pandemic, preferentially infects CD4+ T lymphocytes, leading to their progressive depletion via both direct viral cytotoxicity and through increased rates of apoptosis. To achieve full pathogenicity in vivo, HIV-1 encodes multiple accessory proteins, most of which play defined roles at various steps of the viral replication cycle. In contrast, the 96-amino acid Viral Protein R (Vpr) is implicated in disrupting host cell physiology through a variety of mechanisms, such as facilitating the nuclear import of viral pre-integration complexes, as well as significantly boosting viral production by enhancing the transcriptional activity of viral LTRs. Further, Vpr is actively encapsidated into HIV-1 virions, allowing its direct delivery into host cells upon de novo infection. Collectively, these characteristics epitomize Vpr as a crucial supporting element in the establishment of a productive HIV-1 infection. Nevertheless, multiple gaps exist in the understanding of the mechanisms whereby Vpr allows HIV-1 to exert control over its host cell at various organizational levels, and many studies still fail to answer these questions in physiologically relevant models, such as donor-derived CD4+ T lymphocytes. To address these issues, HIV-1 infection assays employing inhibitors for various signaling pathways were performed on T cell-derived models and primary CD4+ T cells alike, focusing on Vpr’s role in the induction of NF-AT signaling. Consecutively, a thorough bioinformatic analysis was executed on an RNA-Seq dataset derived from HIV-1-infected primary T lymphocytes, aiming to identify how Vpr presence can influence the transcriptomic footprint left by HIV-1 on its host. Finally, Vpr’s ability to hijack and redirect its host’s proteasomal activity was studied in the context of two putative protein targets previously identified through non-targeted proteomics: TCF7 and G3BP1. The present work demonstrated that the role virion-delivered Vpr plays in supporting the establishment of a productive HIV-1 infection is highly reliant on its ability to induce the activation of NF-AT, as artificially inhibiting this transcriptional factor completely curtailed Vpr’s characteristic boost in viral productivity and spread. The aforementioned bioinformatic analyses revealed that Vpr-mediated NF-AT induction leads to the transcriptional reprogramming of the host T cell, differentially affecting a variety of physiological processes, including cell cycle progression, ribosome assembly, protein translation, immune &amp; inflammatory function, intracellular signaling, and cell proliferation, amongst others. In addition, this study established the mechanism whereby Vpr leads to the proteasomal degradation of TCF7, a trans-acting factor of central relevance towards T cell development, differentiation, and survival. Taken together, these results establish Vpr-mediated NF-AT activation as a central mechanism through which HIV-1 reprograms T cell physiology to enhance viral replication, expanding Vpr’s array of virus-supporting roles and illustrating their eventual outcome on T cell physiology. Future work ought to prioritize validating many of these phenomena in primary CD4+ T cells, in parallel exploring the downstream effects of Vpr-targeted protein degradation on T cell differentiation, exhaustion, as well as in the establishment and reactivation of potential HIV-1 reservoir populations.
</description>
<pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10900/180503</guid>
<dc:date>2026-06-09T00:00:00Z</dc:date>
</item>
<item>
<title>Hardware-Aware Machine Learning Methods for Medical Edge Devices</title>
<link>http://hdl.handle.net/10900/180431</link>
<description>Hardware-Aware Machine Learning Methods for Medical Edge Devices
Werner, Julia Helga
Real-time embedded edge devices are of great importance in many applications, such as autonomous driving, robots or smartwatches. Additionally, various medical procedures involve embedded, small in-body sensor edge devices. Equipping such devices with artificial intelligence can improve the procedures by incorporating new functionalities. However, this is often accompanied by a high demand for energy and computational resources if the models are not optimized accordingly. For some applications involving in-body edge devices equipped with machine learning models, the neural network parameters must be stored directly on-device and the model executed locally. Notably, resource-constrained devices impose stringent requirements on machine learning models in respect to on-chip area and electrical energy consumption. These restrictions need to be considered in the final model design. Deep learning methods involving neural networks with millions or even billions of parameters or operations that cannot simply be transferred to hardware, are not a viable solution. Furthermore, there is a necessity of using lightweight, quantized models in fixed-point representation to realize efficient inference on hardware. Storing model parameters in lower precision potentially impairs the overall performance of the classifier, which needs to be addressed by dedicated techniques, such as hardware-aware training. Additionally, potential challenges involve general data sparsity and class imbalances, which often occur in medical datasets since pathologies are naturally underrepresented compared to healthy samples. Importantly, if well-designed, machine-learning-based decision models provide new energy-saving functionalities that can lower the energy demand of the whole system. This thesis specifically addresses these problems by examining two important medical applications: the Video Capsule Endoscopy, a methodology to investigate the otherwise inaccessible small intestine using a small, pill-sized capsule and seizure detection using neuroimplants intended for drug-resistant epilepsy patients.&#13;
The main objective of this thesis is to overcome the described challenges and design artificial intelligence-based classification models suitable for tiny edge devices as present in both introduced medical applications. It is further expected that other medical applications can benefit from the presented methods as well. Overall, this work is dedicated to the development of hardware-aware, specialized machine learning techniques for the Video Capsule Endoscopy and preictal seizure detection. The approaches are tailored for an on-device application, providing the groundwork for future innovations and enhancements, such as an actively controlled capsule. For both applications, hybrid models are proposed, combining machine learning classifiers based on deep neural networks with time-series techniques, such as Hidden Markov Models, to solve these challenges. The resulting methods are accurate, highly efficient and are verified on FPGA-based hardware demonstrators to measure their power consumption. This enhances both medical procedures involving low-power edge devices without increasing the energy demand of the whole system.
</description>
<pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10900/180431</guid>
<dc:date>2026-06-08T00:00:00Z</dc:date>
</item>
</channel>
</rss>
