Welcome to Byte Size Arxiv

Papers made digestable

Recommended Categories:
Show All

2022-11-03

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves long-term autonomous path-following using topometric mapping and localization from a single rich sensor stream. In this paper, we improve the capabilities of a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in changing environments. Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. We then extend the behaviour of generic sample-based motion planners to better suit the teach-and-repeat problem structure by introducing a new edge-cost metric paired with a curvilinear planning space. The resulting planner generates naturally smooth paths that avoid local obstacles while minimizing lateral path deviation to best exploit prior terrain knowledge. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Experimental results from online run-time analysis, unit testing, and qualitative experiments on a differential drive robot show the promise of the technique for reliable long-term autonomous operation in complex unstructured environments.

Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.

2022-11-03

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

The statistical and design considerations that pertain to dose optimization are discussed. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. The traditional more-is-better dose selection paradigm, developed based on cytotoxic chemotherapeutics, is often problematic When applied to the development of novel molecularly targeted agents (e.g., kinase inhibitors, monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug Administration (FDA) initiated Project Optimus to reform the dose optimization and dose selection paradigm in oncology drug development and call for more attention to benefit-risk consideration. We systematically investigated the operating characteristics of the seamless phase 2-3 design as a strategy for dose optimization, where in stage 1 (corresponding to phase 2) patients are randomized to multiple doses, with or without a control; and in stage 2 (corresponding to phase 3) the efficacy of the selected optimal dose is evaluated with a randomized concurrent control or historical control. Depending on whether the concurrent control is included and the type of endpoints used in stages 1 and 2, we describe four types of seamless phase 2-3 dose-optimization designs, which are suitable for different clinical settings. The statistical and design considerations that pertain to dose optimization are discussed. Simulation shows that dose optimization phase 2-3 designs are able to control the familywise type I error rates and yield appropriate statistical power with substantially smaller sample size than the conventional approach. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. Due to the interim dose selection, the phase 2-3 dose-optimization design is logistically and operationally more challenging, and should be carefully planned and implemented to ensure trial integrity.

Authors: Liyun Jiang, Ying Yuan.

2022-11-03

Fast and robust Bayesian Inference using Gaussian Processes with GPry

We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.

2022-11-03

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. While no non-preemptive algorithm is constant competitive, Motwani, Phillips, and Torng (SODA '93) proved that the simple preemptive round robin procedure is $2$-competitive and that no better competitive ratio is possible, initiating a long line of research focused on preemptive algorithms for generalized variants of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS '91) introduced kill-and-restart schedules, where running jobs may be killed and restarted from scratch later, and analyzed then for the makespan objective. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. We contribute to both models: First we give for any $b > 1$ a tight analysis for the natural $b$-scaling kill-and-restart strategy for scheduling jobs without release dates, as well as for a randomized variant of it. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. Second, we show that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is $2$-competitive for jobs released in an online fashion over time, matching the lower bound by Motwani et al. Using this result as well as the competitiveness of round robin for multiple machines, we prove performance guarantees of adaptions of the $b$-scaling algorithm to online release dates and unweighted jobs on identical parallel machines.

Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.

2022-11-03

Could Giant Pretrained Image Models Extract Universal Representations?

Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models. Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. However, with frozen models there are relatively few parameters available for adapting to downstream tasks, which is problematic in computer vision where tasks vary significantly in input/output format and the type of information that is of value. In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. From this empirical analysis, our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes. We additionally examine the upper bound of performance using a giant frozen pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches competitive performance on a varied set of major benchmarks with only one shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7 top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models.

Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.

2022-11-03

Embedding Knowledge Graph of Patent Metadata to Measure Knowledge Proximity

We evaluate the plausibility of these embeddings across different models in predicting target entities. We also evaluate the meaningfulness of knowledge proximity to explain the domain expansion profiles of inventors and assignees. Knowledge proximity refers to the strength of association between any two entities in a structural form that embodies certain aspects of a knowledge base. In this work, we operationalize knowledge proximity within the context of the US Patent Database (knowledge base) using a knowledge graph (structural form) named PatNet built using patent metadata, including citations, inventors, assignees, and domain classifications. Using several graph embedding models (e.g., TransE, RESCAL), we obtain the embeddings of entities and relations that constitute PatNet. The cosine similarity between the corresponding (or transformed) embeddings entities denotes the knowledge proximity between these. We evaluate the plausibility of these embeddings across different models in predicting target entities. We also evaluate the meaningfulness of knowledge proximity to explain the domain expansion profiles of inventors and assignees. We then apply the embeddings of the best-preferred model to associate homogeneous (e.g., patent-patent) and heterogeneous (e.g., inventor-assignee) pairs of entities.

Authors: Guangtong Li, L Siddharth, Jianxi Luo.

2022-11-03

Scalarization

It resembles a phase transition in that the scalar configuration only appears when a certain quantity that characterizes the compact object, e.g., its compactness or spin, is beyond a threshold.

Scalarization is a mechanism that endows strongly self-gravitating bodies, such as neutron stars and black holes, with a scalar field configuration. It resembles a phase transition in that the scalar configuration only appears when a certain quantity that characterizes the compact object, e.g., its compactness or spin, is beyond a threshold. We provide a critical and comprehensive review of scalarization, including the mechanism itself, theories that exhibit it, its manifestation in neutron stars, black holes, and their binaries, potential extension to other fields, and a thorough discussion of future perspectives.

Authors: Daniela D. Doneva, Fethi M. Ramazanoğlu, Hector O. Silva, Thomas P. Sotiriou, Stoytcho S. Yazadjiev.

2022-11-03

Functional Determinant Approach Investigations of Heavy Impurity Physics

More recently, two new directions for studying heavy impurity with FDA have been developed. One is to extend FDA to a strongly correlated background superfluid background, a Bardeen-Cooper-Schrieffer (BCS) superfluid. Multidimensional Ramsey spectroscopy allows us to investigate correlations between spectral peaks of an impurity-medium system that is not accessible in the conventional one-dimensional spectrum. In this brief review, we report some new development in the functional determinant approach (FDA), an exact numerical method, in the studies of a heavy quantum impurity immersed in Fermi gases and manipulated with radio-frequency pulses. FDA has been successfully applied to investigate the universal dynamical responses of a heavy impurity in an ultracold ideal Fermi gas in both the time and frequency domain, which allows the exploration of the renowned Anderson's orthogonality catastrophe (OC). In such a system, OC is induced by the multiple particle-hole excitations of the Fermi sea, which is beyond a simple perturbation picture and manifests itself as the absence of quasiparticles named polarons. More recently, two new directions for studying heavy impurity with FDA have been developed. One is to extend FDA to a strongly correlated background superfluid background, a Bardeen-Cooper-Schrieffer (BCS) superfluid. In this system, Anderson's orthogonality catastrophe is prohibited due to the suppression of multiple particle-hole excitations by the superfluid gap, which leads to the existence of genuine polaron. The other direction is to generalize the FDA to the case of multiple RF pulses scheme, which extends the well-established 1D Ramsey spectroscopy in ultracold atoms into multidimensional, in the same spirit as the well-known multidimensional nuclear magnetic resonance and optical multidimensional coherent spectroscopy. Multidimensional Ramsey spectroscopy allows us to investigate correlations between spectral peaks of an impurity-medium system that is not accessible in the conventional one-dimensional spectrum.

Authors: Jia Wang.

2022-11-03

Holography and magnetohydrodynamics with dynamical gauge fields

Nevertheless, in many real-world applications, e.g., magnetohydrodynamics, plasma physics, superconductors, etc. dynamical gauge fields and Coulomb interactions are fundamental. We numerically study the spectrum of the lowest quasi-normal modes and successfully compare the obtained results to magnetohydrodynamics theory in $2+1$ dimensions.

Within the framework of holography, the Einstein-Maxwell action with Dirichlet boundary conditions corresponds to a dual conformal field theory in presence of an external gauge field. Nevertheless, in many real-world applications, e.g., magnetohydrodynamics, plasma physics, superconductors, etc. dynamical gauge fields and Coulomb interactions are fundamental. In this work, we consider bottom-up holographic models at finite magnetic field and (free) charge density in presence of dynamical boundary gauge fields which are introduced using mixed boundary conditions. We numerically study the spectrum of the lowest quasi-normal modes and successfully compare the obtained results to magnetohydrodynamics theory in $2+1$ dimensions. Surprisingly, as far as the electromagnetic coupling is small enough, we find perfect agreement even in the large magnetic field limit. Our results prove that a holographic description of magnetohydrodynamics does not necessarily need higher-form bulk fields but can be consistently derived using mixed boundary conditions for standard gauge fields.

Authors: Yongjun Ahn, Matteo Baggioli, Kyoung-Bum Huh, Hyun-Sik Jeong, Keun-Young Kim, Ya-Wen Sun.

2022-11-03

Optimal Algorithms for Stochastic Complementary Composite Minimization

We conclude by providing numerical results comparing our methods to the state of the art. Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization in the stochastic setting. This problem corresponds to the minimization of the sum of a (weakly) smooth function endowed with a stochastic first-order oracle, and a structured uniformly convex (possibly nonsmooth and non-Lipschitz) regularization term. Despite intensive work on closely related settings, prior to our work no complexity bounds for this problem were known. We close this gap by providing novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems. We conclude by providing numerical results comparing our methods to the state of the art.

Authors: Alexandre d'Aspremont, Cristóbal Guzmán, Clément Lezane.

2022-11-03

Speech-based emotion recognition with self-supervised models using attentive channel-wise correlations and label smoothing

We evaluate our proposed approach on the benchmark dataset IEMOCAP, and demonstrate high performance surpassing that in the literature. The code to reproduce the results is available at github.com/skakouros/s3prl_attentive_correlation.

When recognizing emotions from speech, we encounter two common problems: how to optimally capture emotion-relevant information from the speech signal and how to best quantify or categorize the noisy subjective emotion labels. Self-supervised pre-trained representations can robustly capture information from speech enabling state-of-the-art results in many downstream tasks including emotion recognition. However, better ways of aggregating the information across time need to be considered as the relevant emotion information is likely to appear piecewise and not uniformly across the signal. For the labels, we need to take into account that there is a substantial degree of noise that comes from the subjective human annotations. In this paper, we propose a novel approach to attentive pooling based on correlations between the representations' coefficients combined with label smoothing, a method aiming to reduce the confidence of the classifier on the training labels. We evaluate our proposed approach on the benchmark dataset IEMOCAP, and demonstrate high performance surpassing that in the literature. The code to reproduce the results is available at github.com/skakouros/s3prl_attentive_correlation.

Authors: Sofoklis Kakouros, Themos Stafylakis, Ladislav Mosner, Lukas Burget.

2022-11-03

KMS states and their classical limit

A continuous bundle of $C^*$-algebras provides a rigorous framework to study the thermodynamic limit of quantum theories. a mathematical formalization in which convergence of algebraic quantum states to probability measures on phase space (typically a Poisson or symplectic manifold) is studied. We additionally show that the ensuing limit corresponds to the unique probability measure satisfying the so-called classical or static KMS- condition. A continuous bundle of $C^*$-algebras provides a rigorous framework to study the thermodynamic limit of quantum theories. If the bundle admits the additional structure of a strict deformation quantization (in the sense of Rieffel) one is allowed to study the classical limit of the quantum system, i.e. a mathematical formalization in which convergence of algebraic quantum states to probability measures on phase space (typically a Poisson or symplectic manifold) is studied. In this manner we first prove the existence of the classical limit of Gibbs states illustrated with a class of Schr\"{o}dinger operators in the regime where Planck's constant $\hbar$ appearing in front of the Laplacian approaches zero. We additionally show that the ensuing limit corresponds to the unique probability measure satisfying the so-called classical or static KMS- condition. Subsequently, a similar study is conducted for the free energy in the classical limit of mean-field quantum spin systems in the regime of large particles, and the existence of the classical limit of the relevant Gibbs states is discussed.

Authors: Christiaan J. F. van de Ven.

2022-11-03

A Comparative Study of Smartphone and Smart TV Apps

Context: Smart TVs have become one of the most popular television types. Many app developers and service providers have designed TV versions for their smartphone applications. The relationship between phone and TV has not been the subject of research works. Method: We gather a large-scale phone/TV app pairs from Google Play Store. ), code (e.g., components, methods, user interactions, etc. ), security and privacy (e.g., reports of AndroBugs and FlowDroid).

Context: Smart TVs have become one of the most popular television types. Many app developers and service providers have designed TV versions for their smartphone applications. Despite the extensive studies on mobile app analysis, its TV equivalents receive far too little attention. The relationship between phone and TV has not been the subject of research works. Objective: In this paper, we aim to characterize the relationship between smartphone and smart TV apps. To fill this gap, we conduct a comparative study on smartphone and smart TV apps in this work, which is the starting and fundamental step to uncover the domain-specific challenges. Method: We gather a large-scale phone/TV app pairs from Google Play Store. We then analyzed the app pairs quantitatively and qualitatively from a variety of perspectives, including non-code (e.g., metadata, resources, permissions, etc.), code (e.g., components, methods, user interactions, etc.), security and privacy (e.g., reports of AndroBugs and FlowDroid). Results: Our experimental results indicate that (1) the code of the smartphone and TV apps can be released in the same app package or in separate app packages with the same package name; (2) 43% of resource files and 50% of code methods are reused between phone/TV app pairs; (3) TV and phone versions of the same app often encounter different kinds of security vulnerabilities; and (4) TV apps encounter fewer user interactions than their phone versions, but the type of user interaction events, surprisingly, are similar between phone/TV apps. Conclution: Our findings are valuable for developers and academics in comprehending the TV app ecosystem by providing additional insight into the migration of phone apps to TVs and the design mechanism of analysis tools for TV apps.

Authors: Yonghui Liu, Xiao Chen, Yue Liu, Pingfan Kong, Tegawendé F. Bissyande, Jacques Klein, Xiaoyu Sun, Chunyang Chen, John Grundy.

2022-11-03

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

Streaming models are an essential component of real-time speech enhancement tools. We demonstrate that the proposed technique leads to stable improvement across different architectures and training scenarios. Streaming models are an essential component of real-time speech enhancement tools. The streaming regime constrains speech enhancement models to use only a tiny context of future information, thus, the low-latency streaming setup is generally assumed to be challenging and has a significant negative effect on the model quality. However, due to the sequential nature of streaming generation, it provides a natural possibility for autoregression, i.e., using previous predictions when making current ones. In this paper, we present a simple, yet effective trick for training of autoregressive low-latency speech enhancement models. We demonstrate that the proposed technique leads to stable improvement across different architectures and training scenarios.

Authors: Pavel Andreev, Nicholas Babaev, Azat Saginbaev, Ivan Shchekotov.

2022-11-03

Disentangling the magneto-optic Kerr effect of manganite epitaxial heterostructures

Magneto-optic Kerr effect can probe the process of magnetization reversal in ferromagnetic thin films and thus be used as an alternative to magnetometry. Kerr effect is wavelength-dependent and the Kerr rotation can reverse sign, vanishing at particular wavelengths. We investigate epitaxial heterostructures of ferromagnetic manganite, La$_{0.7}$Sr$_{0.3}$Mn$_{0.9}$Ru$_{0.1}$O$_3$, by polar Kerr effect and magnetometry. The manganite layers are separated by or interfaced with a layer of nickelate, NdNiO$_3$. Kerr rotation hysteresis loops of trilayers, with two manganite layers of different thickness separated by a nickelate layer, have intriguing humplike features, when measured with light of 400 nm wavelength. By investigating additional reference samples we disentangle the contributions of the individual layers to the loops: we show that the humps originate from the opposite sense of the Kerr rotation of the two different ferromagnetic layers, combined with the additive behavior of the Kerr signal.

Authors: Jörg Schöpf, Paul H. M. van Loosdrecht, Ionela Lindfors-Vrejoiu.

Read more

Read more

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

Fast and robust Bayesian Inference using Gaussian Processes with GPry

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

Could Giant Pretrained Image Models Extract Universal Representations?

Embedding Knowledge Graph of Patent Metadata to Measure Knowledge Proximity

Scalarization

Functional Determinant Approach Investigations of Heavy Impurity Physics

Holography and magnetohydrodynamics with dynamical gauge fields

Optimal Algorithms for Stochastic Complementary Composite Minimization

Speech-based emotion recognition with self-supervised models using attentive channel-wise correlations and label smoothing

KMS states and their classical limit

A Comparative Study of Smartphone and Smart TV Apps

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

Disentangling the magneto-optic Kerr effect of manganite epitaxial heterostructures