다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

Phd by thesis

Predicting user responses, such as clicks and conversions, is of great importance and has found its usage inmany Web applications including recommender systems, websearch and online advertising. The data in those applicationsis mostly categorical and contains multiple fields, a typicalrepresentation is to transform it into a high-dimensional sparsebinary feature representation via one-hot encoding. Facing withthe extreme sparsity, traditional models may limit their capacityof mining shallow patterns from the data, i.e. low-order featurecombinations. Deep models like deep neural networks, on theother hand, cannot be directly applied for the high-dimensionalinput because of the huge feature space. In this paper, we proposea Product-based Neural Networks (PNN) with an embeddinglayer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between interfieldcategories, and further fully connected layers to explorehigh-order feature interactions. Our experimental results on twolarge-scale real-world ad click datasets demonstrate that PNNsconsistently outperform the state-of-the-art models on various metrics.

Product-Based Neural Networks for User Response Prediction

Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics.

Product-based Neural Networks for User Response Prediction

User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search. The data in user response prediction is mostly in a multi-field categorical format and transformed into sparse representations via one-hot encoding. Due to the sparsity problems in representation and optimization, most research focuses on feature engineering and shallow modeling. Recently, deep neural networks have attracted research attention on such a problem for their high capacity and end-to-end training scheme. In this article, we study user response prediction in the scenario of click prediction. We first analyze a coupled gradient issue in latent vector-based models and propose kernel product to learn field-aware feature interactions. Then, we discuss an insensitive gradient issue in DNN-based models and propose Product-based Neural Network, which adopts a feature extractor to explore feature interactions. Generalizing the kernel product to a net-in-net architecture, we further propose Product-network in Network (PIN), which can generalize previous models. Extensive experiments on four industrial datasets and one contest dataset demonstrate that our models consistently outperform eight baselines on both area under curve and log loss. Besides, PIN makes great click-through rate improvement (relatively 34.67%) in online A/B test.

Product-Based Neural Networks for User Response Prediction over Multi-Field Categorical Data

Unmanned surface vehicle (USV) has witnessed a rapid growth in the recent decade and has been applied in various practical applications in both military and civilian domains. USVs can either be deployed as a single unit or multiple vehicles in a fleet to conduct ocean missions. Central to the control of USV and USV formations, path planning is the key technology that ensures the navigation safety by generating collision free trajectories. Compared with conventional path planning algorithms, the deep reinforcement learning (RL) based planning algorithms provides a new resolution by integrating a high-level artificial intelligence. This work investigates the application of deep reinforcement learning algorithms for USV and USV formation path planning with specific focus on a reliable obstacle avoidance in constrained maritime environments. For single USV planning, with the primary aim being to calculate a shortest collision avoiding path, the designed RL path planning algorithm is able to solve other complex issues such as the compliance with vehicle motion constraints. The USV formation maintenance algorithm is capable of calculating suitable paths for the formation and retain the formation shape robustly or vary shapes where necessary, which is promising to assist with the navigation in environments with cluttered obstacles. The developed three sets of algorithms are validated and tested in computer-based simulations and practical maritime environments extracted from real harbour areas in the UK.

/pdf/learn-to-navigate-cooperative-path-planning-for-unmanned-237ely8ng0.pdf

Learn to Navigate: Cooperative Path Planning for Unmanned Surface Vehicles Using Deep Reinforcement Learning

Coordination is one of the essential problems in multi-agent systems. Typically multi-agent reinforcement learning (MARL) methods treat agents equally and the goal is to solve the Markov game to an arbitrary Nash equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE selection. In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments. Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium. We propose a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly. The convergence proof is given, while the resulting learning algorithm is tested against the state of the arts. We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.

/pdf/bi-level-actor-critic-for-multi-agent-coordination-1ewx24wqmg.pdf

Bi-level Actor-Critic for Multi-agent Coordination

Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response. Although much research has been devoted to exploiting the question information, plentiful advanced information among questions and skills hasn't been well extracted, making it challenging for previous work to perform adequately. In this paper, we demonstrate that large gains on KT can be realized by pre-training embeddings for each question on abundant side information, followed by training deep KT models on the obtained embeddings. To be specific, the side information includes question difficulty and three kinds of relations contained in a bipartite graph between questions and skills. To pre-train the question embeddings, we propose to use product-based neural networks to recover the side information. As a result, adopting the pre-trained embeddings in existing deep KT models significantly outperforms state-of-the-art baselines on three common KT datasets.

/pdf/improving-knowledge-tracing-via-pre-training-question-597o7u06tg.pdf

Improving Knowledge Tracing via Pre-training Question Embeddings.

Learning and predicting user responses, such as clicks and conversions, are crucial for many Internet-based businesses including web search, e-commerce, and online advertising. Typically, a user response model is established by optimizing the prediction accuracy, e.g., minimizing the error between the prediction and the ground truth user response. However, in many practical cases, predicting user responses is only part of a rather larger predictive or optimization task, where on one hand, the accuracy of a user response prediction determines the final (expected) utility to be optimized, but on the other hand, its learning may also be influenced from the follow-up stochastic process. It is, thus, of great interest to optimize the entire process as a whole rather than treat them independently or sequentially. In this paper, we take real-time display advertising as an example, where the predicted user's ad click-through rate (CTR) is employed to calculate a bid for an ad impression in the second price auction. We reformulate a common logistic regression CTR model by putting it back into its subsequent bidding context: rather than minimizing the prediction error, the model parameters are learned directly by optimizing campaign profit. The gradient update resulted from our formulations naturally fine-tunes the cases where the market competition is high, leading to a more cost-effective bidding. Our experiments demonstrate that, while maintaining comparable CTR prediction accuracy, our proposed user response learning leads to campaign profit gains as much as 78.2% for offline test and 25.5% for online A/B test over strong baselines.

/pdf/user-response-learning-for-directly-optimizing-campaign-2h1xxrg64b.pdf

User Response Learning for Directly Optimizing Campaign Performance in Display Advertising

Abstract Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment. Such a paradigm is also desirable for multi-agent reinforcement learning (MARL) tasks, given the combinatorially increased interactions among agents and with the environment. However, in MARL, the paradigm of offline pre-training with online fine-tuning has not been studied, nor even datasets or benchmarks for offline MARL research are available. In this paper, we facilitate the research by providing large-scale datasets and using them to examine the usage of the decision transformer in the context of MARL. We investigate the generalization of MARL offline pre-training in the following three aspects: 1) between single agents and multiple agents, 2) from offline pretraining to online fine tuning, and 3) to that of multiple downstream tasks with few-shot and zero-shot capabilities. We start by introducing the first offline MARL dataset with diverse quality levels based on the StarCraftII environment, and then propose the novel architecture of multi-agent decision transformer (MADT) for effective offline learning. MADT leverages the transformer’s modelling ability for sequence modelling and integrates it seamlessly with both offline and online MARL tasks. A significant benefit of MADT is that it learns generalizable policies that can transfer between different types of agents under different task scenarios. On the StarCraft II offline dataset, MADT outperforms the state-of-the-art offline reinforcement learning (RL) baselines, including BCQ and CQL. When applied to online tasks, the pre-trained MADT significantly improves sample efficiency and enjoys strong performance in both few-short and zero-shot cases. To the best of our knowledge, this is the first work that studies and demonstrates the effectiveness of offline pre-trained models in terms of sample efficiency and generalizability enhancements for MARL. 

/pdf/offline-pre-trained-multi-agent-decision-transformer-baadhg5z.pdf

Haifeng Zhang

Papers

Learn to Navigate: Cooperative Path Planning for Unmanned Surface Vehicles Using Deep Reinforcement Learning

Bi-level Actor-Critic for Multi-agent Coordination

Improving Knowledge Tracing via Pre-training Question Embeddings.

User Response Learning for Directly Optimizing Campaign Performance in Display Advertising

Offline Pre-trained Multi-agent Decision Transformer