scispace - formally typeset
Search or ask a question

Showing papers by "Stevens Institute of Technology published in 2019"


Posted Content
TL;DR: This paper analyzes the convergence of Federated Averaging on non-iid data and establishes a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.
Abstract: Federated learning enables a large amount of edge computing devices to jointly learn a model without data sharing. As a leading algorithm in this setting, Federated Averaging (\texttt{FedAvg}) runs Stochastic Gradient Descent (SGD) in parallel on a small subset of the total devices and averages the sequences only once in a while. Despite its simplicity, it lacks theoretical guarantees under realistic settings. In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data and establish a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs. Importantly, our bound demonstrates a trade-off between communication-efficiency and convergence rate. As user devices may be disconnected from the server, we relax the assumption of full device participation to partial device participation and study different averaging schemes; low device participation rate can be achieved without severely slowing down the learning. Our results indicate that heterogeneity of data slows down the convergence, which matches empirical observations. Furthermore, we provide a necessary condition for \texttt{FedAvg} on non-iid data: the learning rate $\eta$ must decay, even if full-gradient is used; otherwise, the solution will be $\Omega (\eta)$ away from the optimal.

919 citations


Proceedings ArticleDOI
25 Jul 2019
TL;DR: The core idea is to capture the normal patterns of multivariate time series by learning their robust representations with key techniques such as stochastic variable connection and planar normalizing flow, reconstruct input data by the representations, and use the reconstruction probabilities to determine anomalies.
Abstract: Industry devices (i.e., entities) such as server machines, spacecrafts, engines, etc., are typically monitored with multivariate time series, whose anomaly detection is critical for an entity's service quality management. However, due to the complex temporal dependence and stochasticity of multivariate time series, their anomaly detection remains a big challenge. This paper proposes OmniAnomaly, a stochastic recurrent neural network for multivariate time series anomaly detection that works well robustly for various devices. Its core idea is to capture the normal patterns of multivariate time series by learning their robust representations with key techniques such as stochastic variable connection and planar normalizing flow, reconstruct input data by the representations, and use the reconstruction probabilities to determine anomalies. Moreover, for a detected entity anomaly, OmniAnomaly can provide interpretations based on the reconstruction probabilities of its constituent univariate time series. The evaluation experiments are conducted on two public datasets from aerospace and a new server machine dataset (collected and released by us) from an Internet company. OmniAnomaly achieves an overall F1-Score of 0.86 in three real-world datasets, signicantly outperforming the best performing baseline method by 0.09. The interpretation accuracy for OmniAnomaly is up to 0.89.

541 citations


Journal ArticleDOI
TL;DR: This paper develops several methods to represent modulated signals in data formats with gridlike topologies for the CNN and demonstrates the significant performance advantage and application feasibility of the DL-based approach for modulation classification.
Abstract: Deep learning (DL) is a new machine learning (ML) methodology that has found successful implementations in many application domains. However, its usage in communications systems has not been well explored. This paper investigates the use of the DL in modulation classification, which is a major task in many communications systems. The DL relies on a massive amount of data and, for research and applications, this can be easily available in communications systems. Furthermore, unlike the ML, the DL has the advantage of not requiring manual feature selections, which significantly reduces the task complexity in modulation classification. In this paper, we use two convolutional neural network (CNN)-based DL models, AlexNet and GoogLeNet. Specifically, we develop several methods to represent modulated signals in data formats with gridlike topologies for the CNN. The impacts of representation on classification performance are also analyzed. In addition, comparisons with traditional cumulant and ML-based algorithms are presented. Experimental results demonstrate the significant performance advantage and application feasibility of the DL-based approach for modulation classification.

355 citations


Journal ArticleDOI
TL;DR: A efficient CH election scheme that rotates the CH position among the nodes with higher energy level as compared to other to elect the next group of CHs for the network that suits for IoT applications, such as environmental monitoring, smart cities, and systems is proposed.
Abstract: Wireless sensor networks (WSNs) groups specialized transducers that provide sensing services to Internet of Things (IoT) devices with limited energy and storage resources. Since replacement or recharging of batteries in sensor nodes is almost impossible, power consumption becomes one of the crucial design issues in WSN. Clustering algorithm plays an important role in power conservation for the energy constrained network. Choosing a cluster head (CH) can appropriately balance the load in the network thereby reducing energy consumption and enhancing lifetime. This paper focuses on an efficient CH election scheme that rotates the CH position among the nodes with higher energy level as compared to other. The algorithm considers initial energy, residual energy, and an optimum value of CHs to elect the next group of CHs for the network that suits for IoT applications, such as environmental monitoring, smart cities, and systems. Simulation analysis shows the modified version performs better than the low energy adaptive clustering hierarchy protocol by enhancing the throughput by 60%, lifetime by 66%, and residual energy by 64%.

317 citations


Journal ArticleDOI
17 Jul 2019
TL;DR: A simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information of a given person, so that correct person candidates can be still identified even even some key parts are missing.
Abstract: Despite the remarkable progress in person re-identification (Re-ID), such approaches still suffer from the failure cases where the discriminative body parts are missing. To mitigate this type of failure, we propose a simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information of a given person, so that correct person candidates can be identified even if some key parts are missing. With HPM, we make the following contributions to produce more robust feature representations for the Re-ID task: 1) we learn to classify using partial feature representations at different horizontal pyramid scales, which successfully enhance the discriminative capabilities of various person parts; 2) we exploit average and max pooling strategies to account for person-specific discriminative information in a global-local manner. To validate the effectiveness of our proposed HPM method, extensive experiments are conducted on three popular datasets including Market-1501, DukeMTMCReID and CUHK03. Respectively, we achieve mAP scores of 83.1%, 74.5% and 59.7% on these challenging benchmarks, which are the new state-of-the-arts.

308 citations


Journal ArticleDOI
TL;DR: In this paper, a nonlinear vibration analysis of metal foam circular cylindrical shells reinforced with graphene platelets is performed, and the results demonstrate that GPL reinforced metal foam (GPLRMF) shells exhibit hardening-spring vibration characteristics.

259 citations


Journal ArticleDOI
TL;DR: A review of the burgeoning literature dedicated to Energy Economics/Finance applications of ML suggests that Support Vector Machine, Artificial Neural Network, and Genetic Algorithms are among the most popular techniques used in energy economics papers.

220 citations


Journal ArticleDOI
TL;DR: This paper introduces the concept of wireless aware joint scheduling and computation offloading (JSCO) for multi-component applications, where an optimal decision is made on which components need to be offloaded as well as the scheduling order of these components.
Abstract: Cloud offloading is an indispensable solution to supporting computationally demanding applications on resource constrained mobile devices. In this paper, we introduce the concept of wireless aware joint scheduling and computation offloading (JSCO) for multi-component applications, where an optimal decision is made on which components need to be offloaded as well as the scheduling order of these components. The JSCO approach allows for more degrees of freedom in the solution by moving away from a compiler pre-determined scheduling order for the components towards a more wireless aware scheduling order. For some component dependency graph structures, the proposed algorithm can shorten execution times by parallel processing appropriate components in the mobile and cloud. We define a net utility that trades-off the energy saved by the mobile, subject to constraints on the communication delay, overall application execution time, and component precedence ordering. The linear optimization problem is solved using real data measurements obtained from running multi-component applications on an HTC smartphone and the Amazon EC2, using WiFi for cloud offloading. The performance is further analyzed using various component dependency graph topologies and sizes. Results show that the energy saved increases with longer application runtime deadline, higher wireless rates, and smaller offload data sizes.

219 citations


Journal ArticleDOI
TL;DR: A three-phase air pollution monitoring system analogous to Google traffic or the navigation application of Google Maps is proposed, and air quality data can be used to predict future air quality index (AQI) levels.
Abstract: Internet of Things (IoT) is a worldwide system of “smart devices” that can sense and connect with their surroundings and interact with users and other systems. Global air pollution is one of the major concerns of our era. Existing monitoring systems have inferior precision, low sensitivity, and require laboratory analysis. Therefore, improved monitoring systems are needed. To overcome the problems of existing systems, we propose a three-phase air pollution monitoring system. An IoT kit was prepared using gas sensors, Arduino integrated development environment (IDE), and a Wi-Fi module. This kit can be physically placed in various cities to monitoring air pollution. The gas sensors gather data from air and forward the data to the Arduino IDE. The Arduino IDE transmits the data to the cloud via the Wi-Fi module. We also developed an Android application termed IoT-Mobair , so that users can access relevant air quality data from the cloud. If a user is traveling to a destination, the pollution level of the entire route is predicted, and a warning is displayed if the pollution level is too high. The proposed system is analogous to Google traffic or the navigation application of Google Maps. Furthermore, air quality data can be used to predict future air quality index (AQI) levels.

214 citations


Journal ArticleDOI
TL;DR: It is found that, under the compound effects of SLR and TC climatology change, the historical 100-year flood level would occur annually in New England and mid-Atlantic regions and every 1–30 years in southeast Atlantic and Gulf of Mexico regions in the late 21st century.
Abstract: One of the most destructive natural hazards, tropical cyclone (TC)–induced coastal flooding, will worsen under climate change. Here we conduct climatology–hydrodynamic modeling to quantify the effects of sea level rise (SLR) and TC climatology change (under RCP 8.5) on late 21st century flood hazards at the county level along the US Atlantic and Gulf Coasts. We find that, under the compound effects of SLR and TC climatology change, the historical 100-year flood level would occur annually in New England and mid-Atlantic regions and every 1–30 years in southeast Atlantic and Gulf of Mexico regions in the late 21st century. The relative effect of TC climatology change increases continuously from New England, mid-Atlantic, southeast Atlantic, to the Gulf of Mexico, and the effect of TC climatology change is likely to be larger than the effect of SLR for over 40% of coastal counties in the Gulf of Mexico. Tropical cyclone-induced coastal flooding will increase under climate change. Here the authors estimate the effects of sea level rise and tropical cyclone climatology change on late–21st–century flood hazards along the US Atlantic and Gulf Coasts and find that the effect of tropical cyclone change could surpass the effect of sea level rise at some areas in the Gulf of Mexico.

206 citations


Journal ArticleDOI
TL;DR: The results provide the first large-sample evidence for the predictive power of textual disclosures and suggest that simpler models such as averaging embedding are more effective than convolutional neural networks.

Journal ArticleDOI
TL;DR: This paper proposes a mass detection method based on CNN deep features and unsupervised extreme learning machine (ELM) clustering and builds a feature set fusing deep features, morphological features, texture features, and density features.
Abstract: A computer-aided diagnosis (CAD) system based on mammograms enables early breast cancer detection, diagnosis, and treatment. However, the accuracy of the existing CAD systems remains unsatisfactory. This paper explores a breast CAD method based on feature fusion with convolutional neural network (CNN) deep features. First, we propose a mass detection method based on CNN deep features and unsupervised extreme learning machine (ELM) clustering. Second, we build a feature set fusing deep features, morphological features, texture features, and density features. Third, an ELM classifier is developed using the fused feature set to classify benign and malignant breast masses. Extensive experiments demonstrate the accuracy and efficiency of our proposed mass detection and breast cancer classification method.

Journal ArticleDOI
TL;DR: A Joint Sentiment-Topic model is used to extract the topics and associated sentiments in review texts and proposes that numerical rating mediates the effects of textual sentiments.

Journal ArticleDOI
TL;DR: The authors implant sacrificial templates subcutaneously to build an organised ECM scaffold, and following template removal and decellularisation use these scaffolds to create functionally integrated muscle, nerve and artery in vivo.
Abstract: Implanted scaffolds with inductive niches can facilitate the recruitment and differentiation of host cells, thereby enhancing endogenous tissue regeneration. Extracellular matrix (ECM) scaffolds derived from cultured cells or natural tissues exhibit superior biocompatibility and trigger favourable immune responses. However, the lack of hierarchical porous structure fails to provide cells with guidance cues for directional migration and spatial organization, and consequently limit the morpho-functional integration for oriented tissues. Here, we engineer ECM scaffolds with parallel microchannels (ECM-C) by subcutaneous implantation of sacrificial templates, followed by template removal and decellularization. The advantages of such ECM-C scaffolds are evidenced by close regulation of in vitro cell activities, and enhanced cell infiltration and vascularization upon in vivo implantation. We demonstrate the versatility and flexibility of these scaffolds by regenerating vascularized and innervated neo-muscle, vascularized neo-nerve and pulsatile neo-artery with functional integration. This strategy has potential to yield inducible biomaterials with applications across tissue engineering and regenerative medicine.

Book ChapterDOI
05 Jun 2019
TL;DR: HashCat and John the Ripper as mentioned in this paper can expand password dictionaries using password generation rules, such as concatenation of words (e.g., “password123456”) and leet speak.
Abstract: State-of-the-art password guessing tools, such as HashCat and John the Ripper, enable users to check billions of passwords per second against password hashes. In addition to performing straightforward dictionary attacks, these tools can expand password dictionaries using password generation rules, such as concatenation of words (e.g., “password123456”) and leet speak (e.g., “password” becomes “p4s5w0rd”). Although these rules work well in practice, creating and expanding them to model further passwords is a labor-intensive task that requires specialized expertise.

Journal ArticleDOI
TL;DR: In this paper, an overview of the rheological properties of UHPC, applicable flow models, measurement techniques and errors associated with the interpretation of Rheological measurements are discussed.

Journal ArticleDOI
20 Sep 2019
TL;DR: In this paper, a chip-integrated lithium niobate microring resonator with a quasi-phase-matched frequency conversion achieved 230,000%/W or 10−6 per single photon.
Abstract: We demonstrate quasi-phase-matched frequency conversion in a chip-integrated lithium niobate microring resonator, whose normalized efficiency reaches 230,000%/W or 10−6 per single photon.

Journal ArticleDOI
TL;DR: A comprehensive review of different versions of the KH algorithm and their engineering applications is presented and specific features of KH and future directions are discussed.
Abstract: Krill herd (KH) is a novel swarm-based metaheuristic optimization algorithm inspired by the krill herding behavior. The objective function in the KH optimization process is based on the least distance between the food location and position of a krill. The KH method has been proven to outperform several state-of-the-art metaheuristic algorithms on many benchmarks and engineering cases. This paper presents a comprehensive review of different versions of the KH algorithm and their engineering applications. The study is divided into the following general parts: KH variants, engineering optimization/application, and theoretical analysis. In addition, specific features of KH and future directions are discussed.

Journal ArticleDOI
TL;DR: A comprehensive review of the design guidelines of TENGs, their performance, and their designs in the context of Internet of Things (IoT) applications is presented, and different designs of power management circuits, supercapacitors, and batteries that can be integrated with TENG devices are reviewed.
Abstract: Since their debut in 2012, triboelectric nanogenerators (TENGs) have attained high performance in terms of both energy density and instantaneous conversion, reaching up to 500 W m-2 and 85%, respectively, synchronous with multiple energy sources and hybridized designs. Here, a comprehensive review of the design guidelines of TENGs, their performance, and their designs in the context of Internet of Things (IoT) applications is presented. The development stages of TENGs in large-scale self-powered systems and technological applications enabled by harvesting energy from water waves or wind energy sources are also reviewed. This self-powered capability is essential considering that IoT applications should be capable of operation anywhere and anytime, supported by a network of energy harvesting systems in arbitrary environments. In addition, this review paper investigates the development of self-charging power units (SCPUs), which can be realized by pairing TENGs with energy storage devices, such as batteries and capacitors. Consequently, different designs of power management circuits, supercapacitors, and batteries that can be integrated with TENG devices are also reviewed. Finally, the significant factors that need to be addressed when designing and optimizing TENG-based systems for energy harvesting and self-powered sensing applications are discussed.

Journal ArticleDOI
TL;DR: Graphene-based conductive nanofibrous scaffolds are explored with the possibility of combining the conductive properties of graphene with electrospun nanofiber to create the electroactive biomimetic scaffolds for nerve tissue regeneration.

Journal ArticleDOI
TL;DR: An alternating direction method of multipliers-based distributed state estimation method is developed to overcome the limitation of conventional state estimation and performance analysis of smart grid against a single type of cyber attacks.
Abstract: Smart grid (SG) represents a large-scale network system with the tight integration of a physical power network and an information network, which makes it more vulnerable to hybrid cyber attacks against different regional subsystems. First, an alternating direction method of multipliers-based distributed state estimation method is developed to overcome the limitation of conventional state estimation and performance analysis of SG against a single type of cyber attacks. Regional subsystems are partitioned via the ${K}$ -means method. Second, a novel distributed state estimation method integrated with the characteristics of data deception attacks and denial of service (DoS) attacks is proposed to account for the simultaneous presence of different cyber attacks on individual regional subsystems. Third, the convergence of a distributed state estimation algorithm under hybrid cyber attacks is proved theoretically. Furthermore, the relationships between the convergence and algorithm parameters as well as the occurring probability of DoS attacks are established. Finally, the simulations on a modified IEEE 118-bus system are given to demonstrate the feasibility and effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: A thought experiment is considered where a massive body in a spatial superposition leads to entanglement of temporal orders between time-like events, resulting in a violation of a Bell-type inequality.
Abstract: Time has a fundamentally different character in quantum mechanics and in general relativity. In quantum theory events unfold in a fixed order while in general relativity temporal order is influenced by the distribution of matter. When matter requires a quantum description, temporal order is expected to become non-classical-a scenario beyond the scope of current theories. Here we provide a direct description of such a scenario. We consider a thought experiment with a massive body in a spatial superposition and show how it leads to entanglement of temporal orders between time-like events. This entanglement enables accomplishing a task, violation of a Bell inequality, that is impossible under local classical temporal order; it means that temporal order cannot be described by any pre-defined local variables. A classical notion of a causal structure is therefore untenable in any framework compatible with the basic principles of quantum mechanics and classical general relativity.

Posted Content
TL;DR: This paper presents an Asynchronous Online Federated Learning (ASO-Fed) framework, where the edge devices perform online learning with continuous streaming local data and a central server aggregates model parameters from clients in an asynchronous manner.
Abstract: Federated learning (FL) is a machine learning paradigm where a shared central model is learned across distributed edge devices while the training data remains on these devices. Federated Averaging (FedAvg) is the leading optimization method for training non-convex models in this setting with a synchronized protocol. However, the assumptions made by FedAvg are not realistic given the heterogeneity of devices. In particular, the volume and distribution of collected data vary in the training process due to different sampling rates of edge devices. The edge devices themselves also vary in their available communication bandwidth and system configurations, such as memory, processor speed, and power requirements. This leads to vastly different training times as well as model/data transfer times. Furthermore, availability issues at edge devices can lead to a lack of contribution from specific edge devices to the federated model. In this paper, we present an Asynchronous Online Federated Learning (ASO-Fed) framework, where the edge devices perform online learning with continuous streaming local data and a central server aggregates model parameters from clients. Our framework updates the central model in an asynchronous manner to tackle the challenges associated with both varying computational loads at heterogeneous edge devices and edge devices that lag behind or dropout. We perform extensive experiments on a simulated benchmark image dataset and three real-world non-IID streaming datasets. The results demonstrate the effectiveness of \model~on converging fast and maintaining good prediction performance.

Journal ArticleDOI
TL;DR: A best-response decentralized algorithm is proposed to identify the optimal operation schedule of the coupled infrastructure, which interprets a market equilibrium as neither system has an incentive to alter their strategies.
Abstract: Combined harnessing of electrical and thermal energies could leverage their complementary nature, inspiring the integration of power grids and centralized heating systems in future smart cities. This paper considers interconnected power distribution network (PDN) and district heating network (DHN) infrastructures through combined heat and power units and heat pumps. In the envisioned market framework, the DHN operator solves an optimal thermal flow problem given the nodal electricity prices and determines the best heat production strategy. Variate coefficients of performance of heat pumps with respect to different load levels are considered and modeled in a disciplined convex optimization format. A two-step hydraulic-thermal decomposition method is suggested to approximately solve the optimal thermal flow problem via a second-order cone program. Simultaneously, the PDN operator clears the distribution power market via an optimal power flow problem given the demands from the DHN. Electricity prices are revealed by dual variables at the optimal solution. The whole problem gives rise to a Nash-type game between the two systems. A best-response decentralized algorithm is proposed to identify the optimal operation schedule of the coupled infrastructure, which interprets a market equilibrium as neither system has an incentive to alter their strategies. Numeric results demonstrate the potential benefits of the proposed framework in terms of reducing wind curtailment and system operation cost.

Journal ArticleDOI
TL;DR: In this article, two types of peer-to-peer (P2P) electricity trading mechanisms, namely auction-based and bilateral contract-based P2P electricity trading, are discussed.

Journal ArticleDOI
TL;DR: The proposed distributionally robust scheduling model maximizes the base-case system social welfare plus the worst-case expected load shedding cost, and is cast into a mixed-integer linear programming problem to enhance computational tractability.
Abstract: This paper proposes a distributionally robust scheduling model for the integrated gas-electricity system (IGES) with electricity and gas load uncertainties, and further studies the impact of integrated gas-electricity demand response (DR) on energy market clearing, as well as locational marginal electricity and gas prices (LMEPs and LMGPs). The proposed model maximizes the base-case system social welfare (i.e., revenue from price-sensitive DR loads minus energy production cost) minus the worst-case expected load shedding cost. Price-based gas-electricity DRs are formulated via price-sensitive demand bidding curves while considering DR participation levels and energy curtailment limits. By linearizing nonlinear Weymouth gas flow equations via Taylor series expansion and further approximating recourse decisions as affine functions of uncertainty parameters, the formulation is cast into a mixed-integer linear programming problem to enhance computational tractability. Case studies illustrate effectiveness of the proposed model for ensuring system security against uncertainties, avoiding potential transmission congestions, and increasing financial stability of DR providers.

Journal ArticleDOI
TL;DR: This paper proposes an energy-efficient solution minimizing the UAV and/or sensors energy consumption while accomplishing a tour to collect data from the spatially distributed wireless sensors.
Abstract: Unnamed aerial vehicles (UAVs) or drones have attracted growing interest in the last few years for multiple applications; thanks to their advantages in terms of mobility, easy movement, and flexible positioning. In UAV-based communications, mobility and higher line-of-sight probability represent opportunities for the flying UAVs while the limited battery capacity remains its major challenge. Thus, they can be employed for specific applications where their permanent presence is not mandatory. Data gathering from wireless sensor networks is one of these applications. This paper proposes an energy-efficient solution minimizing the UAV and/or sensors energy consumption while accomplishing a tour to collect data from the spatially distributed wireless sensors. The objective is to determine the positions of the UAV “stops” from which it can collect data from a subset of sensors located in the same neighborhood and find the path that the UAV should follow to complete its data gathering tour in an energy-efficient manner. A non-convex optimization problem is first formulated then, an efficient and low-complex technique is proposed to iteratively achieve a sub-optimal solution. The initial problem is decomposed into three sub-problems: The first sub-problem optimizes the positioning of the stops using linearization. The second one determines the sensors assignment to stops using clustering. Finally, the path among these stops is optimized using the travel salesman problem. Selected numerical results show the behavior of the UAV versus various system parameters and that the achieved energy is considerably reduced compared to the one of existing approaches.

Journal ArticleDOI
01 Dec 2019
TL;DR: CCSA is a successful improvement to tackle the imbalance search strategy and premature convergence problems of the crow search algorithm and finds the best optimal solution for the applied problems of engineering design.
Abstract: In this paper, a conscious neighborhood-based crow search algorithm (CCSA) is proposed for solving global optimization and engineering design problems. It is a successful improvement to tackle the imbalance search strategy and premature convergence problems of the crow search algorithm. CCSA introduces three new search strategies called neighborhood-based local search (NLS), non-neighborhood based global search (NGS) and wandering around based search (WAS) in order to improve the movement of crows in different search spaces. Moreover, a neighborhood concept is defined to select the movement strategy between NLS and NGS consciously, which enhances the balance between local and global search. The proposed CCSA is evaluated on several benchmark functions and four applied problems of engineering design. In all experiments, CCSA is compared by other state-of-the-art swarm intelligence algorithms: CSA, BA, CLPSO, GWO, EEGWO, WOA, KH, ABC, GABC, and Best-so-far ABC. The experimental and statistical results show that CCSA is very competitive especially for large-scale optimization problems, and it is significantly superior to the compared algorithms. Furthermore, the proposed algorithm also finds the best optimal solution for the applied problems of engineering design.

Proceedings ArticleDOI
08 Apr 2019
TL;DR: In this paper, a federated learning framework for securely accessing and meta-analyzing any biomedical data without sharing individual information is proposed, which is first tested on synthetic data and then applied to multi-centric, multidatabase studies including ADNI, PPMI, MIRIAD and UK Biobank.
Abstract: At this moment, databanks worldwide contain brain images of previously unimaginable numbers. Combined with developments in data science, these massive data provide the potential to better understand the genetic underpinnings of brain diseases. However, different datasets, which are stored at different institutions, cannot always be shared directly due to privacy and legal concerns, thus limiting the full exploitation of big data in the study of brain disorders. Here we propose a federated learning framework for securely accessing and meta-analyzing any biomedical data without sharing individual information. We illustrate our framework by investigating brain structural relationships across diseases and clinical cohorts. The framework is first tested on synthetic data and then applied to multi-centric, multi-database studies including ADNI, PPMI, MIRIAD and UK Biobank, showing the potential of the approach for further applications in distributed analysis of multi-centric cohorts.

Journal ArticleDOI
TL;DR: The authors show that the transcription factors H NF4A and HNF4G regulate the transcriptome of the intestinal epithelium, which provides a framework to understand regenerative tissue homeostasis, particularly in tissues with inherent cellular plasticity.
Abstract: BMP/SMAD signaling is a crucial regulator of intestinal differentiation1–4. However, the molecular underpinnings of the BMP pathway in this context are unknown. Here, we characterize the mechanism by which BMP/SMAD signaling drives enterocyte differentiation. We establish that the transcription factor HNF4A acts redundantly with an intestine-restricted HNF4 paralog, HNF4G, to activate enhancer chromatin and upregulate the majority of transcripts enriched in the differentiated epithelium; cells fail to differentiate on double knockout of both HNF4 paralogs. Furthermore, we show that SMAD4 and HNF4 function via a reinforcing feed-forward loop, activating each other’s expression and co-binding to regulatory elements of differentiation genes. This feed-forward regulatory module promotes and stabilizes enterocyte cell identity; disruption of the HNF4–SMAD4 module results in loss of enterocyte fate in favor of progenitor and secretory cell lineages. This intersection of signaling and transcriptional control provides a framework to understand regenerative tissue homeostasis, particularly in tissues with inherent cellular plasticity5. The authors show that the transcription factors HNF4A and HNF4G regulate the transcriptome of the intestinal epithelium. HNF4 factors cooperate with BMP/SMAD signaling to promote enterocyte identity.