scispace - formally typeset
Search or ask a question

Showing papers on "Hierarchy (mathematics) published in 2018"


Proceedings Article
27 Sep 2018
TL;DR: The novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference.
Abstract: Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed. While the standard LSTM architecture allows different neurons to track information at different time scales, it does not have an explicit bias towards modeling a hierarchy of constituents. This paper proposes to add such an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated. Our novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference.

245 citations


Journal ArticleDOI
Alsallakh Bilal1, Amin Jourabloo2, Mao Ye1, Xiaoming Liu2, Liu Ren1 
TL;DR: This work presents visual-analytics methods to reveal and analyze a hierarchy of similar classes that dictates the confusion patterns between the classes, and designs hierarchy-aware CNNs that accelerate model convergence and alleviate overfitting.
Abstract: Convolutional Neural Networks (CNNs) currently achieve state-of-the-art accuracy in image classification. With a growing number of classes, the accuracy usually drops as the possibilities of confusion increase. Interestingly, the class confusion patterns follow a hierarchical structure over the classes. We present visual-analytics methods to reveal and analyze this hierarchy of similar classes in relation with CNN-internal data. We found that this hierarchy not only dictates the confusion patterns between the classes, it furthermore dictates the learning behavior of CNNs. In particular, the early layers in these networks develop feature detectors that can separate high-level groups of classes quite well, even after a few training epochs. In contrast, the latter layers require substantially more epochs to develop specialized feature detectors that can separate individual classes. We demonstrate how these insights are key to significant improvement in accuracy by designing hierarchy-aware CNNs that accelerate model convergence and alleviate overfitting. We further demonstrate how our methods help in identifying various quality issues in the training data.

186 citations


Posted Content
TL;DR: This article proposed an ordered neurons LSTM (ON-LSTM) to add an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated.
Abstract: Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed. While the standard LSTM architecture allows different neurons to track information at different time scales, it does not have an explicit bias towards modeling a hierarchy of constituents. This paper proposes to add such an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated. Our novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference.

169 citations


Journal ArticleDOI
TL;DR: A new solution to the hierarchy problem utilizing nonlinearly realized discrete symmetries is presented, which can be used to solve the little hierarchy problem as well as give rise to light axions.
Abstract: We present a new solution to the hierarchy problem utilizing nonlinearly realized discrete symmetries. The cancellations occur due to a discrete symmetry that is realized as a shift symmetry on the scalar and as an exchange symmetry on the particles with which the scalar interacts. We show how this mechanism can be used to solve the little hierarchy problem as well as give rise to light axions.

86 citations


Journal ArticleDOI
TL;DR: A decision-making method to deal with multiple criteria decision making (MCDM) problems on the basis of distance and similarity measures of double hierarchy hesitant fuzzy linguistic elements (DHFLEs) and double hierarchy HFLTSs from different angles is developed.

84 citations


Journal ArticleDOI
TL;DR: An approach to obtain solutions for the nonlocal nonlinear Schr\"{o}dinger hierarchy from the known ones of the Ablowitz-Kaup-Newell-Segur hierarchy by reduction in terms of double Wronskian is proposed.

77 citations


Journal ArticleDOI
TL;DR: In this paper, a hierarchical tubular section is inspired by the micro-to nano-architecture of biological materials, such as tendon and muscle, which can be mimicked by packing smaller tubes into a tube of a higher hierarchical level.

76 citations


Proceedings ArticleDOI
15 Oct 2018
TL;DR: In this article, a hierarchical semantic embedding (HSE) framework is proposed to simultaneously predict categories of different levels in the hierarchy and integrate this structured correlation information into the deep neural network, which can effectively regularize the semantic space and thus make prediction less ambiguous.
Abstract: Object categories inherently form a hierarchy with different levels of concept abstraction, especially for fine-grained categories. For example, birds (Aves) can be categorized according to a four-level hierarchy of order, family, genus, and species. This hierarchy encodes rich correlations among various categories across different levels, which can effectively regularize the semantic space and thus make prediction less ambiguous. However, previous studies of fine-grained image recognition primarily focus on categories of one certain level and usually overlook this correlation information. In this work, we investigate simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework. Specifically, the HSE framework sequentially predicts the category score vector of each level in the hierarchy, from highest to lowest. At each level, it incorporates the predicted score vector of the higher level as prior knowledge to learn finer-grained feature representation. During training, the predicted score vector of the higher level is also employed to regularize label prediction by using it as soft targets of corresponding sub-categories. To evaluate the proposed framework, we organize the 200 bird species of the Caltech-UCSD birds dataset with the four-level category hierarchy and construct a large-scale butterfly dataset that also covers four level categories. Extensive experiments on these two and the newly-released VegFru datasets demonstrate the superiority of our HSE framework over the baseline methods and existing competitors.

66 citations


Journal ArticleDOI
TL;DR: In this paper, the Riemann theta function of the trigonal curve, the related Baker-Akhiezer function, and an algebraic function carrying the data of the divisor were derived from the theory of trigonal curves.
Abstract: Starting with a discrete 3 × 3 3\\times 3 matrix spectral problem, the hierarchy of Bogoyavlensky lattices which are pure differential-difference equations are derived with the aid of the Lenard recursion equations and the stationary discrete zero-curvature equation. By using the characteristic polynomial of Lax matrix for the hierarchy of stationary Bogoyavlensky lattices, we introduce a trigonal curve K m − 1 \\mathcal {K}_{m-1} of arithmetic genus m − 1 m-1 and a basis of holomorphic differentials on it, from which we construct the Riemann theta function of the trigonal curve, the related Baker–Akhiezer function, and an algebraic function carrying the data of the divisor. Based on the theory of trigonal curves, the Riemann theta function representations of the Baker–Akhiezer function, the meromorphic function, and in particular, that of solutions of the hierarchy of Bogoyavlensky lattices are obtained.

64 citations


Journal ArticleDOI
TL;DR: In this paper, a sparse version of the bounded degree SOS hierarchy BSOS for polynomial optimization problems is presented, which allows to treat large scale problems which satisfy a structured sparsity pattern.
Abstract: We provide a sparse version of the bounded degree SOS hierarchy BSOS [7] for polynomial optimization problems. It permits to treat large scale problems which satisfy a structured sparsity pattern. When the sparsity pattern satisfies the running intersection property this Sparse-BSOS hierarchy of semidefinite programs (with semidefinite constraints of fixed size) converges to the global optimum of the original problem. Moreover, for the class of SOS-convex problems, finite convergence takes place at the first step of the hierarchy, just as in the dense version.

62 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose general notions to deal with large scale polynomial optimization problems and demonstrate their efficiency on a key industrial problem of the 21st century, namely the optimal power flow problem.
Abstract: We propose general notions to deal with large scale polynomial optimization problems and demonstrate their efficiency on a key industrial problem of the 21st century, namely the optimal power flow problem. These notions enable us to find global minimizers on instances with up to 4,500 variables and 14,500 constraints. First, we generalize the Lasserre hierarchy from real to complex numbers in order to enhance its tractability when dealing with complex polynomial optimization. Complex numbers are typically used to represent oscillatory phenomena, which are omnipresent in physical systems. Using the notion of hyponormality in operator theory, we provide a finite convergence criterion which generalizes the Curto--Fialkow conditions of the real Lasserre hierarchy. Second, we introduce the multi-ordered Lasserre hierarchy in order to exploit sparsity in polynomial optimization problems (in real or complex variables) while preserving global convergence. It is based on two ideas: (1) to use a different relaxatio...

Posted Content
TL;DR: This work investigates simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework.
Abstract: Object categories inherently form a hierarchy with different levels of concept abstraction, especially for fine-grained categories. For example, birds (Aves) can be categorized according to a four-level hierarchy of order, family, genus, and species. This hierarchy encodes rich correlations among various categories across different levels, which can effectively regularize the semantic space and thus make prediction less ambiguous. However, previous studies of fine-grained image recognition primarily focus on categories of one certain level and usually overlook this correlation information. In this work, we investigate simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework. Specifically, the HSE framework sequentially predicts the category score vector of each level in the hierarchy, from highest to lowest. At each level, it incorporates the predicted score vector of the higher level as prior knowledge to learn finer-grained feature representation. During training, the predicted score vector of the higher level is also employed to regularize label prediction by using it as soft targets of corresponding sub-categories. To evaluate the proposed framework, we organize the 200 bird species of the Caltech-UCSD birds dataset with the four-level category hierarchy and construct a large-scale butterfly dataset that also covers four level categories. Extensive experiments on these two and the newly-released VegFru datasets demonstrate the superiority of our HSE framework over the baseline methods and existing competitors.

Journal ArticleDOI
TL;DR: A physically inspired model and an efficient algorithm to infer hierarchical rankings of nodes in directed networks that assigns real-valued ranks to nodes rather than simply ordinal ranks and formalizes the assumption that interactions are more likely to occur between individuals with similar ranks.
Abstract: We present a physically inspired model and an efficient algorithm to infer hierarchical rankings of nodes in directed networks. It assigns real-valued ranks to nodes rather than simply ordinal ranks, and it formalizes the assumption that interactions are more likely to occur between individuals with similar ranks. It provides a natural statistical significance test for the inferred hierarchy, and it can be used to perform inference tasks such as predicting the existence or direction of edges. The ranking is obtained by solving a linear system of equations, which is sparse if the network is; thus, the resulting algorithm is extremely efficient and scalable. We illustrate these findings by analyzing real and synthetic data, including data sets from animal behavior, faculty hiring, social support networks, and sports tournaments. We show that our method often outperforms a variety of others, in both speed and accuracy, in recovering the underlying ranks and predicting edge directions.

Journal ArticleDOI
TL;DR: This paper proposes a methodology that allows one to simultaneously handle synergy and redundancy between criteria, and suitably aggregate indicators by means of the Choquet integral, and applies the Multiple Criteria Hierarchy Process.
Abstract: The evaluation of sustainable development –and, in particular, rural development– through composite indices requires taking into account a plurality of indicators, which are related to economic, social, and environmental aspects. The points of view evaluated by these indices are naturally interacting: thus, a bonus has to be recognized to units performing well on synergic criteria, whereas a penalisation has to be assigned on redundant criteria. An additional difficulty of the modelization is the elicitation of the parameters for the composite indices, since they are typically affected by some imprecision. In most approaches, all these critical points are usually neglected, which in turn yields an unpleasant degree of approximation in the computation of indices. In this paper we propose a methodology that allows one to simultaneously handle these delicate issues. Specifically, to take into account synergy and redundancy between criteria, we suitably aggregate indicators by means of the Choquet integral. Further, to obtain recommendations that take into account the space of fluctuation related to imprecision in non-additive weights (capacity of the Choquet integral), we adopt the Robust Ordinal Regression (ROR) and the Stochastic Multicriteria Acceptability Analysis (SMAA). Finally, to study sustainability not only at a comprehensive level (taking into account all criteria) but also at a local level (separately taking into account economic, social, and environmental aspects), we apply the Multiple Criteria Hierarchy Process (MCHP). We illustrate the advantages of our approach in a concrete example, in which we measure the rural sustainability of 51 municipalities in the province of Catania, the largest city of the East Coast of Sicily (Italy).

Journal ArticleDOI
TL;DR: In this article, a reference prior was constructed from principles of the Objective Bayesian approach, which is minimally informative in the specific sense that the information gain after collection of data is maximised.
Abstract: Given the precision of current neutrino data, priors still impact noticeably the constraints on neutrino masses and their hierarchy. To avoid our understanding of neutrinos being driven by prior assumptions, we construct a prior that is mathematically minimally informative. Using the constructed uninformative prior, we find that the normal hierarchy is favoured but with inconclusive posterior odds of 5.1:1. Better data is hence needed before the neutrino masses and their hierarchy can be well constrained. We find that the next decade of cosmological data should provide conclusive evidence if the normal hierarchy with negligible minimum mass is correct, and if the uncertainty in the sum of neutrino masses drops below 0.025 eV. On the other hand, if neutrinos obey the inverted hierarchy, achieving strong evidence will be difficult with the same uncertainties. Our uninformative prior was constructed from principles of the Objective Bayesian approach. The prior is called a reference prior and is minimally informative in the specific sense that the information gain after collection of data is maximised. The prior is computed for the combination of neutrino oscillation data and cosmological data and still applies if the data improve.

Journal ArticleDOI
TL;DR: This work develops a template for BN construction that allows sufficient flexibility to address most cases, but enough commonality and structure that the flow of information in the BN is readily recognised at a glance.
Abstract: The hierarchy of propositions has been accepted amongst the forensic science community for some time. It is also accepted that the higher up the hierarchy the propositions are, against which the scientist are competent to evaluate their results, the more directly useful the testimony will be to the court. Because each case represents a unique set of circumstances and findings, it is difficult to come up with a standard structure for evaluation. One common tool that assists in this task is Bayesian networks (BNs). There is much diversity in the way that BN can be constructed. In this work, we develop a template for BN construction that allows sufficient flexibility to address most cases, but enough commonality and structure that the flow of information in the BN is readily recognised at a glance. We provide seven steps that can be used to construct BNs within this structure and demonstrate how they can be applied, using a case example.

Journal ArticleDOI
TL;DR: In this article, a new class of estimators that make use of multiple group penalties to capture structural parsimony is proposed, and a general-purpose algorithm is developed with guaranteed convergence and global optimal.
Abstract: Variable selection for models including interactions between explanatory variables often needs to obey certain hierarchical constraints. Weak or strong structural hierarchy requires that the existence of an interaction term implies at least one or both associated main effects to be present in the model. Lately this problem has attracted a lot of attention, but existing computational algorithms converge slow even with a moderate number of predictors. Moreover, in contrast to the rich literature on ordinary variable selection, there is a lack of statistical theory to show reasonably low error rates of hierarchical variable selection. This work investigates a new class of estimators that make use of multiple group penalties to capture structural parsimony. We show that the proposed estimators enjoy sharp rate oracle inequalities, and give the minimax lower bounds in strong and weak hierarchical variable selection. A general-purpose algorithm is developed with guaranteed convergence and global optimal...

Proceedings ArticleDOI
23 Jul 2018
TL;DR: For the first time, it is shown that solutions to a number of fundamental tasks in distributed computing can be obtained quickly using finite-state protocols, including leader election, aggregate and threshold functions on the population, such as majority computation, and plurality consensus.
Abstract: A population protocol describes a set of state change rules for a population of n indistinguishable finite-state agents (automata), undergoing random pairwise interactions. Within this very basic framework, it is possible to resolve a number of fundamental tasks in distributed computing, including: leader election, aggregate and threshold functions on the population, such as majority computation, and plurality consensus. For the first time, we show that solutions to all of these problems can be obtained quickly using finite-state protocols. For any input, the designed finite-state protocols converge under a fair random scheduler to an output which is correct with high probability in expected O(polylog n) parallel time. We also show protocols which always reach a valid solution, in expected parallel time O(n^e), where the number of states depends only on the choice of e>0. The stated time bounds hold for any semi-linear predicate computable in the population protocol framework. The key ingredient of our result is the decentralized design of a hierarchy of phase-clocks, which tick at different rates, with the rates of adjacent clocks separated by a factor of Θ(log n). The construction of this clock hierarchy relies on a new protocol composition technique, combined with an adapted analysis of a self-organizing process of oscillatory dynamics. This clock hierarchy is used to provide nested synchronization primitives, which allow us to view the population in a global manner and design protocols using a high-level imperative programming language with a (limited) capacity for loops and branching instructions.

Journal ArticleDOI
TL;DR: In this article, a reference prior was constructed from principles of the Objective Bayesian approach, which is minimally informative in the specific sense that the information gain after collection of data is maximised.
Abstract: Given the precision of current neutrino data, priors still impact noticeably the constraints on neutrino masses and their hierarchy. To avoid our understanding of neutrinos being driven by prior assumptions, we construct a prior that is mathematically minimally informative. Using the constructed uninformative prior, we find that the normal hierarchy is favoured but with inconclusive posterior odds of 5.1:1. Better data is hence needed before the neutrino masses and their hierarchy can be well constrained. We find that the next decade of cosmological data should provide conclusive evidence if the normal hierarchy with negligible minimum mass is correct, and if the uncertainty in the sum of neutrino masses drops below 0.025 eV. On the other hand, if neutrinos obey the inverted hierarchy, achieving strong evidence will be difficult with the same uncertainties. Our uninformative prior was constructed from principles of the Objective Bayesian approach. The prior is called a reference prior and is minimally informative in the specific sense that the information gain after collection of data is maximised. The prior is computed for the combination of neutrino oscillation data and cosmological data and still applies if the data improve.

Journal ArticleDOI
TL;DR: The present treatment provides a general framework where this kind of analysis can be carried out in full generality of higher-order quantum functions, and can be exported in the context of any operational probabilistic theory.
Abstract: Higher-order quantum theory is an extension of quantum theory where one introduces transformations whose input and output are transformations, thus generalizing the notion of channels and quantum operations. The generalization then goes recursively, with the construction of a full hierarchy of maps of increasingly higher order. The analysis of special cases already showed that higher-order quantum functions exhibit features that cannot be tracked down to the usual circuits, such as indefinite causal structures, providing provable advantages over circuital maps. The present treatment provides a general framework where this kind of analysis can be carried out in full generality. The hierarchy of higher-order quantum maps is introduced axiomatically with a formulation based on the language of types of transformations. Complete positivity of higher-order maps is derived from the general admissibility conditions instead of being postulated as in previous approaches. The recursive characterization of convex sets of maps of a given type is used to prove equivalence relations between different types. The axioms of the framework do not refer to the specific mathematical structure of quantum theory, and can therefore be exported in the context of any operational probabilistic theory.

Proceedings ArticleDOI
01 Jan 2018
TL;DR: An autoencoder model with a latent space defined by a hierarchy of categorical variables, utilizing a recently proposed vector quantization based approach, which allows continuous embeddings to be associated with each latent variable value.
Abstract: Scripts define knowledge about how everyday scenarios (such as going to a restaurant) are expected to unfold. One of the challenges to learning scripts is the hierarchical nature of the knowledge. For example, a suspect arrested might plead innocent or guilty, and a very different track of events is then expected to happen. To capture this type of information, we propose an autoencoder model with a latent space defined by a hierarchy of categorical variables. We utilize a recently proposed vector quantization based approach, which allows continuous embeddings to be associated with each latent variable value. This permits the decoder to softly decide what portions of the latent hierarchy to condition on by attending over the value embeddings for a given setting. Our model effectively encodes and generates scripts, outperforming a recent language modeling-based method on several standard tasks, and allowing the autoencoder model to achieve substantially lower perplexity scores compared to the previous language modeling-based method.

Journal ArticleDOI
TL;DR: This paper presents an efficient method to perform structured matrix approximation by separation and hierarchy (SMASH), when the original dense matrix is associated with a kernel function.

Journal ArticleDOI
TL;DR: Spanners, emulators, and approximate distance oracles can be viewed as lossy compression schemes that represent an unweighted graph metric in small space, say $\tilde{O}(n^{1+\delta})$ bits.
Abstract: Spanners, emulators, and approximate distance oracles can be viewed as lossy compression schemes that represent an unweighted graph metric in small space, say $\tilde{O}(n^{1+\delta})$ bits. There ...

Journal ArticleDOI
TL;DR: In this article, the connection between interference and computational power within the operationally defined framework of generalised probabilistic theories was investigated, and it was shown that any theory satisfying four natural physical principles (i.e., causality, purification, strong symmetry, and informationally consistent composition) possesses a well-defined oracle model.
Abstract: We investigate the connection between interference and computational power within the operationally defined framework of generalised probabilistic theories. To compare the computational abilities of different theories within this framework we show that any theory satisfying four natural physical principles possess a well-defined oracle model. Indeed, we prove a subroutine theorem for oracles in such theories which is a necessary condition for the oracle model to be well-defined. The four principles are: causality (roughly, no signalling from the future), purification (each mixed state arises as the marginal of a pure state of a larger system), strong symmetry (existence of a rich set of nontrivial reversible transformations), and informationally consistent composition (roughly: the information capacity of a composite system is the sum of the capacities of its constituent subsystems). Sorkin has defined a hierarchy of conceivable interference behaviours, where the order in the hierarchy corresponds to the number of paths that have an irreducible interaction in a multi-slit experiment. Given our oracle model, we show that if a classical computer requires at least n queries to solve a learning problem, because fewer queries provide no information about the solution, then the corresponding “no-information” lower bound in theories lying at the kth level of Sorkin’s hierarchy is $$\lceil {n/k}\rceil $$ . This lower bound leaves open the possibility that quantum oracles are less powerful than general probabilistic oracles, although it is not known whether the lower bound is achievable in general. Hence searches for higher-order interference are not only foundationally motivated, but constitute a search for a computational resource that might have power beyond that offered by quantum computation.

Journal ArticleDOI
15 Nov 2018
TL;DR: In this article, a simple sub-universal quantum computing model, called the Hadamard-classical circuit with one-qubit (HC1Q) model, was introduced, which is in the second level of the Fourier hierarchy.
Abstract: We introduce a simple sub-universal quantum computing model, which we call the Hadamard-classical circuit with one-qubit (HC1Q) model. It consists of a classical reversible circuit sandwiched by two layers of Hadamard gates, and therefore it is in the second level of the Fourier hierarchy. We show that output probability distributions of the HC1Q model cannot be classically efficiently sampled within a multiplicative error unless the polynomial-time hierarchy collapses to the second level. The proof technique is different from those used for previous sub-universal models, such as IQP, Boson Sampling, and DQC1, and therefore the technique itself might be useful for finding other sub-universal models that are hard to classically simulate. We also study the classical verification of quantum computing in the second level of the Fourier hierarchy. To this end, we define a promise problem, which we call the probability distribution distinguishability with maximum norm (PDD-Max). It is a promise problem to decide whether output probability distributions of two quantum circuits are far apart or close. We show that PDD-Max is BQP-complete, but if the two circuits are restricted to some types in the second level of the Fourier hierarchy, such as the HC1Q model or the IQP model, PDD-Max has a Merlin-Arthur system with quantum polynomial-time Merlin and classical probabilistic polynomial-time Arthur.

Book ChapterDOI
08 Sep 2018
TL;DR: A hierarchy of specialist networks is presented, which disentangles the intra-class variation and inter-class similarity in a coarse to fine manner, and it is demonstrated that it leads to better performance than using a single type of representation as well as the fused features.
Abstract: We introduce a method for improving convolutional neural networks (CNNs) for scene classification. We present a hierarchy of specialist networks, which disentangles the intra-class variation and inter-class similarity in a coarse to fine manner. Our key insight is that each subset within a class is often associated with different types of inter-class similarity. This suggests that existing network of experts approaches that organize classes into coarse categories are suboptimal. In contrast, we group images based on high-level appearance features rather than their class membership and dedicate a specialist model per group. In addition, we propose an alternating architecture with a global ordered- and a global orderless-representation to account for both the coarse layout of the scene and the transient objects. We demonstrate that it leads to better performance than using a single type of representation as well as the fused features. We also introduce a mini-batch soft k-means that allows end-to-end fine-tuning, as well as a novel routing function for assigning images to specialists. Experimental results show that the proposed approach achieves a significant improvement over baselines including the existing tree-structured CNNs with class-based grouping.

Journal ArticleDOI
TL;DR: A methodology of building a hierarchical framework of system modeling by engaging concepts and design methodology of granular computing is presented and it is demonstrated that it arises as a result of designing and using locally constructed models to develop a model of a global nature.
Abstract: In this study, we present a methodology of building a hierarchical framework of system modeling by engaging concepts and design methodology of granular computing. We demonstrate that it arises as a result of designing and using locally constructed models to develop a model of a global nature. Two main categories of development of hierarchical models are proposed and discussed. In the first one, given a collection of local models, designed is a granular output space and the ensuing hierarchical model produces information granules of the corresponding type depending upon the depth of the hierarchy of the overall hierarchical structure. The crux of the second category of modeling is about selecting one of the original models and elevating its level of information granularity so that it becomes representative of the entire family of local models. The formation of the most “promising” granular model identified in this way involves mechanisms of allocation of information granularity. The focus of the study is on information granules represented as intervals and fuzzy sets (which in case of type-2 information granules lead to so-called granular intervals and interval-valued fuzzy sets) while the detailed models come as rule-based architectures and neural networks. A series of experiments is presented along with a comparative analysis.

Journal ArticleDOI
TL;DR: Experimental results show that in certain configurations, grammatical production can in fact favor linear order over hierarchical structure, and demonstrate that agreement morphology may be computed in a series of steps, one of which is partly independent from syntactic hierarchy.
Abstract: Hierarchical structure has been cherished as a grammatical universal. We use experimental methods to show where linear order is also a relevant syntactic relation. An identical methodology and design were used across six research sites on South Slavic languages. Experimental results show that in certain configurations, grammatical production can in fact favor linear order over hierarchical structure. However, these findings are limited to coordinate structures and distinct from the kind of production errors found with comparable configurations such as “attraction” errors. The results demonstrate that agreement morphology may be computed in a series of steps, one of which is partly independent from syntactic hierarchy.

Journal ArticleDOI
TL;DR: It is proved that this approach is the maximum-entropy choice, and a motivating example is provided, applicable to neutrino-hierarchy inference.
Abstract: We propose a method for transforming probability distributions so that parameters of interest are forced into a specified distribution. We prove that this approach is the maximum entropy choice, and provide a motivating example applicable to neutrino hierarchy inference.

Journal ArticleDOI
TL;DR: In this paper, a generalized Dirac integrable hierarchy is presented by a two-by-two matrix spectral problem, and a Hamiltonian structure is established by trace identity, and its Liouville integrability is proved.