scispace - formally typeset
Search or ask a question

Showing papers in "Automatic Documentation and Mathematical Linguistics in 2012"


Journal ArticleDOI
TL;DR: An approach to the definition of an ontology and a set of operations as an instrument of the formation and quantitatively estimated correlation of identifying images of objects of the subject area in a dialectical relationship of objective, conceptual, and symbolic spaces is proposed.
Abstract: An approach to the definition of an ontology and a set of operations as an instrument of the formation and quantitatively estimated correlation of identifying images of objects of the subject area in a dialectical relationship of objective, conceptual, and symbolic spaces is proposed. Ontological representation of the image of the object in an computing environment corresponds to an object-oriented approach and includes not only property but also behavior. In practice, this approach will automate dynamic reformulation and correlating the retrieval images of query and documents based on their reduction to a common conceptual and terminological context.

26 citations


Journal ArticleDOI
TL;DR: In this article, a probability approach to hypothesis generation in plaussible reasoning of the JSM type using special Markov chains is described, similar the Monte Carlo methods, which are actively used to calculate combinatory configurations and the volumes of concave bodies.
Abstract: A probability approach to hypothesis generation in plaussible reasoning of the JSM type using special Markov chains is described. This approach is similar the Monte Carlo methods, which are actively used to calculate combinatory configurations and the volumes of concave bodies. Simplest Markov chains are discussed, including nonmonotonous, monotonous, and "coupling in the future." Their properties are proven and coupling is proven with a probability 1 *.

15 citations


Journal ArticleDOI
TL;DR: In this paper, the basic concepts and rules of the JSM method are presented in terms of conventional set theory without using the apparatus of non-classical logics, and the architecture of JSM system and its operational principles are considered.
Abstract: An original presentation of the basic concepts and rules of the JSM method is made in terms of conventional set theory (without using the apparatus of non-classical logics). The architecture of the JSM system and its operational principles are considered. The connection between the JSM method and formal concept analysis is discussed. Practical recommendations are given for the developers of non-standard versions of JSM systems.

12 citations


Journal ArticleDOI
TL;DR: The technologies for searching for empirical laws are described step by step by the JSM method for the automated generation of hypotheses using the research material that was obtained by a group of authors on two firms.
Abstract: The problems of sociological data and knowledge representation for the situational analysis of labour relations are described. It is suggested that the technologies for searching for empirical laws are described step by step by the JSM method for the automated generation of hypotheses using the research material that was obtained by a group of authors on two firms. Conclusions are made as to the potentialities and scope of the method.

10 citations


Journal ArticleDOI
TL;DR: An ontology web-server project in Web 2.0 that allows the collaborative development of ontologies is presented and the CASL language (Common Algebraic Specification Language) is reviewed.
Abstract: This paper presents an ontology web-server project in Web 2.0 that allows the collaborative development of ontologies. The levels of presenting ontologies and mathematical models underlying these presentations are discussed. The CASL language (Common Algebraic Specification Language) is reviewed.

6 citations


Journal ArticleDOI
TL;DR: Paired diagrams with the axes En, An, and T are used and describe separation and mixing processes that occur in nature, technology, and society in the most adequate manner.
Abstract: A method for the graphic representation of evolutionary processes of compositions is described. Preparation of materials includes descending ranking of contents of components and standardization of the length of the obtained sequences while discarding excess contents. The following three parameters are calculated for the persistent distribution of contents: (1) information entropy H = ?Σp ilogp i as a measure of complexity of the system's composition, (2) anentropy A = ?Σlogp i as a measure of the composition purity, and (3) tolerance T = log[(Σ1/p i)/n] as a measure of special purity. In order to represent the process of compositional change, paired diagrams with the axes En, An, and T are used. The obtained entropy diagrams describe separation and mixing processes that occur in nature, technology, and society in the most adequate manner.

6 citations


Journal ArticleDOI
TL;DR: An algorithm for knowledge base formation is presented for learning samples in almost real-life ontologies of medical diagnosis.
Abstract: Statements of the major tasks of inductive formation of knowledge are suggested. These are classifications and clusterings, which are part of machine learning and are applied for dependence models with parameters that are not flawed in their traditional statement. An algorithm for knowledge base formation is presented for learning samples in almost real-life ontologies of medical diagnosis.

5 citations


Journal ArticleDOI
TL;DR: The typical architecture of a question-answering system includes a question classification module, and different methods for creating this module are examined in the paper.
Abstract: A number of problems that are involved in creating question-answering systems are discussed. A review is provided of the systems of this type that are most popular. The typical architecture of a question-answering system includes a question classification module. Different methods for creating this module are examined in the paper.

5 citations


Journal ArticleDOI
TL;DR: The Data Center project, which provides integration of the scientific electronic resources (mainly databases) that are developed and supported by the Russian Academy of Sciences, is discussed.
Abstract: This paper discusses the Data Center project, which provides integration of the scientific electronic resources (mainly databases) that are developed and supported by the Russian Academy of Sciences. The integration technology has been verified within the framework of the Properties of Substances and Materials interdisciplinary theme, which is represented in many Institutes of engineering and natural-science profiles of the RAS. The possibilities of the XML language and ontological modeling are considered for the formalized description of the subject field of the properties of substances. Successful examples of work with databases on properties demonstrate that software engineering has achieved a high level and allows for the development of common exchange standards for heterogeneous resources.

5 citations


Journal ArticleDOI
TL;DR: This paper suggests interpreting Bradford’s law in terms of a geometric progression; it introduces a constant, which allows simplifying the application of the law, and outlines the methodology for using the law to analyze the data related to various subject areas.
Abstract: This paper suggests interpreting Bradford's law in terms of a geometric progression; it introduces a constant, which allows simplifying the application of the law, and outlines the methodology for using the law to analyze the data related to various subject areas.

5 citations


Journal ArticleDOI
TL;DR: This article considers the partition of a set of statements P = Pa ∪ Pb∪ P′, where Pa, Pb, P′ are, correspondingly, sets of statements only being argumented, both being argumenting and argumenting, and only argumenting (basic statements).
Abstract: In this article we consider the partition of a set of statements P = P a ? P b ? P?, where P a , P b , P? are, correspondingly, sets of statements only being argumented, both being argumented and argumenting, and only argumenting (basic statements). On P are defined two functions of argument and counter-argument selection forming the semantics for logics of argumentation. A four-valued logic of argumentation is proposed. By means of graph theory are formed trees of arguments (argument trees), wood, consisting of them, and some transformations of the wood, such that their result may be both planar and non-planar graph. Argument systems (systems of arguments) are defined together with their characterizations that use analytic tableaux. With the help of argument trees a specification of the idea of hermeneutic ("vicious") circle is formalized.

Journal ArticleDOI
TL;DR: A method for constructing a two-parameter crystal chemical alphabet for coding crystal chemical formulas (CCFs) for the purpose of their systematization is described, using the example of the mineral tourmaline.
Abstract: A method for constructing a two-parameter crystal chemical alphabet for coding crystal chemical formulas (CCFs) for the purpose of their systematization is described. The general principles for coding the crystal chemical formulas are formulated. The two-parameter alphabet is an ordered collection of pairs of symbols in which data on a position and element (PE) that are peculiar to this structural type of mineral are fixed. Stoichiometric coefficients of elements in positions of particular CCFs allow one to construct rank crystal chemical formulas, i.e., sequences of PE pairs in the order of their coefficient decrease. The collection of these rank formulas is sorted on the dictionary principle in accordance with the PE in the proposed alpha-bet, allowing one to obtain a hierarchical classification of the CCF codes. The construction of rank CCF formulas enables one to calculate the entropy characteristics of the obtained codes in studies of transient processes of the structural-chemical states of substances. The method is described using the example of the mineral tourmaline.

Journal ArticleDOI
TL;DR: In this paper, the relevance of the history of philosophy in terms of information technology (IT) is noted and two contradictory views of IT, that is, engineering and anthropocentric, are addressed.
Abstract: Various approaches to the definition of the concept "information," viz., mathematical-physical, semantic, pragmatic, hermeneutic-existential, and angeletic, are considered. The relevance of the study of the history of philosophy in terms of information technology (IT) is noted. Two contradictory views of information technology, that is, engineering and anthropocentric, are addressed. The future promising lines in IT philosophy are considered.

Journal ArticleDOI
TL;DR: Mathematical models based on the use of methods of mathematical statistics and machine learning were designed to solve the problem of virtual screening of chemical compounds and showed good results.
Abstract: Mathematical models based on the use of methods of mathematical statistics and machine learning were designed to solve the problem of virtual screening of chemical compounds. The performance of the models was assessed using a large experimental data set.


Journal ArticleDOI
TL;DR: Four electronic corpora created in 2011 within the framework of the “Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages” Program of Fundamental Research of the RAS are presented.
Abstract: Four electronic corpora created in 2011 within the framework of the "Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages" Program of Fundamental Research of the RAS are presented. The interface and functionalities of these corpora are described, engineering problems to be solved in their creation are elucidated, and the promises of their development are discussed. A particular emphasis is made on the compilation of dictionaries and automatic grammatical markup of the corpora.

Journal ArticleDOI
TL;DR: The major approaches to solving the classical problems of simulation and the models of time are considered, viz., discrete-event and continuous modeling, as well as Monte-Carlo modeling, and their main propositions, advantages, shortcomings, and concrete realizations are discussed.
Abstract: The major approaches to solving the classical problems of simulation and the models of time are considered, viz., discrete-event and continuous modeling, as well as Monte-Carlo modeling. Their main propositions, advantages, shortcomings, and concrete realizations are discussed. On the basis of the conducted research, the place of the original software tool G-IPS Ultimate is shown in a series of other software products for the solution of applied simulation problems.

Journal ArticleDOI
TL;DR: The methods of research to reveal the relationship between human mental disorders and the functional itch disorder are described and a hypothesis is suggested about the formation of a pathological system in a patient’s body (and the disease).
Abstract: This paper describes the methods of research to reveal the relationship between human mental disorders and the functional itch disorder. A hypothesis is suggested about the formation of a pathological system in a patient's body (and the disease; we conventionally call the system "Psikhozud"). To test the hypothesis, it is suggested that a model of this system be built using artificial-intelligence methods. Some information from systems theory is presented. The problems of the simulation of multi-agent systems are considered. The potential for using intelligent data mining based on the JSM method for the automatic generation of the behavioral patterns of the Psikhozud system is discussed.

Journal ArticleDOI
TL;DR: Rapid quantitative monitoring of subject area (catalysis) term-base change dynamics over time by analysis of representative collections of texts with a known time reference was developed and tested.
Abstract: Rapid quantitative monitoring of subject area (catalysis) term-base change dynamics over time by analysis of representative collections of texts with a known time reference was developed and tested. An L-gram representation of texts followed by selection of term-like chains of arbitrary length was used. The amount of the information to be analyzed by an expert was minimized. The data thus obtained can be used to identify key trends of subject area development.

Journal ArticleDOI
TL;DR: In this paper, theoretical developments in the field of nanorobots and methods of nanobot construction at the microscopic and molecular levels are analyzed, as well as the issues of simulation of nanors and their components and the questions of control and energy sources are considered.
Abstract: In this article theoretical developments in the field of nanorobots and methods of nanorobot construction at the microscopic and molecular levels are analyzed. The issues of simulation of nanorobots and their components and the questions of control and energy sources are considered. Some experimental results and examples of applications are presented.

Journal ArticleDOI
TL;DR: The paper suggests the new linear method of cross-lingual named entities transliteration with FSM, and provides proof of linear complexity of the transliterations procedure.
Abstract: The paper suggests the new linear method of cross-lingual named entities transliteration. Strings are processed with an extended finite-state automaton which is constructed from a system of rules with contexts. Transliteration with FSM is equivalent to transliteration with the system of rules. We provide proof of linear complexity of the transliteration procedure.

Journal ArticleDOI
TL;DR: The grammar and algorithms that are required to perform decomposition by clauses are described and the features of different types of decomposition that are employed in preliminary text processing are considered.
Abstract: The features of different types of decomposition that are employed in preliminary text processing are considered. Linguistic problems of decomposition by clauses via the transformation of communicative and modal planes of a text are discussed. The grammar and algorithms that are required to perform decomposition by clauses are described.

Journal ArticleDOI
TL;DR: A classification of lexicographic resources is proposed to support automatic textanalysis systems and two techniques for dictionary generation are described.
Abstract: A classification of lexicographic resources is proposed to support automatic textanalysis systems. Four types of dictionaries are distinguished and described, namely, terminological, terminological-statistical, thesauri, and ontologies. Methods for dictionary generation are divided into static and dynamic, as well as linear and stepwise. Methods for weighting terms are divided into intertextual and intratextual. The features of the TF*IDF algorithm are considered in detail. Two techniques for dictionary generation are described.

Journal ArticleDOI
TL;DR: An algorithm for lexical analysis is proposed that allows the correct processing of various ambiguities in languages with lexical ambiguity and is proposed to make it possible to correctly process both lexical and syntactic ambigUities.
Abstract: The problem of parsing languages with lexical ambiguities is considered. An algorithm for lexical analysis is proposed that allows the correct processing of various ambiguities. For this algorithm, we propose an algorithm for parsing to make it possible to correctly process both lexical and syntactic ambiguities.

Journal ArticleDOI
TL;DR: The author has undertaken an attempt to show a possible variant of the semantic presentation of the text that describes the situations of force interaction by means of ontology and ontology-based lexicon.
Abstract: The domain of force processes in a universal ontology is described very poorly compared with the adjacent domain of spatial relationships. The author has undertaken an attempt to show a possible variant of the semantic presentation of the text that describes the situations of force interaction by means of ontology and ontology-based lexicon.

Journal ArticleDOI
TL;DR: The features of the expert evaluation of intractable properties (parameters) in the form of interval values on number scales are analyzed and it is proposed that the “weighing” of the interval boundaries be implemented according to the interval width.
Abstract: The features of the expert evaluation of intractable properties (parameters) in the form of interval values on number scales are analyzed. To find a consistent evaluation, two methods for averaging the evaluations in interval form are considered. The first is based on the simple (arithmetical mean) averaging of the interval boundaries and the second is concerned with weighted averaging. It is proposed that the "weighing" of the interval boundaries be implemented according to the interval width according to the following principle: the lower the width of the evaluation interval is, the more qualified the expert evaluation of the property under investigation is and the higher the weight of the boundaries of the corresponding interval under averaging is. When averaging two intervals, an increase in the qualification of a consistent expert evaluation during weighted averaging occurs.