scispace - formally typeset
Search or ask a question

Showing papers by "Turku Centre for Computer Science published in 2007"


Journal ArticleDOI
TL;DR: A corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers is introduced.
Abstract: Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. Currently, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a single set of sentences, and in the level of detail of the relationship annotation. We introduce a corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers. The corpus will be maintained and further developed with a current version being available at http://www.it.utu.fi/BioInfer .

479 citations


Journal ArticleDOI
TL;DR: This paper develops a methodology for valuing options on R&D projects, when future cash flows are estimated by trapezoidal fuzzy numbers, and discusses how the methodology can be used to build decision support tools for optimal R&d project selection in a corporate environment.

209 citations


Book ChapterDOI
01 Jan 2007
TL;DR: This work focuses on lattice-theoretical foundations of rough set theory and basic Notions and Notation, and orders and Lattices of Rough Sets.
Abstract: This work focuses on lattice-theoretical foundations of rough set theory. It consist of the following sections: 1: Introduction 2: Basic Notions and Notation, 3: Orders and Lattices, 4: Distributive, Boolean, and Stone Lattices, 5: Closure Systems and Topologies, 6: Fixpoints and Closure Operators on Ordered Sets, 7: Galois Connections and Their Fixpoints, 8: Information Systems, 9: Rough Set Approximations, and 10: Lattices of Rough Sets. At the end of each section, brief bibliographic remarks are presented.

133 citations


Journal ArticleDOI
TL;DR: It is shown that the state complexity of a combined operation can be very different from the composition of the state complexities of the participating individual operations.

107 citations


Journal ArticleDOI
TL;DR: Based on the responses of 1,501 customers of a passenger cruise company, prior experience of traditional and online channels and perceived usefulness had a substantial effect on behavioral intention to use the online channel in the future.
Abstract: Security and privacy issues have drawn much attention in the electronic commerce research area, and e-vendors have adjusted their online shopping systems to convince customers that vendors and systems are trustworthy. Therefore, this study concentrates on how consumers choose their purchasing channel when the environment is relatively secure. Is the choice based on preferring conversation with customer service, complexity of product, prior online shopping experience, social influence or perception of system usefulness or ease-of-use? The technology acceptance model, media richness theory and social influence model were combined in a research model tested with a Web survey. Based on the responses of 1,501 customers of a passenger cruise company, prior experience of traditional and online channels and perceived usefulness had a substantial effect on behavioral intention to use the online channel in the future.

78 citations


Journal ArticleDOI
TL;DR: New lower bounds are established in the number of matrices for the mortality, zero in the left upper corner, vector reachability, matrix reachable, scalar reachability and freeness problems and a short proof for a strengthened result due to Bell and Potapov stating that the membership problem is undecidable.
Abstract: There are several known undecidable problems for 3 × 3 integer matrices the proof of which use a reduction from the Post Correspondence Problem (PCP). We establish new lower bounds in the number of matrices for the mortality, zero in the left upper corner, vector reachability, matrix reachability, scalar reachability and freeness problems. Also, we give a short proof for a strengthened result due to Bell and Potapov stating that the membership problem is undecidable for finitely generated matrix semigroups R ⊆ ℤ4×4 whether or not kI4 ∈ R for any given |k| > 1. These bounds are obtained by using the Claus instances of the PCP.

48 citations



Journal ArticleDOI
TL;DR: This article obtains the Allee effect by adding different mate finding mechanisms to the within-year dynamics of discrete-time population models and, by adding cannibalism, obtains a higher variety of models.

39 citations


Book ChapterDOI
20 Jan 2007
TL;DR: In this article, the authors give constructions of small probabilistic and MO-type quantum automata that have undecidable emptiness problem for the cut-point languages, which is a special case of the problem we consider in this paper.
Abstract: We give constructions of small probabilistic and MO-type quantum automata that have undecidable emptiness problem for the cut-point languages.

37 citations


Proceedings Article
30 Jan 2007
TL;DR: In this paper, the authors present results from a wide-ranging phenomenographic study of computing academics' understanding of teaching, focusing upon four areas: the role of lab practical sessions, the experience of teaching success, conceptions of motivating and engaging students, and the granularity of the teacher's focus.
Abstract: This paper presents first results from a wide-ranging phenomenographic study of computing academics' understanding of teaching. These first results focus upon four areas: the role of lab practical sessions, the experience of teaching success, conceptions of motivating and engaging students, and the granularity of the teacher's focus. The findings are comparable with prior work on the understandings of academics in other disciplines. This study was started as part of a workshop on phenomenography. Most participants at the workshop received their first training in phenomenography. This paper summarises the structure of the workshop.

36 citations


Proceedings ArticleDOI
27 May 2007
TL;DR: The results indicate that a network structure built from simple 3-port routers provides better fault tolerance than a structure based on more complex multiport routers, and that the area overhead can be kept moderate.
Abstract: The paper presents an approach for analyzing and improving fault tolerance aspects in NoC architectures. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. Several NoC architectures and the router structures as well as the network interface needed for them are presented and compared for their fault tolerance, area and performance. The results indicate that a network structure built from simple 3-port routers provides better fault tolerance than a structure based on more complex multiport routers, and that the area overhead can be kept moderate

Journal ArticleDOI
TL;DR: This work analyzed the pre-implementation frames that could be discerned in 24 interviews of hospital personnel and found that the social context appeared to have a significant influence in the users' interpretation processes and the frames seemed to be congruent within one group.

Journal ArticleDOI
TL;DR: It is shown that each fuzzy set determines a preorder and an Alexandrov topology, and that similar correspondences hold also for the other direction, and suggests how to define for fuzzy subsets of a certain universe the lattice operations in a canonical way.

Book ChapterDOI
03 Sep 2007
TL;DR: The part of the deep Web consisting of dynamic pages in one particular national domain is surveyed and the estimation of the national deep Web is performed using the proposed sampling techniques.
Abstract: With the advances in web technologies, more and more information on the Web is contained in dynamically-generated web pages. Among several types of web "dynamism" the most important one is the case when web pages are generated as results of queries submitted via search web forms to databases available online. These pages constitute the portion of the Web known as deep Web. The existing estimates of the deep Web are predominantly based on study of English deep web sites. The key parameters of other-than-English segments of the deep Web were not investigated so far. Thus, currently known characteristics of the deep Web may be biased, especially owing to a steady increase in non-English web content. In this paper, we survey the part of the deep Web consisting of dynamic pages in one particular national domain. The estimation of the national deep Web is performed using the proposed sampling techniques. We report our observations and findings based on the experiments conducted in summer 2005.

Proceedings ArticleDOI
03 Jan 2007
TL;DR: Evidence is found that contradicts the typical view presented in the M & A literature, and it is argued that there are several different approaches to post-merger IS integration planning, namely comprehensiveness, formalization, focus, flow, participation, and consistency.
Abstract: Many researchers and business professionals have emphasized the importance and difficulties of successful information systems (IS) integration in the context of mergers and acquisitions (M & A). However, existing research remains sparse, failing to explain how firms design their IS integration strategy and its relation to successful IS integration. In order to overcome this shortcoming, we adapt six dimensions of strategic IS planning to the post-merger integration situation, namely comprehensiveness, formalization, focus, flow, participation, and consistency in the post-merger IS integration context. We then use two in-depth cases studies to shed light on these constructs. We find evidence that contradicts the typical view presented in the M & A literature, and argue that there are several different approaches to post-merger IS integration planning. In the analysis, we point out specific differences between the cases that eventually lead up to these fundamentally different approaches to IS integration design

Journal ArticleDOI
TL;DR: This paper proposes in this paper a string-based framework inspired by the principle of self-assembly: two strings with a common overlap, say uv and vw, yield a string uvw; it is said that string uVw has been assembled from strings uvand vw.

Journal Article
TL;DR: It is shown that each Galois connection between two complete lattices determines an Armstrong system, that is, a closed set of dependencies.
Abstract: In this paper we show that each Galois connection between two complete lattices determines an Armstrong system, that is, a closed set of dependencies. Especially, we study Galois connections and Armstrong systems determined by Pawlak's information systems.

Journal ArticleDOI
TL;DR: A new algorithm is presented that inherits the strengths of alias-free shadow maps and has an efficient hardware-accelerated implementation and also presents novel acceleration techniques and proposes a few hardware modifications to enable real-time implementations.
Abstract: The alias-free shadow maps technique is the first shadow map-based algorithm that computes shadows at floating-point accuracy and handles transparent surfaces elegantly. However, its major disadvantage is that it is a specialized software rendering technique, because shadow computations are performed by using irregular sampling point locations. We present a new algorithm that inherits the strengths of alias-free shadow maps and has an efficient hardware-accelerated implementation. With our method, the sampling points are transformed into multiple layers of regular pixel boundaries with a point sprite rendering technique, and the rasterized geometry is modified to include the necessary information for computing accurate shadow terms. We also present novel acceleration techniques and propose a few hardware modifications to enable real-time implementations.

Journal ArticleDOI
TL;DR: A number of experiments show that there is a basic constant overhead for every Java program, and that a subset of Java opcodes have an almost constant energy cost.

Journal Article
TL;DR: Four modal-like operators on Boolean lattices are introduced and their theory is presented from lattice-theoretical, topological and algebraic point of view.
Abstract: In this work, four modal-like operators on Boolean lattices are introduced and their theory is presented from lattice-theoretical, topological and algebraic point of view. It is also shown how rough set approximation operators, modal operators in temporal logic, and linguistic modifiers determined by L-sets can be interpreted as modal-like operators.

Journal ArticleDOI
TL;DR: The question of how patient confidentiality can be ensured in developing language technology for the nursing documentation domain is focused on, and the ethical outcomes which arise when using natural language processing to support clinical judgement and decision-making are identified.

Proceedings ArticleDOI
29 Aug 2007
TL;DR: A novel agent-based reconfiguring concept for futures network-on-chip (NoC) systems that is able to increase application fault-tolerance and performance with autonomous reactions of agents is introduced.
Abstract: We introduce a novel agent-based reconfiguring concept for futures network-on-chip (NoC) systems. The necessary properties to increase architecture level fault tolerance are introduced. The system control is modeled as multi-level agent hierarchy that is able to increase application fault-tolerance and performance with autonomous reactions of agents. The agent technology adds a system level intelligence level to the traditional NoC system design. The architecture and functions of this system are described on conceptual level. Communication and reconfiguring data flows are presented as study cases. Principles of reconfiguration of a NoC on faulty environment are demonstrated and simulated. Probability of reconfiguration success is measured with different latency requirements and amount of redundancy by Monte Carlo simulations. The effect of network topology in reconfiguration of a faulty mesh was also under research in the simulations.

Book ChapterDOI
26 Sep 2007
TL;DR: This paper provides a definition of contracts and refinement using the action system formalism, which enables abstract specifications of model parts, while refinement offers a framework to reason about correctness of implementation of contracts, as well as composition ofmodel parts.
Abstract: Simulink is a popular tool for model-based development of control systems. However, due to the complexity caused by the increasing demand for sophisticated controllers, validation of Simulink models is becoming a more difficult task. To ensure correctness and reliability of large models, it is important to be able to reason about model parts and their interactions. This paper provides a definition of contracts and refinement using the action system formalism. Contracts enable abstract specifications of model parts, while refinement offers a framework to reason about correctness of implementation of contracts, as well as composition of model parts. An example is provided to illustrate system development using contracts and refinement.

Journal ArticleDOI
TL;DR: This study presents a novel algorithm for reverse engineering with linear systems that is a combination of the orthogonal least squares, second order derivative for network pruning, and Bayesian model comparison and can be used to elucidate gene regulatory networks using limited number of experimental data points.
Abstract: A reverse engineering of gene regulatory network with large number of genes and limited number of experimental data points is a computationally challenging task. In particular, reverse engineering using linear systems is an underdetermined and ill conditioned problem, i.e. the amount of microarray data is limited and the solution is very sensitive to noise in the data. Therefore, the reverse engineering of gene regulatory networks with large number of genes and limited number of data points requires rigorous optimization algorithm. This study presents a novel algorithm for reverse engineering with linear systems. The proposed algorithm is a combination of the orthogonal least squares, second order derivative for network pruning, and Bayesian model comparison. In this study, the entire network is decomposed into a set of small networks that are defined as unit networks. The algorithm provides each unit network with P(D|Hi), which is used as confidence level. The unit network with higher P(D|Hi) has a higher confidence such that the unit network is correctly elucidated. Thus, the proposed algorithm is able to locate true positive interactions using P(D|Hi), which is a unique property of the proposed algorithm. The algorithm is evaluated with synthetic and Saccharomyces cerevisiae expression data using the dynamic Bayesian network. With synthetic data, it is shown that the performance of the algorithm depends on the number of genes, noise level, and the number of data points. With Yeast expression data, it is shown that there is remarkable number of known physical or genetic events among all interactions elucidated by the proposed algorithm. The performance of the algorithm is compared with Sparse Bayesian Learning algorithm using both synthetic and Saccharomyces cerevisiae expression data sets. The comparison experiments show that the algorithm produces sparser solutions with less false positives than Sparse Bayesian Learning algorithm. From our evaluation experiments, we draw the conclusion as follows: 1) Simulation results show that the algorithm can be used to elucidate gene regulatory networks using limited number of experimental data points. 2) Simulation results also show that the algorithm is able to handle the problem with noisy data. 3) The experiment with Yeast expression data shows that the proposed algorithm reliably elucidates known physical or genetic events. 4) The comparison experiments show that the algorithm more efficiently performs than Sparse Bayesian Learning algorithm with noisy and limited number of data.

Journal ArticleDOI
TL;DR: A negative answer is given to the open question asking whether there exist independent systems of three equations over three unknowns admitting non-periodic solutions, formulated in 1983 by Culik II and Karhumaki for a large class of systems.
Abstract: We investigate the open question asking whether there exist independent systems of three equations over three unknowns admitting non-periodic solutions, formulated in 1983 by Culik II and Karhumaki. In particular, we give a negative answer to this question for a large class of systems. More specifically, the question remains open only for a well specified class of systems. We also investigate systems of two equations over three unknowns for which we give necessary and sufficient conditions for admitting at most quasi-periodic solutions, i.e., solutions where the images of two unknowns are powers of a common word. In doing so, we also give a number of examples showing that these conditions represent a boundary point between systems admitting purely non-periodic solutions and those admitting at most quasi-periodic ones.

Journal ArticleDOI
TL;DR: An algorithm to test whether or not a finite set of words is an (R,S)-code is described, which explores coding properties of finite sets of words by finding maximal and minimal relations with respect to relational codes.

Journal ArticleDOI
TL;DR: The results indicate that one way to develop electronic patient records could be tools that handle the free text in nursing documentation, and the ability of the machine-learning algorithm to perform the classification on an acceptable level is addressed.

Proceedings ArticleDOI
01 Jul 2007
TL;DR: This paper proves by computer simulations that the newly proposed method can compete with the puncturing method, and in some cases outperforms it, and can be converted to produce multi-block space-time codes that achieve the diversity-multiplexing (D-M) tradeoff.
Abstract: In this paper, the need for the construction of asymmetric space-time block codes (ASTBCs) is discussed, mostly concentrating on the case of four transmitting and two receiving antennas for simplicity. Above the trivial puncturing method, i.e. switching off the extra layers in the symmetric multiple input-multiple output (MIMO) setting, a more sophisticated yet simple asymmetric construction method is proposed. This method can be converted to produce multi-block space-time codes that achieve the diversity-multiplexing (D-M) tradeoff. It is also shown that maximizing the density of the newly proposed codes is equivalent to minimizing the discriminant of a certain order. The use of the general method is then demonstrated by building explicit, sphere decodable codes using different cyclic division algebras (CDAs). We verify by computer simulations that the newly proposed method can compete with the puncturing method, and in some cases outperforms it. Our conquering construction exploiting maximal orders improves even upon the punctured perfect code and the DjABBA code.

Journal ArticleDOI
TL;DR: An approach for educators to evaluate student progress throughout a course, and not merely based on a final exam is presented, and results from using this approach in introductory programming courses at secondary level are presented.
Abstract: This paper presents an approach for educators to evaluate student progress throughout a course, and not merely based on a final exam. We introduce progress reports and describe how these can be used as a tool to evaluate student learning and understanding during programming courses. Complemented with data from surveys and the exam, the progress reports can be used to build an overall picture of individual student progress in a course, and to answer questions related to how students (1) understand program code as a whole, (2) understand individual constructs, and (3) perceive the difficulty level of different programming topics. We also present results from using this approach in introductory programming courses at secondary level. Our initial experience from using the progress reports is positive, as they provide valuable information during the course, which most likely would remain uncovered otherwise.

Journal ArticleDOI
TL;DR: A mathematical model for contextual variants of Id and dlad on strings: recombinations can be done only if certain contexts are present is proposed and it is proved that the proposed model is Turing-universal.
Abstract: The process of gene assembly in ciliates, an ancient group of organisms, is one of the most complex instances of DNA manipulation known in any organism. Three molecular operations Id, hi, and dlad have been postulated for the gene assembly process. We propose in this paper a mathematical model for contextual variants of Id and dlad on strings: recombinations can be done only if certain contexts are present. We prove that the proposed model is Turing-universal.