scispace - formally typeset
Search or ask a question
Author

C. S. Yang

Bio: C. S. Yang is an academic researcher from Cornell University. The author has contributed to research in topics: Document retrieval & Apoptosis. The author has an hindex of 1, co-authored 1 publications receiving 6281 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.
Abstract: In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.

6,619 citations

Journal ArticleDOI
C. S. Yang, Y Lou, Qinjian Ke, X J Xu, Yi Zhang 
23 Mar 2022
TL;DR: The molecular mechanism of circZNF609 targeting miR-153 to regulate the proliferation and apoptosis of diffuse large B-cell lymphoma is found to be mutually targeted.
Abstract: Objective: To investigate the molecular mechanism of circZNF609 targeting miR-153 to regulate the proliferation and apoptosis of diffuse large B-cell lymphoma. Methods: Fifty cases of lymphoma tissue from patients with diffuse large B-cell lymphoma who were diagnosed and treated in the First Affiliated Hospital of Zhengzhou University from July 2018 to December 2019 were collected. Thirty cases of normal lymph node tissues that were confirmed to be reactive hyperplasia by pathological diagnosis during the same period were selected as controls. Real time quantitative polymerase chain reaction (PCR) was used to detect the expression of circZNF609 in diffuse large B-cell lymphoma tissues and control hyperplasia lymph nodes. Diffuse large B-cell lymphoma OCI-LY19 cells were divided into control group (blank control), si-con group (transfected with siRNA control), si-ZNF609 group (transfected with circZNF609 siRNA), and si-ZNF609+ Anti-NC group (co-transfected with circZNF609 siRNA and inhibitor control) and si-ZNF609+ Anti-miR-153 group (co-transfected with circZNF609 siRNA and miR-153 inhibitor). Cell counting kit-8 (CCK-8) was used to detected proliferation, flow cytometry was used to detect cell cycle and apoptosis. Western blot was used to detect the protein expressions of C-caspase-3, cyclin D1, p21. The luciferase reporter system was used to identifie the relationship between circZNF609 and miR-153. Results: The expression level of circZNF609 in diffuse large B-cell lymphoma tissue was (1.44±0.22), higher than (0.37±0.14) in the control tissues (P<0.001). The cell survival rate of the si-ZNF609 group was (51.74±6.39)%, lower than (100.00±10.23)% of the control group and the (99.64±11.67)% of the si-con group (P<0.001). The proportion of cells in the G(0)/G(1) phase was (63.25±4.11)%, higher than (48.62±4.32)% of the control group and (47.12±3.20)% of the si-con group (P<0.001), the apoptosis rate was (13.36±1.42)%, higher than (3.65±0.47)% of the control group and (3.84±0.62)% of the si-con group (P<0.05). The expression levels of C-caspase-3 and p21 protein were (0.85±0.09) and (0.90±0.08), higher than (0.38±0.04) and (0.65±0.07) in the control group and (0.39±0.05) and (0.66±0.05) in the si-con group (P<0.001). The expression level of cyclin D1 protein was (0.40±0.03), lower than (0.52±0.06) of the control group and (0.53±0.04) of the si-con group (all P<0.001). CircZNF609 and miR-153 are mutually targeted. The cell survival rate of the si-ZNF609+ Anti-miR-153 group was (169.92±13.25)%, higher than (100.00±9.68)% of the si-ZNF609+ Anti-NC group (P<0.001), the ratio of cells in G(0)/G(1) phase and apoptosis rate were (52.01±3.62)% and (8.20±0.87)%, respectively, lower than (64.51±5.17)% and (14.03±1.17)% in the si-ZNF609+ Anti-NC group (P<0.001). The protein expression levels of C-caspase-3 and p21 were (0.42±0.06) and (0.52±0.06), lower than (0.80±0.07) and (0.92±0.10) of the si-ZNF609+ Anti-NC group (P<0.001). The protein expression level of cyclin D1 was (0.68±0.07), higher than (0.39±0.04) in the si-ZNF609+ Anti-NC group (P<0.001). Conclusion: Down-regulation of circZNF609 inhibits the proliferation of diffuse large B-cell lymphoma OCI-LY19 cells and induces apoptosis by targeting miR-153.

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Abstract: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

7,539 citations

22 May 2010
TL;DR: This work describes a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion, and implements several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation in a way that makes them completely independent of the training corpus size.
Abstract: Large corpora are ubiquitous in today's world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM). We identify gap in existing VSM implementations, which is their scalability and ease of use. We describe a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion. In this framework, we implement several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation, in a way that makes them completely independent of the training corpus size. Particular emphasis is placed on straightforward and intuitive framework design, so that modifications and extensions of the methods and/or their application by interested practitioners are effortless. We demonstrate the usefulness of our approach on a real-world scenario of computing document similarities within an existing digital library DML-CZ.

3,965 citations

Journal ArticleDOI
TL;DR: The goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs, and to provide pointers into the literature for those who are less familiar with the field.
Abstract: Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field.

2,843 citations

Journal ArticleDOI
01 Sep 2001
TL;DR: This paper examines the sensitivity of retrieval performance to the smoothing parameters and compares several popular smoothing methods on different test collection.
Abstract: Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank documents by the likelihood of the query according to the estimated language model. A core problem in language model estimation is smoothing, which adjusts the maximum likelihood estimator so as to correct the inaccuracy due to data sparseness. In this paper, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections.

1,597 citations

Proceedings Article
01 Jan 2000
TL;DR: In this article, an inner product in the feature space consisting of all subsequences of length k was introduced for comparing two text documents, where a subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously.
Abstract: We introduce a novel kernel for comparing two text documents. The kernel is an inner product in the feature space consisting of all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially decaying factor of their full length in the text, hence emphasising those occurrences which are close to contiguous. A direct computation of this feature vector would involve a prohibitive amount of computation even for modest values of k, since the dimension of the feature space grows exponentially with k. The paper describes how despite this fact the inner product can be efficiently evaluated by a dynamic programming technique. A preliminary experimental comparison of the performance of the kernel compared with a standard word feature space kernel [6] is made showing encouraging results.

1,464 citations