scispace - formally typeset
Search or ask a question

Showing papers by "Jun Yang published in 2022"



Proceedings ArticleDOI
22 Feb 2022
TL;DR: It is argued that this problem is undecidable for general domain relational calculus queries, and practical algorithms for computing a minimum collection of such instances subject to other constraints are developed.
Abstract: A powerful way to understand a complex query is by observing how it operates on data instances. However, specific database instances are not ideal for such observations: they often include large amounts of superfluous details that are not only irrelevant to understanding the query but also cause cognitive overload; and one specific database may not be enough. Given a relational query, is it possible to provide a simple and generic "representative'' instance that (1) illustrates how the query can be satisfied, (2) summarizes all specific instances that would satisfy the query in the same way by abstracting away unnecessary details? Furthermore, is it possible to find a collection of such representative instances that together completely characterize all possible ways in which the query can be satisfied? This paper takes initial steps towards answering these questions. We design what these representative instances look like, define what they stand for, and formalize what it means for them to satisfy a query in "all possible ways." We argue that this problem is undecidable for general domain relational calculus queries, and develop practical algorithms for computing a minimum collection of such instances subject to other constraints. We evaluate the efficiency of our approach experimentally, and show its effectiveness in helping users debug relational queries through a user study.

4 citations


Proceedings ArticleDOI
10 Jun 2022
TL;DR: A useful extension is considered, durable temporal joins, which further selects results with long enough valid intervals so they are not merely transient patterns, and proposes output-sensitive algorithms for non-r-hierarchical joins.
Abstract: This paper studies multi-way join queries over temporal data, where each tuple is associated with a valid time interval indicating when the tuple is valid. A temporal join requires that joining tuples' valid intervals intersect. Previous work on temporal joins has focused on joining two relations, but pairwise processing is often inefficient because it may generate unnecessarily large intermediate results. This paper investigates how to efficiently process complex temporal joins involving multiple relations. We also consider a useful extension, durable temporal joins, which further selects results with long enough valid intervals so they are not merely transient patterns. We classify temporal join queries into different classes based on their computational complexity. We identify the class of r-hierarchical joins and show that a linear-time algorithm exists for a temporal join if and only it is r-hierarchical (assuming the 3SUM conjecture holds). We further propose output-sensitive algorithms for non-r-hierarchical joins. We implement our algorithms and evaluate them on both synthetic and real datasets.

4 citations


Journal ArticleDOI
TL;DR: The combined promoter methylation assay for SHOX2 and RASSF1A can be used for screening and diagnosis of early LUAD, with good sensitivity and specificity.
Abstract: Objective Methylation of the promoters of SHOX2 and RASSF1A are potentially informative biomarkers for the diagnosis of early lung adenocarcinoma (LUAD). Abnormal methylation of SHOX2 and RASSF1A promoters may promote the occurrence and facilitate the progression of LUAD. Materials and Methods We selected 54 patients with early LUAD and 31 patients with benign lung nodules as a NJDT cohort and evaluated their DNA methylation and mRNA sequencing levels. The DNA methylation sequencing, mRNA sequencing, and clinical data for patients with LUAD were obtained from The Cancer Genome Atlas, and served as a TCGA cohort. We evaluated the diagnostic potential of a SHOX2 and RASSF1A combined promoter methylation assay for detection of early LUAD in the NJDT cohort. Then we explored the promoter methylation levels of SHOX2 and RASSF1A and their gene expression between normal and tumor samples at different stages in both cohorts. Pathways enriched between tumor and normal samples of methylation-positive patients in the NJDT cohort were analyzed. Results In the NJDT cohort, the sensitivity of the combined promoter methylation assay on tumor samples was 74.07%, the sensitivity on paired tumor and paracancerous samples was 77.78%, and the specificities in both contexts were 100%. The combined promoter methylation-positive patients had clinicopathologic features including older age, larger tumors, deeper invasion, and higher Ki-67 expression. In both cohorts, SHOX2 expression increased and RASSF1A expression decreased in tumor samples. The promoter methylation level of SHOX2 and RASSF1A was significantly higher in tumor samples at stage I-II than that in normal samples. The promoter methylation levels of these two genes were both negative associated with their expression in early tumor samples. In the NJDT cohort, methylation-positive patients of both individual SHOX2 and RASSF1A assays exhibited upregulation of folate acid metabolism and nucleotide metabolism in tumor samples. The SHOX2 methylation-positive and RASSF1A methylation-positive patients showed the downregulation of pathways related to cell proliferation and apoptosis and pathways involved in DNA repair, cell growth and cell adhesion, respectively. Conclusion The combined promoter methylation assay for SHOX2 and RASSF1A can be used for screening and diagnosis of early LUAD, with good sensitivity and specificity. The promoter methylation levels of SHOX2 and RASSF1A were associated with their abnormal mRNA expression, and affected DNA instability, cell proliferation, apoptosis and tumor microenvironment in patients with LUAD.

2 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors used three independent and orthogonal channels of opto-electromechanical microscopic observation systems to obtain the three views (e.g., front view, top view, side view) of a single fruit cell using tomato and strawberry at two ripening stages as fruit samples.

1 citations



Journal ArticleDOI
TL;DR: The Structure Text Model (STM) as mentioned in this paper is a text modeling method that uses an unsupervised machine learning model to statistically derive latent semantic topics underlying a collection of text documents.
Abstract: Although text data are ubiquitous in organizations, the advancement of text analysis methods has created an unfortunate bottleneck for many organizational researchers. This paper introduced the Structure Text Model (STM; Roberts et al., 2014), a cutting-edge text modeling method that uses an unsupervised machine learning model to statistically derive latent semantic topics underlying a collection of text documents. More importantly, the STM method has the advantage of modeling document-level variables (i.e., metadata) as covariates, which is critical for organizational research. We also demonstrated the application of STM in diversity research: comprehensively analyzing a large number of diversity statements publicly released by Fortune 1000 companies. Our structural topic modeling uncovered six underlying latent semantic topics: 1) general DEI terms, 2) supporting Black community, 3) acknowledging Black community, 4) committing to diversifying workforce, 5) miscellaneous words, and 6) titles and companies. We further explored and found that the prevalence of these topics varied as a function of company characteristics, including industry sector, CEO race, corporate political orientation, etc. Our paper not only demonstrates the promising application of Structural Text Models in organizational research, but also provides important theoretical implications for current diversity research through the meaningful findings.

Journal ArticleDOI
TL;DR: In this paper , a thermal model with highly accurate temperature prediction on the high frequency insulated core transformer (HF-ICT) accelerator is proposed, considering the thermal coupling of the ambient, cores and windings.
Abstract: The high frequency insulated core transformer (HF-ICT) accelerator is expected to replace the traditional insulated core transformer (ICT) for its small size and high power density in the field of irradiation below 1 MeV. As a key component of the accelerator, the ICT high-voltage power supply is used to feed the accelerator tube. The segmented core structure of the ICT will lead to the different magnetic flux of each core section. The nonuniform of power loss and structure results in the uneven temperature distribution of the HF-ICT. In this regard, a thermal model with highly accurate temperature prediction on the HF-ICT parts is required. A thermal resistance network based on the HF-ICT is proposed, considering the thermal coupling of the ambient, cores and windings. The multiphysics coupling analysis is carried out by COMSOL Electromagnetic Thermal Module and Nonisothermal Flow Module. Instead of linear materials, the measured B-P curve and B-H curve are applied for a more realistic simulation model. Finally, the thermal model demonstrates acceptable accuracy and small computational expense within the simulation results, which verifies the effectiveness of the proposed model.