Home
/
Authors
/
Todd Nicholson

Author

Todd Nicholson

University of Illinois at Urbana–Champaign

Bio: Todd Nicholson is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Cloud computing & Metadata. The author has an hindex of 3, co-authored 6 publications receiving 35 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Clowder: Open Source Data Management for Long Tail Data

[...]

Luigi Marini¹, Indira Gutierrez-Polo¹, Rob Kooper¹, Sandeep Puthanveetil Satheesan¹, M. Burnette¹, Jong Lee¹, Todd Nicholson¹, Yan Zhao¹, Kenton McHenry¹ - Show less +5 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

22 Jul 2018

TL;DR: Some of the challenges encountered in designing and developing a system that can be easily adapted to different scientific areas are discussed, including support for large amounts of data, horizontal scaling of domain specific preprocessing algorithms, and ability to provide new data visualizations in the web browser.

...read moreread less

Abstract: Clowder is an open source data management system to support data curation of long tail data and metadata across multiple research domains and diverse data types. Institutions and labs can install and customize their own instance of the framework on local hardware or on remote cloud computing resources to provide a shared service to distributed communities of researchers. Data can be ingested directly from instruments or manually uploaded by users and then shared with remote collaborators using a web front end. We discuss some of the challenges encountered in designing and developing a system that can be easily adapted to different scientific areas including digital preservation, geoscience, material science, medicine, social science, cultural heritage and the arts. Some of these challenges include support for large amounts of data, horizontal scaling of domain specific preprocessing algorithms, ability to provide new data visualizations in the web browser, a comprehensive Web service API for automatic data ingestion and curation, a suite of social annotation and metadata management features to support data annotation by communities of users and algorithms, and a web based front-end to interact with code running on heterogeneous clusters, including HPC resources.

...read moreread less

20 citations

Proceedings Article•DOI•

4CeeD: Real-Time Data Acquisition and Analysis Framework for Material-related Cyber-Physical Environments

[...]

Phuong Nguyen¹, Steven Konstanty¹, Todd Nicholson¹, Thomas C. O'Brien¹, Aaron S. Schwartz-Duval¹, T. Spila¹, Klara Nahrstedt¹, Roy H. Campbell¹, Indranil Gupta¹, Michael Chan¹, Kenton McHenry¹, Normand Paquin¹ - Show less +8 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

14 May 2017

TL;DR: The evaluation results show that the novel cloud framework 4CeeD can help researchers significantly save time and cost spent on experiments, and is efficient in dealing with high-volume and fast-changing workload of heterogeneous types of experimental data.

...read moreread less

Abstract: In this paper, we present a data acquisition and analysis framework for materials-to-devices processes, named 4CeeD, that focuses on the immense potential of capturing, accurately curating, correlating, and coordinating materials-to-devices digital data in a real-time and trusted manner before fully archiving and publishing them for wide access and sharing. In particular, 4CeeD consists of novel services: a curation service for collecting data from microscopes and fabrication instruments, curating, and wrapping of data with extensive metadata in real-time and in a trusted manner, and a cloud-based coordination service for storing data, extracting meta-data, analyzing and finding correlations among the data. Our evaluation results show that our novel cloud framework can help researchers significantly save time and cost spent on experiments, and is efficient in dealing with high-volume and fast-changing workload of heterogeneous types of experimental data.

...read moreread less

17 citations

Proceedings Article•DOI•

BRACELET: Edge-Cloud Microservice Infrastructure for Aging Scientific Instruments

[...]

Phuong Nguyen¹, Tarek Elgamal¹, Steven Konstanty¹, Todd Nicholson¹, Stuart Turner¹, Patrick Su¹, Klara Nahrstedt¹, T. Spila¹, Roy H. Campbell¹, John Dallesasse¹, Michael Chan¹, Kenton McHenry¹ - Show less +8 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Feb 2019

TL;DR: BRACELET is proposed - an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance & security challenges that scientific instruments face when they are connected to the cloud through public network.

...read moreread less

Abstract: Recent advances in cyber-infrastructure have enabled digital data sharing and ubiquitous network connectivity between scientific instruments and cloud-based storage infrastructure for uploading, storing, curating, and correlating of large amounts of materials and semiconductor fabrication data and metadata. However, there is still a significant number of scientific instruments running on old operating systems that are taken offline and cannot connect to the cloud infrastructure, due to security and network performance concerns. In this paper, we propose BRACELET - an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance & security challenges that scientific instruments face when they are connected to the cloud through public network. With BRACELET, we put a networked edge device, called cloudlet, in between the scientific instruments and the cloud as the middle tier of a three-tier hierarchy. The cloudlet will shape and protect the data traffic from scientific instruments to the cloud, and will play a foundational role in keeping the instruments connected throughout its lifetime, and continuously providing the otherwise missing performance and security features for the instrument as its operating system ages.

...read moreread less

5 citations

BRACELET: Hierarchical Edge-Cloud Microservice Infrastructure for Scientific Instruments’ Lifetime Connectivity

[...]

Phuong Nguyen, Steven Konstanty, Tarek Elgamal, Todd Nicholson, Stuart Turner, Patrick Su, Klara Nahrstedt, T. Spila, Roy H. Campbell, John Dallesasse, Michael Chan, Kenton McHenry - Show less +8 more

01 Jul 2018

TL;DR: BRACELET is proposed, an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance & security challenges that scientific instruments face when they are connected to the cloud through public network.

...read moreread less

Abstract: Recent advances in cyber-infrastructure have enabled digital data sharing and ubiquitous network connectivity between scientific instruments and cloud-based storage infrastructure for uploading, storing, curating, and correlating of large amounts of materials and semiconductor fabrication data and metadata. However, there is still a significant number of scientific instruments running on old operating systems that are taken offline and cannot connect to the cloud infrastructure, due to security and performance concerns. In this paper, we propose BRACELET an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance & security challenges that scientific instruments face when they are connected to the cloud through public network. With BRACELET, we put a networked edge device, called cloudlet, in between the scientific instruments and the cloud as the middle tier of a three-tier hierarchy. The cloudlet will shape and protect the data traffic from scientific instruments to the cloud, and will play a foundational role in keeping the instruments connected throughout its lifetime, and continuously providing the otherwise missing performance and security features for the instrument as its operating system ages.

...read moreread less

3 citations

Journal Article•DOI•

Modernizing Microscopy Data Infrastructure: Data and Metadata Curation

[...]

Rachel F. Devers, June W. Lau, Gretchen Greene, Todd Nicholson, Phuong Nguyen, Steve Konstanty - Show less +2 more

01 Aug 2018-Microscopy and Microanalysis

TL;DR: The limitations of current electron microscopy data curation practices are felt whenever a scientist wishes to share and revisit data, and individually managed data and metadata based on project discipline, chronological order, or some other arbitrary user preference is fundamentally lacking in transparency, longevity and reusability.

...read moreread less

Abstract: The limitations of current electron microscopy data curation practices are felt whenever a scientist wishes to share and revisit data. As soon as raw instrument data is written to a file, determining the contents of each file (with and without proprietary software) tends to be a serial, time-consuming task. In some cases, an image thumbnail may be available, but the thumbnail alone usually lacks readily accessible contextual information which makes it valuable. This causes scientists to comb through files sequentially, sometimes requiring instrument or detector-specific proprietary software to view data and metadata. This method of data examination limits the significance of each file to a combination of the researcher’s notes or memory, OS-generated metadata (file size and time stamp), and perhaps file naming convention. This is not a tractable premise for the hundreds of images for a given sample, thousands of images that may have contributed to publications, and hard drives full of project data contributed by multiple researchers over the span of a project. Ultimately, individually managed data and metadata based on project discipline, chronological order, or some other arbitrary user preference is fundamentally lacking in transparency, longevity and reusability.

...read moreread less

1 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

All one needs to know about fog computing and related edge computing paradigms: A complete survey

[...]

Ashkan Yousefpour¹, Caleb Fung², Tam T. Nguyen³, Krishna P. Kadiyala², Fatemeh Jalali⁴, Amirreza Niakanlahiji⁵, Jian Kong², Jason P. Jue² - Show less +4 more•Institutions (5)

University of California, Berkeley¹, University of Texas at Dallas², Georgia Institute of Technology³, IBM⁴, University of North Carolina at Charlotte⁵

01 Sep 2019-Journal of Systems Architecture

TL;DR: This paper provides a tutorial on fog computing and its related computing paradigms, including their similarities and differences, and provides a taxonomy of research topics in fog computing.

...read moreread less

783 citations

Journal Article•DOI•

All One Needs to Know about Fog Computing and Related Edge Computing Paradigms: A Complete Survey

[...]

Ashkan Yousefpour¹, Caleb Fung², Tam T. Nguyen³, Krishna P. Kadiyala², Fatemeh Jalali⁴, Amirreza Niakanlahiji⁵, Jian Kong², Jason P. Jue² - Show less +4 more•Institutions (5)

University of California, Berkeley¹, University of Texas at Dallas², Georgia Institute of Technology³, IBM⁴, University of North Carolina at Charlotte⁵

15 Aug 2018-arXiv: Networking and Internet Architecture

TL;DR: In this paper, the authors provide a tutorial on fog computing and its related computing paradigms, including their similarities and differences, and provide a taxonomy of research topics in fog computing.

...read moreread less

Abstract: With the Internet of Things (IoT) becoming part of our daily life and our environment, we expect rapid growth in the number of connected devices. IoT is expected to connect billions of devices and humans to bring promising advantages for us. With this growth, fog computing, along with its related edge computing paradigms, such as multi-access edge computing (MEC) and cloudlet, are seen as promising solutions for handling the large volume of security-critical and time-sensitive data that is being produced by the IoT. In this paper, we first provide a tutorial on fog computing and its related computing paradigms, including their similarities and differences. Next, we provide a taxonomy of research topics in fog computing, and through a comprehensive survey, we summarize and categorize the efforts on fog computing and its related computing paradigms. Finally, we provide challenges and future directions for research in fog computing.

...read moreread less

360 citations

Journal Article•DOI•

A data ecosystem to support machine learning in materials science

[...]

Ben Blaiszik¹, Logan Ward², Marcus Schwarting², Jonathon Gaff¹, Ryan Chard², Ryan Chard¹, Daniel Pike³, Kyle Chard¹, Ian Foster² - Show less +5 more•Institutions (3)

University of Chicago¹, Argonne National Laboratory², Cornell University³

01 Dec 2019-MRS Communications

TL;DR: This work uses examples to show how MDF and DLHub capabilities can be leveraged to link data with machine learning models and how users can access those capabilities through web and programmatic interfaces.

...read moreread less

Abstract: Facilitating the application of machine learning (ML) to materials science problems requires enhancing the data ecosystem to enable discovery and collection of data from many sources, automated dissemination of new data across the ecosystem, and the connecting of data with materials-specific ML models. Here, we present two projects, the Materials Data Facility (MDF) and the Data and Learning Hub for Science (DLHub), that address these needs. We use examples to show how MDF and DLHub capabilities can be leveraged to link data with ML models and how users can access those capabilities through web and programmatic interfaces.

...read moreread less

58 citations

Journal Article•DOI•

Strategies for accelerating the adoption of materials informatics

[...]

Logan Ward¹, Muratahan Aykol², Ben Blaiszik¹, Ian Foster¹, Bryce Meredig, James E. Saal, Santosh K. Suram² - Show less +3 more•Institutions (2)

University of Chicago¹, Toyota²

01 Sep 2018-Mrs Bulletin

TL;DR:

...read moreread less

Abstract: Ongoing, rapid innovations in fields ranging from microelectronics, aerospace, and automotive to defense, energy, and health demand new advanced materials at even greater rates and lower costs. Traditional materials R&D methods offer few paths to achieve both outcomes simultaneously. Materials informatics, while a nascent field, offers such a promise through screening, growing databases of materials for new applications, learning new relationships from existing data resources, and building fast predictive models. We highlight key materials informatics successes from the atomic-scale modeling community, and discuss the ecosystem of open data, software, services, and infrastructure that have led to broad adoption of materials informatics approaches. We then examine emerging opportunities for informatics in materials science and describe an ideal data ecosystem capable of supporting similar widespread adoption of materials informatics, which we believe will enable the faster design of materials.

...read moreread less

30 citations

Proceedings Article•DOI•

TERRA-REF Data Processing Infrastructure

[...]

M. Burnette¹, Rob Kooper¹, J. D. Maloney¹, Gareth S. Rohde¹, Jeffrey Terstriep¹, Craig Willis¹, Noah Fahlgren², Todd C. Mockler², Maria Newcomb³, Vasit Sagan⁴, Pedro Andrade-Sanchez³, Nadia Shakoor², Paheding Sidike⁴, Richard W. Ward³, David LeBauer¹ - Show less +11 more•Institutions (4)

University of Illinois at Urbana–Champaign¹, Donald Danforth Plant Science Center², University of Arizona³, Saint Louis University⁴

22 Jul 2018

TL;DR: The technical architecture for the TERRA-REF data and computing pipeline provides a suite of components to convert raw imagery to standard formats, geospatially subset data, and identify biophysical and physiological plant features related to crop productivity, resource use, and stress tolerance.

...read moreread less

Abstract: The Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) provides a data and computation pipeline responsible for collecting, transferring, processing and distributing large volumes of crop sensing and genomic data from genetically informative germplasm sets. The primary source of these data is a field scanner system built over an experimental field at the University of Arizona Maricopa Agricultural Center. The scanner uses several different sensors to observe the field at a dense collection frequency with high resolution. These sensors include RGB stereo, thermal, pulse-amplitude modulated chlorophyll fluorescence, imaging spectrometer cameras, a 3D laser scanner, and environmental monitors. In addition, data from sensors mounted on tractors, UAVs, an indoor controlled-environment facility, and manually collected measurements are integrated into the pipeline. Up to two TB of data per day are collected and transferred to the National Center for Supercomputing Applications at the University of Illinois (NCSA) where they are processed.In this paper we describe the technical architecture for the TERRA-REF data and computing pipeline. This modular and scalable pipeline provides a suite of components to convert raw imagery to standard formats, geospatially subset data, and identify biophysical and physiological plant features related to crop productivity, resource use, and stress tolerance. Derived data products are uploaded to the Clowder content management system and the BETYdb traits and yields database for querying, supporting research at an experimental plot level. All software is open source2 under a BSD 3-clause or similar license and the data products are open access (currently for evaluation with a full release in fall 2019). In addition, we provide computing environments in which users can explore data and develop new tools. The goal of this system is to enable scientists to evaluate and use data, create new algorithms, and advance the science of digital agriculture and crop improvement.

...read moreread less

21 citations

1
2
3
4
…
5
6
7
8
9

Collapse