KNIME: The Konstanz Information Miner
read more
Citations
Fiji: an open-source platform for biological-image analysis
ImageJ2: ImageJ for the next generation of scientific image data
Big Data: A Survey
The ImageJ ecosystem: An open platform for biomedical image analysis
A new bioinformatics analysis tools framework at EMBL–EBI
References
Data Mining: Practical Machine Learning Tools and Techniques
KNIME: The Konstanz Information Miner.
The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics
Parallel and Distributed Data Pipelining with KNIME
Related Papers (5)
Fiji: an open-source platform for biological-image analysis
Frequently Asked Questions (15)
Q2. What type of processing is easy to extend?
The type of processing ranges from simple data operations such as filtering or merging to more complex statistical functions, such as computations of mean, standard deviation or linear regression coefficients to computation intensive data modeling operators (clustering, decision trees, neural networks, to name just a few).
Q3. What is the main purpose of Knime?
But to accommodate the increasing availability of multi-core machines, also the support for shared memory parallelism becomesincreasingly important.
Q4. What is the function that can be used to mark a selected point in a view?
Through receiving events from a HiLiteHandler (and sending events to it) it is possible to mark (the so-called HiLiting) selected points in such a view to enable visual brushing.
Q5. What are the main principles of Knime?
Customized applications can be modelled through individual data pipelines.• modularity: Processing units and data containers should not depend on each other in order to enable easy distribution of computation and allow for independent development of different algorithms.
Q6. Why is it important to avoid accessing data by row ID?
The reason to avoid access by Row ID or index is scalability, that is, the desire to be able to process large amounts of data and therefore not be forced to keep all of the rows in memory for fast, random access.
Q7. What is the function of a meta-node?
A meta-node can be exported to other users as a predefined module and allow to create wrappers for repeated execution as needed in cases such as, e.g. cross-validation, bagging and boosting, ensemble learning etc.
Q8. What is the main purpose of the Konstanz Information Miner?
In order to make use of the vast variety of data analysis methods around, it is essential that such an environment is easy and intuitive to use, allows for quick and interactive changes to the analysis and enables the user to visually explore the results.
Q9. What is the purpose of the class Node?
The class Node wraps all functionality and makes use of user defined implementations of a NodeModel, possibly a NodeDialog, and one or more NodeView instances if appropriate.
Q10. What is the need for a NodeFactory?
In addition to the three model, dialog, and view classes the programmer also needs to provide a NodeFactory, creating new instances.
Q11. What is the reason to avoid accessing data by a row?
Thanks to the underlying graph structure, the workflow manager is able to determine all nodes required to be executed along the paths leading to the node the user actually wants to execute.
Q12. What is the main idea of a data analysis process?
In order to achieve this, a data analysis process consists of a pipeline of nodes, connected by edges that transport either data or models.
Q13. What is the purpose of nested workflows?
Such nested workflows introduce modularity and allow the user to design complex workflows while focusing on different level of details (abstraction).
Q14. What is the way to create a new node?
A wizard integrated in the Eclipse-based development environment allows to quickly generate all required class bodies for a new node.
Q15. What is the purpose of the Konstanz Information Miner?
From this large variety of nodes, one can select data sources, data preprocessing steps, model building algorithms, visualization techniques as well as model I/O tools and drag them onto the workbench where they can be connected to other nodes.