Enhancing Performance Prediction Robustness by Combining Analytical Modeling and Machine Learning
read more
Citations
Pattern Recognition and Machine Learning
Performance Prediction for Apache Spark Platform
Machine Learning Methods for Reliable Resource Provisioning in Edge-Cloud Computing: A Survey
Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads
All versus one: an empirical comparison on retrained and incremental machine learning for modeling performance of adaptable software
References
Pattern Recognition and Machine Learning
Dynamic Programming
Pattern Recognition and Machine Learning
Pattern Recognition and Machine Learning (Information Science and Statistics)
Related Papers (5)
Identifying the optimal level of parallelism in transactional memory applications
Frequently Asked Questions (13)
Q2. What are the techniques used to identify the parameters’ values for the AM?
Techniques employed to identify the parameters’ values for the AM include regression [44, 9], clustering [34], Genetic Programming [18] or a combination of Kalman Filters and autoregressive models [45].
Q3. What is the reason why the authors evaluated the effectiveness of their proposals?
The authors evaluated the effectiveness of their proposals by relying on case studies related to two highly relevant open-source middleware platforms, namely a key-value data store and a group communication system.
Q4. What is the way to ensure that the proposed techniques can be used in case AM?
It is worth noting that, by assuming the analytical model ΓAM to be an immutable object, the authors can ensure that the proposed techniques can also be employed in case ΓAM can be dynamically updated.
Q5. How does the paper assess the validity of the proposed ensemble techniques?
The authors assess the validity of their proposal through an extensive experimental evaluation carried out in two different application domains: throughput prediction of a popular open-source NoSQL distributed key-value store, Red Hat’s Infinispan [25], and response time prediction of a total order broadcast service, a key building block for fault-tolerant replicated systems.
Q6. What is the definition of a function in a ML-based model?
Analogously to a ML-based model, an analytical model ΓAM is a function FAM → C, which can be queried to predict the performance of the modeled system ŷ = AM(x) over a given configuration x ∈ FAM .
Q7. What is the purpose of the ML-based learner?
By narrowing the scope in which the ML-based learner Γreg is used to the regions of high error for ΓAM , the complexity of the function that needs to be learnt via ML may be reduced, which may ultimately benefit the accuracy of Γreg.
Q8. What is the use of ML for prediction of performance?
In the former case, AM is employed to capture the effects of data and CPU contention on performance, whereas ML is employed to forecast response time of network-bound operations.
Q9. What is the reason for the lack of a regressor?
The authors argue that this depends on the fact that the error function of the ensemble composed by ΓAM and by one ML-based regressor was extremely irregular, hence resulting not easy to learn using additional black-box regressors.
Q10. What is the key factor that affects the performance of the proposed solutions?
Their experimental study suggests that one of the key factors that affects the performance of the proposed solutions is the “shape” of the error distribution of ΓAM .
Q11. What is the way to tune the parameters of a ML ensemble?
The correct settings of these parameters can be identified recurring to the standard methodology used to tune the internal parameters of ML-based algorithms: performing a parameter’s sweep during the training phase, and using crossvalidation to evaluate the accuracy achieved when using a candidate parameter configuration over a test set disjoint from the training set used to initialize the ensemble [2].incurs the largest errors are relatively circumscribed, solutions like KNN and Probing, which are based on the idea of determining in which regions to use which learner, result the most effective.
Q12. What are the three ensemble techniques that are used to achieve the same objectives?
As already mentioned, the authors present in the following three ensemble techniques that pursue the same objectives (minimizing training time and achieving an accuracy better or comparable to that of both black and white box techniques) using different algorithmic approaches.
Q13. Why is Cubist the accurate ML technique?
The choice of Cubist as reference base learner for the results presented in this section is due to the fact that, at least in the considered case studies, Cubist consistently resulted to be the most accurate individual (non-ensembled) ML tech-nique, when compared to (Weka’s implementations of) ANN and SVM2.