THEMIS: Fairness in Federated Stream Processing under Overload
read more
Citations
GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks
Overload Control for Scaling WeChat Microservices
A holistic view of stream partitioning costs
Distributed resource management across process boundaries
Load-aware shedding in stream processing systems
References
The Tragedy of the Commons
A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems
Web Services Architecture
Aurora: a new model and architecture for data stream management
The CQL continuous query language: semantic foundations and query execution
Related Papers (5)
Frequently Asked Questions (16)
Q2. What is the SIC assignment for derived tuples?
The assignment of SIC values to derived tuples is performed as per Equation (3), which requires the sets of input and output tuples.
Q3. How does the load shedder calculate the result SIC?
To reduce the impact of delays when disseminating the result SIC values by the query coordinator to nodes hosting query fragments, the load shedder estimates the result SIC values of queries based on its local shedding.
Q4. What is the domain of tuples in the query graph?
Certain operators in the query graph are connected to a finite set of sources, which are denoted by S and produce source tuples in time-variant rates.
Q5. How does the algorithm increase the result SIC of all queries?
The algorithm follows a gradient ascent approach to increase gradually the result SIC values of all queries while minimising the pairwise SIC differences of the two queries with the lowest SIC values.
Q6. What is the FSPS's policy for balancing the SIC values of query?
(2) Overloaded nodes in the FSPS invoke a distributed semantic fair load-shedding algorithm that aims to balance the SIC values of query results across all queries, referred to as the BALANCE-SIC fairness policy.
Q7. Why is it difficult to capture the tuples that are not dropped?
it is challenging in practice to capture accurately the sets T̃ S , T S , T̃ R and T R because source tuples are successively transformed to derived tuples by operators and some are shed: de-rived tuples are “lost”, e.g. due to filters and joins, which only select a subset of their input tuples.
Q8. What is the SIC metric used to determine the importance of tuples?
in their model, the SIC metric captures the importance of tuples, i.e. the higher the SIC value, the more important is the tuple, the algorithm thus always keeps the most valuable tuples (max(xSIC ) in line 16).
Q9. What is the effect of shedding on the rest of the nodes?
Since queries span across sites and share resources, such effects are spread across sites, affecting shedding decisions on the rest of the nodes.
Q10. What is the SIC value of an individual source tuple ts?
the SIC value of an individual source tuple ts is inversely proportional to |T Ss | and is also normalised by the number of sources |S| in a query for a query-independent metric.
Q11. What is the way to measure processing quality?
the authors require a measure for processing quality that quantifies the processing degradation under shedding but is query-independent, i.e. it does not have to be adapted manually to the semantics of specific queries.
Q12. How many fragments does the BALANCE-SIC fairness algorithm accept?
Figure 11 shows that, when more queries are multi-fragmented, the BALANCE-SIC fairness algorithm converges to a fairer system, as more queries span nodes.
Q13. What is the SIC value of a result tuple?
the query SIC value of result tuples is:qSIC := ∑tr∈T̃ RtrSIC, (4)where the authors only consider result tuples tr ∈ T̃ R ⊆ T R that are derived from source tuples ∈
Q14. What are the different approaches for load shedding?
There exist semantic load shedding approaches for specific operator types, such as joins [21, 26, 17], aggregates [10, 35] and XML operators [38].
Q15. What is the effect of shedding tuples on the other nodes?
The shedding of tuples eventually converges to global fairness as each node continuously adjusts its shedding behaviour in response to that of other FSPS nodes.
Q16. How does the Jain’s fairness index work?
The solution of [44] is obtained using Matlab and the Jain’s fairness index for the resulting utilities’ distribution (normalised log-output rates) equals 0.87.