Efficient balanced sampling: The cube method
read more
Citations
Handling class imbalance in customer churn prediction
Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study
Socioeconomic impacts of COVID-19 in low-income countries.
TRIÈST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fixed Memory Size
Geostatistical Model-Based Estimates of Schistosomiasis Prevalence among Individuals Aged ≤20 Years in West Africa
References
Model assisted survey sampling
On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection
On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection
On the Theory of Sampling from Finite Populations
Related Papers (5)
Frequently Asked Questions (11)
Q2. What is the strategy for using balanced sampling?
Balanced sampling protects against extreme or negative weights, which, as mentioned before, can be very problematic, particularly with small samples.
Q3. What is the second way to sort the data?
The second step consists of taking v(1)= (0 1 0 . . . 0)∞.The second way consists of sorting the data randomly before applying the cube method with any vectors v(t).
Q4. What is the balancing variable in equation 2?
When the population size is known before selecting the sample, it could be important to select a sample such that∑ kµU S k p k =N. (2)Equation (2) is a balancing equation, in which the balancing variable is x k =1 (kµU).
Q5. How can the authors implement the cube method?
Nearly all existing methods, except the rejective ones and the variations of systematic sampling, can easily be implemented by means of the cube method.
Q6. What is the general method for detecting when the balancing equations are exactly satisfied?
At the end of the flight phase, a vertex of K is chosen randomly in such a way that the inclusion probabilities pk (kµU) and the balancing equations (1) are exactly satisfied.
Q7. What is the variance approximation for balanced sampling?
A variance approximation is proposed for balanced sampling based on regression residuals, which is validated by a theoretical development and a large set of simulations.
Q8. What is the way to solve the problem of sampling with unequal probability?
In order to satisfy this constraint, expression (5) implies that∑ kµU u k (t)=0. (13)Each choice, random or not, of vectors u(t) that satisfy (13) produces another method for sampling with unequal probability.
Q9. What is the first step in the generating of the vector u(t)?
For generating the vector u(t), the authors first generate any, random or not, vector v(t)= {vk (t)} in RN, that is independentof p(t−1), . . . , p(1).
Q10. What is the standard probability weighted estimator?
The calibration estimator is defined asYC R =YC+ (X−XC )∞b,whereb=A ∑ kµU s k x k x∞ k p k B−1 ∑ kµU s k x k y k p kis the ‘standard’ probability weighted estimator.
Q11. How many auxiliary variables can be used to calculate the variance?
With some adjustments, the cube method can thus be applied to any sampling frame, even with millions of units and a large number of auxiliary variables.