Test no paramétricos: U de Mann-Whitney. Módulo 6
Analysis of the Relationship Between Quantitative and Qualitative Variables
In this section, the analysis focuses on the relationship between a quantitative variable and a qualitative one, particularly emphasizing the UDMA-WIME test. The discussion begins with an example involving meat consumption in two Spanish autonomous communities, Extremadura and Castilla León.
Understanding Variable Relationships
- Significant differences in meat consumption are observed between Castilla León and Extremadura.
- The analysis aims to determine if meat consumption depends on the autonomous community of residence.
- The study seeks to establish a relationship between a qualitative variable (autonomous community) and a quantitative variable (meat consumption).
Statistical Testing Approaches
- Analyzing the relationship between variables involves comparing central tendencies through statistical tests.
- Different statistical approaches are used based on data normality; for non-normal data like in this case, median comparison tests are employed.
UDMA-WIME Test: Non-parametric Analysis
This section delves into the UDMA-WIME test as a non-parametric method for analyzing data that deviates from normality or involves small sample sizes.
Considerations for Non-Normal Data
- When dealing with small sample sizes or non-normal data, it is crucial to assess if the data is independent or paired.
- Tests like UDMA-WIME compare medians rather than means and operate on order ranks of data points.
Power and Hypothesis Testing
- Non-parametric tests are generally less powerful than parametric ones due to their reliance on rank-based calculations.
- The hypothesis testing framework involves comparing medians under specified null and alternative hypotheses using calculated test statistics.
Calculation Process
- The U statistic in UDMA-WIME is determined by calculating two values (UZ1 and UZ2), which depend on sample sizes (N1, N2).
Understanding Statistical Calculations
In this section, the speaker explains the process of calculating R1 and R2 in statistical analysis.
Calculating R1 and R2
- The process involves giving the same order to ligatures and then calculating R1 and R2.
- R1 is defined as the sum of the rank orders of the first sample, while R2 is defined as the sum of the rank orders of the second sample.
- These calculations are crucial for determining experimental values used in statistical analysis.
Working with Data Samples
This part focuses on organizing and analyzing data samples for statistical testing.
Organizing Data Samples
- Data from different groups are sorted from lowest to highest values for analysis.
- Rank orders are assigned to each value within the samples to facilitate further calculations.
- Ligatures, where values repeat, are addressed by assigning average rank orders.
Handling Ligatures
- Ligatures occur when values repeat across samples, requiring adjustment by assigning average rank orders.
- Calculation involves determining mean ranks for ligatured values to ensure accurate statistical analysis.
Calculating Experimental Values
This segment delves into calculating experimental values based on organized data samples.
Determining Experimental Values
- After assigning rank orders, values are replaced with corresponding ranks for further computations.
- Summing up rank orders provides R1 for one group and R2 for another group in preparation for experimental value calculation.
Finalizing Experimental Value Calculation
- The calculated R1 and R2 are utilized in a formula to derive an experimental value critical for hypothesis testing.
New Section
In this section, the speaker discusses statistical analysis techniques related to sample sizes and hypothesis testing.
Statistical Analysis Techniques
- The speaker explains a calculation involving subtracting something related to sample sizes and dividing by a specific formula. This process follows a standard normal model for comparison ease.
- When comparing data with critical points of a normal distribution at a 95% confidence level or 5% error level, the transformed value is crucial. If the absolute value is less than 1.96 (the critical point), it leads to accepting the null hypothesis of no differences in consumption.
New Section
This part delves into interpreting results based on calculated values and their significance in hypothesis testing.
Interpreting Results
- A typical computer output displays sample sizes, average ranges, and sum of ranks essential for subsequent calculations and decision-making processes.
- Values such as R1, R2, and u are calculated and displayed in the output, aiding in making informed decisions based on experimental data.
New Section
Here, the discussion centers around approximations using normal distributions and interpreting outputs for significance determination.
Normal Distribution Approximation
- Despite being an illustrative example, the approximation by normal distribution reveals a value of 0.739 that aligns with previous calculations. Additionally, the p-value of 0.46 indicates non-significant results.
New Section
The final segment concludes by emphasizing the lack of significant differences between meat consumption among residents from different regions.
Conclusive Remarks
- Based on the data analysis outcomes, it is concluded that there is no significant association detected between meat consumption patterns among residents from two distinct regions.