class: center, middle, inverse, title-slide # ASP 460 2.0 Special Topics in Statistics ## Higher-dimensional Displays and Special Structures ### Thiyanga Talagala ### 2020-06-10 --- ## Higher-dimensional Displays and Special Structures - Scatterplot Matrices (Sploms) - Parallel Coordinates - Mosaic Plots - Small Multiples and Trellis Displays - Time Series --- ## Scatterplot Matrices ![](lecture10_files/figure-html/unnamed-chunk-2-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-3-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-4-1.png)<!-- --> --- ```r set.seed(2020) x <- as.factor(c(rep(1, 100), rep(0, 100))) y <- as.factor(sample(x)) z <- as.factor(c(rep(1, 25), rep(2, 75), rep(3, 50), rep(4, 50))) x <- factor(x, label=c("A", "B")) y <- factor(y, label=c("Yes", "No")) z <- factor(z, label=c("High", "Middle", "Low", "Never")) df <- data.frame(v1=x, v2=y, v3=z) summary(df) ``` ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` --- ```r table(df$v1,df$v2) ``` ``` Yes No A 54 46 B 46 54 ``` ```r table(df$v1, df$v3) ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ```r table(df$v2, df$v3) ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` --- # Mosaic Plot (1 ~ X) .pull-left[ ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r library(ggmosaic) ggplot(df)+ geom_mosaic(aes(x=product(v1)))+ ggtitle("V1") ``` ![](lecture10_files/figure-html/unnamed-chunk-8-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v1), fill=v1))+ ggtitle("V1") ``` ![](lecture10_files/figure-html/unnamed-chunk-10-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v3), fill=v3))+ ggtitle("V3") ``` ![](lecture10_files/figure-html/unnamed-chunk-12-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v1, v2), fill=v2))+ggtitle("V1 and V2*") ``` ![](lecture10_files/figure-html/unnamed-chunk-14-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v1, v2), fill=v1))+ggtitle("V1* and V2") ``` ![](lecture10_files/figure-html/unnamed-chunk-16-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v1, v3), fill=v1))+ggtitle("V1 and V3*") ``` ![](lecture10_files/figure-html/unnamed-chunk-18-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v3, v1), fill=v1))+ggtitle("V1 and V3*") ``` ![](lecture10_files/figure-html/unnamed-chunk-20-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X) .pull-left[ ``` v1 v2 v3 A:100 Yes:100 High :25 B:100 No :100 Middle:75 Low :50 Never :50 ``` ``` Yes No A 54 46 B 46 54 ``` ``` High Middle Low Never A 0 0 50 50 B 25 75 0 0 ``` ``` High Middle Low Never Yes 12 34 28 26 No 13 41 22 24 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v3, v1), fill=v3))+ggtitle("V1 and V3*") ``` ![](lecture10_files/figure-html/unnamed-chunk-22-1.png)<!-- --> ] --- # Mosaic Plot (1 ~ Y + X|Z) .pull-left[ ``` High Middle Low Never A 0 0 28 26 B 12 34 0 0 ``` ``` High Middle Low Never A 0 0 22 24 B 13 41 0 0 ``` ] .pull-right[ ```r ggplot(df)+ geom_mosaic(aes(x=product(v3, v1), fill=v3))+ggtitle("V1 and V3*") + facet_wrap(~v2) ``` ![](lecture10_files/figure-html/unnamed-chunk-24-1.png)<!-- --> ] --- # Conditioning plots (Coplots) ## Trellis Graphics - Grid graphic system - The idea is based on conditioning on the values taken on by one or more of the variables in a data set. - Categorical: same subplot for levels of the conditioning variable - Numeric: same subplot for different ranges (intervals) of the conditioning variable --- # Trellis Graphics: Categorical ![](lecture10_files/figure-html/unnamed-chunk-25-1.png)<!-- --> --- # Trellis Graphics: Numeric `quakes`: magitude of earthquakes under the Tonga Trench, to the North of New Zealand. ```r str(quakes) ``` ``` 'data.frame': 1000 obs. of 5 variables: $ lat : num -20.4 -20.6 -26 -18 -20.4 ... $ long : num 182 181 184 182 182 ... $ depth : int 562 650 42 626 649 195 82 194 211 622 ... $ mag : num 4.8 4.2 5.4 4.1 4 4 4.8 4.4 4.7 4.3 ... $ stations: int 41 15 43 19 11 12 43 15 35 19 ... ``` ```r library(lattice) ``` --- # Trellis Graphics: Numeric ![](lecture10_files/figure-html/unnamed-chunk-28-1.png)<!-- --> --- .pull-left[ ## Unconditional plot ![](lecture10_files/figure-html/unnamed-chunk-29-1.png)<!-- --> ] .pull-right[ ## Conditional plot ![](lecture10_files/figure-html/unnamed-chunk-30-1.png)<!-- --> Condition on `depth` ] --- # Coplots using ggplot .pull-left[ ![](lecture10_files/figure-html/unnamed-chunk-31-1.png)<!-- --> ] .pull-right[ ![](lecture10_files/figure-html/unnamed-chunk-32-1.png)<!-- --> ] --- # Coplots using ggplot .pull-left[ ![](lecture10_files/figure-html/unnamed-chunk-33-1.png)<!-- --> ] .pull-right[ ![](lecture10_files/figure-html/unnamed-chunk-34-1.png)<!-- --> ] --- # Parallel Coordinates Plots ```r df1 <- data.frame(x=c(1, 2), y=c(20, 10), z=c(10, 10)) df1 ``` ``` x y z 1 1 20 10 2 2 10 10 ``` ![](lecture10_files/figure-html/unnamed-chunk-36-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-37-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-38-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-39-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-40-1.png)<!-- --> --- ![](lecture10_files/figure-html/unnamed-chunk-41-1.png)<!-- --> --- ## ggparcoord: scales The different types of scales are as follows: `std`: univariately, subtract mean and divide by standard deviation `robust`: univariately, subtract median and divide by median absolute deviation `uniminmax`: univariately, scale so the minimum of the variable is zero, and the maximum is one `globalminmax`: no scaling is done; the range of the graphs is defined by the global minimum and the global maximum `center`: use uniminmax to standardize vertical height, then center each variable at a value specified by the scaleSummary param `centerObs`: use uniminmax to standardize vertical height, then center each variable at the value of the observation specified by the centerObsID param --- ![](lecture10_files/figure-html/unnamed-chunk-42-1.png)<!-- -->