本書介紹如何使用R語言完成資料讀取 (檔案、透過API擷取或爬蟲)、資料清洗與處理、探索式資料分析、資料視覺化、互動式資料呈現 (搭配Shiny) 與資料探勘等,並介紹R與Hadoop Ecosystems介接方法。

資料探勘章節尚未完成,epub版本格式微調中。

如要一次安裝所有本書會使用到的套件,可在R內執行以下程式碼:

本書為長庚大學資訊管理學系 大數據分析方法課程教學使用書籍,並可搭配YouTube平台上的教學影片參考使用,影片閱讀清單詳見本書最末章節Ch 13 教學影片資訊。

如果您想修改文字或範例,歡迎透過此連結或是透過GitHub issue提供建議與回饋。

本書程式碼執行環境:

## R version 4.0.1 (2020-06-06)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=Chinese (Traditional)_Taiwan.950 
## [2] LC_CTYPE=Chinese (Traditional)_Taiwan.950   
## [3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
## [4] LC_NUMERIC=C                                
## [5] LC_TIME=Chinese (Traditional)_Taiwan.950    
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] tidyr_1.1.0           curl_4.3              treemap_2.4-2        
##  [4] purrr_0.3.4           caret_6.0-86          lattice_0.20-41      
##  [7] MASS_7.3-51.6         arulesViz_1.3-3       arules_1.6-6         
## [10] Matrix_1.2-18         fields_10.3           maps_3.3.0           
## [13] spam_2.5-1            dotCall64_1.0-0       rpart.plot_3.0.8     
## [16] rpart_4.1-15          googleVis_0.6.6       ggvis_0.4.5          
## [19] plotly_4.9.2.1        shiny_1.4.0.2         treemapify_2.5.3     
## [22] WDI_2.7.1             choroplethrMaps_1.0.1 choroplethr_3.6.3    
## [25] acs_2.1.4             ggmap_3.0.0.902       maptools_1.0-1       
## [28] rgeos_0.5-3           rgdal_1.5-12          sp_1.4-2             
## [31] rvest_0.3.5           xml2_1.3.2            Rfacebook_0.6.15     
## [34] httpuv_1.5.4          rjson_0.2.20          XML_3.99-0.3         
## [37] jsonlite_1.6.1        httr_1.4.1            readxl_1.3.1         
## [40] readr_1.3.1           SportsAnalytics_0.2   reshape2_1.4.4       
## [43] stringr_1.4.0         data.table_1.12.8     RCurl_1.98-1.2       
## [46] rmarkdown_2.3         knitr_1.28            bookdown_0.20        
## [49] bit64_0.9-7.1         bit_1.1-15.2          lubridate_1.7.9      
## [52] dplyr_1.0.0           ggplot2_3.3.1        
## 
## loaded via a namespace (and not attached):
##   [1] uuid_0.1-4           backports_1.1.7      Hmisc_4.4-0         
##   [4] igraph_1.2.5         plyr_1.8.6           lazyeval_0.2.2      
##   [7] splines_4.0.1        gridBase_0.4-7       digest_0.6.25       
##  [10] foreach_1.5.0        htmltools_0.4.0      viridis_0.5.1       
##  [13] gdata_2.18.0         magrittr_1.5         checkmate_2.0.0     
##  [16] cluster_2.1.0        gclus_1.3.2          recipes_0.1.12      
##  [19] ggfittext_0.9.0      gower_0.2.1          jpeg_0.1-8.1        
##  [22] colorspace_1.4-1     rappdirs_0.3.1       xfun_0.14           
##  [25] crayon_1.3.4         zoo_1.8-8            survival_3.1-12     
##  [28] tigris_1.0           iterators_1.0.12     glue_1.4.1          
##  [31] registry_0.5-1       gtable_0.3.0         ipred_0.9-9         
##  [34] scales_1.1.1         DBI_1.1.0            Rcpp_1.0.4.6        
##  [37] viridisLite_0.3.0    xtable_1.8-4         htmlTable_2.0.1     
##  [40] units_0.6-7          foreign_0.8-80       Formula_1.2-3       
##  [43] stats4_4.0.1         lava_1.6.7           prodlim_2019.11.13  
##  [46] DT_0.13              vcd_1.4-7            htmlwidgets_1.5.1   
##  [49] gplots_3.0.3         RColorBrewer_1.1-2   acepack_1.4.1       
##  [52] ellipsis_0.3.1       pkgconfig_2.0.3      nnet_7.3-14         
##  [55] RJSONIO_1.3-1.4      tidyselect_1.1.0     rlang_0.4.6         
##  [58] later_1.1.0.1        visNetwork_2.0.9     munsell_0.5.0       
##  [61] cellranger_1.1.0     tools_4.0.1          generics_0.0.2      
##  [64] evaluate_0.14        fastmap_1.0.1        yaml_2.2.1          
##  [67] ModelMetrics_1.2.2.2 caTools_1.18.0       RgoogleMaps_1.4.5.3 
##  [70] dendextend_1.13.4    packrat_0.5.0        nlme_3.1-148        
##  [73] mime_0.9             compiler_4.0.1       rstudioapi_0.11     
##  [76] png_0.1-7            e1071_1.7-3          tibble_3.0.1        
##  [79] stringi_1.4.6        classInt_0.4-3       vctrs_0.3.1         
##  [82] pillar_1.4.4         lifecycle_0.2.0      lmtest_0.9-37       
##  [85] bitops_1.0-6         seriation_1.2-8      R6_2.4.1            
##  [88] latticeExtra_0.6-29  promises_1.1.1       TSP_1.1-10          
##  [91] KernSmooth_2.23-17   gridExtra_2.3        codetools_0.2-16    
##  [94] gtools_3.8.2         assertthat_0.2.1     withr_2.2.0         
##  [97] hms_0.5.3            timeDate_3043.102    class_7.3-17        
## [100] pROC_1.16.2          sf_0.9-5             scatterplot3d_0.3-41
## [103] base64enc_0.1-3

本書使用套件版本:

Package Version
ggplot2 3.3.1
dplyr 1.0.0
lubridate 1.7.9
bit64 0.9-7.1
bookdown 0.20
knitr 1.28
rmarkdown 2.3
RCurl 1.98-1.2
data.table 1.12.8
stringr 1.4.0
reshape2 1.4.4
SportsAnalytics 0.2
readr 1.3.1
readxl 1.3.1
httr 1.4.1
jsonlite 1.6.1
XML 3.99-0.3
Rfacebook 0.6.15
rvest 0.3.5
rgdal 1.5-12
rgeos 0.5-3
maptools 1.0-1
ggmap 3.0.0.902
choroplethr 3.6.3
choroplethrMaps 1.0.1
WDI 2.7.1
treemapify 2.5.3
shiny 1.4.0.2
plotly 4.9.2.1
ggvis 0.4.5
googleVis 0.6.6
rpart 4.1-15
rpart.plot 3.0.8
fields 10.3
arules 1.6-6
datasets 4.0.1
arulesViz 1.3-3
MASS 7.3-51.6
caret 6.0-86
purrr 0.3.4
treemap 2.4-2
curl 4.3
xml2 1.3.2
tidyr 1.1.0

本著作係採用創用 CC 姓名標示-非商業性-禁止改作 3.0 台灣 授權條款授權。