Lecture 5 - Data Manipulation Advanced

Summary

#1 Data manipulation basics (Single Dataset)

flowchart TD
  A((dplyr)) --> AA{{selection}}
  AA -->|columns|AAA(select):::api
  AA -->|rows|AAB(filter):::api
  A -->AB{{transformation}}
  AB -->|columns|ABA(mutate):::api
  AB -->|rows|ABB(arrange):::api
  A -->AC{{aggregation}}
  AC -->ACA(summarize):::api
  ACA --o ACAA(group_by):::api
  classDef api fill:#f96,color:#fff

#2 Joining Data (Two Datasets)

flowchart TD
  A((dplyr)) --> AD{{join}}
  AD --> ADA{{mutating}}
  ADA --> ADAA(inner):::api
  ADA --> ADAB(outer):::api
  ADA --> ADAC(left):::api
  ADA --> ADAD(right):::api
  ADA --> ADAE(full):::api
  AD --> ADB{{filtering}}
  ADB --> ADBA(semi):::api
  ADB --> ADBB(anti):::api
  classDef api fill:#f96,color:#fff

#3 Pivot Data (Reshape Dataset)

flowchart TD
  A((tidyr)) -->B{{reshaping}}
  B --> C(pivot_wider):::api
  B --> D(pivot_longer):::api
  classDef api fill:#f96,color:#fff