blog/content/cheatsheet.md

98 lines
3.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: bovender's personal cheat sheet
description: >
On this page, I collect bits and pieces of information that I
never seem to be able to memorize. This stuff is helpful for me
and maybe it is helpful for you, too, internet wanderer.
date: 2024-11-05T08:00:00+0100
draft: false
---
{{< git-info >}}
{{< param "description" >}}
## R
### Print R objects so that they can be re-generated from code
```r
# dput takes an R object
dput(x, file = "", control = c("keepNA", "keepInteger", "niceNames", "showAttributes"))
dget(file, keep.source = FALSE)
# dump takes a list of R object names
dump(list, file = "dumpdata.R", append = FALSE, control = "all", envir = parent.frame(), evaluate = TRUE)
```
Example:
``` r
library(dplyr)
my_data <-
starwars |>
select(species, homeworld)
dput(my_data)
#> structure(list(species = c("Human", "Droid", "Droid", "Human",
#> "Human", "Human", "Human", "Droid", "Human", "Human", "Human",
#> "Human", "Wookiee", "Human", "Rodian", "Hutt", "Human", NA, "Yoda's species",
#> "Human", "Human", "Droid", "Trandoshan", "Human", "Human", "Mon Calamari",
#> "Human", "Human", "Ewok", "Sullustan", "Human", "Neimodian",
#> "Human", "Human", "Gungan", "Gungan", "Gungan", "Human", "Toydarian",
#> "Dug", "Human", "Human", "Zabrak", "Twi'lek", "Twi'lek", "Aleena",
#> "Vulptereen", "Xexto", "Toong", "Human", "Cerean", "Nautolan",
#> "Zabrak", "Tholothian", "Iktotchi", "Quermian", "Kel Dor", "Chagrian",
#> NA, NA, "Human", "Geonosian", "Mirialan", "Mirialan", "Human",
#> "Human", "Human", "Human", "Clawdite", "Besalisk", "Kaminoan",
#> "Kaminoan", "Human", "Droid", "Skakoan", "Muun", "Togruta", "Kaleesh",
#> "Wookiee", "Human", NA, "Pau'an", "Human", "Human", "Human",
#> "Droid", "Human"), homeworld = c("Tatooine", "Tatooine", "Naboo",
#> "Tatooine", "Alderaan", "Tatooine", "Tatooine", "Tatooine", "Tatooine",
#> "Stewjon", "Tatooine", "Eriadu", "Kashyyyk", "Corellia", "Rodia",
#> "Nal Hutta", "Corellia", "Bestine IV", NA, "Naboo", "Kamino",
#> NA, "Trandosha", "Socorro", "Bespin", "Mon Cala", "Chandrila",
#> NA, "Endor", "Sullust", NA, "Cato Neimoidia", "Coruscant", "Naboo",
#> "Naboo", "Naboo", "Naboo", "Naboo", "Toydaria", "Malastare",
#> "Naboo", "Tatooine", "Dathomir", "Ryloth", "Ryloth", "Aleen Minor",
#> "Vulpter", "Troiken", "Tund", "Haruun Kal", "Cerea", "Glee Anselm",
#> "Iridonia", "Coruscant", "Iktotch", "Quermia", "Dorin", "Champala",
#> "Naboo", "Naboo", "Tatooine", "Geonosis", "Mirial", "Mirial",
#> "Naboo", "Serenno", "Alderaan", "Concord Dawn", "Zolan", "Ojom",
#> "Kamino", "Kamino", "Coruscant", NA, "Skako", "Muunilinst", "Shili",
#> "Kalee", "Kashyyyk", "Alderaan", "Umbara", "Utapau", NA, NA,
#> NA, NA, NA)), row.names = c(NA, -87L), class = c("tbl_df", "tbl",
#> "data.frame"))
```
<sup>Created on 2024-11-05 with [reprex v2.1.1](https://reprex.tidyverse.org)</sup>
### Calculate percentages of subgroups
Principle: `Group` the dataset and `summarize` by counting (`n()`), drop the last group
while doing so, then `mutate` the dataset by adding the count divided by the sum of counts.
Hint to better understand the example: Look at the sexes in black-eyed subjects.
``` r
library(dplyr)
starwars |>
group_by(eye_color, gender) |>
summarize(n = n(), .groups = "drop_last") |>
mutate(p = n/sum(n))
#> # A tibble: 23 × 4
#> # Groups: eye_color [15]
#> eye_color gender n p
#> <chr> <chr> <int> <dbl>
#> 1 black feminine 2 0.2
#> 2 black masculine 8 0.8
#> 3 blue feminine 6 0.316
#> 4 blue masculine 12 0.632
#> 5 blue <NA> 1 0.0526
#> 6 blue-gray masculine 1 1
#> 7 brown feminine 4 0.190
#> 8 brown masculine 15 0.714
#> 9 brown <NA> 2 0.0952
#> 10 dark masculine 1 1
#> # 13 more rows
```
<sup>Created on 2024-11-05 with [reprex v2.1.1](https://reprex.tidyverse.org)</sup>