From 8d15e4da526f07d4952283c19dc3b5ec853a546a Mon Sep 17 00:00:00 2001 From: "Edward A. Roualdes" Date: Wed, 31 Jan 2024 14:32:54 -0800 Subject: [PATCH] fix issue 08 --- week-02.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/week-02.md b/week-02.md index aebbfe1..ee6cfde 100644 --- a/week-02.md +++ b/week-02.md @@ -91,7 +91,7 @@ The plotting package plotnine, by default, includes `NaN`s as its own category, which can be undesirable. ```{code-cell} -p = pn.ggplot(data = msleep) + pn.geom_bar(pn.aes("conservation")) +p = pn.ggplot(data = msleep) + pn.geom_bar(pn.aes(x = "conservation")) p.draw() ``` @@ -104,7 +104,7 @@ you do care about. ```{code-cell} df = msleep.dropna(subset = "conservation") -p = pn.ggplot(data = df) + pn.geom_bar(pn.aes("conservation")) +p = pn.ggplot(data = df) + pn.geom_bar(pn.aes(x = "conservation")) p.draw() ``` @@ -280,6 +280,13 @@ msleep["smrt"] = msleep["smrt"].cat.remove_unused_categories() msleep["smrt"] ``` +The function `remove_unused_categories()` is a safe bet, because no used +category will be removed. Alternatively, the function +[`remove_categories([...])`](https://pandas.pydata.org/docs/reference/api/pandas.Series.cat.remove_categories.html#pandas.Series.cat.remove_categories) +will remove any specified categories, whether or not they are used. The +function documentation warns "Values which were in the removed categories will +be set to NaN". + ```{seealso} Week 02 Assignment ```