Category Archives: Julia

These Languages are Accumulating

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/11/28/these-languages-are-accumulating/

I keep saying that the more programming languages you know, the more you will
understand all the others you know – I’m now at the point where I want to solve
every problem I see in a handful of different languages. They all offer
different functionality, and some are certainly more suited to particular
problems than others, but there’s a world of difference between two characters
and importing from two libraries.

A newsletter I follow (and can’t find online copies of) that demonstrates neat
things in python (gotta learn it, despite not loving it) recently covered
accumulate, showing that sum and accumulate were sort of related

>>> my_list = [42, 73, 0, 16, 10]
>>> sum(my_list)
141
>>> from itertools import accumulate
>>> list(accumulate(my_list))
[42, 115, 115, 131, 141]

sum adds up all the elements of the list, while accumulate does the same but
keeps each successive partial sum.

It rounds out the demo with an alternative function being used in accumulate

>>> from itertools import accumulate
>>> from operator import mul  # mul(2, 3) == 6
>>> initial_investment = 1000
>>> rates = [1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06]
>>> list(
...     accumulate(rates, mul, initial=initial_investment)
... )
[1000, 1010.0, 1020.1, 1040.502, 1066.515, 1103.843, 1142.477, 1211.026]

Now, firstly… from operator import mul??? It looks like there’s no way to
pass * as an argument to a function. I could define a function that performs
the same on known arguments, e.g. lambda x, y: x * y

>>> list(accumulate(rates, lambda x, y: x*y, initial=initial_investment))
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

but… ew.

It’s possible that there’s a different way to approach this. A list
comprehension comes to mind, e.g. something like

>>> [sum(my_list[0:i]) for i in range(1, len(my_list)+1)]
[42, 115, 115, 131, 141]

but that requires performing a sum for each sub-interval, so performance would
not scale well (admittedly, that was not a consideration here at all). I also
don’t believe there’s a built-in prod so one must import math in order to do
similar

>>> import math
>>> x = [initial_investment] + rates
>>> [math.prod(x[0:i]) for i in range(1, len(x)+1)]
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

In R that could use the built-in cumprod for the cumulative product

initial_investment <- 1000
rates = c(1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06)

cumprod(c(initial_investment, rates))
## [1] 1000.000 1010.000 1020.100 1040.502 1066.515 1103.843 1142.477 1211.026

but that has the ‘multiply’ operation hardcoded. cumsum uses + as the
function… hmmm. Maybe R doesn’t have a generalised accumulate?

I’ve been playing around with Haskell lately, so recursive functions to the
rescue! One feature of recursive functions in R that I really like is Recall
which calls the function in which it is defined with a new set of arguments –
perfect for recursion!

accumulate_recall <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Recall(tail(x, -1), f, f(i, x[2])))
}

It’s also robust against renaming the function; the body doesn’t actually call
accumulate_recall by name at all.

This might be inefficient, though – it’s not uncommon to blow out the stack, so
a new Tailcall function (which doesn’t have the same elegance of being robust
against renaming) helps with flagging this as something that can be optimised

accumulate <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Tailcall(accumulate, tail(x, -1), f, f(i, x[2])))
}

With this, I can emulate the cumsum() and cumprod() functions

cumprod(1:6)
## [1]   1   2   6  24 120 720
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720
cumsum(2:6)
## [1]  2  5  9 14 20
accumulate(2:6, `+`)
## [1]  2  5  9 14 20

unless I try to calculate something too big…

cumprod(5:15)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000
accumulate(5:15, `*`)
## Warning in f(i, x[2]): NAs produced by integer overflow
##  [1]         5        30       210      1680     15120    151200   1663200
##  [8]  19958400 259459200        NA        NA

It appears that the built-in functions convert to numeric. That’s easily fixed
on input

accumulate(as.numeric(5:15), `*`)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000

In any case, there’s a generalised accumulate that takes the bare functions as
arguments.

But it can be so much cleaner than this!

In APL you won’t find any function named “sum” because it is just a reduction
(Reduce in R) with the function +

      sum←+/
      
      sum ⍳6 ⍝ sum the values 1:6
21

      sum 1↓⍳6 ⍝ sum the values 2:6
20

which in R is

sum(1:6)
## [1] 21
sum(2:6)
## [1] 20

Why would you write sum if you can just use +/? It’s fewer
characters to write out the implementation than the name!

For accumulate the terminology in APL is scan which uses a very similar
glyph because the operation itself is very similar; a reduce (/) is just the
last value of a scan (\) which keeps the progressive values. In both cases,
the operator (either slash) takes a binary function as the left argument and
produces a modified function – in these examples, effectively sum and prod
which is then applied to values on the right. The scan version does the same

      +\⍳6
1 3 6 10 15 21

      ×\⍳6
1 2 6 24 120 720
accumulate(1:6, `+`)
## [1]  1  3  6 10 15 21
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720

As for the rates example above, we concatenate the initial value with catenate
(,) just like the R example, but otherwise this works fine

      rates ← 1.01 1.01 1.02 1.025 1.035 1.035 1.06
      inv ← 1000
      
      ×/inv, rates
1211.025672

      ×\inv, rates
1000 1010 1020.1 1040.502 1066.51455 1103.842559 1142.477049 1211.025672

So all of that recursive R code made to generalise the cumulative application of
a function provided as an argument is boiled down to just the single glyph \.
Outstanding!

What’s more, there are lots of binary functions one would use this with, all
of which have spelled-out names in other languages

      +/ ⍝ sum (add)
      ×/ ⍝ prod (multiply)
      ∧/ ⍝ all (and)
      ∨/ ⍝ any (or)
      ⌈/ ⍝ maximum (max)
      ⌊/ ⍝ minimum (min)

In summary, it seems that looking across these languages, the available options
range from a single glyph for scan along with the bare binary operator, e.g.
×/; a cumprod() function which isn’t well-generalised but works out of the
box; and then there’s whatever mess this is (once you’ve installed these)

>>> from itertools import accumulate
>>> from operator import mul
>>> list(accumulate(rates, mul, initial=initial_investment))

Where did we go so wrong?

For what it’s worth, Julia has a reduce and an accumulate that behave very
nicely; generalised for the binary function as an argument

julia> reduce(+, 1:6)
21

julia> reduce(*, 1:6)
720

julia> accumulate(+, 1:6)
6-element Vector{Int64}:
  1
  3
  6
 10
 15
 21

julia> accumulate(*, 1:6)
6-element Vector{Int64}:
   1
   2
   6
  24
 120
 720

This is extremely close to the APL approach, but with longer worded names for
the reduce and scan operators. It also defines the more convenient sum,
prod, cumsum, and cumprod; no shortage of ways to do this in Julia!

In Haskell, foldl and scanl are the (left-associative) version of reduce
and accumulate, and passing an infix as an argument necessitates wrapping it
in parentheses

ghci> foldl (+) 0 [1..6]
21

ghci> scanl (+) 0 [1..6]
[0,1,3,6,10,15,21]

ghci> foldl (*) 1 [1..6]
720

ghci> scanl (*) 1 [1..6]
[1,1,2,6,24,120,720]

This requires an explicit starting value, unless one uses the specialised
versions which use the first value as an initial value

ghci> foldl1 (+) [1..6]
21

ghci> scanl1 (+) [1..6]
[1,3,6,10,15,21]

ghci> foldl1 (*) [1..6]
720

ghci> scanl1 (*) [1..6]
[1,2,6,24,120,720]

I started this post hoping to demonstrate how nice the APL syntax was for this,
but the detour through generalising the R function was a lot of unexpected fun
as well.

Comments, improvements, or your own solutions are most welcome. I can be found
on Mastodon or use the comments below.

Addendums

It should probably be noted that R does have a function scan but it’s for
reading data into a vector – if you ever spot someone using it for that… run.
I have war stories about that function.

I’d love to hear how this is accomplished in some other languages, too – does it
have a built-in accumulate that takes a binary function?

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14)
##  os       macOS Sonoma 14.6
##  system   aarch64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Adelaide
##  date     2024-11-28
##  pandoc   3.5 @ /opt/homebrew/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  blogdown      1.19       2024-02-01 [1] CRAN (R 4.4.0)
##  bookdown      0.41       2024-10-16 [1] CRAN (R 4.4.1)
##  bslib         0.8.0      2024-07-29 [1] CRAN (R 4.4.0)
##  cachem        1.1.0      2024-05-16 [1] CRAN (R 4.4.0)
##  cli           3.6.3      2024-06-21 [1] CRAN (R 4.4.0)
##  devtools      2.4.5      2022-10-11 [1] CRAN (R 4.4.0)
##  digest        0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
##  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate      1.0.1      2024-10-10 [1] CRAN (R 4.4.1)
##  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
##  fs            1.6.5      2024-10-30 [1] CRAN (R 4.4.1)
##  glue          1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
##  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets   1.6.4      2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv        1.6.15     2024-03-26 [1] CRAN (R 4.4.0)
##  jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite      1.8.9      2024-09-20 [1] CRAN (R 4.4.1)
##  knitr         1.48       2024-07-07 [1] CRAN (R 4.4.0)
##  later         1.3.2      2023-12-06 [1] CRAN (R 4.4.0)
##  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
##  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
##  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.4.0)
##  mime          0.12       2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.4.0)
##  pkgbuild      1.4.5      2024-10-28 [1] CRAN (R 4.4.1)
##  pkgload       1.4.0      2024-06-28 [1] CRAN (R 4.4.0)
##  profvis       0.4.0      2024-09-20 [1] CRAN (R 4.4.1)
##  promises      1.3.0      2024-04-05 [1] CRAN (R 4.4.0)
##  purrr         1.0.2      2023-08-10 [1] CRAN (R 4.4.0)
##  R6            2.5.1      2021-08-19 [1] CRAN (R 4.4.0)
##  Rcpp          1.0.13-1   2024-11-02 [1] CRAN (R 4.4.1)
##  remotes       2.5.0.9000 2024-11-03 [1] Github (r-lib/remotes@5b7eb08)
##  rlang         1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown     2.28       2024-08-17 [1] CRAN (R 4.4.0)
##  rstudioapi    0.17.1     2024-10-22 [1] CRAN (R 4.4.1)
##  sass          0.4.9      2024-03-15 [1] CRAN (R 4.4.0)
##  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.4.0)
##  shiny         1.9.1      2024-08-01 [1] CRAN (R 4.4.0)
##  urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.4.0)
##  usethis       3.0.0      2024-07-29 [1] CRAN (R 4.4.0)
##  vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
##  xfun          0.49       2024-10-31 [1] CRAN (R 4.4.1)
##  xtable        1.8-4      2019-04-21 [1] CRAN (R 4.4.0)
##  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.4.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────

How to Create a Julia Package from Scratch

By: Great Lakes Consulting

Re-posted from: https://blog.glcs.io/package-creation

This post was written by Steven Whitaker.

The Julia programming languageis a high-level languagethat is known, at least in part,for its excellent package managerand outstanding composability.(See another blog post that illustrates this composability.)

Julia makes it super easyfor anybody to create their own package.Julia’s package manager enables easy development and testing of packages.The ease of package developmentencourages developers to split reusable chunks of codeinto individual packages,further enhancing Julia’s composability.

In this post,we will learn what comprises a Julia package.We will also discuss toolsthat automate the creation of packages.Finally,we will talk about the basics of package developmentand walk through how to publish (register) a packagefor others to use.

This post assumes you are comfortable navigating the Julia REPL.If you need a refresher,check out our post on the Julia REPL.

Components of a Package

Packages are easy enough to use:just install them with add PkgName in the package promptand then run using PkgName in the julia prompt.But what actually goes into a package?

Packages must follow a specific directory structureand include certain informationto be recognized as a package by Julia.

Suppose we are creating a package called PracticePackage.jl.First, we create a directory called PracticePackage.This directory is the package root.Within the root directory we need a file called Project.tomland another directory called src.

The Project.toml requires the following information:

name = "PracticePackage"uuid = "11111111-2222-3333-aaaa-bbbbbbbbbbbb"authors = ["Your Name <youremail@email.com>"]version = "0.1.0"
  • uuid stands for universally unique identifier,and can be generated in Julia withusing UUIDs; uuid4().The purpose of a UUID is to allow different packages of the same name to coexist.
  • version should be set to whatever version is appropriate for your package,typically "0.1.0" or "1.0.0" for an initial release.The versioning of Julia packages follows SemVer.
  • The Project.toml will also include informationabout package dependencies,but more on that later.

The src directory requires one Julia filenamed PracticePackage.jlthat defines a module named PracticePackage:

module PracticePackage# Package code goes here.end

So, the directory structure of the packagelooks like the following:

PracticePackage Project.toml src     PracticePackage.jl

And that’s all there is to a package!(Well, at least minimally.)

Some Technicalities

Feel free to skip this section,but if you are curious about some technicalitiesfor what comprises a valid package,read on.

  • The Project.toml only needs the name and uuid fieldsfor Julia to recognize the package.Without the version field,Julia treats the version as v0.0.0.
    • However, the version and authors fields are neededto register the package.
  • The name of the package root directory doesn’t matter,meaning it doesn’t have to match the package name.However, the name field in Project.tomldoes have to match the name of the moduledefined in src/PracticePackage.jl,and the file name of src/PracticePackage.jl also has to match.
    • For example,we could change the name of the packageby setting name = "Oops" in Project.toml,renaming src/PracticePackage.jl to src/Oops.jl,and defining module Oops in that file.We would not have to rename the package root directoryfrom PracticePackage to Oops(though that would be a good idea to avoid confusion).

Automatically Generating Packages

The basic structure of a package is pretty simple,so there ought to be a way to automate it, right?(I mean, who wants to manually generate a UUID?)Good news: package creation can be automated!

Package generate Command

Julia comes with a generate package command built-in.First, change directoriesto where the package root directory should live,then run generate in the Julia package prompt:

pkg> generate PracticePackage

This command creates the package root directory PracticePackageand the Project.toml and src/PracticePackage.jl files.Some notes:

  • The Project.toml is pre-filled with the correct fields and values,including an automatically generated UUID.When I ran generate on my computer,it also pre-filled the authors fieldwith my name and email from my ~/.gitconfig file.
  • src/PracticePackage.jl is pre-filledwith a definition for the module PracticePackage.It also defines a function greet in the module,but typically you will replace that with your own code.

PkgTemplates.jl

The generate command works fine,but it’s barebones.For example,if you are planning on hosting your package on GitHub,you might want to include a GitHub Actionfor continuous integration (CI),so it would be niceto automate the creation of the appropriate .yml file.This is where PkgTemplates.jl comes in.

PkgTemplates.jl is a normal Julia package,so install it as usual and run using PkgTemplates.Then we can create our PracticePackage.jl:

t = Template(; dir = ".")t("PracticePackage")

Running this code creates the packagewith the following directory structure:

PracticePackage .git    .github    dependabot.yml    workflows        CI.yml        CompatHelper.yml        TagBot.yml .gitignore LICENSE Manifest.toml Project.toml README.md src    PracticePackage.jl test     runtests.jl

As you can see,PkgTemplates.jl automatically generates a lot of filesthat aid in following package development best practices,like adding CI and tests.

Note that many optionscan be supplied to Templateto customize what files are generated.See the PkgTemplates.jl docs for all the options.

Checklist of settings

Basic Package Development

Once your package is set up,the next step is to actually add code.Add the functions, types, constants, etc.that your package needsdirectly in the PracticePackage module in src/PracticePackage.jl,or add additional files in the src directoryand include them in the module.(See a previous blog post for more information about modules,though note that using modules directly works slightly differentlythan using packages.)

To add dependencies for your package to use,you will need to activate your project’s package environmentand then add packages.For example,if you want your package to use the DataFrames.jl package,start Julia and navigate to your package root directory.Then, activate the package environment and add the package:

(@v1.X) pkg> activate .(PracticePackage) pkg> add DataFrames

After this,you will be able to include using DataFramesin your package codeto enable the functionality provided by DataFrames.jl.

Adding packages after activating the package environmentedits the package’s Project.toml file.It adds a [deps] sectionthat lists the added packages and their UUIDs.In the example above,adding DataFrames.jladds the following lines to the Project.toml file:

[deps]DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"

(And (PracticePackage) pkg> rm DataFrames would remove the DataFrames = ... line,so it is best not to edit the [deps] section manually.)

Finally,to try out your package,activate your package environment (as above)and then load your package as usual:

julia> using PracticePackage # No need to `add PracticePackage` first.

Note that by default Julia will have to be restartedto reload any changes you make to your package code.If you want to avoid restarting Juliawhenever you make changes,check out Revise.jl.

Publishing/Registering a Package

Once your package is in working order,it is natural to want to publish the packagefor others to use.

A package can be publishedby registering it in a package registry,which basically is a map that tells the Julia package managerwhere to find a packageso it can be downloaded.

Treasure map

The General registry is the largest registryas well as the default registry used by Julia;most, if not all, of the most popular open-source packages(DataFrames.jl, Plots.jl, StaticArrays.jl, ModelingToolkit.jl, etc.)exist in General.Once a package is registered in General,it can be installed with pkg> add PracticePackage.

(Note that if registering a package is not desired for some reason,a package can be added via URL, e.g.,pkg> add https://github.com/username/PracticePackage.jl,assuming the package is in a public git repository.However,the package manager has limited abilityto manage packages added in this way;in particular,managing package versions must be done manually.)

The most common wayto register a package in Generalis to use Registrator.jl as a GitHub App.See the README for detailed instructions,but the process basically boils down to:

  1. Write/test package code.
  2. Update the version field in the Project.toml(e.g., to "0.1.0" or "1.0.0" for the first registered version).
  3. Add a comment with @JuliaRegistrator registerto the latest commit that should be includedin the registered version of the package.

Note that there are additional steps for preparing a package for publishingthat we did not discuss in this post(such as specifying compatible versionsof Julia and package dependencies).Refer to the General registry’s documentation and links therein for details.

Summary

In this post,we discussed creating Julia packages.We learned what comprises a package,how to automate package creation,and how to register a package in Julia’s General registry.

What package development tips do you have?Let us know in the comments below!

Additional Links

Cover image background provided by www.proflowers.com athttps://www.flickr.com/photos/127365614@N08/16011252136.

Treasure map image source: https://openclipart.org/detail/299283/x-marks-the-spot

]]>

The Chart Missing From ALL Spreadsheet Software

By: DSB

Re-posted from: https://davisbarreira.medium.com/the-chart-missing-from-all-spreadsheet-software-b7fede90d634?source=rss-8bd6ec95ab58------2

And how to implement it in Julia

This chart has been discussed in a recent video on the Minute Physics YouTube Channel. It is actually quite simple, consisting of a floating stacked bar chart. The idea of such plot is that we can use it to visualize range values, while assigning a color to represent such range.

Despite its simplicity, one might be surprised to find out that this chart is not readily available in many of the spreadsheet softwares. It is possible to draw it, yet, one needs to resort to some ad-hoc methods. This and more are discussed in the Minute Physics video.

The Challenge

We want to implement this visualization using the Julia programming language; but we want to be able to do this without resorting to a bunch of ad-hoc methods, and without having to write a humongous amount of code.

One possible solution would be to search for a charting library that covers such example. Yet, as in the case with spreadsheets, this search might be futile. Another possible approach would be to use a more generic visualization library that enables us to “directly” specify such graphic.

The Graphic Description

Before we can implement a solution, we must first get a clear picture of how this graphic is constructed. Consider the following dataset:

In order to reproduce the desired chart, we start by rearranging the dataset. We transform the locations (“Ontario”, “England” and “Kentucky”) into row values, and we pair together the temperature labels, for example, “winter mean low | annual mean low”.

If the dataset organized as such, we can more easily describe how to construct our graphic. Here is the construction logic. For each row, draw a “bar” with the horizontal position according to the location column, the lower end of the bar according to the low_temp column, the higher end of the bar according to the high_temp column, and the bar color is based on the label_temp column.

That’s it! With the dataset structured in such a way, the procedure to construct our graphic is fairly intuitive. We are now ready to implement our solution.

The Implementation

In order to draw our graphic, we need a visualization library that allows us to implement the construction logic we just described. This can be done using the package called Vizagrams.jl.

Vizagrams is a visualization grammar with a syntax very similar to VegaLite. If you have used packages such as Altair or ggplot, you probably know what I am talking about. The difference compared to these grammars is that Vizagrams implements what is called “graphic expression”.

A graphic expression is a constructive description of a graphic. In other words, we can use graphic expressions to encode our constructive logic without having to resort to ad-hoc methods, or writing a bunch of code.

This tutorial does not aim to thoroughly explain Vizagrams or graphic expressions. So let us cut short our explanation here, and move on to the solution.

In the code below, we start by importing Vizagrams. We assume that df holds the dataframe already properly transformed, and we use my-colors to specify the colors to be used (one could also just pick an existing colorscheme). We then specify the plot, which is stored into the plt variable.

The plot specification contains the data, the encoding variables (x, y, color, low_y) and the graphic, which is where we write the graphic expression. Our graphic expression simply iterates over each row in the dataset, and draws a trail having width 20 and with the respective color, lower point and higher point. The trail mark is simpler to use (in this example) and delivers the same result as using the bar mark.

using Vizagrams

df # Our transformed dataframe
my_colors = ["#F28E2B","#4E79A7","#FFBE7D","#A0CBE8"] # Our bar colors

plt = Plot(
data=df,
x = :location,
y = (field=:high_temp,scale_domain=(-10,40), scale_range=(0,200)),
low_y = (field=:low_temp,scale_domain=(-10,40), scale_range=(0,200)),
color = (field=:label_temp,scale_range=my_colors),
graphic =
∑() do row
S(:fill=>row.color)*Trail([[row.x,row.low_y],[row.x,row.y]],20)
end
)

draw(plt)

Conclusion

And there we have it! We implemented the desired chart. The example with all the code and data necessary can be found in Vizagrams’ gallery under the “Floating Stacked Bar” section.

To get a better understanding of what is going on, you should hop on a notebook and try the code yourself. Try changing the graphic expression using different graphical marks, or changing how the encoding variables are used.