Author Archives: DSB

Diagramming + Data Visualization with Julia

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/diagramming-data-visualization-with-julia-37147ce63168?source=rss-8bd6ec95ab58------2

A new approach to Data Visualization using Vizagrams.jl

Data visualization and diagramming are usually treated as distinct subjects. Yet, for the keen observer, they are quite similar. In fact, one could say that data visualization is a subset of diagramming, where data is used in a structured manner to generate a diagram drawing. This idea is explored in the Julia package Vizagrams.jl, which implements a diagramming Domain Specific Language (DSL) with a data visualization grammar built on top of it.

In this article, we show how to use Vizagrams.jl both for diagramming and data visualization. I highly recommend using a notebook such as Jupyter or Pluto in order to follow along.

Installation

The package is registered in the Julia general repository, hence, it can be installed by simply doing:

julia>]
pkg> add Vizagrams

The Basics of Diagramming

Vizagrams implements a diagramming DSL inspired in the great Haskell library Diagrams. We can think of a diagram as simply a collection of primitive graphical objects (circles, rectangles, lines…), where the final drawing simply renders each object one after the other, much like an SVG.

We start with the most basic example, a single circle:

using Vizagrams
# My first Diagram
d = Circle()
draw(d)

By default, our circle is drawn as black. We can modify its color by applying a style transformation.

d = S(:fill=>:red)*Circle()
draw(d)

We can make things more interesting by adding another object to our drawing. How can we do this? Well, just add it:

d = S(:fill=>:red)*Circle() + Square()
draw(d)

Note that the order of the summation matters. By adding a square after the circle, Vizagrams renders the square above the circle.

We have used the S(:fill=>:red) to apply the fill color red to the circle. Besides stylistic transformations, we can also apply geometric transformations, such as translation T , rotation R and scaling U .

d = S(:fill=>:red)*Circle() +
S(:stroke=>:blue,:fill=>:white)T(2,0)*Square() +
S(:fill=>:white)R(3.14/4)*Square()

draw(d)

Finally, we can combine also combine existing diagrams to form new ones.

d_2 = d + T(4,0) * d
draw(d_2)

Constructing Data Visualizations

Let us now go to plotting. Again, we start with the most basic example:

plt = plot(x=rand(100),y=rand(100),color=rand(["a","b"],100))
draw(plt)

As expected, the result is a simple scatter plot. Yet, there is something interesting going on. The variable plt is actually holding a diagram. Which means that we can manipulate it just like we did previously with other diagrams, and we can also combine it with other diagrams.

d = R(π/4)plt +
T(250,0)*plt +
Line([[180,250],[350,350],[350,200]]) +
T(350,350)Circle(r=10)

draw(d)

Although the example above was silly, it does illustrates the possibilities of what can be done. In fact, it is easy to see how one can combine diagramming operations to construct more useful visualizations, such as:

Scatter Plot with a PCA fit over the data. The histogram represents the error distribution in the PCA fit. This plot was draw using Vizagrams.jl.

Visualization Grammars with Diagrams

In Vizagrams, the diagramming DSL was used to build a data visualization grammar, i.e. a specification that can produce a variety of visualizations. The syntax for the grammar is based on Vega-Lite, which is a very popular grammar in the Data Visualization community (the Python package Altair is based on Vega-Lite, and there is also a Julia package called VegaLite.jl).

Explaining visualization grammars would take an article on its own. Yet, the specification style if fairly simple, so hopefully even those not familiar with Vega-Lite will understand what is going on.

# Importing DataFrames to store the data
using DataFrames

# VegaDatasets is used to load the dataset `cars`
using VegaDatasets
df = DataFrame(dataset("cars"));
df = dropmissing(df);

# Here is where Vizagrams plot specification actually starts
plt = Plot(
data=df,
encodings=(
x=(field=:Horsepower,),
y=(field=:Miles_per_Gallon,),
color=(field=:Origin,),
size=(field=:Acceleration,),
),
graphic=Circle()
)
draw(plt)

Note that we specified our plot by first passing on the dataset. Then, we defined the “encodings”, which were x , y , color and size. Each encoding variable has a parameter field which says which column in the dataset is mapped to it. Thus, in our example, we are mapping “Horsepower” to the x-axis, “Miles_per_Gallon” to the y-axis, the color is varying according to the “Origin” and the size according to “Acceleration”. At last, the “graphic” is specifying what is to be drawn in this plot. Since we passed Circle() , this means that we are drawing circles.

Here is where things get interesting. We can pass diagrams to this “graphic” parameter in the plot specification. First, let us create a diagram:

d = S(:fill=>:white,:stroke=>:black)*Circle(r=2) +
Circle() +
T(2,0)*Square()
draw(d)

Now, we place this diagram inside the plot specification, and…

plt = Plot(
data=df,
encodings=(
x=(field=:Horsepower,),
y=(field=:Miles_per_Gallon,),
color=(field=:Origin,),
size=(field=:Acceleration,),
),
graphic = Mark(d)
)

draw(plt)

Again, this example is not very useful, yet, it illustrates the sort of things that can be achieved. Here is perhaps a more “useful” example:

Scatter plot using Penguin mark for the famous Palmer Penguis dataset. The complete example can be found in Vizagrams documentation.

Some Final Words

This article was a mere introduction to Vizagrams. There is much more to be explored, such as graphical marks creation, graphic expressions, stacking operations, and so on.

If you are interested in learning more about Vizagrams, check out the documentation. Again, I recommend using a notebook (Jupyter or Pluto), as one can quickly experiment with different designs.


Diagramming + Data Visualization with Julia was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.

Drawing Vector Graphics with Julia can be Awesome

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/drawing-vector-graphics-with-julia-can-be-awesome-298169609032?source=rss-8bd6ec95ab58------2

A quick tutorial on Luxor.jl, the go-to package for static vector graphics

Image from Luxor’s documentation.

We are all very used to plotting packages, yet, from time to time, we want to create visualizations that require more freedom. When that happens, it might be better to go straight to vector graphics. This is an area where people tend to be less experienced, and it’s not always clear what package to use.

Vector Graphics Packages

When it comes to vector graphics, you’ve probably heard the name Cairo, since it is perhaps on of the most famous 2D graphics libraries, and it’s commonly used by plotting packages as a backend. Thus, a first option for drawing your vector graphics is using Cairo.jl, which is Julia’s api for the Cairo library. Yet, it might not be a very pleasant experience, as Cairo can be fairly complicated for newcomers.

Here is where Luxor.jl comes in.

Luxor is a Julia package for drawing simple static vector graphics. It provides basic drawing functions and utilities for working with shapes, polygons, clipping masks, PNG and SVG images, turtle graphics, and simple animations.

The focus of Luxor is on simplicity and ease of use: it should be easier to use than plain Cairo.jl, with shorter names, fewer underscores, default contexts, and simplified functions. — Luxor’s documentation.

I could actually stop here and tell you to go read the docs, cause this package is insanely well documented. But since you are already here, I might as well take some time to quickly spill out some of the basics.

Quick Introduction to Luxor.jl

Perhaps the first thing to note about Luxor is that it’s procedural and static, which means that you code a series of drawing commands until you say it’s finished, which will in turn produce either a PNG, SVG, PDF or EPS based on what you defined at the start.

If you are used to dynamic programming (which you probably are, since you are using Julia), then this static way of drawing with Luxor will be a bit strange at first. Yet, there is a flip side. Luxor is fast! What do I mean? Just try it and you’ll see that not only the precompilation time is minimal, but every redrawing is also quite fast (compared specially to plotting packages such as VegaLite.jl and Plotly.jl).

Let’s do our first drawing in Luxor.

using Luxor
Drawing(500, 500, "my-drawing.svg")
origin()
setcolor("red")
circle(Point(0, 0), 100, :fill)
finish()

The code above encapsulates very well the workflow in Luxor. The very first line is just importing the package. In the second line, we create a drawing with size of 500 by 500 pixels, and once we finish the drawing, Luxor will create a file called “my-drawing.svg” in the folder we are working at. All good, let’s proceed to the third line of the code.

You might think that the orgin() line is “starting” the drawing, but what it’s actually doing in altering the origin of the coordinate system to the center of the image… What do I mean? Well, in Luxor the default origin is in the top left of the figure, and the y-axis actually points down. Thus, the origin() function is responsible to altering the location of origin coordinate. And if we call this function without arguments, it places the origin at the very center of the drawing (you might be a bit confused, unless you’ve already used other vector drawing packages, in which case, this convention might actually make sense to you).

Moving on, the next line is setcolor("red") and it sets the current color to red. The important thing to note here is that this is similar to choosing the current color of the drawing pen. Hence, every drawing that we do after calling setcolor("red") will be red, until we change the color of the pen.

Let’s finally draw something in our canvas. And the command circle(Point(0,0),100,:fill) does exactly that. This command is drawing a circle with center at point (0,0) , which is the center of our figure, since we used the command origin() . The next argument sets the radius, which in our case is 100 pixels. Finally, the :fill argument indicates that we must paint the inside of the circle. Again, this is all very common if you have worked with vector drawings before.

We conclude our drawing with the finish() command, which then saves the drawing file. Now, if you want to see the drawing right after finishing it, you can just use the function preview() , which will show the most current finished drawing.

Another Simple Example

Let’s change a bit of our first drawing.

Drawing(200, 200, "my-drawing.png")
background(0.9, 0.2, 0.2, .5)
setcolor("blue")
setline(10)
setdash("dash")
circle(Point(0, 0), 100, :stroke)
finish()

This is very similar to our initial drawing, but we’ve added some new functions and removed the origin() . Note that now our circle is actually centered in the top left corner of our figure, which can now be seen because of the background() function. These numbers inside the background function is just the color specification in hsla.

Another difference is that we used :stroke instead of :fill in the circle function. Thus, instead of filling the circle, it just draws the border. The setline() function alters the thickness of the stroke, and the setdash() changes the line style. There are many more variations in line styles, which you can check in the function docstring.

Luxor for the Mathematically Inclined

All I showed up until now is very basic, and perhaps not very useful, so perhaps this next example will grab your interest, that is, if you have a more mathematical inclination.

Let’s use Luxor to draw some diagrams, which occur in a discipline called Category Theory 🥸.

using MathTeXEngine
Drawing(200, 200, "my-drawing.png")
# Drawing the borders of our drawing. Note that the `O` is a
# shortcut to Point(0,0).
rect(O,200,200,:stroke)
# Let's change the origin to the center of the drawing
origin()
# We now draw the our diagram
setline(10)
arrow(Point(-40,0),Point(40,0))
# Let's define the font size and write some text
fontsize(15)
text(L"f",Point(0,20), halign = :center)
circle(Point(-50,0),5,:fill)
text(L"ℕ",Point(-50,20),halign = :center)
circle(Point(50,0),5,:fill)
text(L"ℚ",Point(50,20),halign = :center)
#### From here, we are drawing the looping arrows ####
loopx = 30
loopy = 40
adjx = 6
adjy = -2
arrow(
Point(-50,0) + Point(-adjx,adjy),
Point(-50,0) + Point(-loopx,-loopy),
Point(-50,0)+ Point(loopx,-loopy),
Point(-50,0)+Point(adjx,adjy)
)
arrow(
Point(50,0) + Point(-adjx,adjy),
Point(50,0) + Point(-loopx,-loopy),
Point(50,0) + Point(loopx,-loopy),
Point(50,0) + Point(adjx,adjy)
)
finish()

Pretty neat, no? You might be a bit overwhelmed by the code above, but just read it through and you will see it’s quite straightforward. Here are some comments on things that might be obscure.

First, you might have noticed that we can write LaTeX in Luxor (I know, it’s awesome)! Just writeL”\sum^n_{i=10}”, i.e. place “L” before the string. The MathTeX is a package that is required in order to render the LaTeX string without actually needing to use a LaTeX version installed in your computer.

The final part is the code is more complicated, but the only thing going on there is that one can pass control points in order to draw curves. Hence, instead of passing only the start and finish of the arrow, I’m actually passing four points in order to make the arrow loop, and I’m also altering the start and finish position so that they do not touch the circle. The loopx and the other defined variables are just some auxiliary variables that I used in order to figure out the control points to give me a nice looking curve.

Final Words and a Hot Tip

There are many other things I could say about Luxor, since the package is quite extensive. Fortunately, as I’ve already said, the docs are very good. Not only it has many examples, but it also has many explanations about the many functionalities.

Yet, the fact that the docs are so comprehensive also makes a bit harder to find some specific things one might be looking for. Hence, here are some hot tips that might come in handy.

If you are working in a notebook environment such as Jupyter, you might wish to store the drawings in variables, instead of saving it to files right away. Also, you might wish to do small drawings, and then place them inside another drawing… Is this possible? YES! Here is how to do both.

d1 = Drawing(200,200,:svg)
origin()
circle(O+Point(30,0),10,:fill)
finish()
d2 = Drawing(200,200,:svg)
origin()
setcolor("blue")
circle(O,10,:fill)
finish()
d = Drawing(200,200,:svg)
placeimage(d1)
placeimage(d2)
finish()

I recommend always working with svg if your drawing is not too heavy, since you’ll always have crystal clear quality. With the code above, you can just run d in a cell in order to visualize the drawing stored in the variable.

Finally, you might wish to see the actual svg specification. Here is a possible solution.

dsvg = String(copy(d.bufferdata))

This will give you the svg string, which you can just save into a mydrawing.svg file.

That’s all. Now go read the docs!


Drawing Vector Graphics with Julia can be Awesome was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.

Visualizing Data with Julia using Makie

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/visualizing-data-with-julia-using-makie-7685d7850f06?source=rss-8bd6ec95ab58------2

Plotting in Julia with Makie

A Brief Tutorial on Makie.jl

When starting learning Julia, one might get lost in the many different packages available to do data visualization. Right out of the cuff, there is Plots, Gadfly, VegaLite … and there is Makie.

Makie is fairly new (~2018), yet, it’s very versatile, actively developed and quickly growing in number of users. This article is a quick introduction to Makie, yet, by the end of it, you will be able to do a plethora of different plots.

The Future of Plotting in Julia

When I started coding in Julia, Makie was not one of the contenders for “best” plotting libraries. As time passed, I started to here more and more about it around the community. For some reason, people were saying that:

“Makie is the future” — People in the Julia Community

I never fully understood why that was the case, and every time I tried to learn it, I’d be turned off by the verbose syntax, and, frankly, ugly examples. It was only when I bumped into Beautiful Makie that I decided to put aside my prejudices and get on with the times.

Hence, if you are starting to code in Julia, and is wondering which plotting package you should invest your time to learn, I say to you that Makie is the way to go, since I guess “Makie is the future”.

Number of GitHub Star’s in per repository. I guess indeed Makie is the future, if this trend keeps going.

Starting with Makie… Pick your backend

The versatility in Makie can make it a bit unwelcoming for those that “just want to do a damn scatter plot”. First of all, there is Makie.jl, CairoMakie.jl, GLMakie.jl WGLMakie.jl 😰. Which one should you use?

Well, here is the deal. Makie.jl is the main plotting package, but you have to choose a backend to which you will display your plots. The choice depends on your objectives. So yes, besides Makie.jl, you will need to install one of the backends. Here is a small description to help you chose:

  • CairoMakie.jl: It’s the easiest to use of all three, and it’s the ideal choice if you just want to produce static plots (not interactive);
  • GLMakie.jl: Uses OpenGL to display the plots, hence, you need to have OpenGL installed. Once you do a plot and run the display(myplot) , it’ll open an interactive window with your plot. If you want to do interactive 3D plots, then this is the backend for you;
  • WGLMakie.jl: It’s the hardest one to work with. Still, if you want to create interactive visualizations in the web, this is your choice.

In this tutorial, we’ll use CairoMakie.jl.

Your first plot

After picking our backend, we can now start plotting! I’ll go out on a limb and say that Makie is very similar to Matplotlib. It does not work with any fancy “Grammar of Graphics” (but if you like this sort of stuff, take a look at the AlgebraOfGraphics.jl, which implements an “Algebra of Graphics” on Makie).

Thus, there are a bunch of ready to use functions for some of the most common plots.

using CairoMakie #Yeah, no need to import Makie
scatter(rand(10,2))

Easy breezy… Yet, if you are plotting this in a Jupyter Notebook, you might be slightly ticked off by two things. First, the image is just too large. And second, it’s kind of low quality. What is going on?

By default, CairoMakie uses raster format for images, and the default size is a bit large. If you are like me and prefer your plots to be in svg and a bit smaller, then no worries! Just do the following:

using CairoMakie
CairoMakie.activate!(type = "svg")
scatter(rand(10,2),figure=(;resolution=(300,300)))

In the code above, the CairoMakie.activate!() is a command that tells Makie which backend you are using. You can import more than one backend at a time, and switch between them using this activation commands. Also, the CairoMakie backend has the option to do svg plots (to my knowledge, this is not possible for the other backends). Hence, with this small line of code, all our plots will now be displayed in high quality.

Next, we defined a “resolution” to our figure. In my opinion, this is a bit of an unfortunate name, because the resolution is actually the size of our image. Yet, as we’ll see further on, the attribute resolution actually belongs to our figure, and not to the actual scatter plot. For this reason we pass the whole figure = (; resolution=(300,300)) (if you are new to Julia, the ; is just a way of separating attributes that have names, from unnamed ones, i.e. args and kwags).

Congrats! You now know the bare minimum of Makie to do a whole bunch of different plots! Just go to the Makie’s website and see how to use all the different ready-to-use plotting functions! In order to be self contained, here is a small cheat sheet from the great book Julia Data Science.

Of course, we still haven’t talked about a bunch of important things, like titles, subplots, legends, axes limits, etc. Just keep on reading…

Storopoli, Huijzer and Alonso (2021). Julia Data Science. https://juliadatascience.io. ISBN: 9798489859165.

Figure, Axis and Plot

Commands like scatter produce a “FigureAxisPlot” object, which contains a figure, a set of axes and the actual plot. Each of these objects has different attributes and are fundamental in order to customize your visualization. By doing:

fig, ax, plt = scatter(rand(10,2))

We save each of these objects in a different variable, and can more easily modify them. In this example, the function scatter is actually creating all three objects, and not only the plot. We could instead create each of these objects individually. Here is how we do it:

fig = Figure(resolution=(600, 400)) 
ax = Axis(fig[1, 1], xlabel = "x label", ylabel = "y label",
title = "Title")
lines!(ax, 1:0.1:10, x->sin(x))
Plot from code above

Let’s explain the code above. First, we created the empty figure and stored it in fig . Next, we created an “Axis”. But, we need to tell to which figure this object belongs, and this is where the fig[1,1] comes in. But, what is this “[1,1]”?

Every figure in Makie comes with a grid layout underneath, which enable us to easily create subplots in the same figure. Hence, the fig[1,1] means “Axis belongs to fig row 1 and column 1”. Since our figure only has one element, then our axis will occupy the whole thing. Still confused? Don’t worry, once we do subplots you’ll understand why this is so useful.

The rest of the arguments in “Axis” are easy to understand. We are just defining the names in each axis and then the title.

Finally, we add the plot using lines! . The exclamation is a standard in Julia that means that a function is actually modifying an object. In our case, the lines!(ax, 1:0.1:10, x->sin(x)) is appending a line plot to the ax axis.

It’s clear now how we can, for example, add more line plots. By running the same lines! , this will append more plots to our ax axis. In this case, let’s also add a legend to our plot.

fig = Figure(resolution=(600, 400)) 
ax = Axis(fig[1, 1], xlabel = "x label", ylabel = "y label",
title = "Title")
lines!(ax, 1:0.1:10, x->sin(x), label="sin")
stairs!(ax, 1:0.1:10, x->cos(x), label="cos", color=:black)
axislegend(ax)
#*Tip*: if you are using Jupyter and want to display your
# visualization, you can do display(fig) or just write fig in
# the end of the cell.

Ok, our plots are starting to look good. Let me end this section talking about subplots. As I said, this is where the whole “fig[1,1]” comes into play. If instead of doing two plots in the same axis we wanted to create two parallel plots in the same figure, here is how we would do this.

fig = Figure(resolution=(600, 300)) 
ax1 = Axis(fig[1, 1], xlabel = "x label", ylabel = "y label",
title = "Title1")
ax2 = Axis(fig[1, 2], xlabel = "x label", ylabel = "y label",
title = "Title2")
lines!(ax1, 1:0.1:10, x->sin(x), label="sin")
stairs!(ax1, 1:0.1:10, x->cos(x), label="cos", color=:black)
density!(ax2, randn(100))
axislegend(ax)
save("figure.png", fig)

This time, in the same figure, we created two axis, but the first one is in the first row and first column, while the second one is in the second column. We then just append the plot to the respective axis. Lastly, we save the figure in “png” format.

Final Words

That’s it for this tutorial. Of course, there is much more the talk about, as we have only scratched the surface. Makie has some awesome capabilities in terms of animations, and much more attributes/objects to play with in order to create truly astonishing visualizations. If you want to learn more, take a look at Makie’s documentation, it’s very nice. And also, the Julia Data Science book has a chapter only on Makie.

References

This article draws heavily on the Julia Data Science book and Makie’s own documentation.

Storopoli, Huijzer and Alonso (2021). Julia Data Science. https://juliadatascience.io. ISBN: 9798489859165.

Danisch & Krumbiegel, (2021). Makie.jl: Flexible high-performance data visualization for Julia. Journal of Open Source Software, 6(65), 3349, https://doi.org/10.21105/joss.03349


Visualizing Data with Julia using Makie was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.