Tag Archives: Julia

Using DataFrames and PyPlot in Julia

By: Jaafar Ballout

Re-posted from: https://www.supplychaindataanalytics.com/using-dataframes-and-pyplot-in-julia/

Julia includes many packages that could be used in operation research and optimization in general. This article serves as a brief introduction to different packages available in Julia’s ecosystem. I will focus on two packages that are stepping-stones in future work: DataFrames.jl and PyPlots. DataFrames.jl provides a set of tools for working with tabular data similar to Pandas in Python. The PyPlots module provides a Julia interface to the Matplotlib plotting library from Python. Although PyPlots is a distribution of Python and Matplotlib, Julia can install a private distribution that can’t be accessed outside Julia’s environment.

Adding the necessary packages

Open a new terminal window and run Julia. Use the code below to add the packages required for this article.

julia> using Pkg
julia> Pkg.add("DataFrames")
julia> Pkg.add("CSV")
julia> Pkg.add("Arrow")

Open a new terminal window and run Julia. Initialize the PYTHON environment variable:

julia> ENV["PYTHON"] = ""
""

Install PyPlot:

julia> using Pkg
julia> Pkg.add("PyPlot")

After adding all the required packages using Julia REPL, the following code is used to import the packages in the Jupyter editor (i.e. Jupyter Notebook).

using DataFrames
using CSV
using Arrow
using PyPlot

Using DataFrames in Julia

DataFrames requires some packages in the backend like CSV and Arrow to complete its operations properly. Thus, these packages were added initially in the section before.

Using PyPlots in Julia

Because PyPlots is an interface for Matplotlib in Julia, all the documentation is available on Matplotlib’s main page.

Covid-19 showcase using Julia

The first covid-19 case was detected on the 17th of November, 2019. Although two years passed since the pandemic started, the virus is still persisting and cases are increasing exponentially. Thus, data analysis is important to understand the growth of cases around the world. I will be using a .csv file containing data about the cases in country X. The file is available in a public repository on my GitHub page, so I can copy it to an excel sheet and move it to the directory file where the Jupyter notebook is located. Also, I can import the data from the web using excel and attach the link of the raw format of the database on GitHub. In both cases, saving the file as time.csv is necessary to fit with the code in later stages.

Here, I am showing the database, in csv format, that I opened via excel on my desktop.

I will read the csv file, in the Jupyter notebook, using the code block below:

df  = CSV.File("time.csv") |> DataFrame; # reading the csv file using CSV package and changing it to a DataFrame using the arrow operation |>
df[1:5,:] # output the first five rows

Importing data files is smoother using DataFrames. Some of the operations present in the code blocks below are explained in a previous post introducing Julia. Plotting is another important tool to understand data and visualize it better. Matplotlib is introduced before to the blog but in Python.

The code block below creates a bar chart showing the number of cumulative tests and cumulative negative cases over a period of six days from the DataFrame, df, imported above.

y1 = df[20:25,2]; 
y2 = df[20:25,3];
x = df[20:25,1];

fig = plt.figure() 
ax = plt.subplot() 
ax.bar(x, y1, label="cumulative tests",color="black")
ax.bar(x, y2, label="cumulative negative cases",color="grey")
ax.set_title("Covid-19 data in Country X",fontsize=18,color="green")
ax.set_xlabel("date",fontsize=14,color="red")
ax.set_ylabel("number of cases",fontsize=14,color="red")
ax.legend(fontsize=10)
ax.grid(b=1,color="blue",alpha=0.1)
plt.show()

The bar chart above could be improved by allocating a bar for each category. This is done using the code block below.

barWidth = 0.25
br1 = 1:1:length(x)
br2 = [x + barWidth for x in br1]

fig = plt.figure()
ax = plt.subplot() 
ax.bar(br1, y1,color="r", width=barWidth, edgecolor ="grey",label="cumulative tests")
ax.bar(br2, y2,color ="g",width=barWidth, edgecolor ="grey", label="cumulative negative cases")
ax.set_title("Covid-19 data in Country X",fontsize=18,color="green")
ax.set_xlabel("date", fontweight ="bold",fontsize=14)
ax.set_ylabel("number of cases", fontweight ="bold",fontsize=14)
ax.legend(fontsize=10)
ax.grid(b=1,color="blue",alpha=0.1)
plt.xticks([r + barWidth for r in br1],  ["2/8/2020", "2/9/2020", "2/10/2020", "2/11/2020", "2/12/2020","2/13/2020"])
plt.legend()
plt.show()

Understanding the exponential growth associated with covid19 requires plotting the whole dataset which is over the span of 44 days. Therefore, the next code block aims to show the covid cases of the complete dataset.

days = 1:1:44
days_array = collect(days)
confirmed_cases = df[:,4]

fig = plt.figure(figsize=(10,10))
ax = plt.subplot()
ax.bar(days_array,confirmed_cases , color ="blue",width = 0.4)
ax.set_title("A bar chart showing cumulative confirmed covid-19 cases in country X",fontsize=22,color="darkgreen")
ax.set_xlabel("day number",fontsize=16,color="darkgreen")
ax.set_ylabel("number of cases",fontsize=16,color="darkgreen")
ax.xaxis.set_ticks_position("none")
ax.yaxis.set_ticks_position("none")
ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)
ax.grid(b = 1, color ="grey",linestyle ="-.", linewidth = 0.5,alpha = 0.2)
fig.text(0.3, 0.8, "SCDA-JaafarBallout", fontsize = 10,color ="grey", ha ="right", va ="bottom",alpha = 0.6)
plt.show()

Even though bar charts are powerful in our case, it is still nice to observe the exponential curve. For that, a code block is presented below to plot the growth curve of confirmed cases.

fig = plt.figure(figsize=(10,10))
ax = plt.subplot()
ax.plot(days_array,confirmed_cases , color ="blue",marker="o",markersize=6, linewidth=2, linestyle ="--")
ax.set_title("A bar chart showing cumulative confirmed covid-19 cases in country X",fontsize=22,color="darkgreen")
ax.set_xlabel("day number",fontsize=16,color="darkgreen")
ax.set_ylabel("number of cases",fontsize=16,color="darkgreen")
ax.grid(b = 1, color ="grey",linestyle ="-.", linewidth = 0.5,alpha = 0.2)
fig.text(0.3, 0.8, "SCDA-JaafarBallout", fontsize = 10,color ="grey", ha ="right", va ="bottom",alpha = 0.6)
plt.xticks(size=16, color ="black")
plt.yticks(size=16, color ="black")
plt.show()

Realizing a meme! (with Julia)

Recently, this meme got viral. So, it is nice to figure the missing functions by plotting.

fig, axs = plt.subplots(2, 2);
fig.tight_layout(pad=4);

x1 = range(-10, 10, length=1000);
y1 = range(1, 1,length =1000);

axs[1].plot(x1, y1, color="blue", linewidth=2.0, linestyle="-");
axs[1].set_xlabel("x1");
axs[1].set_ylabel("y1");
axs[1].set_title("Constant");

x2 = range(-10, 10, length=1000);
y2 = x2.^3;

axs[2].plot(x2, y2, color="blue", linewidth=2.0, linestyle="-");
axs[2].set_xlabel("x2");
axs[2].set_ylabel("y2");
axs[2].set_title("Y = X^(3)");

x3 = range(-10, 10, length=1000);
y3 = x2.^2;

axs[3].plot(x3, y3, color="blue", linewidth=2.0, linestyle="-");
axs[3].set_xlabel("x3");
axs[3].set_ylabel("y3");
axs[3].set_title("Y = X^(2)");

x4 = range(-5, 5, length=1000);
y4 = cos.(x4);

axs[4].plot(x4, y4, color="blue", linewidth=2.0, linestyle="-");
axs[4].set_xlabel("x4");
axs[4].set_ylabel("y4");
axs[4].set_title("Y = cos(X)");

Graphing networks by hand or traditional software seems exhausting. In future posts, I will demonstrate how to draw complicated networks using Julia and its dependencies. Then, I will solve the network problem using the JuMP package.

The post Using DataFrames and PyPlot in Julia appeared first on Supply Chain Data Analytics.

Introduction to Julia

By: Jaafar Ballout

Re-posted from: https://www.supplychaindataanalytics.com/introduction-to-julia/

After having introduced in one of my previous posts optimization and linear programming, I explain the basics of the Julia programming language in this article. The article represents a tutorial and is based on the official Julia documentation. My tutorial covers the key aspects that I will later use, in upcoming blog posts, for solving supply chain and operation research problems (e.g. network problems). My tutorial also highlights some major syntactic and functional differences when compared to other popular programming languages (namely Python and Matlab).

Defining variables in Julia

A variable is a name associated with a value and saved in the computer memory. Assigning a value for the variable is done using the = operator. Unicode can be used as a variable name. Most Julia editors support LaTeX syntax which can be used to create Unicode characters. The double quotes ” are used for strings while the single quote ‘ is used for a character.

Here are some examples demonstrating the above explanation.

In[]:
x = 7 # variable name: x; variable value = 7

Out[]:
7
In[]:
y = 10 # variable name: y; variable value = 10

Out[]:
10

Variable names are case-sensitive and have no semantic meaning.

In[]:
Y = 100 

Out[]:
100
In[]:
n = "Jaafar" # variable name: n; variable value = "Jaafar" which is  a string

Out[]:
"Jaafar"
In[]:
Letter = 'a' # character

Out[]:
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
In[]:
α = 1 # unicode is easier in Julia: write \alpha and press tab 

Out[]:
1
In[]:
😠=0 # \:angry: and then press <tab>

Out[]: 
0

Integer and floating point numbers

Integers and floating points are the building blocks of mathematical operations. Using the typeof command in Julia I can find the type of any pre-defined variable.

In[]:
x = 1; #semicolon is to avoid printing the variables
y = 10000;
z = 1.1;
d = 5e-3

println("x is ", typeof(x))
println("y is ", typeof(y))
println("z is ", typeof(z))
println("d is ", typeof(d))

Out[]:
x is Int64
y is Int64
z is Float64
d is Float64

Using the Sys.WORD_SIZE, Julia’s internal variable, I can indicate whether the targetted system is 32-bit or 64-bit.

In[]:
# 64-bit system
Sys.WORD_SIZE

Out[]:
64

Main mathematical operations in Julia

The arithmetic operations are similar to Python except for the power. For power operations, Julia uses ^ instead of ** (as known from Python).

Boolean operations are as follows:

  • a && b: a and b
  • a || b: a or b
  • !a: negation
In[]:
a = 1;
b = 3;

# Addition
a + b;

# Subtraction
a - b;

# times
a*b;

# divison
a/b;

# Power 
a^b; # this is different from python where ** is used to raise a to the bth power

#Updating operators 
a+=1; # a = a + 1
b*=2; # b = b * 2

Vectorized operators are very important in linear algebra. Dot operators are used for arrays where elementary operations are performed.

In[]:
[1,2,3].^1 # [1^1,2^1,3^1]

Out[]:
3-element Vector{Int64}:
 1
 2
 3

Basic collections

Tuples, named tuples, and dictionaries

  • Tuples: Ordered immutable collections of elements
  • NamedTuples: Exactly like tuples but also assign a name for each variable
  • Dictionaries: Unordered mutable collections of pairs: key-value
In[]:
# Tuple 
favoritesongs = ("outnumbered", "Power Over Me", "Bad Habits") # elements have the same type

# Tuple
favoritethings= ("yellow", 'j', pi) # elements with different types

# NamedTuple
favoritesongs_named = (a = "outnumbered", b = "Power Over Me", c = "Bad Habits") # it is between a Tuple and Dictionary

# Dictionary
myDict = Dict("name" => "Jaafar", "age" => "twenty", "hobby"=> "biking")

Out[]:
Dict{String, String} with 3 entries:
  "name"  => "Jaafar"
  "hobby" => "biking"
  "age"   => "twenty"
In[]:
# Tuple access
favoritesongs[1] # indexing starts by 1 not 0 (unlike Python)

# NamedTuple access
favoritesongs_named[1] # accessed by index
favoritesongs_named.a # accessed by key

# Dictionary access
myDict["name"] # call the kay to output the value in the dictionary 

Out[]:
"Jaafar"

Vectors, arrays, and matrices

Like any numerical computation language, Julia provides an easy way to handle matrices and their corresponding operations. Unlike Matlab, Julia arrays are indexed with square brackets, A[i,j]. However, similarly to Matlab, indexing starts using one, not zero, which makes it more convenient especially in loops later by using f(i) instead of f(i-1).

  • array: ordered and mutable collection of items of the same type
  • vector: array of dimension one
  • matrix: array of dimension two
  • tensor: array of n-dimension (usually 3 and above)
In[]:
#vector
array_V = [1, 2, 3, 4, 5, 6, 7, 8, 9] # acts as a vector of one-dimension
typeof(array_V)

Out[]:
Vector{Int64} (alias for Array{Int64, 1})
In[]:
#Matrix
array_M = [1 2 3; 4 5 6; 7 8 9] # acts as a matrix of two-dimension

Out[]:
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9
In[]:
# Random vector
vec = rand(9)

Out[]:
9-element Vector{Float64}:
 0.7130265942088201
 0.9545688377050932
 0.7878361868436774
 0.4973658015754845
 0.44265779030703434
 0.01870528656705095
 0.010563833645745424
 0.8906392694739755
 0.5416448302194592
In[]:
# Random matrix 
mat = rand(3,3)

Out[]:
3×3 Matrix{Float64}:
 0.412231  0.0180507  0.862113
 0.534452  0.711949   0.541887
 0.52126   0.894952   0.443401
In[]:
# Random tensor
ten = rand(3,3,3) # three-dimenisonal 

Out[]:
3×3×3 Array{Float64, 3}:
[:, :, 1] =
 0.517095    0.976259  0.114393
 0.00295048  0.759259  0.302369
 0.988611    0.688391  0.438473

[:, :, 2] =
 0.163933  0.138108  0.770564
 0.899507  0.109004  0.577751
 0.63999   0.280642  0.751499

[:, :, 3] =
 0.361409  0.575224  0.525733
 0.858351  0.586987  0.638436
 0.101579  0.447222  0.364909

It is important to note that ranges act like vector. However, specifying a range is easier to code. The syntax is as follows: range_name = start:step:end

In[]:
# Range
r = 1:1:9 

Out[]:
1:1:9
In[]:
collect(r) # transofrm the range output to a vector output (better output)

Out[]:
9-element Vector{Int64}:
 1
 2
 3
 4
 5
 6
 7
 8
 9

More on indices and ranges in Julia

Working with matrices and arrays, in general, requires good command of indexing and slicing operations in Julia. This can be done more easily using the ranges. This is due to their compact code.

In[]:
# Define an array
name_letters = ['j','a','a','f','a','r']

# Index the array
name_letters[1] # returns j

# slice the array using a range: start:end
name_letters[1:3] # returns jaa

# slice the array using a range with step of 2: start:step:end
name_letters[1:3:6] # returns jf

Out[]:
2-element Vector{Char}:
 'j': ASCII/Unicode U+006A (category Ll: Letter, lowercase)
 'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)

Printing output in Julia

Although printing the output of the code is simple it is important for e.g. debugging the code. Print commands are also helpful to readers that are trying to understand the function and purpose of the code.

In[]:
# Print on new line
println("Jaafar") ;
println("Ballout");

# Print one same line
print("Jaafar");
print(" Ballout");

Out[]:
Jaafar
Ballout
Jaafar Ballout

Specifying loops and conditions in Julia

Both loops and conditions in Julia require an end command, unlike Python where indentation is enough to end the if-condition, loop, or function-definition.

In[]:
#If condition
if length("Jaafar") > length("Ballout")
    print("Your first name is bigger than your last name")
elseif length("Jaafar") == length("Ballout")
    println("First name and last name have same number of characters")
else
    println("Your first name is smaller than your last name")
end

Out[]:
Your first name is smaller than your last name

The code block below prints each character in my name into a new line.

In[]:
name_letters = ['j','a','a','f','a','r']
# For loop
for i in 1:length(name_letters)
    println(name_letters[i])
end

Out[]:
j
a
a
f
a
r

The code below finds the location or index of a character in my name.

In[]:
#If condition in For Loop
for i in 1:length(name_letters)
    if name_letters[i] == 'a'
        println(i)
    end
end

Out[]:
2
3
5

Defining functions in Julia

Functions, e.g. routines or methods, can be defined in Julia. The linear optimization model, below, is from my previous blog post. In the coding example that follows, I define the objective function as a function in Julia.

Here is how I can define the objective function in Julia:

In[]:
function max(x,y)
    return 5x + 4y # no need to write the multiplication * sign; Julia will understand
end

Out[]:
max (generic function with 1 method)
In[]:
# Optimal soultion of the constrained problem above from previous post 
z = max(3.75,1.25) # optimal value
print(z)

Out[]:
23.75

Import files into Julia

Importing files is very import especially in supply chain management and logistics problems. Because I might use .csv files in future posts, explaining how to import these files in Julia is necessary at this stage. I will be using Pandas.jl which is a Julia interface to the excellent Pandas package in Python.

In[]:
using Pandas
In[]:
df_list = Pandas.read_csv("https://gist.githubusercontent.com/brooksandrew/e570c38bcc72a8d102422f2af836513b/raw/89c76b2563dbc0e88384719a35cba0dfc04cd522/edgelist_sleeping_giant.csv");

I will introduce useful packages like DataFrames.jl and PyPlots in Julia in future work. These packages are very useful for solving network problems as known from supply chain management and operations research.

The post Introduction to Julia appeared first on Supply Chain Data Analytics.