Author Archives: Julia Computing, Inc.

Julia Computing Wins RiskTech100® 2018 Rising Star Award

New York, NY – Julia Computing was selected by Chartis Research as a RiskTech Rising Star for 2018.

The RiskTech100® Rankings are acknowledged globally as the most comprehensive and independent study of the world’s major players in risk and compliance technology. Based on nine months of detailed analysis by Chartis Research, the RiskTech100® Rankings assess the market effectiveness and performance of firms in this rapidly evolving space.

Rob Stubbs, Chartis Research Head of Research, explains, “We interviewed thousands of risk technology buyers, vendors, consultants and systems integrators to identify the leading RiskTech firms for 2018. We know that risk analysis, risk management and regulatory requirements are increasingly complex and require solutions that demand speed, performance and ease of use. Julia Computing has been developing next-generation solutions to meet many of these requirements.”

For example, Aviva, Britain’s second-largest insurer, selected Julia to achieve compliance with the European Union’s new Solvency II requirements. According to Tim Thornham, Aviva’s Director of Financial Modeling Solutions, “Solvency II compliant models in Julia are 1,000x faster than our legacy system, use 93% fewer lines of code and took 1/10 the time to implement.” Furthermore, the server cluster size required to run Aviva’s risk model simulations fell 95% from 100 servers to 5 servers, and simpler code not only saves programming, testing and execution time and reduces mistakes, but also increases code transparency and readability for regulators, updates, maintenance, analysis and error checking.

About Julia and Julia Computing

Julia is the fastest high performance open source computing language for data, analytics, algorithmic trading, machine learning, artificial intelligence, and many other domains. Julia solves the two language problem by combining the ease of use of Python and R with the speed of C++. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. For example, Julia has run at petascale on 650,000 cores with 1.3 million threads to analyze over 56 terabytes of data using Cori, the world’s sixth-largest supercomputer. With more than 1.2 million downloads and +161% annual growth, Julia is one of the top programming languages developed on GitHub. Julia adoption is growing rapidly in finance, insurance, machine learning, energy, robotics, genomics, aerospace, medicine and many other fields.

Julia Computing was founded in 2015 by all the creators of Julia to develop products and provide professional services to businesses and researchers using Julia. Julia Computing offers the following products:

JuliaPro for data science professionals and researchers to install and run Julia with more than one hundred carefully curated popular Julia packages on a laptop or desktop computer.
JuliaRun for deploying Julia at scale on dozens, hundreds or thousands of nodes in the public or private cloud, including AWS and Microsoft Azure.
JuliaFin for financial modeling, algorithmic trading and risk analysis including Bloomberg and Excel integration, Miletus for designing and executing trading strategies and advanced time-series analytics.
JuliaDB for in-database in-memory analytics and advanced time-series analysis.
JuliaBox for students or new Julia users to experience Julia in a Jupyter notebook right from a Web browser with no download or installation required.

To learn more about how Julia users deploy these products to solve problems using Julia, please visit the Case Studies section on the Julia Computing Website.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Citibank, Comcast, Disney, Facebook, Ford, Google, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Uber, and many more.

About Chartis Research

Chartis Research is the leading provider of research and analysis on the global market for risk technology. It is part of Infopro Digital, which owns market-leading brands such as Risk and WatersTechnology. Chartis’ goal is to support enterprises as they drive business performance through improved risk management, corporate governance and compliance, and to help clients make informed technology and business decisions by providing in-depth analysis and actionable advice on virtually all aspects of risk technology.

Big Data Analytics with OnlineStats.jl

OnlineStats is a package for computing statistics and models via online algorithms. It is designed for taking on big data and can naturally handle out-of-core processing, parallel/distributed computing, and streaming data. JuliaDB fully integrates OnlineStats for providing analytics on large persistent datasets. While future posts will dive into this integration, this post serves as a light introduction to OnlineStats.

What are Online Algorithms?

Online algorithms accept input one observation at a time. Consider a mean of n data points:

$\theta^{(n)} = \frac{1}{n}\sum_{i=1}^n x_i.$

By adding a single observation, the mean could be recalculated from scratch (offline):

$\theta^{(n+1)} = \frac{1}{n+1}\sum_{i=1}^{n+1} x_i.$

Or we could use only the current estimate and the new observation (online):

$\theta^{(n+1)} = \left(1 - \frac{1}{n+1}\right)\theta^{(n)} + \frac{1}{n+1}x_{n+1}$

A big advantage of online algorithms is that data does not need to be revisited when new observations are added. It is therefore not necessary for the dataset to be fixed in size or small enough to fit in computer memory. The disadvantage is that not everything can be calculated exactly like the mean above. Whenever exact solutions are impossible, OnlineStats relies on state of the art stochastic approximation algorithms.

OnlineStats Basics

The statistics/models of OnlineStats are subtypes of OnlineStat:

using OnlineStats, Plots

# Each OnlineStat is a type
o = IHistogram(100)  
o2 = Sum()

# OnlineStats are grouped together in a Series
s = Series(o, o2)

# Updating the Series updates the grouped OnlineStats
y = randexp(100_000)

# fit!(s, y) translates to:
for yi in y
    fit!(s, yi)
end

plot(o)

Working with Series of Different Inputs

A Series groups together any number of OnlineStats which share a common input. The input (single observation) of an OnlineStat can be a scalar (e.g. Variance), a vector (e.g. CovMatrix), or a vector/scalar pair (e.g. LinReg).

The Series constructor optionally accepts data to fit! right away.

Scalar-input Series

julia> Series(randn(100), Mean(), Variance())
▦ Series{0} with EqualWeight
  ├── nobs = 100
  ├── Mean(0.0899071)
  └── Variance(0.952008)

Vector-input Series
- The MV type can turn a scalar-input OnlineStat into a vector-input version.

julia> Series(randn(100, 2), CovMatrix(2), MV(2, Mean()))
▦ Series{1} with EqualWeight
  ├── nobs = 100
  ├── CovMatrix([0.916472 0.089655; 0.089655 0.984442])
  └── MV{Mean}(0.17287277199330608, -0.12199728546589127)

Vector/Scalar-input Series
- The Vector holds predictor variables and the Scalar is a response.

julia> Series((randn(100, 3), randn(100)), LinReg(3))
▦ Series{(1, 0)} with EqualWeight
  ├── nobs = 100
  └── LinReg: β(0.0) = [-0.0486756 -0.0437766 -0.160813]

Working with Series and Individual OnlineStats

value returns the stat’s value

julia> o = Mean()
Mean(0.0)

julia> value(o)
0.0

value on a Series maps value to the stats

julia> s = Series(Mean(), Variance())
▦ Series{0} with EqualWeight
  ├── nobs = 0
  ├── Mean(0.0)
  └── Variance(-0.0)

julia> value(s)
(0.0, -0.0)

stats returns a tuple of stats

julia> m, v = stats(s)
(Mean(0.0), Variance(-0.0))

(Embarassingly) Parallel Computation

At first glance, it appears necessary that a Series must be fit!t-ed serially, but OnlineStats
provides merge/merge! methods for combining two Series into one. This is how
JuliaDB is able to use OnlineStats in a
distributed fashion. Below is a simple (not actually parallel) example of merging.

s1 = Series(Mean(), Variance())
s2 = Series(Mean(), Variance())
s3 = Series(Mean(), Variance())

fit!(s1, randn(1000))
fit!(s2, randn(1000))
fit!(s3, randn(1000))

merge!(s1, s2)
merge!(s1, s3)

Resources

This is a small sample of OnlineStats functionality. For more information, stay tuned for future posts or check out the OnlineStats Github repo and documentation.

Julia Featured in insideHPC’s "AI-HPC Is Happening Now" white-paper

Julia and Julia Computing are featured in a new insideHPC white paper titled “AI-HPC Is Happening Now.”

insideHPC is a leading blog in the high-performance computing (HPC) community.

The article notes that “Julia … recently delivered a peak performance of 1.54 petaflops using 1.3 million threads on 9,300 Intel Xeon Phi processor nodes of the Cori supercomputer at NERSC. The Celeste project utilized a code written entirely in Julia that processed approximately 178 terabytes of celestial image data and produced estimates for 188 million stars and galaxies in 14.6 minutes.”

Julia Computing CTO (Tools) Keno Fischer explains, “We used Julia on the world’s sixth most powerful supercomputer to achieve a performance improvement of 1,000x over unoptimized single core execution. We have demonstrated that Julia scales effectively and efficiently from a single laptop or desktop to dozens or hundreds of nodes in the cloud and multithreaded parallel supercomputing at petascale. Julia has been downloaded more than 1.2 million times, an annual increase of +161%. Julia is also helping quantitative finance analysts on Wall Street and rocket scientists at NASA’s Jet Propulsion Laboratory achieve faster computing speeds with higher productivity.”

About Julia and Julia Computing

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1.2 million downloads and +161% annual growth, Julia is one of the top programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Citibank, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC and Uber.

Julia is lightning fast. Julia is being used in production today and has generated speed improvements up to 1,000x for insurance model estimation and parallel supercomputing astronomical image analysis.
Julia provides unlimited scalability. Julia applications can be deployed on large clusters with a click of a button and can run parallel and distributed computing quickly and easily on tens of thousands of nodes.
Julia is easy to learn. Julia’s flexible syntax is familiar and comfortable for users of Python, R and Matlab.
Julia integrates well with existing code and platforms. Users of C, C++, Python, R and other languages can easily integrate their existing code into Julia.
Elegant code. Julia was built from the ground up for mathematical, scientific and statistical computing. It has advanced libraries that make programming simple and fast and dramatically reduce the number of lines of code required – in some cases, by 90% or more.
Julia solves the two language problem. Because Julia combines the ease of use and familiar syntax of Python, R and Matlab with the speed of C, C++ or Java, programmers no longer need to estimate models in one language and reproduce them in a faster production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

juliabloggers.com

A Julia Language Blog Aggregator

Author Archives: Julia Computing, Inc.

Julia Computing Wins RiskTech100® 2018 Rising Star Award

Big Data Analytics with OnlineStats.jl

What are Online Algorithms?

OnlineStats Basics

Working with Series of Different Inputs

Working with Series and Individual OnlineStats

(Embarassingly) Parallel Computation

Resources

Julia Featured in insideHPC’s "AI-HPC Is Happening Now" white-paper