I just released Query.jl
v0.9.0. The new version adds the @take
and @drop
standalone query
operators and brings pretty printing to uncollected queries.
Pretty printing
In previous versions queries displayed a really awful mess of internal
data when they were displayed in the REPL. In practice one always had to
collect a query into something like a DataFrame
to get a nice view
of the query result. The new version changes that and provides a nice
output for any query, even an uncollected one. Here is an example:
julia> using FileIO, Query, CSVFiles
julia> filename = "https://gist.githubusercontent.com/davidanthoff/bebfd24c1a3f32f576eb61bee77f5944/raw/dd9233ad860037a2155f3a9ca3c37eb2d5572573/testdata2.csv";
julia> load(filename) |> @map({_.Year, _.Cause_Name})
15028x2 query result
Year │ Cause_Name
─────┼───────────────────────
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
1999 │ Unintentional Injuries
... with 15018 more rows
The pretty printing should work for the values returned from any of the
query operators. The output format is heavily inspired by R’s tibbles.
I hope this will make interactive work much more pleasant because it
should be easier to build up more complicated queries step by step, while
periodically running a query to check intermediate results.
I also plan to add this to the whole tabular file IO of the iterable
tables ecosystem at a later date (e.g. CSVFiles.jl,
FeatherFiles.jl,
ExcelFiles.jl,
StatFiles.jl etc.).
The @take and @drop query commands
Those are fairly straightforward: both of these filter elements out of a
sequence. @take
limits the number of elements to some upper maximum,
and @drop
skips a number of elements. Here is an example of how one
can use these:
using FileIO, Query, CSVFiles
filename = "https://gist.githubusercontent.com/davidanthoff/bebfd24c1a3f32f576eb61bee77f5944/raw/dd9233ad860037a2155f3a9ca3c37eb2d5572573/testdata2.csv"
load(filename) |>
@filter(_.Cause_Name!="All Causes" && !isnull(_.Age_adjusted_Death_Rate)) |>
@groupby(_.Cause_Name) |>
@map({cause=_.key, death_rate=sum(_..Age_adjusted_Death_Rate)}) |>
@orderby_descending(_.death_rate) |>
@drop(2) |>
@take(3) |>
save("output.feather")
This example showcases a whole range of features, including the use of
the @drop
and @take
operations. The official documentation for
these two new operators is in the “Experimental Features” section in the
Query.jl documentation.
Any feedback on these new features (and old ones) is most welcome, and of
course any help with the overall package would also be fantastic!
This post is being discussed here.