Releases: medialab/xan
Releases Β· medialab/xan
v0.56.0
Features
- Adding
xan bisect. - Adding
xan flatten -N/--non-empty. - Adding the
soundex,refined_soundex&phonogrammoonblade functions for phonetic encoding.
Fixes
- Fixing
xan to (md|html) --no-headers. - Fixing
xan plot -R/--regression-line.
Quality of Life
- Adding
xan to markdownas an alias forxan to md. xan flatten&xan viewwill stop masquerading trimmed empty cells as empty.
v0.55.0
Breaking
- Changing how
xan separategenerates default column names. xan from -f=(json|ndjson|jsonl)will now emit column in input order by default.- Changing
xan to -B/--buffer-sizeto--sample-sizeto harmonize flag names withxan from.
Features
- Adding the
xan completecommand. - Adding an optional unit to
ceil,floor,round&truncmoonblade function. E.g. floor to nearest decade:floor(year, 10). - Adding
basename&dirnamemoonblade functions. - Adding
parse_py_literalmoonblade functions. Useful to deal with files dubiously serialized usingpandas. - Adding
xan view --repeat-headers=(auto|always|never). - Adding
xan view --reveal-whitespace=(auto|always|never). - Adding
--colorsupport toXAN_VIEW_ARGS. - Adding
xan from -f json --sample-size -1to sample the whole file. - Adding
xan from -f json --single-object. - Adding
xan from --sort-keys. - Adding
xan to (json|ndjson|jsonl) --sample-size -1to sample the whole file. - Adding
xan to (json|ndjson|jsonl) --stringsflag. - Adding
xan separate --prefix. - Adding
xan heatmap -Cshort flag for--cram. - Adding
xan heatmap --repeat-headers. - Adding
rank,cume_dist,percent_rankandntilewindow functions. - Adding
xan help --color.
Fixes
- Fixing
xan select -neincorrectly emitting headers.
Quality of Life
xan view -pwill not print bottom header anymore by default.xan viewwill not reveal problematic whitespace if output is not colored anymore, by default.- Better
xan histerror messages and help. - Testing more file name variants when searching for a
.gziindex.
v0.54.1
Fixes
- Fixing
xan freq --groupbyincorrectly unescaping group cells. - Fixing help related to
xan pivot&xan unpivot. - Upgrading
simd-csvto get safety fixes.
v0.54.0
The SIMD update.
Breaking
- Bumping MSRV to
1.83.0. - Dropping
xan plot -Y/--add-series. It is now possible to select multiple columns as<y>inxan plot <x> <y>instead. - Dropping the
-C/--force-colorsflag inflatten,heatmap,hist,plotandviewin favor of the more standardized and flexible--color=(auto|never|always)flag. xan joinwill now automatically drop joined columns from one the files when it is obviously safe to do so.xan behead&xan renamedo not normalize the output anymore to be as fast as possible.- The new SIMD CSV parser might not deal with CSV irregular cases the same way
rust-csvdid. In any case,xan inputwill still continue to userust-csv. xan slice -B/--byte-offset&xan slice -A/--accumulateare now mutually exclusive.xan inputhas been overhauled.- Dropping
xan count --sample-size. - Overhauling
xan fixlengthsto accept streams by shifting default from double-pass read to buffering the whole stream into memory. xan plot --x-scale log & --y-scale logare now natural log. Uselog10for the base10 log as before.- Dropping
xan reverse -m/--in-memoryflag. Behavior is now automatically detected. - Dropping
xan shuffle -m/--in-memoryflag. Loading the file into memory is now the default. Thexan shuffle -e/--externalflag has been added if
you want the old default behavior. xan binsnow outputs<empty>values instead of<nulls>.- Overhauling
xan bins. The default is now to find nice boundaries for the bins. Use-e/--exactto revert to the old behavior. The default number of bins is now10, and won't use Freedman-Diaconis rule by default. A-H/--heuristicflag has been added if you want to automatically select a suitable number of bins.
Features
- Adding
xan flatten -F/--flatter. xan pivotcan now target multiple columns.- Adding the
xan grepcommand for fast but coarse filtering. - Adding
xan search -f/--flag. - Adding
xan map -F/--filter. xan search -B/--breakdownnow consolidates the results when multiple patterns have a same name.- Adding
xan flatten --row-separator. - Adding
xan flatten --csv. - Adding
xan headers --color. - Adding the
xan join <columns> <input1> <input2>arity as a convenience when joined column names are the same in both inputs. - Adding
xan join -D/--drop-key=(none|both|left|right). - Adding
xan fuzzy-join -D/--drop-key=(none|both|left|right). - Adding
xan plot -A/--aggregate. - Adding support for plural selection clauses in both
xan select -e&xan mape.g.xan map 'full_name.split(" ") as (first_name, last_name). - Adding
xan search -P/--add-pattern. - Adding
xan groupby -M/--along-matrix. - Adding
xan groupby -T/--total. - Adding support for
.ndjson&.jsonlfiles. Those are considered as headless TSV files with null byte quoting so you can easily use them withxancommands. - Adding out-of-the-box support for
.vcf,.sam,.bed,.gtf&.gff2files. - Adding a
xan cat colsalias toxan cat columns. - Adding
zstdsupport. - Adding
earliest&latestmoonblade functions. - Adding
xan dedup -f/--flag. - Adding
-kshort flag forxan dedup --keep-duplicates, and-Cshort flag forxan dedup --choose. - Adding
xan fixlengths -H/--trust-header. - Adding
xan separate. - Adding full log scale support to
xan plot. - Adding
xan hist --scale. xan windowis now able to run total aggregations.- Adding
thousands_sep,commaandsignificancekwargs tonumfmtmoonblade function.
Fixes
- Fixing
xan dedup --checkbug where the first record was ignored. - Fixing
xan hist -Dwhen a same date is found multiple times. - Fixing
xan from -f xlsdatetime conversion. - Fixing
xan flatten&xan viewwhen column names contain line breaks. - Fixing invalid argument parsing error being printed to stdout instead of stderr.
- Fixing
xan progressSIGINT corrupting output. - Fixing
xan enum -A/--accumulate. - Fixing
xan from -f tarwhen tarball archive is not gzipped. - Fixing
min&maxmoonblade function when passing a list of numbers. - Fixing
xan flatten -Hedge cases. - Fixing commands requiring seekable streams accepting unindexed compressed files by error.
- Fixing
xan plot --count --y-scale log.
Performance
- Wildly improving performance of most of
xancommands by leveraging a novel SIMD CSV parser/writer. - Improving performance of
xan from -f txt&xan from -f npy. - Improving memory footprint of hash-based commands (e.g.
frequency,groupby,dedupetc.). - Improving performance of
xan progress,xan range,xan enum,xan behead,xan rename.
Quality of Life
xan parallel catnow flushing more consistently.- Better highlighting of problematic strings in
xan flatten,xan view&xan headers. xan parallelwill now generally stop as soon as an error is detected in a subprocess and cleanly report errors.- Better argv parsing error UX in general.
- The
-pflag will now avoid going further than 16 to avoid issues on server with many CPUs where hogging the resources is an issue and where using too much threads at once could hurt performance. The-tflag remain available to tweak the number of threads. xan histwill now dim bars having a0count so you can easily distinguish them from non-empty bars.
v0.54.0-rc.4
Bump 0.54.0-rc.4
v0.54.0-rc.3
Bump 0.54.0-rc.3
v0.54.0-rc.2
Bump 0.54.0-rc.2
v0.54.0-rc.1
Bump 0.54.0-rc.1
v0.53.0
Breaking
xan partitionnow normalizes filenames to lowercase to correctly deal with case-insensitive filesystems.xan partitionalso gets a related-C/--case-sensitiveflag.
Features
- Adding
allandanymoonblade higher-order functions. - Allowing moonblade
printffunction to be called with lists. - Adding
-f/--evaluate-fileflag tomap,filter,flatmap&transformcommands. - Adding
xan map -O/--overwrite.
Fixes
- Fixing
xan top -T/--tiesedge case. - Fixing broken pipe panics for some commands.
- Dropping remnant
dbg!macro when reading files in reverse.
Performance
- Using
jemallocatorfor musl builds.
Quality of Life
- Better moonblade
printffunction error messages.
v0.53.0-rc.1
Bump 0.53.0-rc.1