Skip to content

Delimiter-separated values (DSV) format parser for GNU Guile.

License

Notifications You must be signed in to change notification settings

artyom-poptsov/guile-dsv

Repository files navigation

Guile-DSV

https://github.com/artyom-poptsov/guile-dsv/actions/workflows/guile2.2.yml/badge.svg https://github.com/artyom-poptsov/guile-dsv/actions/workflows/guile3.0.yml/badge.svg https://github.com/artyom-poptsov/guile-dsv/actions/workflows/guix.yml/badge.svg

Guile-DSV is a GNU Guile module for working with the delimiter-separated values (DSV) data format.

Guile-DSV supports the Unix-style DSV format and RFC 4180 format.

Also Guile-DSV ships with a program named dsv (source code is here: utils/dsv.in) that allows to read and process DSV format (including delimiter change and conversion from one standard to another.)

Note that if you want to use Guile-DSV from an environment where syslog is unavailable, then you must set the log-driver option for dsv->scm to “file” or “none” to prevent it from trying to log messages to the syslog. See the Texinfo documentation for details.

Requirements

Build dependencies

  • Texinfo (contains makeinfo tool that is required for making the documentation in Texinfo format)
  • Texlive (also is needed for documentation.)
  • help2man

Installation

GNU Guix

$ guix install guile-dsv

Manual

$ git clone https://github.com/artyom-poptsov/guile-dsv.git
$ cd guile-dsv
$ autoreconf -vif
$ ./configure --prefix=/usr
$ make -j$(nproc)
$ sudo make install

For a basic explanation of the installation of the package, see the INSTALL file.

Please note that you will need Automake 1.12 or later to run self-tests with make check (but the library itself can be built with older Automake version such as 1.11).

important You probably want to call configure with the --with-guilesitedir option so that this package is installed in Guile’s default path. But, if you don’t know where your Guile site directory is, run configure without the option, and it will give you a suggestion.

dsv tool

Options

$ dsv --help
Usage: dsv [options] [file]

The default behavior of the program is to print a formatted table from a
<file> to stdout.  The options listed below can be used to change or modify
this behavior.

When no <file> is provided, dsv reads data from stdin.

Options:
  --help, -h                 Print this message and exit.
  --summary, -s              Print summary information for a file.
  --delimiter, -D <delim>    Set a delimiter.
  --guess-delimiter, -d      Guess a file delimiter and print the result.
  --number, -n               Number rows and columns.
  --width, -w <width>        Wrap long lines of text inside cells to fit the table
                             into the specified width.  If with is specified as
                             "auto" (default value) then current terminal width
                             is used.
                             When the required width is too small for the table
                             wrapping, an error will be issued.
                             Zero width means no wrapping so the table might not
                             fit into the screen.
  --map-cell, -m <code>      Apply an arbitrary Scheme code on each cell value
                             before printing.
                             There are three variables that can be used in the code:
                             - $value -- current cell value.
                             - $row   -- current row number
                             - $col   -- current column number.

                             Code examples:
                             '(if (> $value 0) $value 0)'
                             '(string-append "\"" $value "\"")'

                             Note that the code must return a string, that in turn
                             will be printed in a cell.

  --filter-row, -f <code>    Keep only rows for which CODE returns #t.
                             There are two variables that can be used in the code:
                             - $value -- current row content.
                             - $row   -- current row number.

                             For example with this code Guile-DSV keeps only rows
                             that are 5 columns in length:
                             '(= (length $value) 5)'

  --filter-column, -c <procedure>
                             Keep only columns for which PROCEDURE returns #t.
                             There are two variables that can be used in the code:
                             - $value -- current column content as a list.
                             - $row   -- current column number.

                             For example with this code Guile-DSV keeps only the 2nd
                             column from the input data:
                              '(= $col 2)'

  --file-format, -F <fmt>    Set a file format.  Possible formats are:
                             "unix" (default), "rfc4180"
  --with-header, -H          Use the first row of a table as a header when
                             printing the table to the screen.
  --table-borders, -b <spec> Set table borders for printing.  The value can be
                             either a borders specification or a preset name.

                             Spec can be a comma-separated list of key=value
                             pairs that specify the table style.  The list of
                             possible keys can be found below
                             (see "Table parameters".)

                             Also a table preset name can be used as the value.
                             See "Table presets" below.

                             Table preset parameters can be overridden by specifying
                             extra parameters after the preset name.  E.g.:
                               "graphic,bs=3;31"

                             Example values:
                               - "v=|,h=-,j=+"
                               - org

  --table-presets-path <path>
                             Set the table preset path.
                             This option can be also set by
                              "GUILE_DSV_TABLE_PRESETS_PATH" environment
                             variable.
                             Default value: /gnu/store/448pzfcwaaa8smrrdbn1shmk45s7agwx-guile-dsv-git/share/guile-dsv/presets/
  --to, -t <fmt>             Convert a file to a specified format, write
                             the result to stdout.
  --to-delimiter, -T <delim> Convert delimiters to the specified variant.
                             When this option is not used, default delimiters
                             for the chosen output format will be used.
  --version                  Print information about Guile-DSV version.
  --debug                    Enable state machine debugging.

Table parameters:
  bt   border-top                The top border.
  btl  border-top-left           The top left corner.
  btr  border-top-right          The top right corner.
  btj  border-top-joint          The top border joint.
  bl   border-left               The left table border.
  blj  border-left-joint         The left table border joint.
  br   border-right              The right table border.
  brj  border-right-joint        The right table border joint.
  bb   border-bottom             The bottom border.
  bbl  border-bottom-left        The left corner of the bottom border.
  bbr  border-bottom-right       The right corner of the bottom border.
  bbj  border-bottom-joint       The bottom border joint.
  bs   border-style              The style of the borders ("fg;bg".)
  ts   text-style                The text style ("fg;bg".)
  s    shadow                    The table shadow.
  so   shadow-offset             The table shadow offset in format "x;y" (e.g. "2;2".)
  ss   shadow-style              The style of the shadow ("fg;bg".)
  rs   row-separator             The table row separator.
  rj   row-joint                 The row joint.
  cs   column-separator          The table column separator
  hs   header-style              The header style ("fg;bg".)
  ht   header-top                The header top border.
  htl  header-top-left           The header top left border.
  htr  header-top-right          The header top right border.
  htj  header-top-joint          The header top joint.
  hl   header-left               The header left border.
  hr   header-right              The header right border.
  hcs  header-column-separator   The header column separator.
  hb   header-bottom             The header bottom border.
  hbl  header-bottom-left        The header bottom left corner.
  hbr  header-bottom-right       The header bottom right border.
  hbj  header-bottom-joint       The header bottom joint.

Table presets:
  ascii
  graphic-bold
  graphic-double
  graphic
  graphic-with-shadow
  markdown
  org

Print DSV files

To show DSV files (Unix-style) in human-readable manner, just invoke the tool like this:

$ head -4 /etc/passwd | dsv
 root    x  0  0  root    /root      /bin/bash         
 daemon  x  1  1  daemon  /usr/sbin  /usr/sbin/nologin 
 bin     x  2  2  bin     /bin       /usr/sbin/nologin 
 sys     x  3  3  sys     /dev       /usr/sbin/nologin

Show a DSV file as a fancy table with custom borders:

$ head -4 /etc/passwd | dsv -b "rs=-,cs=|,rj=+"
 root   | x | 0 | 0 | root   | /root     | /bin/bash         
--------+---+---+---+--------+-----------+-------------------
 daemon | x | 1 | 1 | daemon | /usr/sbin | /usr/sbin/nologin 
--------+---+---+---+--------+-----------+-------------------
 bin    | x | 2 | 2 | bin    | /bin      | /usr/sbin/nologin 
--------+---+---+---+--------+-----------+-------------------
 sys    | x | 3 | 3 | sys    | /dev      | /usr/sbin/nologin

The same output but with box-drawing characters:

$ head -4 /etc/passwd | dsv -b "rs=─,cs=│,rj=┼"
 root   │ x │ 0 │ 0 │ root   │ /root     │ /bin/bash         
────────┼───┼───┼───┼────────┼───────────┼───────────────────
 daemon │ x │ 1 │ 1 │ daemon │ /usr/sbin │ /usr/sbin/nologin 
────────┼───┼───┼───┼────────┼───────────┼───────────────────
 bin    │ x │ 2 │ 2 │ bin    │ /bin      │ /usr/sbin/nologin 
────────┼───┼───┼───┼────────┼───────────┼───────────────────
 sys    │ x │ 3 │ 3 │ sys    │ /dev      │ /usr/sbin/nologin

Table presets

There are table presets that can be used to draw tables with specified border styles. Some examples:

ascii

$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "ascii"
.--------------.
| a  | b  | c  |
|----+----+----|
| a1 | b1 | c1 |
|----+----+----|
| a2 | b2 | c2 |
'--------------'

$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "ascii" --with-header
.--------------.
| a  | b  | c  |
|====+====+====|
| a1 | b1 | c1 |
|----+----+----|
| a2 | b2 | c2 |
'--------------'

graphic

$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "graphic"
┌────┬────┬────┐
│ a  │ b  │ c  │
├────┼────┼────┤
│ a1 │ b1 │ c1 │
├────┼────┼────┤
│ a2 │ b2 │ c2 │
└────┴────┴────┘
$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "graphic-bold"
┏━━━━┳━━━━┳━━━━┓
┃ a  ┃ b  ┃ c  ┃
┣━━━━╋━━━━╋━━━━┫
┃ a1 ┃ b1 ┃ c1 ┃
┣━━━━╋━━━━╋━━━━┫
┃ a2 ┃ b2 ┃ c2 ┃
┗━━━━┻━━━━┻━━━━┛
$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "graphic-double"
╔════╦════╦════╗
║ a  ║ b  ║ c  ║
╠════╬════╬════╣
║ a1 ║ b1 ║ c1 ║
╠════╬════╬════╣
║ a2 ║ b2 ║ c2 ║
╚════╩════╩════╝

org

This is the preset that allows to generate org-mode tables from CSV/DSV data.

$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "org"
| a  | b  | c  |
| a1 | b1 | c1 |
| a2 | b2 | c2 |
$ echo -e "a,b,c\na1,b1,c1\na2,b2,c2\n" | dsv -b "org" --with-header
| a  | b  | c  |
|----+----+----|
| a1 | b1 | c1 |
| a2 | b2 | c2 |

Guessing the delimiter for a file

$ dsv -d /etc/passwd
:

Getting the summary for a CSV/DSV file

$ dsv -s /etc/passwd
File:      /etc/passwd
Format:    unix
Delimiter: ':' (0x3a)
Records:   50

column       width       
1            19          
2            1           
3            5           
4            5           
5            34          
6            26          
7            17

Converting files between formats

From Unix DSV to RFC4180:

$ dsv -t rfc4180 /etc/passwd | head -4
root,x,0,0,root,/root,/bin/bash
daemon,x,1,1,daemon,/usr/sbin,/usr/sbin/nologin
bin,x,2,2,bin,/bin,/usr/sbin/nologin
sys,x,3,3,sys,/dev,/usr/sbin/nologin

Convert delimiters:

$ dsv -t unix -T "|" /etc/passwd | head -4
root|x|0|0|root|/root|/bin/bash
daemon|x|1|1|daemon|/usr/sbin|/usr/sbin/nologin
bin|x|2|2|bin|/bin|/usr/sbin/nologin
sys|x|3|3|sys|/dev|/usr/sbin/nologin

Apply an arbitrary Scheme code to each cell of a table

Wrap each table value in double quotes:

dsv -m '(string-append "\"" $value "\"")' /etc/group

Table filtering

Remove 2nd row from a table:

$ dsv -f '(not (= $row 1))' /etc/passwd

Remove 2nd column from a table:

$ dsv -f '(not (= $col 1))' /etc/passwd