Skip to content

Commit

Permalink
Support PPRINT barred input (#1472)
Browse files Browse the repository at this point in the history
* Support PPRINT barred input

* regression-test files

* output from `make dev`

* doc updates
  • Loading branch information
johnkerl authored Jan 20, 2024
1 parent 76408f3 commit 794a754
Show file tree
Hide file tree
Showing 21 changed files with 565 additions and 57 deletions.
33 changes: 32 additions & 1 deletion docs/src/file-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,7 @@ Note that while Miller is a line-at-a-time processor and retains input lines in

See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.

For output only (this isn't supported in the input-scanner as of 5.0.0) you can use `--barred` with pprint output format:
Since Miller 5.0.0, you can use `--barred` or `--barred-output` with pprint output format:

<pre class="pre-highlight-in-pair">
<b>mlr --opprint --barred cat data/small</b>
Expand All @@ -383,6 +383,37 @@ For output only (this isn't supported in the input-scanner as of 5.0.0) you can
+-----+-----+---+----------+----------+
</pre>

Since Miller 6.11.0, you can use `--barred-input` with pprint output format:

<pre class="pre-highlight-in-pair">
<b>mlr -o pprint --barred cat data/small | mlr -i pprint --barred-input -o json filter '$b == "pan"'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
[
{
"a": "pan",
"b": "pan",
"i": 1,
"x": 0.346791,
"y": 0.726802
},
{
"a": "eks",
"b": "pan",
"i": 2,
"x": 0.758679,
"y": 0.522151
},
{
"a": "wye",
"b": "pan",
"i": 5,
"x": 0.573288,
"y": 0.863624
}
]
</pre>

## Markdown tabular

Markdown format looks like this:
Expand Down
8 changes: 7 additions & 1 deletion docs/src/file-formats.md.in
Original file line number Diff line number Diff line change
Expand Up @@ -153,12 +153,18 @@ Note that while Miller is a line-at-a-time processor and retains input lines in

See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.

For output only (this isn't supported in the input-scanner as of 5.0.0) you can use `--barred` with pprint output format:
Since Miller 5.0.0, you can use `--barred` or `--barred-output` with pprint output format:

GENMD-RUN-COMMAND
mlr --opprint --barred cat data/small
GENMD-EOF

Since Miller 6.11.0, you can use `--barred-input` with pprint output format:

GENMD-RUN-COMMAND
mlr -o pprint --barred cat data/small | mlr -i pprint --barred-input -o json filter '$b == "pan"'
GENMD-EOF

## Markdown tabular

Markdown format looks like this:
Expand Down
16 changes: 7 additions & 9 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,7 @@ Quick links:
This is simply a copy of what you should see on running `man mlr` at a command prompt, once Miller is installed on your system.

<pre class="pre-non-highlight-non-pair">
MILLER(1) MILLER(1)


4mMILLER24m(1) 4mMILLER24m(1)

1mNAME0m
Miller -- like awk, sed, cut, join, and sort for name-indexed data such
Expand Down Expand Up @@ -697,8 +695,10 @@ MILLER(1) MILLER(1)
1mPPRINT-ONLY FLAGS0m
These are flags which are applicable to PPRINT format.

--barred Prints a border around PPRINT output (not available
for input).
--barred or --barred-output
Prints a border around PPRINT output.
--barred-input When used in conjunction with --pprint, accepts
barred input.
--right Right-justifies all fields for PPRINT output.

1mPROFILING FLAGS0m
Expand Down Expand Up @@ -807,7 +807,7 @@ MILLER(1) MILLER(1)
markdown " " N/A "\n"
nidx " " N/A "\n"
pprint " " N/A "\n"
tsv " " N/A "\n"
tsv " " N/A "\n"
xtab "\n" " " "\n\n"

--fs {string} Specify FS for input and output.
Expand Down Expand Up @@ -3687,7 +3687,5 @@ MILLER(1) MILLER(1)
MIME Type for Comma-Separated Values (CSV) Files, the Miller docsite
https://miller.readthedocs.io



2024-01-01 MILLER(1)
2024-01-20 4mMILLER24m(1)
</pre>
16 changes: 7 additions & 9 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
MILLER(1) MILLER(1)


4mMILLER24m(1) 4mMILLER24m(1)

1mNAME0m
Miller -- like awk, sed, cut, join, and sort for name-indexed data such
Expand Down Expand Up @@ -676,8 +674,10 @@ MILLER(1) MILLER(1)
1mPPRINT-ONLY FLAGS0m
These are flags which are applicable to PPRINT format.

--barred Prints a border around PPRINT output (not available
for input).
--barred or --barred-output
Prints a border around PPRINT output.
--barred-input When used in conjunction with --pprint, accepts
barred input.
--right Right-justifies all fields for PPRINT output.

1mPROFILING FLAGS0m
Expand Down Expand Up @@ -786,7 +786,7 @@ MILLER(1) MILLER(1)
markdown " " N/A "\n"
nidx " " N/A "\n"
pprint " " N/A "\n"
tsv " " N/A "\n"
tsv " " N/A "\n"
xtab "\n" " " "\n\n"

--fs {string} Specify FS for input and output.
Expand Down Expand Up @@ -3666,6 +3666,4 @@ MILLER(1) MILLER(1)
MIME Type for Comma-Separated Values (CSV) Files, the Miller docsite
https://miller.readthedocs.io



2024-01-01 MILLER(1)
2024-01-20 4mMILLER24m(1)
3 changes: 2 additions & 1 deletion docs/src/reference-main-flag-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,8 @@ These are flags which are applicable to PPRINT format.

**Flags:**

* `--barred`: Prints a border around PPRINT output (not available for input).
* `--barred or --barred-output`: Prints a border around PPRINT output.
* `--barred-input`: When used in conjunction with --pprint, accepts barred input.
* `--right`: Right-justifies all fields for PPRINT output.

## Profiling flags
Expand Down
16 changes: 7 additions & 9 deletions man/manpage.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
MILLER(1) MILLER(1)


4mMILLER24m(1) 4mMILLER24m(1)

1mNAME0m
Miller -- like awk, sed, cut, join, and sort for name-indexed data such
Expand Down Expand Up @@ -676,8 +674,10 @@ MILLER(1) MILLER(1)
1mPPRINT-ONLY FLAGS0m
These are flags which are applicable to PPRINT format.

--barred Prints a border around PPRINT output (not available
for input).
--barred or --barred-output
Prints a border around PPRINT output.
--barred-input When used in conjunction with --pprint, accepts
barred input.
--right Right-justifies all fields for PPRINT output.

1mPROFILING FLAGS0m
Expand Down Expand Up @@ -786,7 +786,7 @@ MILLER(1) MILLER(1)
markdown " " N/A "\n"
nidx " " N/A "\n"
pprint " " N/A "\n"
tsv " " N/A "\n"
tsv " " N/A "\n"
xtab "\n" " " "\n\n"

--fs {string} Specify FS for input and output.
Expand Down Expand Up @@ -3666,6 +3666,4 @@ MILLER(1) MILLER(1)
MIME Type for Comma-Separated Values (CSV) Files, the Miller docsite
https://miller.readthedocs.io



2024-01-01 MILLER(1)
2024-01-20 4mMILLER24m(1)
2 changes: 1 addition & 1 deletion man/mkman.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def main
# Live code-generation needs to be using mlr from *this* tree, not from
# somewhere else in the PATH.
unless File.executable?('../mlr')
$stderr.puts "#{$0}: Need ../../mlr to exist: please check 'make build' in ../.."
$stderr.puts "#{$0}: Need ../mlr to exist: please check 'make build' in ../.."
exit 1
end
`../mlr --version`
Expand Down
10 changes: 6 additions & 4 deletions man/mlr.1
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
.\" Title: mlr
.\" Author: [see the "AUTHOR" section]
.\" Generator: ./mkman.rb
.\" Date: 2024-01-01
.\" Date: 2024-01-20
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
.TH "MILLER" "1" "2024-01-01" "\ \&" "\ \&"
.TH "MILLER" "1" "2024-01-20" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Portability definitions
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -811,8 +811,10 @@ those can be joined with a "-", like "red-bold", "bold-170", "bold-underline", e
.nf
These are flags which are applicable to PPRINT format.

--barred Prints a border around PPRINT output (not available
for input).
--barred or --barred-output
Prints a border around PPRINT output.
--barred-input When used in conjunction with --pprint, accepts
barred input.
--right Right-justifies all fields for PPRINT output.
.fi
.if n \{\
Expand Down
15 changes: 13 additions & 2 deletions pkg/cli/option_parse.go
Original file line number Diff line number Diff line change
Expand Up @@ -494,13 +494,24 @@ var PPRINTOnlyFlagSection = FlagSection{
},

{
name: "--barred",
help: "Prints a border around PPRINT output (not available for input).",
name: "--barred",
altNames: []string{"--barred-output"},
help: "Prints a border around PPRINT output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.WriterOptions.BarredPprintOutput = true
*pargi += 1
},
},

{
name: "--barred-input",
help: "When used in conjunction with --pprint, accepts barred input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.ReaderOptions.BarredPprintInput = true
options.ReaderOptions.IFS = "|"
*pargi += 1
},
},
},
}

Expand Down
1 change: 1 addition & 0 deletions pkg/cli/option_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ type TReaderOptions struct {
AllowRaggedCSVInput bool
CSVLazyQuotes bool
CSVTrimLeadingSpace bool
BarredPprintInput bool

CommentHandling TCommentHandling
CommentString string
Expand Down
20 changes: 0 additions & 20 deletions pkg/input/record_reader_csvlite.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,26 +78,6 @@ func NewRecordReaderCSVLite(
return reader, nil
}

func NewRecordReaderPPRINT(
readerOptions *cli.TReaderOptions,
recordsPerBatch int64,
) (*RecordReaderCSVLite, error) {
reader := &RecordReaderCSVLite{
readerOptions: readerOptions,
recordsPerBatch: recordsPerBatch,
fieldSplitter: newFieldSplitter(readerOptions),

useVoidRep: true,
voidRep: "-",
}
if reader.readerOptions.UseImplicitCSVHeader {
reader.recordBatchGetter = getRecordBatchImplicitCSVHeader
} else {
reader.recordBatchGetter = getRecordBatchExplicitCSVHeader
}
return reader, nil
}

func (reader *RecordReaderCSVLite) Read(
filenames []string,
context types.Context,
Expand Down
Loading

0 comments on commit 794a754

Please sign in to comment.