-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
103 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
name = "JMPReader" | ||
uuid = "d9f7e686-cf87-4d12-8d7a-0e9b8c9fba29" | ||
authors = ["Jaakko Ruohio <[email protected]>"] | ||
version = "0.1.10" | ||
version = "0.1.11" | ||
|
||
[deps] | ||
CodecZlib = "944b1d66-785c-5afd-91f1-9de20f533193" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
build/ | ||
site/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
[deps] | ||
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" | ||
JMPReader = "d9f7e686-cf87-4d12-8d7a-0e9b8c9fba29" | ||
|
||
[compat] | ||
Documenter = "1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
## Testing | ||
|
||
Basic testing with limited number of files | ||
```julia | ||
using Pkg | ||
Pkg.test("JMPReader") | ||
``` | ||
|
||
Utility function `JMPReader.scandir` is provided that scans recursively the argument directory. | ||
For example, | ||
```julia | ||
JMPReader.scandir(joinpath(pathof(JMPReader), "..", "..", "test")) | ||
``` | ||
reads 12 JMP-files, and | ||
```julia | ||
JMPReader.scandir(raw"C:\Program Files\SAS\JMPPRO\17\Samples\Data") | ||
``` | ||
reads successfully 605 JMP-files. | ||
|
||
## Looking into the binary .jmp file | ||
|
||
### Finding strings | ||
|
||
Location of strings in the binary `.jmp` can be found using a snippet like | ||
```julia | ||
fn = joinpath(pathof(JMPReader), "..", "..", "test", "example1.jmp") | ||
raw = read(fn) | ||
seq = reinterpret(UInt8, codeunits("jäääär")) | ||
findall(seq, raw) | ||
``` | ||
returns | ||
``` | ||
1-element Vector{UnitRange{Int64}}: | ||
1986:1995 | ||
``` | ||
|
||
Hex editor can be useful, for example [Hex Editor for VS Code](https://github.com/microsoft/vscode-hexeditor). | ||
|
||
If string is not found, columns could be GZ compressed. In that case, see options in JMP File->Preferences. | ||
|
||
### Reading columns | ||
|
||
This snippet reads the fourth column | ||
|
||
```julia | ||
fn = joinpath(pathof(JMPReader), "..", "..", "test", "example1.jmp") | ||
io = open(fn) | ||
info = JMPReader.metadata(io) | ||
d = JMPReader.column_data(io, info, 4, Vector{UInt8}()) | ||
close(io) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# JMPReader.jl Documentation | ||
|
||
[JMP](https://en.wikipedia.org/wiki/JMP_(statistical_software)) is commercial statistical software. This package provides an independent reader for `.jmp` files | ||
implemented in Julia. | ||
|
||
## Basic usage | ||
|
||
Basic usage is | ||
``` | ||
using JMPReader | ||
fn = joinpath(pathof(JMPReader), "..", "..", "test", "example1.jmp") | ||
df = readjmp(fn) | ||
``` | ||
to read file `fn` and get the data as a Julia `DataFrame`. All columns are included | ||
``` | ||
4×12 DataFrame | ||
Row │ ints floats charconstwidth time date duration charconstwidth2 charvariable16 formula pressures char utf8 charvariable8 | ||
│ Int8 Float64 String DateTime? Date? Millisec… String String String Float64? String String | ||
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── | ||
1 │ 1 11.1 a 1976-04-01T21:12:00 2024-01-13 2322000 milliseconds a aa 2 101.325 ꙮꙮꙮ a | ||
2 │ 2 22.2 b 1984-08-06T23:58:00 2024-01-14 364000 milliseconds bb bbbb 4 missing 🚴💨 bb | ||
3 │ 3 33.3 c 2003-06-02T17:00:00 missing 229000 milliseconds ccc cccccccc 6 2.6 jäääär cc | ||
4 │ 4 44.4 d missing 2032-02-12 0 milliseconds dddd abcdefghijabcdefghijabcdefghijab… 8 4.63309e110 辛口 abcdefghijkl | ||
``` | ||
|
||
## Choosing columns | ||
|
||
Two keyword arguments are available, `include_columns` and `exclude_columns` | ||
``` | ||
df = readjmp(fn, include_columns=[2, "date", r"^char"], exclude_columns=[r"varia"]) | ||
``` | ||
returns the second column `floats`, a column named `date`, columns that start with `char`, | ||
but excluding columns whose name contain a string `varia`. | ||
``` | ||
4×5 DataFrame | ||
Row │ floats charconstwidth date charconstwidth2 char utf8 | ||
│ Float64 String Date? String String | ||
─────┼───────────────────────────────────────────────────────────────── | ||
1 │ 11.1 a 2024-01-13 a ꙮꙮꙮ | ||
2 │ 22.2 b 2024-01-14 bb 🚴💨 | ||
3 │ 33.3 c missing ccc jäääär | ||
4 │ 44.4 d 2032-02-12 dddd 辛口 | ||
``` |