Skip to content

Commit

Permalink
v3
Browse files Browse the repository at this point in the history
  • Loading branch information
coderosh committed Dec 17, 2021
1 parent d2fb464 commit 25ca585
Show file tree
Hide file tree
Showing 14 changed files with 5,142 additions and 1,955 deletions.
127 changes: 126 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,127 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*
.pnpm-debug.log*

# Diagnostic reports (https://nodejs.org/api/report.html)
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage
*.lcov

# nyc test coverage
.nyc_output

# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# node-waf configuration
.lock-wscript

# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
dist/
jspm_packages/

# Snowpack dependency directory (https://snowpack.dev/)
web_modules/

# TypeScript cache
*.tsbuildinfo

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Optional stylelint cache
.stylelintcache

# Microbundle cache
.rpt2_cache/
.rts2_cache_cjs/
.rts2_cache_es/
.rts2_cache_umd/

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# dotenv environment variable files
.env
.env.development.local
.env.test.local
.env.production.local
.env.local

# parcel-bundler cache (https://parceljs.org/)
.cache
.parcel-cache

# Next.js build output
.next
out

# Nuxt.js build / generate output
.nuxt
dist

# Gatsby files
.cache/
# Comment in the public line in if your project uses Gatsby and not Next.js
# https://nextjs.org/blog/next-9-1#public-directory-support
# public

# vuepress build output
.vuepress/dist

# vuepress v2.x temp and cache directory
.temp
.cache

# Serverless directories
.serverless/

# FuseBox cache
.fusebox/

# DynamoDB Local files
.dynamodb/

# TernJS port file
.tern-port

# Stores VSCode versions used for testing VSCode extensions
.vscode-test

# yarn v2
.yarn/cache
.yarn/unplugged
.yarn/build-state.yml
.yarn/install-state.gz
.pnp.*
10 changes: 0 additions & 10 deletions .npmignore

This file was deleted.

201 changes: 58 additions & 143 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# jsonFromTable

Convert html tables to javascript objects, array or json.
Convert html tables to object (or array). Supports complex rowspan and colspan.

<a href="https://www.npmjs.com/package/sjsonfromtable"><img alt="NPM" src="https://img.shields.io/npm/v/jsonfromtable" /></a>
<a href="https://github.com/coderosh/jsonfromtable"><img alt="MIT" src="https://img.shields.io/badge/license-MIT-blue.svg" /></a>
Expand All @@ -25,163 +25,78 @@ yarn add jsonfromtable
## Usage

```js
const { jsonFromtable } = require('jsonfromtable')
// OR import { jsonFromTable } from "jsonfromtable"

const obj = jsonFromTable({
html: `<table>...</table>`,
})

const json = jsonFromTable({
html: `<table>...</table>`,
format: 'json',
})

const arr = jsonFromTable({
html: `<table>...</table>`,
format: 'array',
})

const [headers, body] = jsonFromTable({
html: `<table>...</table>`,
format: 'raw',
})
```

`jsonFromTable` function accepts only one argument `options`;

```ts
interface Options {
url?: string // utl to page which contains table
html?: string // html which contains table
selector?: string // table selector
hSelector?: string // head selector
bSelector?: [string, string] // body selector [row, td]
format?: 'json' | 'array' | 'raw' | 'object' // output format
headers?: string[] // custom headers
}
```

## Options
const { JSONFromTable } = require('jsonfromtable')

### url
const obj = JSONFromTable.fromString(`<table>...</table>`)
// [ { title1: value1, title2: value2, ... }, ... ]

If you want the output from a url then you need to pass `url` option. The url should be of a webpage which has a table. If url parameter is passed then the function will return a promise.
const { headers, body } = JSONFromTable.arrayFromString(`<table>...</table>`)
// { headers: [title1, titel2, ...], body: [[val2, val2, ...],...] }

```js
;(async () => {
const obj = await jsonFromTable({ url: 'https://example.com' })

console.log(obj)
const obj = await JSONFromTable.fromUrl(`https://...`)
const { headers, body } = await JSONFromTable.arrayFromUrl(`https://...`)
})()
```

### html

If you want the output from a html then you need to pass `html` option. The html should contain `table` tag.

```js
const obj = jsonFromTable({
html: `<table>...</table>`,
})

console.log(obj)
```

### format

If you want the json or array or raw output then you can pass `format` option. Default value is `object`.

```js
const json = jsonFromTable({
html: `<table>...</table>`,
format: 'json',
})
console.log(json)

jsonFromTable({
url: `https://example.com`,
format: 'array',
}).then((arr) => console.log(arr))

const [headers, body] = jsonFromTable({
html: `<table>...</table>`,
format: 'raw',
})
console.log({ headers, body })
```

### selector

If the page has more than one table, then you can pass css selector of the table as `selector`.

```js
const html = `
<html>
<table>...</table>
<table class="table">...</table>
</html>
`

const obj = jsonFromTable({
html: html,
selector: '.table',
})

console.log(obj)
```

### hSelector
Each function in `JSONFromTable` accepts two arguments. First is source (string or url) and second is `options`.

By default `tr:first-child th` is used to get the headings from table. Sometimes that selecter may not give you the best result. In such case you can provide css selector which will select all headings.

```js
const obj = jsonFromTable({
html: `<table>...</table>`,
hSelector: `thead tr:first-child th`,
})

console.log(obj)
```ts
interface Options {
titles?: string[] // custom titles (eg: ["sn", "name", "title"])
firstRowIsHeading?: boolean // use first row for titles ?
includeFirstRowInBody?: boolean // add first row in body ?
tableSelector?: string // css selector for table (eg: table.wikitable)
rowColSelector?: [string, string] // css selectors for row and col (eg: ["tr", "th,td"])
shouldBeText?: boolean // if false value is html else true
trim?: boolean // should trim the value ?
}
```

### bSelector

By default `['tr:not(:first-child)', 'td']` is used to get body from table. Sometimes that selecter may not give you the best result. In such case you can provide css selector.
## Example

```js
const obj = jsonFromTable({
html: `<table>...</table>`,
bSelector: ['tbody tr:not(:first-child)', 'td'],
})

const str = `<table>
<tr>
<th>name</th>
<th>alias</th>
<th>class</th>
<th>info</th>
</tr>
<tr>
<td colspan="2">Roshan</td>
<td>Eng</td>
<td rowspan="2">na</td>
</tr>
<tr>
<td rowspan="2">John</td>
<td colspan="2">Cook</td>
</tr>
<tr>
<td rowspan="2">Danger</td>
<td colspan="2">Ninja</td>
</tr>
<tr>
<td>AGuy</td>
<td>Eng</td>
<td rowspan="2">Eats a lot</td>
</tr>
<tr>
<td colspan="2">Dante</td>
<td rowspan="2">Art</td>
</tr>
<tr>
<td>Jake</td>
<td>ake</td>
<td>Actor</td>
</tr>
</table>`

const obj = JSONFromTable.fromString(str)
console.log(obj)
```

> Note that if provided `hSelector` and `bSelector` failes to select headers/body than following selectors will be used to select and get headers and body.
```js
const hSelectors = [
'thead tr:first-child th',
'tr:first-child th',
'tr:first-child td',
]
const bSelectors = [
['tbody tr', 'td'],
['tr:not(:first-child)', 'td'],
['tr', 'td'],
]
```

### headers

You can don't like the headers in table, you can add your own.

```js
jsonFromTable({
html: `<table>...</table>`
headers: ["SN", "Name", "Age"]
})
```
<img src="./example.png" height="350" />

## License

Expand Down
Binary file added example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 6 additions & 5 deletions jest.config.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import type { Config } from "@jest/types";
import type { Config } from '@jest/types'

const config: Config.InitialOptions = {
preset: "ts-jest",
testMatch: ["**/*.test.ts"],
};
preset: 'ts-jest',
testMatch: ['**/*.test.ts'],
collectCoverage: true,
}

export default config;
export default config
Loading

0 comments on commit 25ca585

Please sign in to comment.