Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
bartveneman committed Mar 23, 2019
1 parent 0172342 commit a93b939
Show file tree
Hide file tree
Showing 11 changed files with 355 additions and 322 deletions.
33 changes: 27 additions & 6 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
{
"name": "distill-css",
"description": "PACKAGE_DESCRIPTION",
"name": "extract-css-core",
"description": "Extract all CSS from a given url, both server side and client side rendered.",
"version": "0.1.0",
"homepage": "https://www.projectwallace.com/oss",
"repository": "PACKAGE_REPOSITORY",
"issues": "PACKAGE_ISSUES",
"repository": "https://github.com/bartveneman/extract-css-core",
"issues": "https://github.com/bartveneman/extract-css-core/issues",
"license": "MIT",
"author": "Bart Veneman",
"keywords": [
"PACKAGE_KEYWORD"
"extract",
"css",
"scrape",
"get-css"
],
"scripts": {
"test": "xo && ava test"
Expand All @@ -21,7 +24,25 @@
"node": ">=8.0"
},
"xo": {
"prettier": true
"prettier": true,
"rules": {
"ava/no-skip-test": "warn"
}
},
"prettier": {
"useTabs": true,
"semi": false,
"singleQuote": true,
"bracketSpacing": false,
"proseWrap": "always",
"overrides": [
{
"files": "*.json",
"options": {
"useTabs": false
}
}
]
},
"devDependencies": {
"ava": "^1.3.1",
Expand Down
54 changes: 33 additions & 21 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,24 @@

### Problem

Existing packages like [get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css) look at a server-generated piece of HTML and get all the `<link>` and `<style>` tags from it. This works fine for 100% server rendered pages, but apps that employ style injection with JavaScript will not be covered.
Existing packages like
[get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css)
look at a server-generated piece of HTML and get all the `<link>` and `<style>`
tags from it. This works fine for 100% server rendered pages, but apps that
employ style injection with JavaScript will not be covered.

### Solution

This module uses an instance of Chromium to render a page. This has the benefit that most of the styles can be rendered, even when generated by JavaScript. The [Puppeteer CSSCoverage API](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#coveragestartcsscoverageoptions) is the power behind finding most of the CSS.
This module uses an instance of Chromium to render a page. This has the benefit
that most of the styles can be rendered, even when generated by JavaScript. The
[Puppeteer CSSCoverage API](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#coveragestartcsscoverageoptions)
is the power behind finding most of the CSS.

### Shortcomings

Currently, there is no solution to get the CSS from modules that use [Styled Components](https://www.styled-components.com) or something similar. Any help resolving this issue will be very much appreciated.
Currently, there is no solution to get the CSS from modules that use
[Styled Components](https://www.styled-components.com) or something similar. Any
help resolving this issue will be very much appreciated.

## Installation

Expand All @@ -38,57 +47,60 @@ yarn add extract-css-core
## Usage

```js
const extractCss = require("extract-css-core");
const extractCss = require('extract-css-core')

extractCss("http://www.projectwallace.com").then(css => console.log(css));
extractCss('http://www.projectwallace.com').then(css => console.log(css))
```

## API

### extractCss(url, [options])

Extract CSS from a page. Returns a Promise that contains the CSS as a single String.
Extract CSS from a page. Returns a Promise that contains the CSS as a single
String.

### Options

Type: `Object`

#### debug

Type: `Boolean`
Default: `false`
Type: `Boolean` Default: `false`

Set to `true` if you want a Chromium window to open as it works to get all the CSS from the page.
Set to `true` if you want a Chromium window to open as it works to get all the
CSS from the page.

#### waitUntil

Type: `String`
Default: `networkidle2`
Type: `String` Default: `networkidle2`

Can be any value as provided by the [Puppeteer docs](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagegotourl-options).
Can be any value as provided by the
[Puppeteer docs](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagegotourl-options).

#### customBrowser

Type: `Object`
Default: `null`
Type: `Object` Default: `null`

This is useful if you want to run extract-css on AWS Lambda for example.

##### executablePath

Type: `String`
Default: `null`
Type: `String` Default: `null`

Pass in the executable path for a custom Chromium instance.

##### customPuppeteer

Type: `Object`
Default: `null`
Type: `Object` Default: `null`

You probably want to provide [puppeteer-core](https://www.npmjs.com/package/puppeteer-core) for a custom browser instead of [puppeteer](https://www.npmjs.com/package/puppeteer) which brings it's own Chromium instance.
You probably want to provide
[puppeteer-core](https://www.npmjs.com/package/puppeteer-core) for a custom
browser instead of [puppeteer](https://www.npmjs.com/package/puppeteer) which
brings it's own Chromium instance.

## Related

- [extract-css lambda](https://github.com/bartveneman/extract-css) - Extract CSS running as a serverless function
- [get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css) - The original get-css
- [extract-css lambda](https://github.com/bartveneman/extract-css) - Extract CSS
running as a serverless function
- [get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css) -
The original get-css
136 changes: 68 additions & 68 deletions src/index.js
Original file line number Diff line number Diff line change
@@ -1,85 +1,85 @@
const puppeteer = require("puppeteer");
const puppeteer = require('puppeteer')

function InvalidUrlError({ url, statusCode, statusText }) {
this.name = "InvalidUrlError";
this.message = `There was an error retrieving CSS from ${url}.\n\tHTTP status code: ${statusCode} (${statusText})`;
function InvalidUrlError({url, statusCode, statusText}) {
this.name = 'InvalidUrlError'
this.message = `There was an error retrieving CSS from ${url}.\n\tHTTP status code: ${statusCode} (${statusText})`
}

InvalidUrlError.prototype = Error.prototype;
InvalidUrlError.prototype = Error.prototype

module.exports = async (
url,
{ debug = false, waitUntil = "networkidle2", customBrowser = {} } = {}
url,
{debug = false, waitUntil = 'networkidle2', customBrowser = {}} = {}
) => {
const browserOptions = {
headless: debug !== true,
puppeteer
};
const browserOptions = {
headless: debug !== true,
puppeteer
}

// Replace the puppeteer instance if a custom one is provided
// This also means that the executablePath needs to be set to
// a custom path where some chromium instance is running.
if (
customBrowser &&
customBrowser.executablePath &&
customBrowser.customPuppeteer
) {
browserOptions.executablePath = customBrowser.executablePath;
browserOptions.puppeteer = customBrowser.customPuppeteer;
}
// Replace the puppeteer instance if a custom one is provided
// This also means that the executablePath needs to be set to
// a custom path where some chromium instance is running.
if (
customBrowser &&
customBrowser.executablePath &&
customBrowser.customPuppeteer
) {
browserOptions.executablePath = customBrowser.executablePath
browserOptions.puppeteer = customBrowser.customPuppeteer
}

// Setup a browser instance
const browser = await browserOptions.puppeteer.launch(browserOptions);
// Setup a browser instance
const browser = await browserOptions.puppeteer.launch(browserOptions)

// Create a new page and navigate to it
const page = await browser.newPage();
// Create a new page and navigate to it
const page = await browser.newPage()

// Start CSS coverage. This is the meat and bones of this module
await page.coverage.startCSSCoverage();
const response = await page.goto(url, { waitUntil });
// Start CSS coverage. This is the meat and bones of this module
await page.coverage.startCSSCoverage()
const response = await page.goto(url, {waitUntil})

// Make sure that we only try to extract CSS from valid pages.
// Bail out if the response is an invalid request (400, 500)
if (response.status() >= 400) {
await browser.close(); // Don't leave any resources behind
// Make sure that we only try to extract CSS from valid pages.
// Bail out if the response is an invalid request (400, 500)
if (response.status() >= 400) {
await browser.close() // Don't leave any resources behind

return Promise.reject(
new InvalidUrlError({
url,
statusCode: response.status(),
statusText: response.statusText()
})
);
}
return Promise.reject(
new InvalidUrlError({
url,
statusCode: response.status(),
statusText: response.statusText()
})
)
}

// Coverage contains a lot of <style> and <link> CSS,
// but not all...
const coverage = await page.coverage.stopCSSCoverage();
// Coverage contains a lot of <style> and <link> CSS,
// but not all...
const coverage = await page.coverage.stopCSSCoverage()

// Fetch all <style> tags from the page, because the coverage
// API may have missed some JS-generated <style> tags.
// Some of them *were* already caught by the coverage API,
// but they will be removed later on to prevent duplicates.
const styleTagsCss = (await page.$$eval("style", styles => {
// Get the text inside each <style> tag and trim() the
// results to prevent all the inside-html indentation
// clogging up the results and making it look
// bigger than it actually is
return styles.map(style => style.innerHTML.trim());
})).join("");
// Fetch all <style> tags from the page, because the coverage
// API may have missed some JS-generated <style> tags.
// Some of them *were* already caught by the coverage API,
// but they will be removed later on to prevent duplicates.
const styleTagsCss = (await page.$$eval('style', styles => {
// Get the text inside each <style> tag and trim() the
// results to prevent all the inside-html indentation
// clogging up the results and making it look
// bigger than it actually is
return styles.map(style => style.innerHTML.trim())
})).join('')

await browser.close();
await browser.close()

// Turn the coverage Array into a single string of CSS
const coverageCss = coverage
// Filter out the <style> tags that were found in the coverage
// report since we've conducted our own search for them.
// A coverage CSS item with the same url as the url of the page
// we requested is an indication that this was a <style> tag
.filter(styles => styles.url !== url)
// The `text` property contains the actual CSS
.map(({ text }) => text)
.join("");
// Turn the coverage Array into a single string of CSS
const coverageCss = coverage
// Filter out the <style> tags that were found in the coverage
// report since we've conducted our own search for them.
// A coverage CSS item with the same url as the url of the page
// we requested is an indication that this was a <style> tag
.filter(styles => styles.url !== url)
// The `text` property contains the actual CSS
.map(({text}) => text)
.join('')

return Promise.resolve(coverageCss + styleTagsCss);
};
return Promise.resolve(coverageCss + styleTagsCss)
}
58 changes: 29 additions & 29 deletions test/css-in-js.html
Original file line number Diff line number Diff line change
@@ -1,38 +1,38 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>CodePen - CSS-in-JS example for extract-css-core unit tests</title>
<head>
<meta charset="UTF-8" />
<title>CodePen - CSS-in-JS example for extract-css-core unit tests</title>

<style>
html {
color: #f00;
}
</style>
</head>
<style>
html {
color: #f00;
}
</style>
</head>

<body translate="no">
<div id="app"></div>
<body translate="no">
<div id="app"></div>

<script src="https://cdnjs.cloudflare.com/ajax/libs/react/16.8.4/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/16.8.4/umd/react-dom.production.min.js"></script>
<script src="https://unpkg.com/styled-components/dist/styled-components.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/16.8.4/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/16.8.4/umd/react-dom.production.min.js"></script>
<script src="https://unpkg.com/styled-components/dist/styled-components.min.js"></script>

<script id="rendered-js">
const Title = styled.h1`
color: blue;
font-family: sans-serif;
font-size: 3em;
`;
<script id="rendered-js">
const Title = styled.h1`
color: blue;
font-family: sans-serif;
font-size: 3em;
`

const App = () => {
return React.createElement(Title, null, "Title");
};
const App = () => {
return React.createElement(Title, null, 'Title')
}

ReactDOM.render(
React.createElement(App, null),
document.querySelector("#app")
);
</script>
</body>
ReactDOM.render(
React.createElement(App, null),
document.querySelector('#app')
)
</script>
</body>
</html>
2 changes: 1 addition & 1 deletion test/fixture.css
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
body {
color: teal;
color: teal;
}
Loading

0 comments on commit a93b939

Please sign in to comment.