Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
bartveneman committed Mar 23, 2019
0 parents commit 0172342
Show file tree
Hide file tree
Showing 14 changed files with 543 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
node_modules
package-lock.json
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
language: node_js
node_js:
- '8'
- '10'
21 changes: 21 additions & 0 deletions license
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2019 Bart Veneman

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
37 changes: 37 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"name": "distill-css",
"description": "PACKAGE_DESCRIPTION",
"version": "0.1.0",
"homepage": "https://www.projectwallace.com/oss",
"repository": "PACKAGE_REPOSITORY",
"issues": "PACKAGE_ISSUES",
"license": "MIT",
"author": "Bart Veneman",
"keywords": [
"PACKAGE_KEYWORD"
],
"scripts": {
"test": "xo && ava test"
},
"files": [
"src"
],
"main": "src/index.js",
"engines": {
"node": ">=8.0"
},
"xo": {
"prettier": true
},
"devDependencies": {
"ava": "^1.3.1",
"chromium": "^2.1.0",
"create-test-server": "^2.4.0",
"prettier": "^1.16.4",
"puppeteer-core": "^1.13.0",
"xo": "^0.24.0"
},
"dependencies": {
"puppeteer": "^1.13.0"
}
}
94 changes: 94 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
<div align="center">
<h1>extract-css-core</h1>
<p>Extract all CSS from a given url, both server side and client side rendered.</p>
</div>

[![NPM Version](https://img.shields.io/npm/v/extract-css-core.svg)](https://www.npmjs.com/package/extract-css-core)
[![Build Status](https://travis-ci.org/bartveneman/extract-css-core.svg?branch=master)](https://travis-ci.org/bartveneman/extract-css-core)
[![Known Vulnerabilities](https://snyk.io/test/github/bartveneman/extract-css-core/badge.svg)](https://snyk.io/test/github/bartveneman/extract-css-core)
[![Weekly downloads](https://img.shields.io/npm/dw/extract-css-core.svg)](https://www.npmjs.com/package/extract-css-core)
![Dependencies Status](https://img.shields.io/david/bartveneman/extract-css-core.svg)
![Dependencies Status](https://img.shields.io/david/dev/bartveneman/extract-css-core.svg)
[![XO code style](https://img.shields.io/badge/code_style-XO-5ed9c7.svg)](https://github.com/sindresorhus/xo)
[![Project: Wallace](https://img.shields.io/badge/Project-Wallace-29c87d.svg)](https://www.projectwallace.com/oss)

## Problem, solution and shortcomings

### Problem

Existing packages like [get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css) look at a server-generated piece of HTML and get all the `<link>` and `<style>` tags from it. This works fine for 100% server rendered pages, but apps that employ style injection with JavaScript will not be covered.

### Solution

This module uses an instance of Chromium to render a page. This has the benefit that most of the styles can be rendered, even when generated by JavaScript. The [Puppeteer CSSCoverage API](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#coveragestartcsscoverageoptions) is the power behind finding most of the CSS.

### Shortcomings

Currently, there is no solution to get the CSS from modules that use [Styled Components](https://www.styled-components.com) or something similar. Any help resolving this issue will be very much appreciated.

## Installation

```sh
npm install extract-css-core

# Or with Yarn
yarn add extract-css-core
```

## Usage

```js
const extractCss = require("extract-css-core");

extractCss("http://www.projectwallace.com").then(css => console.log(css));
```

## API

### extractCss(url, [options])

Extract CSS from a page. Returns a Promise that contains the CSS as a single String.

### Options

Type: `Object`

#### debug

Type: `Boolean`
Default: `false`

Set to `true` if you want a Chromium window to open as it works to get all the CSS from the page.

#### waitUntil

Type: `String`
Default: `networkidle2`

Can be any value as provided by the [Puppeteer docs](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagegotourl-options).

#### customBrowser

Type: `Object`
Default: `null`

This is useful if you want to run extract-css on AWS Lambda for example.

##### executablePath

Type: `String`
Default: `null`

Pass in the executable path for a custom Chromium instance.

##### customPuppeteer

Type: `Object`
Default: `null`

You probably want to provide [puppeteer-core](https://www.npmjs.com/package/puppeteer-core) for a custom browser instead of [puppeteer](https://www.npmjs.com/package/puppeteer) which brings it's own Chromium instance.

## Related

- [extract-css lambda](https://github.com/bartveneman/extract-css) - Extract CSS running as a serverless function
- [get-css](https://github.com/cssstats/cssstats/tree/master/packages/get-css) - The original get-css
85 changes: 85 additions & 0 deletions src/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
const puppeteer = require("puppeteer");

function InvalidUrlError({ url, statusCode, statusText }) {
this.name = "InvalidUrlError";
this.message = `There was an error retrieving CSS from ${url}.\n\tHTTP status code: ${statusCode} (${statusText})`;
}

InvalidUrlError.prototype = Error.prototype;

module.exports = async (
url,
{ debug = false, waitUntil = "networkidle2", customBrowser = {} } = {}
) => {
const browserOptions = {
headless: debug !== true,
puppeteer
};

// Replace the puppeteer instance if a custom one is provided
// This also means that the executablePath needs to be set to
// a custom path where some chromium instance is running.
if (
customBrowser &&
customBrowser.executablePath &&
customBrowser.customPuppeteer
) {
browserOptions.executablePath = customBrowser.executablePath;
browserOptions.puppeteer = customBrowser.customPuppeteer;
}

// Setup a browser instance
const browser = await browserOptions.puppeteer.launch(browserOptions);

// Create a new page and navigate to it
const page = await browser.newPage();

// Start CSS coverage. This is the meat and bones of this module
await page.coverage.startCSSCoverage();
const response = await page.goto(url, { waitUntil });

// Make sure that we only try to extract CSS from valid pages.
// Bail out if the response is an invalid request (400, 500)
if (response.status() >= 400) {
await browser.close(); // Don't leave any resources behind

return Promise.reject(
new InvalidUrlError({
url,
statusCode: response.status(),
statusText: response.statusText()
})
);
}

// Coverage contains a lot of <style> and <link> CSS,
// but not all...
const coverage = await page.coverage.stopCSSCoverage();

// Fetch all <style> tags from the page, because the coverage
// API may have missed some JS-generated <style> tags.
// Some of them *were* already caught by the coverage API,
// but they will be removed later on to prevent duplicates.
const styleTagsCss = (await page.$$eval("style", styles => {
// Get the text inside each <style> tag and trim() the
// results to prevent all the inside-html indentation
// clogging up the results and making it look
// bigger than it actually is
return styles.map(style => style.innerHTML.trim());
})).join("");

await browser.close();

// Turn the coverage Array into a single string of CSS
const coverageCss = coverage
// Filter out the <style> tags that were found in the coverage
// report since we've conducted our own search for them.
// A coverage CSS item with the same url as the url of the page
// we requested is an indication that this was a <style> tag
.filter(styles => styles.url !== url)
// The `text` property contains the actual CSS
.map(({ text }) => text)
.join("");

return Promise.resolve(coverageCss + styleTagsCss);
};
38 changes: 38 additions & 0 deletions test/css-in-js.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>CodePen - CSS-in-JS example for extract-css-core unit tests</title>

<style>
html {
color: #f00;
}
</style>
</head>

<body translate="no">
<div id="app"></div>

<script src="https://cdnjs.cloudflare.com/ajax/libs/react/16.8.4/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/16.8.4/umd/react-dom.production.min.js"></script>
<script src="https://unpkg.com/styled-components/dist/styled-components.min.js"></script>

<script id="rendered-js">
const Title = styled.h1`
color: blue;
font-family: sans-serif;
font-size: 3em;
`;

const App = () => {
return React.createElement(Title, null, "Title");
};

ReactDOM.render(
React.createElement(App, null),
document.querySelector("#app")
);
</script>
</body>
</html>
3 changes: 3 additions & 0 deletions test/fixture.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
body {
color: teal;
}
Loading

0 comments on commit 0172342

Please sign in to comment.