Transform a JSON feed into Jekyll posts!!
This is a simple Node.js project for parsing a JSON feed into markdown files — one for each feed item.
Each markdown file is named using the Jekyll post naming convention and contains valid YAML front-matter.
- Node.js or NVM (Node Version Manager)
Use the Nodejs version specified in .nvmrc
.
If you use NVM you can use this projects' .nvmrc
file by running nvm use
(while in the project directory.)
git clone rss-to-markdown
cd rss-to-markdown
Run the main script using npm:
npm run process-feed -- --output=my-folder ## Generates the markdown files in ./dist/my-folder
The following works too:
npm run process-feed -- --output my-folder
npm run process-feed -- -o=my-folder
npm run process-feed -- -o my-folder
You can optionally specify a parent directory to build the files into using the --dir-parent
option:
npm run process-feed -- --output=my-folder --dir-parent=my-parent ## Generates files in ./my-parent/my-folder
The following also works:
npm run process-feed -- --output=my-folder --dir-parent my-parent
npm run process-feed -- --output=my-folder -d=my-parent
npm run process-feed -- --output=my-folder -d my-parent
View the help message using -h
or --help
:
npm run process-feed -- -h
# Or
npm run process-feed -- --help
usage: npm run process-update -- [options]
options:
-o, --output Specify the output folder to save files to [string] [required]
-d, --dir-prefix Specify directory prefix for output folder [string] [default: "dist"]
If you need to change the URL to the feed, or where images are kept, adjust the FEED_URL
and
IMAGE_BASEURL
constants in ./process-update.mjs
.
It is not recommended that you use the old scripts/instructions
This is a simple Node.js project for parsing an RSS feed into markdown files — one for each feed item.
Each markdown file is named using the Jekyll post naming convention and contains valid YAML front-matter.
Note: colors
is not recommended as it had previous security issues and needs to be locked to version 1.4.0
including in your package.json
file (use 1.4.0
and not ^1.4.0
).
This project has minimal dependencies:
rss-parser
to parse a string of XML (our RSS feed) into a JavaScript Object.colors
to create more aesthetically pleasing console output.
There's also a folder ./helpers/
with a module cleanContent.js
used for cleaning up less-than-ideal HTML image elements. See cleanContent
Helper for more information.
- Node.js or NVM (Node Version Manager)
This project was built using nodejs version 14.5.4
.
If you use NVM you can use this projects' .nvmrc
file by running nvm use
(while in the project directory.)
git clone rss-to-markdown
cd rss-to-markdown
npm i # or `npm install` if you like typing more
- Place XML feed file in
./src
folder - In
./index.js
file:- Update
config.input.source
to match filename of your XML feed - Set
config.output.path
to the path you want you markdown files built (i.e../dist/my-feed
) - Run the main file (
./index.js
) using:npm start
- Update
The configuration object is found towards the top of ./index.js
:
// index.js
const config = {
input: {
source: 'my-rss-feed.xml' // ./src/my-rss-feed.xml
},
output: {
path: './dist/my-feed'
}
}
Existing files will be overwritten!
If the same file exists (in the output directory) it will be overwritten by running npm start
.
If the output folder(s) specified (config.output.path
) does not exist, it will be created — including any subfolders.
The contents of each file is defined inside index.js
— within a function named processFeed()
.
The array fileArray
becomes the contents for each file:
function processFeed(feed) {
feed.items.forEach((item) => {
const { title, link, pubDate, author, content, guid, isoDate } = item;
/* JS omitted ... */
// This array becomes the markdown file contents:
const fileArray = [
'---', // YAML front-matter start
`\ntitle: "${cleanTitle}"`,
`\nlink: ${link}`,
`\nauthor: ${author}`,
`\npublish_date: ${pubDate}`,
`\nguid: ${guid}`,
`\nisoDate: ${isoDate}`,
`\n---`, // YAML front-matter end
`\n`,
`\n${clean}`
];
/* JS omitted ... */
});
}
The ./helper/
folder contains a cleanContent.js
module for cleaning up html coming from the feed.
Use the ./helper/
directory for any helper modules/function and import them into index.js
The existing helper functions were created for cleaning up old SharePoint code. Specifically, to cleanup strings of HTML containing image elements and alter their src attributes to point to a new folder location (/uploads/
.)
The image elements coming from the feed I am processing vary.
Some have the src
followed by the alt
attribute while others have the reversed order.
Some images have class
attributes with old/unneeded sharepoint classes.
There are also images with style
attributes that need removing.
import {cleanContent} from './helpers/cleanContent';
cleanImageString = cleanContent(stringWithHTMLImageElements);
It will take a string of multiple HTML images (within other HTML elements too) and clean them all up:
import {cleanContent} from './helpers/cleanContent';
const html =
`<img class="ms-old-class" src="some-old-path/image.jpg" alt="alt text" />
<img src="some-old-path/image.jpg" alt="alt text" />
<img alt="" src="some-old-path/image.jpg" />
<img class="ms-old-class" src="some-old-path/image.jpg" alt="alt text" style="border: none;" />`;
const cleanHTML = cleanContent(html);
console.log(cleanHTML);
// <img class="img-fluid" src="/uploads/image.jpg" alt="alt text">
// <img class="img-fluid" src="/uploads/image.jpg" alt="alt text">
// <img class="img-fluid" alt="" src="/uploads/image.jpg">
// <img class="img-fluid" src="/uploads/image.jpg" alt="alt text">