Skip to content

spk/validate-website

Repository files navigation

validate-website

Description

Web crawler for checking the validity of your documents

validate website

Installation

Debian

apt install ruby-dev libxslt1-dev libxml2-dev

If you want complete local validation look tidy packages

RubyGems

gem install validate-website

Synopsis

validate-website [OPTIONS]
validate-website-static [OPTIONS]

Examples

validate-website -v -s https://www.ruby-lang.org/
validate-website -v -x tidy -s https://www.ruby-lang.org/
validate-website -v -x nu -s https://www.ruby-lang.org/
validate-website -h

Description

validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls (more info doc/validate-website.adoc).

validate-website-static checks the markup validity of your local documents with XML Schema / DTD (more info doc/validate-website-static.adoc).

HTML5 support with libtidy5 or Validator.nu Web Service.

Exit status

  • 0: Markup is valid and no 404 found.
  • 64: Not valid markup found.
  • 65: There are pages not found.
  • 66: There are not valid markup and pages not found.

On your application

require 'validate_website/validator'
body = '<!DOCTYPE html><html></html>'
v = ValidateWebsite::Validator.new(Nokogiri::HTML(body), body)
v.valid? # => false

Jekyll static site validation

You can add this Rake task to validate a jekyll site:

desc 'validate _site with validate website'
task validate: :build do
    Dir.chdir("_site") do
      system("validate-website-static",
               "--verbose",
               "--exclude", "examples",
               "--site", HTTP_URL)
      exit($?.exitstatus)
    end
  end
end

More info

HTML5

Tidy5

If the libtidy5 is found on your system this will be the default to validate your html5 document. This does not depend on a tier service everything is done locally.

nokogiri

nokogiri can validate html5 document without tier service but reports less errors than tidy.

Validator.nu web service

When --html5-validator nu option is used HTML5 support is done by using the Validator.nu Web Service, so the content of your webpage is logged by a tier. It's not the case for other validation because validate-website use the XML Schema or DTD stored on the data/ directory.

Please read http://about.validator.nu/#tos for more info on the HTML5 validation service.

Use validator standalone web server locally

You can download validator jar and start it with:

java -cp PATH_TO/vnu.jar nu.validator.servlet.Main 8888

Then you can use validate-website option:

--html5-validator-service-url http://localhost:8888/
# or
export VALIDATOR_NU_URL="http://localhost:8888/"

This will prevent you to be blacklisted from validator webservice.

Tests

With standard environment:

bundle exec rake

Credits

  • Thanks tenderlove for Nokogiri, this tool is inspired from markup_validity.
  • And Chris Kite for Anemone web-spider framework and postmodern for Spidr.

Contributors

See GitHub.

License

The MIT License

Copyright (c) 2009-2022 Laurent Arnoud [email protected]


Build Coverage Version Documentation License Inline docs