Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator Generated (validator-generated.php file) needs update #231

Open
sarath49 opened this issue Apr 19, 2018 · 8 comments
Open

Validator Generated (validator-generated.php file) needs update #231

sarath49 opened this issue Apr 19, 2018 · 8 comments

Comments

@sarath49
Copy link

validator-generated.php file has not been updated for 2 years and there are several implementation like the issue

#222

which removed second or more implementation of side bar.

@sarath49 sarath49 changed the title Validator Genratrated validator-generated.php needs update Validator Generated (validator-generated.php file) needs update Apr 19, 2018
@mxhCodes
Copy link

Another thing which does not work because the validator is old like my gandpa:
amp-list when using further attributes "items" and "single-item".

Please update the validator.

@mxhCodes
Copy link

Seems that the fork needs a synchronization with the original AMPHTML repo first to get a new generated validation file: "This branch is 12 commits ahead, 5647 commits behind ampproject:master."

@timonweb
Copy link

Any plans on updating validator-generated.php? AMP has got many new tags which isn't possible to use due to outdated validator-generated.php.

@mxhCodes
Copy link

Currently seems to be not easy, see also https://www.drupal.org/project/amp/issues/2962305

@marcelovani
Copy link

See another discussion here #181

@westonruter
Copy link

I suggest using the generated PHP file in the AMP WordPress plugin: https://github.com/Automattic/amp-wp/blob/develop/includes/sanitizers/class-amp-allowed-tags-generated.php

We'd like to work on extracting this into a Composer package as well to make it easier to incorporate.

Here's the Python script that generates the file from the latest amphtml release: https://github.com/Automattic/amp-wp/blob/develop/bin/amphtml-update.sh

@marcelovani
Copy link

marcelovani commented Dec 7, 2018

Hi @westonruter

I was very curious to see this working, and I did some experiments as a proof
of concept. I actually did the opposite first, I moved some code from AMP WP into
the AMP Library to prove that WP can work with it.
The next step would be to start porting the code from WP Plugin into the library
or perhaps updating the library, whatever makes more sense.
Please have a look at this and comment below.

This experiment is an attempt to merge the functionalites of
Wordpress AMP plugin and Lullabot AMP Library by making an assessment
of what could be abstracted and put into a PHP package that could be used by
different platforms.

I work with AMP for Drupal and I am one of the contributors to the Drupal AMP project.
The Drupal module uses the AMP Library to validate and convert the markup.
The Library comes with many interesting functionalities including:

  • The ability to validate the AMP markup
  • The ability to convert HTML markup to AMP Markup, fixing and removing invalid tags.
  • Reporting changes and errors found
  • A command line tool that can be used to convert files manually, useful for testing
  • PHPUnit tests
  • I works with third party dependencies that are managed with composer

The AMP Library was developed in 2016 by Lullabot in partnership with Google.
Some of the key people involved:

Here is the Full list of contributors

Maintainers of the Drupal AMP Module:

Here is the Full list of contributors

The problem

As Google AMP team keep pushing new functionalities into the AMP Html Project
on a daily basis, we have a situation where the AMP Library no longer works with
the latest AMP Tags and attributes, which are stripped out by the AMP Library.
Here are some of the issues for AMP Library and Drupal AMP module:

  • Lullabot AMP library Validate Generator needs update: #2962305
  • Move Power User mode to separate module: #2970810
  • Incorrectly removing AMP-Auto-Ads code: #2959782
  • Validator Generated (validator-generated.php file) needs update: #231
  • Question: validator-generated.php: #181
  • AMP bind script is being removed: #226
  • Amp-sidebar is being removed: #222
  • rtc-config attribute is getting stripped from tags: #220
  • validator-generated.php seems to be incomplete: #208

AMP Library uses an auto-generated class to validate and strip invalid AMP tags,
see more details here
This file is generated by a the Lullabot fork
of the ampproject/amphtml.

The problem is that the generated file no longer works with the latest AMP HTML code.
There is another problem, the Validators on this folder
are basically a port from validate.js to PHP classes. The original validate.js
has changed a lot and has been moved into a folder called engine.

So, the fix requires a total re-write of the PHP classes and the Python script
that creates the generated validator class.

The experiment

1. WordPress

I was curious about the Wordpress plugin

The Wordpress plugin is mainly maintained by:

Here is the Full list of contributors

I decided to have a look into the code to see how it works and what bits of
functionalities are common with AMP Library.
I can see that the module has some Sanitizers
which are called Passes in the AMP library.
This is a good sign that we can probably join some of the code.

Since the AMP Library is already a standalone package, I started by moving some
of the code from the Wordpress plugin into the AMP Library. I also removed the
git submodule. See all comments on this Pull request Abstracting AMP validators and sanitizers into AMP Library .

So, now we have WordPress plugin using AMP Library!

It does the job and also prints some debugging information, see screenshots:
Screenshots
On the left, you see the Desktop version on a WordPress site.
In the middle, you see the AMP version of the article using the original AMP WP plugin.
On the right, it's the Forked WP Plugin using the Lullabot Library with some extra
debugging information (optional).

This is the HTML markup of the article

AMP Converter Test
<p style="text-align: center;">Image
  <img class="wp-image-13 size-medium aligncenter" src="https://cdn.shopify.com/s/files/1/2222/0805/products/Stereo-845-PP-e1441807241782_480x480.jpg" alt="" width="300" height="198" />
</p>
<p style="text-align: center;">YouTube Video
  <iframe src="https://www.youtube.com/embed/cKgIsOA-G-Q" width="560" height="315" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
</p>

This is the code that replaces the class:
see https://github.com/marcelovani/amp-wp/pull/4/files#diff-35292ad861a5ed6690412c3ccfe5b0adR318

private function build_post_content() {
    $amp = new \Lullabot\AMP\AMP();
     $content = apply_filters('the_content', $this->post->post_content);
     $options = array(
      'scope' => 'body',
      'add_stats_html_comment' => false
    );
    $amp->loadHtml($content, $options);
    $this->add_data_by_key('post_amp_content', $amp->convertToAmpHtml());
    $this->merge_data_for_key('amp_component_scripts', $amp->getComponentJs());
    $this->add_data_by_key('post_amp_debug', $amp->warningsHumanHtml());
    return;

I wanted to change the scope to apply the AMP markup on the full HTML that is
rendered by WordPress, but I don't know how to do that. But for the proof of
concept, this should be a good start.

How to test this

  • Install WordPress as usual
  • Check out the forked AMP WP pulgin inside the wp-content/plugins
    folder buy running: git clone -b amp-library [email protected]:marcelovani/amp-wp.git amp
  • Go inside the wp-content/plugins/amp folder and run composer install.
  • Enable and configure the AMP plugin
  • Create an article and visit the /amp url

2. Lullabot AMP Library

I moved some code from the WP plugin and from the Lullabot fork of AMPHTML into
This fork of AMP Library
See this pull request Amphtml generator update

How to regenerate the validator scripts

The AMP Library currently has both validators (class-amp-allowed-tags-generated.php and validator-generated.php)
side by side, until we do the actual work to update the classes and then get rid of one of them.

  • Check out the forked AMP library buy running: git clone -b amphtml-generator [email protected]:dennisinteractive/amp-library.git amp-library
  • Go inside the amp-library and run Vagrant up --provision to start the VM (Requires Vagrant and VirtualBox)
  • Get inside the VM vagrant ssh
  • Go to the bin folder cd /vagrant/bin
  • Run the build script sh ./build.sh

The build script will check out the latest tag of AMPHtml, copy the Python
scripts inside vendor/amphtml/validator folder, generate the PHP files and copy
them into src/Spec folder.
These files can be committed and the AMP Library can be tagged and released.

The command line tool

  • Go into wp-content/plugins/amp/vendor/lullabot/amp folder and run composer install
  • Run vendor/bin/amp-console amp:convert vendor/lullabot/amp/sample-html/sample-html-fragment.html
    Options:
--no-orig-and-warn  If set, the original HTML and warnings/messages encountered during conversion will not be printed out
      --no-lines          If set, the line numbers will be not printed alongside the AMPized HTML.
      --diff              If set, a diff of the input and output HTML will be printed out instead of the AMP html. Note that the original HTML will be formatted before being diffed with output HTML for best results. This is because the output HTML is also formatted automatically.
      --js                If set, a list of custom amp components and the url to include the js is printed out
      --full-document     If set, assumes this is a whole document html document and not an html fragment underneath the body (which is the default)
      --options=OPTIONS   If set, loads options from the file indicated
  -h, --help              Display this help message
  -q, --quiet             Do not output any message
  -V, --version           Display this application version
      --ansi              Force ANSI output
      --no-ansi           Disable ANSI output
  -n, --no-interaction    Do not ask any interactive question
  -v|vv|vvv, --verbose    Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

Pull requests:

Challenges and ideas

We can either discuss this here, or use

@westonruter
Copy link

I've filed ampproject/amp-wp#2315 to begin an exploration into extracting the logic from the AMP WordPress plugin into something which can be re-used by PHP-based CMSes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants