Problem/Motivation

'Top Stories with AMP' is a carousel at the top of Google's search results, which highlights AMP-enabled content. Google's documentation specifies what metadata is allowed/required to support this AMP feature. Much of that data may already exist in a Drupal site, so we should try to expose it where possible.

See the docs at https://developers.google.com/structured-data/carousels/top-stories#publ...

Proposed resolution

This module should:

* expose canonical url, title, image, publisher name and logo, date, author and description metadata, as described in the docs linked above.
* provide some configuration mechanism allowing a content editor to select the image that should be used for the Top Stories carousel
* cooperate with metadata or other popular contrib modules to provide this data.

CommentFileSizeAuthor
#67 interdiff-2717641-66-67.txt593 bytesrainbowarray
#67 2717641-67-top-stories-config.patch79.87 KBrainbowarray
#66 2717641-64-top-stories-config.patch79.77 KBrainbowarray
#64 interdiff-2717641-63-64.txt879 bytesrainbowarray
#64 2717641-64-top-stories-config.patch0 bytesrainbowarray
#63 interdiff-2717641-58-63.txt5.19 KBrainbowarray
#63 2717641-63-top-stories-config.patch79.87 KBrainbowarray
#59 interdiff-2717641-55-58.txt7.42 KBrainbowarray
#59 2717641-58-top-stories-config.patch77.01 KBrainbowarray
#55 interdiff-2717641-52-55.txt1.15 KBrainbowarray
#55 2717641-55-top-stories-config.patch74.99 KBrainbowarray
#53 interdiff.patch2.34 KBkarens
#52 2717641-52-top-stories-config.patch74.89 KBkarens
#51 interdiff-2717641-50-51.txt3.13 KBrainbowarray
#51 2717641-51-top-stories-config.patch74.63 KBrainbowarray
#50 interdiff-2717641-49-50.txt1012 bytesrainbowarray
#50 2717641-50-top-stories-config.patch73.95 KBrainbowarray
#49 2717641-49-top-stories-config.patch73.97 KBrainbowarray
#49 interdiff-2717641-48-49.txt3.57 KBrainbowarray
#48 interdiff-2717641-47-48.txt16.38 KBrainbowarray
#48 2717641-48-top-stories-config.patch74.08 KBrainbowarray
#47 interdiff-2717641-44-47.txt2.92 KBrainbowarray
#47 2717641-47-top-stories-config.patch74.18 KBrainbowarray
#44 interdiff-2717641-41-44.txt6.87 KBrainbowarray
#44 2717641-44-top-stories-config.patch73.91 KBrainbowarray
#43 Screenshot from 2016-06-02 16-04-38.png30.32 KBmtift
#42 Screenshot from 2016-06-02 16-00-36.png10.6 KBmtift
#41 2717641-41-top-stories-config.patch72.48 KBrainbowarray
#39 interdiff-2717641-38-39.txt14.86 KBrainbowarray
#39 2717641-39-top-stories-config.patch72.31 KBrainbowarray
#38 2717641-38-top-stories-config.patch63.93 KBrainbowarray
#37 interdiff-2717641-32-37.txt22.9 KBrainbowarray
#37 2717641-37-top-stories-config.patch63.96 KBrainbowarray
#34 no-metadata.png32.12 KBmtift
#33 no-global.png23.64 KBmtift
#33 changing-global-config.png59.83 KBmtift
#32 interdiff-2717641-27-32.txt3.18 KBrainbowarray
#32 2717641-32-top-stories-config.patch45.45 KBrainbowarray
#28 interdiff-2717641-26-27.txt918 bytesrainbowarray
#28 2717641-27-top-stories-config.patch46.69 KBrainbowarray
#26 2717641-24-top-stories-config.patch46.59 KBrainbowarray
#26 interdiff-2717641-23-24.txt1.43 KBrainbowarray
#24 2717641-23-top-stories-config.patch45.93 KBrainbowarray
#24 interdiff-2717641-22-23.txt7.34 KBrainbowarray
#23 interdiff-2717641-22-23.txt0 bytesrainbowarray
#23 2717641-23-top-stories-config.patch45.93 KBrainbowarray
#22 interdiff-27176411-19-22.txt5.67 KBrainbowarray
#22 2717641-22-top-stories-config.patch48.95 KBrainbowarray
#19 interdiff-2717641-18-19.txt14.78 KBrainbowarray
#19 2717641-19-top-stories-config.patch45.57 KBrainbowarray
#18 interdiff-2717641-16-18.txt5.98 KBrainbowarray
#18 2717641-18-top-stories-config.patch37.83 KBrainbowarray
#17 interdiff-2717641-14-16.txt31.74 KBrainbowarray
#17 2717641-16-top-stories-config.patch37.42 KBrainbowarray
#14 interdiff-2717641-11-14.txt19.22 KBrainbowarray
#14 2717641-14-top-stories-config.patch32.23 KBrainbowarray
#11 interdiff-2717641-9-11.txt10.41 KBrainbowarray
#11 2717641-11-top-stories-config.patch12.99 KBrainbowarray
#10 2717641-top-stories-config.patch8.76 KBrainbowarray

Comments

dsayswhat created an issue. See original summary.

dsayswhat’s picture

Issue summary: View changes
dsayswhat’s picture

Issue summary: View changes
dsayswhat’s picture

Issue tags: +1.0 blocker

Adding tags to this issue

rainbowarray’s picture

Assigned: Unassigned » rainbowarray
rainbowarray’s picture

rainbowarray’s picture

On the AMP module config page (or on a subpage), we should have a form for settings that will then be used by the AMP theme to provide schema info in the head section of the document for AMP pages. The form items we will need are as follows:

  • Schema type
  • Organization name
  • Organization logo (600x60)
  • Article author name
  • Article image for carousel (at least 696 wide)
  • Article description (limited to 150 characters?)

Specific markup details are available at https://developers.google.com/structured-data/carousels/top-stories#mark....

Tokens would be useful for several of these items, such as the author, article image and article description.

We looked at integrating with metatag module, but there really isn't anything that meshes well with what is needed for this schema information.

The schema bit looks like one of the trickier parts. In theory a site could have both news articles and blog posts and want each noted separately? It's unclear right now what effect choosing NewsArticle or BlogPosting has. The documentation specifies those two schemas, but I'm unsure if theoretically it might be possible to use other schemas in the future as well. For now we maybe make that a global setting, and allow for setting that on a per-content-type basis if there's a demand for that later?

berdir’s picture

Per node type settings could be done as third party settings of the node type config entity and be part of the node type form.

Need to check how this is implemented, would be good to have an alter hook so it can be customized easily.

rainbowarray’s picture

Posting my progress so far on this. So far, this sets up a separate tab on the AMP admin page with the ability to set an organization name and logo.

My plan is to split out the settings for the organization (name, logo) from the settings for content (author, image, description, schema type). For the latter, I am planning to make a config entity. That will make it possible to add config settings per content type. I'd rather have those settings in the AMP config section rather than on content type edit pages. I'll also need to set up default settings for content. Not entirely sure if simple config or a config entity is better suited for that. I'm thinking maybe simple config, since the config entity will likely have a content type associated with it, while default settings would not.

rainbowarray’s picture

StatusFileSize
new8.76 KB

Didn't upload patch.

rainbowarray’s picture

Status: Active » Needs work
StatusFileSize
new12.99 KB
new10.41 KB

This adds default content config with simple config values. That is working fine, huzzah.

I am having loads of trouble with the organizational logo. I was getting errors that led me to find that we need to set file.usage on the logo after it is saved. You need to provide an entity type and entity ID for the parent of the managed file, neither of which exist for a file saved using simple config. If we had a config entity, maybe, but that seems like overkill. So for now adding a fake type called 'organization' and entity ID based on the file ID. If you go to the file usage page for that file, that leads to an error, but c'est la vie.

Now however it appears that once you upload a file in this field, it cannot be removed. If you press remove and then hit save, the file reappears. Because reasons.

I added a hidden field to help determine when a file has been removed so it can be deleted and usage deleted as well. In theory it seems like that should work, but it does not. Because reasons.

Letting this go for the night.

Next step is probably to set up config entities for the per-content type overrides of content settings.

berdir’s picture

  1. +++ b/amp.routing.yml
    @@ -6,6 +6,14 @@ amp.settings:
    +amp.settings.top_stories:
    +  path: '/admin/config/content/amp/top_stories'
    +  defaults:
    +    _form: '\Drupal\amp\Form\AmpTopStoriesSettingsForm'
    +    _title: 'Provide Top Stories carousel information'
    +  requirements:
    +    _permission: 'administer site configuration'
    

    I'm not convinced about calling this "Top Stories".

    That's just one example of how google currently uses it. all of it is 100% generic metadata like author/image/site name/..

    So I'd go with something like metadata settings or so.

  2. +++ b/src/Form/AmpTopStoriesSettingsForm.php
    @@ -0,0 +1,271 @@
    +    $amp_top_stories_config = $this->config('amp.top_stories');
    

    I would strongly recommend to use third party settings storage for this.

    Having a single config file to store configuration for config entities makes for example anything related to default config/features integration very painful, you can't put the settings for a specific node type into default config of a module that defines that node type.

    Also, you need to take care of cleaning up your settings when node types are deleted. All of that just works with third party settings.

    You can still have your own settings form if you want to.

  3. +++ b/src/Form/AmpTopStoriesSettingsForm.php
    @@ -0,0 +1,271 @@
    +    $form['organization_group']['description'] = [
    +      '#type' => 'item',
    +      '#description' => t('Provide information about your organization for use in the Top Stories carousel in Google Search.'),
    +    ];
    

    Same here and for 90% of the other mentions. I'd keep the introduction at the top (although that might work better in hook_help() ?). But anything else is just generic metadata and not at all just "for Top stories"

  4. +++ b/src/Form/AmpTopStoriesSettingsForm.php
    @@ -0,0 +1,271 @@
    +    // Add an option to select the schema type for AMP pages.
    +    $schema_type_options = [
    +      'Article' => 'Article',
    +      'NewsArticle' => 'NewsArticle',
    +      'BlogPosting' => 'BlogPosting',
    +    ];
    +    $form['default_content_group']['amp_content_schema_type'] = [
    +      '#type' => 'select',
    +      '#title' => t('AMP schema type'),
    +      '#options' => $schema_type_options,
    

    Given that we need per node type settings anyway, I think it would be easy to use it for this as well?

  5. +++ b/src/Form/AmpTopStoriesSettingsForm.php
    @@ -0,0 +1,271 @@
    +    // Add an option to set the image on AMP pages.
    +    $top_stories_image_guidelines_url = Url::fromUri('https://developers.google.com/search/docs/data-types/articles#article_types');
    +    $form['default_content_group']['amp_content_image'] = [
    +      '#type' => 'textfield',
    +      '#title' => t('Article image for carousel'),
    +      '#description' => t('An article image to appear in the Top Stories carousel. Images must be at least 696px wide: refer to <a href=":image_guidelines">article image guidelines</a> for further details. Use tokens to provide the correct image for each article page.', [':image_guidelines' => $top_stories_image_guidelines_url->toString()]),
    +      '#default_value' => $amp_top_stories_config->get('amp_content_image'),
    +    ];
    

    Related: image tokens really need image style support. There's currently a separate module for that that needs an 8.x port, I want to get it into token module.

berdir’s picture

For 2, the settings are not yet per node type, so it doesn't apply to what's there right now but what will come :)

rainbowarray’s picture

StatusFileSize
new32.23 KB
new19.22 KB

I am in the process of transitioning this to using a config entity for settings. Worked on getting the entity set up today. Used Drupal Console to generate boilerplate files to get started.

So this doesn't work yet, but posting my work to give an update on the progress. Will be working tomorrow on setting up the config forms to work with the config entity.

Right now planning to provide global metadata settings and per content type overrides with the same config entities. Would be possible to allow for third-party settings to add overrides on the content type forms, but I think we'll be able to centralize this in the AMP config settings using a model similar to metatag's admin screens.

berdir’s picture

I don't really understand why the global settings need to be in the config entity too, that adds a whole bunch of properties that by definition will only ever be used once. And makes the form more complicated?

rainbowarray’s picture

This adds the form for adding and editing AMP Metadata config entities.

There's a bug with how remove image is working for the organizational logo. It seems to persist no matter what I do.

#15: The organization logo needs to have an entity as its parent for the file usage table to not have an error. So now we have a config entity that can provide that. For the content type override, I'm collapsing the organization info, but still making it available. I could see somebody wanting to provide a different org name and logo for their articles versus their blog posts, for example. A rare use case maybe, but possible. I could hide those fields completely for content type overrides but I don't think that's necessary.

rainbowarray’s picture

StatusFileSize
new37.42 KB
new31.74 KB
rainbowarray’s picture

StatusFileSize
new37.83 KB
new5.98 KB

This fixes the bug with organizational logo file removal.

rainbowarray’s picture

StatusFileSize
new45.57 KB
new14.78 KB

This adds the summary page for the AMP metadata.

berdir’s picture

I still think this is a *huge* amount of code to just to send a handful of properties. But apparently I can't convince you, so giving up on that. Below is a review of some things I noticed.

At least in D7, it was absolutely possible to just set a fake type/id for a global setting image. Themes have to handle that somehow as well?

  1. +++ b/src/AmpMetadataInfo.php
    @@ -0,0 +1,132 @@
    +    $amp_metadata_settings = $this->entityTypeManager->getStorage('amp_metadata')->loadMultiple();
    +    /** @var AmpMetadataInterface $amp_metadata  */
    +    foreach ($amp_metadata_settings as $amp_metadata_id => $amp_metadata) {
    +      if ($amp_metadata->isGlobal()) {
    +        $amp_metadata_global_exists = TRUE;
    +      }
    +    }
    +    $this->cache->set('amp_metadata_has_global', $amp_metadata_global_exists);
    +
    

    You can do this with an entity query. And actually, since you hardcode the ID based on the selection, you can just check if your global entity exists with a single load() call. Which you don't need to cache as that is already cached.

  2. +++ b/src/AmpMetadataInfo.php
    @@ -0,0 +1,132 @@
    +   */
    +  public function getAmpContentTypesWithoutMetadataSettings() {
    +    if ($cache = $this->cache->get('amp_content_types_without_metadata_settings')) {
    +      $options = $cache->data;
    +    }
    +    else {
    +      $options = $this->updateAmpContentTypesWithoutMetadataSettings();
    +    }
    +    return $options;
    +  }
    

    I assume this is only used in the backend? Caching that information seems like overkill to me. Loading the config is pretty fast and cached already itself.

  3. +++ b/src/AmpMetadataInterface.php
    @@ -0,0 +1,185 @@
    +  /**
    +   * Apply these settings per content type.
    +   *
    +   * @param string $content_type_id
    +   *   The identifier of the content type.
    +   */
    +  public function setContentType($content_type_id);
    +
    

    code usually uses node type, not content type, that's a UI thing. (just like node vs. content).

  4. +++ b/src/Entity/AmpMetadata.php
    @@ -0,0 +1,294 @@
    + * @ConfigEntityType(
    + *   id = "amp_metadata",
    + *   label = @Translation("AMP Metadata"),
    

    You can put something like this in the annotation:

    * lookup_keys = {
    * "type",
    * "status",
    * },

    with the fields that you do entity queries on (content type I guess). Then core automatically manages an index for you and can just directly load the right config entity. Even more reason then to just drop the caching that you have atm.

  5. +++ b/src/Entity/AmpMetadata.php
    @@ -0,0 +1,294 @@
    + *   config_prefix = "amp_metadata",
    

    conig_prefix only needs to be specified when you want to have a shorter key. For example, this now reads amp.amp_metadata, which is repetitive, you could set it to metadata to just have amp.metadata.

  6. +++ b/src/Entity/AmpMetadata.php
    @@ -0,0 +1,294 @@
    + *   admin_permission = "administer site configuration",
    

    Does amp not have a permission for its own settings yet?

    Can be a separate issue but I think that would be useful. Using this can be problematic if you want to give users access to a few items in admin/config as there is no way to prevent amp settings, which they most likely don't need to change, from showing up.

  7. +++ b/src/EntityTypeInfo.php
    @@ -1,10 +1,5 @@
     
    -/**
    - * @file
    - * Contains Drupal\amp\EntityTypeInfo.
    - */
    -
     namespace Drupal\amp;
     
    

    unrelated, also most of your new files are adding them again ;)

  8. +++ b/src/Form/AmpMetadataForm.php
    @@ -0,0 +1,351 @@
    +    $introduction_title = $amp_metadata_global_exists && !$amp_metadata->isGlobal() ? 'Content type override for' : 'Global';
    ...
    +      '#title' => $introduction_title . ' metadata settings',
    

    This can't be translated, avoid merging strings together, even if you make the parts translatable. Make complete strings, pass each separately trough t()

  9. +++ b/src/Form/AmpMetadataForm.php
    @@ -0,0 +1,351 @@
    +        $amp_metadata->set('id', 'global_metadata');
    +        $amp_metadata->set('label', $this->t('Global AMP metadata'));
    ...
    +        $amp_metadata->set('id', $content_type . '_metadata_override');
    +        $amp_metadata->set('label', $this->t('@type override for AMP metadata', [
    

    given that the full name is already amp.metadata (or amp.amp_metadata right now), repeating that suffix in the id makes it unnecessary long?

    Setting the id like this is also problematic, as the name will change depending on your current interface language without the ability to change it.

    Building the entity is usually done in buildEntity(), this is too late for validation for example. Having this all as custom code (most of core just uses 1:1 mappings between form elements and config) will result in setting a lot of weird stuff in the default buildEntity() implementation, so you probably want to override it completely.

berdir’s picture

One more note. Since this metadata can actually be added on any page, not just AMP, I've implemented something myself now that I'm adding to all my article pages, not just AMP.

Haven't found a contrib module yet that does that, but there was quite some discussion to remove rdf.module in core in favor of this: #2152459: [Policy] Deprecate RDF module and move it to contrib.

I know you prefer to have a simple approach and not many dependencies, but it might still be worthwhile to consider moving this to a separate module. The code doesn't even need to know about AMP, I simply implemented hook_page_attachments() and then I'm adding it there to the output and it shows up anywhere.

One problem seems to be exact structure, I'm following the AMP examples from the link you posted, but that doesn't validate.

rainbowarray’s picture

StatusFileSize
new48.95 KB
new5.67 KB

Trying to only show the add metadata settings link when there are no global settings or when there are no amp-enabled content types without settings. I'm finding this to be practically impossible to do. Tried adding a cache context to vary the cache based on whether or not there are amp-enabled content types without settings. That doesn't seem to work. Nor does the cache context on the local action seem to bubble up to the local actions block. Tried setting that context on the block manually. No dice. Tried setting cache max-age to 0. Nope. Tried creating a cache tag and invalidating it when we update whether content types without settings (the possibility of doing that is one reason I moved that into its own cached function). That didn't work either.

I'll probably end up just doing a redirect on the add page if there are no available settings and then provide a message to that effect.

Wanted to post a patch with the caching things I tried in case I want to refer to it later. For what it's worth, I found it very difficult to find out how to create a cache tag and invalidate it.

rainbowarray’s picture

StatusFileSize
new45.93 KB
new0 bytes

Stripped out the cache settings for the add metadata link. Was able to add a message on the add page if global settings and all content type overrides have already been set. Would be nice to do a page redirect as well, but cannot find a way to do that. This isn't a controller, so I can't return a redirect response to the collection page. You can set a redirect URL on form state, but that only takes effect when the form is submitted. Boo.

rainbowarray’s picture

StatusFileSize
new7.34 KB
new45.93 KB

Something was wrong with the last patch and interdiff. Here's the correct ones.

rainbowarray’s picture

#20: Sorry I didn't reply to this sooner. Appreciate the feedback.

Yes, you can create a fake entity type and ID. However, if you do so, then when you go to the files page and go to the file usage page for a particular file, a white screen with an ugly error appears because there isn't a valid entity type.

We had decided to create a config entity for the per content type overrides. At that point, adding the extra items for the organization name and logo didn't seem that much extra. This also solves the issue with file usage error.

1. I'll take a look at using an entity query.

2. I was looking at the AmpEntityInfo service, which did similar caching. I can take a look at pulling out the caching. There are a couple thorny caching bugs I'm trying to fix, and I think there's still a chance these lookup caches might be useful, but I'll make sure to look back at this.

3. Thanks. I'll look to switch from ContentType to NodeType in the code.

4. Good tip. Thanks.

5. Okay, I'll take a look at this Thanks.

6. Yes, we do not have a separate permission on that. Good point, definitely should be a separate issue I think.

7. I used Drupal Console to generate most of the files for this, and they still had the file statements in there. I think I've pulled those out now.

8. I'll take a look, thanks.

9. Just to be clear, this is the machine name of the content type. I don't think that will change with translations, right?

#21. I'm not sure about moving this to a separate module. The documentation for this makes it seem like this is mostly useful for AMP Top Stories carousels. I thought it might be useful for non-AMP Top Stories, but I'm not seeing that on the documentation page.

rainbowarray’s picture

StatusFileSize
new1.43 KB
new46.59 KB

Fixing some whitespace bugs mtift found, and adding token module as a dependency.

mtift’s picture

Applying this patch gives me the following errors when I visit admin/config/content/amp/amp_metadata:

TypeError: Argument 1 passed to Drupal\amp\AmpMetadataListBuilder::buildRow() must implement interface Drupal\Core\Entity\EntityInterface, null given, called in /var/www/amp/docroot/core/lib/Drupal/Core/Entity/EntityListBuilder.php on line 229 in Drupal\amp\AmpMetadataListBuilder->buildRow() (line 42 of /var/www/amp/docroot/modules/contrib/amp/src/AmpMetadataListBuilder.php).

and

Notice: Undefined index: global_metadata in Drupal\amp\AmpMetadataListBuilder->load() (line 28 of /var/www/amp/docroot/modules/contrib/amp/src/AmpMetadataListBuilder.php).

rainbowarray’s picture

StatusFileSize
new46.69 KB
new918 bytes

This should fix a bug on the AMP Metadata list page if there were no global settings.

rainbowarray’s picture

Follow-up issue to add tests.

mtift’s picture

If we are going to go this route, we'll need to clean up the UX. For example, if I go to admin/config/content/amp/amp_metadata/add and click "Save" and then do that again, I get a WSOD.

If I then go back to enable AMP for a content type, such as article, and then head back to the Metadata tab, "article" does not appear in the list of choices.

If I go back and delete the "Global" config, it says "There is no AMP Metadata yet," but if I click "Add AMP Metadata" then it says "Cannot add new settings: please edit existing settings and overrides."

In other words, having this "global config" setup as it is can cause all sorts of problems. This is making me question whether or not it is wise to continue down the path of using a configuration entity (which is something used for zero or more things) to store settings that need exactly one copy (the purpose of simple configuration). But if the logos are causing problems in simple config, then the path forward is not totally clear to me.

Either way, this workflow introduced in the latest patch will not work. Hmmm.....

rainbowarray’s picture

#30: There are some weird caching bugs going on. Not sure what exactly is causing that, but I'll investigate further.

rainbowarray’s picture

StatusFileSize
new45.45 KB
new3.18 KB

I think this should fix the caching bugs. Berdir had pointed out the caching of AMP metadata info checks was probably unnecessary. As it turns out, I think it was also causing these caching bugs.

I think the problem was that the metadata check cache wasn't updated when entities were deleted. So in theory a change in the delete form could also fix this problem. But since remove the cache for these checks was already recommended, I went with that.

mtift’s picture

Issue summary: View changes
StatusFileSize
new59.83 KB
new23.64 KB

Looking better. Still some issues.

1. Changing the image results in three messages:

2. Removing the global config results in settings that override nothing. That's probably OK in theory, but seems confusing:

3. Perhaps most importantly, I don't see any of the metadata appearing in my article node. For example, with this configuration, I see neither "My org" nor "Org override" when I visit an article URL such as node/1?amp.

mtift’s picture

StatusFileSize
new32.12 KB

I guess that last image did not work. Trying again

rainbowarray’s picture

Issue summary: View changes

mtift: Yes, I still need to add the theming parts to this that takes the data and adds the JSON schema info from this into head.

I'll look into those error messages, that definitely is a problem.

Not sure what to say about removing the global settings. I suppose we could change the language for the per content type settings to no longer say that they are overrides. Or only change that language if global settings get removed maybe?

rainbowarray’s picture

Providing a link to the potential D7 solution for this, which is slightly different.

rainbowarray’s picture

We are now outputting the JSON schema metadata if all of the necessary metadata is present.

I added an additional config field for a headline token, as there is a 110 character limit for the headline, so somebody may want to use a separate short headline field.

Working with content image tokens is kind of clunky. The best solution I found was to suggest using the Image URL Formatter module, then setting a Token view mode and use the Image URL formatter for the image field in question. In D7, you can change an image field token to file token fields to get the URL or fid, but that doesn't seem possible at present in D8.

I also made the canonical URL absolute on AMP pages, as that seemed important based on #2737729: Canonical and AMPHTML must be absolute, and gets reused in the schema we output here.

rainbowarray’s picture

StatusFileSize
new63.93 KB

Reroll.

rainbowarray’s picture

StatusFileSize
new72.31 KB
new14.86 KB

The one about usability

This adds a number of usability improvements.

  1. The add metadata settings button no longer appears if you already have global settings and have settings for all AMP-enabled content types.
    • The caching on this add button is really, really thorny. There's no controller for the link: the render array gets built up in the LocalActionsBlock plugin. You can use a local actions link alter hook to unset the link to make it disappear, but even with appropriate caching settings, that unsetting gets cached and isn't rechecked even after a cache tag is invalidated, probably because the block never gets created.
    • You can remove the content in a preprocess block, but if you remove it entirely the absence of the block ends up getting cached, and unless you set a cache tag on maybe the region, there's nothing that ends up rechecking the block's existence.
    • Yes, in theory all these caching settings should bubble up. In practice that doesn't seem to happen if the plugin or block never appear because their contents have been unset.
    • So, my diabolical solution was to replace the block contents with some markup with a message about not being able to add more settings, then hiding the previous block contents in a theme wrapper with a hidden class. That actually works. It is very ugly, but it works, so huzzah for that.
  2. The message about adding metadata settings or viewing existing settings is no contextual depending on which settings have already been set.
    • The label "Article overrides" (as an example) is tied to the name of the config entity. That would need to be changed in order for it to not say overrides if there are no global settings. That seems like an uncommon scenario that doesn't necessarily need to be optimized for beyond the changes that have already been made.
  3. Added image styles to help resize the organization logo and content image to the appropriate standards. Is there a way to add this config in an install hook if it doesn't exist? That might be handy for people who have already installed the module.

I am likely going to move the large chunks of code in amp.module that build the render array with the JSON metadata. I think it probably makes the most sense to create a render element for that. In theory a generic service class could work too. However the purpose of a render element is to output a render array and encapsulate extra processing that needs to happen in order to do so. That's pretty much what we're doing here, so that seems like a good idea. Moving that code into a class should make it easier to test if we decide that's something we'd like to test.

I also need to add caching info to that render array. That's lacking now, so it takes a cache clear to get the schema info to appear in head (if all the metadata is complete).

The one about why this patch takes the approach it's taking

It's also worth noting that yes, this is a fair amount of code to collect some metadata settings and output some JSON. The approach we're taking with configuration entities is very similar to the approach used in the Metatag module, which also collects global settings and provides overrides for particular content types or taxonomies.

We're not extending the Metatag module itself for a number of reasons. We only need metadata settings for particular content types that have AMP enabled, not all of them. No AMP metadata settings are needed for taxonomies. Communicating where to go to add metadata settings in Metatag module, but only on certain pages and not others, seems fairly thorny. Metatag is a fairly hefty module and requiring it in order to provide an essential function for AMP seems like a pretty big burden, particularly when the user experience might be a bit awkward.

So instead we'e creating configuration entities, which work well for the content type overrides, and we also allow for global settings on those entities.

I'm feeling pretty good about the user experience of how this is working now. Glad for feedback.

mtift’s picture

2717641-39-top-stories-config.patch does not apply

rainbowarray’s picture

Status: Needs work » Needs review
StatusFileSize
new72.48 KB

Reroll.

mtift’s picture

Issue summary: View changes
StatusFileSize
new10.6 KB

Applying this patch, I'm seeing the following error:

mtift’s picture

Issue summary: View changes
Status: Needs review » Needs work
StatusFileSize
new30.32 KB

And once I add global settings, I don't see anywhere to add more:

Not sure if I'm missing something here.

rainbowarray’s picture

Status: Needs work » Needs review
StatusFileSize
new73.91 KB
new6.87 KB

The screenshot in #43 is happening because with a fresh install, there are no AMP-enabled content types. That effectively looks the same as when all AMP-enabled content types have settings with the way I was checking things. So I've now added an additional check to see if there are any AMP-enabled content types. If not, I provide a message to that effect. I needed to add an additional cache tag for AMP-enabled content types for the caching to work properly.

I didn't run into the exact error as noted in #42. However I did run into an error installing the AMP module when this patch is applied. I'm fairly sure that's because there are problems in the schema definition file for the config entity. I'd very much appreciate some help figuring out where I went wrong there. This is the first time I've made a config entity, and there's a very solid likelihood that I haven't defined the schema properly. Any feedback on that file would be much appreciated.

berdir’s picture

I think you have a dependency on serialization.module now with that service. I don't think that is needed, \Drupal\Component\Serialization\Json::encode should be enough for what you need?

I'm not saying we have to depend on metadata module. I'm not using that either and agree that it is very heavy. But, all this ld+json schema stuff is in no way AMP specific. AMP uses it, but if you have all that code to build it, why not expose it on normal pages as well? Google is going to thank you.

At the same time, someone might want to do it themself, for example because they have different data to consider (I have my own date fields for example).

It's fine if it stays optional, but it could also work fine as its own standalone module. And as mentioned above, an alter hook would be useful as well if you want to switch out a few properties. That is hard without as it is then json encoded.

rainbowarray’s picture

#45:

serialization.json maps to \Drupal\Component\Serialization\Json, so I'm not sure how using the service call is different from directly calling that service?

I'm planning on moving the JSON prep into a render element, which should hopefully help improve the extensibility of this.

I'm leery of extracting this into a separate module because of both the scope increase and the potential to degrade the user experience for using this with AMP.

If the goal becomes providing rich snippets for not only AMP-enabled content, there's a lot of other potential additions that might be necessary for that. And the more additional options, the more complex this could become for somebody who just wants to get AMP working.

The JSON processing will be removed from the theme once this is complete, so this will be essential in order to have valid AMP. I'm reluctant to make an essential part of this module more complex than it needs to be, when it's not entirely clear what the benefits of that scope expansion would be.

rainbowarray’s picture

StatusFileSize
new74.18 KB
new2.92 KB

A recent reroll stopped the JSON schema output from working. This fixes that. I tried to add cache tags for the script tag render array so that it would update when the metadata entities are updated, but that doesn't seem to be working. The cache tag does not appear to be bubbling up. I tried to trace that through the attachment renderer. It seemed like head tags should bubble up cacheability metadata, so not sure why this isn't working.

rainbowarray’s picture

StatusFileSize
new74.08 KB
new16.38 KB

This should address nearly all of the comments Berdir brought up in #20.

1. Added an entity query... then realized you were suggesting doing an entity load instead of doing that. So that's what's happening now. Fewer lines of code, huzzah!

2. I removed the caching of the lookups in an earlier patch.

3. Refactored this patch to refer to node type in code, and content type in the UI.

4. Added lookup keys.

5. Shortened the config prefix down to metadata. Also tweaked the URLs to just be amp/metadata rather than amp/amp_metadata.

6. Did not address the permissions issue.

7. I believe all the file statements have been eliminated now. There may still be a few lingering in other parts of the amp module. I'm cleaning these up as I work on files that have them.

8. Updated the translation strings to be complete rather than fragments.

9. Updated the ids of the config entities (as well as the labels). The latter should help reduce confusion if there are settings for a node type, but not for global.

I'd be glad for more code review to find things like this. Hopefully it's getting cleaner.

rainbowarray’s picture

StatusFileSize
new3.57 KB
new73.97 KB

This fixes the error with installation. Apparently declaring dependencies in the image style config.install files was causing problems. Not sure why.

rainbowarray’s picture

StatusFileSize
new73.95 KB
new1012 bytes

Thanks to some help from Wim Leers, this patch fixes the caching on the script element with the JSON. Turns out, an html_head attachment can't have caching info declared within its render array, since it's an attachment, and attachments bubble separately from caching. So instead we attach the caching info to $build itself rather than the render array. And now that works. Yay!

Going to refactor the gobs of code in amp.module that preps the script element to go into a render element instead. That's next on the list!

rainbowarray’s picture

StatusFileSize
new74.63 KB
new3.13 KB

This patch fixes three things:

1) When global settings and all content type settings, an error message will only display if you somehow try to add new settings, not when you edit existing settings.
2) Previously unless all JSON settings were exactly correct, no JSON would output in order to prevent invalid and incomplete JSON from making it seem like things were working when in reality they would not. In theory that's good, but in practice that makes this really hard to test and debug. So instead we allow incomplete JSON schema now. In a follow-up patch we can work on better ways to allow for debugging this.
3) The content image URL token display mode Image URL formatter thing is pretty fiddly and hard to get working right. So now we do some fallback processing if the token outputs an image element. This now works either with a URL token or with an img element token (or an amp-img element for that matter).

So hopefully this works much better for testing. Still would like to do some more refactoring of the amp.module code that preps the JSON, whether that ends up being in this issue or a follow-up.

karens’s picture

StatusFileSize
new74.89 KB

I tested this and ran into the problem mentioned above in #42. Then I tried a fresh install and did not. After trying several combinations I found that happens if you enable amp before applying the patch, which effectively will be like updating to this code in an existing installation. That error seems related to making changes in config and needs to be fixed using an update hook that runs \Drupal::entityDefinitionUpdateManager()->applyUpdates(); I added an update hook and was able to confirm that error went away no matter whether I applied the patch before or after AMP is enabled.

The code for grabbing the image urls has been expanded to extract it from markup, so the complicated message about using the image formatter module and creating another view mode is no longer necessary, so I removed it.

I also swapped in a different method of creating the json using JSON_PRETTY_PRINT, so it is human-readable. The Json serializer provided by core won't do that, but all it does is run json_encode(), so I just used json_encode() directly. This makes it much easier to see what has been created and matches the examples provided in the AMP documentation.

I have one remaining problem. I have created images in the simple way (using the regular image field and letting the code parse out the src). The result in my json is escaping the url, "url": "http:\/\/amp.dev\/sites\/default\/files\/news\/generateImage_c0sYes.jpg", instead of "url": "http://amp.dev/sites/default/files/news/generateImage_c0sYes.jpg", I see the same escaping on the id: "@id": "http:\/\/amp.dev\/node\/23" instead of "@id": "http://amp.dev/node/23". I'm not quite sure where that is coming from but I suspect that's from using things like $content_image_url = PlainTextOutput::renderFromHtml($content_image_url). I am out of time for now to dig into that more.

It would be nice to have default values out of the box for obvious things like site:name, but when I tried to set them up I saw the problem. You can't force them in any time the box is empty because that means you'd have to keep checking every place you really wanted to leave the values empty to be sure the module hasn't put in a value you didn't want there. So I don't think that can be done.

Finally it would be nice to add a line to the description at the very top of the form if Token module is not enabled suggesting that enabling Token module will add a nice token browser to the form to make it easier to insert tokens (as opposed to trying to figure out some other way what tokens are available).

karens’s picture

StatusFileSize
new2.34 KB

And this is the interdiff for #52

mtift’s picture

I'll continue to review this, but I just wanted to say that the patch in #52 is working for me! Thanks, @KarenS!

rainbowarray’s picture

StatusFileSize
new74.99 KB
new1.15 KB

Thanks a million, Karen, for those fixes!

I tweaked the JSON encode to use JSON_UNESCAPED_SLASHES, which eliminates the escaping you saw in the URLs. I also added the JSON options used within Drupal's serialization service's encode method: I assume those were selected for a good reason. In theory we could override the serialization service, but that seems excessive and would change serialization throughout the codebase, which seems too heavy-handed a change.

I don't think we need a notice about enabling the Token module, because Token is already declared as a dependency for AMP, so will definitely be installed.

I agree, Karen, putting in actual defaults for the config settings could be problematic: the best alternative I could find was setting placeholders to at least give people a good idea of what might work for each field.

mtift’s picture

I don't think we need a notice about enabling the Token module, because Token is already declared as a dependency for AMP, so will definitely be installed.

That is kind of true. Before this patch is applied, Token is not a dependency of AMP. This patch adds that dependency.

karens’s picture

If this patch adds a new dependency we should warn existing users. This could be a simple as adding a message to the update hook that says something like "Token module is now a dependency. Please download and enable it. This will also provide you with a token browser to make it easier to select tokens on the metatags configuration page."

mtift’s picture

I'm thinking we should do something like this:

--- a/src/AmpMetadataListBuilder.php
+++ b/src/AmpMetadataListBuilder.php
@@ -312,6 +312,12 @@ class AmpMetadataListBuilder extends ConfigEntityListBuilder {
       $build['header']['#markup'] = '<p>There should be a message here, but apparently something unforseen has happened.</p>';
     }
 
+    // Early versions of the AMP module did not require Token, so we add one
+    // final check here.
+    if (!\Drupal::moduleHandler()->moduleExists('token')) {
+      $build['header']['#markup'] = '<p>In order for metadata to appear properly, the Token module must be enabled.</p>';
+    }
+
rainbowarray’s picture

StatusFileSize
new77.01 KB
new7.42 KB

This adds warning messages if somebody upgrades with this patch applied but has not installed the Token module.

mtift’s picture

I have a lot more testing to do, but here is some initial feedback:

  1. +++ b/amp.module
    @@ -77,17 +83,12 @@ function amp_entity_view_alter(array &$build, EntityInterface $entity, EntityVie
    +              $current_canonical = $config[$key][0]['href'];
    

    We need to check that $config[$key][0]['href'] exists before using it. This does not exist, for example, when a node is added.

  2. +++ b/amp.module
    @@ -410,9 +411,385 @@ function amp_view_modes_submit(&$form, \Drupal\Core\Form\FormStateInterface $for
    +  // Provide description of content, if available: maximum 150 characters.
    +  if (isset($amp_metadata['content_description_token']) && !empty($amp_metadata['content_description_token'])) {
    +    $description = PlainTextOutput::renderFromHtml(\Drupal::service('token')->replace($amp_metadata['content_description_token'], ['node' => $node]));
    +    $amp_temp_json_metadata['description'] = (strlen($description) > 150) ? mb_strimwidth($description, 0, 147, "...") : $description;
    +  }
    

    If a node does not have a description, this will result in having a description of "[node:summary]" but it seems like we would just want this to be blank?

  3. All of the config in config/install/image.style.amp_metadata_content_image_696.yml and config/install/image.style.amp_metadata_logo_600x60.yml needs to be added to an update hook for existing modules.
  4. +++ b/amp.install
    @@ -40,3 +45,9 @@ function amp_update_8001(&$sandbox) {
    +/**
    + * Create the new AMP metadata config entity on existing sites.
    + */
    +function amp_update_8002() {
    

    This docbloc needs a space above it

mtift’s picture

Just a note that in follow-up issues (*not* this one), we should convert _amp_get_amp_url(), _amp_build_metadata(), _amp_get_merged_metadata(), and _amp_prepare_metadata_json() into services.

mtift’s picture

Here are some additional links with more information about the metadata that needs to appear:

https://developers.google.com/search/docs/data-types/articles#use-cases

https://search.google.com/structured-data/testing-tool

rainbowarray’s picture

StatusFileSize
new79.87 KB
new5.19 KB

1) Added an extra check around this. Keep in mind, this is essentially the same code as when the canonical url was made absolute: just using an existing shorthand here.

2) I'm not quite sure what you mean with this. content_description_token is the value of the config field. So that can be a token or a string like 'All my snippets will have this description.' If somebody enters [node:summary] in this field, then that token will be processed. If a summary isn't explicitly declared, this value is an automatic shortening of the body field.

3) Added the image styles in the update hook. Fun learning how to do this! Also, one gotcha I ran into when doing this from a fresh install is that you need to have at least one node created or the update hook will fail (regardless of the addition of the image style creation code) due to a message about the batch table not existing.

4) Space added.

Thanks for the review mtift!

rainbowarray’s picture

StatusFileSize
new0 bytes
new879 bytes

This fixes a bug that was preventing the amphtml link from showing up on non-AMP pages.

mtift’s picture

Status: Needs review » Needs work

@mdrummund: there is no patch attached. Also, based on your interdiff, it gets us back to the problem of using $config[0]['href']; before checking that it exists.

rainbowarray’s picture

StatusFileSize
new79.77 KB

Fixed patch.

rainbowarray’s picture

StatusFileSize
new79.87 KB
new593 bytes

This adds an extra check for $config[0]['href']

mtift’s picture

Status: Needs work » Needs review

I think this is just about ready! Any other feedback welcome. Otherwise, we will probably commit this later today and then open lots of follow-ups.

mtift’s picture

Status: Needs review » Reviewed & tested by the community

We're going to do this

  • mtift committed 06259b2 on 8.x-1.x authored by mdrummond
    Issue #2717641 by mdrummond, KarenS, mtift, Berdir: Add support for 'Top...
mtift’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 8.x-1.x. Congratulations, Marc!

rainbowarray’s picture

Crediting mtift and berdir with the d.o. credit system.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mtift’s picture