Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted attributes appearing in index #172

Open
gazconroy opened this issue Apr 16, 2021 · 13 comments
Open

Unwanted attributes appearing in index #172

gazconroy opened this issue Apr 16, 2021 · 13 comments

Comments

@gazconroy
Copy link

I want to report a bug:

What is the current behavior?

I've defined an algolia_hooks.rb in the _plugins directory and filled it with classes/attrributes that I don't want indexed. However, the dashboard tells me that they still have been.

What is your expected behavior?

None of these classes should turn up on the site's index.

Git repository to reproduce the issue:

https://github.com/gazconroy/digital-comma/tree/gh-pages

Ruby version used:

2.5

Jekyll version used:

3.9

@Haroenv
Copy link
Contributor

Haroenv commented Apr 19, 2021

is it possible that your ci hasn't run since you updated the configuration to avoid those nodes? it looks to me as if it's failing for another reason: https://travis-ci.org/github/gazconroy/digital-comma/builds/767322353

@gazconroy
Copy link
Author

Cheers. Travis has never worked so have been updating the index manually from the command line as required. Any other ideas abut why the Algolia update is failing to remove those classes?

@Haroenv
Copy link
Contributor

Haroenv commented Apr 19, 2021

Unfortunately I have no further idea what could cause it. if you manually add logging to the plugin, do you see whether your method is called?

@gazconroy
Copy link
Author

gazconroy commented Apr 19, 2021

What flag adds logging?

I have a bit more of an insight into this, though. It has partially worked in that it has reduced the number of classes from over 140 down to 25. I suspect some of them may be 'protected'. Here are the those I attempted to remove but could not:

  • objectID
  • date
  • custom_ranking
  • author_profile
  • read_time
  • comments
  • share
  • related
  • show_date
  • header
  • classes
  • excerpt_html
  • slug
  • type

Some of them may well be required for Algolia to work but it would be nice to have a list of such 'protected' items.

@Haroenv
Copy link
Contributor

Haroenv commented Apr 19, 2021

objectID is required, maybe indeed trying to remove that is causing the index to no longer be consistent. Others aren't required

@gazconroy
Copy link
Author

Mmm. Doesn't seem to make a difference. Perhaps those protections are within the jekyll-algolia code?

@pixelastic
Copy link
Collaborator

Hello @gazconroy,

The Algolia API does not forced any specific attribute, except the objectID. All other attributes will be generated by jekyll-algolia.

You can find here in the JSON example the base keys added by the plugin (and needed for it to correctly sort your results): https://community.algolia.com/jekyll-algolia/how-it-works.html

I don't remember if those keys are added after the hook or before, though. If they are added after you won't be able to remove them. If they are added before you can remove them, but you might then break the relevance of the plugin.

@gazconroy
Copy link
Author

Thank you for the update @pixelastic . It does look like those keys are added after the hook. However, I also can't remove other keys outside that 'how it works' list:

  • author_profile
  • read_time
  • comments
  • share
  • related
  • show_date
  • header
  • tags

Tthe algolia_hooks.rb code is successfully removing some unwanted keys. Just not those...

@pixelastic
Copy link
Collaborator

Wait, I had a look at your hook code and I think there might be some confusion here.

You're talking about removing keys from Algolia records, but the hook you shared seem to remove entire records based on their CSS classes. So I'm thinking we might not be talking about the same thing here.

The way the plugin works is by creating one Algolia record (the items you see in your dashboard) per HTML node (the things you're matching against in your hook). If the hook returns nil, this record is not created. If the record is created, it contains a bunch of keys (the default one I shared earlier, but you can also use the hook to add custom keys)

Does that help? If not, could you share a screenshot of what you see when you mentioned "However, the dashboard tells me that they still have been [indexed]."?

@gazconroy
Copy link
Author

gazconroy commented Jun 6, 2021

Sure. Here's the dashboard display of those keys.

algolia index

As you can imagine, my interpretation of this is that the plugin converts CSS classes to Algolia keys (which seems like a mighty fine idea to me).

@pixelastic
Copy link
Collaborator

@gazconroy Could you also share the frontmatter of the matching post?

@gazconroy
Copy link
Author

layout: splashplace
title: Writing human-readable JavaScript for APIs
categories:
  - Javascript
header:
  overlay_color: "#000"
  overlay_filter: "0.5"
  overlay_image: assets/images/javascript.jpg
  teaser: assets/images/javascript.jpg
excerpt: URLSearchParams allows you to compose easy-to-understand API calls

@gazconroy
Copy link
Author

gazconroy commented Jun 9, 2021

Minimal mistakes theme with a customised layout for this post but all other content (standard minimal mistakes posts/pages) exhibit the same behaviour

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants