-
-
Notifications
You must be signed in to change notification settings - Fork 213
Storage location of XML Sitemap should be freely determinable #8728
Comments
I think that's sufficient. In a multidomain installation you might want to add redirects for sitemap.xml anyway, if you want search engines be able to find the correct sitemap.xml for each domain automatically, e.g. http://www.example1.org/sitemap.xml etc. Also, why do you use these virtual subfolders instead of (sub)domains? How does that even work within Contao? I don't think that's a supported use case in general? |
Yeah, that's how I do it (redirects), it's just not very user friendly and it "blows up" the htaccess over time. I use virtual subfolders instead of subdomains for SEO reasons. Our main domain has been around for quite a while and has earned significant trust, backlinks and "ranking power" for certain keywords. Google used to treat subdomains pretty much as individual domains (so the subdomains hardly inherit any domain authority that the main domain has earned), while URLs with subfolders under the same domain benefitted much more from the authority of the main domain. We're ranked well for keywords x, y and z -- this way the new sites pretty quickly rank well for the keywords "location1 + x, y, z". Nowadays Google says it doesn't really make a difference anymore -- however, this is the system we started out with so that's why we're still operating that way. :-) In Contao I simply set "example.com" as domain in every root page -- the "subfolder" is determined by the alias of the root page. |
check https://github.com/hofff/contao-robots-txt-editor in combination with https://github.com/hofff/contao-htaccess There are some more problems with the sitemap:
This 3 steps we solve with the 2 extensions. |
@madmaharaja Did you check the two extensions above? |
Has anybody considered that placing the sitemap.xml in a subdirectory of the webroot it is used for violates the standard? https://www.sitemaps.org/protocol.html#location A solution like the one used in Contao always requires either a symlink or a redirect. Maybe the cross submit rule also applies to subfolders, so using a modified robots.txt would work, too. But anyway, relying on either extensions or on adminstrators actively working around things doesn't feel right. |
So actually, |
Yes. That's something we should have by default. Makes no sense to enable a sitemap by checkbox etc. We just have to make sure the correct one is output. That would be a superbe feature ;) |
Indeed :). Also - couldn't the (appropriate) sitemap simply be generated on the fly within that route instead of going through the trouble of generating the XML files in the cron whenever there was a change? On large sites this can cause memory overflow problems and (as discussed in contao/check#134) its generation blocks the response in Contao 4 (if you do not use php-fpm). |
Yeah, it can be generated on demand but obviously not every time it is requested. So I'd still cache it somewhere in |
I would like to - unfortunately we are overbooked currently ... |
@leofeyer can you move that to contao/core-bundle please? Because it's not going to change for Contao 3.5 anyway but would be a super nice addition to any future Contao 4 version. |
There is no need to move the ticket. Do you want me to assign it to you? |
Talked to @frontendschlampe about it, maybe they'll be working on a PR :) |
I've talked to @Toflar via Mumble, because we're currently updating our hofff/contao-robots-txt-editor and hofff/contao-htaccess. If you want, we will make a PR for this:
For a website with various languages under the same domain, there will be a sitemap for every language (maybe we add the language to sitemap name) and one robots.txt with all absolute path to every sitemap. I hope, I described correctly. :-) /cc @cliffparnitzky |
Very good, except the "create a robots.txt file" part. We have discussed this several times and decided not to mess with user generated files. |
Should the URL limit per sitemap be considered? A sitemap may not contain more than 50.000 URLs. Are use cases like a huge news portal, shop (e.g. Isotope), music catalogue,... with more than 50k "objects" relevant? |
If it is a route, we do not mess with it at all :) It's sort of fallback. If you upload a |
Yes ... we will do. Should we take the 50.000 URLs or less of them? Maybe 20.000? |
Google recommends to split them up (did not check how exactly) and I'm sure there's some recommendation on the threshold somewhere :) |
I will check! |
Sitemap is split in many files. And is built sitemap index file.
…On 9/27/2017 13:25, Yanick Witschi wrote:
Google recommends to split them up (did not check how exactly) and I'm
sure there's some recommendation on the threshold somewhere :)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8728 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADOViOym7Y4Sob9B3ZftI65bvf_zMbKDks5smjDAgaJpZM4N5JDm>.
--
Sebastijan Ribaric, dipl. oec.
*MEDIAR*, Information services
Ljubljana, Slovenia
www.mediar-agency.com <http://www.mediar-agency.com>
phone: +1 718 208 4520
mobile: +386 40 130 791
skype: sebastijanribaric <skype:sebastijanribaric?chat>
[email protected] <mailto:[email protected]>
|
Please use the existing cache! The response simply needs appropriate cache headers, and everything's taken care of 😉 . No need to store the files anywhere. I've used |
Currently all xml sitemaps in Contao are automatically stored in the /share folder. For multi-site installations that make use of subfolders rather than different (sub)domains this turns out to be a problem when wanting to submit a sitemap to Google Webmaster tools.
If you have the following structure:
www.example.com (main page)
www.example.com/location1/ (local site with it's own root page)
www.example.com/location2/
etc.
All sitemaps will be stored in /share:
www.example.com/share/main_sitemap.xml
www.example.com/share/location1_sitemap.xml
etc.
The problem
If you have individual Google properties for each site and you want to submit sitemaps for each site to Google Webmaster tools the form will look as can be seen on this screenshot.
Currently the only workaround I could think of is to set a redirect in .htaccess:
RedirectPermanent location1/location1_sitemap.xml /share/location1_sitemap.xml
However, I don't find this very convinient to do for every new location we add online.
My suggestion
Give the option to select an alternative folder where the sitemap can be stored (e.g. see illustration)
Or another alternative:
Override storing a sitemap in the standard /share folder by typing in folder + name of sitemap into the respective field:
The text was updated successfully, but these errors were encountered: