-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Substack with a custom domain #3244
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,14 +2,14 @@ | |
"translatorID": "ac3b958f-0581-4117-bebc-44af3b876545", | ||
"label": "Substack", | ||
"creator": "Abe Jellinek", | ||
"target": "^https://[^.]+\\.substack\\.com/(p/|archive)", | ||
"target": "/p/|/archive", | ||
"minVersion": "3.0", | ||
"maxVersion": "", | ||
"priority": 100, | ||
"inRepository": true, | ||
"translatorType": 4, | ||
"browserSupport": "gcsibv", | ||
"lastUpdated": "2022-10-05 15:16:38" | ||
"lastUpdated": "2024-01-29 20:12:55" | ||
} | ||
|
||
/* | ||
|
@@ -37,6 +37,8 @@ | |
|
||
|
||
function detectWeb(doc, url) { | ||
if (!url.match(/^https:\/\/[^.]+\.substack\.com\/(p\/|archive)/) && !text(doc, "a.footer-substack-cta")) | ||
return false; | ||
if (url.includes('/p/')) { | ||
return "blogPost"; | ||
} | ||
|
@@ -49,7 +51,7 @@ function detectWeb(doc, url) { | |
function getSearchResults(doc, checkOnly) { | ||
var items = {}; | ||
var found = false; | ||
var rows = doc.querySelectorAll('a.post-preview-title[href*="/p/"]'); | ||
var rows = doc.querySelectorAll('a[data-testid="post-preview-title"][href*="/p/"]'); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good one - looks like this part is plain broken right now due to this change in the markup. In view of that, even if the pattern change doesn't get in (e.g., I can imagine maintainers proposing an alternative approach to keep the |
||
for (let row of rows) { | ||
let href = row.href; | ||
let title = ZU.trimInternal(row.textContent); | ||
|
@@ -225,6 +227,37 @@ var testCases = [ | |
"seeAlso": [] | ||
} | ||
] | ||
}, | ||
{ | ||
"type": "web", | ||
"url": "https://www.latent.space/p/ai-ux-moat", | ||
"items": [ | ||
{ | ||
"itemType": "blogPost", | ||
"title": "How to Make AI UX Your Moat", | ||
"creators": [ | ||
{ | ||
"firstName": "Anshul", | ||
"lastName": "Ramachandran", | ||
"creatorType": "author" | ||
} | ||
], | ||
"date": "2023-07-07", | ||
"abstractNote": "Design great AI Products that go beyond \"just LLM Wrappers\": make AI more present, more practical, and then more powerful.", | ||
"blogTitle": "Latent Space", | ||
"url": "https://www.latent.space/p/ai-ux-moat", | ||
"websiteType": "Substack newsletter", | ||
"attachments": [ | ||
{ | ||
"title": "Snapshot", | ||
"mimeType": "text/html" | ||
} | ||
], | ||
"tags": [], | ||
"notes": [], | ||
"seeAlso": [] | ||
} | ||
] | ||
} | ||
] | ||
/** END TEST CASES **/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is of course quite generic, so this translator will be considered for quite a few URLs I suspect (e.g., many sites plausibly have some "archive"). A couple of suggestions in view of that:
priority
field should be bumped up to "250", I think. See translator priority docs for details. This is accounting for thata.footer-substack-cta
check, which I think can be counted as a "unique check in detectWeb".target
pattern as much as possible and at least adding a(?:$|\?)
(non-capturing because performance 🙂) for thearchive
part (see a proposal and a counterexample below)./p/
is trickier as I'm not sure what symbols Substack could use there, but maybe something along the same lines, i.e.,/p/<something-but-not-/>$
? Though I see in the test set that there may be a/p/<post name>/comments
, so maybe not exactly that - essentially, this is just a suggestion to think twice of any additional options.As for the
archive
, below is the proposed alteration and here's the counterexample link I could think of OTMH: IACR ePrint paper version record.