A Go toolkit for web content processing, analysis, and SEO optimization, offering utilities to efficiently extract favicons, links, descriptions and titles.
Note
Each method is thoroughly tested and optimized for performance, but the package is still in development and may contain unseen bugs. Please don't hesitate to report any issues you encounter!
Table of Contents
- LRU Caching
- Timeout
- User-Agent
go get github.com/devnyxie/katsuragi
The GetTitle() function currently supports the following title meta tags:
<title>Title</title>
<meta name="twitter:title" content="Title">
<meta property="og:title" content="Title">
import (
. "github.com/devnyxie/katsuragi"
)
func main() {
// Create a new fetcher with a timeout of 3 seconds and a cache capacity of 10
fetcher := NewFetcher(
&FetcherProps{
Timeout: 3000, // 3 seconds
CacheCap: 10, // 10 Network Requests will be cached
},
)
defer fetcher.ClearCache()
// Get website's title
title, err := fetcher.GetTitle("https://www.example.com")
}
The GetDescription() function currently supports the following description meta tags:
<meta name="description" content="Description">
<meta name="twitter:description" content="Description">
<meta property="og:description" content="Description">
...
// Get website's description
description, err := fetcher.GetDescription("https://www.example.com")
...
The GetFavicons() function currently supports the following favicon meta tags:
<link rel="icon" href="favicon.ico">
<link rel="apple-touch-icon" href="favicon.png">
<meta property="og:image" content="favicon.png">
Open Graph image (
og:image
) will be used only if bothog:image:width
andog:image:height
are present and equal, forming a square image.
...
// Get website's favicons
favicons, err := fetcher.GetFavicons("https://www.example.com")
// [https://www.example.com/favicon.ico, https://www.example.com/favicon.png]
...
The GetLinks() function searches for all <a>
tags in the HTML document and returns a slice of links.
Options:
Url
(required): The URL of the website to fetch.Category
(optional): The category of links to fetch. Possible values areinternal
,external
, andall
. Default isall
.
// Get website's links
links, err := fetcher.GetLinks(GetLinksProps{
Url: "https://www.example.com",
Category: "external",
})
// [https://www.youtube.com/example, https://www.facebook.com/example]
go test -v
# Generate coverage.out report, generate HTML report from coverage.out, and open the HTML report in the browser:
go test -coverprofile=coverage.out && go tool cover -html=coverage.out -o coverage.html && open coverage.html
This project is licensed under the GNU General Public License (GPL). You can find the full text of the license here.