Skip to content

A Go toolkit offering utilities to efficiently extract favicons, links, descriptions, and titles, with support for LRU caching.

License

Notifications You must be signed in to change notification settings

devnyxie/katsuragi

Repository files navigation

katsuragi

Go Build codecov

A Go toolkit for web content processing, analysis, and SEO optimization, offering utilities to efficiently extract favicons, links, descriptions and titles.

Note

Each method is thoroughly tested and optimized for performance, but the package is still in development and may contain unseen bugs. Please don't hesitate to report any issues you encounter!

Table of Contents

Features

  • LRU Caching
  • Timeout
  • User-Agent

Installation

go get github.com/devnyxie/katsuragi

Usage

Title

The GetTitle() function currently supports the following title meta tags:

  • <title>Title</title>
  • <meta name="twitter:title" content="Title">
  • <meta property="og:title" content="Title">
import (
	. "github.com/devnyxie/katsuragi"
)

func main() {
  // Create a new fetcher with a timeout of 3 seconds and a cache capacity of 10
  fetcher := NewFetcher(
    &FetcherProps{
      Timeout:       3000, // 3 seconds
      CacheCap: 10, // 10 Network Requests will be cached
    },
  )

  defer fetcher.ClearCache()

  // Get website's title
  title, err := fetcher.GetTitle("https://www.example.com")
}

Description

The GetDescription() function currently supports the following description meta tags:

  • <meta name="description" content="Description">
  • <meta name="twitter:description" content="Description">
  • <meta property="og:description" content="Description">
...
  // Get website's description
  description, err := fetcher.GetDescription("https://www.example.com")
...

Favicons

The GetFavicons() function currently supports the following favicon meta tags:

  • <link rel="icon" href="favicon.ico">
  • <link rel="apple-touch-icon" href="favicon.png">
  • <meta property="og:image" content="favicon.png">

    Open Graph image (og:image) will be used only if both og:image:width and og:image:height are present and equal, forming a square image.

...
  // Get website's favicons
  favicons, err := fetcher.GetFavicons("https://www.example.com")
  // [https://www.example.com/favicon.ico, https://www.example.com/favicon.png]
...

Links/Backlinks

The GetLinks() function searches for all <a> tags in the HTML document and returns a slice of links.

Options:

  • Url (required): The URL of the website to fetch.
  • Category (optional): The category of links to fetch. Possible values are internal, external, and all. Default is all.
  // Get website's links
  links, err := fetcher.GetLinks(GetLinksProps{
    Url: "https://www.example.com",
    Category: "external",
  })
  // [https://www.youtube.com/example, https://www.facebook.com/example]

Local Development

Testing

go test -v

Code Coverage

# Generate coverage.out report, generate HTML report from coverage.out, and open the HTML report in the browser:
go test -coverprofile=coverage.out && go tool cover -html=coverage.out -o coverage.html && open coverage.html

License

This project is licensed under the GNU General Public License (GPL). You can find the full text of the license here.

About

A Go toolkit offering utilities to efficiently extract favicons, links, descriptions, and titles, with support for LRU caching.

Topics

Resources

License

Stars

Watchers

Forks

Languages