Library for scraping Bandcamp content. (supports Cloudflare Pages)
Go to file
2021-01-26 17:24:32 +08:00
examples Update examples 2021-01-26 17:22:46 +08:00
lib Implement proper caching 2021-01-26 15:58:39 +08:00
.gitignore Initial commit 2021-01-20 04:37:31 +08:00
package.json Update version 2021-01-26 17:24:32 +08:00
README.md Update readme 2021-01-26 17:24:10 +08:00

bandcamp-fetch

A JS library for scraping Bandcamp content; inspired by bandcamp-scraper.

Installation

npm i bandcamp-fetch --save

Usage

const bcfetch = require('bandcamp-fetch');

bcfetch.discover(...).then( results => {
    ...
});

API

Each function returns a Promise which resolves to the fetched data.

discover([params], [options])

Example (output)

Fetches albums through Bandcamp Discover.

  • params (optional) - object specifying params to be passed to Bandcamp Discover

    • genre
    • subgenre: only valid when genre is something other than 'all'
    • location
    • sortBy
    • artistRecommendationType: only valid when sortBy is 'rec' (artist recommended)
    • format
    • time
    • page

    All properties are optional. Possible values for each property can be obtained with the getDiscoverOptions() function.

    params passed to this function will be sanitized with sanitizeDiscoverParams(). A copy of the sanitized params can obtained through the params property of the returned result.

  • options (optional) - object specifying options to be used when formulating results:

    • albumImageFormat: name, Id or object referring to an image format.
    • artistImageFormat

    All properties are optional. Image formats can be obtained with the getImageFormats() function.

getDiscoverOptions()

Example (output)

Fetches Bandcamp Discover options that can be passed back to discover().

sanitizeDiscoverParams(params)

Example (output)

Sanitizes params by setting default values for omitted params and removing irrelevant ones.

You don't have to call this function on params passed to discover() - they will be sanitized automatically.

getImageFormats([filter])

Example (output)

Fetches the list of image formats used in Bandcamp.

  • filter (optional) - 'artist' or 'album'. If specified, narrows down the result to include only formats applicable to the specified value.

getImageFormat(idOrName)

Fetches the image format that matches Id or name. If none is found, the result will be null.

getArtistOrLabelInfo(artistOrLabelUrl, [options])

Example (output)

Fetches information about an artist or label.

  • artistOrLabelUrl
  • options (optional)
    • imageFormat

getLabelArtists(labelUrl, [options])

Example (output)

Fetches the list of artists belonging to a label.

  • labelUrl
  • options (optional)
    • imageFormat

getDiscography(artistOrLabelUrl, [options])

Example (output)

Fetches the list of albums and standalone tracks belonging to an artist or label.

  • artistOrLabelUrl
  • options (optional)
    • imageFormat

getAlbumInfo(albumUrl, [options])

Example (output)

Fetches information about an album.

  • albumUrl
  • options (optional)
    • albumImageFormat
    • artistImageFormat
    • includeRawData

getTrackInfo(trackUrl, [options])

Example (output)

Fetches information about a track.

  • trackUrl
  • options (optional)
    • albumImageFormat
    • artistImageFormat
    • includeRawData

getAlbumHighlightsByTag(tagUrl, [options])

Example (output)

Fetches album highlights for the tag referred to by tagUrl. The result is an array of album collections, with each collection corresponding to a highlight category such as 'new and notable' and 'all-time best selling'.

  • tagUrl

    Tag URLs can be obtained with the getTags() function.

  • options (optional)

    • imageFormat

getTags()

Example (output)

Fetches Bandcamp tags. The result is an object with the following properties:

  • tags: non-location tags
  • locations: location tags

search(params, [options])

Example (output)

Searches for params.query.

  • params
    • query: search string
    • page (1 if omitted)
  • options (optional)
    • albumImageFormat
    • artistImageFormat

Caching

The library maintains an in-memory cache for two types of resources:

  1. page - pages fetched during scraping
  2. constant - image formats and discover options

Functions related to the cache can be called this way:

const bcfetch = require('bandcamp-fetch');

bcfetch.cache.setTTL('page', 500);
bcfetch.cache.setMaxPages(20);
bcfetch.cache.clear('constant');

cache.setTTL(type, TTL)

Sets the expiry time, in seconds, of cache entries for the given resource type.

  • type: 'page' or 'constant'
  • TTL: expiry time in seconds (default: 300 for 'page' and 3600 for 'constant')

cache.setMaxPages(maxPages)

Sets the maximum number of pages that can be stored in the cache. A negative value means unlimited. Default: 10.

cache.clear([type])

Clears the cache entries for the given resource type.

  • type (optional): 'page' or 'constant'. If unspecified, clears the entire cache.

License

MIT