Library for scraping Bandcamp content. (supports Cloudflare Pages)

Go to file

patrickkfkan f1b9100635 Update version		2021-01-26 17:24:32 +08:00
examples	Update examples	2021-01-26 17:22:46 +08:00
lib	Implement proper caching	2021-01-26 15:58:39 +08:00
.gitignore	Initial commit	2021-01-20 04:37:31 +08:00
package.json	Update version	2021-01-26 17:24:32 +08:00
README.md	Update readme	2021-01-26 17:24:10 +08:00

README.md

bandcamp-fetch

A JS library for scraping Bandcamp content; inspired by bandcamp-scraper.

Installation

npm i bandcamp-fetch --save

Usage

const bcfetch = require('bandcamp-fetch');

bcfetch.discover(...).then( results => {
    ...
});

API

Each function returns a Promise which resolves to the fetched data.

`discover([params], [options])`

Example (output)

Fetches albums through Bandcamp Discover.

params (optional) - object specifying params to be passed to Bandcamp Discover
- genre
- subgenre: only valid when genre is something other than 'all'
- location
- sortBy
- artistRecommendationType: only valid when sortBy is 'rec' (artist recommended)
- format
- time
- page
All properties are optional. Possible values for each property can be obtained with the getDiscoverOptions() function.

params passed to this function will be sanitized with sanitizeDiscoverParams(). A copy of the sanitized params can obtained through the params property of the returned result.
options (optional) - object specifying options to be used when formulating results:
- albumImageFormat: name, Id or object referring to an image format.
- artistImageFormat
All properties are optional. Image formats can be obtained with the getImageFormats() function.

`getDiscoverOptions()`

Example (output)

Fetches Bandcamp Discover options that can be passed back to discover().

`sanitizeDiscoverParams(params)`

Example (output)

Sanitizes params by setting default values for omitted params and removing irrelevant ones.

You don't have to call this function on params passed to discover() - they will be sanitized automatically.

`getImageFormats([filter])`

Example (output)

Fetches the list of image formats used in Bandcamp.

filter (optional) - 'artist' or 'album'. If specified, narrows down the result to include only formats applicable to the specified value.

`getImageFormat(idOrName)`

Fetches the image format that matches Id or name. If none is found, the result will be null.

`getArtistOrLabelInfo(artistOrLabelUrl, [options])`

Example (output)

Fetches information about an artist or label.

artistOrLabelUrl
options (optional)
- imageFormat

`getLabelArtists(labelUrl, [options])`

Example (output)

Fetches the list of artists belonging to a label.

labelUrl
options (optional)
- imageFormat

`getDiscography(artistOrLabelUrl, [options])`

Example (output)

Fetches the list of albums and standalone tracks belonging to an artist or label.

artistOrLabelUrl
options (optional)
- imageFormat

`getAlbumInfo(albumUrl, [options])`

Example (output)

Fetches information about an album.

albumUrl
options (optional)
- albumImageFormat
- artistImageFormat
- includeRawData

`getTrackInfo(trackUrl, [options])`

Example (output)

Fetches information about a track.

trackUrl
options (optional)
- albumImageFormat
- artistImageFormat
- includeRawData

`getAlbumHighlightsByTag(tagUrl, [options])`

Example (output)

Fetches album highlights for the tag referred to by tagUrl. The result is an array of album collections, with each collection corresponding to a highlight category such as 'new and notable' and 'all-time best selling'.

tagUrl

Tag URLs can be obtained with the getTags() function.
options (optional)
- imageFormat

`getTags()`

Example (output)

Fetches Bandcamp tags. The result is an object with the following properties:

tags: non-location tags
locations: location tags

`search(params, [options])`

Example (output)

Searches for params.query.

params
- query: search string
- page (1 if omitted)
options (optional)
- albumImageFormat
- artistImageFormat

Caching

The library maintains an in-memory cache for two types of resources:

page - pages fetched during scraping
constant - image formats and discover options

Functions related to the cache can be called this way:

const bcfetch = require('bandcamp-fetch');

bcfetch.cache.setTTL('page', 500);
bcfetch.cache.setMaxPages(20);
bcfetch.cache.clear('constant');

`cache.setTTL(type, TTL)`

Sets the expiry time, in seconds, of cache entries for the given resource type.

type: 'page' or 'constant'
TTL: expiry time in seconds (default: 300 for 'page' and 3600 for 'constant')

`cache.setMaxPages(maxPages)`

Sets the maximum number of pages that can be stored in the cache. A negative value means unlimited. Default: 10.

`cache.clear([type])`

Clears the cache entries for the given resource type.

type (optional): 'page' or 'constant'. If unspecified, clears the entire cache.

License

MIT