tum-conf-video-scraper - DEV

Build Commitizen friendly License GitHub issues GitHub stars npm

tum-conf-scraper

This is a scraper written in Node.js and using Puppeteer that gets the videos served by Tum Conf services.

Install

To install tum-conf-scraper, run:

$ npm install tum-conf-scraper

Project purpose

This module is written because videos hosted on Tum Conf are difficult to download and watchable only in the browser. By using the module video-scraper-core, I created this module, that allows those videos to be recorderd.

Project usage

To scrape a video available at "https://tum-conf.zoom.us/rec/share/myvideo" and save it to "./saved.webm":

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
// Close the browser
await scraper.close();
}
main();

To scrape and download more than one video:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the first video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
// Scrape and save the second video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm');
// Close the browser
await scraper.close();
}
main();

To scrape and download in parallel more than one video:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function scrape(dest, link) {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the video
await scraper.scrape(link, dest);
// Close the browser
await scraper.close();
}

async function main() {
const tasks = [
['./saved.webm', 'https://tum-conf.zoom.us/rec/share/myvideo'],
['./saved_bis.webm', 'https://tum-conf.zoom.us/rec/share/myvideo-bis']
].map(([dest, link]) => scrape(dest, link));
await Promise.all(tasks);
}
main();

With custom options:

const { TumConfVideoScraper } = require('tum-conf-scraper');

async function main() {
// Browser options
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true,
debugScope: 'This will be written as scope of the euberlog debug',
windowSize: {
width: 1000,
height: 800
},
browserExecutablePath: '/usr/bin/firefox'
});
await scraper.launch();

// Scraping options
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm', { duration: 1000 });
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm', {
audio: false,
delayAfterVideoStarted: 3000,
delayAfterVideoFinished: 2000
});

await scraper.close();
}
main();

...all the options can be seen in the API section or with the Typescript definitions.

API

The documentation site is: tum-conf-scraper documentation

The documentation for development site is: tum-conf-scraper dev documentation

TumConfVideoScraper

The TumConfVideoScraper class, that scrapes a video from a "BBB WebKonferenze" and saves it to a file.

Syntax:

const scraper = new TumConfVideoScraper(passcode, options);

Parameters:

  • passcode: A string that specifies the passcode to access the video page.
  • options: Optional. A BrowserOptions object that specifies the options for this instance.

Methods:

  • setBrowserOptions(options: BrowserOptions): void: Changes the browser options with the ones given by the options parameter.
  • launch(): Promise: Launches the browser window.
  • close(): Promise: Closes the browser window.
  • scrape(url: string, destPath: string, options: ScrapingOptions): Promise: Scrapes the video in url and saves it to destPath. Some ScrapingOptions can be passed.

BrowserOptions

The options given to the TumConfVideoScraper constructor, see video-scraper-core for more information.

ScrapingOptions

The options passing to a scrape method, see video-scraper-core for more information.

Errors

There are also some error classes that can be thrown by this module, see video-scraper-core for more information.

Tests

The package is tested by using jest and ts-jest. The tests try for real to download some videos and check if they are saved, therefore, are not run in the CI because they are not headless.

Notes

  • The default browser is Google Chrome on /usr/bin/google-chrome, because Chromium did not support the BBB videos. You can always change the browser executable path on the configurations.
  • By default (if the duration option is null), the duration of the recording will be automatically detected by looking at the vjs player of the page and by adding a stopping delay of 15 seconds.
  • This module can be uses only in headful mode.

Generated using TypeDoc