This is a continuation of my earlier post about the requirements of The Public Sector Bodies (Websites and Mobile Applications) Accessibility Regulations 2018 for documents supplied to students. It is quite possible that there will be extensive online teaching in autumn 2020, much of which would be delivered through live or pre-recorded video. This post is about what the PSB regulations require for this sort of content.

Scope

I have ignored requirements for audio-only or video-only content, because I don’t expect this to be widely used. The world has enough podcasts.

PSB part 1 section 3 (2) says

These Regulations do not apply to the following content of a website and mobile application of a public sector body…(c) live time-based media.

This seems pretty clear: the PSB regulations don’t impose any additional requirements on live audio/video teaching (even though EN 301 549 has section 6 on two-way voice and video communication). For this reason I’ve omitted anything that applies only to live media.

Glossary

Here are the definitions and abbreviations from EN 301 549

In particular: it is clear that ICT in EN 301 549 can refer to web pages (e.g. 9.2.1 where ICT is a webpage, it shall satisfy…), and that documents include webpages.

From PSB part 1 section 2: “‘time-based media’ means media of one or more of the following types: audio-only, video-only, audio-video, audio and/or video combined with interaction.”

WCAG 2.0 glossary. In particular, synchronized media includes a video with an audio track. An alternative for time-based media is a “document including correctly sequenced text descriptions of time-based visual and auditory information..” An audio description is also called a video description or descriptive narration.

The requirements

As in the previous post, the PSB regulations require content adheres to EN 301 549, which in turn harmonizes with WCAG 2.0. The part of EN 301 549 relevant to pre-recorded video is section 9 on web content.

EN 301 549 section 9: Web content

9.2.1 non-text content

All non-text content has a text alternative. Text alternatives to time-based media provide descriptive identification of the non-text content.

9.2.3 captions (pre-recorded)

Captions are provided for all pre-recorded audio in synchronized media, except when the media is itself an alternative to text.

9.2.4 audio description or media alternative (pre-recorded)

Refers to WCAG 2.0 1.2.3. Requires an alternative for time-based media or an audio description of the pre-recorded video component of synchronized media, same exception as before.

9.2.6 audio description (pre-recorded)

Refers to WCAG 2.0 1.2.5. Pre-recorded video content must have an audio description, same exception. The difference between 9.2.4 and 9.2.6 is that 9.2.4 allows for time-based content that might not involve video whereas 9.2.6 is specifically about video.

Summary of the requirements

Very simply, pre-recorded video must have captions as well as an audio description. I don’t have any helpful thoughts about audio descriptions for math teaching videos at the moment, but the next part of the post will look at creating captions.

Captions

Producing captions for your own videos is but not difficult. You just need to produce a text file with timing information and the text content, one possible format being .srt, and supply it to whatever player you use. There is free software that can help, e.g. Gnome subtitles.

However, producing your own captions is very time-consuming - even if you get really fluent it probably takes double or treble the length of the video.

Auto-captions example

YouTube will automatically generate captions using text-to-speech Here’s the text of the auto-captions for the first part of a video I made about Fermat’s little theorem.

in this video we’re going to prove Fermat’s little theorem it’s called that to distinguish it from thermals last theorem so the thing about a to the n plus B to the N equals C to the N but here’s the statement so firm as little theorem says that if you take a prime number P and if you take an integer X which is not divisible by P then X to the power P minus 1 is congruent to 1 mod p like their difference is divisible by P so the way we’re going to prove this is by applying the corollary from our last lecture which I’ve copied at the bottom there to a particular group so in fact to the group said P cross so you’ll remember that Z P cross under multiplication was a group and the set said P cross consisted of the equivalence classes class of 1 mod p class of 2 mod p and so on up to the class of P minus 1 mod p so all the equivalence classes in Z P except for the equivalence class of 0 which as you’ll remember consists of all the multiples of P okay well what we can say about this integer X we’ve got is since X is not divisible by P then the …

The results are…mixed. It got one Fermat correct, for example, and it knows technical vocabulary like equivalence, mod, integer, and congruent, but lots of it is wrong.

Can we get away with just using the auto-captions?

Probably not. You can see from the above that they’re not comparable to being able to hear and understand the audio.

Here is a post from the University of Minnesota Duluth on youtube auto-captioning. They link to a story about Harvard and MIT being sued for failing to caption their online media in particular implying that auto-captions don’t suffice. They cite a figure of 60-70% accuracy for YouTube automatic captioning which feels plausible (though you would expect technical vocabulary to reduce this further).

Information from the US about this is not quite as irrelevant as it might seem: this page suggests you can comply with the ADA by meeting WCAG 2.0 level AA, which EN 301 549 is also supposed to harmonize with.

That doesn’t mean the automatically generated captions are useless: correcting them is probably faster than creating captions from scratch. If you’re willing to upload your videos to YouTube (they don’t have to be publicly viewable) you can edit the autogenerated captions to improve them. You can also download the subtitle file for use elsewhere. Correcting the captions will be even faster if you’re fluent with Vim motions, possibly easymotion also helps.