Chapter 10: Creating Quality Videos

Video production is a time-intensive process that requires extensive planning and preparation. Please review the following sections to ensure that you have allotted enough time and resources to create the desired amount of video you would like for your course.

Timed Transcripts

In order to meet most online accessibility guidelines, you will need to have transcriptions (timed if possible) for every video you produce. Some video hosting systems require a specific type of transcript format for their system (such as SRT – a timed format that allows specific words to appear on the screen as they are being spoken in the video), while others will accept many different formats. Make sure to check with your video hosting service. You can always pay a transcription service to create transcription files from completed videos, but just keep in mind this can get expensive. One way to save time and money on transcriptions is to create your videos using the recommended guidelines below. These guidelines will help you create a professional video with a transcript already in hand to be quickly converted to any transcription format.

Time Estimates for Online Course Video Production

Pre-prep* Filming Processing †
Up to 10 minutes with no slides, graphics, etc. 5 hours 0.5 hours 1 hour §
Up to 10 minutes with 25% slides, no moving parts ‡ 5.5 hours 0.5 hours 2 hours
Up to 10 minutes with 50% slides, no moving parts ‡ 6 hours 0.5 hours 5 hours
Up to 10 minutes with 75% slides, no moving parts ‡ 6.5 hours 0.5 hours 8 hours
Up to 10 minutes with 25% moving parts* 8 hours 0.5 hours 8 hours
Up to 10 minutes with 50% moving parts* 9 hours 0.5 hours 16 hours
Up to 10 minutes with 75% moving parts* 10 hours 0.5 hours 25 hours

*  includes creating scripts, gathering slides, providing documentation and content for moving parts, etc. All materials are required to be submitted in native digital format (i.e. no scans of handwritten content). Please see Script Creation Process below.
†  processing time will vary depending on the quality of content provided from the prep stage. Estimates given are for high quality native digital formats.
‡  moving parts refers to self-revealing equations, animations, moving graphics, or anything beyond a static PowerPoint slide. There are no estimates for 100% because all videos need to contain some video of the speaker.
§  includes 0.5 hour for audio processing and 0.5 hours for video processing, this amount included in all estimates in this column.

The Importance of the Script Creation Process

The most time-intensive process for presenters is creating a script. But also the single most important prep is to prepare a script.  This heavily impacts editing time for the final media.  It also impacts how the students digest the material.  It also impacts government-mandated needs for closed-captioning.

Impact on students: Research has shown that there is only a short duration of time (about 4.5 minutes), before students start to lose interest in the media and stop paying attention (Guo, Kim, & Rubin, 2014).  If this is true, why not capitalize on the value of a well-performed efficient presentation (devoid of digressions, mis-cues, fumbling over mis-placed slides, and tons of fillers like “Ummmm”)?  All of these issues can be avoided if the presenter has a script, and performs the script with a degree of enthusiasm and engagement.  With a script, it takes far less air-time to get a point across, than if the idea was presented ad-lib.

Impact on delivery: Often, the delivery of online media (especially when prepared using tax-payer dollars at state institutions) is required to be multi-channel, ie, close-captioning is often mandated for all audio tracks.  If your subject matter is highly technical, it is far better that you (as the expert) provide a transcript of what is said, rather than having a transcript prepared by a third-party (who probably does not know your subject matter or your specialized terminology).

Impact on Editing:  There are two extremes in editing the audio track of a presentation.

  1. The audio is a scripted performance, and the performer has practiced the script a few times, and performs the script “on-camera” with relatively few errors or re-takes. A scripted performance usually results in a recording in which the presenter sounds knowledgeable, authoritative, and well-prepared.  In a well scripted performance, the presenter delivers the material with a degree of dynamic enthusiasm.  The speaker gains credibility.
  2. The audio is an unscripted performance, in which the performer must conceive and present the material on-the-fly. Ad-lib performances tend to ramble, be disorganized, and have a speech register which is slow (because idea and terminology is being retrieved from memory, and this requires mental processing time – the performer tends to speak faster than they think, so the presentation slows down verbally to match the slow processing time that it takes to construct the discussion in the presenter’s mind).  If the performer makes some verbal mistakes, then the performer will employ conversation repair mechanisms to re-direct the audience back to the corrected train-of-thought.  These repair mechanisms involve louder speaking volume, change of pitch and speaking rate.  (If the editor removes the mistakes, then the voice quality before and after the splice (where the mistake was removed) do not match, and the audience knows that a substantial edit has occurred.  Also, in an unscripted performance, the presenter usually sounds distracted or unknowledgeable, unsure and un-authoritative, and dis-organized and un-prepared.  The audience wonders why they are listening to someone who does not know the topic.  Students wonder why they are required to submit well-organized assignments on time, when the instructor clearly is disorganized and doing things at the last minute.  The speaker loses credibility.

In terms of production schedule, there is a trade-off in pre-production workload and post-production workload.  Usually, the presenter is responsible for most of the pre-production workload, and the editor is responsible for the bulk of the post-production workload.

Script Creation Process

One highly recommended method for script creation is to pre-record the presenter delivering the material live (in class or into a Dictaphone device) to serve as raw content.  This pre-recording is then transcribed, and then edited, to improve rhetorical organization (eliminate digressions and back-tracking or combine clarification question answers with original statements), to correct improper vocabulary issues, and to adjust the information density to match the requirements of potential on-screen video effects (this usually involves replacing minimal grammatical phrases (like pronouns or generic verbs like “do”, ie replace “if you do this” with “If you perform this calculation”)).

Please note that parts of the script preparation can be performed by persons other than the presenter. The presenter delivers a live-performance of the presentation, but can then hand off that recording to a GTA or transcription service to get a transcription, and then the GTA or the presenter can edit the script. The edited script is then given back to the presenter for review and practice.  Therefore, it is possible that the presenter only be involved in about 2 hours of the 5-hour script-prep activity.

The rule of thumb is that live lectures usually distill down by 2/3 when converted to professional audio. Therefore, in order to create a video segment of up to 10 minutes, the following table will describe minimum estimates for pre-pre work:

Live performance raw content capture (lecture, etc) 0.5 hours
Transcription of raw content (can be sent to transcription service) 2.0 hours
Presenter edits transcription 2.0 hours
Presenter reads through edited transcription for practice 0.5 hours
Minimum Script Generation Time 5.0 hours

Also, as noted earlier, a complete script is much easier to convert to the required SRT format. For instructions on how to create SRT files yourself, please see this guide:

In some cases, you may not have the ability to go through this entire process. In those cases, focus on what you can accomplish. Start by writing your script out in a conversational tone. Then read it out loud to time it and see who it sounds to you. Edit your script and read aloud until you feel you get it correct. Remove extra parts and rabbit trails that aren’t needed as you perfect it. Then practice with the final version a few times before recording it. Even if you think you are good at “winging it,” preparation and practice will make you sound more professional. Finally, look at suggestions for how to make good marketing or branding videos – these contain tips that will help any video script writing process. For one example, see:

Audio Editing

At a minimum, the audio editor must listen to the full length of the recording.  If the recording is absolutely clean, with zero edits required, then the audio editor spent 10 minutes “editing.”  However, the typical scripted session will involve a small number of edits to correct for mis-pronunciations, stumbles, and page-turning, etc.  As such, typically, 20% of the recorded scripted session is discarded.  If the session is scripted, it is fairly easy to make splices that are “undetectable” in the final cut.

In contrast, if the session is unscripted (for example, lecture capture), typically, a very large number of edits have to be made to generate a final cut that is devoid of information errors, organization errors, speech errors, etc.  As a result, typically, two-thirds of the unscripted studio session is discarded.

Overall Time Considerations

Therefore, to create up to 10 minutes of quality video, a minimum of 6.5 hours is required if no problems are encountered.  However, the chance that there will be no issues is very, very small. Additionally, this is the estimate for straight video with no graphics, animations, self-revealing equations, etc. (see “Graphics and Video Preparation” below)

To put this into perspective, let’s compare to full in-person lecture capture. Consider a presenter going to class 3 times per week for one hour presentation for 15 weeks (a standard MWF one-hour lecture format).  Let’s consider that it takes these 45 hours of unscripted sessions (contact hours) to deliver a course-load of information.  That will require a total of about 540 man-hours to produce, and will yield only 15 hours of finished audio tracks.

On the other hand, if the presenter were to take those 45 hours of live-presentation time, and re-allocated them to 30 hours of script prep (enhanced by additional hours of GTA/transcription service involvement) and 15 hours of studio presentation, then this will require only 135 total man-hours to process, and will yield 12 hours of finished audio tracks  (and these audio tracks will be higher quality – the presenter will appear to be better-informed, more authoritative, and better-prepared – that is, more professional, with higher credibility).

Thus, with scripted studio performances for audio of a standard course, there is approximately 400 man-hour savings.  This block of 400 man-hours can be re-directed to providing more elaborate graphics for this course (or providing audio editing for 2 additional courses, or providing the audio editing for this course for 1/3 of the price…)

Graphics and Video Preparation

The graphics/moving parts portion of video creation is much more time-consuming, but typically, requires less presenter time.  The presenter provides a bunch of raw materials to the media editor, and the media editor spends weeks and weeks building media.  This impacts monetary budgeting and time-budgeting (to ensure the finished media is deployed on time).  The chart at the beginning of this section demonstrates how the overall production time increases quickly with the addition of slides, animations, self-revealing equations, etc. However these graphics are important for illustrating points and examples and should not be left out just because of time considerations.

The difficulty in estimating time requirements for these additions is that complexity can vary greatly. For example, self-revealing equations look very simple when comparing the final product to computer animated graphics, but can take much longer to create due to the extensive needs of proper placement, spacing, and readability of each individual character. This process can also be slowed down by inferior raw material provided by instructors (for example, handwritten notes that are hard to read and not properly written out).

In order to ensure the best possible delivery of content with the least amount of wasted time, we recommend all materials to be created digitally in a native digital format (for example, graphics created in a graphics editor and not just a scan of a textbook page). If the media specialist can copy and paste high quality content right into the video editor program, this can save them and the presenter enormous amounts of time. It may seem like it is faster to write out a formula on paper than take the time to enter it in an equation editor, but if the media specialist has to call you and set up a meeting to clarify what doesn’t look right and then meet for an hour or more to clear up one equation and so on and so forth, you might actually cause more time in the long run by trying to save time up front.

Therefore, please sit down with your technical media consultant to go through all graphical needs to ensure you are using the highest quality creation method from the beginning. Even if it does turn out that they need to do a hand sketch to create a completely new animation from scratch (which is rare), at least they will have created their own hand sketch that they will understand from the beginning.

Licensing and Videos

Another issue to consider when creating your videos is how they will be licensed. For many people, your institution or company will already have made decisions on how the videos you have created will be licensed. For others, you may have some or total freedom to license the videos that you create as you like. If you decide to use any type of open license for your video, this could increase the impact and spread of your content outside of the course. For more details on open licensing (and licensing in general), see the chapter on Open Educational Resources.



Guo, P. J., Kim, J., & Rubin, R. (2014, March). How video production affects student engagement: An empirical study of mooc videos. In Proceedings of the first ACM conference on Learning@ scale conference (pp. 41-50). ACM.


Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Chapter 10: Creating Quality Videos by Matt Crosslin is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book